 information. Is it an asset or a liability? Well, that depends on who you're speaking to in an organization. With today's trends in big data, most organizations are looking at big data as an opportunity. At the same time, many IT organizations are struggling with data growth and flat budgets. So how do you reconcile the opportunity of big data with the challenges of data growth and the implications to the organization in terms of risk management? Well, we're here today with Randolph Kahn, who is the principal of Kahn Consulting. Randy is an expert on legal and compliance and policy issues related to business information, electronic records, and information technology. Randy, welcome to the Cube. Hey, thanks so much, Dave. So we're here to talk about a new book that Randy has in the process of writing, should be out in a couple of months. It's called Chucking Daisies, How Companies Deal with Big Data. So, well, first of all, congratulations on almost having the book done. And this is, I think you said your sixth book, which is fantastic. So tell us about the book. What's the premise behind it? So, you know, just the way you started the conversation, this idea that information is an asset, well, that's true. But if organizations can actually find this stuff, it's no longer an asset, it's a liability, it's a cost, it's a pile of risk, it's a pile of inconvenience, it's a pile of inefficiency, right? So Chucking Daisies is a bible for IT professionals with simple rules that say, you know, about shell, you IT professional, do the following things to manage information, right side your information, what print, keep the stuff that you need to keep, get rid of the crud. And it's really a book to help the IT professionals to walk through that problem. You got this big pile of data, what do I keep, what do I get rid of? How do I do it in a legally dependable kind of way? And that's really what Chucking Daisies is about. So, where'd you get the name? Well, if you think about a flower, right, you know, we're going into fall, it's sort of the end of the growing life cycle, right? You go outside with these beautiful leaves, you know, that's just a couple weeks away from looking like death. Well, if you take a look at the life cycle of information, right, that daisy in spring looks beautiful, and it's grabbing sun from the sky, and it's crisp, and the colors of the leaves are brilliant. And then over the course of it's short little life, right? The value of that thing declines. I don't think that most IT professionals think about information as having a life cycle. I mean, in fact, if you look at most organizations today, most folks are keeping everything without regard to what the heck it is, costing them huge amounts of dough, right? So this idea of the daisy, the idea of Chucking the daisy at the end of its useful life, when it's value is declined, when it's no longer attractive, that's the same with information. At the end of the information's life cycle, the idea of parking it somewhere in a repository, a shared drive, it's an expense, it's a liability. And I think IT professionals need to understand the importance of that life cycle. You also use big data in the title. And as I was saying up front, a lot of people are looking at data as an opportunity. And many organizations don't want to throw away that data because there might be some diamond in the rough that they can find years down the road. Are you suggesting that that's the wrong strategy? And can you give us some insight in that regard? The big data is that you have this pool of information that you're able to harness and extract to see trends to understand your business in a deeper, more significant kind of way. So that you can be a more efficient business and you can also plan the future. But the problem is, for most organizations, you think about this way. For the last two or three decades, organizations have been applying information technology and systems to all kinds of business processes. Today, everything is electronic. And for every system, there's electronic information, right? The pool grows and it grows and it grows and it grows. And at some point, big companies have hundreds of terabytes of extraneous stuff, or even petabytes of extraneous stuff. What the heck is it? So big data is this idea that I can take tools and I can understand my business and be a better, more efficient business if I could harness that stuff. Well, actually, for most businesses today, they have so much stuff and they don't have the tools that they need that that pile of data is wholly underutilized. They can't find the record they need to find. Organizations now regularly have to recreate information, but they can't find the stuff that they have. Litigation response tells us over and over and over again, companies don't have their app together. So, you know, I would tell you that that harnessing information and big data context is aspirational for most organizations. I would say ubiquitous mismanagement is much more the theme of the day. So, Randy, with the Federal Rules of Civil Procedure in the mid-2000s, I think 2006 came to four, the General Counsel in various organizations had a lot of power to essentially dictate the policies of organizations in terms of records retention and data, deleting data and the like. Have they not been successful in your regard? And is the pendulum swinging back the marketing pendulum, if you will, toward, you know, big data becoming this opportunity? And what risks does that pose to organizations? So, you asked a whole bunch of stuff. Let me start with the General Counsel. Let me also address records management. So, for the records managers, they're not going to want to hear this, and if you heard me speak before, you heard me say it before. But this is the reality. Records management for most organizations is a total utter failure. I mean, think about it this way. The belief about records management is, I will create rules. That rule will tell me how long I need to keep stuff, and then at the end of its use for life, that stuff will go away. Well, I've got to tell you something. For most big organizations today, that stuff is not going away. Those rules, if they exist, are not being applied to the vast majority of electronic content. Rules are way too complex. Policies are way too complex. And so to the extent that there was a huge push for records management, I mean, I have a business, con consulting that does nothing but records management, day in and day out, and I can tell you, for most organizations, even PIN ones, it's not been particularly successful. So, let me ask you, what qualifies you to talk about this topic? And what can an IT executive learn from a lawyer who writes books with interesting names? Yeah, so I spent about 20 years in the information management space, helping organizations and governmental agencies get their Information Management Act together. This last year, I started a business called Dell. And what Dell does, it helps big organizations really operationalize that chucking daisies thing. We have hundreds of terabytes of stuff. We didn't know what to do with it. It grows unfettered and it grows and it grows and it grows and at some point, we need to get rid of that stuff. So what Dell is doing as a new business actually rather successfully is going into big business and helping them in a legally defensible kind of way, cleaning their storage bins of the old information that doesn't have any legal need to be around. It doesn't have any business utility be around. It's basically digital junk, right? So we're going to help them get rid of it. But beyond that, I mean, I've been thinking about this problem for a whole bunch of years. I started my legal career as a litigator. And if you look at most businesses today, and I would say the litigation response problem is a manifestation of it. Most people organizations, as I said earlier, have ubiquitous and failed to manage information. Most of them are not applying retention schedules, the vast quality of electronic stuff. And when a lawsuit happens, or the investigation happens, the idea of finding everything and anything that's even potentially relevant, not only is it a gargantuan burden and an expense and a gargantuan pain in the butt, but in real life, you think that they can actually grab anything and everything that's potentially relevant. If you think about the land, the land, the sand, all the places that information can be parked today, the idea of being successful in that, the idea of finding a record when you need it for business purposes and be successful at that. So, you know, to the heart of your question, I spent my life helping organizations figure this problem out. And I just think about it, maybe slightly more than the next guy. So in case you just joined us, we're here with Randolph Khan, who's an attorney. He's an expert in legal and compliant issues. He helps organizations, you know, figure out squint through the model of the records management records retention policy issues and come up with ways in which they can be more effective looking at information as both an asset and a liability and trying to strike that balance. And he does this every day with his clients. Randy, tell us, talk a little bit about why it is so hard to chuck daisies. Yeah, it's a really interesting question, right? So there's a couple of things that are, I think, driving organizations to the wrong place. The first thing is this erroneous or palatious belief that storage is cheap. Oh, don't worry about it. Storage is cheap. Well, you know, and I hear this with, with, with con consultants and dub clients every single day. Why are we going to take it on? Storage is cheap. We'll just keep dumping that stuff in a big parking lot and who the heck cares? Well, this is the deal. Information is growing at, for most organizations anywhere, something between 20 and 50 percent per year. The actual storage cost is going down slightly every year. In real dollars, what they're spending to store stuff because there's so much more, you know, this year than last year. Fact is, storage is not cheap. Storage is a huge cost. And I'll give you an example. We're looking at a project right now for a large insurance company, where this is a Dell project where we're helping them shock the day she's when we're up in the right size or information footprint. We have legally defensively getting rid of the information cut. To do that project, they save millions and millions of dollars per year, just on the storage savings per year on a net basis. So the ROI makes sense, the TCO makes really rapid sense, right? So the first thing you have to break down is storage is not cheap. Second thing is, nobody owns the information anymore. IP says, hey, it's my repository, it's my pipe, it's my box, it's not my information. I don't know what to do with the stuff inside there. The lawyers have said, well, we have lots, since we have investigation, you know, be very, very hesitant about what you do. And all of a sudden, between the storage is cheap, and it's not my problem, I don't know what's in that system. And being a very hesitant set of the lawyers, people say, okay, forget about it, I'm not going to touch it. Now there's a fear. And the information grows and it grows and it grows. And again, from, you know, Adele legal defensibility perspective, the only way that you're going to clean house and not worry about being nailed for destruction of evidence or exfoliation, as lawyers talk about, is having some methodology that says I went through that stuff. I know that that stuff is not a record. I know that stuff is not otherwise needed for auto litigation investigation. It must be digital junk. Because we don't have any business utility with that stuff anymore. Let's get rid of it. But there's got to be some diligence around that. Otherwise, you run substantial risk and get rid of it. You talked about storage is not cheap and lawyers aren't cheap either. But and so I wanted to ask you if you found with your clients that not only have you helped them save storage costs, but what about legal discovery costs? I mean, discovery is a volume driven activity. If you've got less storage, you're, you're paying less to discover, aren't you? Yeah. So as it relates to the way in which Dell makes the business case for our clients, we never ever go to the issue of risk mitigation or litigation costs and response avoidance or being a more efficient business. Those are absolutely real. But their soft costs, some can be quantified. And a Dell project, a, you know, chucking Daisy's project, a defensible disposition project for a big IT department makes sense purely on storage. Now, having said that, is the cost of a lawyer, a significant cost in terms of information review in the context of our litigation investigations are actually lawyers are incredibly expensive. And they love the big piles of information. In fact, what I would say to you is, there's no question that defensively disposing of stuff makes you a more efficient business without question, right? You have much more, you talk about big data, it's big data on actually the relevant information that you need, as opposed to trying to find the information nugget or that needle in that gargantuan information haystack, if you will, right? So I mean, as it relates to the litigation response and inconvenience across, there's no question that that's a gargantuan cost. And it's no question that that's you're gonna find substantial benefit by getting rid of the cross, right? No question. So you've used this term defensible disposition a couple of times. Can you talk a bit more about what that is and define that for our audience? Sure. Sure. So, you know, if you think about it this way, then the issue is, I have a shared drive. And that shared drive has dozens or hundreds of terabytes of content in it. I don't even know what that stuff is anymore. It sits and somebody may be babysits the system or not. And somebody may be active, some of that stuff or not. And what you find, it just sort of sits. Well, for a an organization to go in and look at hundreds of millions of files today, for a big organization, we have clients that billions of files. It's such a substantial volume. There's no way you're going to have your employees do it. First of all, they're really bad at classification. That's number one. But even if they weren't bad at it, compared to technology, it's an incredibly bad use of their time. So really, to chuck the daisies or defensively dispose, I need to have a methodology that sets. I know that the stuff that I'm looking at is not a record. It's not needed any longer for business purposes. And then also, I know that it's not otherwise needed for audit litigation investigation. And I have to find a way to do that with different systems and different kinds of content and different kinds of business arenas and environments. So that I can say, when we dispose of content, that that stuff wasn't otherwise needed. That idea of defensibility, I dispose of it without my people looking at it. I use technology. One of the things that Dell does really efficiently is uses technology tools, the auto classification tools or the machine learning tools to do the heavy lifting, right? So a lot more efficient. So they're a lot better than people. It saves them a boatload of dough. But in the end, I need to do that in a kind of way that's going to make the lawyers comfortable. It's going to make the compliance people comfortable, their business people comfortable. Otherwise, at the end of that process, they're not going to want to pull the trigger to get rid of the current, right? So that defensibility or that defensible methodology allows me to evaluate content in a kind of way, that at the end of that analysis, I can in a legally dispensable kind of way, blow it away in the ordinary courts of business, and not cause the lawyers some hard work. So I want to come back to this notion of defensible disposal and disposition in a second. But before I do, it sounds like a little bit like this is records management 2.0. Is it? Yeah, it's funny that you can just say that because that's exactly how I think about it. If you think about it this when I talked about this earlier, records management fails, because the rules are too complex. They're too voluminous. There's nobody there to apply and technology can't take 1000 rules and apply to anything. It's just, it just is a failure. Really what what Chuckie Daisy talks about what Doug does day in and day out is take that old retention schedule, simplify that thing, operationalize it and then apply it with technology because people can't actually get into that. So let me come back to defensible disposition. You talked about I mean, essentially, technology's gotten into the guttiness into this problem. You're saying you and your clients work together to use technology to help us get out of this problem through classification. And it sounds like you're you're helping them automate that classification. Talk about that a little bit more. Sure. So so when a doubt goes into a client, right, you can assume two things. They're going to have structured content, and they're going to unstructured content. Let's deal with the structured first. The structured content, the stuff in databases, it sits there. And typically, it's the kind of content that doesn't individual employees or businesses typically don't interact with on a regular basis. Someone needs to go in, determine what that stuff is, and determine what business rules or attention rules apply to it, so that it can go away. There are clients that delve has where they have huge volumes of structured content, where they've never applied in archiving technologies. So you know, simple compression, simple way to manage that content, irrespective of the end of life disposition rules, you need to come in with tools and technology and understanding of what's out there and what can be done with this, again, this gargantuan storage footprint, just for the structured stuff, right? So there's tools and technology and methodology and experts that delve for it. On the other side of the equation, the unstructured content. Unstructured content sits in all kinds of systems. There's certain kinds of certain kinds of content that auto classification, machine learning technologies are pretty darn good at discerning what it is. There are some file types, because of the uniqueness of the technology or the file type, that makes it much more challenging. You know, that said, there are technologies today that are able to discern what something is, able to apply a business rule to that. And those kinds of technologies do be earnest more often by more companies. Because as I said before, again, volumes are so great, people simply can't do it anymore. So I want to come back to this issue a little bit and help our audience understand. So can you actually go back and classify, you know, with machines, an existing you know, corporative data, or are you suggesting, all right, let's moving forward from, you know, day zero start to auto classify. How do you deal with that? So when Dell goes into a client, two things are happening. And I think that the direction that you're going in is exactly correct. The first thing is I need to clean up the past. How do I clean up the past? How do I know what something is? What Dell will actually do? They'll take their retention schedule, will simplify it, take their content, will teach their content to a computer, so that when it crawls through hundreds of terabytes of stuff, it actually has learned what that content is, and it can apply the business rule, right? If I'm in a coding record or an HR document, or a contract, whatever it is, we're able to actually go in and teach a incredibly robust classification engine, what your records are, so that instead of people doing it, at night, when the system is not being utilized, or however, the system can crawl through, again, terabytes or petabytes of content, and make business judgments, business decisions and classification against your rules and the content that you taught it. So that's the best cleaning up the past. Dell comes in, cleans the past. On the other side of the equation, again, auto classification, just to clean up the past, if you're a small company, you don't have a lot of data, you're not going to want to do it. Too complicated, too much money. And again, to take it on, I don't want to minimize the upfront exercise to actually train the software on what your content is and the rules. It's not an undertaking that's without cost, without inconvenience and expense. But if you have a lot of content, it makes incredible business value. The business value, again, the idea at the end of the exercise is we can legally, defensively get rid of huge, vast quantities of stuff. It reduces that storage in a macro sense. It reduces that storage footprint, which equates to millions of dollars a year. So once those rules are built, then of course the question is why shouldn't we use that as a new information management paradigm going forward? So what Dell is really doing is coming in and cleaning up the past. And then once you've built the rules, you might as well use it on a go-forward basis and be a more efficient business to actually apply retention rules that's totally different way than you have before without human intervention and doing it, I guess, real-time in systems seamlessly, right? And that's really what Dell is about. So I wonder if you could talk a little bit about the impact of mobile and bring your own devices. Risk is becoming increasingly decentralized by its very nature. How do you, how should IT organizations ensure that when they think something gets deleted, it actually does get deleted? So let me deal with both parts of your question because they're both really interesting, right? So one of the things that Shocking Daisy does is lays out a series of rules. One of the rules is for every new technology, there's a chunk of informational output and you need to have policies up front in a way to manage that content before you actually implement the system, right? Well, what that tends to do is it tends to force the business question, do I need this content as a business record? And if I do as a system in which we can keep it and store it and access it, is it makes economic sense to do it this way? But if we can't actually readily retain this stuff and store this stuff, how are we going to actually satisfy our legal requirements or how are we going to take care of our business needs? Horsing that policy conversation up front forces you to deal with those issues. At the end of the life cycle, the issue is what am I going to do about all this content that exists that I otherwise need to get rid of? And this is again, from my perspective, most organizations have to build into the process when they develop policy and that new technology implementation up front to say at the end of its useful life, who is going to own this position? Who is going to actually effectuate the disposal of this content? How are we going to do that in a legally defensible kind of way? And if organizations, in particular for IT executives, build that into the process up front, you'd find a great deal less inconvenience expense, business inefficiency, litigation, response, bloodshed, happening. And that's where IT organizations and in particular IT executives need to start wrapping their head around. So talk about who in the organization cares or should care about defensible disposition. So in our delve experience, the people that really care are IT executives. So the middle-level storage guy says, yeah, this seems to make sense. When you take it all the way up to the food chain and say, we can save you 20 or 30 or 40 million dollars per year. Are you interested? It's a really easy sale. Oh, and by the way, the lawyers are going to like it because litigation response is going to be a heck of a lot easier. And it's going to be a heck of a lot cheaper. And your privacy guy is going to like it because we're going to reduce your privacy in your PHI information risk footprint. And that's a good thing. And your business executive is like it because my employees that now spend 10% up to 25% of my time looking for stuff, that would be reduced. So make my employees more efficient. And my customers who can actually get answers from me much more readily, you'll find that business will be augmented. The selling defensible disposition is incredibly, incredibly easy on storage alone. There are so many other really significant benefits, but the people that find the greatest value and understand it immediately is the CIO, right? Immediately why? Because I can hand them a whole chunk of money for them to utilize, especially in a credit economy, utilize for buying new technology, hiring new people, building other efficiencies, sister's dead money. Why not chuck the daisies and do something with that dough? So the book is Chuck, Chucking daisies by Randolph Conn, how companies deal with big data. So tell us a little bit more about the book. How did you organize it? What can people expect to see when it hits the stands? Yeah. So the book is structured as simple rules, right? There's 18, 19 simple rules that help organizations and again, primarily IT professionals really understand, right? So a rule might be, for example, never implement technology unless you have policy first. We talked about that. Now, that may seem like no brainer, but the number of organizations that technology sneaks into and all of a sudden a social network is a perfect example, right? When you wake up one day and you have a big insurance company and you realize that all your sales agents are using Facebook to sell policies. Wonderful for business. But wait, you know, we have compliance requirements. We have our potential requirements. What if they have respond litigation? We have some privacy concerns because there's their Facebook account. Every single organization that looks at social networking that says, hey, there's some value here needs to stop first and build that policy construct, right? As we talked about. So the rules are really very straightforward, very pragmatic, very easy to understand. And we use real life to actually make it come alive. And I probably should point out to you that I have a co author. Galena Datovsky is my co author, so I'm not doing it by myself. And the book is will be out roughly when? In the next couple months. Excellent. Alright, Randy, really appreciate you coming on the Cube, spending some time with us sharing your best practices and your knowledge with our community. Love to have you back and hopefully we'll see you at IOD. The Cube will be at IOD next week will be there Monday and Tuesday in Las Vegas. And I hope to have Randy on again, talking about these and other issues. So again, Randy, thanks very much for coming on the Cube. Love to join you. Thank you so much. Alright, thanks for watching everybody. We'll see you next time. This is Dave Vellante at Wikibon headquarters. And this is the Cube.