 Okay, we're back here in Las Vegas, Nevada. This is HP Discover 2013. This is theCUBE, our flagship program. We'll go out to the events, extract a signal from the noise. I'm John Furrier, the founder of SiliconANG, I'm joined by my co-host. Hi everybody, this is Dave Vellante at wikibon.org. Mike Sullivan is here. He's the general manager of the information archiving and e-discovery business at HP Autonomy. Mike, welcome to theCUBE. Thank you, pleasure to be here. Good to see you. You guys are really starting to get the autonomy message out, get the integration going. It took a while, obviously, but you see a lot of leverage of autonomy across different business units. We saw it with the Haven announcement. We're hearing people doing analytics, starting to embed that into their platforms, and so how do you feel? We feel great. I mean, there's a lot of excitement at the show, as you know, about big data, and that is a big topic for me. I happen to work in the information governance space, so we have a different spin on big data. So we are, for the same reasons, that big data is such an interesting topic to people. Lots of information, a big variety of types of information that people have to deal with, very dynamic. Those all present opportunities that is probably getting the most buzz in the marketplace, and even at the show, because people can mine a lot of value from that information. But what I end up helping our customers do is focus on how do you govern that information, how do you manage the risk that can be associated with all that information, which is a big challenge for a company today. Yeah, so is information, is it an asset or a liability? Both, yeah, but both, literally. And you're in the liability mitigation side of the business, is that fair to say? That's right, I think that's how we get in the door on the part of the business that I run. We're helping companies be compliant with rules and regulations, we're helping them understand the risks that exist in all that unstructured information, especially communications. Today you have social media, Twitter feeds, and messaging is obviously a big area. And there's data privacy rules and all kinds of new regulations coming out every day. It's very difficult for companies to stay in front of all that and we help them do that. Well, plus the whole big data meme has changed the way in which people look at information. They're increasingly looking at it as an asset and that creates problems, that creates a lot more liabilities. Because let's face it, since 2006 with the federal rules of civil procedure, the whole Enron debacle, subsequent to that, the general council had a lot of juice in the organization and now the CMO has a lot of juice and the revenue often trumps what the GC says, although there's still that balance. So it is a balancing act, but it feels like the tables or the pendulum is swinging, which again, makes your job that much tougher or that much more lucrative, I don't know, so. No, absolutely. I mean, I think when you go out and poll CIOs even these days, what's top of mind for them? Information governance, compliance are big topics. And as you pointed out, it's not just an IT issue anymore, it's the kind of stuff that we deal with makes it to the boardroom. We deal with a lot of sensitive situations where your company could end up on the front page of the newspaper and I wish I could mention some of them right now because it'd be really interesting, but we do some really, get involved with some pretty interesting things. Well, for a period of time, you'd open the Wall Street Journal and every week there was a story in there. I mean, even going back to when the government said electronic email is an admissible record now. You got to maintain that for X number of years and there were hundreds of millions of dollars lost as a result and so that was a big boon for your business, obviously. I mean, autonomy really started to accelerate after that and at the same time, it's a problem that never gets solved because of data growth, complexity, and cloud. So let's talk about that. So how do you, let's start with information governance. What does that mean? Well, information governance, a good question. It's evolving all the time, but today it really is broadly means how do you manage the risk that's inherent in your information? And that can range from activities like just complying with the rules that you're subject to and depending on your industry, that could be any number of things. It could be managing the disposition and retention of information. That's a very common thing. If you're a public company, your financial services institution, bio, chems, pharmaceuticals, they all have those kinds of books and records types of rules that they need to comply with. The second area is mining that information just as you would in a big data application to understand what's in there. But the difference is, you want to understand what the risk is in there as opposed to often the use case for big data applications is to find the opportunity. Give me that golden nugget. That's right. And in this case, people kind of want to buy it. Give me the smoke and gun. That's, you got it, that's right. And then the third thing that they want to do for information governance is companies need to be able to respond when it's appropriate. So if you're a part of an investigation, an internal audit, a litigation situation, you need to be able to respond and meet your obligations. There's preservation obligations where you have to issue legal holds and lock things down. You have to produce massive amounts of information to the government. You have to make sure you produce everything. You have to make sure you produce it in a certain format. And you probably have a deadline that's not a reasonable one because the courts haven't really caught up to the problem and it's very, very difficult standard to meet. So what a lot of organizations, they use the, you know, the FIFO first in, first out. A lot of them use the FINO, right? The first in, never out. So they retain it forever, which is dramatically increases your risk. So go ahead, talk about that a little bit. Well, I'd say in the last few years, that's absolutely true. We had a lot of companies come to us and say, especially in the regulated industries, we keep everything forever. And I think it's just now where, what we're seeing is an interest in actually changing that. So obviously that's an impractical thing to do, I think, in most countries. We mean just from a cost standpoint. From a cost perspective. But it's also risky. And I think from a risk. I mean, it's a very hard thing to do. It sounds odd, but it's a very hard thing to do is to know what you can delete when. And nobody in any company wants to be the person that pushes that button to actually do the deletion. So we're working on solutions with our customers to spread that risk over a lot of people and have them all press the button and you know. This is such a complicated topic too. A lot of policy. So now you can, then you got cloud and mobile. Yeah. So risk has inherently now distributed onto devices and in the cloud. So let's assume I delete something. I hit the delete key. How do I even know that it's been deleted? So in other words, what does the opposing attorney know that I don't know that he might find or she might find in my records? So how do you help solve that problem? That's a great question. And one of the other really important topics today for companies that they're worried about and that they should be thinking about is defensibility. So it's exactly what you're saying is, how can you prove that you're doing the right thing? And you know, you're held to a reasonable standard to do things. And I think, and that's really the burden that you have is to prove that you're doing the right things and you have a defensible approach. So are you keeping everything that you're supposed to? Are you producing everything that you're supposed to? And I'll tell you one thing that's changed. I've been in the litigation and discovery business for a lot of years. And one of the things that's absolutely changed in the last five years is five years ago in a litigation or an investigation, you'd produce what you were supposed to do and you wouldn't get questioned. You would just get the benefit of the doubt that you gave everything that you have. Now I would say every single lawsuit that we see, every investigation turnaround that we see, there's absolutely a question from the other side or an accusation from the other side that says, prove to me that you did it right, that you did it in a forensically sound way. Show me that you have a defensible position in the way that you collected things, the way you instructed people and so forth. Do you need a really good audit trail and proof that you need to be able to cover your tracks and make sure you can show that you did the right thing? Well, what happened in the mid-2000s is a lot of attorneys would go after the process because the companies didn't have one. And then there was the famous case where they kept submitting. They kept asking people for all the electronic records and oh, judge, we have some more, we have some more, and the judge threw it out. It was a couple hundred million. I think they're a customer of ours, yeah. I don't want to name the name because I think I'll get it wrong, but it began with an M. But, so, all right, so, okay, so that's interesting. And now, but then this data influx comes in. Now, as I said, you got the CMO saying, I really don't really care about the risk. We see opportunity, go, go, go, go, go. So, how has big data and the perception that there's gold in them and our hills changed? What you guys are able to deliver, has it made your jobs harder? Has it made your advocates within companies harder? Are they collaborating or is it just a nightmare? You know, I'd say for us, we're benefiting from the fact that there's so much interest in these technologies to manage this massive amount of data. And again, it's not just the side, obviously that's the thing that gets the press, it's just the sheer amount, but it's the way the data moves and how dynamic it is, and a lot of it, as you pointed out, is not in your control anymore. It's out somewhere in the world. Facebook message is the example we used this morning, Robert, you said, yeah, that's an electronic record. So, we have good technologies to deal with that, and I think our customers on the governance side and the risk side are benefiting from the advances in the R&D that we're doing to handle these other big data problems, and the technology that we create for one side is helping for the other side. So, it's all a big data problem, it's just more about the use case. So, tell me what's changed in terms of the follow. So, it used to be I would shove everything into an archive, sometimes physical archive, or I presume at this point it's a virtual archive. Okay, so you got this virtual container that has all my electronic records in there, and I maybe got a stop on my system or whatever it is, and so I know where the data is kind of. All right, like I said, if it's on a device, or it's in a Facebook, it's in a voicemail, I mean, I know you've dealt with those problems, but so, and I get a legal hold, so I can't delete it. Okay, you guys do a pretty good job on that, but when something becomes defensively deleteable, let's assume we can figure that out, and that can happen. How do you ensure that it actually gets removed? That's an excellent question, especially as it relates to cloud, because you know, cloud today, if you go to an Amazon cloud or a public cloud, you really don't know where your data is for sure, and it's probably, those platforms are not compliant platforms from the storage perspective, from the regulatory perspective, so our platform is designed, it's got the same kind of grid computing infrastructure that you would see in a typical cloud environment, so you can ensure that you have spindle separation on your data, so that your data's not commingled with anybody else's, in case there was a subpoena to pull that data out, it's very important that you know where it is, and that you can, when it comes time to delete, know that you can actually wipe that data from a disk, so we have very specific features to deal with that, so we talk about cloud, we're really talking about, when it comes to compliance perspective, a private type of a cloud environment, but leveraging a lot of that. Okay, so I believe you can tell me that something got deleted in my virtual archive, but you can also tell me that, you can ensure me as the general counsel that it got deleted on all the laptops, all the mobile devices, I'm in the cloud, not so easy, right? Yeah, that's still a difficult problem, we have technologies that can address that, but I think the general strategy for companies is to try to pull that in so you have a copy, but you're absolutely right, governance, one of the parts of the things that people want to deal with in governance is control those endpoints, which is very difficult, and I think it's really about managing the risk there, because you can't control what's out in Twitter, and social media, or on all the mobile devices, and you want to be able to detect that as much as you possibly can, we have some great security things that we're doing at HP and leveraging our technology to try to detect and prevent those things in an inappropriate way. Yeah, HP is an awesome security source, and as well, you kind of use, and I think others do too, it's not about just pick on HP autonomy, but you do, Oracle buys in Deca for a big number, because search is key, but you kind of use search as a blunt instrument, you sort of aim it at the corpus and say, all right, let's go find what we need to find, and it works, but it's kind of expensive, I mean, a large pharmaceutical company can spend hundreds of millions a year on e-discovery, because it's volume-driven, and so can technology help solve that problem? For instance, can you, are we at the day where technology can actually auto-classify data so that you can scale this problem? I would say that's the biggest trend in discovery right now, is the movement towards leveraging technologies that we really pioneered to be able to understand the content. I mean, if you think about discovery, the two big areas to spend are collecting all the data, but the biggest area is actually reviewing it to review what is relevant and needs to be produced. Paying lawyers. Paying lawyers, you got it. So, you know, the holy grail for companies is to be able to not pay those lawyers and automate it. Well, you can automate that process unless you have technologies that can actually understand the content that's in any document or piece of content. That's what we have. So we really have a very, very unique and highly differentiated offering, because since we can understand that meaning through very sophisticated pattern-matching technology, we can actually automate the process and put in a very defensible process that is now backed by court cases and a precedence. Oh, it's been tested. Absolutely. Now, the courts are really starting to be a little bit sympathetic with companies and how difficult this is. Right, because at first, they were just blinded. The courts didn't have a clue, and it was dangerous for a technologist going, they just don't get it. I can't explain it to them. They're just like, where's the stuff? So, okay, so that's been tested in court now. It's really, but in the very early stages, and most companies that have a lot of litigation are really starting to test this. It's called computer-assisted coding. Technology-assisted review is another acronym you'll hear. So that can help me with my e-discovery costs and dramatically lower them, because again, it isn't just the volume, but the volume's growing. The more the volume grows, the more employer bills I get, which just checks off a lot of people. So, all right, good. This is a good discussion, Mike. I really appreciate you coming by theCUBE and good luck with all these challenges, and I predict good things for you guys, so. Thanks for coming on. Thank you very much. Pleasure meeting you. Appreciate the time. Thank you very much. Next guest enter the short break. This is Silicon Angles and Wikibon's theCUBE. HP Discovering Las Vegas, day three of wall-to-wall three-day live coverage. Go to siliconangle.com and wikibon.org. For videos, go to youtube.com slash siliconangle. We'll be right back with our next guest enter the short break.