 Live from New York, it's theCUBE, covering big data NYC 2015. Brought to you by Hortonworks, IBM, EMC, and Pivotal. Now your host, John Furrier at George Gilbert. Okay, welcome back everyone. We are live in New York City. This is Silicon Angles theCUBE, our flagship program where we go out to the events and extract the similar noise or have our own events. The event we're having is Big Data NYC, right in conjunction with Strata Hadoop, which is one block, 100 yards away at the Javits. We've been here for three days, third day of wall-to-wall coverage. I'm John Furrier, George Gilbert, Big Data Analyst at Wikibon, and Omar Tramansi, and co-founder of Rokana. Formally scaling data, if you're inside the ropes and know the industry history, welcome back to theCUBE, great to see you. Thank you, it's great to be back, and it's great to be here. Just a block away from Strata, it's fantastic. You're just for the folks out there, just some background on you. You were one early employee at Vertica. That's right. Built that out early on, then they ended up selling to HP, and it's a flagship, by the way, of their entire Big Data offering. It's gotten pretty significant traction there. It's pretty impressive. It's a lot of air built out there. First, go-to-market organization with customers, taking a dupe to the market for that first leg of the journey. That was a very adventurous time in the early, early days where it took weeks and weeks just to set up a Hadoop cluster. We be data, one of the co-founders started a company, and now you started a company called Scaling Data now, called Rokana, which is now branded. You guys are out doing business, so let's get down to it. Rokana is the new name. Yep. What's going on with the company? Give us the update. Funding, staff, customers, what do you got? So we raised our series B round earlier this year that was also in concert with rebranding the company, so Rokana is root cause analysis. That's exactly what we do. We bring a big data take to global IT operations. We announced our product GA earlier this summer, and we've been talking a little bit about what customers are doing with it in retail, monitoring end to end from the point of sale systems all the way to the back office, in financial services globally across the company, in gaming, some of the largest gaming providers in the world, using us to make sure that their users have a very good experience, literally, and see access to the games. That's a kind of high end operations, next generation operations that our customers are using as well. So you really, to me, the poster child of what's going on in this big data world, certainly a startup, kind of got some beach head, you get a nice use case, root cause analysis, but really if you look at it, you have experience understanding the database, the large scale piece of it, you've seen the worlds of the Facebooks, you were obviously early at Cloudera. You see the big picture going on under the hood, but you have to kill our app as analytics. The end game for everyone, and as a startup, you're navigating a very competitive landscape. Splunk has certainly done some great stuff with log file analysis. They have, they have. And they're now trying to get into security, move up the food chain, if you will, from data exhaust, taking out the trash, if you will, log analysis, and essentially no one wanted to do that, with log file. Yeah, just sitting there, waiting for someone to look at it and do something with it. But ingestion and turning value into the data really is what the successful companies are doing. That's kind of what you guys are doing. Can you explain this entry strategy for you? Where are you putting your beach head stake in the ground? How are you guys setting up camp, and what's your growth strategy? Sure, so we, this is actually what we realized coming from kind of a big data background, is that it wasn't just about picking yet another source of data, and trying to apply some kind of search-based root force understanding of it. That was a good first step, and I think that's going to be useful in some domains. Really, it's looking across the organization and saying how do all these streams of data fit together, and what kind of purpose-built analytics and visualization can help my operators, not just my analysts, actually take advantage of all that data and improve the efficiency across the entire organization. So a couple of notables, Don Brown, Eric Sammer, both Rockstar, DevOps, kind of guys I call them, I mean, for lack of a better word, no, I don't think that's personal, but I think it's a compliment. Eric would take flight offense maybe to DevOps, but that's okay. Well, he wrote the book. You did. On a lot of the operations. Enterprise-related stuff, and Don's security background, both cloud-era employees early on. That's right. How are they piecing into it? I mean, if you just connect the dots, you got a little bit of security, not sure that's enterprise table stakes. What's amazing is that Don and Eric were both customers. Don was building out data center software at Earthlink back in the day. He was working in security space. Eric was running data centers at Experian. They've used every piece of software under the sun, and then they spent years deploying big database solutions, and now they're like, well, big data is actually what we've been looking for our entire careers, let's go build it. So when we go and talk to customers, it's not like this is the first time we're here in their use case, we've been living it. And so we know exactly what to build. We know exactly how they need to see the data and just deliver it out of the box. So Eric, obviously a great guy, and we get to know what he's doing. I'll call him the ops dev guy. Not the dev ops guy, because he wrote the book, Hadoop Operations. But this is really, if you look at cloud and IT, a couple things. Operations IT guys are heavily involved in this hybrid cloud game. We heard that at VMworld, loud and clear. But now you move over to what's going on in the big data space. It's data in flight, as Murav Adrian says, is the new trend. And what's happening with this is that there's a lot of stream processing going on. That is like the big deal. And George, Systems of Intelligence, kind of teases that. So I got to ask you, as data moves, sometimes it may not even hit a database. So to be real time, you got to have an architecture that can be data centric, but not constrained by some data storage. What's your take on all that? Yeah, so yes and no. What we're finding is that you have a lot of real time processing. You want the business to be nimble and responsive. But that's not always going to go right. There's no such thing as 100% perfect technology. There's always going to be some problem. And if you don't have a record of that, if you can't merge the real time with the historical, build analytics on the historical and understand whether the real time's going right or not, it could be, I mean, even if it's seconds after the fact, you've just missed how many cycles of processing that was just wrong, right? You have errors that you didn't catch. And those cascade. It's basically a miss. Missing key data. So the faster you go, the more misses, statistically, or the more misses you're going to have. So you need to marry kind of the historical and the real time in one system. And so since every customer's configuration for their application services and their infrastructure is different, how do you build that conceptual model of what's the way it should behave? And then how do you know when it's not behaving? Right. It's actually less a function of good and bad. It's very difficult to build software, as you point out, that knows that going in something is behaving well or behaving poorly. But it turns out humans are really good at that. They just can't look at a billion data points and a bunch of log files and figure that out. What we can tell them is this system is behaving differently. It was like this, we built a model and we understand that this is sort of the pattern, the fingerprint for this collection of systems, this application server, database server network. This is how they interoperate. And now the application has slowed down a little. Your top KPIs, your retail transactions are slowing, your gamers are not as active, and we've been seeing these glitches in the network at the same time. If you pop that pattern up, then a human's going to make the connection and they're going to be able to solve the problem very quickly. But how do you help identify which are the root causes and which are the symptoms? Yeah. That's part of what's the secret sauce under the covers. That's why I'm asking you. And basically what we do with the analytics is we present the information in a way where people can sort of visually look at correlations and try and understand what the causality is. So it's hard, again, for a machine to say this is authoritatively where the root cause happened. But if I can guide you to, here's two systems that are connected and this one started misbehaving before this one did, then you can start piecing together, oh, this happened, then this happened, then this happened. But you're looking, you've now narrowed down from hundreds of billions of data points to just the systems that are actually related to a given problem. If I were to pin down what I just heard, it almost sounds like time is the key. Time is the key. Okay. Time is the key. Yeah, Eric's a musician actually originally as a hobby and he had it as an occupation before he got into software. He likes to say that the music is in the rests, right? It's a lot about what's missing in the data sets and so when you look at it from kind of that perspective, you're looking for the gaps, the differences, not just the melody is right and the harmony behind that, that's the data center functioning properly, right? Where are things deviating from that? And you have to look at that over a sort of a time spectrum. Can you use the abnormal behavior to put back into history to help you with identifying slightly different anomalies when they happen in the future? Yeah, part of stuff we've been looking at is how do you trace, for example, an issue that's emerging and say, well, when did this actually, it's been happening a lot, right? Every Monday morning we have this little glitch. Can you show me every Monday for the past few months and just kind of time align those four specific subset of systems that are related to it and just piece in and out what do I want to compare? That's an incredibly powerful tool. If someone can just point and click, they don't have to write a magic incantation. They don't have to be a super power user. You know, it's like a broker on Wall Street just kind of picking, these are the indices and these are the stocks that I want to look at and they just pop up. So which verticals are you guys playing and how security comes to mind one? Where are your beach head customers? When does a customer know when to call you? Can you share, I'm actually your startup, so you're now again navigating competitive space. So give an example of the verticals you're successful in, what verticals you're eyeing, kind of adjacent verticals, and then what's the use case for a customer that are watching to say, hey, I got to call. I got to call these guys up and get some help. So where we're seeing early adoption is in people who have pretty complicated systems, right? Our customers are still buying and deploying new mainframes and they have private cloud. They run their own data centers, sometimes they're hybrid, they have data centers and they use cloud providers. So it's complex environments. They are sort of Fortune 500, Fortune 500 class type of customer and what they find is that their business is converging, right? So example I like to use is I can do banking on my mobile phone and access, dozens of different banking applications all through one interface. But there's probably 50 monitoring systems running each of those applications. So when I have a problem, there's no single IT person who can look at one place and say, oh yeah, this is exactly where Omer's Transaction kind of went awry. It's because we had a database glitch because the disk died. Does that mean the legacy systems management systems that were put in place to support the old applications, the old sort of stove pipes? Does that mean you sort of have to go around them to make sense out of what's going on for the whole landscape? I think you need to take a universal look in order to understand which you should even pay attention to. There's no way you can put all the monitors side by side and as a human process those and say, oh yeah, this is the one that actually looks different and it's correlated to that one. So before you even get into sort of the legacy purpose built or the siloed monitoring, you want to say am I even looking in the right place? And can I as even a level one operator actually get to where the root cause of any given problem is and do that extremely quickly? So it sounds like you could point someone to one of the earlier systems management products just say it sounds like the root cause is in there, go there. We can definitely do that. In many cases you can actually just solve the problem within the system. So for example, we had a customer who deployed our software within a few minutes. They noticed some misconfiguration that was causing a spike in activity. And he started looking at the log files and oh yeah, we don't know those IP addresses. This is an attack and what happened? Oh there was a firewall misconfiguration that was flagged just a few minutes earlier. So within the tool, they don't have to go to their networking tool or their firewall tool. They can see exactly what sort of the time sequence of events was and identify the root cause. So I was interested and I noticed that Rick Miner's on your board. Rich, yeah. Rich Miner and he was the co-founder of Android which we all know. It was interesting we had an interview last week with the head of Swisscom. And I asked him about malware. He said it's 50% Android, 50% iPhone. Really? Well, you know, so it's not all Android that's got the malware and I want to bring this up because talk about security. And then he said, I wanted to get your thoughts on this. Security is an assumed penetration game now. Not work on waiting. But the hackers are in. So this is a question of how bad. So this goes back down to some of the things you're talking about. Okay, if you assume that this might be a really killer tool for that because you say okay, if you assume they're already in, you're looking for the patterns. This is where the tech will add value. So I got to ask you one, do you believe in that? And what's your position on that? And two, what's under the hood? Machine learning is a lot of math involved. There's a lot of events happening, a lot of processing happening. We do, I mean, we do assume, we believe kind of in this world model where you're already penetrated, right? The hackers are lying, waiting. If you think you're not, you're just not looking at the right systems. And that's step one in the answer is actually look everywhere. So much data is falling on the ground because people think it's not relevant or not important until you discover an IP address from some unknown rogue nation in your HVAC system, your point of sale system, right? The little thing on the side, the mobile phone app that no one thought was actually relevant because it was just a project you were rolling out. And so step one is collecting it all. Once you have all these data points, that's the point where- It's not just the network either. Talk about that because it's not just the network, it's apps too. It's apps, it's embedded applications, it's the services behind them, it's the endpoint network devices. We're talking about firewalls and proxy servers. It's the database servers. Everything, if it's software, it's an entry point. If it's firmware, it's an entry point, right? That's the challenge that people are seeing today. There's so many places that people can get in. I know the other investor you have is General Catalyst, best-time investor. Steve Herry's been on theCUBE many times saying the perimeter's dead. And he's been on this thesis all day long. He's exactly right, he's exactly right. Enterprises get this. I mean, I got it. I mean, you're out there. I mean, you're getting traction. They're starting to. They're starting to. What we're seeing is that both the CIO and the CSO are starting to realize that there's so much they don't know. And when it comes to the point of their responsibilities, control over infrastructure, control over security, if they don't have visibility, if they don't know what's going on, they can't control what's going on. We're going to ask you what's going on in Hadoop World, now called Strata, Hadoop. It was once called Hadoop World when we were there. But the reality is that there's some things going on here that no one's talking about, but everyone's talking about. It's kind of like the public secret. One is cloud and security. The conversations are just starting this year, but yet Gardner's data shows that 50% of Big Day is deployed in the cloud and security is still being kicked around. So those are two key conversations. Can you comment on what's going on in the show around cloud and security? And then what does the big data piece of your solution and the Beggar macro ecosystem deal with these? I mean, are people putting their head in the sand? Is it just they don't know? What's the status of the model? Picta is a weird visual. I think they've been buried in the sand and the sand's finally starting to clear off and they can see the sky. The R&D projects, which is what's driven a lot of early cloud adoption, are starting to come home. People are trying to figure out what workloads are cloud and what aren't. And people are starting to get more comfortable with the fact that the cloud is secure, or it's at least as secure as the perceived perimeter that they were creating around their data center. If you don't have a perimeter, if it's been breached, what's the difference between on-premise and cloud? And so now we're starting to see this movement. People from on-premise are adding the cloud. People on cloud are starting to add a little on-premise. It's becoming a very hybrid world and security just gets 10 times harder. And security is just one of those things where everyone's just scratching their heads looking for answers pretty much at this point. Yeah, it's the fear of the unknown unknowns, right? Hargan back to a nice sketch phrase. It's, what have I missed? And that's where, to your earlier question, that's where analytics come into play. Can we surface things? Can a machine, by virtue of using analytics and machine learning, visually tell you that something looks different than it did and that's where you should focus your time? George and I were at the Facebook scale conference two weeks ago, one of the things we noticed was that's what they're processing in terms of the streams is so massive in volume that we were just scratching our heads saying, statistically, can they see everything? I mean, so that's another problem, the volume and velocity of the data. So how do you guys view that? Are you doing anything in that area? Is that something that is a use case for you? Or is that something on your radar? We view it as a function of how do you surface it so that someone can get it, right? Until we get to the point that there's a perfect circle, full automation where the machines run the world, right? The sort of the AI phase. Until then, humans are pretty critical to running from a security and operations perspective. We had over 100 people here last night for our presentation of George's new research and we had the end user panel. So I got to ask you the end user question. I'm going to put my end user hat on practitioner. CXO at a corporation, when do I have a problem that you need? Describe to me my use case that where you can come in and help me. What is that for the folks out there who might not have heard you guys as a start of you growing, when do I know when to call you? What is my environment look like? When do I pick up the phone and call Rokana right away? We have very, very simple questions, right? Where are you breached? Are you breached? Where are you breached? If you don't know, you need Rokana. Where are you having capacity and performance issues? If you can't point to the exact place, you have to call us. What are the top 10 issues that are plaguing your users that you can fix from an infrastructure and software perspective? If you don't have that sitting on your desk, already printed out, you got to call Rokana. So it's an ops solution. This is operations and security operations. Security operations, that's what we say. Yeah, you put under that buck. Exactly. Okay, so the use case would be, okay, shoot, I got security. If I had a breach, I'm fired because that's what happens when someone gets breached. They might be already in camp inside the facility and two, are my servers getting whacked and are they misconfigured? Yeah, well, are my users, are my delivering the experience that my users expect? Or SLA internal or external? Ultimately, it's the KPI of the business. It could be something at the millisecond level because you're online gaming. It could be something that's in a physical location because people are doing self-service checkout. It could be just people checking their retirement accounts or trying to do online transactions. It's just small. You've got the application piece, which is SLA-based, could be internal or external. You have a misconfiguration of some weird op thing that happened through some automation done bad. You see lots of, you know, server got rebooted or the new Docker image got floated up. Yeah. Yeah, orchestration is new, right? I mean, people are automating work loads. And it has failures and you got to figure out and they don't surface immediately, right? It's something that you notice well done. So you guys are op-step. I mean, I would put you guys in the category of... I like op-step. Op-step. Well, DevOps, this is what we find. So Pat Gelsinger at VMworld said, you know, of all the DevOps stuff that they've been getting into, they've done the surveys. And the attendees were 80% IT ops. Not so much dev, test dev under the local host. The IT ops guys are underwater, right? And they need help and they need smarter software. The dev guys, they don't really want to carry pagers and they know how to write software. And they feel good right now. There's a lot of good tooling out there for the dev guys. Not so much for the ops guys. The asshole is pushing it to the infrastructure as code. And then the ops guys are saying, oh my God, we know this trend's coming. So it seems like they're kind of like changing the airplane engine out and midstream for a lot of these companies. It is, it is. And as I said, if you're in a world where you just migrated to a new mainframe and you're deploying a hybrid cloud solution, like try to fit both of those in your head and build an operating model around it, that's challenging. Going back to the Facebook at scale conference that we were at a couple of weeks ago, one of the Google PMs said something interesting about running their native services that the way they'll do it that makes it easier to consume is dev ops tools to put the whole thing together. And whether they, how they fit in third parties or whether they fit in third parties is a sort of philosophical discussion between the different service providers. But how does Rukana fit into that world when you're going to use the cloud native services sort of exclusively? If you are using cloud native exclusively, I think you're still at the scale that you can probably build a single, you have one silo. You have one application domain. Oh, but like the banking example. Very few companies are multi-billion dollar companies and have one platform. In fact, I'd say probably no one at this point. If you look at all the large successful companies, they grow over time, they adopt new technology, they can't retire the old technologies. And so they work in this hybrid environment and there isn't a single one-stop shop in order to build their monitoring. What he's basically referring to, what I think you're saying is, the outliers are the Googles. Then that everyone in this IT world- They build their own servers, right? They can control everything. And they need- It's not transformational. It's not like they have to do a transformation. All right, so final question. What's your take on the show this year? You've seen the evolution- I've seen this since it was like 500 people across town. Give me the color on that. But you put your industry hat on, take your CEO hat off. What is Hadoop World? How has it evolved? And what's your take on it? Do you feel like the big whales are in here too? Heavy? Does Claudio have a shot? Got his storage is now key. What's your take on everything? I think it's finally gotten to the point that there's no longer a question as to what the platform technologies are, right? Even this whole like, is it Spark? Or is it Hadoop? Or is it Cassandra? I was talking to someone today. He said, you know what? Our customers, they use everything, right? They used to buy DB2 and SQL Server and Oracle, right? They used to have sort of this- I'm an Oracle shop in the world, I buy that. Now you kind of buy this collection of next generation data management and everyone's looking for how do I get the most value out of it or what can I do that everyone else is doing to transform their business? A lot of sort of more sophisticated R&D, we're seeing some buyers show up at the show. Technologists, senior technology leaders who are saying, you know, we're investing in this. Now how does it become strategic to our business? I'll give you the final word. Share with the folks out there watching your startup. Take the opportunity to plug the company, your goals for the year. We haven't been plugging it for- Why should they work with you? Well, we're just getting the word out for you guys, but I think it's valuable. But as the CEO, share with why they should do business with you and what your plans are for the next year or so. We, so what we see is we're really the first company that's starting to look at monitoring and operations from a global perspective. And so if you are VP infrastructure, if you're a CIO, if you're a CISO and you're starting to figure out how this sort of massive tooling actually makes sense, you think big data might actually be the answer behind it. We have an application that solves that problem for you. Oh man, thanks so much for sharing. Congratulations on your success. Thank you very much. The startup is a rockin' here inside Big Data NYC. We'll be right back with more after this short break.