 Live from San Jose, California in the heart of Silicon Valley. It's theCUBE, covering Hadoop Summit 2016. Brought to you by Hortonworks. Here's your host, John Furrier. Hey, welcome back. And we're here live in Silicon Valley in San Jose for Hadoop Summit 2016. Soon it'll be called DataWorks Summit and to announce the new name change. This is theCUBE Silicon Angles flagship program. We're about to advance and extract the signal from the noise. I'm John Furrier, next guest is Joey Etchier-Varia, platform technical leader at Rokana. Welcome to theCUBE. Thank you. So Rokana is one of those companies that has come out of the DNA of this ecosystem. Omar, Don, whole team, Eric, Eric Sammer, former Cloudera guys, but also they've been doing big data. And I remember when you guys started, it was interesting, because Splunk had just gone public or around that time, pre-public. It was pretty clear that the data value was there. So I remember that was where you guys started. Give us the update, because now it's all about instrumentation. It's all about ops. Exactly. Give us the update on Rokana. Well, basically what we're seeing out there in the marketplace is our customers that are a lot of traditional Fortune 100, Fortune 500, they're trying to compete with these really digital native companies. You know, we've got retailers that are competing with Amazon and they really need to change the way that they do business. They need to go away from individual monitoring of systems to what we call IT operational total visibility or digital transformation and actually collect all of the data, not leave any of it on the floor that allows new use cases, like year over year analysis and things like that. Yeah, it's interesting. I was talking about that DockerCon last week, Docker madness is going crazy, but one of the things that was common on theCUBE was, you have Airbnb and then you have like, you know, the Hilton or Marriott's website. Yep. I mean, it's stark comparison between, you know, cutting edge, cool, a lot of dynamic action, a lot of data driven, a lot of social graph, very cool, relevant, and then kind of the boring hotel sites. Right. And so I use that as an example, but this is the phenomenon you're referring to that a lot of these enterprises, the existing enterprise that need to get modern, they need the agility. So the ops has kind of been now embracing the DevOps. Yep. Win, DevOps is one, basically. I mean, you look at Docker momentum, DevOps is now mainstream. Yep. That's going to put a lot of pressure on IT ops. Oh, definitely. And so that conflict between Dev and ops has always been the beginning, but now that DevOps has kind of won that battle, ops has got to transform, but they're not going to reduce their SLAs. They still got to have security. How do you guys help that? Because that is, I know that's something that you guys are bullsying on. Yeah. Give us a taste of what that world's like right now. Absolutely. I mean, we're really focused on scalability, security, all of those concerns that IT ops is going to have, especially as their operational data is merged with business metrics. They're not just going to do IT system monitoring to know that a server's been down. They care how it's impacting their customers, how it's impacting their bottom line. And so one of the things that we've recently announced with our 1.5 version is a brand new search engine. It lets us scale to volumes of data that none of our competitors have been able to reach, makes that data instantly searchable, and allows long-term retention. And the way that we store the data is actually designed so that you could have years upon years of data retention without a loss of performance on the query side. So give me an example of how deep you guys go on the storage versus the query. So obviously latency is key. Yeah. What's the kind of performance and kind of depth of range of dates? Yeah, I mean, so we've done testing internally and have hit 400,000 events per second on 24 commodity nodes pretty easily. And we've tested data retention on the order of, you know, three or four years because we only ever touch data that's actively being queried. The archived indexes don't impact the performance of the live indexes. So give us an example of what this means for the customer. Yeah, it means that you can actually do year-over-year analysis. You can compare what you were seeing a year ago, a week ago to what you're seeing now. And that's really critical for new analytics like anomaly detection. One of the systems, or one of the features built into our system is anomaly detection across every metric we collect. So that way you don't have to think about what do I want to model? What do I not? What do I care about? Everything is available to you at your fingerprint. So does IoT get you guys excited because you guys are doing something that's kind of a real cutting edge category called embedded analytics. Yes, exactly. This is what we're talking about. In real time, data, value. Exactly. And putting the value data or wherever a store could be pulled in, that is key. We'll talk about what that is. Let's spend a minute to describe that dynamic of embedded analytics and the impact, whether it's IoT or whatnot. Yeah, absolutely. I mean, all these big data systems are great, but if you're not getting results that matter to you, then it's not as useful. It's not contextualized. And so we really focus on going beyond search and giving you things like anomaly detection, built-in real-time analytics so that you're being notified when something is going wrong and you're actually able to use the analytics to drive you to root cause analysis rather than having to rely on just sort of dumpster diving in a giant search endo. So if you had to boil it down, what main problem do you guys solve? I mean, when you guys go into the customer base, what's the main problem? Yeah, it's really about that visibility of the data, the ability to archive all of the data. I mean, we've got customers that are using us to monitor the real-time traffic on their websites, make sure that if something starts to go wrong or something starts to deviate from the norm, that they're able to jump on that immediately and get to it before there's an impact on revenue. What does total visibility mean to you guys? I know it's something you guys talk about a lot. Yeah, it means that you can collect all of your IT operational data. You don't have to pick and choose what you think is important. This is really critical for security use cases. Most intrusions aren't detected until nine months after the fact. And by that point, your normal IT operational systems have dropped all of that log data, and you can't do any kind of retrospective analysis. Is security driving a lot of value prepositions for you guys? That seemed to be a big market right now for this kind of stuff. Yeah, absolutely. Security is really big. It's top of mind with our customers. It's driving the need to retain the data. It's driving the need to analyze older data, compare it to new data. So it's definitely a big driver. So you guys are doing really well right now. Give an update on the company. What brocon is up to? Growth, headcount, kind of customers here. Yeah, sure. So we've got 59 full-time employees. We've got three customers in production, near a dozen customers and active POCs, and a huge pipeline, ton of interest, just constant activity. The thing I love about your company is you've got a lot of seasoned vets out there. Omar, Chane, Eric Sammer, Don Brown. A lot of technical depth at the management team, but also down and through the company. So I've got to ask, when you guys go to your off-site meetings and talk as a group, when you look at this ecosystem that's happening now, and obviously Hortonworks event, Hadoop Summit's being renamed, DataWorks, which really speaks volumes to this transformation in the industry. What is some of the, I won't say old timers, because it's only been 10 years, but I know Omar's been around me. What's the discussion internally share? Some of the conversations like, oh, the ecosystem's a mess, it's transforming, where's the value? How do you guys talk about this ecosystem? Yeah, I mean, we definitely see the shift from the ecosystem from a batch-only world to a real-time streaming world. That's where all the innovation is, that's where all the demand is. We're constantly seeing evolution of the Hadoop system and related technologies from a security perspective. In the early days, security was, don't let anyone in, because we don't have any control over it. So the newer features there around being able to control your data sets and apply sort of rich labels to them, and control access that way is really powerful. So I'm going to ask you the question about cloud, because obviously hybrid cloud is where a lot of your enterprise customers are moving to or building. How does the instrumentation and embedded analytics affect your offering and or the customers' move to the cloud? That's a great question. So we very specifically provide our software as a downloadable that our customers can install, because we know that they have infrastructure in the cloud, they have infrastructure in on-prem data centers. We don't want to restrict their usage of our platform, and they really want to bring all that data together and enrich it, because they don't want to have to explain, well, we didn't know that this part of our system was down because it's in the cloud, we don't have monitoring on that, we only have monitoring on our older internal data centers. So that sort of hybrid approach and that sort of unified approach really means that you have to bring all that data together and analyze it in real time and prevent and present those analysis results to the users of the system so they can take action right away. So that doesn't put any constraints on the customers because you're basically saying, this old school technique, you know, when everything's sass and a platform's a service, but realistically, they're embedding it into their architecture, they don't have to worry about what you guys do as a business, and they don't have to make any adjustments. Is the feedback positive on that? I mean, give me an example of a use case. Yeah, I mean, the feedback's been really positive on that. I mean, a big part of our messaging is we're built on open source technologies, we're built on open data, we store all of our data in open source formats, we don't think that we're the owners of your data, we're not the gatekeepers of it. Customers depend on that. If they're going to be storing data in these systems for the next seven or 10 years, they need to know that it's going to be around regardless of what happens out there. Yeah, it's interesting, Joey, the API economy's been a word that's been kicked around. Okay, we all get that. The perimeter's gone. It's going to increase your value on the security side, but also we're an event notification economy. And no one's really kind of talking about that because it's got means different things. Event-driven stuff is big up for you guys. Absolutely. How is customers looking at that trend because you could have a zillion notifications going off and you really got to start thinking about which ones are more important. I'm assuming you guys must have some machine learning, there's some stuff going on. How do you guys look at this whole tsunami of alarms and events and all these notifications? Yeah, absolutely. I mean, it's really about taking the power of complex event processing and making it accessible to the end user. You don't want to deluge them with just a million additional alerts and events that they need to look at. You want to surface up to the top the things that are really relevant. And we've actually done a lot of work on our anomaly detection engine to specifically look for deviations that are of significance. Not just relative to what normal data is, but relative to the sort of amplitude of the data. So you have a lot of signals in your data center that don't necessarily, that are usually more noise than signal and we can actually differentiate those. We don't send you a bunch of alerts on those where sudden spikes may actually be the norm versus other data that's a lot more stable and if it deviates a little bit, you want to know right away. It's like going to a hospital, you know, you hear all these beeps and alarms, but you want to know the code blue or whatever. Exactly. So you have to kind of understand the signal ratio. Is that what you're talking about the anomaly? Exactly, yeah, yeah. We basically can, you know, using the signal to noise ratio in the actual metric streams that we're getting gives us enough information using a technique called predictive intervals in order to provide contextualized alerts that really matter versus noisy alerts based on every change. So we're going to ask you about the cost because one of the things that it seemed at the beginning when you guys launched that you guys have a price model that's disruptive to other competitors where, you know, they're trying to ratchet up their prices because, you know, I mean, Splunk's a public company so they got to produce earnings. So imagining, you know, they got to increase their value and change their product. How do you guys compare to the competition relative to pricing value? That's a great question. So we price per user. The reason why we do that is our system is designed to make your users 10 times more efficient. And so we're really letting you invest in those users. And we think that that's the right unit to measure you on. We don't measure your data ingest. We don't restrict how much data you can have archived. We don't use any of those kinds of things. It's really just based on the total user account that you want to access. So you guys are betting on the long game, pretty much. Exactly. Give the customer some value, lower the surface area of value and allow them to get in. And hopefully if it works, your premise is you're betting on happy users. Yep, that's exactly right. Happy users drives our business. And so how's that going? I mean, you're funded so you got a lot of cash. You're not going to run out of business sending out cash anytime soon. But I mean, how's that tracking? It's been great. I mean, the number one reason why customers come and talk to us is oftentimes our business model. They hear about the business model. They get it instantly. They're like, yes, that's exactly what I need. I don't want to have to pick and choose what data I can bring into the system because if I bring more data, it's going to be more expensive. I want to make sure that I'm paying for the value that I get out of it. And they can measure that value based on the capabilities that users actually get. That's an ethos kind of philosophy because the concept of open data is increased to surface area or observation space for the customer. Exactly. So why wouldn't you want more, more the better, right? I mean, that's the whole purpose of big data. Oh, absolutely, 100%. All right, Joey, thanks so much for coming on theCUBE. Really appreciate your Rokana. You guys hiring, give a quick plug. Yep, we're hiring across the board, rokana.com slash careers. All right, Joey's here inside theCUBE, breaking it down to Hadoop Summit 2016. We're live in Silicon Valley. We'll be right back with more after this short break.