 Live from the MGM Grand Convention Center in Las Vegas, Nevada, it's the queue at Splunk.conf 2014. Brought to you by headline sponsor Splunk. Here are your hosts, Jeff Kelly and Jeff Frick. Hey, welcome back. I'm Jeff Frick. We're here at the MGM Grand in Las Vegas at Splunk.conf 2014, the fifth annual Splunk user convention, over 4,000 Splunkers, aficionados, customers, partners here, learning about Splunk, use cases, best practices, a lot of energy, the room is full. If I could turn the camera around, you'd see people all over the place. Join here in this next segment with my co-host. So we're joined by Todd Papiana, who is the CTO of Splunk, also a CUBE alum. Todd, welcome back to the CUBE. Thank you. Good to be back. You've been with Splunk now almost a year. What's the year been like? You're basically riding a rocket ship here, aren't you? Absolutely. Rocket ship is a great way to describe it. It's coming out for a year. I've been learning. I spent the first three months, I think, just learning the people, learning the products and the architecture, something about the customers, then spending the next three or four months in the road, talking to our customers, learning all the fantastic stuff people are doing with Splunk in their business. It felt like I was finally up to speed and was dangerous enough to talk about the company, talk about what we're doing, understand the market, where we fit and all the rest of it. What do you take on the show here? Obviously, a lot of great customers, a lot of good energy. The show has been fantastic. Bigger than last year, a ton of customers. The best thing about this show, I think, is just the amount of customers and use cases and folks I get to talk to. It's never a dull day when I get up and know that I'm going to talk to a half a dozen high-profile customers and doing a real variety of different things with Splunk. I definitely want to talk about some of those novel use cases that you're seeing, but I think what would be beneficial to our audience is to help us, if you can give us a Splunk 101 from a technology perspective, because Splunk gets kind of mentioned with the big data term, and I'm not sure everybody quite understands where Splunk kind of fits. Maybe if you could walk us through the technology stack from a Splunk perspective, and then maybe we can put it in context with a larger big data world. Absolutely. So I think of Splunk as doing kind of three or four very simple things. The first is we collect data. So we have a set of agents we call them for that goes and collects data. So we collect data from the moment it is created. The server, log file, the application on the wire and the network, wherever it is, we collect that data and we allow customers to just forward that and pour it into our platform. So our platform is to scale out the distributed data fabric. I think of us as the next generation data fabric for unstructured data. People collect it, pour it in. They don't have to do anything to it. There's no ETL, no processing, no MDM, none of that. We just let the raw data get poured in. And then we allow you to extract value. And we do that using our search language. Think of it a little bit like Google. Everybody uses Google every day to extract value out of unstructured data mostly. So our search language is the interface to the underlying data. And on top of that, we build applications on top of our analytics platform. So there's a bunch of high value ones that we build like enterprise security, MDM, we're in exchange and stuff like that. There's a bunch that customers build. But at this heart it's very simple. Collect data, do nothing to it, pour it into our platform, extract value using our search language and then use applications on top of it to use that to drive business decisions. The application space in the big data world is an area that we've been covering for a while here on theCUBE and at Wikibon. It seems like every year people say this is the year of the big data application and it just simply hasn't happened for the most part. But I say Splunk is one of those exceptions. Talk a little bit about the application layer and your approach to that because you mentioned we've had a guest on yesterday. Both Splunk creates applications for specific use cases. We also open up the platform to customers and developers to build their own applications. We do. How are you able to kind of balance both of those things and what's your approach to applications? When does Splunk decide well this is an application we need to build and this is an area we need to focus on versus let's let the Splunk community have at it and build their own applications? Yeah, so a couple of things. I think you're absolutely right. The amplification of the big data space has been something that I think I was actually on theCUBE in like 2012 talking about. It's been so slow developing. I think personally when I look at the market and say look if there are no applications in the big data space then it goes away and just becomes like a file system for Teradate or someone like that. So we need applications. We need those killer apps. Over the last couple of years I think we're starting to see a couple of patterns emerging. Security being one I think. Security as a killer pattern for taking this mass of data, unstructured data and be able to drive value out of it and also personalization and targeting. That's another space that I see classes. As Splunk we deliver a platform, core product enterprise is a platform you can build on top of it. So we have a set of SDKs, a set of partner support stuff, developer support so you can actually go and build apps. And so we build a couple and our approach really is let the customer kind of lead us into the applications that they're building and they're trying to solve on top of our platform and so we take that and we learn from that and that's kind of like led us to Enterprise Security, one of our apps that led us to some of the other apps that we have around monitoring VMware clouds and we're not monitoring Amazon and that kind of stuff. But there's such a diverse set of things that you can do with Splunk and the data when you pour it into Splunk that there is a thriving ecosystem of other people building applications on top of our platform and we welcome that. We have a program internally that is specifically focused on outreach to developers, outreach to partners, the channel, and establishing an ecosystem of applications. I think for us as a business, we have big aspirations in the data space. For us as a business to become a piece of the cement, a piece of the fabric of the enterprise architecture, we need applications and so we focus on that in enabling people to build apps. So as you mentioned, you've got big ambitions to be a key part of that modern data architecture. So help put this in context. So you mentioned kind of the storage aspect of Splunk, the scale out distributed, it sounds a lot like what something like Hadoop does and you've got great Hadoop experience, you've got the tech shops in that space. So put Splunk in perspective. How does it relate to something like Hadoop, like the world of NoSQL? Where does it fit in this kind of emerging modern approach to data management and analytics? Yeah, that's a great question. So the way I think about it is this. Data gets created at some point and you want to collect it. So we're the front door for that. We collect the data and then we allow customers to pour it into our platform. So our core platform is Enterprise. It's an index. We do some stuff, magic to the data to allow you to extract value out of it very quickly and that's really, really good for doing real-time decisioning, real-time alerting, real-time analytics on it and we see customers using it for a whole wide array of things to do, but it allows you to act on the data immediately and we see customers, they store a ton of data in there but you know, I would say traditionally it's a shorter window than say the amount of data you put in Hadoop. They say for us, we're looking at people making decisions on 30 to 90 days worth of data and Hadoop, it might be like four years. It's unlikely you're going to keep four years worth of data in Splunk Core, but that's okay because we've got you covered. I look at Hadoop as where the data lake is going to be, where all the data is going to end up over the next 10 to 15 years. HDFS is a fantastic file system. It scales out for everybody. So I see the data kind of like a life cycle of it, right? So we collect it, we run it through our index, you do real-time decisioning, and then it kind of like lands in HDFS and then you start to do more long-term trending analysis, that kind of stuff on it. So we have a product called Hunk, which is really all of the goodness of Splunk, all of the analytic capability of Splunk, but it just layers over the top of HDFS and allows you to extract value out of any data in HDFS immediately the first day you turn it on. So kind of leveraging that capability but kind of leaving the storage behind but bringing the application and the analytics component and layering that on top of HDFS. Yeah, the way to think about it, the way I think about it is kind of like a unified analytics layer, right? So Splunk is a vertically integrated application, quite different, I think, to some of the other kind of data vendors out there. We collect the data, we store it for you, we process it, we allow you to extract value through our analytics capability and also our visualizations and data enrichment and all that. There's a ton of stuff that we do in the analytics layer from the raw data and field extraction, data enrichment, modeling, pivoting, visualizations, the whole shebang, to the point where users are able to see stuff on screen. All of that goodness above the storage fabric, you can just put straight on top of HDFS and use the same analytics capability on stuff that's in our store and stuff that's on HDFS. You can actually hybrid across the two of them. Very cool. Jeff, I know Jeff wants to get in with a question. I just have one more. Actually, I wanted to ask on this thread. Curious, does Splunk consider themselves a competitor to some of the SQL on Hadoop vendors that are being developed? Are you trying to play that role as well? I definitely would say no to the SQL on Hadoop because we're not a SQL platform. Right, but it's not the goal to provide more self-service type analytics, easier analytics to get value out of that data in Hadoop. Is that not the similar goal? Yeah, absolutely. I think, so Hunk's a new product for us. We released it in the last year. It brings all of the capability of our search language and analytics and all the stuff you can do in Splunk to Hadoop, and it's really targeted existing customers who want to roll data on to Hadoop for long-term storage, cheap batch storage. It's also targeted at Hadoop customers and like you said, I've been around the Hadoop ecosystem for quite a while. I think the state of the art in Hadoop is charitably saying it takes a little while to start to get value out of Hadoop. There's a lot of science projects that happen six, nine months, build-outs. There's a lot of people who are a little bit dissatisfied. And so with Hunk, you can take it and point it at data that's in HDFS and use all of our analytics capability, visualization, charting, all the rest of it to extract value. So is it a competitor to impaler and stinger and those kind of things? I mean, in the sense that we help people extract value from the data in HDFS and Hadoop, sure, you could say that. I think we do it differently. We're much more vertically integrated, stack-focused, more on, you know, the IT, the business analyst, rather than the programmer per se, right? Yeah, and I want to follow up with the talk because one of the topics that always comes up is the consumerization of IT. And really, when we talk about that, as you said, everybody's used Google. It's really the expected behavior of applications and access to data that I'm used to when I'm on Google, right? That's kind of my benchmark now. And it seems like smoke's taking a very different kind of Google-esque philosophy, if you will, in that it's really based on search. It's not based on data scientists who are very, very sophisticated at complex queries, et cetera, but really more an iterative process of delivering that search capability down to people that are not data scientists so they can make actionable decisions. Yeah, exactly. I think that's a very, very astute way of thinking about it. I think the company was originally started back in the day by the founders. They started a company to basically bring Google to IT data. And so I've been on record saying I believe that search will be the de facto query language for big data going forward. I'm a bit of a Star Trek fan. I think most techies are. And when I think about how do people interact with... How do they interact with data on Star Trek? Well, they talk to the computer and the computer, show me all the planets that were in blah, blah, blah, blah, blah. Well, that's search, right? It's not like select planets from planet tables crossed with local universe. They don't do that, right? I think the vision that I have for where we're going with interacting with data will be driven by search. Data exploration and data discovery will be a search thing. And you're going to want to use the kind of capabilities we have in our products right now, the search language, and even more so, more towards NLP. I want to be able to log into a product and say, show me all of the customers within 20 mile radius of this store who churned in the last six months. And that's the kind of interface that I want for, you know, discovery, you know, data discovery, data exploration. And that's the business analyst thing. Now, there's one more access pattern to the data, I think, which is important, which is algorithms, right? Which is, you're seeing a lot in big data, people starting to do stuff at Spark and some of the other kind of like things on top of do, which is really driving machine learning algorithms. You still need that capability in the platform. That won't be search driven, that'll be algorithm driven. And I think the data scientists will, you know, use search to understand stuff, create an algorithm. The algorithms start to run, right? And then you start to feed that back into the decisioning system. And so ultimately, you want to build what I think of as data driven applications, data driven decisioning, right? Yeah, and I thought what was interesting in Godfrey's keynote is he talked about not only adding what we think of traditional kind of big data, data sources, you know, with mobile and now structured in social, but even, you know, machine to machine, hardcore, inside the machine data as well as now connecting to mainframe. So really expanding the breadth of data sources and data types that feed into this machine. Yeah, absolutely. I think when, you know, when I think about the opportunity for us at Splunk, anywhere there's data, machine data, that's where we go, right? I mean, so when I talk to customers, they say, well, look, I think you probably think traditionally we're good at kind of, you know, ingesting your server logs. Well, think about the server logs and the application logs. Also the logs of all of the devices in the data center. By the way, we'll also get the data from the power in the data center and the building and cars and trains and planes and automobiles and all of that stuff, right? We go and get that and we pull it all into our platform. We can take data pretty much from anywhere and allow you to do, you know, to build really interesting insight out of it. And I think, you know, one of the best things about this job is I'm never bored of meeting new customers, doing novel things. You know, I was telling some folks actually last night, I was speaking at a conference, you know, a couple of months ago for one of our customers and they build fuel pumps and gas stations and stuff like that. So you know, you guys do a lot of these shows, right? You probably go to the partner show, you know, for us, right? You see a bunch of booths with kind of like, you know, monitors and people showing you software. So I did the keynote of this thing. I came out, I went to walk the floor and like in the middle of the floor is a fuel pump, fully functioning fuel pump and I'm like, wow, this is the coolest thing. It's like, this is real actual like stuff, tangible stuff that like affects me day to day. And by the way, we're pulling all the data off that stuff and allowing them to do analytics on it in real time, right? From a fuel pump. I'm like, okay, that's just one customer. You know, next day I'm talking to someone who's like monitoring like elevators and escalators, right? Then the next day it's like people doing like buildings and then it's kind of like software and applications and you know, security intrusions and like every day is like a fantastic, fantastic experience of like what people are doing with our software. So we're getting the short time but there's another huge force in the marketplace that I wanted to touch base with you. That's Amazon, right? Amazon on a whole bunch of fronts. Amazon and again defining in the way we interact with applications and our expected behavior but then also obviously AWS and they're drive to cloud, they're drive to pricing and I see AWS is a sponsor here, obviously a big partner. I wonder if you could tell us a little bit about you know, kind of the relationship between Splunk and AWS. Where is it today? Where is it heading? Yeah, they're a fantastic partner, a fantastic business partner and also a show partner. And we had a big announcement this week. You know, we've actually announced that they're going to be OEMingHunk rolling out so, you know, they're, I mean, clearly Amazon is setting the trend. They haven't set in a trend for years in where cloud computing is going. There's a ton of data in S3, ton of data that people process in EMR. They're OEMingHunk and allowing people to use Hunkletop for data in EMR and S3 which I think is huge because that data set is just going to continue to grow exponentially, right? I think it's a very, very tight relationship. You know, we have a ton of fun working with them and a ton of, you know, a ton of really good, you know, high level synergy and where is the future of, you know, these applications that people are going to want to build on top of their platform and where we can help drive value. Yeah, as I was going to say, it's just great that you guys are, you're an Amazon, you've got Hadoop, you've got your own SaaS platform. You guys seem to be kind of really covering the basis for your customers and really offering a plethora of options for them to be able to implement this. Absolutely. Ultimately, what we want to do is offer people a single unified analytics capability and say that you have data on-prem, we've got you covered. We've got data on Amazon. We've got you covered. You're private cloud. We've got you covered, right? You know, by the way, we run your cloud too all through the same unified analytics fabric, right? You can point it at any of where the data is and we'll slope it all together and, you know, drive insight and value out of it. That's kind of like, you know, the big vision for us is they do it anywhere. We'll make it accessible and usable. All right, Todd. Well, thank you very much for getting the hard hook. Give Matt grief for not scheduling a longer segment with you. So next time, Matt, we need longer segments this time. So Jeff Rick here, we'll be back in the next segment after this short break.