 From Austin, Texas, it's theCUBE, covering Pure Storage Accelerate 2019, brought to you by Pure Storage. Welcome back to theCUBE, Lisa Martin, Dave Vellante's my co-host. We're at Pure Accelerate 2019 in Austin, Texas. Couple of guests joining us next, please welcome Barat Alati, Director of Product Management for Splunk. Welcome back to theCUBE. Thank you. And guess who's back? Vaughn Stewart, VP of Technology from Pure. Hey Vaughn, welcome back. Hey, thanks for having us guys. Really excited about this topic. We are too. All right, so Vaughn, we'll start with you since you're so excited and your nice orange pocket square is peeking out of your jacket there. Let's talk about the Splunk Pure relationship, long relationship, new offerings, joint value. What's going on? Great setup. So Splunk and Pure have had a long relationship around accelerating customers analytics, the speed at which they can get their questions answered, the rate at which they can ingest data, right? To be able to ingest more sources, look at more data, get faster time to take action. However, I shouldn't be leading this conversation because Splunk has released a new architecture, a significant evolution, if you will, from the traditional Splunk architecture that was built off of DAZ and has shared nothing architecture, leveraging replicas, right? Very similar to what you'd have with, like, say, an HDFS workload or an HCI for those who aren't in the analytics space. They've released a new architecture that's disaggregated based off of a caching and an object store construct called SmartStore, which Barath is the product manager for. All right, tell us about that. Sure. So we released a SmartStore of the feature as part of Splunk at a price 7.2 about a year back, back in September timeframe. The very genesis of Splunk SmartStore goes back to a key customer problem that we were looking to solve. So one of our customers, they were already ingesting a large volume of data, but they needed to retain the data for twice the amount of period. And in today's architecture, what it required was them to kind of linearly scale on the amount of hardware. What we realized is sooner or later, all customers are going to run into this issue where if they want to ingest more data or retain the data for longer periods of time, they're going to run into this cost ceiling sooner or later. And the challenge is that in today's architecture, today's distributed scale architecture that we have today, which evolved about 10 years back with the evolution of Hadoop. In this particular architecture, the compute and storage are co-located. And because compute and storage are co-located, it allows us to process large volumes of data. But if you look at the demand today, we can see that the demand for storage is outpacing the demand for compute. So these are two directly opposite trends that we are seeing in the market space. And if you need to basically provide performance at scale, there needs to be a better model, there needs to be a better solution than what we had right now. So that's the reason we basically brought SmartStore and announced the availability last September. What SmartStore brings to the table is that it decouples compute and storage so that now you can scale storage independent of compute. So if you need more storage or if you need to retain it for longer periods of time, you can just scale independently on the storage. And we leverage remote object stores like pure FlashBid to provide that data repository. But most of your active data set still resides locally on the indexers, right? So what we did was we basically broke the paradigm of compute and storage co-location, and we added a small twist. We said that now compute and storage can be decoupled, but we bring compute and storage closer together only on demand. So that means that whenever you're running a query or whenever you're running a search, and whenever the data is being looked for, that is only when we bring the data together. The other key thing that we do is we have an active data set. We ensure that the SmartStore has a very powerful cache manager that allows, that ensures that the active data set is always in the cache. Very similar to the RAM on your laptop. The RAM on your laptop has active data sets always in the cache, always on memory. So very similar to that SmartStore cache allows you to have your active data set always locally on the indexer so that your search performance is not impacted. Yeah, so this problem of scaling compute and storage independently, you mentioned HDFS, you saw it early on there, the hyper-converged guys have been trying to solve this problem. Some of the database guys like Snowflake have solved it in the cloud. But if I understand it correctly, you're doing this on-prem. So we're doing this both in on-prem as well as in cloud. So this SmartStore feature is already available on-prem. We're also already using it to host all of our Splunk cloud deployments as well. And it's available for customers who want to basically deploy Splunk on AWS as well. Okay, where do you guys fit in? So we fit in with customers anywhere from on the, and I hate to say it this way, but on the small side at the hundreds of terabytes up into the tens and hundreds of petabytes side. And that's really just kind of shows the pervasiveness of Splunk, both through mid-market all the way up through the enterprise, every industry, and every vertical. So where we come in relative to SmartStore is we were a co-developer, a launch partner, and because our object offering FlashBlade is a high-performance object store, we are a little bit different than the rest of the Splunk S3 partner ecosystem who have invested in slow, more of an archive mode of S3, right? We have always been designed and kind of betting on the future would be based on high-performance, large-scale object. And so we believe SmartStore is a perfect example, if you will, of a modern analytics platform. When you look at the architecture with SmartStore, as Baras shared with you, you want to suffice a majority of your queries out of cash because the performance difference between reading out of cash, that let's say that's NAND-based or NVME-based or Optane, if you will, when you fall and have to go read data out of the object store, you can have a significant performance trade-off. We significantly minimize that performance drop because you're going to a very high band with FlashBlade. We've done comparison tests with other SmartStore search results that have been published in other vendors' white papers and we show FlashBlade when we run the same benchmark is 80 times faster. And so what you can now have with that architecture is confidence that should you find yourself in a compliance or regulatory issue, something like maybe GDPR, where you've got 72 hours to notify everyone who's been impacted by a breach, maybe you've got a cybersecurity case where the average time to find that you've been penetrated occurs 206 days after the event and now you've got to go dig through your old data. Illegal discovery, questions around customer purchases or credit card payments. Anytime where you've got to go back into history, we're going to deliver those results in an order of magnitude faster than any other object store in the market today. That translates from hours to days, days to weeks and we think that falls into our advantage. Almost two orders of magnitude, you said, right? So Bharath, can this be FlashBlade? Not 80%, sorry, 80 times. 80x, yes, that's what I heard. Do you, does Splunk consider what FlashBlade is doing here an accelerant of Splunk workloads in customer environments? Definitely, because with the smart store cache, we allow high performance at scale for data that resides locally in the cache. But now by using a high-performance object store like Pew FlashBlade, customers can expect the same high performance both when data is in the cache as well as when it's in the remote store. Splunk's an interesting animal. You have a point before we change subjects? Well, I don't want to cut you off. Oh, it's okay. So I was going to say commenting on the performance is just part of the equation. When you look at common operational activities that a Splunk team, not a storage team, but a Splunk team has to incur, right? Patch management, whether it's at the Splunk software, maybe it's the operating system like Linux or Windows that Splunk is running on or any of the other components on that platform. Patch management, data rebalancing because it's unequally distributed. Hardware refreshes, expansion of your cluster, maybe you need more computer storage. Those operations in terms of time when they're on smart store versus the classic model are anywhere from 100 to 1,000 times faster with smart store. So you could have a deployment that, for example, it takes you two weeks to upgrade all the nodes and it gets done in four hours when it's on smart store. That is material in terms of your operational costs. So I was going to say Splunk, we've been watching Splunk for a long time. This is our 10th year of doing theCUBE, not our 10th anniversary, but our 10th year. I think this will be our ninth year of doing .conf. And so we've seen Splunk emerge, very cool company, like Pure, a hip vibe to it. And back in the day, we talked about big data. Splunk never used that term, really, not widely in its marketing. But then when we started to talk about who's going to own the big data space? Was it cloud era? Was it going to be map? We came back to, we said, it's going to be Splunk. And that's what's happened. Splunk has become a workload, a variety of workloads that has now permeated the organization, started with log files and security, kind of cumbersome. But now it's like everywhere. So I wonder if you could talk to the sort of explosion of Splunk and the workloads, and then what kind of opportunity this provides for you guys? So a very good question here, right? So what we have seen is that Splunk has become the de facto platform for all of unstructured data. As customers start to realize the value of putting data into Splunk, and the virtue of Splunk is that, and this is like a huge differentiator for Splunk is the read only, the scheme on read, which allows you to basically put all of the data without any structure and ask questions on the fly. That allows you to kind of do investigations in real time, be more reactive versus being proactive, be more proactive versus being reactive. And it's a scalable platform that scales on large data volumes, highly available platform. All of that are the reason why you're seeing an increased adoption. We see the same thing with all other customers as well. They start off with one data source, with one use case, and then very soon they realize the power of Splunk, and they start to add additional use cases, and ingest more and more data sources. But there's no schema on writer, you call schema on read, has been so problematic for so many big data practitioners because it just became this data swamp. That didn't happen with Splunk. Was that because you had very defined use cases, obviously security being won, or were there architectural considerations as well? So it is the architectural considerations, right? Security and IT were the initial use cases, but the fact that the schema on read basically opens up the possibilities for you, right? Because there's no structure to the data, you can ask questions on the fly, and you can use that to investigate, to troubleshoot, analyze, and take remedial actions on what's happening. And now with our new acquisitions, we have added additional capabilities where we can orchestrate the whole end-to-end flow with Phantom, right? So a lot of these acquisitions are also helping enable the market for us. So we've been talking about TAM expansion all week. We definitely hit it with Charlie pretty hard. I think it's a really important topic. One of the things we haven't hit on is TAM expansion through partnerships and that flywheel effect. So how do you see the partnership with Splunk just in terms of supporting that TAM expansion the next 10 years? So analytics, particularly log analytics have really taken off for us in the last year as we put more focus on it. We want to double down on our investments as we go through the end of this year and in the next year with focus on Splunk as well as other alliances. We think we are in a unique position because the rollout of smart store, right? Customers are always on a different scale in terms of when they want to adopt a new architecture, right? It is a significant decision that they have to make. And so we believe between the combination of flash array for the hot tier and flash blade for the cold is a nice way for customers with classic Splunk architecture to modernize their platform, leverage the benefits of data reduction to drive down some of the cost, leverage the benefits of flash to increase the rate at which they can ask questions and get answers. Is a nice stepping stone and when customers are ready because flash blade is one of the few storage platforms in the market that is scale out, bandwidth optimized for both NFS and object they can go through a rolling non-disruptive upgrade to smart store, have investment protection and if they can't repurpose that flash array they can use pure as a service to have the flash arrays the hot tier today and drop it back off to us when they're done with it tomorrow. And what about C for big workloads like big data workloads? I mean, is that a good fit here? Does it really need to be more performance oriented? Yeah, so flash blade is high bandwidth optimization which really is designed for workloads like Splunk where when you have to do a sparse search, right? We'll find that needle in the haystack question, right? Were you breached? Where were you breached? How were you breached? Go read as much data as possible. You've got to ingest all that data back to the servers as fast as you can. Band with beast. Yes, what you need. And cloud blocks, I'm sorry, flash array C is really optimized at a tier two form of NAND for that secondary maybe transactional database or virtual machines, things of that nature. All right, I have one more and then I'm going to shut up. The signal FX acquisition was very interesting to me for a lot of reasons. One was the cloud, the SAS portion of it. Splunk was late to that game but now you're sort of making that transition. You know, you saw Tableau, you saw Adobe like rip the Band-Aid off and it was somewhat painful, but Splunk is it. So I wonder, any advice that you as Splunk would have to Vaughn as pure as they make that transition to that SAS model? So I think and definitely I think it's going to be a challenging one but I think it's a much needed one in the environment that we are in. The key thing is to always be customer focused and I'm sure that you already are customer focused but the key thing is to make sure that any service is up all the time and make sure that you can provide that up time which is going to be crucial for beating to your customers at least. That's good, that's good guidance. Just wanted to cover that for you. Thank you Dave. So Vaughn, you gave us some of those really impressive stats in terms of performance. They're almost too good to be true. Well, what's customer feedback? Let's talk about the real world. When you're talking to customers about those numbers. What's the reaction? So I don't want to speak for Barath so I will say in our engagements within their customer base, while we hear particularly from customers of scale. So the larger the environment, the more aggressive they are to say, they will adopt SmartStore, right? And on a more aggressive scale than the smaller environments and it's because the benefits of operating and maintaining the indexer cluster are so great that they'll actually turn to the storage team and say, this is the new architecture I want, this is the new storage platform I want. And again, so when we're talking about patch management, cluster expansion, hardware refresh, I mean, you're talking for some large installs, weeks, not two or three, 10 weeks, 12 weeks. And so it can be, you can reduce that down to a couple of days. It changes your operational paradigm, your staffing. And so it's got high impact. So one of the messages that we are hearing from customers is that with SmartStore, they get a significant reduction in the infrastructure spend. It almost drops by two thirds. That's very significant for a lot of our large customers for spending a ton of money on infrastructure, right? So just dropping that by two thirds is a significant driver to kind of move to SmartStore. This is in addition to all the other benefits I get with SmartStore, the operational simplicity and the agility that it provides. You also have customers because of SmartStore, they can now actually burst on demand. And so you can think of this in kind of two paradigms, right? Instead of having to try to avoid some of the operational pain, right? Pre-purchase and pre-provision a large infrastructure and hope you fill it up, they can do a more of a right size and kind of grow in increments on demand, whether it's storage or compute. That's something that's net new with SmartStore. They can also, if they have a significant event occur, they can fire up additional indexer nodes and search clusters that can either be bare metal, VMs or containers, right? Try to push the FlashBlade to its max. Once they've found the answers that they need, gotten through whatever the urgent issues, they can just deep-provision those assets on demand and return back down to a steady state. So it's very flexible, kind of cloud-native agile platform. Awesome, well guys, I wish we had more time, but thank you so much, Vaughn and Bharath, for joining Dave and me on theCUBE today and sharing all of the innovation that continues to come from this partnership. All right, great, love to see it. Thank you, appreciate it. For Dave Vellante, I'm Lisa Martin and you're watching theCUBE.