 Hi, I'm Peter Burris and welcome to another CUBE conversation from our wonderful studios in beautiful Palo Alto, California. Today we're going to be talking about new architectures, new disciplines required to really make possible the opportunities associated with digital business. And to do that, we've got Neil Vacherajani who is the technical director of Pure Storage. Neil, welcome to theCUBE. Thank you for having me, Peter. So Neil, we have spent a fair amount of time within Wikibon and within the CUBE community talking a lot about what is digital business. So, give me a second, run something by a, tell me if you agree. So we think that there is a difference between business and digital business and specifically we think that difference is a digital business uses data assets differently than a business does. Walmart beats Sears because it used data differently. AWS is putting the pressure on Walmart because it uses data differently or Amazon's putting the pressure on Walmart because it uses data differently. So that is at the centerpiece of a lot of these digital transformations. How are you using data to re-institutionalize your work, realign your resources, re-establish a new engagement model with your marketplace? Would you agree with that? Yeah, I'd absolutely agree with that. And I think a lot of it has to do with the volume of data, where the data is coming from. If you look at traditional business, it really was about just putting into computers what we used to do on paper. And digital business today I think is about generating huge volumes of data by really looking at every interaction we have no matter how small or how big. So putting telemetry on as many things. So IoT for machines, mobile for human beings. But it used to be, as you said, it was a process, known process, unknown technology world for a long time. And now these are data different processes. We're actually using data to describe what that next best action should be, what the recommendation should be. So as we think about this, business has been around for a long time. There's this notion of evidence-based management, which is the idea that we use data differently from the boardroom all the way down to the drivers. How does a business start to bring forward the discipline required to really make possible this data-driven world? Well, I think the first thing is to really recognize why this new paradigm shift changes things. And I think in the old world, if you looked at a piece of data, you actually could articulate all the way from the boardroom down to the stockroom, every use of the data. And that meant that you could build a lot of siloed applications, and that wasn't a big deal. You got your money's worth out of the data. So for example, recording transactions in store number 17. That's right. But in the new world, you actually don't know what the value of the data is ahead of time. In some sense, you're trying to capture a lot of data and then use technology to correlate it with things, mix and match, mash it up, and then drive business decisions that you didn't even know you were making a decision a few weeks ago. And that means that you can't really lock up your data. You can't constrain it because that's going to limit your possibilities and it's going to limit your ROI on that data. Yeah, we like to say that data as an asset is different from all other assets because it is inherently shareable, reusable. It doesn't follow the laws of scarcity. And so in many respects, what the IT organization has had to do is find new ways to privatize that data through things like security. But as you're saying, they don't want to introduce technologies that artificially constrain derivative and future uses of that data. And I think that that's where really the big architectural shift is happening in the data center. Because if you look traditionally, we have siloed the data and it wasn't like this intentional thing that we want to put it into a silo, but that's how we packaged our applications and that's how we deployed our applications. And now we need a new discipline inside the data center that makes the data available, lets people put policies on it, like security policies, but then also makes it available for the innovators all throughout the company to get access to that data. And we're trying to crystallize this whole philosophy into something we refer to as the data-centric architecture where data's at the center, people have access to the data and then there's just applications all around it that are all hitting this common pool of data and doing different things, driving new business processes. Now, you're talking not about a physical pool of data, but rather a logical pool of data. Data's still going to be very distributed, right? Well, you know, data gets generated in a distributed way. Data's very large. I think you would be a bit naive to be able to point to one rack and one data center and say all your data's going to be right here in this one rack. Or in one cloud. Or in one cloud, for that matter. But just from a philosophical perspective, you do want to pull your data out of anything that is, like you said a minute ago, that's constraining it. So I think one really good example of this is when we went, quote unquote, web scale, we saw a lot of applications move into direct-detach storage to dive deep into a technology. And that was great if you wanted to only come in the front door and access the data through the application that was managing that DAZ. But if you wanted to do anything else, you were kind of stuck. So as to summarize this point, we're moving from a world in which data is a place to data is a service. You got that right? That's absolutely right. I mean, the way I like to think about it is that data and storage need to really be different things. And storage's job is to give you access to the data. Storage in its own right, you know, it doesn't solve a business problem. The data that solves a business problem storages the vehicle that gets you there. And so I think it's pretty exciting that there's new technologies that are coming out or that, you know, honestly are here that are enabling that, things like Flash and NVMe and its future. Well, let's talk about that because what the observation that I've made to clients for quite some time is that if you go back, disk was a great technology for persisting data. So again, store number 17, transaction had at a certain time, it's already occurred, we have to record it. So we record it, we persisted on disk. Now what we're trying to do is we're utilizing technologies that are inherently structured to deliver data so that we can have the data be very distributed but still look at it from a logical standpoint and have that data be delivered to a lot of applications, whether that is a local and as long as we don't undermine basic physics, perhaps further away. But even more importantly, deliver it to different roles, different, same data being delivered to developers, same data being delivered to a new application. What are some of those core technologies that are going to be necessary to do this? You mentioned NVMe, let's start there. Yeah, if I just back up a little bit, right? That in some sense, even that recording the data workflow that you talked about, we made disk work but it was actually a pretty challenging media and so we put in a lot of optimizations and things in place because we said we know the usage pattern and if we know the usage pattern, we know how to organize our data. And so it was a step one, like the transformation that I think is in pretty full swing these days was moving from disk to flash. And that was a huge transformation because it meant that random access to the data was just as performant as this carefully crafted sequential access. That meant you could start accepting unknown workloads into your application but you were still stuck behind this very serial, very antiquated, scuzzy protocol and NVMe is now bringing a lot more parallelism to play and that's gonna help us drive things like just simple, plain old data center stuff like density and performance density and power and that kind of thing. So that's sort of step one in terms of the technology that you can package all of this stuff and in a pretty dense package and put petabytes of storage with enough IO to actually access that data. That's the key that you can have petabytes but you only have one eye out for each gig where you're not going to get a lot out of that data. So just to stop right there and that leads to a world in which as long as you're disciplined and architectured you do not have to know what workloads are going to access that data in your term. Well, you know, that's only step one, right? Because the other challenge is that very few people access storage directly, right? We hide this behind databases and we hide this behind a whole bunch of other technologies. Now, those technologies might have their own limitations in place but we have a lot of really rich things we can do at the storage level to present the same data out multiple front ends and so the simplest idea is we don't have one copy of a database we often will have the transactional database that's using recording those transactions but then we'll have an analytics copy of the database and then we need to keep the two of those things in sync. And this is where the discipline and the architecture really comes into place and we kind of have a lot of that figured out for things like relational databases and best practices there but in the meantime, the world also moved over to the new world of no SQL databases, Qs, Kafka, things of that nature and those brought direct attached storage as the best practice and so I think where the discipline comes in and where some of the new technologies that we're talking about right now are how do you bring those old disciplines that we figured out on, let's say, the relational world how do you bring that to bear on the new technologies that are meeting the scale requirements that we have today? Well, one of the more important workloads that are gonna require scale is, for example, AI. So how are we going to organize some of these technologies, add them to these new disciplines to be able to make some of these AI workloads run really, really fast? You know, I think a lot of this really comes down to pulling the storage out and putting it into its own tier and so Pure Storage has an offering called ARI which is packaging NVIDIA DGX boxes with flash blades and we say, hey, you don't need a whole bunch of direct attached storage which is siloing your data, you can go put it into this common shared pool and I think that the other side of the house or flash array business is doing something really similar with NVMe, the flash array X is essentially commoditizing NVMe it's saying everybody has access to this high performance density and looking into the future with technologies like NVMe over Fabric, what we're really saying is your apps that used to use direct attached storage, there's no reason why they can't go to us to a sand-based architecture that offers rich data services and not compromise one IOTA on latency. Or access or any number of other activities as well. So we've got NVMe, NVMe over Fabric, flash, new approaches for thinking about packaging some of these things. Are there any other technologies that you envision on the horizon that are going to be really important to customers and that Pure is going to take advantage of? Yeah, you know, I really think that the other thing is once you collect all this stuff you need a way to tame the beast. You need a way to deploy your applications you need a way to catalog everything and honestly things like Kubernetes and container orchestration is becoming this platform where you deploy all of this stuff. And some of the assumptions that are baked into that really go back and tie in nicely with those other technologies in particular they assume that I can schedule this compute wherever I want and I have access to the data. So in that way having a fabric if you will between your compute and your data is essential and it's just another reason why siloing things off into particular units of compute is just really the architecture of the past and the architecture going forward is going to be to logically centralize and maybe put some smarts at that other layer saying hey if this data is in the public cloud let me schedule up there but if this data is in my data center let me schedule the compute down there but then not having to worry about the micro decisions about does it have to be in this rack or on this particular physical node all your data is accessible. But increasingly we're going to do things that move the compute both physically as well as logically closer to the data. You know 100% right but it's at what scale that you really want to get the data center right. Your compute should be running in the correct data center. Or the center of data right. Or the center of data right. You know get it in the right spot but then you don't want to have to worry about all the other micro constraints. You don't want you know if you look on the networking side of the world leaf spine networks are all about saying hey look there really is a uniform fabric for networking. We're trying to do the same thing in storage and just say look the storage is so performant there's no reason to silo. You can run your compute wherever you want if you've got a good networking fabric and you've got a good storage fabric the end of the day all your data is accessible to whatever new application you envision and you just there's no reason why you have to lock it up. You mentioned security before you know you should absolutely be able to orchestrate things like taking a snapshot of your data putting it through masking or whatever anonymization you need to make it safely accessible to new applications and innovators inside of your company to drive that digital business. Yeah so we like to talk about moving from a world that is focused on infrastructure taking cost out making it static by removing all uncertainty to a world where we've known workloads and elastic capacity or elastic scale to a plastic world where plastic using of the physicals you know the physics sense is unknown workload unknown scale and just making sure that we have the option to use data any way we want as much as possible in the future. And I think that that's why you see the rise of service catalogs and self-service coming up in IT it's that plasticity that you have the brightest minds in your company trying to figure out what to do and you don't want to have infrastructure be this bottleneck that's causing everything to go slower or for people to say no you just always want to say yes. And that's why I think it's really exciting to see these technologies NVMe come out and say we've now got the performance to say yes. NVMe over fabric to say there's no compromise over latency and then honestly having this stuff packaged in things like FlashArrayX where the CIO or the CFO doesn't complain about breaking the bank as well because now these technologies are the status quo they're the standard there's no premium for them and if anyone's trying to charge you that premium you should really ask them why this is the new architecture this should be the only thing you offer in some sense. Yeah we're bringing all these new technologies into the economic envelope that IT has to be in for business today. That's right and you know you look at something like Flash memory it's not a new technology. I remember in college having a flash card to put into like a digital camera in the early days of digital cameras but for it to make it into the data center the thing that was critical was that economic aspect of it. So it's not just about being on the bleeding edge of technology but it's packaging that in a way that's actually palatable for the entire C-suite to consume inside your organization. And I remember my disc pack that I carried around in college from the PDP system that we had to use. All right, Neil Rocherajani, technical director of Pure Storage talking about the relationship between new technologies, data-centric architectures and digital business. Thanks very much for being on theCUBE. Thanks so much Peter. And once again I'm Peter Burris you've been participating in another theCUBE conversation till we talk again.