 Hi, I'm Peter Burris and welcome once again to another CUBE Conversation from our studios here in beautiful Palo Alto, California. Today we've got a really special guest. We're going to be talking about AI and some of the new technologies that are making that even more valuable to business. And we're speaking with Roy Kim, who's the lead for AI solutions at Pure Storage. Roy, welcome to theCUBE. Thank you for having me, very excited. Well, so let's start by just, how does one get to be a lead for AI solutions? Tell us a little bit about that. Yeah, first of all, darn that many AI, anything in the world today, but I did spend eight years at NVIDIA helping build out their AI practice. And I'm fairly new to storage. I'm about 11 months into Pure Storage. So that's how you get into it. You cut your teeth on real stuff and start at NVIDIA. Well, let's talk about some real stuff. I have a thesis, I want to throw it by you and see what you think about it. The thesis that I have, Wikibon has been at the vanguard of talking about the role that Flash is going to play, memory, Flash storage systems are going to play in changes in the technology industry. We were one of the first to really talk about it. And we believe, I believe very strongly that if you take a look at all the changes that are happening today with AI and the commercialization of AI and even big data and some other things that are happening, a lot of that can be traced back directly to the transition from memory, which had very, very long lag times, millisecond speed lag times to Flash, which is microsecond speed. And when you go to microsecond, you can just do so much more with data. And it just seems as though that transition from this to Flash has kind of catalyzed a lot of this change. Would you agree with that? Yeah, that transition from this to Flash was the fundamental transition within the storage industry. So the fundamental thing is that data is now fueling this whole AI revolution. And I would argue that the big data revolution with Hadoop Spark and all that is really the essence underneath it is that you use data to get insight. And so disks were really fundamentally designed to store data and not to deliver data. So if you think about it the way that it's designed is that it's really just to store as much data as possible. Flash is the other way around. It's to deliver data as fast as possible. And so that transition is fundamentally the reason why this is happening today. Oh, it's going to be right. You are definitely right. So the second observation I would make is that we're seeing, and it makes perfect sense, a move to start or turn to start move more processing closer to the data, especially as you said on Flash systems that are capable of delivering data so much faster. Is that also starting to happen in your experience? That's right. So this idea that you take a lot of this data and move it to compute as fast as possible. Or move the compute even closer to the data. And the reason for that, and AI really exposes that as much as possible because AI is this idea that you have these really powerful processors that need as much data as quickly as possible to turn that around into neural networks that give you insight. And that actually leads to what I'll be talking about, but the thing that we built, this thing called ARRI, is the idea that you pull compute and storage and networking on to this compact design so that there is no bottleneck, that data lives close to compute and delivers that fastest performance for your neural network training. So let's talk about that a little bit. If we combine your background in NVIDIA, the fact that you're currently a pure, the role that Flash plays in delivering data faster, the need for that faster delivery in AI applications, and now the possibility of moving GPUs and related types of technologies even closer to the data, you guys have created a partnership with NVIDIA. What exactly? Tell us a little bit more about ARRI. Right, so this week we announced ARRI. So ARRI is the industry's first AI complete platform for enterprises. What we found? AI ready infrastructure for enterprises. That's where ARRI comes from. And it really brought NVIDIA and Pure together because we saw a lot of these trends within customers that are really cutting their teeth in building AI infrastructure and it was hard. There's a lot of intricate details that go into building AI infrastructure. And we have lots of mutual customers at NVIDIA and we found is that there's some best practices that we can pull into a single solution whether it's hardware and software so that the rest of the enterprises can just get up and running quickly. And that is represented in ARRI. Well, we know it's hard because if it was easy, it would have been done a long time ago. So tell us a little bit about specifically about the types of technologies that are embedded within ARRI. How does it work? Right, so if you think about what's required to build deep learning and AI practice, you start from data scientists and then you go into frameworks like TensorFlow and PyTorch. You may have heard of them. Then you go into the tools and then GPUs, InfiniBand typically is networking of choice, and then Flash, right? So these are all the components all of these parts that you have access to. That's right, that's right. And so enterprises today, they have to build all of this together by hand to get their data scientists ready for AI. What ARRI represents is everything but data scientists. So start from the tools like TensorFlow, all the way down to Flash, all built and tuned into a single solution so that all really, an enterprise need to do is give it to a data scientist and they get up and running. So we've done a fair amount of research on this at Wikibon and we discovered that one of the reasons why big data and AI related projects have not been as successful as they might have been is precisely because so much time was spent trying to understand the underlying technologies in the infrastructure required to process it. And even though it was often easy to procure the stuff it took a long time to integrate, a long time to test, a long time to master before it could bring application orientations to bear on the problems. So what you're saying is you're slicing all that off so that folks that are trying to do artificial intelligence related workloads can have a much better time to value. I got that right? That's right. So think about just within that stack, everything that I just talked about, InfiniBand, right? Enterprises are like, what is InfiniBand? GPU. A lot of people know what GPU is but enterprise will say that they've never deployed GPUs. Think about TensorFlow or PyTorps, these are tools that are necessary to data scientists but enterprise is like, oh my goodness, what is that? So all of this is really foreign to the enterprises and they're spending months and months trying to figure out what it is and how to deploy it, how to design it. And how to make it work together. How to make it work together. And so what NVIDIA and Pure decided to do is take all the learnings that we had from these pioneers, trailblazers within the enterprise industry, bring all those best practices into a single solution so that enterprises don't have to worry about InfiniBand or Ethernet or GPUs or Skill Out Flash or TensorFlow, it just works. So it sounds like it's a solution that's specifically designed and delivered to increase the productivity of data scientists as they try to do data science. That's right. What about some of those impacts? What kinds of early insights about more productivity with data science are you starting to see as a consequence of this approach? Yeah, you'll be surprised that most data scientists doing AI today, when they kick off a job, it takes a month to finish. So think about that. When someone, I'm a data scientist, I come in on Monday, early February, I kick off a job, I go on vacation for four weeks, I come back and it's still running. What do you mean by kicking off a job? It means I start this workload that helps train neural nets. It requires GPUs to start computing and the TensorFlow to work and the data to get it consumed. You're talking about it takes weeks to run a job that does relatively simple things in a data science sense like train a model. Train a model, it takes a month. And so the scary thing about that is you really have 12 tries a year to get it right. Just imagine that. So, and that's not something that we want enterprises to suffer through. And so what AIRI does is it cuts what used to take a month down to a week. Now, that's amazing, if you think about it. We're used to, they only had 12 tries in a year. Now they have 48 tries in a year. Transformative, right? The way that that worked is we, in AIRI, if you look at it, there's actually four servers with FlashBlade. We figured out a way to have that job run across all four servers to give you four extra throughput. Think that that's easy to do, but it actually is not. And how are we doing it? We paralyzed it. And that is not necessarily easy to do. These are often not particularly simple jobs. But that's why no one's doing it today. But if you think about it going back to your point, it's like the individual who takes performance enhancements, drugs, so they can get one more workout than the competition. And that lets them hit another 10, 15 home runs, which leads to millions of extra dollars. You're kind of seeing something similar. You used to be able to get only 12 workouts a year. Now you can do 48 workouts. Which business is going to be stronger and more successful as a result? That's a great analogy. Another way to look at it is a typical data scientist probably makes about half a million dollars a year. What if you get four X the productivity out of that person? So you get the return of $2 million a return out of that $500,000 investment that you make. That's another way of seeing performance enhancements for that data scientist. But I honestly think it's even more than that because there's a lot of other support staff that are today doing a lot of the data science grunt work, let's call it. Lining up the pipelines, building the testing pipelines, making sure that they run, testing sources, testing syncs. And this is reducing the need for infrastructure types of tasks. So you're getting more productivity out of the data scientist, but you're also getting more productivity out of all the people who here to forward, you were spending on doing this type of stuff. What are they doing is just taking care of the infrastructure. Is that right? That's exactly right. We have a customer in the UK, one of the world's largest hedge fund companies that's publicly traded. And what they told us is that with FlashBlade, and not necessarily every customer at this time, but they're actually doing AI with FlashBlade today at Pure, from Pure. And what they said is with FlashBlade, they actually got two engineers that was full time taking care of infrastructure. Now they're doing data science, right? To your point, that they don't have to worry about infrastructure anymore because of the simplicity of what we bring from Pure. And so now they're working on models that help them make more money. So at half million dollars a year that you were spending on eight data scientists and a couple of administrators, that you were getting two million dollars, that you're not getting two million dollars return, you can now take those administrators and have them start doing more data science without necessarily paying them more. It's a little secret. But you're now getting four, five, six million dollars in return as a consequence of the system. That's right. So as you think about where Aries is now or Aries is now, and you think about where it's going to go, give us a sense of kind of how this presages new approaches to thinking about problem solving as it relates to AI and other types of things. Well, one of the beauty about AI is that it's always evolving. What used to be what they call CNNs as the most popular model now is GANs or Genetic. CNN stands for? Convolution Neural Nets, typically used for image processing. Now people are using things like generative adversarial networks, which is putting two networks against each other to improve one another. So each one works is more productive. Right. And so, and that happened in the matter of a couple of years. And so AI is always changing, always evolving, always getting better. And so it really gives us an opportunity to think about how does AI evolve to keep up and bring the best state of the art technology to the data scientists. So there is actually boundless opportunities to. Well, and even if you talk about GANs, our generative adversarial networks, the basic algorithms have been in place for 15, 20, maybe even longer, 30 years. But the technology wouldn't allow it to work. And so really what we're talking about is a combination of deep understanding of how some of these algorithms work that's been around for a long time and the practical ability to get business value out of them. And that's kind of why this is such an exploding thing is because there's been so much knowledge about how this stuff or what this stuff could do that now we can actually apply it to some of these complex business problems. That's exactly right. You know, I tell people that the promise of big data has been around for a long time. People haven't talked about big data for 10, 20 years. AI is really the first killer application of big data. Hadoop's been around for a long time, but we know that people have struggled with Hadoop. Spark has been great. But what AI does is it really taps into the big data platform and translates that to the insight, whatever the data is. Video, text, all kinds of data you can use AI on. And so that really is the reason why there's a lot of excitement around AI. It really is a first killer application for big data. Well, I would say it's even more than that. It's an application, but it's also, we think there's a bifurcation. We think that we're seeing an increased convergence inside the infrastructure, which is offering up greater specialization in AI. So AI is an application, but it also will be the combination of tooling, especially for data scientists, will be the new platform by which you build these new classes of applications. So you won't even know you're using AI. You just build an application that has those capabilities, right? Right, that's right. I mean, I think it's as technical as that or as simple as when you use your iPhone and you're talking to Siri, you don't know that you're talking to AI. It's just part of your daily life. Or looking at having it recognize your face. That is processing. The algorithms have been in place for a long time, but it was only recently that we had the hardware that was capable of doing it. And Pure Storage is now bringing a lot of that to the enterprise through this relationship with NVIDIA. That's right. So AI does represent all the best of AI infrastructure from all of our customers. We pulled it into what AI is and we're both really excited to give it to all of our customers. So I guess it's a good time to be the lead for AI solutions at Pure Storage, huh? That's right. There's a ton of work, but a lot of excitement. This is really the first time a storage company was spotlighted and became and went on the grand stage of AI, right? There's always been NVIDIA, there's always been Google, Facebook and hyperscalers, but when was the last time a storage company was highlighted on the grand stage of AI? I don't think it'll be the last time now. And it's to your point that this transition from disk to flash is that big transition in the industry and Fiat has it that Pure Storage has the best flash play solution for deep learning. So I got one more question for you. So we've got a number of people that are watching the video, watching us talk. A lot of them, variants in AI, trying to do AI, you've got a fair amount of experience. What are the most interesting problems that you think we should be focusing on with AI? Wow, that's a good one. Well, there's so many. Other than using storage better. Yeah, I think there's so many applications. Just think about customer experience. Like just one of the most frustrating things for a lot of people is when they dial in and they have to go through five different prompts to get to the right person. And that area alone could use a lot of just intelligence in the system, right? And I think by the time they actually speak to a real live person, they're just frustrated. And the customer experience is poor. So that's one area I know that there's a lot of research in how does AI enhance the experience? So and in fact, one of our customers is global response and they are a call center services company as well as an offshore company. And they're doing exactly that. They're using AI to understand the sentiment of the caller and give a better experience. All that's predicated on the ability to do the delivery. So I'd like to see AI be used to sell AI. All right, so Roy Kim, who's the lead of AI solutions at Pure Storage. Roy, thank you very much for being on theCUBE and talking to us about AIRI and the evolving relationship between hardware, specifically storage and new classes of business solutions powered by AI. Well, thank you for inviting me. And again, I'm Peter Burris and once again, you've been watching theCUBE talk to you soon.