 Live, from Las Vegas, it's theCUBE. Covering IBM Think 2018, brought to you by IBM. Welcome back to IBM Think 2018. This is theCUBE, the leader in live tech coverage. My name is Dave Vellante, and I'm here with Peter Burris. Our wall-to-wall covers this day too. Everything AI, blockchain, cognitive, quantum computing, smart ledger, storage, data. Vina Hallman is here. She's the Vice President of Offering Management for Storage and Software Defined. Welcome back to theCUBE, Vina. Thanks for having me back. Steve Elliott is here. He's the Vice President of Deep Learning and Global Chief Data Office at IBM. Welcome to theCUBE, Steve. Thanks, you guys, for coming on. Pleasure to be here. That was a great introduction. Thank you, appreciate that. Yeah, so this has been quite an event, consolidating all your events, bringing your customers together. 30,000, 40,000, too many people to count. Very large event. Standing room over in all the sessions. It's been unbelievable. Your thoughts? It's been fantastic. Lots of participation, lots of sessions. We brought, as you said, all of our conferences together, and it's a great event. So, Steve, tell us more about your role. We're talking off the camera. We've had here DePaul, Bandari on before, Chief Data Officer at IBM. You're in that office, but you've got other roles around Deep Learning, so explain that sort of multi-tool star here. For sure. So, roles and responsibility at IBM in the Chief Data Office. Kind of two pillars. We focus in the Deep Learning Group on foundation platform components. So how to accelerate the infrastructure and platform behind the scenes. To accelerate the ideation of product maze. We want the data scientists to be very effective and for us to churn projects very, very quickly. That said, I mentioned projects. So, on the applied side, we have a number of internal use cases across IBM. And it's not just a handful. It's in the orders of hundreds. And those applied use cases are part of the cognitive plan, per se. And each one of those is part of the transformation of IBM into our cognitive. Okay, now we were talking to Ed Walsh this morning, Vina, about how you collaborate with colleagues in the storage business. You know, you guys have been growing. It's four quarters straight. And that doesn't even count some of the stuff that you guys ship on the cloud in storage. So, talk about the collaboration across company. Yeah, yeah, we've had some tremendous collaboration. You know, the broader IBM and bringing all of that together. And that's one of the things that we're talking about here today with Steven team is really as they built out their cognitive architecture to be able to then leverage some of our capabilities and the strengths that we bring to the table as part of that overall architecture. And it's been a great story. Yeah? What would you add to that, Steve? Yeah, absolutely, you know, refreshing. You know, I've built up supercomputers in the past and specifically for deep learning and coming on board at IBM about a year ago, you know, seeing the Elastic Storage Solution or a server. Elastic Storage Server, yep. It handles a number of, you know, different aspects of my pipeline. Very uniquely. So for starters, I don't want to worry about, you know, rolling out new infrastructure all the time. You know, I want to be able to grow my teams and grow my projects. And that's what's nice about ESS is it's extensible. I'm able to, you know, roll out more projects, more people, more detentancy, et cetera. And it supports us effectively. Especially, you know, it has very unique attributes like the read-only performance speed and random access of data. It's very unique to the offering. So, Steve, you're a customer of Venus, right? Okay, so what do you need for infrastructure for deep learning, AI? What is it, you mentioned some attributes before, but to get them over. Well, the reality is there's many different aspects. And if anything kind of breaks down, then the data science experience breaks down. So we want to make sure that everything from the interconnect of the pipelines is effective. That, you know, you heard Jensen earlier today from NVIDIA. We've got to make sure that we have compute devices that, you know, are effective for the computation that we're rolling out on them. But that said, if those GPUs are starved by data, that we don't have the data available, which we're drawing from the ESS, then, you know, we're not making effective use of those GPUs. It means we have to roll out more of them, et cetera, et cetera. And more importantly, the time for experimentation is elongated. So that whole idea to product timeline that I talked about is elongated. If anything breaks down. So, we got to make sure that the storage doesn't break down and that's why this is awesome for us. So, let me, especially from a deep learning standpoint, let me throw kind of a little bit of history and tell me if you think, let me hear your thoughts. So, years ago, the data was put as close to the application as possible. About 10, 15 years ago, we started breaking the data from the application, the storage from the application. And now we're moving the algorithm down as close to the data as possible. Yeah. At what point in time, we stopped calling the storage and started acknowledging that we're talking about a fabric that's actually quite different. We put a lot more processing power as close to the data as possible. We're not just storing, we're really doing truly deeply distributed computing. What do you think? There's a number of different areas where that's coming from. Everything from switches to storage to memory that's doing compute very close to where the data actually rests. Still, I think that this is, you can look all the way back to Google file system. Moving computation to where the data is as close as possible, so you don't have to transfer that data. I think that as time goes on, we're going to get closer and closer to that. But still, we're limited by the capacity of very fast storage. NVMe, very interesting technology, still limited. How much memory do we have on the GPUs? 16 gigs, 24s, interesting, 48s, interesting. The models that I want to train is in the hundreds of gigabytes. But you can still parallelize it. You can parallelize it, but there's not really anything that's true model parallelism out there right now. There's some hacks and things that people are doing, but yeah, so I think we're getting there. It's still some time, but moving it closer and closer means we don't have to spend the power, the latency, et cetera, to move the data. So does that mean that the rate of increase of data and the size of the objects that we're going to be looking at is still going to exceed the rate of our ability to bring algorithms and storage or algorithms and data together? What do you think? I think it's getting closer, but I can always just look at a bigger problem. I'm dealing with 30 terabytes of data for one of the problems that I'm solving. I would like to be using 60 terabytes of data. If I could, if I could do it in the same amount of time when I wasn't having to transfer it. But that said, if he gave me 60, I'd say I'd really want it 120, so. You're one of those kind of guys. I'm definitely one of those guys. I'm curious, what would it look like? Because what I see right now is it'd be advantageous, and I would like to do it. But I ran 40,000 experiments with 30 terabytes of data. It would be four times the amount of transfer if I had to run that many experiments with 120, right? So. Beena, what do you think? What is the fundamental, especially from a software defined side, what does the fundamental value proposition storage become as we start pushing more of the intelligence close to the data? Yeah, but you know, the storage layer, fundamentally, software defined, you still need that set of protocols and the file system, the NFS, right? And so some of that still becomes relevant, even as you kind of separate some of the physical storage or flash from the actual compute. So I think there's still a relevance when you talk about software defined storage there. Yeah, yeah. So you don't expect that there's going to be any particular architectural change, NVMe is going to have a real impact. NVMe will have a real impact, and there will be this notion of composable systems, and we will see some level of advancement there, of course. That's around the corner, actually, right? So I do see it progressing from that perspective. So what's underneath it all? What actually, what products? Yeah, let me share a little bit about the product. So what Steve and team are using is our elastic storage server. So I talked about software defined storage, as you know, we have a very complete set of software defined storage offerings, and within that, our strategy has always been allow the clients to consume the capabilities the way they want, offer only on their own hardware or as a service or as a integrated solution. And so what Steve and team are using is an integrated solution with our Spectrum Scale software, along with our Flash and Power 9 server power systems, right? And on the software side from Spectrum Scale, you know, this is a very rich offering that we've had in our portfolio, highly scalable file system. It's one of the solutions that powers a lot of our, you know, supercomputers, a project that we just are still in the process and have delivered on around coral or national labs, right? So same file system combined with a set of servers and Flash system, right? Highly scalable, erasure coding, high availability, right? As well as throughput, right? 40 gigabytes per second. So that's the solution, that's the storage and system underneath what Steve and team are leveraging. Steve, you talk about, you want more, what else is on Venus to do list from your standpoint? Specifically targeted at storage, or? What do you want from the products? Well, I think long stretch goals are, you know, multi-tenancy and, you know, the wide array of dimensions that, like especially in the Chief Data Office, that we're dealing with, like we have so many different business units, so many different of those enterprise problems in the orders of hundreds. How do you effectively use that storage medium, driving so many different users, right? I think it's still hard. I think we're doing it a hell of a lot better than we ever have, but it's still, it's an open research area. How do you do that? And especially there's unique attributes towards deep learning, like most of the data's read, read only to a certain degree. You know, there's, when data changes, there's some consistent checks that could be done, but really for my experiment that's running right now, it doesn't really matter that it's changed, right? So there's a lot of nuances specific to deep learning that I would like exploited if I could, and that's some of the interactions that we're working on to kind of alleviate those pains. So instead of CDO conference at Boston last October, and Interpol was there to present this enterprise data architecture, and there were probably about three or 400 CDOs, Chief Data Officers in the room, sort of explain that, could you sort of summarize what that is and how it relates to sort of what you do on a day-to-day basis and how customers are using it? Yeah, for sure. So the architecture is kind of like the backbone and rules that kind of govern how we work with the data. So the realities are there's no sort of blueprint out there. What works at Google or works at Microsoft or works in Amazon, it's very unique to what they're doing. Now, IBM has a very unique offering as well. We have so many, we're a composition of many, many different businesses put together. And now with the Chief Data Office, it's come to light across many organizations. Like you said, at the conference, three to 400 people, the requirements are different across the orgs, right? So bringing the data together is kind of one of the big attributes of it. Decreasing a number of silos, making a monolithic kind of reliable, accessible entity that various business units can trust and that it's governed behind the scenes to make sure that it's adhering to everyone's policies, that their own specific business unit has deemed to be their policy. We have to adhere to that or the data won't come. And the beauty of the data is we've moved into this cognitive era. Data is valuable, but only if we can link it, right? If the data is there, but there's no linkages there, what do I do with it? I don't really, I can't really draw new insights. I can't draw all those hundreds of enterprise use cases. I can't build new value in them because I don't have any more data, right? So it's all about linking the data and then looking for alternative data sources or additional data sources and bringing that data together and then looking at the new insights that come from it. So in a nutshell, we're doing that internally in IBM to help our transformation, but at the same time creating a blueprint that we're making accessible to the CDOs around the world and our enterprise customers around the world. So they can follow us on this new adventure. New adventure being two years old, but. Yeah, sure, but it seems like if you're going to apply AI, you've got to have your data house in order to do that. So this sounds like a logical first step. Absolutely, is that right? 100%, and the realities are there's a lot of people that are kicking the tires and trying to figure out the right way to do that. And it's a big investment. And investment, drawing out large sums of money to kind of build this hypothetical better area for data, you need to have a reference design. And once you have that, you can actually approach the C-level suite and say, hey, this is what we've seen, this is the potential. And we have an architecture now and they've already gone down all the hard paths. So now we don't have to go down as many hard paths, right? So it's incredibly empowering for them to have that reference design and learning from our mistakes. Already proven internally now, bringing it to our enterprise clients. So we heard Ginny this morning talk about incumbent disruptors. So I'm sort of curious as to any learnings that you have there, it's early days. I realize that. But when you think about, you know, the discussion, are banks going to lose control of the payment systems? Are retail stores going to go away? Is owning and driving your own vehicle going to be the exception, you know, not the norm? Et cetera, et cetera, et cetera. You have big questions. How far can we take machine intelligence? Have you seen your clients begin to apply this in their businesses, incumbents? We saw three examples today. Good examples, I thought. I don't think it's widespread yet. But what are you guys seeing? What are you learning? And how are you applying that to clients? Yeah, so, I mean, certainly for us, from these new AI workloads, we have a number of clients and a number of different types of solutions, right? Whether it's in genomics, or it's AID learning and analyzing financial data, you know, a variety of different types of use cases where we do see clients, you know, leveraging the capabilities like Spectrum Scale, ESS, and other flash system solutions to address some of those problems. We're seeing it now. Autonomous driving as well, right, to analyze data. What a little roadmap to end this segment. I mean, where do you want to take this initiative? What should we be looking for, as sort of observers on the outside looking in? Well, I think drawing from the endeavors that we have within this CDO, what we want to do is take some of those ideas and look at some of the derivative products that we can take out of there and how do we kind of move those into products? You know, because we want to make it as simple as possible for the enterprise customer. Because although, you know, we see these big ExoScale companies and all the wonderful things that they're doing, what we've had the feedback from, which is similar to our own experiences, is that those use cases aren't directly applicable to most of the enterprise customers. Some of them are, right? Some of the sub-envision and brand targeting and speech recognition and all that type of stuff are. But at the same time, the majority in the 90% area are not. So we have to be able to bring down, sorry, just the EchoScale. Yeah, it gets loud here sometimes. There's already going on. Exactly. We have to be able to bring that technology to them in a simpler form so they can make it more accessible to their internal data scientists and get better outcomes for themselves. And we find that they're on a wide spectrum. Some of them are quite advanced. It doesn't mean just because you have a big name here, quite advanced. Some of the smaller players have smaller names, but quite advanced, right? So there's a wide array. So we want to make that accessible to these various enterprise customers. So I think that's what you can expect. The reference architecture for the cognitive enterprise data architecture, and you can expect to see some of the products from those internal use cases come out to some of our offerings, like maybe IGC or Information Analyzer, things like that, or maybe the Watson studio. Things like that. You'll see that trickle up. All right, Peter, we'll give you the final word. You guys, the business is good. Four straight quarters of growth. Got some tailwinds. Currency is actually a tailwind for a change. Customers seem to be happy here. Final word. Yeah, no, we've got great momentum. And I think 2018, we've got a great set of roadmap items and new capabilities coming out. So we feel like we've got a real strong set of future for IBM storage here. Great, well, Bina, Steve, thanks for coming on theCUBE. Thank you. Thank you. All right, keep it right there, everybody. We'll be back with our next guest right after this. This is day two of IBM Think 2018. You're watching theCUBE.