 Live from San Jose in the heart of Silicon Valley. It's theCUBE, covering DataWorks Summit 2018. Brought to you by Hortonworks. Welcome back to theCUBE's live coverage of DataWorks here in San Jose, California. I'm your host, Rebecca Knight, along with my co-host, James Kobielus. We have with us Eric Herzog. He is the Chief Marketing Officer and VP of Global Channels at the IBM Storage Division. Thanks so much for coming on theCUBE once again, Eric. Well, thank you. We always love to be on theCUBE and talk to all theCUBE analysts about various topics, data, storage, multi-cloud, all of these. And before the cameras are rolling, we were talking about how you might be the biggest CUBE alum in the sense of being in theCUBE more times than anyone else. I know I'm in the top five, but I may be number one. I have to check with Dave Valenti in the crew and seat. Exactly, and often wearing a Hawaiian shirt. Yes, I was on theCUBE last week from Cisco Live. I was not wearing a Hawaiian shirt, and Stu and John gave me a hard time about why was I not wearing a Hawaiian shirt. So I made sure I showed up to the day to work show at the Hawaiian shirt. You were a Californian with a tan, so it fits, so go. So we were talking a little bit before the cameras were rolling and you were saying one of the points that is sort of central to your professional life is it's not just about the storage, it's about the data. So riff on that a little bit. Sure, so at IBM we believe everything is data driven. And in fact, we would argue that data is more valuable than oil or diamonds or plutonium or platinum or silver or anything else. It is the most valuable asset, whether you be a global Fortune 500, whether you be a mid-sized company or whether it be a Herzog's Bar and Grill. So data is what you use with your suppliers, with your customers, with your partners. Literally everything around your company is really built around the data. So most effectively managing it and make sure, A, it's always performant. Because when it's not performant, they go away. As you probably know, Google did a survey that one, two, after one, two, they go off your website. They click somewhere else. So it has to be performant. Obviously in today's 365, seven by 24 company, it needs to always be resilient and reliable and it always needs to be available. Otherwise, if the storage goes down, guess what? Your AI doesn't work, your cloud doesn't work, whatever workload, if you're more traditional, your Oracle SQL, SAP, none of those workloads work, if you don't have a solid storage foundation underneath your data-driven enterprise. So with that ethos in mind, talk about the products that you are launching, the newly launched and also your product roadmap going forward. Sure, so for us, everything really is at storage is this critical foundation for the data-driven multi-cloud enterprise. And as I said before in the queue, all of our storage softwares now cloudified. So if you need to automatically tear out to IBM cloud or Amazon or Azure, we automatically will move the data placement around from on-premise out to a cloud. And for certain customers who may be multi-cloud, in this case, using multiple private cloud providers, which happens due to either legal reasons or procurement reasons or geographic reasons for the larger enterprises, we can handle that as well. So that's part of it. Second thing is we just announced earlier today an artificial intelligence and AI reference architecture that incorporates a full stack from the very bottom, both servers and storage, all the way up through the top layer and then the applications on top. So we just launched that today. AI for storage management or AI for- AI, regular AI, artificial intelligence from an application perspective. So we announced that reference architecture today, basically think of the reference architecture as your recipe, your blueprint of how to put it all together. Some of the components are from IBM, such as spectrum scale and spectrum computing from my division, Power servers, from our proud division, some are open source, tensor, cafe, things like that. Basically it gives you what the stack needs to be and what you need to do in various AI workloads, applications and use cases. I believe you have distributed deep learning as an IBM capability, that's part of that stack. That is part of the stack. It's like in the middle of the stack. Is it, correct me if I'm wrong, that's containerization of AI functionality for distributed deployment in an orchestrated Kubernetes fabric, is that correct? Yeah, so when you look at it from an IBM perspective, while we could clearly support the virtualized world, the VMwares, the Hyper-Vs, the KVMs and the OVMs, and we will continue to do that, we're also heavily invested in the container environment. So for example, one of our other divisions, the IBM Cloud Private Division, has announced a solution that's all about private clouds. You can either get it hosted at IBM or literally buy our stack. Rob Thomas, in fact demoed it this morning here. Exactly, and you can create your own, your own private cloud initiative. And there are companies that, whether it be for security purposes or whether it be for legal reasons or other reasons, don't want to use public cloud providers, be it IBM, Amazon, Azure, Google or any of the big public cloud providers, they want a private cloud. And IBM either A will host it or B with IBM Cloud Private. So all of that infrastructure is built around a containerized environment. So we support the older world, the virtualized world and the newer world, the container world. And in fact, our storage allows you to have persistent storage in a container's environment, Dockers and Kubernetes. And that works on all of our block storage, that's a freebie by the way, we don't charge for that. So you've worked in the data storage industry for a long time. Can you talk a little bit about how the marketing message has changed and evolved since you first began in this industry and in terms of what customers want to hear and what assuages their fears? So nobody cares about speeds and feeds. They accept me, because I've been doing storage for 32 years. And him, he might care. But when you look at it, the decision makers today, the CIOs in 32 years, including seven startups, IBM and EMC, I've never, ever, ever met a CIO who used to be a storage guy. Ever. So they don't care what they, they know that they need storage and the other infrastructure including servers and networking. But think about it, when the app is slow, who do they blame? Usually they blame the storage guy first. Secondarily, they blame the server guy. Thirdly, they blame the networking guy. They never looked to see that their code stack is improperly done. So really what you have to do is talk applications, workloads and use cases which what the AI reference architecture does, what my team does in non-AI workloads. It's all about, again, data-driven, multi-cloud infrastructure. And they want to know how you're going to make a new workload fast AI. How you're going to make their cloud resilient whether it's private or hybrid. In fact, IBM Storage sells a ton of technology to large public cloud providers that do not have the initials IBM. So we sell gobs of storage to other public cloud providers both big, medium and small. So it's really all about the applications, workloads and use cases. That's what gets people excited. And you basically need a position just like I talked about with the AI foundation. Storage is the critical foundation. So we happen to be right now knocking on wood. Let's hope there's no earthquake since I've lived here my whole life. And I've been in earthquakes. I was in the 89 quake. Look, you fell down a bunch of stairs in the 89 quake. So if there's an earthquake as great as IBM Storage is or any other storage or servers, it's crushed. If there's a bad, boom, you're done. Okay, well, you need to make sure that your infrastructure, really your data is covered by the right infrastructure and that's always resilient. It's always performant and it's always available. And that's what IBM drives is about, that's the message. Not about how many gigabytes per second in bandwidth or what's the option. Not that we can't spew that stuff when we can talk to the right person, but in general, P don't care about. What they want to know is, oh, that SAP workload took 30 hours and now it takes 30 minutes. We have public references that will say that. Oh, you mean I can use eight to 10 times less storage for the same money? Yes, we have public references that will say that. So that's what it's really about. So storage is really more from really a speeds and feeds nerd burger sort of thing. And now all the nerd burgers are doing AI and cafe and TensorFlow and all of that. They're all hackers, right? And it used to be storage guys used to do that and to a lesser extent server guys and definitely networking guys. That's all shifted to the software side. So you got to talk to languages. What can we do with Hortonworks? And by the way, we were named in Q1 of 2018 as the Hortonworks infrastructure partner of the year. So we work with Hortonworks all time at all levels, whether it be with our channel partners, whether it be with our direct end users. However, the customer wants to consume. We work with Hortonworks very closely and other providers as well in that big data analytics and the AI infrastructure world. That's what we do. So the containerization side of the IBM AI stack and then the containerization capabilities in Hortonworks data platform 3.0. Can you give us a sense for how you plan to or do you plan at IBM to work with Hortonworks to bring these capabilities, your reference architecture into more or bring their environment for that matter into more of an alignment of what you're offering. So we haven't made an exact decision of how we're going to do it, but we interface with Hortonworks on a continual basis and we'll work with them to figure out what's the right solution. Would that be an integrated solution of some type? Whether that be something that we do through an adjunct to our reference architecture or some reference architecture that they have. But we always make sure, again, we are their partner of the year for infrastructure named in Q1. That's because we work very tightly with Hortonworks and make sure that what we do ties out with them, hits the right applications, workloads in use cases, the big data world, the analytics world and the AI world. So we're tied off together to make sure that we deliver the right solutions to the end user. Because that's what matters most, what gets the end users fired up, not what gets Hortonworks or IBM fired up, it's what gets the end users fired up. So when you're trying to get into the headspace of the CIO and get your message out there, I mean, what is it, what would you say is it that keeps them up at night? What are their biggest pain points and then how do you come in and solve them? So I'd say the number one pain point for most CIOs is application delivery, whether that be to the line of business, put it this way, let's take an old workload. Let's take that SAP example. That CIO was under pressure because they were trying, in this case, it was a giant retailer who was shipping stuff every night all over the world. Well, guess what? The green undershirts in the wrong size went to Paducah, Kentucky. And then one of the other stores in Singapore, which needed those green shirts, they ended up with shoes. And the reason is they couldn't run that SAP workload in a couple hours. Now they run it in 30 minutes, it used to take 30 hours. So since they're shipping it every night, you're basically missing a cycle, essentially, and you're not delivering the right thing from a retail infrastructure perspective to each of their nodes, if you will, to their retail locations. So they care about what do they need to do to deliver to the business, the right applications, workloads and use cases on the right timeframe, and they can't go down. People get fired for that at the CIO level. Right, if something goes down, the CIO is gone. And obviously for certain companies that are more in the modern mode, people who are delivering stuff in their primary transactional vehicle is the internet, not retail, not through partners, not through people like IBM, but their primary transactional vehicle is a website. If that website is not resilient, performant and always reliable, then guess what, they are shut down and they're not selling anything to anybody, which is not true if you're Nordstroms. Someone can always go into the store and buy something and figure it out. And almost all old retailers have not only a connection to core, but they literally have a server and storage in every retail location. So if the core goes down, guess what, they can transact. And in the year of the internet, you don't do that anymore. If you're shipping only on the internet, you're shipping on the internet. So would it be a new workload? Okay, an old workload if you're doing the whole IoT thing. So for example, I know a company that I was working with, it's a giant private mining company. They have those giant like three-story dump trucks you see on like the Discovery Channel. Those things cost them $100 million. So they have 5,000 sensors on every dump truck. It's a fricking dump truck. Guess what, they got 5,000 sensors on there. So they can monitor and make sure to take proactive action because if that goes down, whether these be diamond mines or whether these be uranium mines or whatever it is, it costs them hundreds of millions of dollars to have a thing go down. So that's, if you will, trying to take it out of the traditional high-tech arena, which we all talk about, whether it be Apple or Google or IBM. Okay, great, now let's put it to some other workload. In this case, this is the use of IoT in a big data analytics environment with AI-based infrastructure to manage dump trucks. Something to come out of what's called digital twins in a networked environment for materials management, supply chain management and so forth. Is that sort of, are those requirements growing in terms of industrial IoT requirements of that sort? And the big, how does that affect the amount of data that needs to be stored, the sophistication of the AI and the stream computing that needs to be provisioned? Can you talk to that? The amount of data is growing exponentially. It's growing at yottabytes and zettabytes a year now, not at just exabytes anymore. In fact, everybody on their iPhone or their laptop, I've got a, you know, 10 gig phone. Okay, my laptop, which happens to be a power book, is two terabytes of flash on a laptop. So just imagine how much data is being generated if you're doing in a giant factory or whether you're being in the warehouse space, whether you'd be in healthcare, whether you'd be in government, whether you'd be in the financial center. And now with all these additional regulations, such as GDPR in Europe and other regulations across the world about what you have to do with your healthcare data, what you have to do with your finance data, the amount of data being stored. And then on top of it, quite honestly, from the AI big data analytics perspective, the more data you have, the more valuable is the more you can mine it. Or the more oil it's as if the world was just oil. Forget the pollution side. Let's assume oil didn't cause pollution. Okay, great. Then guess what? You would be using oil everywhere. And you wouldn't be using solar, you'd be using oil. And by the way, you need more and more and more. And how much oil you have and how you control that would be the power. That right now is the power of data. And if anything, it's getting more and more and more. So again, you always have to be resilient with that data. You always have to interact with things like we do with Hortonworks or other application workloads. Our AI reference architecture is another perfect example of the things you need to do to provide, at the base infrastructure, the right foundation. If you have the wrong foundation to a building, it falls over. Whether it be your house, a hotel, this convention center, if it had the wrong foundation, it falls over. And to actually follow the oil analogy and just a little bit further, the more of this data you have, the more PII there is in it, usually. And the more the workloads need to be scaled up, especially if it's things like data masking. When you have compliance requirements like GDPR. So you want to process the data, but you need to mask it first. Therefore you need clusters that considerably are optimized for high volume, highly scalable masking in real time. To drive the downstream app, to feed the downstream applications and to feed the data scientists, data lakes and so forth and so on. That's why you need things like Incredible Compute, which IBM offers with the Power platform. And why you need storage that again, can scale up. Can get as big as you need to be. For example, in our reference architecture, we use both what we call spectrum scale, which is a big data analytics workload performance engine that has multiple threaded, multitasking it can do. In fact, one of the largest banks in the world up there, if you happen to bank with them, your credit card fraud is being done on our stuff. Okay, but at the same time we have what's called IBM Cloud Object Storage, which is an object store, you want to take every one of those searches for fraud, and when they find out that no one stole my master card or the visa, you still want to put it in there because then you can mine it later and see patterns of how people are trying to steal stuff because it's all being done digitally anyway. So you want to be able to do that. So you want to handle it very quickly and resiliently, but then you want to be able to mine it later, as you said. Mining the data. High volume anomaly detection in the moment to be able to tag in more anomalous data that you can then sift through later or maybe in the moment for real time. Well, that's highly compute intensive. It's AI intensive, and it's highly storage intensive on the performance side. And then what happens is you store it all for, let's say further analysis, you can tell people when you buy your, when you get your Amex card, do this and they won't steal it. Well, the only way you can do that is you use AI on this ocean of data where you're analyzing all this fraud that has happened to look at patterns and then you tell me as a consumer what to do. So whether it be in the financial business, in this case, the credit card business, healthcare, government, manufacturing, one of our resellers actually developed an AI based tool that can scan boxes and cans for faults on an assembly line. And I actually have sold it to a beer company and to a soda company that instead of people looking at the cans like you see on the food channel to pull it off, guess what? It's all automatically done. There's no people pulling the can off. Oh, that can is damaged and they're looking at it. And by the way, sometimes they slip through. Now using cameras and this AI based infrastructure from IBM with our storage underneath the hood, they're able to do this. Great. Well, Eric, thank you so much for coming on theCUBE. It's always been a lot of fun talking to you. Great. Well, thank you very much. We love being on theCUBE and appreciate it. And hope everyone enjoys the data works conference. We will have more from DataWorks just after this.