 from Las Vegas. It's theCUBE covering Dell EMC World 2017. Brought to you by Dell EMC. Hey, welcome back everyone. We're here live in Las Vegas for Dell EMC World 2017. This is theCUBE. I'm John Furrier with my co-host, Paul Gillan. And we're our next guest member, Don, Senior Vice President of Product Management, Dell EMC, former Microsoft Azure historic role at Microsoft, been with EMC for a few years. Welcome to theCUBE. Good to see you. Thank you. It's nice to be here. So last year we had a conversation. We were talking about some of the technology and kind of the direction it was going. So first question is from last year to this year, what's changed and what's the news? Yeah, so we've brought together two pretty well-known platforms that we did, Icelon for Scalar file and ECS for Scalar object into one team that are now around called the unstructured data storage team. And we've done this really because from the point of view of the customer, what we see is this confluence between file and object really in the space of unstructured storage. And we have some ideas of how to put that together in just the right solution for the customer. So that's why we've brought these teams together and we've got a lot of great stuff to talk about this year. So how are you positioning file versus object right now? It seems like object is the rage, but file is still going to be around for a long time. How do you position it? Yes, I think it will be. I think basically, if I may, it's not just two, but we see three pillars of unstructured storage. The first is file, which is really more towards compatibility with traditional workloads. A lot of the application ecosystem is comfortable programming against NFS or SMB. And that ecosystem is going to remain for a long time. For instance, in a space like video surveillance. So that's where we see file. It's optimized more for performance rather than scale, although you do get scale. The next level was really object, which is more for your modern workloads, for your web and mobile sort of workloads, optimized more for scale rather than performance. And then the third pillar that we see that we've been working on now is really real-time data or what you call streaming data from things like IoT, where you're getting a fire hose of information coming out and you got to store it very, very quickly. So we see these are the three different pillars of unstructured storage. And really what we've been working on in our unstructured data storage team is how to bring all these three together in the right solution for the customer. So how about the group that you're in because this is kind of a new, not new industry, we've been talking about unstructured data for many years, going on eight years. But it's becoming super important now as you have this horizontal data fabric developing. We talked a little bit about it last year, but you can see a clear line of sight now with apps using data very dynamically. So you need under the hood storage, but now you need addressability of data. And so there's a challenge of getting the right data from the right database to the right place on the app in less than 100 milliseconds. And that's like the nirvana. Yeah. So I think there's a couple things happening, right? Firstly, the advances in hardware have changed the game a fair bit because you can take a software stack that was not optimized for latency to begin with. You can put it on all flash hardware and you can reduce that round trip a lot, right? That's one thing. The other thing I see is that, especially with the advancement of object, one of the things people don't immediately realize about object is it comes with metadata. And so you can tag everything that you write into an object store, and this makes it a lot easier to find your stuff and to index it, right? Which is a unique capability of object. Well, IOT, I can see an incident advantage for IOT, but also analytics is going to be impacted big time. Yes. What's the impact of analytics from your perspective? Yeah, I think. Jayda, you got to have all that. You got to have all the metadata, but I think especially from the file perspective, so we've seen, for instance, with Isilon, our Scala NAS, we've seen a lot of adoption into big data, and that's because if you look at how Hadoop or HDFS started, but you look at how it's evolved in terms of the capability roadmap, it's converged more and more with what you would expect from a regular file system. So I think we've really seen that sort of convergence that what used to be thought of as Hadoop, capabilities like encryption, compliance, steering, et cetera, are capabilities you would expect in a traditional file system. And so that's why we've really pushed Isilon in that direction of being a first-class Hadoop store, right? All right, so a big question, kind of let's zoom out. Yeah. Look down from the balcony onto the stage of life in IT. You're a research background, PhD in computer science. I mean, it's a pretty awesome time to be in computer science right now. It is. There's a ton of opportunity to apply some of that machine learning, all this goodness there. What's your vision of how the next 30 years are going to play out? Because Michael Dell said, hey, it's been 33 years since you started the company. Next 33 are going to be amazing, and I believe that to be true as well, given that science opportunities. How do you look at this, from a personal level and then also from a Dell EMC? Yeah, I think what's really going to change is up to now, a lot of things that have been done with computing have started with the thought of how much data can I really have? And then once I've decided how much data I can really have, what am I going to do with it? And I think the sort of the innovation that's happened in storage that I'm a part of, what has really changed is it said, you don't have to stop and think about how much data you're going to have. You can just keep all of it, right? And so what that means is I don't have to figure out upfront what I'm going to do with this data. I'm just going to bring it all in. It's going to be really cheap. It's going to be really, the systems are really scalable that can hold it and everything is sort of tagged and in such a way that after the fact, five years from now, I can go do something with this data that I hadn't envisioned when I brought it in. And I think that just opens up a range of things that were hard to imagine. The other thing I think is the- Programmatically meaning, from a spreading software standpoint, discoverability. That's right. I think as you said, machine learning is a big part of it, right? Because I think machine learning unlocks opportunities to mine the data that people hadn't really thought of before. And it comes back to the same thing that when I bring data in, whether it's from sensors or aircraft engines or what have you, I have no idea what I'm going to do with the data. So I have no idea which part of the data is important and which part of the data is less important, right? But when I can apply things like machine learning after the fact, I don't actually have to worry about that. I just bring it all in and the algorithms themselves will figure out what part of the data is the useful part of the data. So, yeah. You have a scale-up product line, a scale-up product line. How are you positioning those two application-wise to your customers? So I think there is a distinction between tier one storage and tier two storage, right? I think when you think about tier one storage, it's not just about the numbers like latency and IOPS, but it's about the whole experience of tier one storage. Do I have, for my disaster recovery, do I have RPO zero, which means I can recover to the exact point in time I was at when I failed over data center, right? How does my replication work? What data services that I have? So I think our scale-up technologies are very well-oriented towards the tier one kind of capabilities, right? And then our scale-out technologies are very well-oriented towards sort of the ubiquitous tier two of storage, which is much more deployable at scale. A pretty good performance and throughput, actually, but not with that complete set of capabilities you think about with tier one in terms of RPO zero, synchronous replication, those kinds of things, right? So I think there's a very natural sort of mix between the two. And really, I think from a storage vision, what we see is the tier two storage is so scalable and so cheap that all of your pools of tier one storage on the top tear down automatically into the tier two storage. And what that means is for our customers, if you think about how much tier one storage they have to provision today, they should be able to provision less of that because they should be able to tear more of that down to the tier two storage, which is now capable enough to hold the rest of the data, right? So- And be available. And be available. So this is not the old world where I had tier one in tape and so because I know tape is slow and hard to recover from, I have to over-provision the tier one storage. I now have a capable enough tier two that I can aggressively offload a lot of the tier one into the tier two. But I still need the tier one with my scale up model because I have those capabilities that tier one storage- And tier three is even cheaper when you get the cold storage. Generally, since out there. I just think that you're onto this as a game changer. This is fundamentally the architecture. Tier one is hot, whatever you want to call it. Tier two is getting cheaper and cheaper. It scales, you have machines. So you really have to optimize because what you said earlier, I think it's the big fundamental thing in the industry people are missing is that you don't know the value of data at any given time- Exactly. Because some database record that's in some data lake or data warehouse and the tier two could be super valuable in context to say a retail transaction at any given moment. Exactly. Or just think about video surveillance, right? A particular clip from some moment might be really valuable three years from now when you're doing some legal work and you want to figure out what happened at precisely that point in time. Okay, so customers want to do this, no brainer. All right, so when we hear Amazon talk about this all the time, Jeff Bezos was just talking about just the other day and the Jassy, they get the recognition software so you see facial recognition, a lot of great stuff happening. All over the cloud world with this kind of modeling with the power of compute that's available. What do the customers do now? Because now they get it, it's a no brainer, obviously. Now they got to change how they did IT 30 years to be agile for tomorrow. What's the playbook? So what we're seeing is the step one that we're seeing more and more today and have seen really for the last couple of years with Isilon and with ECS is what I would call consolidation of the tier two, right? So where we had 12 different clustered silos of storage for the different use cases. Let's buy into this model that I can just build one large storage cluster and it can handle the 12 different use cases at the same time, right? And that's what we've been proving out for the last few years. I think customers have really, enterprise customers are really getting there. And now what we're beginning to see this year is the next phase where, whether it's the industrial internet with the automotives, et cetera, the more IoT style use cases. So in fact, on Wednesday we'll be talking about a new thing we've got called Project Nautilus which is the third leg of our stool with the streaming storage that is built on top of Isilon and ECS. And we are now at the point where our first customers are beginning to work with that where they're saying, from my sensors in the automobiles, on the cameras, I'm going to bring in this fire hose of data. I'm going to store it all with you but later on I'm going to do analytics on it. As it's coming in, I'm going to do some real-time analytics on it and then after the fact I'm going to do the more batch style Hadoop analytics. I know Paul Scott wants to jump in but I want you to just back up because I missed the three pillars. The three pillars were file for which we have Isilon, object for your modern applications and web workloads for which we have ECS and then streaming storage for IoT. Which is Nautilus. Which is Project Nautilus. Okay, got it. And Dave, the way I put it to people is traditional storage systems, scale up or scale out, file or object, they need resilience, right? So when you write the data, you have to write and think at the same time because you have to record all kinds of information about it. You have to take locks, et cetera. For IoT, you need a storage system that writes now and thinks later so that you can just suck it all in. It sounds like an operating system. I mean, you've got a storage that's turning into like LUN's provisioning hardware, essentially intelligence software that has to compile, run time, assembly, all this stuff's going on. And there's all these fancy names like Lambda architecture and all that kind of stuff and what that's all saying is I bring the data in as as it's coming in, there's some things I already want to do with it. I do that analytics in real time. There's other things where I go tag it who was in the photo, who was, you know, where was it? And then the rest of it I'm going to do later. And who knows what and when. And that's the beauty of it. You're way along the thinking curve on this, obviously, but where are your customers? I mean, you're talking about a pretty radically different approach to processing and storing data, even in real time, machine learning, meta tagging. I mean, there's a lot for them to absorb. So, and I think that part is really, it's a vertical driven, use case driven thing. So there's some industries where we see a lot of uptake on that. Automotive is a great example, right? Financial services. Financial services, fraud detection, those kinds of things. And there's other verticals where it's not time for that yet, right? Like I'd say healthcare is a great example, right? So in those verticals, we see more of just the storage consolidation. Let me build one pool of tier two storage, if you will, and consolidate my 12 use cases, sort of what we refer to as the data lake in our world. But I think it's specific verticals. And you know, that's fine, right? If you look at even the traditional unstructured storage, I think it really started with certain verticals like media and entertainment, life sciences, and that's sort of where it kicked up from. And I think for the streaming storage, it's these verticals that are more oriented towards IoT. Your automotive, your fraud detection, those kinds of things where it's really kicking off. And then it'll sort, broaden from there. How's this playing into the Dell server strategy? You know, it's really a fantastic thing. I don't want to say so much for us as for our customer because I've talked to a number of people in these verticals where the customer wants a complete solution for IoT, right? And what that means is number one on the edge, do I have the right equipment with the right horsepower and the right software on the edge to bring in all the data from the edge and do the part of the processing that needs to be done right there on the edge in real time. And then it has to be backed by stuff in the backend environment that can process massive amounts of data, right? And with Dell, we have the opportunity for the first time that we didn't have when we were EMC alone to do the complete solution on both ends of it, right? Both the equipment on the edge as well as the backend ITs. So I think it's a great opportunity. You bring up so many awesome conversations because it's boring storage now. Storage is not boring anymore because it's fundamental to the heartbeat of a company. So here's the question for you, okay? Kind of like thinking out loud and riffing with you. So some debate like, listen, I want to find a needle in the haystacks, but the haystacks are getting bigger. So there's a problem with that. I got to do more design and more geek digging, if you will. And the second point is customers are saying to at least to us in theCUBE and privately is, I got a data lake, it's turning into a data swamp. So help me not have swamps of data. And I want more needles, but the stack's getting bigger. What's your advice to those CXOs? Could be a CDO, Chief Data Officer, CISO? And these are the fundamental questions. I would say this, whatever technology you're evaluating, whether it's an on-premise technology or a hosted technology from a vendor like us, or it's a service that's out there in the public cloud, if you will, ask yourself two questions. One is, if I size out what I need right now and I multiply it by 10 or 100, what is it going to cost? And is it really going to work the same way? Is it going to scale the same way? Look at the algorithmics inside the product, not the PowerPoint and say the way they've designed this thing, when I put 100 times the data on 100 times the number of servers on this storage system are things actually going to work the same way or not? That's a scale question, kind of order of magnitude thinking. You need to kind of go out and size it up a bit. Because I see right now the landscape is full of new technologies for storage and a lot of them sound great and look great on the PowerPoint and you go to a POC with four nodes or eight nodes and you put flash in there and it works really well. But the thing is when you have 200 nodes of that, when you've got a 30 petabyte cluster and you've got to fail it over because your data center went down, how does that go? It's also who's going to run it too, right? I mean, you want less ops people, not more and you don't want them to be huge, expensive developers. To your point, that's the other thing. We really don't talk to our customers in terms of storage acquisition costs anymore. We talk in terms of TCO, total cost of ownership, right? You look at power, you look at cooling, you look at human. That kills Hadoop basically. It was so hard to run in total cost of ownership. Michael Dell was just on, I was interviewing Michael and I asked him where's the cloud strategy? I was just busting his chops a little bit because I know it's messaging. Trying to get him off his messaging. But he made an interesting comment and metaphor. He goes, well John, remember the days when we, during the internet days, where's your internet strategy? Look where that happened, the bubble popped. But ultimately everything played out as according to plan. His pet food online is now we've got food delivery, door dash, all this stuff's happening. So he kind of was using it to compare to the cloud today. There's a lot of hope and promise where's your cloud strategy, but yet his point was it's going to be everywhere. Yeah, and I would say this, right? I think people sometimes confuse cloud with public cloud, right? And I think what happened is, having done Azure myself, right, I would say that public cloud exposed a certain model that had some benefits for the customer base that were new. That is, I consume as a service. I don't worry about operationalizing things. I can pay as I go, so I get that it's elastic. But it also came with a lot of drawbacks, right? I don't have the kind of control that I would like to have. You know, a normal thing that any person who takes a dependency on infrastructure has is, today is my Super Bowl Sunday. Don't touch my environment today. Now you go to the public cloud and you use a service that is used by thousands of other customers. Which day is Super Bowl Sunday? Every day is Super Bowl Sunday for somebody, right? It was a metaphor. Public cloud was a metaphor for virtualization. And so I think the journey we're all in, all the vendors, the public cloud supplies, everybody is, what is the right set of models that are going to cover the space for all our customers? There's not going to be one, there's several. I think the dedicated private cloud models are certainly very appealing in a number of ways if you do the economics right. And I think that's the journey we're all on sort of together. Well, Wikibon just put out research. I tweeted a little bit of the jewels out there this morning, true private cloud's going to be a $265 billion market. And they were the first ones to actually size that, when they say true private cloud, they mean essentially hybrid, but on-prem with the data center. That's huge numbers. It's not like rounding errors. I mean, we believe that too. And that's why one of the neatest things we've announced this year with ECS, object storage is something called ECS dedicated cloud. Which is basically saying, you can take the object storage from us, but it's going to run in our data centers. We operate it. It's actually the developers who wrote the code from my team who are actually operating it. And you can do a variety of hybrid things. You can keep some of it on-prem, some of it off-prem. You can keep all of it off-prem. But regardless, it's your stuff. You can hug it. It's dedicated to you. You're not sharing the cluster with anybody else. And if you get to decide, when do you upgrade to a version? When do you take a maintenance window? What have you, right? So we're all searching for that sweet spot, if you will. I want to ask you about something, it's something different, containers. The hottest thing right now in infrastructure, lack of persistent storage has been a real problem for containers. Is that a problem that's yours to solve? Or is it Docker's to solve? No, I think it is ours to solve with them. So I'll say a couple of things. Firstly, our modern products, ECS, our object storage, as well as ScaleIO, our block scaleout storage, these are built with containers. So for instance, if you take ECS today, every ECS appliance that we ship, if you look inside every server, it's running Susie Linux with Docker. And all the ECS code is running on Docker containers. That's just how it works, right? So A, we're believer in containers. And two, I think we have been doing the work to provide that persistence ecosystem for containers using our storage. So we have a great team at Dell EMC called EMC code. And these are people who do, they do a lot of this integration stuff. They work very closely with Docker and a number of the other frameworks to really plug our storage in. And I think it's a very open ecosystem. There are APIs there now, so you can plug anybody's storage in. And I think that's really if you compare VM-based infrastructures with container-based infrastructures, that's really the gap, right? Because when you operationalize this stuff, you need things like that. You need persistent storage. You need snapshots. You need a BR story. You need those kinds of things, right? But I think that'll all come. Well, we're looking forward to continuing the conversation, I know, times tight. We'd like to follow up with you after the show. Maybe bring you into our studio via Skype to drill down. You're in a hot area. You got the storage. You get the software. You got some cloud action going on. And we're thanking you very much for coming on theCUBE. Appreciate it. Thank you for having me. There's the CUBE live coverage here at Dell EMC World 2017. I'm John Furrier with Paul Gillett. We'll be right back. Stay with us.