 In January 2008, EMC landed a haymaker by incorporating enterprise flash drives into the symmetric array. At the time, Hitachi was pressuring EMC around performance with the million IOPS claims. The enterprise flash drive that EMC announced totally took performance off the table and catapulted EMC back into the performance leadership. This started, in Wikibon's opinion, the IO-centric era for high-end storage. Over the next four years, we saw a spate of flash innovation come to the market in the form of copycats to EMC, PCIe flash, all flash arrays, flash cache, and importantly, a host of software innovations. Hi, everybody. This is Dave Vellante of wikibon.org, and I'm here today to talk about an announcement that EMC made today, February 6th. I'm here with Barry Ader, who's the senior director of product marketing, product management at EMC in the flash business unit, and Dan Cobb, who is the CTO of that business unit. Gentlemen, welcome. And thanks for coming on theCUBE. Thank you, Dave. So it's great to have you guys here today. We're talking about your announcement. We're going to talk about flash, the vision that you have for that new architecture and new infrastructure. Barry, let's start with what exactly did you announce today? Yeah. So we announced a new product called EMC VF Cache. Back in May of last year at EMC World, we previewed a new product called Project Lightning. This is the culmination of that product. Announcing it today, general availability was actually last week. Basically think of VF Cache as an extension to our VMAX VNX product lines. It extends the storage into the server, extending it from a performance perspective. So now we have a cache that sits closer to the application, closer to the server, to enhance performance, but it's not just about giving the customer 3X performance or cutting response time by about half. It's also about being able to deliver this performance with protection. It's also about delivering this performance with intelligence. So complimentary to the EMC storage, you still get the ability to have all of your data sit down at the storage array, but the hot data sits closer to the application to deliver incredible performance. So Dan, for 20 years we've seen storage function move out of the server into the array and then eventually the same EMC sort of started that trend with Symmetrics and we're starting to see the pendulum swing back the other way. In some cases, in other cases, the data is going from server to storage still. Talk about what you see as the big trends here. So help us squint through what's really going on. Well, I think you made a really good point, Dave. What we're seeing is this whole notion of the, I guess what I like to think of as the North-South computing model where I get a server up at Magnetic North and I've got storage at South and it's all about the path from server to storage and the levels of availability and quality of service and things like that in that whole intelligent IOSTAC. What's happening now with the emergence of Flash, with the emergence of high-speed networks and things like that, new application models is there's been really no change to that North-South application model and there continues to be innovation there, but we're seeing the emergence of more and more what I call East-West models where the traffic is just as much about server to server or service to service as it is from storage down to up to a server. So we're seeing new application models, we're seeing new device models and I think as a result we're seeing new deployment models of some of these new disruptive technologies throughout the IOSTAC. Yeah, the flattening of the network is a theme that you hear from people like Arista and Cisco and many others and that's really part of what's driving this. How about active data and where it sits, what are you seeing there? So I think it all comes back to one of the things that EMC came out with back in 2009, how we leverage Flash in new ways. Fast really is all about moving the hot data to the right place at the right time and active data is really a question of how much active data do you have and how can you leverage the best performing technology, be it Flash, be it PCIe Flash in the server. How do you move that data to the best performing technology to give it the best performance but at the same time leverage the larger capacity spinning drives for the economics. So here we have a solution that allows you from a caching perspective to move that hot data closer to the application. And the sub-lun volume management capability of Fast really sort of changed the way in which Fast and Flash interacted, correct? Can you talk about why that is? Absolutely right. So as these things get adopted as Flash first came to the market and some of the initial deployments it was about being able to look at the most demanding applications, the hottest data, the highest throughput requirements and specifically provision a certain technology against that workload. What we found with customers is a couple of things actually. One is there is a natural life cycle to this data so the things that are hot today may be warm tomorrow and maybe a little bit cooler next week and vice versa. And it's not just about the coarse grained lun by lun kind of management but it's really about fluidity of being able to move data between tiers at a lower granularity so that just the hot data can be on a more performant tier and just the cold data can be on a more cost effective tier. At the granularity that's much more native to the application and the workload rather than to the physics of some storage device itself. So VF Cash, it's here why now, why today? I think one of the trends that we've obviously been seeing over the last decade is that from a Moore's law perspective, the server technology, the chipset technology continues to double every 18 months and there's a gap, there's a gap between what the disks can give out and what the server and the application are wanting to consume. So thereby with this new technology that's in hand, PCIe Flash, being able to use that in new and exciting ways to really cover that IO gap, that's really why now. Okay so the product is a read acceleration cash, it's a PCIe product today, right? So talk a little bit more about what it is today and maybe give us a sense as to where it's going. So today it is a combination of hardware software, that hardware is a PCIe card that would go into the server, the software is a filter driver that would go into the operating environment, it is completely application agnostic, sits in the SCSI stack, intercepts the IOs, makes sure the hot data is sitting close, you know, in the server on that PCIe card to deliver that better performance but yet allows you to write through down to the storage array for protection reasons because you want to make sure that your mission critical data is still on a VMAX or a VNX for that 5.9s HA data reliability, you know, concerns. So how do you see that back end changing, Dan, over time? So what we're noticing is, you know, with the emergence of tiering to handle hot data, you know, warm data and cold data, what we actually see is that the read traffic, right, you can think of applications as being comprised of read traffic and write traffic and for a product like VF cash it's interesting to note that applications that are read mostly, applications that have a kind of a well defined working set of information benefit tremendously from the low performance, low performance, right, the high performance and low latency of PCIe flash. Correspondingly, since Lightning or VF cash is now handling the low latency read traffic, it actually frees up resources on the storage array to do a more performant job or even more scalable job of handling the write traffic. So the protection, the persistence, the data services, the virtual provisioning, the 5.9s, the business continuance, all the logic that exists in the storage arrays is still there to do its job and provide those data services. But essentially VF cash now acts as an amplifier for the read capabilities of those arrays as well. So a storage practitioner said to me last week, my storage budget is not growing at 50% but my data is. I thought that was a pretty poignant quote. So for the practitioners in our audience, a lot of them I know are sitting there saying well flash, isn't it really expensive? And these guys are talking about putting another layer in and keeping everything else the same. How am I going to afford this? Can you talk about that a little bit? Sure. So clearly flash has come down in price. It is definitely more economic today than it was when we introduced it back in 2008. But flash is still a little bit more expensive than let's say harddress. And that's really where FAST comes into play. By being able to leverage the right technology for economic reasons, you're able to make sure that you use flash for your hot active data and then leverage SATA, NeoLine SAS technologies for that colder data, thereby your entire TCO coming down. I've seen statistics that suggest, and this is maybe a year or so ago, maybe even two years ago, that the vast majority of data that's on tier one storage really shouldn't be there. And I presume with the experience that now you've had with FAST, can you confirm that? And number one and number two, are you saying to the practitioners out there that we can actually lower your costs by putting in this new infrastructure and making it intelligent? So you make a really good point, especially to clarify things for the practitioners out there. First of all, it's not unusual at all to see this whole notion of IO density, where 20% of the IOPS, or 80% of the IOPS, excuse me, is on 20% of the data. And naturally, that's what drives a lot of the application response, that's what drives a lot of infrastructure throughput, and the notion of that tier and that part of the workload being performant is really key to meeting service levels by achieving the performance requirements in that tier. But what about the other 80% of the data? Does that need to live on a storage tier that per gigabyte costs a little bit more? You buy Flash because of its dollars per IOPS performance. You buy ever-increasingly dense hard drives because of their dollars per gigabyte performance. And the whole notion about what EMC is driving is the fact that one size doesn't fit all. If all you have is a Flash hammer, everything looks like a performance nail. Well, that's really not what real-world workloads are like. You do have performance needs, you do have capacity needs, and the magic or the art is in mixing and matching those things so that you get real-world IO, real-world workloads running against the service levels that are important to an enterprise. And while we see use cases that show all Flash arrays as an example, and Vmax and VNX have those configurations to be able to do that, most of the time for the bulk of the customer use cases, there is going to be cold data. And there is going to be hot, active data, and that data is going to change over time. And technologies like Fast and technologies like VF-Cache, which are allowing us to extend Fast into the server, that's really the use case for the majority. But the economics of Flash are actually really intriguing because prices are coming down somewhat faster than spinning disk. And you have the case, particularly with FC drives or even SAS drives where you're short stroking, you've had that for a decade, people have been short stroking, so you're not able to take advantage of the capacity here. And then there's the power, the whole power and cooling aspect. So when you look at the total cost, it starts to become very interesting and very attractive, and it feels like anyway that Flash is encroaching upon those high spin speed, let's say FC disks and maybe even SAS disks. And then I'm going to really be able to take advantage of SATA for the most efficient place to store my data for long term. Do you guys agree with that? Absolutely, and I think what we're starting to see from a lot of the data we've been getting for the last four years is where are we seeing customers with a little Flash and the rest all SATA or nearline SAS? We're getting there. We're probably seeing the average customer might deploy 4% to 5% Flash and let's say 25% fiber and the rest all SATA and nearline SAS. That's kind of the majority use case, which of course depends upon the customer and the use case of how they play off of that. And I have seen statistics, and I think they were from EMC, that about 5% of the world's data are candidates for Flash. So am I understanding that you don't see that changing dramatically over time because disk drive prices are going to continue to come down? Or do you see that number growing over time? More in Flash? Yes. Actually, I'm not sure I do. It's really good. Every customer is going to be different. But for the bulk of the customers, 5% of the data being hot is probably where it really stands. So the petabytes are going to go to the spinning disk. And yet the value of the data that's going to be on Flash is going to be substantially higher. And so the revenue opportunities associated with that are potentially greater. Dave, if I could butt in for a second, the notion here of finding the perfect media is I think something that we've all been part of that chase at one time or another. And so we've seen for years the prediction back in our archiving and backup days of the death of tape. We've heard. You tried your best. And it's really an economical storage medium for a bunch of interesting use cases. So there's a reason why it's still with us. The same thing about spinning disk, this whole notion of every couple of years or even more often than that, this notion of disk is dying. The physicists have run out of tricks to play about storing bits. We went from horizontal to perpendicular. We improved aerial density. Now we're back to kind of shingle recording and the ability to overlap and squeeze tracks closer and closer together. There's a tremendous amount of fundamental device physics, a tremendous amount of innovation that continues to astonish me in the hard drive space. And it probably doesn't make sense for us yet to sort of ring the death now there or handicap against it. But I think what we can agree is this notion of being able to put the appropriate data on the appropriate medium, being able to put the inappropriate data on a different medium that meets those service levels and accomplishes those business goals is really what it's all about. And it's kind of like that old serenity prayer. The secret is in the wisdom to know the difference, where it should belong. The disk drive engineers are kind of like Scotty in Star Trek, right? She's going to blow, Captain. We can't do it. They always get by. One of the interesting things is obviously PCIe is a relatively new technology and we see customers starting to deploy that, right? And one of the use cases for PCIe is just to stick it in the server and treat it as a DAZ device, right? The problem with that is, you know, first off, how do you know what data is hot? How do you know that data doesn't change over time? And how do you know that that server is the only server that wants to access that data? So it's all the problems associated with stranded servers and stranded storage, excuse me, et cetera. So one of the beauties, bring us back a little bit to VF Cash, one of the beauties of VF Cash is not only is it a caching product, but we also have the capability to give you the customer flexibility to have it use that PCIe card as a cache or use that PCIe card as a DAZ device. And one of what we see, we're hearing from a lot of customers, is use the DAZ part for the temporary ephemeral data that you don't might not need that protection for. So giving them the flexibility, leverage a new technology, PCIe, but use it in a variety of different ways. Great, I want to bring up a chart here that David Floyer produced just today. I think it hit the wiki. And basically he laid out this notion of an IO-centric infrastructure in the various layers and there's really three layers, the top layer being the, essentially an extension of memory, however, a persistent extension and the most expensive. So this vertical axis is cost and it's where data is created or ingested or input and then you've got this sort of management layer which is the distributed shared flash storage layer and it's an active management layer and that's where this whole life cycle comes in and then you've got presumably using technologies like FAST the ability to demote data down to what I call the bit bucket, this sort of distributed archive and backup and data protection, that archive, backup and delete layer. So first of all, is that a reasonable way to look at the sort of storage hierarchy as it's emerging and he's got flash playing as you can see in each of those. You know probably a small bit here in the distributed archive layer to be able to, if you have to get it back. So I wonder if you could comment on that model and help us understand where EMC fits both as the flash business unit and as an organization, you know, as a storage company. Sure, sure. So I think we would look at this and we'd see a lot of parallels to some of the ways we're looking at the hierarchy. We actually think of this maybe as the IO hierarchy and I think David would probably agree with that where for a while now, if you needed very low latency, very high throughput, access to information, it lived in DRAM. DRAM as compared to flash isn't persistent but it's accessed at processor speeds, processor latencies, processor cache line type access and very much as a part of the processor ecosystem if you will. At the other end of the spectrum would be the magnetic media and things that get done with hard drives and that's very much been a part of the whole storage ecosystem all these years. The biggest difference between the two has been that persistence, right? The notion that information that's in memory is there for the processor to add one to or subtract three from and things like that whereas information that's in storage has a life cycle that goes, that's measured in years and so what we're seeing here and I think you alluded to this a little bit earlier Dave is this notion of a new type of ecosystem, a new type of infrastructure. When you asked Mark earlier if this was EMC moving from being a storage company to a systems company I think the backdrop there is a chart very much like this and that it's really about customer-driven solutions. It's really about what problems are customers solving? Where does the low latency persistence become important and how do you manage that persistence over its life cycle? How do you manage the persistence according to the placement of the right high value data at the right point where it's to be consumed? How do you manage that when it becomes warm, when it becomes cold, when it needs to be recovered to a remote site, when it needs to be reflected in additional compute resources being added in to accommodate changes in workload, priority and things like that. It's about being able to manage those things across all of those disparate ecosystems that on the one hand is what makes EMC a solutions company and on the other hand is about delivering that value that the customers are looking for. Don't make me choose a bunch of different technologies and knit them together. Instead give me an ecosystem that lets me build out the pieces that I need and know that they work together. So that leads me to the whole discussion of competition and differentiation. We like on theCUBE, we like to present things to our user audience in terms of the decisions that they're trying to make. So they're out there talking about, there's Fusion IOs, the new kid in the block, the systems guys, HP, IBM and Oracle are talking about the systems expertise and in the case of IBM and Oracle really focusing on the database affinity. You've got the pure plays, the startups, the VC funded companies that are valuations are going through the roof and they're excited because a lot of them are going to get bought out and a bunch of people are going to make a ton of dough. And then you've got the specter of Intel on the horizon. So two questions there. One is where does EMC fit? How do you differentiate? And number two is what are the critical success factors in this space? So I'll start. So I think first, as it relates to Flash, clearly there are many different use cases whether it's all Flash configurations, whether it's Flash on the storage array, whether it's Flash in the server, Flash will play in a lot of different places going back to the model that we were just talking about. And that's clearly one of the things that EMC has been doing since 2008 and continues to do is look for opportunities and look for ways that we can leverage Flash at different places in the IO stack but all from a complementary nature. How do we make sure that from a complete IO stack that the storage array works in conjunction with the network that it works, in conjunction with the server that works in conjunction with the application in a variety of different ways and be able to deliver all of those data services while at the same time delivering on the performance. So one of the key things is that strategy. Talked a little bit about some of the different strategies that we might have versus the competition in terms of DAZ versus Cache and having that protection layer, the flexibility. So those are a couple of the things that I would point out. We're also not seeing as much pull from some of the other competitors out there. Where is their Flash strategy? We see a lot of things from some of the startups but some of the system companies not really seeing a broad based strategy from them at least right now. And you feel, I mean, it's obvious you guys are going for it. Absolutely. Okay, and Dan, the second question around critical success factors. What are the things that you think of from a CTO perspective that you really need to have to do well in this market space? Well, I think the backdrop there, as Barry was saying, is that a lot of the traditional elements of a end-to-end system here, a lot of those lines are getting redrawn right now. A lot of the traditional boundaries are kind of becoming, you know, to use a great word becoming semi-permeable, right? So we're seeing elements of compute, you know, move close to storage. We're seeing elements of storage move close to compute. And so with a product like VF-Cache, you kind of take a step back and look at the technology, then you look at what it's good for, right? Most important things about VF-Cache from the technology perspective and this new kind of refactored world is a couple of important things. One is you need a place to stand in the IO stack. And EMC has been in the IO stack for a long time, obviously in storage arrays, but also equally prominent on the host side. So with intelligent multipathing to, you know, enable better throughput, better fault tolerance, more scalability, and those kinds of things in multipathic technology, with the ability to do, you know, inline replication, you know, local and remote across all kinds of different interconnects and storage types to enable business continuance and enable protection. Being in the IO stack is our home turf, whether it's in the array or in the server. With Project Lightning or with VF-Cache, the notion of being in the IO stack gives us a place to stand. It gives us a place to look at IO patterns, look at reads, look at writes, and intelligently intercept that information to find the things that can be accelerated and deliver them to the application. The next important factor there is really the caching logic Excel. How do you decide what data's hot? How do you decide when to throw cold data out of the cache and replace it with something else? And in a product like VF-Cache, that information is based on years and years and millions and millions of workload analyses that have gone on across all of our storage platforms that really gives us insight into how real-world applications behave, the decisions that you might make that allow you to decide to fetch certain data early, to eject certain data at other times, the right IO sizes to cache, the right caching granularity, a number of those things that we all learned in computer science 101 applied to real-world applications and real-world problems. The last part of it that's important is this whole notion of highly performant hardware. Barry talked about PCIe Flash and some of the innate performance capabilities there. It's not just about having Flash on a PCIe card, however, it's having Flash that can achieve high levels of parallel throughput that can be doing 64, 128, 256 IOs at a time, because that's what real-world applications request of their storage tier. It's about a PCIe Flash device that can respond with very low latency those kinds of outstanding IOs, and it's about being able to do that in the context of the fact that customers buy servers to run applications in, so they like to use the DRAM of those servers and the compute cycles of those servers to run their applications, not to necessarily run the Flash device and things like that. So the notion of parallelism and throughput, the notion of latency, and the notion of being CPU and memory efficient really bring all of those things together into a solution that runs for real-world workloads. Yeah, I was gonna say, spoken like a CTO of a solutions company, right? I mean, that's what you just described is all the different pieces that you've got to put together. So my last question is around ecosystem and partnerships. Can you talk about what kind of new partnerships this innovation is leading you towards and anything that you can talk about today? Nothing specific that we can talk about today. I can tell you that many of the workloads that VF Cash is directed at are the bread and butter of what we've been working with for many, many years, talking about the Oracle apps, you're talking about the Microsoft apps. So these are the same bread and butter apps that our customers are running mission critical. VF Cash is the perfect compliment to our storage arrays to go after that. Yeah, well, we're excited to have a pad on as well. We've got him coming on a little later and definitely want to talk to him about the role of Intel in this whole thing and presumably he's got some connections there. So we're, you know, because we see them as Intel is really starting to think about this in a big way and going after this and total new ways of thinking about designing applications that we're seeing when we talk to application developers and CIOs. It's a very exciting space. Deanna Berry, thanks very much for coming inside theCUBE and sharing the details of this announcement. Congratulations and good luck with it. Thank you, Dave. Great to see you guys. Really happy to be here. Thank you. Thanks for watching everybody. We'll see you next time.