 TheCube at EMC World 2014 is brought to you by EMC. Redefine VCE, innovating the world's first converged infrastructure solution for private cloud computing. Brocade, say goodbye to the status quo and hello to Brocade. Okay, welcome back everyone here, live in Las Vegas for EMC World 2014. This is TheCube, our flagship program. We go out to the events, extract a signal from the noise. I'm John Burrier, founder of SiliconANGLE. I'm joined by co-host Dave Vellante of wikibond.org. Our next guest is Miroslav Kavensky. Did I get that? You got it. All right. It's Jim Iogue. Hey, welcome to TheCube. Thank you. Thanks, I had to be here. So one of the things about all Flash arrays is I don't believe you. The doubting Thomas is the people who are saying that's all BS, show me the numbers. Dave and I were just talking last segment about that. Flash is in prime time right now. People are loving it. And you guys have a million dollar giveaway, not giveaway contest, if someone does something. Tell us about, how do you test this stuff? I mean, like, if it's a new animal, I've never seen this animal before. How do I evaluate its success? If I got apples over here, I got orange over here, or I got a dog and cats, I mean, what? Well, you know, doing apples to apples comparisons for storage arrays is always a little tricky. But when it comes to testing Flash, it's probably even more different from how you would test a spinning disk array than most people are used to. And Flash is a different beast. You know, I'm sure this was kind of covered in a bunch of different segments over the years, but it behaves differently when it's new versus when it's been in production for a while. The latency tends to be really low. It's a random access media. And all of that has different implications for how you would go about testing that array. So what is the biggest thing that you focus in on right now? Do you build a comprehensive suite? How do you guys build it from the ground up? How do I trust it? All the above are the questions I would want to ask you. Great, so there's a lot of different tools out there for evaluating storage. What we've noticed is over the last few months, since we've been generally available, we've had increasing requests from customers to help them figure out how to evaluate an all-flash array that they're looking to purchase, whether it's from Extreme.io or evaluating different vendors. And what we've done is come up with a suite of tools. And that suite of tools is just using free software. It's freely available. And we built things around VD Bench. I'm not sure how familiar you are with that, but it's a tool that was actually developed at Sun, is now owned by Oracle. It's part of the Storage Performance Council suite and used in SNEA. So it's a very neutral tool to use for evaluating storage. And we liked it because it has a number of great criteria. One of the most important ones is it's actually very savvy around generating data that can work with storage efficiency algorithms, whether you need to dial in a certain D-dupe and compression factor or you want to avoid D-dupe and compression by accident. So that's why we selected VD Bench as the load generator, but we've done a bunch more around VD Bench. So we've written a number of scripts that can implement the IDC methodology for evaluating all flash arrays. What's the biggest difference, bottom line between flash arrays and the old way? What are the big issues to worry about? Well, one is how the performance looks out of the box. If you test a spinning disk array out of the box, it's going to look pretty much like how the array is going to look. But when you test a flash array brand new out of the box, the performance is going to be artificially high. And it's important to precondition that all flash array, which basically means making sure that you've written to all of the flash cells, making sure that the rate stripes or whatever the data protection scheme is has been aged so that it represents what the space will look like after it's been in production for a few months. So it's a false image, if you will, if you want to go right out of the box because you got to do a lot of preparation to kind of understand the media. Is that what you're saying or capabilities? I think the capabilities of the media change over time. And if you evaluated when it's brand new, you're going to get a very false impression of the capabilities of that box. When the media ages in, the performance of flash can drop to a third of its brand new out of the box performance. And the preconditioning is really important, but another factor is flash is fast. You can throw a bunch of flash into a box and there are solutions out there that do that. But just because the flash is fast doesn't mean it's going to be fast in all different conditions, under all different circumstances, under heavy load as well as light load. So if all you're doing is throwing a couple of VMs on there and kind of lightly tickling the box, you're not going to get a real impression of what this array can do. Can we go back to the objectives of the testing? So from a customer standpoint, what are the objectives? You're trying to just understand the performance, understand the performance in specific workloads, understand the durability, all of the above? I think all of the above, but where we seem to get a lot of different questions is around characterizing the performance of the array and comparing them, looking at the consistency of the behavior and performance. People buy flash arrays and pay kind of premium dollars for those flash arrays because they really care about low latency and they want that low latency to be predictable. That low latency needs to keep showing up no matter what you throw at that array, if there are spikes in the demand or if there are changing workloads, that latency still needs to be relatively low and the performance needs to be predictable across a broad range of different workloads. Okay, so the objective was not a hero benchmark, okay, we set that right away. You were setting out to develop something that a customer could actually use to predict the performance of the array and their environment. Yes, and the toolkit that we developed around VDBench tries all sorts of different workloads with an emphasis on random access because that's how people tend to deploy flash arrays and one of the things that's really cool about the toolkit is that it will automatically analyze the data and generate the graphs for you. So you get graphs and tables that show how the array will perform and it makes it real easy to compare one platform versus another. And so is that my, let's talk about the platform comparison. Am I trying to compare different all flash arrays? Am I trying to compare an all flash array relative to an old disk subsystem or a modern disk subsystem or a hybrid disk subsystem or any of those? Well, you know, it's a toolkit. You can point it pretty much at any storage array. It was really meant for testing and comparing different all flash arrays. So it emphasizes random performance. If you were going to compare, let's say a spinning disk array for streaming media, you'd probably test it in a slightly different way. So even though it has an emphasis on all flash, you could use it for a spinning disk array evaluation. You could look at how, let's say if you characterize your existing array and then characterize the all flash array that you're considering, you can compare the two and see how it'll do. So am I right, Mirself? When I look at such comparisons, I'm going to zero in on writes, right? Writes are always the problem. It's that we're the high old bottlenecks exist. So I presume with your testing you can vary the right intensity, is that right? Well, yeah, and it's actually programmed to automatically try different read-write ratios, different IO sizes. 10, 20, 30, 40, 50, if I want to go nuts. Exactly, and we built this to be kind of a enterprise reliable information tool. So honestly, it's not like the ZDNet benchmark that you can kind of download and run on your laptop. It's meant to be set up and run for over like two, three days of heavy IO intensive load on the array. And it will characterize a lot of different combinations and give you information about how the thing we will perform under stress in enterprise environment. So I'm envision, so in the metrics, I'm sure there's a gazillion metrics, but I'm simplifying it, right? I want to see IOPS maybe and I want to see latency. I want to see how that changes. He showed some graphs. I mean, Joe, I guess it was more David was showing some graphs today. And so what have you been able to find in terms of applying it? I couldn't agree with you more. Customers want consistent performance from a latency standpoint and they want, I guess sub two milliseconds. I think Joe even said sub one millisecond. Is that what you're sort of prescribing in the field? That's what we find tends to be a good match for all flash arrays. When someone actually has an application need for consistent sub millisecond, sub two millisecond latencies, we are a good fit. And along with that raw performance, we're bringing all sorts of data services around thin provisioning, deduplication, really awesome VAI enabled performance for virtualization. All of those are data services that come along with the great consistent performance. We have a question from the crowd. I want to ask you here. This is Dave Vellante from the crowd father. Ask about native flash architectures versus caching controllers. Well, so that's a good question, right? Because if I have a caching controller, I'm going to asynchronously trickle to disk and signal, use a mainframe term, device end when I hit the disk. Flash is different, right? How so, talk about that. It is different and I think probably one of the biggest differences is flash as a random access media is great for your entire data set, right? With caching controllers, part of what happens is you have a working set and when your working set is relatively small, it fits in cache and everything's golden. But working sets change, they shift over time. As you add and remove applications, the working set is going to be different. So when you're experiencing problems because the working set no longer fits into cache, you can either replace your array with something that has a larger cache or you can consider an all flash array. So many of the people that are coming to us or are coming and looking at all flash arrays have applications where the working set is very large and when you're evaluating an all flash array, you also want to be testing across a very large data set where the working set might not fit into a cache. I know we got to leave it there but where do I get this benchmark? How do I get it? It's up on the EMC community network. We have the URL and we can share it with you. It's publicly accessible right now. We've created some how-to videos to let everyone use it and we have pre-built VMs that makes it really easy to get started. Fantastic. What you guys are doing, we love data. We actually geek out on our data for our data science operation. We know what this is all. We need an engine to run it. Thank you for coming on. I really appreciate your commentary. Pre-conditioning that flash, understanding the capabilities, big part of it. You got to know the engine, the car you're driving, know the machinery before you push the envelope. This is theCUBE, we'll be right back after this short break. Great.