 From the SiliconANGLE Media office in Boston, Massachusetts, it's theCUBE. Now, here's your host, Dave Vellante. Hi everybody, this is Dave Vellante, and this is theCUBE, the leader in live tech coverage. This CUBE conversation, I'm really excited, Craig Hibbert is here. He's the vice president of Infinidat, and he focuses on strategic accounts. He's been in the storage business for a long time. He's got great perspectives. Craig, good to see you again. Thanks for coming on. Good to see you, Dave. Good to be back. So there's a saying, don't fight fashion. You guys fight fashion all the time. You got these patents, you got this thing called NeuroCache. Your founder and chairman, Moshe, has always been cutting against the grain and doing things his own way. But I'd love for you to talk about some of those things, the patents that you have, the architecture, the NeuroCache, fill us in on all that. Sure. So when we go in, we talk to customers and we say we have 138 patents. A lot of them say, well, that's great, but how does that relate to me? A lot of these are andor gates and certain things. They don't know how it fits into the day-to-day life. So I think this is a good opportunity to talk about several of those that do. And so obviously the NeuroCache is something that is dynamic instead of having a key and a hash, which all the other vendors have, just our position in that table allows us to determine all the values and things we need from it. But it also monitors, this is an astounding statement, but from the moment that array is powered on, every I.O. that flows through it, we track data for the life of the array. And for some of these customers, it's five and six years. So are those blocks of data, are they random? Are they sequential? Are they hot? Are they cold? When was the last time it was accessed? And this is key information because we bring intelligence to the lower level block layer where everybody else is just dumb. They just ship things, things come into a queue and they move and they have no idea what they are. We do. And the value around that is that we can then predict when workloads are aging out. Today you have manual people writing things in things like Easy Tier or Fast or competing products or Tiered Storage Manager or all these things that manage all of these profiles of the human intervention. We do it dynamically and that feeds information back into the array and helps it determine which virtual array group it should reside on and where on the dispendials based upon the age of the application, how it's trending. These are very powerful things in a day where we need imminent information sent into a consumer and store, right? It's all of this dynamic processing and the ability to bring that in. So that's one of the things we do. Another one is that's the catalyst for our fast rebuilds. We can rebuild two failed full 12 terabyte drives in under 18 minutes. If those drives are half full, then it's nine minutes. And this is by understanding where all the data is and sharing the rebuild process from the drives. That's another one of our patterns. Perhaps one of the most challenging that we have is that storage vendors tend to do error correction at the fiber channel layer. Once that data enters into the storage array, there is no mechanism to check the integrity of that data. And a couple of vendors have an option to do this, but they can only do it for the first write. And they also recommend you turn that feature off because it slows down the box. So where infinite is unique, and I think this is for me one of the most important patterns that we have, is that every time we write a 64K slice in the system, we assign some metadata to that. And obviously it has a CRC checksum, but more importantly, it has the locality of reference. So if we subsequently go back and do a reread and the CRC matches, but the location has changed, we know that corruption has happened. Something's a bit flipped on, right? All of these things that constitute sound data corruption. That's not just the impressive part. What we do at that point is we dynamically deduce that the data's been corrupt and using the parity in the core. We're a RAID 6, like a dual parity configuration. We rebuild that data on the fly without the application or the end user knowing that there was a problem and that way served back the data that was actually written. We guarantee that. We're the only RAID that does that today that is massive for our customers. Yeah, I mean, the time to rebuild, you said a 12TB drive, I mean, I would have thought, I mean, I always joke, how long do you think it takes to rebuild a 30TB drive? Because eventually, you know, it's getting there. It's like a month. With us, it's the same. So if you look at our 3TB drives, it was 18 minutes. The 4TB drives, 18 minutes. The 6, 18 minutes, 8, 12. We'll be good all the way up to 28TB drives of the configuration we have. Now, I want to come back to a conversation we've had many, many times with you guys. We were early on in the flash storage trend and we saw the prices coming down. We felt like high speed spinning disks where their days were numbered and we were correct in that prediction. But then, you know, disk drives have kept that distance. You guys have a skewed going all flash because of the economics. But help us understand this because you've got this mechanical device and yet you guys are able to claim performance that's equal to or oftentimes much, much better than a lot of your all flash competitors. And I want to understand that a little bit. It suggests to me that there's so much other overhead going on and other bottlenecks in the system that you guys are dealing with both architecturally and through your intelligence software. Can you talk about that a little bit? Absolutely, absolutely. The software is the key, right? We are a software company and we have some phenomenal guys that do the software piece. So as far as the performance goes, the back end spinning disks are really obfuscated by two layers of virtualization. And we ensure that because we have massive amounts of DRAM that all of that data flows into DRAM. It will sit in DRAM for an astonishing five minutes. I say astonishing because most of our vendors try to evict cash straight away so they've got room for the next one. And that does not facilitate a mechanism by which you can inspect those dumb pieces of data. And if you get enough dumb data, you can start to make it intelligent, right? You can go get discarded data from cell phone towers and find out where people go to work and what time they work. And because of that, what demographic are they in? And now you're predicting the election based upon discarded cell phone tower data. So if you can take dumb data and put patterns around it and make it sequential, which we do, we write out in a log-structured right. So we're really, really fast at the front end and some customers say, well, how do you manage that on the back end? Here's something that our designers and architects did very, very well. The speed of DDR3 is about 15 gig per second, which is what's in DRAM right now. We have 480 spindles on the back end. If you say each one of them can do 100 meg per second, which they can do more than that, they can do about 200. That gives us a 48 gigabyte, sorry, per second, backplane destageability, which is three times faster than the DRAM. So when you look at it, the box has been designed all the way so there is no bottleneck flowing through the DRAM. Anything that's still been accessed that comes out of that five minute window once it's destaged to all the spindles, incidentally in a log-structured right. So right now it have 480 spindles all the time. And then you've got the random still on the SSD, which will help to keep that response time around about two milliseconds. And just one last point on that. I have a customer that has 1.2 petabytes written on a 1.38 petabyte box of still achieving a two millisecond response time. And that's unheard of, because most block arrays, as you fill them up to 60, 70% that the performance starts going in the tank. So I got to go down memory lane here. So the most successful storage array in the history of the industry, my opinion, probably fact, was Symmetrix. And when Moshe designed that, he eschewed RAID 5. Everybody was on the crazy about RAID 5, they said, no, no, no, we'll just mirror it. And that's going to give us the performance that we need. And he would write to DRAM. And then, of course, you'd think that the D-stage bandwidth was the bottleneck. But because they had such a back high, a large number of back end spindles, the bandwidth coming out of that DRAM was enormous. You just described something actually quite similar. So I was going to ask you, isn't the D-stage bandwidth the bottleneck? And you're saying no, because your D-stage bandwidth is actually higher than the DRAM. Yep, it is. So with the Symmetrix and typical platforms, you would have a certain amount of disk in a disk group, and you would assign FAs and fiber channel ports to that. And there'd be certain segments in cash that were dedicated to those disks. We have done away with that. We have so many, well, we have two layers of the virtualization at the front, as we talked about. But because nothing is a bottleneck, and because we've optimized each component, the DRAM, and I talked about the SSDs, we don't write heavily over those. We write in a sequential pattern to the SSDs so that the wear rate is elongated. And so because of that, and we have all of the virtualized RAID groups configured in cash. So what happens is as we get to that five minute window we're about to D-stage, all of the RAID groups, the algorithms are telling the cash how to lay out the virtual RAID structure based on how busy other RAID groups are at the time. So if you were to pause it and ask us where it's going, we can tell you it's the machine learners, the artificial intelligence that's saying this RAID group just took a D-stage or there's a lot of data in the cash that's heading for these, based upon the prediction of the hot, the cold that I talked about a few months ago. And so it will make a determination to use a different virtual ratio. And that's all done in memory as opposed to relying on the disk. So we don't have the concept of spared disk, we have the concept of spare capacity, it's all shared. And because it's all shared, it's this very powerful pool that just doesn't get bogged down and continues to operate all the way up to the full capacity. So I'm struggling with this, there is no bottleneck. There's always a bottleneck in the system. So where is the bottleneck? The bottleneck for us is when the array's full. So if you overwrite the maximum bandwidth and that historically, in 2016, 2017 was roughly 12 gig per second, we up that in the fall of 2018 to around about 15 and we're about to make the announcement that we've made tectonic increases in that where we'll now have right bandwidth approaching 16 gig per second and also read bandwidth about 25 gig per second. That 16 is going to move up to 20. Remember what I said, we released a number and we gradually grow into it and maximize and tweak that software. When you think that most all flash arrays can do maybe one and a half gig per second sustained writes, that gives us a massive leg up over our competition. And instead of buying an all flash array for this and another mid tier array for this and call service for this, you can just buy one platform that services it all, all the protocols and they're all accessed the same way. So you write an API one way. Mark Shuttle was just a big fan of this about writing code obviously with Spinnaker and some of those other things that he's been involved in. We do the same thing. So our API is the same for the block as it is for the NAS, as it is for the iSCSI. So it's very consistent. You write it once and you can adapt multiple products. I want to pick your brain about customers for a bit. Everybody talks about digital transformation and it's this big buzzword, but when you talk to customers they're all going through some kind of digital transformation or they want to get digital right. Let's put it that way. They don't want to get disrupted. They see Amazon buying grocers and Apple getting into financial services and content and it's all about the data. So there's a real disruption scenario going on for every business. And the innovation engine seems to be data. Okay, but data just sitting there and a data swamp is no good. So you got to apply machine intelligence to that data and you got to have scale. So you guys make a big deal about petabyte scale. What are your customers telling you about the importance of that and how does it fit into that innovation sandwich that I just laid out? Sure, no, that's a great question. So we have some very big customers. We don't have 70 petabytes of production. Seven? With a 70, yep. We have a couple of those, both financial institutions, very, very good at what they do. We worked with them previously with another product that really kind of introduced another one of Moshe's products that was XIV that introduced the concept of self-healing and no tuning and things like that. We haven't even talked about that, that there's no tuning knobs on the Infinidat. Probably should mention that. But our customers have said to us, we couldn't scale. We had a couple hundred terabyte boxes before that we're okay. You've brought, you've raised the game by bringing a much higher level of availability and much higher capacity. We can take one of our, but I'm in this process right now, the customer, we can take one of our boxes and collapse three VMAX 20 or VMAX 40s on it. We have in numerous occasions gone into establishments that have 11, 12, 23 inch cabinets, two and a half thousand spindles of the old DMC VMAX station, we've replaced it with one 19 inch rack of arts, right? That's a phenomenal state when you think about it. And that was paid for, you think some of these VMAX 40s have up to 192 ports on them, fiber channel ports, we have 24. So the fiber channel port reduction, the power heating and cooling over an entire row down to one eight kilowatt consumption. By the way, our power is the same, whether it's three, four terabyte, six, eight, 12, they all use the same power plant. So as we increase the geometry capacity of the drives, we decrease the cost per usable. Well, we're actually far more efficient than an old flash array. We're the most environmentally friendly, hybrid spinning disk planet on the array. So I asked you about cloud, so. Discourage on the planet, that would be so. So when cloud first sort of came into vision, financial services guys were like, no, cloud is a bad word. They're definitely leaning into that, adopting it more. But still, there's a lot of workloads that they're going to leave on-prem. They want to bring that cloud experience to the data. What are you hearing from the financial services customers in particular? And I single them out because they're very advanced, they're very demanding, and they get a lot of dough. And so what do you see in terms of them building cloud, hybrid cloud, and what it means for them, and specifically the storage industry? Yeah, so I'm actually surprised that they've adopted it as much as they have to be honest with you. And I think the economics are driving that. But having said that, whenever they want to get the data back, or they want to bring it back on-prem for various reasons, that's when they're running into problems, right? It's like, how do I get my own data back? Well, you've got to open up the chat book and write big checks. So I think Infinidad has a nice strategy there where we have the same capabilities that you have on-prem, you have in the cloud. And don't forget, nobody else has that. One of the incumbrances to people moving to the cloud has been that it lacks the enterprise functionality that people are used to in the data center. But because our cost point is so affordable, we become not only very attractable for on-prem, but for cloud solutions as well. And of course, we have our own Nutritz cloud offering which allows people to use it as DR or replications and so however you want to do it where you can use the same APIs and code that you run your data center and extrapolate that out to the cloud, obviously, which is very helpful. And so we have the ability, if you take a snapshot on Amazon, it may take four hours and it's been copied over to an S3 device. That's the only way they can make it affordable to do it. And then if you need that data back, it's not imminent. You've got to rehydrate from S3 and then copy it back over your snapshot. With Infinidat, it's instantaneous. We do not stop IO when we do snapshots, another one of the patterns. We use a time-synchronous mechanism. Every IO that arrives has a timestamp. And when we take a snapshot, we just do a point in time and any timestamp that's greater than that instantiation point is for the volume and previous is for the snapshot. We can do that in the cloud. We can instantly recover hundreds of terabytes worth of databases and make them instantly available. So our story, again, with the innovation, our innovation wasn't just for on-prem, it was to be facilitated anywhere you are. And that same price point carries forward from here into the cloud. When Amazon and Microsoft wake up and realize that we have this phenomenal story here, I think they'll be buying from us in leaps and bounds, it's the only way to make the cloud affordable for storage vendors. So the interesting thing is you talk about bringing data back, bringing workloads back. And there are tool chains that are now on-prem. The Kubernetes is a great example that are cloud-like. And so when you bring data back, you want to have that cloud experience. So automated operations plays into that. You know, automation used to be something that people are afraid of. They want to do manual tiering. Remember, they wanted their own knobs to turn. Those days are gone because people want to drive digital transformations. They don't want to spend time doing all this heavy lifting. Can you talk about that a little bit and where you guys fit? Yeah, I mean, you know, I say to my customers too, not to knock our competition, but you can't have a service processor as the intercommunication point between what the customer wants and it deciding when it's going to talk to the array and configure. It's going to be instantaneous. And so all we have, we don't have any Java. We don't have any Flash. We don't have any host. We don't have massive servers around the data center, collect information. We just have an HTML5 interface. And so our time to deployment is very, very quick. When we land on the customer's dock, the box goes in, we hook up the power, we put the drives in. I hate to use the word VToc because it brings back bad memories for a lot of customers. Volume table with context for those who are, you know. Now we're going back in time, right? Knowing that main here. And so we're very dynamic, both in how we forward face the customers, but also on the backend for ourselves. We eat our own dog food in the sense that we are, we have an automation team. We've automated our migration from non-infinite platforms to us. That uses some level of artificial intelligence. We've also built a lot of parameters around things like going with service now because you can do with our API what other people take, you know, page in page of code. I'll give you an example. One of our customers said, I need OCI, the NetApp management product. And we called NetApp and they said, hey, listen, you know, it usually takes six months to get an appointment and then it takes at least six months to do the code. And we said, no, no, no, we're not like any other storage around it. We don't have all these silly RAID groups and spare disk capacity. You know, there's three commands we can show in the API and we show them and they're like, wow, can you send us an array? We said, no, we can do something better. We were designed SDS, right? When infinite out was coded, there was no hardware. And the reason we did that is because software developers will always code to the level of resilience of the hardware. So if you take away that hardware, the software developers have to code to make something to withstand any type of hardware that comes in. And at the end of the coding process, that's when we started bringing in the hardware pieces. So we were written SDS, we can send vendors and customers an OVA, a virtual appliance of our box. They were able to do it in a week. They told the customer, we have to go through full QA, no reason why it wouldn't work. And they did it for us and got it on. It was a massive customer of theirs and ours. That's a powerful story. The time to deployment for your home grown apps as well as things like ServiceNow and OCI, incredible infinite out. Three API calls, we were done. So you guys had a little shadow partnership with that app in the field. We did, yeah, I mean it was great. They had a massive license with this particular customer. They wanted our storage on the platform and we worked very, very quickly with them. They were very accommodating. We'd love to get our storage qualified behind their heads right now for another customer as well. So yeah, there's definitely some synergy there. People realize what we have, a Splunk's massive for us. What we're able to do with Splunk in one box, people, the competitors can't do in a row. So it's very compelling what we actually bring in, how we do it. And that API level is incredibly powerful and we're utilizing that ourselves. I would like to see some integration with Canonical. Mark Schull and his guys have done a great job with SDS Plays. We'd like to bring that here, do Spinnaker, do Galactic Fog, do some of those things as well that we're working on with the automation team. We just added another employee, another FTE to the automation team and infinite out. So we do these and we engage with customers and we help you get out of that trench that is antiquity and move forward into the vision of how you do one thing well and it permeates the cloud on-prem, off-prem, hybrid. All those kind of things. Well, that API philosophy that you have and that infrastructure as code model that you just described allows you to build out your ecosystem in a really fast style way. So Greg, thanks so much for coming on and doing that double click with us. Really appreciate it. I'd love to have you back. Great, thanks a lot Dave. Thank you. You're welcome. Thank you for watching. You're watching theCUBE. This is Dave Vellante. We'll see you next time.