 And I'm here today with guests from Caringo. We have with us Gene Cheshire. Hi there. Nice to meet you. And Mike Melson. Correct. Pleasure to meet you. Pleasure to meet you. So how's the show going for you guys? Excellent. It's good for you. Fantastic. Yeah, really good. You got here when? I flew in Monday. OK. Oh, I got here Monday evening also. Yeah, OK. Just in time for the party. Good. Oh, just in time for the party. Absolutely. Yeah. So Caringo's in the object store business. That's correct. And what's the heritage of the company? Just to give us a little background. Sure, sure. Well, Caringo is founded by three gentlemen who have a tremendous legacy in the startup business. Mark Goros, Jonathan Ring, and Paul Carpendier. Their names make up the name Caringo. Paul Carpendier in particular is the father of Kaz. He invented what became this entera product. Right, right. I knew that there was an EM, a centera, really. There very much is. So he ended up taking the things that he wished he had perhaps done differently and went with 2.0, if you will, and parlayed that into the object storage server that is currently. So one of the things I'm interested in is what were the things that they learned in generation one to sort of impact generation two Kaz products? Sure. One of the things, of course, with centera, the original addressing scheme was based on the MD5 hash of the content. When MD5 became a cracked algorithm, then that became insecure. And it was no longer something that held up in court of law for a compliance key. You could no longer prove that the content was immutable. They've had to deal with that. So he separated out the address from the ability to prove forever that the content was truly never changed. So that was one thing. The other was dealing with small files. Centera has a history of perhaps not being real speedy with the smaller files. So he fixed that with his product. The way centera works, correct me if I'm wrong, in order to deal with billions of small files, what they would do then is sort of aggregate groups of files and then deal with them as a single object. Is that right? Yeah, and they keep the index for the metadata actually in their access nodes in the server so that it's not really stored with the object. So with the DX and Dell's implementation of Keringo, we actually store the metadata with the object physically down on the disk. OK, and Jean, you're with Dell. I'm with Dell. And your title is? I'm a storage strategist. I work in our PG product development group and in our advanced engineering. So we were working with the DX product in Keringo before it was released in the early stages of bringing all integration partners and bringing the solution to market. OK, and were you with Keringo from the beginning? No, I was probably about employee 20. I've been there three and a half years now. And the relationship with Dell started when? Well, our advanced engineering group effectively went out and tried to analyze what are the options for this type of object storage. We wanted an object storage platform to have an archive. So it's part of our longer term intelligent data management strategy. So it ties in with the other acquisitions that we've had recently with Compellant because of their tiering and storage, Ocarina because of their compression and Dduke and Exynet because of the file system. So one of those components of a complete family of intelligent data management product was to have this intelligent active archive, and primarily the HTTP RESTful interface and the ability to scale to millions of objects. So it was a platform. And we just simply looked at the way Keringo had done it compared to everybody else and said, this is what we want. This is effectively the way we wanted to implement the technology. So we've become extremely close technology partners. We really worked a lot deeply together. And you've got a specific device that you ship, right? But you're also looking, are you looking longer term to embed the functionality into all of the platform, across all the platforms? So is it going to stay as a separate appliance? All of the above is the proper answer. Most everything will come to market initially as individual appliances. The archive store, the compression engine, the Exynet file head, each of those things will come there. But eventually it's very logical to see the compression things of Ocarina in an ecologic controller to see the file systems and things get more integrated as you go forward. But it's about collecting the proper intellectual property and applying the right thing at the right place. So there is a true architecture that Dell is investing in, what they call, our new term for it is fluid data architecture. Now that we've got the compelete and the family. But it was an intelligent data management fabric so that we have the proper relationship of a data mover, of a deduplication engine, of workload managers and so forth so that we break the way that data should be handled down into the smaller components and then apply the technologies to move it more efficiently to where it needs to be. It's a long road. And obviously object store is really important in sort of medical records. It's important across all industries in the context of legal and regulatory compliance. Legal and regulatory compliance, anywhere where there's really fixed content. Kringa was built, Caster, which is the software inside the DX, from the ground up to manage fixed content. So yes, medical images, video, audio, any of those types of things. We're seeing that in media and entertainment of course. Feature length movies now, 12 terabytes of data, 350,000 files. A feature length movies only 12 terabytes post rendering. That's the HD RAW that you see in the theater. It's 350,000 frames. After it's rendered, it's ready for broadcast. Pre-rendering, it's a lot bigger than that. Much, much more than that. So the explosion of fixed content, if you will, because those frames once they're shot, they're in the can, they stay, they will edit them and create new versions, new renderings. But yes, medical imaging, satellite photos, electronic discovery, e-evidence, all of that is just creating tremendous growth in fixed content. And I've talked to banks who said if you could solve the problem of, I don't remember if it was Bank of America or one of them, but the problem of storing tends to hundreds of billions of check images. Right, correct, correct. And so I could retrieve a single check image without having to retrieve a bundle of them. Right. And I'd buy the product today. So how deeply are you guys in that space? A lot. You know, the old tradition problems that checked different types of bankings would have these millions of little 2K check images and they'd be laying them down in full. Is that how small they are? Yeah, small as 2K. And then the smaller sector in the file system is like 4K, so you're wasting 50% of the space by putting it there. So the way Koringa lays the software down on the DX, they lay the objects down end to end to end. And so there is no wasted space out there. You can literally, we did a session earlier today where we really got down and did the comparison of traditional file systems versus an archive storage and to look at the different efficiencies. And you get a, when you put an object in an object store, you get back a tang or you need an identifier and it's just the way that you find it. So it's literally one step to go retrieve that and pull that object back. And if you look at a file system with a RAID 5, there's, I don't know, what was there? 12,000 different pointers get hit to pull back one file compared to one. So it's those types of efficiencies if you're gonna add 10 storage nodes a month for 70 years, you gotta be sure that you're not wearing out those disks and that you've got a pattern of accessing that data that will last you through the years. And file formats and types and operating systems change, but we're native HTTP. So HTTP is probably gonna be there a long time. So that's the access method. That's the access method, yeah. Okay, and is there a theoretical limit in terms of the number of objects? Well, we can use a class B at network addressing scheme today so we can logically get about 65,000 nodes and we could store a 30 million items per node depending on their size. So yeah, it's a pretty big number. It's a big number. The storage capacity as well as the namespacing to address that capacity, all of that is fully distributed. It's a symmetrical architecture. So it is hard to predict where that theoretical limit would be with the DX. As you add nodes to the system, it just keeps adding capacity. Is there an operation? Sometimes there's theoretical limits and then there's the operational logistical limit. Yeah, like I say, we've looked at scaling in a mid release in December, we went to support a class B versus just class C network. First you didn't think anybody needed more than like 250 storage nodes, but if you can get to 65,000 storage nodes at two terabyte drives, 12 drives per system, you can get to about a one and a half exabytes worth of data today. But we haven't sold one that big yet, but we'd like to. We'd be pretty happy to find out what that limit is in reality. So, medical image is big. What about in the area of social media? Because some of these social media platforms, they're sort of, they're building their own, I think they're building a lot of their own technology. Are you seeing opportunities for maybe the tier two kinds of applications maybe once it exists in the cloud? Absolutely, well the first place is CDNs, content distribution networks. So, they have their network where they have a tremendous number of edge servers where they cache content as they deliver it, but all of that needs to come back to a set of origin servers. For the original content, I'm sure everything's the same. And so for that golden master, which they may want in two or three strategic geographies, the DX is an absolutely outstanding platform for that origin server. And those would be what kinds of files? They range in the CDN world, everything, from the 2K thumbnails for a social networking site to full feature length videos that are streamed across the web. Netflix at cost of wearing? No, we don't know, okay. I don't know. It seems like they're soaking up a lot of traffic these days, so they are. Well, anything that is native HTTP, so so many things starting out with cameras and PDAs and everything, they're just in an HTML format picture. So, you know, like one of the big wins for Caringo I know a long time ago, they talk about it was like Vodafone in Europe. And they said, well, they're starting out with HTTP and they're putting it on a file system and they're going across a block system and they're replicating to another file system and then they're delivering HTTP on the back end. So if they could go HTTP end to end from the time you create data until you retrieve data and you've simplified it, every time you go through another format you have an opportunity to corrupt data or to lose data. So it's just a matter of having a platform. And for all practical purposes, a DX just looks like a web server. It's HTTP AD, it's just there. Now, the scales and it's very easy to manage. You just plug in additional nodes and they boot up and join the cluster. And so there's no backup in restore when you run out of data, you don't have to do a forklift upgrade whenever you've filled up a frame or something like that. And it fits Dell's model. All of our storage products we work for them to be pure scalable. So that's the same way Ecologic. When you add more Ecologic you get more controllers, more horsepower, more nick ports, more drive. DX does the same thing. The way that X in it scales is the same way. The way Ocarina will scale will be the same way. So as you add more compliances, so to speak, you will then grow that power as it goes forward. Your biggest competitor probably is EMC in this space. Is that right or is it somebody else? Well, we like to think we don't have any competition. Of course you'd like it. It's a simple thing. But EMC Centera has gone into life and that was the problem platform. They have an Atmos out there today. In both cases, it's heavy lifting to basically do business with them. They have a very heavy API. You have to totally rewrite your software to take advantage of it. Or we're really HTTP 1.1. And instead of a write, you do a put. And instead of a read, you do a get. It's just extremely simple to do that. And so the porting gives us another set of scale. There's a few others out there, but they're making it look like an object where they'll have a file system below or they'll be keeping the index in a server instead of really down with the data. So this is the purest application that we've seen in what we truly call a true object store. So you would say the wave of the future is accessed by HTTP. It absolutely is. It'll be there for a long time. That was probably the last of the major differences that Paul Carpenter looked at, was a protocol as the API, something that was industry standard out there that didn't need a big SDK, that you could actually point a web browser at the DX and pull your content out if you happen to know the key. And so it's being validated out there, of course, by some folks like Amazon with S3. That HTTP is the way to access cloud and object storage. You can think of it as a private cloud, but then there's just people who are learning different use models. I mean, it was originally thought of very much as being a second or a third tier of storage. But again, in medical, if you're reading 100 meg files, it's primary storage to them. It's a whole lot of need because of that type of application. You need some performance characteristics that are fairly significant, right? Well, and that's why there are features such as what we call dark eye where you have perhaps petabytes of data, but most of it's long tail. It won't be accessed. Medical images will sit there and you need those available. You don't know when the doctor's going to need that X-ray, but when he does, it's got to be there. And so we'll fill up DX nodes and then power them down. And until the content is needed, those disks will stay spun down. You get 30% power savings until you need the content and then it's available almost instantly. And again, this relationship has been going on for about how long? About probably three years, I guess. Yeah, two or three years. It's the time it started development. First release was just over a year ago in May, it was the first Dell release of the product. And then we had a second feed bump that we kind of came out with in the December timeframe. And we added some significant features there where we added the ability to name an object instead of having to use just that unique unit identifier. So you can actually... So it's got a worldwide name now? Yeah, well... You can do that. Yeah. You can do that? Yeah, yeah. So it made it more applicable to other areas. We've also come out with a file protocol gateway so that you can have a SIFS and an NFS access to it through a gateway. Because some people aren't completely ready to rewrite their software, but they want the advantages of that long-term archive. So we developed a gateway that will, again, allow people to do a normal mount. They lose some features when they do that and if you write native to the application, you can write all your metadata on a per-write basis. So I can make five copies of your email or two copies of mine or I can do anything that I want to for life points or when it's going to be deleted. If you do it through a gateway, you have to set your policies on each mount. So a D drive might live for seven years and an F drive would wind up, you can make it do 19 years or whatever you want. You just set your static policies compared to the being dynamic. That's about the biggest difference. If I'm an application developer today, that's got some sort of long-term where I need immutability, let's just leave it at that. But I'm writing a new application pretty straightforward, if I write. It is. You can write standard HTTP calls, but to make it even easier than that on support.del.com, there's a complete open source SDK that has bindings in multiple languages, Java, C-sharp, C++, Python. You can download that, literally, you can be writing content into the DX in a matter of minutes. Okay, but if I was a guy who already developed the application and maybe I was integrated with a Centera Atmos, kind of, or Atmos approach, what's the process for sort of, for making a migration or making this an alternative? Right, it's fairly simple and that was where named objects as we called them came in. Originally, when the first version of the DX only had the ability to be assigned a key by the DX, you then had to store that in your application. Some applications didn't want that. They said, we create our own name, we even build semantics into the name. It might have an account number, that type of thing. So now with both naming schemes available, it really is simply a matter of changing your rights to the file system, to posts to HTTP, and either storing the key you get back or using the name you've already been using all along. It really is that straightforward. And so there's a lot of hybrid use cases. So they'll write specific application servers to write all this data, but they may have a simple web browser application for people to pull data. If you want to pull back a map or a picture, you just type in the HTTP and you can hide that behind a little plug-in to any type of an app. So the access method, I mean, you can use the SDK and you can write all the things that you need to with the software, but then you can retrieve it very often when I'm doing a demo. I'll write a script or something and show a bunch of things. And then we'll just type in an IP address and the URL and when it comes back, the picture of the Parthenon, whatever you want to see so that you can show. Which actually starts bringing some interesting use cases that I don't think you would have ever thought of in the storage world. You can actually post HTML and JavaScript into a DX cluster and use it as a web server and bring those pages back. So there are some things like that. You can plug a search appliance into it like you would normally. I was going to ask. Full text index and metadata index the entire storage system. And then have full text search capability of your entire system. With some of the unstructured data, particularly around images and stuff like that, obviously with images, hopefully there's been some tagging and I can do some searching on that. Are there other capabilities in terms of search that are sort of interesting to you to enable people to search through things that don't have a lot of metadata? Or maybe you use it to create metadata? All of the above. Can you talk about it? Can you talk about it? Well let me jump on that first and then Gene, feel free to chime in. So in addition to full text search, that type of thing, we also have a product called the content router. And you can, when you write data, you can do custom tagging, as you said. So in addition to standard metadata, such as timestamps and those types of things, content length, you can add any custom metadata you want. There's in this product called the content router, you can write rules against that metadata to generate lists of content that are in the system that meet criteria, such as tagged as send to the Southwest region, whatever it is. That then is available to get that stream of data and do whatever you want with it. By default, we have a system that will replicate it to other DX clusters anywhere in the world. So that's why it's content replication, the content router, it will route anywhere in the world, subsets of the data. But it's an open API. So you can do whatever you can dream up with that API as well. In our December release, in our second release, Dell actually created a set of standardized metadata tags and we're encouraging, not completely requiring, but strongly encouraging our ISV partners to write a set of standardized metadata tags down with the object whenever they're creating the object. And this medical, for instance, was a big ask for this because they may have five or six applications in a hospital that all need to be archived. Well, the only search was each individual application. So the payroll app versus the pharmacy versus the cardiology, each one of them had their individual and they couldn't search across all things. But if they get each of those apps to fill the metadata field, then they can search and find all of the instances of Mike Melson and we can mark him as dead or whatever we need to there in the metadata. We can modify, we can modify, you know. But we can say we, Mike, found a cure for cancer and we can update that across all of the records. Thank you for going positive. Yeah, I appreciate it. Yeah, yeah. Well, and actually, Michael Dell talked a little bit about that in a keynote today is around how the medical community has been in some ways very non-science based and some of that's driven by privacy, right? Because I'd like to know how it worked out for you when you took that medication, but for privacy reasons, I don't get to. So are you seeing interesting applications? Some of the most cutting edge things that I've seen have just come out of conversations with people in the medical because they're usually pretty willing to, they write their own software anyway, so they're not scared about making a border thing. And we were in a conversation with a big medical company out on the West Coast and they were saying, well, we could just do HTTP range reads and make it work. And I went, what, explain that. Well, you can read in HTTP a chunk of a file without having to read the whole file from the end. So they want to write like 10 copies of the file and retrieve 10 chunks of that file and put them back together so they're retrieving it 10 times faster than they would from a file system for picking it up from the beginning to the end. And we're just going, well, it's just a standard part of HTTP. It just exists out there, but for someone to create that creative application to do that, and they want to see their medical images faster than their competitors. So they're out figuring out ways to do that. The other thing they can do since we have the ability to write multiple copies, the default is like a minimum of two and a maximum of 16, but you can change those things. So they want to write like 15 copies of the application or the object when it's first created so that a lot of doctors for the first couple of weeks can get to it. And then two weeks later, it has a tag that drops it down to two copies for the longer term retention. So that type of metadata manipulation and tagging can be created at the time that the object is written instead of having to be post-processed. So they'd also possibly have a tag that said delete this file at the end of seven years. And so your application didn't have to go crawl file systems to create deletes. They're just life points that exist. So it's a different kind of tool. Now Perot systems has got a lot of expertise in the area of medical, right? Correct. And so how are you three, you're not part of Perot, Perot's part of Dell, but how does Keringo, Dell proper and Perot system work? Very well together. We have a large, still I think in prototype stage, but it's a large application for Stanford Medical that Perot was involved in supporting out in California. And we actually set up DX clusters with TerraMedica, one of our use partners. And so TerraMedica performs the whole Paxus layer to connect to all the modalities. And then they, as they pool the product in an equal logic for their almost spooling type of storage. Well, it has to stay there until it reaches a level of maturity that it's reached the doctors updated at the pharmacy of the different thing. And then it automatically moves through and stored into the DX with the long-time archive function. And then they replicate between their two hospitals so that they've got complete copies of everything with one and the other. And that was all a Perot integration. Perot did all of the, there's some significant networking can get involved in this. And so that's where Perot and those people do an awesome job because they'll go in and consult and analyze and figure out a person's network so that they overcome the fears of going through it. So they've been great partners. And it's fun because they're smart people and they learn fast. And so as the more you work with them, they instantly start coming up with new ideas. Well, I can use this over here. I can use it over there because they see that use case evolving. We were talking to some of the Ocarina team and when you start thinking about replicating data over distance and some of this data is particularly medical and entertainment, so some of those files get mighty big and suck up a lot of bandwidth. So what's the opportunity, but you've got to, if you've got to preserve content in its immutable state. Correct. How do you guys work together? What's the opportunity for you guys to work together long-term? What can you do? What can't you do? I think it's a tremendous opportunity when you look at the intelligence of the Ocarina solution. It really goes about how they do compression and D-dupe very, very differently and very, very smartly. So it really is just a beautiful marriage of two very compatible technologies. I think you heard earlier that Ocarina and DX are going to be integrated very soon here. The first to integrate the Ocarina technology. I think it makes a lot of sense to bring that out first because they add so much value to the immutable story because of the way they do it. I appreciate you guys coming on theCUBE.