 Live from the Wigwam in Phoenix, Arizona. It's theCUBE, covering Data Platforms 2017. Brought to you by CUBE. Hey, welcome back everybody. Jeff Frick here with theCUBE. We are at the historic Wigwam Resort, 99 years young just outside of Phoenix at Data Platforms 2017. I'm Jeff Frick here with George Gilbert from Wikibon. He's been co-hosting with me all day, getting through the end of the day. And we're excited to have our next guest. She is Kellan Gorman, the technical intelligence manager and also the office of the CTO at Delfix. Welcome. Thank you, thank you so much. Absolutely, so what is Delfix for people who aren't familiar with Delfix? Most of us realize that the database and data in general is the bottleneck and Delfix completely revolutionizes that. We remove it from being the bottleneck by virtualizing data. So you must love this show. Oh, I do, I do. I'm hearing all about all kinds of new terms that we can take advantage of. Right, cloud native and separate. And I think just the whole concept of atomic computing, breaking down, removing storage from serve, breaking it down into smaller parts. Sounds like it fits right into your guy's wheelhouse. Yeah, I kind of want to containerize it all and be able to move it everywhere, but I love it, yeah. So what do you think of this whole concept of data ops? We've been talking about DevOps for, I don't know how long, we've been talking about DevOps, George. Five years, six years, a while. Yeah, a while. But now. Actually, maybe eight years. You're dating yourself, George. But now we're talking about data ops, right? And there's a lot of talk of data ops. This is the first time I've really heard it coined in such a way where it really becomes the primary driver in the way that you basically deliver value inside your organization. Oh, absolutely. You know, I come from the database realm. I was a DBA for over two decades and DevOps was a hard sell to a lot of DBAs. They didn't want to hear about it. I tried to introduce it over and over. The idea of automating and taking us kind of out this manual intervention that introduced many times human error. So DevOps was a huge step forward getting that out of there. But the database was still in data in general was still this bottleneck. So data ops is the idea that you automate all of this. And if you virtualize that data, we found with Delphix that that removed that last hurdle. And that was my, I guess my session was on virtualizing big data. The idea that I could take any kind of structured or unstructured file and virtualize that as well. And instead of deploying it to multiple environments, I was able to deploy it once and actually do ION demand. So let's peel the onion on that a little bit. What does it mean to virtualize data? And how does that break a databases bottleneck on the application? Well, right now when you talk about a relational data or any kind of legacy data store, people are duplicating that through archaic processes. So if we talk about Oracle, they're using things like data pump. They're using transportable table spaces. These are very cumbersome. They take a very long time, especially with the introduction of the cloud. There's many rooms for failure. It's not made for that, especially as the network is our last bottleneck is what we're also feeling too for many of these folks. When we introduce big data, many of these environments, many of these, I guess you'd say projects came out of open source. They were done as a need, as a necessity to fulfill. And they've got a lot of moving pieces. And to be able to containerize that and then deploy it once and then virtualize it. So instead of, let's say you have 16 gigs that you need to duplicate here and over and over again, especially if you're going on-prem or to the cloud, that I'm able to do it once. And then do that IO in demand and go back to a gold copy, a central location. And it makes it look like it's there. I was able to deploy a 16 gig file to multiple environments in less than a minute. And then each of those developers each have their own environment. Each tester has their own and they actually have a read, write, full, robust copy. That's amazing to folks. All of a sudden they're not held back by it. So our infrastructure analyst and our Wikibon research CTO, David Fleuer, if I'm understanding this correctly, talks about this where it's almost like a snapshot. It's a read, write snapshot, although you're probably not going to merge it back into the original. And this way dev test and whoever else wants to operate on live data can do that. Absolutely. It's full read, write, what we call it data version control. We've always had version control at the code level. You may have had it at the actual server level, but you've rarely ever had it at the data level for the database or with flat files. What I used was the CMS.gov data. It's available to everyone. It's public data. And we realized that these files were quite large and cumbersome. And I was able to reproduce it and enhance what they were doing at Time Magazine and create a use case that made sense to a lot of people of things that they're seeing in their real world environments. So tell us more, elaborate how DevOps expands on this. I'm sorry, not DevOps, data. I'm still stuck in eight years ago. Yeah. How to take that as an example and generalize it some more so that we see how, if DBAs were a bottleneck, how they now can become an enabler. One, it's getting them to various new skills. Many DBAs think that their value relies on those archaic processes. It's going to take me three weeks to do this. So I have three weeks of value. Instead of saying I am going to be able to do this in one day and those other resources are now also valuable because they're doing their jobs. We're also saying that data was seen as the centralized point. People were trying to come up with these pain points of solution to them. We're able to take that out completely and people are able to embrace agility. They have agile environments now. DevOps means that they're able to automate that very easily instead of having that stopping point of constantly hitting a data and saying, I've got to take time to refresh this. How am I going to refresh it? Can I do just certain, we hear about this all the time with testing? When I go to testing summits, they are trying to create synchronized or is virtualized data. They're creating test data sets that they have to manage. It may not be the same as production where I can actually create a container of the entire development or production environment and refresh that back. And people are working on their full product. There's no room for error that you're seeing where you would have that if you were just taking a piece of it or if you were able to just grab just one tear of that environment because the data was too large before. So would the automation part be a generation of snapshot, one or many, one or more snapshots and then the sort of orchestration and distribution to get it to the intended audiences? Yes. And we would use things like Jenkins through Chef. Normal DevOps tools work along with this, along with command line utilities that are part of our product to allow people to just create what they would create normally, but many times it's been siloed and like I said, work around that data. We've included the data as part of that that they can deploy it just as fast. So a lot of the conversation here this morning was really about put the data all in S3 or pick your favorite public cloud to enable access to all the applications through APIs through all different types of things. How does that impact kind of what you guys do from conceptually? If you're able to containerize that, it makes you capable of deploying to multiple clouds, which is what we're finding. About 60% of our customers are in more than one cloud, two to five exactly. As we're dealing with that and recognizing that, it's kind of like looking at your cloud environments like your phone providers. People see something shiny and new, a better price point, lesser dollar. We're able to provide that one by saving all that storage space, it's virtualized, it's not taking a lot of disk space. Second of all, we're seeing them say, you know, I'm going to go over to Google. Oh, guess what? This project says they need the data and they need to actually take the data source over to Amazon now. We're able to do that very easily and we do it from multi-tier, flat files, the legacy data sources as well as their application tier. Now, when you're doing these snapshots, my understanding, if I'm getting it right, is it's like a, it's not a full like Xerox. It's more like a Delta that like if someone's doing test dev, they have some portion of the source of truth and then as they make changes to it, it grows to include the edits until they're done, in which case then the whole thing's blown away. It depends on the technology you're looking at. Ours is able to trap that. So when we're talking about a virtual database, we're using the native recovery mechanisms to kind of think of it as a perpetual recovery state inside our Delphic's engine. So those changes are going on and then you have your VDBs that are a snapshot in time that they're working on. Oh, so like you take a snapshot and then it's like a journal. The transactional data is from the logs is continually applied. Of course it's different depending on each technology. So we do it differently for Sybase versus Oracle versus SQL Server and so on and so forth. Virtual files, when we talk about flat files are different as well. Your parent, you take an exact snapshot of it but it's really just projecting that NFS mount to another place. So that mount, if you replace those files or update them, of course, then you would be able to refresh and create a new shot of those files. So somebody said, we refresh these files every single night. You would be able to then refresh and project them out. So it's almost like you're sub-classing them. Yes. Okay, okay. Interesting. So how, when you go into a company that's got a big data initiative, where do you fit in the discussion? You know, in the sequence, how do you position the value add relative to the data platform that is sort of the center of the priority of getting a platform in place? Well, that's what's so interesting about this is that we haven't really talked to a lot of big data companies. We have been very relational over a period of time but our product is very much a Swiss Army knife. It will work on flat files. We've been doing it from multi-tier environments forever. It's that our customers are now going, I have 96 petabytes in Oracle. I'm about to move over to big data. So I was able to go out and say, well, how would I do this in a big data environment? And I found this use case being used by Time Magazine and then created my environment and did it off of Amazon. But it was just a use case. It was just a proof of concept that I built to show and demonstrate that. Yeah, my guys back at the office are going, Kellen, when you're done with it, you can just deliver it back to us. Yeah. All right, Kellen. Well, thank you for taking a few minutes to stop by and a pretty interesting story. Everything's getting virtualized. Machines, databases. Soon enough. Soon George, right? Not me, George. All right. Thanks again, Kellen, for stopping by. Thank you so much. All right, I'm with George Gilbert. I'm Jeff Brick. You're watching theCUBE from Data Platforms 2017 at Phoenix, Arizona. Thanks for watching.