 Live from Nashville, Tennessee, it's theCUBE. Covering Commvault Go 2018, brought to you by Commvault. Welcome back to Nashville, Tennessee, the home of hot chicken and Commvault Go this week. I'm Stu Miniman with my co-host Keith Townsend. Keith wasn't expecting that one. I'm looking forward to the hot chicken. Absolutely and happy to welcome to the program first time guest Rahul Pawar, who's the head of R&D research and development at Commvault. Thank you so much for joining us. Thanks for having me on this one. All right, we said, like the hot chicken, I said we need to roll up our sleeves and really get into the sauce of what we're talking about. All right, enough of the puns on my standpoint, but tell us a little bit about R&D inside, what's your role, what's your team, what's your charter? So we have a team of about 650 very dynamic young engineers and what my role, and I'm very excited about that role, I get to talk with a lot of our customers and partners and understand their pain points and majority of my research comes from what the customer is really looking to do and what is hurting them and trying to solve that and describe, and once I have a problem defined, the team is very, very intelligent at solving them and they come up with various ways to solve it. And then getting that customer satisfaction high is what gives me the high and that's really what kept me at Commwall for over 17 years now. Yeah, 17 years, Rahul, I think back so, 17 years ago, I was working for a storage company and we talked about data, but it was usually about storing data, protecting data. Now we're talking about how we can get more value of the data, one of the things I was looking at coming into this show is like, okay, you talk about the AI and the ML, well how does that fit into the environment? Maybe you can explain why is it different now in 2018? What can you do now that you wouldn't have been able to do 10 years or even five years ago? So Stu, you made a good point, backup especially was make a copy, put it on tape, send it to somewhere, Iron Mountain typically and that has changed, now everything is available online all the time and even our thermostat is much smarter than what it was five years back, so we really are expecting, everybody's expecting a lot more from the data that is available, from all the information that is there and they want to make use of that. So backup can no longer be, hey I'm backing up this five servers and go figure it out, backup is now getting tons of VMs, tons of new applications popping in, various cloud applications that are coming in, so the IT team is really, really in the middle of this data revolution getting so much of information thrown their way. So that data and that data is the liquid goal like Bob and I like to call it and that has a lot of valuable information, it has information about your patents, it has information about who's accessing what files and should they really be accessing it, what data is really, really not needed and what is the sensitive data that is lurking behind and it could become a problem for you. So that data is a gold mine and the systems and the hard disk have become so cheaper, storage has become so cheaper, so having that data accessible all the time is, we take it for granted. So Raul, I like to say scale breaks things. When I was a young administrator I could, I literally had a spreadsheet to keep track of my tapes, of where my tapes were, what systems were backed up. So even if I lost my index in my software backup product, I couldn't know what my tapes at. Now, with organizations with petabytes and petabytes of data, how important is ML, AI to knowing where your data is at and how important is the index to that relationship? I would, I really want to say that ML and AI has become what deduplication was five years back and pretty much everybody is expecting you to have it. Like I said, if my car knows it, if my home knows it, my thermostat knows it, even my phone knows it, like where I'm going, like every week if I travel to a certain place and it knows it, it is something that is expected to be known and our backup environment have become so dynamic, like there's network failures and there's like tons of things beyond the control of the backup admin, even the storage admin or the DBs or the app developers who are putting in there that just come in place and with all of that happening, they really, you need a system that is learning from what is happening and being very smart about doing stuff. So we learn from yesterday's failures or the failures that were on the backups. We look at the network load that is on right now, the disk load that is on right now and adapt our backups or backup schedules accordingly. So we know your SLAs, you're trying to get an SLA of certain number of hours versus minutes and based on that, we prioritize certain servers over others or certain VMs that we see brand new over other VMs and then VMs that are on certain data stores over others because we want to keep the load on the data store ESX server or even your network and the proxies minimum, but at the same time, we know we are racing against the clock because we want everything to be backed up and even have a secondary copy and all of that. So there we are prioritizing and reprioritizing our backups and schedules and everything. One of the challenges we talk about automation is there's the technology and then there's the people. And in the open of the keynote this morning, the poet was using the GPS analogy and talked about like, oh, okay, you have arrived. Well, the admins today, they kind of have their turf that they control versus do I trust that it's doing the job and can automate some of those things and I shouldn't have to worry about it. Can you, does your team get involved in that dynamic because I know you listen to the customers? How do you help bridge that gap and help? I think of autonomous cars, we said, we will soon get to the point sometime, hopefully in the not too distant future where it's not that I don't trust the computers, it's like really I trust them more than I do the people. Okay, so I'll tell you and trust develops as you use it more. Like there's a reason why autonomous driving cars still have a steering wheel and a brake because I'm not sure whether I can trust it. But on the other hand, as time passes by, you really see the software in action and you want to see that it's really doing the smart thing and you yield control to it more and more. Like today, I'm like old era. So when I have something important, I make an extra copy. What's this, my kids? Like they are on Google file system or cloud file system. They never even think about making an extra copy. So the same thing is going to happen. Like we do have people who can take control and they can put on their priorities and all of that but we are saying, hey guys, you shouldn't be doing it. We are here to help you and we are going to show you and in case you don't like it, you can always put your brake on that self-driving car or the self-driving backup. So we'd be remiss if we didn't, if we had a researcher on theCUBE and we didn't talk about the art of the possible looking a few years ahead or even a couple of years ahead. And if you've ever been a backup administrator, nothing beats bandwidth, the bandwidth of a station wagon full of tapes. However, in this monitoring digital transformed environment, we have to get data to the cloud as soon as possible. What are some of the unique ways Commvault is tackling getting big data from where it's ingested to into the cloud provider so that we can take advantage of stuff like AI, ML, base workloads in Amazon or Google? So one thing that we have done with the cloud or anything is we have always kept the data independent of where it is going. So even if I'm taking data from on-prem to a cloud provider, so we will play to their full strength, but we will still keep the data independent where in case you want to move from one cloud vendor to another, you have that flexibility with Commvault. As far as taking the cloud and its efficiency and using its efficiency, what we have done is we always only send deduplicated encrypted data to cloud and we have various ways of consuming cloud. So cloud is where your storage has become so cheap that you don't have to think about it. In fact, I had a customer who got rid of their whole secondary DR data center and now they're using cloud as their DR location. And every three months they do the DR test with Commvault wherein they bring up the infrastructure machines up and it's all scripted and orchestrated. They bring up the infrastructure machines up followed by all the VMs and the applications in a certain order, like database has to come up before, AD has to come before exchange and database has to come before web server. So all of that happens after their testing is done and they have SLAs of four hours and 24 hours on certain servers. After all of that is done, they power it off, they get rid of the infrastructure and then they are back to playing, paying only their storage bill on the cloud. That's just one usage, but cloud has made life so flexible that I don't know how to think about my rack space and where does this server go and when do I order it and when does it ship? If I need something, I experiment with it, I give it more memory and size and do stuff. So protecting that data in cloud and protecting it well is what we do. And we have taken use of all the technologies like replicating across regions, taking it and replicating across clouds. We have done all of that. So let's talk about the importance of metadata in all of that. So if I have bits and pieces of data distributed across cloud providers, on-prem, how do I keep track of that data? So that's where our fully index comes in play, Keith, because what is happening is the data is spreading faster than some of the cloud growth because you have data with so much of copies and people have made extra copies just to be safe, that keeping track of everything and knowing what is where and who has access to what and people change roles or people leave and who has access even after all of that is done. It's very vital and critical for an organization to function. So our full index is keeping track of not just the bare minimum of who has the files and what are the files are. What we have done is we have worked with several customers where we have allowed them to insert their own custom tags, or custom information along with the data. So it's not just the file and file information or the file content awareness. They are able to keep third-party extra data along with every piece that is automatically queried from their other databases and inserted in that file. So those are their custom properties that are tagged along. Yeah, it's interesting. Think about metadata. I remember five or 10 years ago we were talking about the importance of metadata, but it seems like it's the convergence of the intelligence and the AI paired with that because it used to be, oh, well, make sure you tag your files or set up your ontologies or things like that. And now, on our phones, it does a lot of that for us and therefore the enterprise is following a similar methodology. Did we hit a certain kind of tipping point recently or is it just some of these technologies coming together? I think a lot of that was in the making. Like we used to have this technology called index cards where we were keeping track of things and whoever thinks of that, right? Now everything is by search and that's the new normal. Searching for your thing, thinking that somebody will know what I'm trying to do and tell me ahead of time is where the future is and that's where we are trying to keep up with. You're saying my kids don't know the Dewey decimal system because they have Amazon and now we have a similar thing in business. And it really strikes to you, like for calculator on a Windows desktop, when the kids go and search on the web for calculator, instead of using the calculator app on the desktop, you really know that things have changed and shifted a lot, so. So thinking about that change and shift with 4D, I'm able to add these custom tags to net new data. I'm going to throw you a softball from a use case perspective, but it's a hard technical challenge is I have 20 years of Commvault data that are data that I've backed up with Commvault. Wouldn't it be great if I could teach an ML or AI algorithm to go back and tag that data based on those high tag new data? How, any request for that or roll map to add that type of capability? All right, so if you are a 20 year Commvault veteran customer, first of all, thank you. But secondly, the fact that your index is there and we have built on our existing index and added a lot more attributes to it, so we already know a lot about you and if you are starting to beam to our cloud, we know even a lot more about how your backups are and how much you're backing up and how your licensing is and what are the typical workloads and the top error rates and how the health conditions are and a lot of that. And that is even on your own server dashboard. You don't have to beam it to any public cloud. You could see it on your own dashboard, all those statistics. So we already know all of that information. What we have come and started doing is we are inserting even more and more pieces of intelligence that we are finding because things have changed over the last 20 years. So what used to be just file metadata, user and ACLs and all of that. Now we have a lot more attributes that the file has. Yeah, one of the biggest challenges we see is, I'm a networking person and when I go to like the Cisco show this year, the network administrator, most of the network that they are responsible for isn't under their purview. And I think we have the same thing in data is that a lot of the data that I concern about my business, it's no longer my four walls and it's spread out in so many different environments. Opportunity, challenge both. For us, it's very exciting and opportunistic for our customers and a lot of IT admins. If you are dealing with multiple tools to handle that kind of thing, it's a big challenge because I met several customers and they wouldn't admit it but they know that even though their company policy is not to use certain clouds, the people are using it. If their company policy is not to use some doc sharing, people are using it. So there are two ways you could look at it. You could forget it and then risk or you could accept it and analyze everything with Commvault and move ahead. So let's talk about Commvault in this ability to know where your data's at with a adjacent technology, data protection. It's about protecting the data, not just from like, oops, I lost my data or even ransomware specifically but security. What is the role of the index or metadata in protecting your data from intruders? So as far as ransomware is concerned, we have taken like a few things. One is, and we are not only a ransomware protection per se, but what we are done is because we are in there and we look at your backup, how often they happen and how much data is changing and adjusted that to seasonality. Hey, we know quarter end, you have a lot of files changing versus weekends and stuff and how things change. Adjusted to seasonality, if we see something that is out of the norm, we are going to alert you and at that point you could pretty much, that alert is an actionable alert where you could say, hey, I want to disable data edging on this particular client or I want to take away access of someone on that. So with even data risk, like a rogue admin or an accidental admin, what we did is, we have added almost like a two signature kind of stuff. So if somebody accidentally deletes a client or a storage policy, one admin wouldn't be able to do that, like the business workflow says, do you have authentication also from Stu, that hey, Keith is trying to delete this? That's too uproof of it and it's an email which you reply yes or no. And the moment it is done, it goes ahead and deletes it versus it may stop and oops, that was an accident, Keith didn't want it to really do it. So there's that aspect. Second thing is our own media. What we have done is it's completely protected with our drivers wherein you can't get to it. It only common processes, common authenticated processes are able to write to our media. So when the customer came in this morning and was talking about it, all their infrastructure was affected by common media agents because we had it secured and the ransomware couldn't attack that because they were simply not able to write to it. All right, well Rahupur, I really appreciate you giving us an update. Look forward to catching up in the future where we'll see exactly where the research is going. All right, for Keith Townsend, I'm Stu Miniman. We'll be back with lots more coverage here from Commvault Go in Nashville, Tennessee. Thanks for watching theCUBE. Thank you, Keith. Thank you, Stu. Thank you.