 Live from Denver, Colorado, it's theCUBE. Covering Commvault Go 2019, brought to you by Commvault. Hey, welcome to theCUBE. Lisa Martin in Colorado for Commvault Go 19. Stu Miniman is with me this week and we are pleased to welcome one of Commvault's long time customers from the University of Leicester. We have Mark Penny, the system specialist in infrastructure. Mark, welcome to theCUBE. Hi, it's good to be here. So you have been a Commvault customer at the uni for nearly 10 years now. Just giving folks an idea about the uni. You've got 51 different academic departments, about five research institutes. Cool research going on, by the way. And between staff and students, about 20,000 folks, I'm sure all bringing multiple devices onto the campus. So talk to us about, you came on board in 2010. It's hard to believe that was almost 10 years ago and said, all right guys, we really got to get a strategy around backup. Talk to us about, way back then, what you guys were doing, what you saw as an opportunity and what you're doing with Commvault today. So at the time, there was a wide range of backup products. There was no real assurance that we were getting backups. We had a bit of Commvault 7, that was backing up the windows infrastructure. There was Tivoli Storage Manager backing up a lot of Linux and there was an Amanda and Open Source thing and then there was all sorts of scripts and things. So for instance, VMware backups were done by creating an array snapshot with a script. Then mounting that script into that snapshot into another server, backing up the server with Commvault and the restore process was an absolute, pig-zero, it was very, very difficult. Long-winded required a lot of time and the checks for this. It really was quite difficult to run. It used a lot of staff time. We were, as far as the corporate servers can turn exclusively on tape, Tivoli Storage Manager were using disk, Amanda was again for tape but a different, completely isolated system. Coupled with this, there had been a lack of investment in the data centers themselves so the network hadn't really got a lot of throughput. This meant that we were using private backup networks in order to keep backup data off the production networks because there was real challenges over bandwidth contention. Backups also, they were overrun and so on. If you've got a backup coming into the working day, it affects student there. So we started with a blank sheet of paper in many respects and went out to see what was available. And there was the usual ones with the net backup, Tivoli obviously again, Convult, ArcServe, but what was really interesting was de-duplication was starting to come in but at the time, Convult 9 had just been released and it had an absolute killer feature for us which was client-side de-duplication. So this meant that we could now get rid of most of this private backup network that was making a lot of complexity. It also did back up to disk and back up to tape. So at that point, we went in with six media agents. We had a few hundred terabytes of disk storage. The strategy was to keep 28 days on disk and then do long-term retention on tape into a tape library. We kept that through to about 2013. Then took the decision, disk was working, so let's just do disk only and save a whole load of effort even with a tape library, you've got to refresh the tapes and things. So keep it all on disk. With the de-duplication, we were basically getting a one-to-one. So if we had, take my current figures, about 1.5 petabytes of front-side protected data, we've got about 1.5 petabytes in the backup system, which because of all the synthetic falls and everything, we've got 12-month retention, we've got 28 days retention. It works really, really well in that. And that relationship of almost one-to-one with what's in the backup with all the retention and the client-side data has been fairly consistent since we went all disk. Mark, I wonder if we can actually step back a second. Talks about the role and importance of data in your organization, because we went through a lot of the bits and the bytes and the pieces there, but as a research organization, I expect that data is quite a strategic component of it. Data forms your intellectual property. It's what is core to your research. It's the output of your investigations. So where we're doing earth-of-relational science, so we get data from satellites. That is then brought down raw as tiny little files. They then get a dataset, which will consist of multiple packages of these files and maybe even different measurements from different satellites. That's then combined and can be used to model scenarios, climate change, temperature or pollution, all these types of things can then come in and it's how you then take that raw data, work with it, in our case we use a lot of HPC, high performance computing to manipulate that data and a lot of it is how smart the researchers are in getting their code, getting the maximum out of that data and then the output of that becomes a paper, a project and a finalized set of data which is the result, which all goes with the paper. We've also done a lot of the genetics and things like that, because the DNA fingerprinting was to Alec Jeffery and what was very interesting with that one is how it was those techniques which then identified the bones that were dug up under the car park in Leicester, which was Richard the third. That's right, I saw that documentary. Yeah and that really was quite exciting the way that worked. It really was quite fitting really that it's techniques that the university had discovered which were then instrumental in identifying that. So one of the interesting things I found in this part of the market is you used to talk about just protecting my data. A lot of times now it's about how do I leverage my data even more, how do I share my data, how do I extract more value out of the data. In the 10 years you've been working with Convaltor are you seeing that journey is that something that organizations going down? There's almost, there's actually two conflicting things here because researchers love to share their data but some of the data sets are so big that can be quite challenging. Some of the data sets we take other people's data bring it in, combine it with our own to do our own modeling then that goes out to provide some more for somebody else and there's also issues about where data can exist. So there's a lot of very strict controls about the NHS data, so health data which so NHS England, that can't then go out to Scotland and sometimes the regulatory compliance almost gets sidelines with the excitement about the research and we have quite a sort of dichotomy of making sure that where we know about the data that the appropriate controls are there and we understand it and hopefully people just don't go and put it somewhere, it's not because some of the data sets for medical research that given the data which has got personal identifiable information in it that then has to be stripped out so that you've got an anonymized data set which they can then work with and it's ensuring that the right data is used the right information is removed so that you don't inadvertently go and then expose stuff. So it's not just pure research and it going in this silo and in this silo it's actually ensuring that you've got the right bits in the right place and it's being handled correctly. So talk to us about as you pointed out this massive growth in data volumes from a university perspective, a health data perspective, research perspective the files are getting bigger and bigger in the time that you've started this foundation with Commvault in the last nine, 10 years tremendous changes, not just in data but talking about compliance you've now got GDPR to deal with. Give us a perspective and snapshot of your Commvault implementation and how you've evolved that as all the data changes, compliance changes and Commvault's technology has evolved. So if you take where we started off we had a few hundred petabytes of disk just before we migrated to our on-premise three cloud libraries, at that point I think I'd got 2.1 petabytes of backup storage. The volume of data is exponentially growing because the resolution of the instruments increases. So you can suddenly have a four-fold growth in your data but some of those are quite interesting things. See when I first joined there was great excitement with a project which was just known as the Bethi Colombo which was the Mercury, a European Space Agency to do Mercury and they wanted 50 terabytes and at that time that was quite a big number and we were thinking, well we made this blink but we need to be careful, yes okay, 50 terabytes is that over the line for project and now that's probably just to get us going and not much actually happened with it and then storage systems changed and they still had their 50 terabytes with almost nothing in it. We then understood that the spacecraft hadn't even been launched and that once it had been launched which was earlier this year it was going to take a couple of years before the first data came back because it has to go to Venus, it has to go around Venus in the wrong direction against gravity to slow it down, then it goes to Mercury and the real bulk data then starts coming back in. You would have thought going to Mercury was dead easy, you just go boom straight in but actually if you did that because of gravity to the Sun, it would just go in, you'd never stop, it would just go straight into the Sun and you'd lose your spacecraft. Nobody wants that, it's too expensive. Nobody wants that. Another really interesting example is have you heard of the Gaia satellite? Yes. This is the one which is mapping a billion stars in the Milky Way, it's now gone past its primary mission and it's got most of that data, huge data sets and that data, there's a, it's already been worked on but there's other universities tasked with packaging it and cleansing it. We're going to get a set of that data, we're going to host, we're currently hosting a national HPC facility which is for space research. That's being replaced with an even bigger, more powerful one that'll probably fill one of our data centers completely, it's about 40 racks worth. That's just a process, that data because there's so much information that's come from it and it's the resolution, it's the speed at which it can be computed and holding so much in memory. If you take across our current HPC systems, we've got 100 terabytes of memory across two systems and those numbers were just unthinkable. Even 10 years ago, a terabyte of memory, you... So Mark, Lisa and I would like to keep you here all week to talk about space data. Yeah, keep it out, Mark. Two of our favorite topics. But before we get towards the end, there's been a lot of changes at Commvault. It's a whole new executive team, they bought HeadVig, they landmatch this Metallic.io, they've got new things. It's a long time customer, what's your viewpoint on Commvault today and what you've been seeing over the last year? So it's been quite interesting to see how Commvault has evolved and the change which sort of happened between V10 and V11 when they took the decision on the next generation platform that it would be this, by industry standards, quite an aggressive pace of service packs which have then come out onto this schedule and to be fair, that schedule has been stuck to and we can plan ahead, we know what's happening and it's interesting that they're both patches and they're new features and stuff and it's really great to have that line to work to now. The platform now supports natively so much stuff and this was actually one of the decisions which took us round using our own on-prem S3 cloud library. We were using Azure to put some tier one data off site and with all is working great that, can we do S3 on-prem? And it's supported by Commvault, it's just a cloud library. Now when we first started, that didn't exist. We took the decision, we'll proof of concept and so on and it all worked and we then got Hyfascale as well and it's interesting to see how Commvault has gone down into that appliance level too because people want to be able to just have a box, unpack it and plug it in. If you haven't got a big technical team or strong skills in those area, why worry about putting your own system together? Hyfascale gives you back up in a box and the partnerships with, we were an HP customer so we were using Apollo's for our S3 storage and the Apollo is actually the platform if we would have bought Hyfascale, it would have gone on an HP Apollo as well because of the agreements we've got in place with HP but it's quite interesting how they've gone from software. The hardware has now come in and it's evolving into this platform and with Headfig, I mean there was a Commvault object store buried in it but it was very discreet, no one really knew about it. You occasionally could see a term and it would appear but it wasn't something which they published or there but object store with increasing data volumes, object store is the only way to store these volumes of data in a resilient and durable way. So Headfig buying that and integrating it in provides a really interesting way forward. From my perspective, I'm using S3. So if we had gone down the Headfig route, from my perspective what I would like to see is I have a storage policy, I click and I'm going to point it to S3 and it goes out, it provisions the buckets, does the whole lot in one couple of clicks and that's it, job done. I don't need to go out, create the user, create the bucket and then go and add every little bit and piece in there and it's that tight integration which is where I see the benefits coming in you. It's giving value to the platform and giving the customer the assurance that you've configured it correctly because the processes and automations in Commvault ensure that every step of the way the right decision is being made and that, yeah, with Metallic, that's everything it's about. It's actually tried and tested products with a very, very smart workflow process put around to ensure that the decisions you make, you don't need to be a Commvault expert to get the outcome and get good backups. Excellent, well Mark, thank you for joining Stu and me on the keep talking about the evolution that the University of Leicester has gone through and your thoughts on Commvault's evolution in parallel. We appreciate your time. Everyone, thank you very much. For Stu Miniman, I'm Lisa Martin. You're watching theCUBE from Commvault Go 19.