 Hello everyone. My name is Fabrizio Fresco. I joined HP one year ago. I work in the Professional Service team based in Galway, Ireland. We're here today, speaking about Freezer. It's kind of a new project, launched by HP. A little bit of history. When HP started to implement his own public cloud, a lot of people, different teams, working on deployment. And as a lot of new things implemented, they started kind of not very well synchronized, and each one of the teams was solving his own backup problem in very different ways. This was causing problems, because a lot of duplication of code, a lot of efforts and maintenance problem, bugs, discovery, and all these classical, operational things. And then they started to implement checks about the consistencies of the backup that runs every a couple of months. And they had very bad news, because as expected, there were many consistencies trying to restore not all these work in the way they were expecting. So we started to investigate possible, already done solution, and we couldn't find a cloud-oriented backup application available at the moment. So we started with development, we started to coordinate things between the different teams, the engineering, the operations, the ops auto, and so on. And so we were embracing the open-source community, and we wanted to be more present, so this project has been open-source since the beginning, and reusing or leveraging all the functionalities or library that OpenStack provided us. At that point, our customers, I am a professional service, so our customers were having absolutely the same problems, and they liked the idea, they started making requests and giving feedback and testing solutions. So they wanted Windows, for example, and they pushed really hard the idea of the disaster recovery capabilities, and all this thing motivated us quite a lot. So in our public cloud we started the tests, and the first address problem, that's kind of a classical one, is our CICD platform. That's quite huge for internal use, mainly, and each node has kind of 400k nodes, half terabyte of data, the classical difficult solution for the backups, because the change rate is quite high, the incremental is kind of a challenge. At the same time, even some customer wanted to solve quite similar problems, mainly telcos using the cloud, a dot moment for internal use, for NFE, and so on, but even for storage for the very critical data in reality. Our feedback was really important for us, it was the things that pushed us to really put a lot of effort in this project. So as today we have that deployant in Helion 1.1.1, and there is under evaluation at the moment the inclusion of Freezer in Helion 2.0, that's developing currently. All these requests, all the things, so we wanted to learn from our own mistakes, and we started to design architectural goals that we wanted to achieve. So the idea was to have a solution, a unique one for the open stack infrastructure, for the virtual machines, or even for the normal computers, let's say laptops, or stations, or whatever. The aspect solution that integrates is our focus completely, because from our own mistakes we realized that having a lot of backups, of useless data, it's not worth. One thing that open stack couldn't help us very much is that we wanted to move the workload away from the infrastructure, because on the customer side they have big implementations with a lot of data and generating backups during the night, all at the same time resulted in a denial of service over the internal open stack control nodes. We have a lot of object storage, so since the beginning we wanted to use that for the backups. It's redundant, it's easy to distribute it over multiple sites, so we thought that it was the way to go. And so we love open stack, and we wanted to do something for helping open stack go to the users, not waiting for the users to go to open stack, or leave open stack only to the operators and not the end users. And so the requests and the challenges grow, they started asking to provide a point in time to make up of the entire infrastructure. This, as you can imagine, is kind of a big problem, because it's a lot of data to a lot of I.O., a lot of transfers, and it's not an easy problem to solve. And then even better they wanted to be able to restore everything in case of a disaster in the center, for example, in another one, at least part of the infrastructure, the more critical pieces. So once this was defined, we realized how big the problem is in the technical perspective, because it's open stack, as I said, it's not helping a lot. There is no automated way of generating the consistencies, backups, volume or images, whatever it is. The file systems won't help us a lot, because you can't sync the file system, but you can't forget about the applications. The easy backup is... There are a lot of solutions commonly in the market, on the open source world or whatever, but we wanted to solve more than only this simple problem. So we started to implement different strategies from the most simple, as I said, flash syncing the file system, and then backing up a simple directory tree. But then we started to do something better. We started to interact with the application, can be your databases or whatever, try to flash what's maintained in memory and take in more consistent backups. These obviously came with a cost, it's not free, and the main idea is to leverage the potentiality over the files, so everything is a file at the end, interact at least as possible with the application, and then work on the files directly. So then the next step was to have multiple choices over the efficiency of the solution, because in some cases we want to have very fast backups, low CPU usage, and we thought that the reason really is something that do this, that it's tar, so it's such an old technology, it's tested and it's there since 20 or more, but customer wanted more advanced solution, they wanted to be able to have a more bandwidth or sites efficient solution, and after a lot of brainstorming we came to the point that probably the ercing way of working was the most intelligent one, so it's kind of a different strategy, it's a lot more CPU intensive and memory, and slow obviously. We had a request about automating the image backups in Nova or Cinder, and Glance didn't help a lot, the only thing that's possible to do there is to create an image from a VM, and Glance have no idea what's happening inside of the VM, so with these alone it's sometimes completely useless as a backup, because not even the file system gets synced inside of the VM, this is very often a big problem. In the Cinder side it's a little bit more advanced, because Cinder already provides the backup, can do compression, can do kind of duplication, and even encryption recently, but again the problem was the same, no consistencies inside of the VM, so our main user cases that we could figure out, we started tracking the most used application, thinking about how many lamp platform are implemented inside of OpenStack these days, and my SQL was kind of a classical problem to be addressed, there are some solutions in the market, but anyway none is automatic and easy, and not clod-oriented, we wanted the object storage for storing our data. So my SQL after a lot of evaluation we realized that probably the best way to go is to interact with the database, connect into the database, lock in it, flash in the tables, flash in the file system, and take the snapshot of the file system. At that point we can remove the lock and everything go on quite well. There is no downtime strictly speaking, but it's only a small interruption of the rides. This is sometimes not acceptable anyway, but even the other solution can do a lot better than that. You can have your, let's say, stand-by replica dedicated for backup, so you can take the data from there and you will not interrupt your service. MongoDB, for example, is quite a lot easier because we really have an internal journal-ed way of working, so we know that flashing the syncing, let's say the file system, is a nog for having binary files in a consistent state. In Windows we have the big problem of SQL server, but with the same strategy we used for my SQL it was mainly a nog, so you can lock your database, flash it, take your snapshot using VSS, that are most the same thing as LVM in Linux, and work from there. We are working on the elastic search, it's widely used, we have it in Helion by default. So, again, the same strategy, different terms, but it's the same as my SQL, we can lock database, take snapshot, and work on the binary files. We are successfully using Freezer now since almost one year in our public cloud, and it's working. We are planning on the implementation of the OSX and BSD clients, we have some ideas, but we are defining the way to go. And then probably it was kind of a mistake, not starting with Cinder or Nova since the beginning, but finally we realize that this is important too. So, we are almost completed the backup leveraging Cinder, so, again, we have our client inside of the virtual machine, we can work through the application and then create backup in Cinder. Without this kind of approach, it's very difficult to have useful backup with Cinder, there is even no way to have that automatically done by default. You can work through Horizon, but you have to go take the snapshot, you can't, at the same time, have sync a different system inside and then create the backup from the snapshot. A lot of virtual machine is kind of a big effort. Obviously, you can work through the APIs, but you need to orchestrate everything and you need to write your own script and synchronize all the stuff by yourself. So, we had that working already, it's in merge state, it's going to be there very soon and we had talked even about doing more than what Cinder do of the volume during the backup because we received some request about having the chance to manage all the keys for the encryption outside of the cloud, the customer won't manage that by themselves. So, here the idea was to, after creating the snapshot, we can create an image from the volume and then allow the image from glance. This is not optimal, but it's impossible to get the volumes directly from Cinder, so the only way to go we couldn't figure out better ideas. And with glance, we have to do everything on the outside because we can create a snapshot of the image, but it's not encrypted, it's not compressed, the duplication, nothing of this. We create the image from the VM and we download that in the freezer, we do our processing and we store the backup in Zwift. How do freezer works in the reality? There is no space usually to store anything inside of the VM because it's kind of illogic, really a lot of IO and it's really something that we didn't want to happen. So, we leverage a lot the multiprocessing and pipelines, we use the Python pipes, so each processor thread will do his piece of work like the compression or the encryption or the duplication and then pass through the pipe to the other processes that will do the sequence work in the chain and then at the end copy it directly in the object storage. These have a lot of advantage because it's almost no additional space required in the inside of the machine that's going to be backed up and only the snapshot, that's really minimal and no additional space required for the restore of the backup and the memory usage is quite low it's not zero but the word is not perfect. For the ErSync way of working we had really big problems because the standard ErSync algorithm that there is no implementation in Python anyway uses both copies of the data the old one and the entire new file to be restored because it uses heavily the duplication so only once is chunk of data is stored over there and can be applied multiple times inside of the file to be restored so it was almost impossible to proceed in that way we struggled a lot to find the solution and so far the best idea we get was to split each big files in blocks of data we used by default the same machine that's 50 megabyte and inside of each one of these blocks we applied the ErSync algorithm that takes 16 kilobytes of blocks he calculates the checksum and we store all this information in the metadata and we are able to compress it afterwards and do the encryption duplication and all the same stuff that the ErSync algorithm will allow us to do this is not optimal but we are still thinking in better solution we have a few ideas but we are trying to write a proof of concept and see if the idea is at the end possible to be implemented in big pictures how do freezer work in the reality it can work alone you install your client in your machine to be back it up and it simply store your backup in Zwift or unamounted an FS file system or throw the data through a SSH machine with storage attached but we wanted more than these obviously so we started implementing the APIs and with these APIs we were able to implement a nice user interface for making things even more easy for the end user so here on the left side you have the freezer client could be whatever as I say the VM or desktop or Windows machine or a physical node if you have the entire architecture implemented you have to register the freezer APIs in Keystone in the catalog and deploy the client with a very basic configuration that's mainly the username the tenant ID the password and the Keystone endpoint at that point the client will auto-register through the APIs so it will show up in your user interface and from there you can create apply your detailed configuration we are using elastic search as the storage for the configuration the events and all the metadata metadata of the current backups and we implemented a dashboard in Horizon for its open stack it's already there what we have so far is a simple list of what we already have implemented we can backup and restore from Zwift we have the MongoDB jubilnaled database functionalities, MySQL, SQL server we have the APIs that are working smoothly we are still working on that adding more functionalities once we have the functionalities in the API we can implement the user interface we have the Windows client working in both ways in the Taren and HerSync way we have the encryption we use SSL whatever kind of encryption we want we can use that we have the incremental differential backup policies and the automatic removal after the expiration period is gone we are actually working on the sender backups it's there but we need a lot more testing we just started to implementation in glance, NOVA images and in our priority there is kind of more functionalities the OSX and VBS implementation the capability of storing the backups in NFS or SSH to SSH creating multiple file system or directory trees in different file system in one passage the concurrency prevention of the client and probably Oracle and SAP and even we are trying to to have collaboration with other product like the internal HP1 that's data protector and we really want to have a mobile client implementation to be done a very nice idea would be even to have a workload integration between different technologies but it's not easy to achieve some script shot of the actual user interface as I said we use elastic search as a behind that allow us to create nice analytics about how things are going the number of successful backup the failures, the size of the files contained in the backups and some nice charts this is the main view you can see all your machines with the name who is executing the backup when I've been added to the freezer and some more additional information the number of backups if there are broken links this is obviously in heavy development so things are changing quite quickly this is the configuration management as I said new machine will pop up automatically over this dashboard and you can create new configurations or you can select one of the configurations like now it's the elastic search one and you will see all the machines to whom the configuration is applied to the status of the last backup in this case there was no backup done and you have the button to be able to restore automatically the backup on the machine here is the configuration creation it's quite simple configuration name in which containers store data in Zwift the mode of the backup if a simple file system or databases or whatever you're willing to backup and the source directory where the data is going to be back up you can choose the way of working as I said before the speed one the fast one or the space efficient you can choose the compression algorithm and the encryption here you will have all your freezer client showing and you can apply the configuration to whatever client you want the scheduling when the backups are going to be taken the hour and after how much time and some more additional efficient like excluding some file patterns where to store the logs or if you need a proxy to access your object storage that's almost it here is where the code is we are in the open stack incubator launch pad you can install freezer directly from the pypy repository and obviously if you want to give some feedback or have bright ideas or you're willing to join the team and help us out you're very welcome that's it if you have any question or file level yes it's not yet implemented in the UI but if you do that by hand let's say you can restore only one file that was a recent request that we implemented quite recently but it has sense so the idea here is to leverage the libvirt and backing up the XML file of your virtual machine so we this is not done yet as I said the disaster recovery is something that we are working on and willing to do that so we have the solution and we are working on the implementation we will retrieve your XML we can show you the network configuration of your virtual machine and we will provide you the way to change to the new addressing from there and restoring then the XML over the new compute node in the new data center let's say it's not easy so it's something that we we want to do and we are working on that in which release of open stack can we use a freezer which release of open stack there's no restriction we develop it in dev stack our public cloud is not completely Juno it's compatible with Juno and you mentioned that your merging code for Cinder we are not working on Cinder we are using Cinder leveraging all the functionalizers that we can use that Cinder will provide us but we are not working in Cinder