 Hello guys, thank you for being with us, so we are, yeah, I think so, let's go, so we are VH Cloud, I'm with Xavier Nicole, who is the Public Cloud Infrastructure Director. Ok, so, let's go. Xavier Nicole, who is the Public Cloud Infrastructure Director, not easy, master of the universe for public cloud. And Antoine, who is helping us, he's the hiring officer, so I hope you will have a huge success. And myself, I'm the product manager for our public cloud, infrastructure layer of the public cloud. And today we'll talk about SRE, but not only. First we'll talk about the way we do the cloud, and then focus on how we maintain the quality of service at scale. Ok, so let's go. Maybe everything here is something you know, I'm not sure about that, and maybe all the information I will share, you heard it already, but it's always very necessary to know where we come from, to better understand where we want to go. So, our journey started in 1999 in the North of France, and let's say we quickly showed that there is not only one way to do the cloud. Ok, so we've been initially motivated by efficiency and I would say sustainability as well. And so we showed that we could do basically the cloud an alternative way. We started to build our data centers, and yeah, going out of the standard sort of legacy coming from on-premises era. Ok, so building our own data centers, this is not only about the way we use the space or we rack the hosts, but also about the way we cool the infrastructure, no AC, everything is water cooled. So we, I mean, this is the model we have since the beginning, this is the model we still have. I'm not up to date on the number of patents we have around water cooling technology, but it's huge. All our infrastructure is water cooled, and a lot of water. Last year of rage power usage effectiveness is around 1.1, which is quite good. And so, I mean, if you got that, like efficiency along with sustainability, this is what drives our way to create and operate the cloud at scale. And this is what made us a sort of industrial, I mean, in this cloud market, ok. And I can testify as well that, I mean, if, I mean, the visit of an OVH cloud data center is something like really a part, ok. Since 2002, I mean, really quick, I mean, in our story, we started to design and build as well our servers. We basically, I mean, create our acts out of sheets of metal, metal sheets, and assemble ourselves, I mean, all the components necessary to get our servers. Now, we have two factories, one in Roubaix, which, I mean, croix now, which delivers, yeah, which delivers, that's in the North of France. Deliver everything we need for Europe, and one other factory in Bois-Arnois, which delivers everything we need for the America continent. We're in a deep control at all the, you know, the steps of the chain, let's say, in a full integrated model, and from the data center to the server. And as a consequence, like, it's, I mean, you get now, I mean, you know, that there is no trap behind this price per ratio, like what made us kind of success. And this is just a consequence, ok. And we have this parallel as well with the network, like, we invested a lot in our network, I mean, we own our backbone, and this is, for example, why the egress is still included in the price of the computer in OVH cloud. Yeah, some figures about the scale, because size matters sometimes. We have more than, yeah, 450K instances running, more than 360 petabytes of physical space used, I mean, currently, and more than 7.8 billion of requests per month. So, I mean, that's really impressive, and given the number of open stack regions, I mean, the scale is becoming totally crazy, ok. So, I mean, we are in control, so we are a cost leader, we are in control, so we are a sustainable cloud. And now let's talk about, I mean, how to be in control at that level of scale, about the, I mean, in control of the quality of service for end customers. I think so. Ok. So, again, I take it from here. Hello everyone. So, as we said, we run like nearly half a million instances that represent, we're going to soon be in the club of the million core cluster. We run about 40 regions all over the world. That means we needed to find a operation model that allow us to grow, keep growing, and maintain, and improve the quality of service for our customer the right way, and keeping developing new features. So, about 3-4 years ago, we decided to implement and adopt the SRE methodology, let's say that, in the infrastructure and in the team. So, main secret, know what's going on in your infrastructure. So, we start a large project of observability, and we gather billions of SLI on infrastructure to know what's going on live, and be able to, or self-field infrastructure, or lead what's going to be the work from some squads in our teams that develop and improve the system. So, we are basically SLO driven in our infrastructure, meaning that today, I mean, let's go back in time, just before the pandemic, the team that I had a chance to lead today, we're about 30 people to manage this, and we're about to double the size of this team during the pandemic, which was a very big challenge. We were able to do that, we were about 60 people running this. It's not a lot of person, and we need to be very focused on the quality of service, and at the same time, trying to develop new features, maintaining with the version, I mean, upgrading the cluster, and being as close as we can from the last version of OpenStack, which is not easy for us sometimes. So, how we did that? Today, we are 60%. We are organized by squads, meaning that our team that are very spread all over the world, because there's people from India, from Poland, from UK, from France, from Canada, from US, and we all work together in the same team on the same project, with the same goal. So, people are able to work on the specific technology, or part they are more interested to, and we allow all the team to be able to move and change from one squad to another, as soon as they want to work on a specific topic. The goal is here to always improve, and it's what we, we have all in mind improved the quality of service we deliver to our customers. So, this organization allows us to, as I said, doing some rolling upgrade of the infrastructure. We have tens of thousands of computers, as you can imagine, and we are able to upgrade those regions on a monthly basis, right now, and be able to reach as much as we can the last version of every module, which is, again, another challenge. Something, to improve this process, a few, I mean, now, two years ago, we started a project to run OpenStack on the top of Kubernetes, meaning that we have to deliver a Kubernetes cluster running on bare metal servers to run OpenStack infrastructure that allow us to run Kubernetes cluster for customers on the top of it. So, that's one of the big challenge we have today, and that's going to allow us to improve, and update, and fix the infrastructure faster. We were able to have a very important, I mean, to achieve some goals in the sale feeling of the infrastructure, again, you're going to deep dive fast in one of them. When you run that amount of compute host, you always have hardware failure every day, and we need to manage that automatically. So, we are able today to detect in advance migrate VMs, switch, I mean, get off production, the instance that, I mean, the compute host that's not working well. Then we have people in data centers managing them, and all this is automatic on our side. So, there's no intervention of human OS series in this. The guys in data centers fix the server, if we know exactly what's wrong in the server, we tell them and they fix even faster, but that's how we work today. So, we have achieved a level of automatization in the sale feeling that allow us to run a very large cluster automatically, and with not a lot of human intervention on the cluster. As I was saying that we are 16 in the team today, and we allow us to have only one person 24-7, so, in average, to be in charge of managing all arts manually, the rest is fully automatic, and that's the biggest achievement we had for the last years that allow us to, again, focus then on new features. And in the new features we're going to have, it's mostly right now, I have some notes because many things, sorry. Yeah, we have today in alpha and beta version for our customers, some Octavia service, a lot balance out of the service, it's going to be available generically in that fall, running with Barbican, because we need it, of course, for SSL Certificate, and second big project is we're running Ironic, and it's going to be the bridge between the main business of OVH for the last 20 years, to be able to start bare metal servers on CloudWay, in OpenStack, with OpenStack APIs. I'm going to just go a bit on the big picture here, around the OpenStack, I have two minutes left, around the OpenStack ecosystem, OVH has acquired two companies for the last years, around storage, so an object storage company in OpenIO, and block storage company in Extend, that run block storage and VME, sorry, of a fabric. So we're working very hard to have this product live, we already have, correct, the high performance one for object storage, and standard in beta, and we're going to have next year the high performance, I mean, we speak of millions of IAPs per block storage, on the block storage part, we are very involved in the OpenStack community, and the open source community, sorry, for this, because we want to open source those two technologies as soon as they're going to be ready. On another side, we are working and developing some services around data management, we develop services around database service, some machine learning, and some AI process for our customer to have it as a service. So as a short conclusion, if you want to be part of a team that run one of the biggest public cloud cluster on OpenStack, and if wherever you are in the world, I want to say that is again, we are very spread, and it's not an issue wherever you are, contact Antoine or myself, come to speak with us, we will be very welcome, we will be, we will welcome you a lot, I mean, we are hiring right now, and we need to grow fast. Thank you. Thank you very much. Do we have a few minutes for Q&A? Or is it just again? Sorry? One minute? Okay, so does anyone have question regarding well our technological stack or our company culture, and how we implemented a SRE model in OVH cloud? Yeah, you have a microphone over there if you want, just there in the front, in the front over there. You said about achieving higher variability of virtual machines and automation of, in failover of compute nodes and virtual machines, so my question is how do you achieve this, and if you are using maybe masakari with something else? I'm not sure to get your question, I mean, about the availability of one instances? Failover, automation of failover of virtual machines. I mean, the automation is about the self-filling when I was explaining, so when we detect that a server gonna fail, because you can see that in advance most of the time, we are évacuating the server, moving it to another instance alive. You do it manually? No, it's automatic, fully, and we have some Mral workflow managing that. Yeah, Mral for running it, and detecting it, it's all observability system based on... Sorry? By monitoring, but are you using a masakari project? No, no, no, we are running observability, runs on prometheus and otanos, and then we engage some workflow with Mral if we need to. Oh, thank you. Thank you very much. Thank you.