 Good morning. Good afternoon. Good evening. Wherever you're hailing from welcome to another episode of cloud tech Thursdays here on open shift TV I'm Chris short executive producer of open shift TV We are joined by a bunch of friends here from red hat as well as a special person from Amy will tell us Amy. Hi. How you doing today? Good. Let me go ahead and introduce my compatriots first We have Josh Berkes who is the kubernetes community person here at red hat We have Mike Perez who is the Seth storage community architect and myself Amy Marish Who is the open-stack community person here at red hat and we are very pleased to announce that today? We have Belmira more era from CERN to talk about scaling the open-stack cloud at CERN Bill Merrill So, hello, my name is don't be more area I'm a computer engineer at CERN. I joined around 12 years ago Initially I was working server consolidation and all to virtualize the batch service. That is a huge service at CERN But rapidly move my focus into cloud computing and how to deploy and manage a large-scale infrastructures I'm also a member of the open-stack technical committee and also co-share of the open-stack large-scale seed and I'm really happy to be invited to it to be here to talk with you About the CERN infrastructure awesome, so cool You said you have a short presentation Is not that short Okay, just lines. Okay. I think it will be enough for the hour I can start sharing it's CERN. We're talking large-scale here, right? All right, can you see it? Yes. Yeah, come on. They accelerate things to the speed of light. That's true So that's why I'm gonna through very fast for this All right So Yeah, I can start so this session it's about how we scale Open-stack at CERN So currently we run a thousand of nodes and thousands of virtual machines And we go through through the steps from the beginning in 2013 to today but Maybe the audience is not familiar with CERN so in the next couple of slides I'll give you an overview about the organization and the role of the CERN cloud infrastructure in the organization So CERN is the European organization for nuclear research. It was established in 1954 Initially with only 12 member states And this number has been growing over the years currently. They are 23 member states Love sits in the border between France and Switzerland and it's very very close to Geneva The mission of the organization is to do fundamental research in the particle physics field CERN is the biggest international scientific organization in the world more than 10,000 scientists from more than 100 countries work in the organization Not everyone is a staff member. It's mostly people from different universities around the world So to help understand the universe CERN provides a unique range of particle accelerator facilities the accelerator third complex at CERN is a succession of different machines accelerators that accelerate particle beams to higher energies very close to the speed of light The LHC is one that you can see clearly in the satellite picture Is the CERN largest accelerator? It's also the world's largest accelerator with a 27 kilometers diameter and it crosses two countries France and Switzerland For comparison, you can see the Geneva Airport here and to have an idea about the size of this machine Here is the CERN main site and These are other accelerators where the particle being straveled before being injected in the large other collider So the LHC that's where it's not only one but two beams of particles that travel in opposite directions And they collide in very precise points that are the experiments so atlas, ALAS, CMS and LHCB So what is quite fascinating to me is that all of these is 100 meters and the ground and This is how the tunnel looks like you can see the magnets So these big blue pipes that you see here. They are around 10,000 of them in the altarno Each magnet can measure between 5 to 15 meters and can weight up to 35 tons Inside each magnet. They are two pipes where the particle beams travel in opposite directions These are electromagnets and superconducting. This means that they conduct electricity without any resistance However to achieve superconductivity states they need to be At very very low temperatures so minus 271.5 degrees Celsius Which is even lower than the temperature of outer space As you can imagine the cooling process takes a few weeks and requires tons and tons of helium. So these are the experiments atlas, CMS, LHCB and ALAS These are the particle detectors where the collisions occur. These machines are huge They have up to 45 meters long, 25 meters in diameter and more than 12,000 tons And of course everything is 100 meters on the ground Mike visited this some time ago A detector is basically a digital camera, but they can take up to 40 million picture Seconds this produces up to one petabyte of raw data every second Of course, we cannot handle all this data Our source system doesn't support this so what physicists do they have Triggers in the experiments that try to identify in real time the interesting events and everything else is discarded So at the end we end up with the few gigabytes per second that are stored for analysis Even though per year we store around 90 petabytes of data that needs then to be analyzed So this is how it looks an event after reconstruction And with all these pictures physicists can have a representation of the collision events The analysis of all these data gives physicists insights how the particles interact But the detectors are not only undergrounds at the CERN site. They are also in space So this is AMS, the Alpha Magnetic Spectometer that was installed in the International Space Station in 2011 to measure antimatter and cosmic rays and to search for dark matter All this data that AMS generates is transferred to Earth to be analyzed and most of it is analyzing the CERN clouds Okay, so this was a very brief introduction about the mission of the organization and what it does So now let's start talking a little bit about the cloud infrastructure So to process all of this data and to support the scientists all around the world CERN also provides compute resources to the scientific community Over 90% of the resources in the computer in the data center are provided through the CERN's OpenStack Private Clouds And to understand the motivation Why we build a private cloud? We need to go back through the beginning. So 29 2011 and Then we will see the evolution of the cloud infrastructure over the years and some of our architecture decisions If you have any question, please interrupt me at any time so this is the Data center in Geneva. It's a building from the 70s It was designed to have a prey super computer in the middle at that time Of course, it's evolved. It was upgraded over the years. Now. It's a normal data center This it has two floors. This is one of the floors But one of the limitations that we have in this data center is the power capacity currently it has a power capacity of four megawatts and Is not easy to extend data center. So that's why if you visit now the Center, you will see that most of the racks are not completely full Usually they are all full and power constraints is one of the reasons So this is another data center that the CERN operated From 2013 to 2019 over six years and there we Only run compute nodes for the OpenStack Clouds all the compute nodes for computing for processing were for the OpenStack Clouds, so this is in Hungary and It was a huge challenge for us because when we launch in production our cloud infrastructure We had two different locations one in at CERN Geneva and the other in Hungary So the challenge was not only to the play open stack at that time But also to run these different locations transparently and we're gonna talk a little bit more about this later And these are another location when where we run our cloud infrastructure. These are now compute containers with high density for computing and And you can see some when they were installed and this cooling Hardware being installed Alright, so this is one of our dashboards for monitoring You can see the size of our current cloud infrastructure So we have around 300,000 cores in the clouds 3,400 users More than 4,000 projects around 30,000 virtual machines. This changed a lot. We are in the process that we are commissioning a lot of hardware because We are replacing it. So that's why you see these big drop in Compute nodes and number of pms that we added beginning of the month We also have a lot of services in the cloud. So we have ironic to provision bare metal you we have Around 8,000 compute bare metal nodes Magnum clusters Usually most of these clusters are Kubernetes more than 600 And also volumes for senior you can see that we have a lot of block storage more than three petabytes So quick question from the audience Kind of unrelated, which is Can you direct our audience member to where they can look at jobs that are available at CERN? Jobs yep, I've already sold everybody Right Everyone who wants to work at CERN with y'all Um, there is a web page for jobs. I think if you search jobs turn you will immediately find it It's I'll go ahead and look it up Just do it Job CERN. Let's see. Oh, here we go careers Careers dot CERN. That's all it is. Yep. That's all you need. No, I'll drop it in Are you already there? Thank you. Okay. Y'all are faster than I am Yep All right. So going back to 2011 So that was a period of change So we are in a period where our We had a lot of computing requirements. They were increasing a lot The LHC was running There was a need for more computing resources and we had these Power constraints in the computer center in Geneva. So we need to expansion options in the different side Also, we need the business continuity It's nice to have different locations because data center doesn't run only Processing jobs batch jobs for to analyze the LHC data But also it runs all the services for the organization So having a business continuity plan and on top of that the disaster recovery So that's why CERN opened an international tender to For all the member states to have another data center in the Hungary one So that's why we got that data center In Hungary. So the project started in 2011 and the data center was ready in 2013 Just in time for the launch of our private open-stack cloud infrastructure Where in Hungary is it? So it's very close to Budapest Okay When I say Wigner, Wigner was the exact location, but it's Budapest Okay And then we are in the situation that we know that we can have a new data center also more servers But we are still using our Managing managing tools from the beginning of the 2000s tools that were developed in-house It was a time that there was nothing available actually to manage a data center of our size in 2000 so we needed to build all these tools However, to reality in 2011 was completely different. There were a lot of open-slash projects that Probably were doing definitely they were doing a better job And we did much more functionality than the tools that we made in the house and then other problem was when An organization build their own tools Attracting people to work on those tools is actually very difficult because they only have value inside your organization And when you arrive there, you need to learn them and if you leave That knowledge will not Well, it's not interesting for other organizations and companies So it was a time that we needed to really adopt opens the open-source tools available to manage our data center And so we start looking to the all the options available So why to build Also cloud infrastructure at that time everything was running on physical machines and it's a completely shift Not only out the way we managed data center, but also for all the users that's That's what we needed to tell them. Well, now we need to transfer these workloads to virtual machines And as you can imagine a lot of them were server augers They liked it to have their machines in the center and control them and this was a huge cultural change but As everyone knows so cloud infrastructure brings a lot of advantages improve operational efficiency The resource efficiency so the what the possibility to consolidate a lot of servers in one resource it's it brings a lot of advantages and Something that was quite easy to sell was to we improve the responsiveness because at that time if someone needed a machine they needed to feel a lot of forms and Maybe after few weeks or even months they will get to the physical machine to work on Having a cloud infrastructure with an API as all service. They could immediately take that machine So we start to identify a new tool chain and the things that we clearly need was a configuration management tool So there were a few options at that time At that time we decided for puppets and this what we've been using since then but that is not only used to configure your open stack infrastructure, but is worldwide used through the Organization to configure all the IT services for example Monitoring tools and there are a lot of projects open source projects that were in the define and do that we are using today so Kibana search collect the Fluent the That we are using to manage not only the monitor not only the open set resources, but all the resources in data center under the cloud manager tool, so it was a time that There was not much available When we started really looking to this this was 2009 and I start looking to open ebola Open ebola was an open source tool. Actually, we start looking to open ebola to visualize the batch system and we We were quite successful We did huge scalability test because one of the concerns at that time it was these open source tools were able to scale to our needs So we are able to create more than 15,000 virtual machines on open ebola, which was amazing at that time That was one of the options but also certain beginning of 2006 Was managing a small virtualization tool that was built on top of Microsoft systems center virtual machine manager where the CERN team basically build a web interface on top of it and it was a Basic web interface where the CERN users could go and create virtual machines selecting an image and basically Was it so there was no API interaction. It was only that web interface It was a virtualization tool, but it was quite popular so it had in 2011 it had thousands of virtual machines running in that Microsoft infrastructure But that was a time where OpenStack was released 2010 and that was a game changer. So with all the industry support to these new Clouds tool I Think it was clear from the beginning that was the right choice for us to invest on OpenStack To understand this tool And to join the community So we started investigating basically OpenStack from the beginning This is a presentation that I gave in January 2011 To my management to basically describe my findings about OpenStack So this was based on the first release of OpenStack Austin And you can find the presentation in this link. It's quite funny now going back all these years and see this presentation I'm glad that I did it Given the timing did you consider eucalyptus or cloud stack at all? A cloud stack was not there yet. Yeah, it was only after Eucalyptus it was an option but it There was not a lot of deployments running eucalyptus at least at our scale. So It was not it was never considered as a good option for us. Yeah, I started with Grizzly. I can't imagine starting with Austin Well, it was quite a challenge going back now into these slides and Revisiting all of this is it's amazing. So in this slide, this is the architecture of Nova at that time and yeah We believe that Nova was complicated at that time with these diagrams. Well, when you see it today, it's Completely different. It was a time also where there was only two projects that was Swift and Nova Nothing else. So glance. Well, I think it was only available on Bexar or Catus So everything was no even even to create users. You needed to do Nova manager user create something like this Nova was doing more than just virtual machines at that point, right? So it was doing storage even at that time. Hey, we even had networking that was being done Some people would call that the good old days of a flat network before we moved on to quantum and then moving on to Neutron So no network still works for us actually But the Texan and me has to just clarify even though it is spelled B E X A R is bear because we pronounce things funny Yeah Texans do or just We have some weird city names Bear County is one of the Thank you, Amy to raise that all right, so This was 2011 and So we needed to to get our hands dirty on this So we created Several prototypes and the goal was to adding functionality in these different prototypes So we started with with what we call guppy because it was a very small and fragile animal And you see that the animals over time get more stronger with more functionality Um Well, this was deployed the first prototype with the fedora 16 um Why because at that time there was the fedora cloud 16 that released the rpms or fedora They were not available for rel yet centers So we use a fedora because of that Also, it was a big change for us because this it was using kvm And we wanted to use kvm because we thought it was the future because until then in our open ebola Tests we are always using zen um, but that was a pivotal moment also for kvm. It was integrated on Rail six, I believe and it it was the only one supported there We started from the beginning using the open stack puppet modules um Actually, we helped them to to develop the initial puppet modules With the puppet labs at that time those funny times And yeah, that was just initially initially tests. We went then with different release And this was always closed on it for us to test and you see that this time we already moved to center six our scientific linox six Um and also hyper v so you remember that infrastructure that I told you that we had at cern running on top of these microsoft infrastructure um So we wanted to continue to support that and we at that time believe that the easy way to do it was also to have hyper v in the open stack infrastructure And then move those machines into open stack But using the same virtualization layer So you see the challenge is it's completely new product trying to scale this or to move our own infrastructure To open stack to data centers and then two different virtualization technologies um Keystone held up integration of what was also tried during this version. You can imagine that we have a huge that directory um Then the the last prototype, uh, we opened to some of our community We try to put all of the services into aj and we add in this prototype more than 600 compute nodes This was already beginning of 2013 And basically we launched our cloud infrastructure in july 2013 From the beginning it was clear um If we get serious with this we need to engage with open stack community because there is no way That we alone will be able to solve all the issues will meet the the help of the community So you see that from early on we started attending meetups Also helping the community and the organizing meetups. This one was at CERN in end of 2013 Um And this was my first open stack summit. This was 2012 I think in san francisco So a long long time ago and this was the keynote room. Is he all small? It was at that time right So the CERN cloud infrastructure. So we started using scientific linux 6. So that was in 2013 um later we moved to centos Cento 7 Um from the beginning that we are um using the rdo packaging. Um, it's great To have all these bad packages and all the testing that redact does Uh, however, we still have projects that we need need some internal packages. So we need to rebuild all of this And from the beginning that we are using the upstream public modules for for open stack So some some considerations When we start to building this so the number of compute nodes Um, we started very small with only a few under the compute nodes But we knew that we wanted to move all the data center to to open stack So at the end is a few thousand compute nodes So is this tool able to to scale to these numbers? At that time, if you look back, there are not a lot of big sites using open stacks So that was always a concern Uh different locations. So the data center that I just mentioned At the group number of open stack projects That was a time that every week was a new open stack project popping up Um, it was very hard to to follow all of this and then there was all this split up functionality that also was happening So for example, nova volumes moving to cinder Um nova network moving to quanto So this was very difficult to when we are trying to deploy a series infrastructure very difficult to manage Also, then we add a large number of users and projects. Um, a lot of users every month Leave the organization So around the 100 users leave the organization and the 100 users come back to the organization or join the organization There is always this constant move of people at CERN Um, so we needed to automate all of this Um automation of creation of the projects when people leave The organization all the project removal all these automation needed to be invented And then of course, so when having a large infrastructure All the automation that is needed to manage the infrastructure itself All the procedures that needs to be figured out because everything is new So the kind of workloads that we run in the infrastructure Um, mainly it's physics data analysis. So all the data from the LHC experiments and many other experiments IT services Uh, and then many other infrastructure that is required for the organization the Experiments services to run the experiments itself engineering services to develop different tools for the experiments Um, and also the personal VMs. So any user at CERN has the possibility to to have a project and run their Dextops their personal VMs in the infrastructure So most of our virtual machines are ephemeral They They consume more than 80 percent of all the CPU cores available in the cloud and this is what is used for The LHC data processing And for because this is a very specific use case. We have special configuration for this virtual machine. So we do CPU pass through We have no more flavors But also they are ephemeral because they are only processing jobs. So things like line migration We are not really interested for for this kind of virtual machines Then we have the pets that are all the service VMs um Where performance is less important, but what is really important is that we can keep these virtual machines running and Line migration is a huge requirement for for all of this Um, it's a multi it's a mix of operating systems that we have here. We have a lot of the windows VMs as well So are these mostly like people's personal environments So these are people personal environments, but also a lot of services most of the services of the organization so 2013 is when we finally Open the private cloud infrastructure to our users Uh, we started with two cells and cells at that time was a quite new concept. It's not a lot of people were using that um We decided to use cells instead of regions even if we had two different data centers Um, mainly because we wanted this to be as easy as possible for the users to migrate their workloads to the infrastructure Uh, I've these were users. Um, most of them, um They they are not computer service, but they need to to manage their their applications Physicists that they have their projects Uh, we wanted to be as easy as possible for them to move their workloads from physical service to calling for the structure So that's why we wanted to reduce it maximum the number of concepts So that's why we decided from the beginning to use nova cells and these were cells v1 at that time So we had one cell in Geneva and the other cell in vigner At that time we wanted to have a j everywhere because we believed it was with architecture We didn't want to have a single point of failure We use salometer um Bad idea we moved back sometime then after um Glance was safe Backends, but at the beginning really in 2013 It was actually all the images were stored on afs because the safe cluster at that time was not ready So we had for a few months all the images stored on afs And we were still doing some cinder tests HAProxy That continues and then we had We had three master nodes per per cell because this i availability ribo team q again Ribo team q with cluster with the Miller qs I think nothing special very very common architecture. So this is a diagram that you can see that we had two cells Geneva vigner And this is all the services that they're running at at that time um For salometer, we are running MongoDB for the database There was no gnocchi at that time um stack tack we we started running that at that time it was It was very good to have a perspective of the service Um, and then we we keep more or less the the same architecture. This is the top cell um The architecture of cells v1 is completely different from cells v1. So that's why you see You may not not recognize the service that we have here from the current architecture enough open stack So this was the vm growth since we launch For our users in in july 2013 And you can see that this is the cumulative number of ems that were created in cloud. This is only until april 2017 but to the pattern Keeps the same after 2017 And this is the number of ems Growing so you see that this was very well adaptive by our community and also we Work quite hard to basically move all the compute all the physical nodes from the data center to the infrastructure There are weeks that we are moving more than 100 nodes into the Cloud infrastructure and this was not new order. This was converting the existing servers um into compute nodes For a while. So when did you get Have all of the You know standalone servers been converted or is that something you're still working on? Um, so all the servers that were that are dedicated to computing uh are converted However, not everything in the data center was converted to compute nodes and not everything runs On top of the open stack one example is storage so This doesn't doesn't make sense uh to run the storage servers. Um on top of open stack so They continue Bare metal machines, uh managed by the storage team Okay, so those bare metals aren't ironic, but yet you do have ironic as well as magnum clusters in the system Right. So ironic is a quite recent product in in our clouds. So I if I remember well in 2019 it's available since 2018 2019 um Ironic started to uh for us, you know It was a requirement because a lot of people a lot some of the use cases that that we had um didn't fit well virtual machines. So people that really needed uh huge virtual machines full node virtual machine So there is not didn't make a lot of sense to virtualize Uh that environment for them because they were losing a little bit of performance So we deployed bare metal the ironic service and initially Our goal was to To have an api for for For the users to interact with the cloud with bare metal The same way they interact with the virtual machines But we had bigger goals for ironic not only that not only having a pool of bare metal nodes for people to use But also to change all the workflows In the data center and what our goal at that time was to manage all the resources in the data center using bare metal Including the compute nodes So Currently all the compute nodes in the infrastructure are managed by ironic We have this kind of inception and we have a lot of this in our infrastructure Was that your Amy? Yep So Yeah, so another follow-up question I kind of want to ask is what determines whether it's a vm A container or on bare metal? So what determines where the workload goes? So it's quite ironic that we are talking about all this inception on all of these concepts and I have the slide scale quite simple We'll go ahead and go through and and we'll we'll get to it um No, so that that is a decision of the user Uh, so the user have all these apis and if he has kota he can create a bare metal nodes Or he can create a virtual machine or he can create a container depending in the In the application that he wants to deploy Okay, uh, I I think I still have one slide about ironic. I can go then Deeper on it Okay, so as I said scale implies simple, uh, because if from the beginning, you know that we're gonna manage nodes, um The architecture the architecture needs to be simple so this is an overview about our architecture Um, something that we decided from the beginning was to isolate the open stack services. So we don't have a Few physical nodes like most of people three physical nodes and we run all the open stack control planning those nodes So we just we try to distribute as much as possible So we have machines that only run keystone. We have around 16 of them. We have machines that only run plans api Nealtrend and so on Why because all this isolation, uh, allow us to upgrade all these different open stack components independently And allow us to focus in one problem at a time Then for nova, uh Also, we have a this kind of architecture We run nova apis completely isolated Then we have the level one that is the main control plane or the for cell zero Where we have the schedulers and conductors Uh, no the no vnc proxy And placement as well is completely isolated from nova And even the databases We have completely Isolated the instances for each open stack project. So we have a mysql instance for keystone one mysql instance for Glance and for nova we have one independent mysql instance per cell And again is to have this isolation And then we have the the cells itself. Um, they are very very simple. They have the control plane Again is what it is and then uh all the compute nodes So with which this represents is that we have one control plane or only one server acting as a control plane for around 200 compute nodes And in total we have around 80 cells Can you speak about the benefits of Isolating things into different cells Can you repeat mike? Yes, can you uh talk about the benefits of uh taking the cells approach of isolating different services within child cells? Like uh, how did you reach that requirement in your infrastructure? And and also for for my sake for cells we're talking about sort of groups of physical servers. Yeah Exactly. It's a logical partition of the nova deployment So it's all can be a rack. It can be a data center But it's a couple of servers that are together Yeah, I think this slide is good for this to explain this Um, so we decided to to go through the cell routes. Um, because we didn't have at the beginning to have these um region concept concept to our users But actually cells have a lot of benefits um Basically, they can act as the failure domains Um Also, they can allow you to configure those servers in that cell in a particular way And we are Those advantages, um, basically to to deploy our infrastructure For example, you can see here that our availability zones are basically sets of different cells Meaning that if the cell goes down the availability zone is not completely down. It's just graded Okay And because we have so many cells means that the control plane that we have for each one It's only one server meaning that if that server goes down The workloads continue to run because it's only the apis for this particular cell for this particular resources that will not be available And also It has the advantage to distribute all the control plane meaning that for each cell We're going to have a completely independent rabbit in queue Meaning that the scaling rabbit in queue will never be a problem for us because The cluster it's very small. It's just that maximum of 200 compute nodes So that's why we never reached a scalability issues for rabbit in queue in over I think I have I have a slide where I go through the advantages of cells and compare it with regions That's a real question, Mike Yeah, yeah, I just wanted to give the sense to the audience of the benefits of it. Thank you um But what I would like to show with this slide Is that we also run all these services that I show you in this slide His stone glass new turn all the open stack control plane. We run it on top of the cloud itself So the control plane doesn't run in physical machines dedicated for the control plane or doesn't run in a different clouds Only for the control plane It's running the cloud itself that it manages um So that's why we have this inception Again, it's like ironic as I mentioned early that manages also the compute nodes Even if it's an open stack project So you see that in the server first side by side with the user VMs, we have keystones We have glances VMs and so on and Keystone for example and all the other servers are distributed between the different availability zones The same availability process that we give to our users We use it and also distributed between the different cells. So that's why we reached our availability for the control plane It's nice to see using availability zones. I don't think people take advantage of it as much as they should We are exposing availability zones to our users from the beginning Yeah, because you you started out with two widely separated data centers Yeah, yeah I actually weird techie question here What what's the actual line lag between uh, cern and hungry? Oh, exactly just in time, right? Yeah all right, so Between cern and vigner, it's uh, 1,600 kilometers and that translates to around 24 milliseconds of latency Um And this was at the beginning. We are trying not only to figure out How to set up the new data center, but also to Set up open stack on top of it. Then we have this legacy this latency issue as well um, what you see in this slide as well is the connections between vigner and cern So we had two network links 100 gigabits uh, bandwidth Uh between the two centers completely redundant. That's why you see two and After a few years, we also added a third one. So we had a connection With uh with a total of 300 gigabits per second between the two centers And this is basically for us. This was like connecting It was a cloud interconnect um With appearing networks because it was the same network For people that are used to public clouds Of course, uh having the data center there, um That had some architecture implications for example Um, we run the databases. We started to run the databases in Geneva, but the latency was so high And that was a time that we didn't have nova conductor So all the compute nodes were connecting to the databases at that time Also because we are using cells v1 the scheduler was for cell So we had a scheduler in vigner that were connecting to the databases in Geneva So the user experience at the beginning in vigner was not that great because it was sometimes very slow So we continued to iterate on this. So databases were deployed in the vigner data center Um Another thing was sef because the sef cluster was in Geneva Uh, so at the beginning because the latency It was a very bad experience for users to have Work storage in vigner. So in 2015, um A sef cluster the storage team deployed a sef cluster in vigner for these use case to have block storage for the cloud um and other things for example, um A glance cache Initially, we thought that 24 milliseconds was a lot of latency and we needed to transfer a lot of images to vigner So we decided to implement glance cache Uh at vigner But that turns out it's not really needed because after sometime the images are cached at compute at the compute node level So it was just an overhead that we had in architecture So we removed that and that was a case that glance was the nodes were actually contacting glance in Geneva without any issue So you see that there were different and the several architecture constraints that we were figuring out over the years So the data centering vigner was operational between Or cern was using that data center. It continues to run between 2013 and 2019 We knew this from the beginning. It was uh A contract for only four five years that was extended one more year um By the end of 2018 we were running there 17 over sales. Um 3300 compute nodes and the For the availability zones there we had the 78 nodes So early 2019 we started decommissioning the clouds in the vigner data center and as you can imagine that was Another challenge. So we needed to remove all these sales from the infrastructure And this was completely in november 2019 and one interesting part is that 2500 servers actually returned to Geneva because um they were late purchase and It was still very good servers. So and this service was the ones that were added to the computer containers that I show you at the beginning in that picture Now is wigner totally going away? Are you just moving some servers around because you mentioned so now we don't use vigner at all anymore Okay. All right. So sales versus regions So I least here are some of the advantages uh of sales and regions to try to to make it more clear why we went For sales at the beginning Basically sales is to shard the nova deployment. It's it only applies to nova. There is no other service that has cells And that is actually an issue Isolate server domains and it's complete transparent users and one of the things it's uh also logical partition for operators and allow us to to have different configurations for A particular cell does and distinguish the different cells with different configurations Which is important for us for example for the batch use case That has a complete different configuration for when comparing to the service cells Regions, uh, they are completely independent open stack environments Um, so we need to basically deploy all the services all the open stack projects in the point independently in that region but you have that Fault tolerance. It's a completely different environment that should be managed in completely isolated way. So that is the big advantage And actually now we are running multiple cells by multiple. I mean three cells so in 2013, um for us it was more simple to manage one small cell than two small cells However, when the infrastructure, um grow to to at this point It's more simple actually to manage Two or three small clouds than one big one And one of the main reason reasons that also we move into regions is because neutrin um, neutrin doesn't have this logical partition with cells and It was a big point of failure. So if neutrin was down or our anything was affecting neutrin Uh, it it was visible in all the clouds. So all the users will see this And partitioning neutrin between two three different regions Allow us to to improve the liability of the cloud a lot Neutron agents are quite chatty with ribed MQ cluster So needs to be a very big ribed MQ cluster and that is always an issue to maintain also the new The regions that we have now They are per use case. So one region is focused in the it services and user VMs The other two regions is more for or it's only for LHC data processing. So they only have that use cases Which for us logically also makes sense the way we manage this So the way you're describing neutrin it sounds like a downgrade. So what are you getting out of upgrading? Well, we we were forced to upgrade to neutrin because nova network Is not supported anymore However, for the old regions, we still we are still running Nova network. So we have six six cells Where we still run nova network And because we still we are still evaluating how we're gonna migrate this is quite scary migrating from nova network to neutrin without interruption of the VMs Just thinking about that the VMs could Lose network connectivity for some time. It's very scary for us And those cells are running important service for the organization. So that's why we are still trying to figure out how to How to do it in the right way So simple doesn't always Um plays well with the availability, but we try to To work around this. So that's why we have multiple cells for availability zone which allow us to If a cell goes down to for the availability zone not to be completely down Which is a very good feature um The cell control plane is not highly available. Um If it dies There is no workload interruption just people are not able to connect To to their VMs using the open stack apis or to open stack operations Using the the apis in that particular cell. However, um These simplifies a lot the the deployment Because we have around 80 cells If we had availability through all these cells control plane We will have a lot a lot of control plane to manage And then as I already mentioned, we also run the The control plane on top of the cloud itself Right in queue. It's very challenging to scale and maintain um So we try to not run right in queue clusters at all and what we find is that um If we have very small clusters like we have per cell right in queue is quite stable Um, and not having the complication of right in queue clusters simplifies a lot of the deployment and operations um Do you want me to go faster? I don't know if uh We have a Hard line to finish this. How are we doing? I mean, we're fine on time. You can go over if you want no problem Okay I'll I'll try to go faster now So my sequel databases, um again, like rabbit in queue um We don't have a cluster for my for the mysql databases So we have independent mysql instances and the the funny thing is most of them run on top of the cloud infrastructure So the storage is in a net app solution But the instances itself most of them run on top of the cloud infrastructure except few exceptions that we have to For bootstrap issues that we run at them on physical servers so open stack And when we have a cloud infrastructure with where we want to have a lot of functionality that translates to a lot of open stack projects And these are the current open stack projects that we run And you see the version that we have Now in our cloud and you see that they are they are in different versions And is this isolation of the deployment that allow us to have this So nova is still in stein one of the main reasons that don't be still in stein is because we still have those cells running nova network And upgrading now. It's very very risky. We have a lot of patches for nova network to continue to run um But you see that most of the services are running very very um recent releases And that is one thing that we always try to keep up with the open stack release cycle um Having so many opens many open stack projects and managing all of this and Most of them manage Thousands of resources For example, cinder manages thousands of volumes with the petabytes of storage behind It's always tricky. For example, ironic. We have now around 8 000 nodes that are managed in ironic And we started reaching scalability issues in ironic um So again, that is one of those things that you you get if you're scaling the infrastructure And um, fortunately, there is this functionality conductor groups that is more or less like nova cells in ironic And now we are taking advantage of this. So splitting logically the ironic deployment Scale also means staging Even if we try to upgrade most of the services every six months The configuration is always changing. Um So the configuration through the models is always changing. So trying to have a CICD testing everything before we deploy it to production is quite important And we have a staging process to have everything first on pre-stage testing In a small number of nodes and qa and then going through different master levels until reach everyone in the infrastructure Test stack is what we call to our testing infrastructure very few few nodes to test upgrades and new configuration options Scale also translates into automation Um, we are using several projects for automation. Uh, for example, rally to grow the infrastructure every day Rally deploys thousands of virtual machines in infrastructure Just to make sure that every cell is okay and um Everything is running as intended Uh, run deck is um a project that we use a lot for operations For example, we have different teams Um, the repair team, for example, doesn't have access to the open stack resources Uh, however, we have all these procedures and run deck jobs that they can trigger For example, when a node is uh needs a repair They can trigger a job that will basically try to lie migrate all the instances in that node notified users If that is not possible that that node needs any repair intervention And then we have mistral mistral is also an open stack project that we use for workflows. For example Uh, for all the projects creations and all the projects removal when a user leaves organizations So going through all the resources from users and make sure that they are deleted And all this is automated Scale also means permanent changes So upgrades through the open stack release cycle is every six months. Um, so we run 15 open stack projects So as you can imagine every It's almost an upgrade day for us Um, and then also we have the open operating systems distributions upgrade So we started with scientific Linux six Uh, at some point we needed to upgrade to a center seven. This is never easy um There is no easy way to move from six to seven It's a required reinstallation in our case And now we are facing again the move from center seven to synthesize and stream Um, and we are working on this Hardware commissions so every around five years Copy notes need to be the commission um And as you can imagine a lot of live migrations need to happen To try to do this transparent to the users. So recently, uh, we just Removed around Um Or we migrated around 900 virtual machines because we are commissions some cells and we continue to do it Um, this is a lot of work. We wrote a recent blog post. You can follow our work here Security, um As you all know so meltdown specter a couple of years ago created a lot of fuzz so we needed to actually reboot Um, most of our cloud infrastructure because of this Um, also disable hyperthriding Uh, reducing the number of cores available And these operations when you have thousands of nodes, it's a lot of planning A lot of work Currently for kernel upgrades We are trying to automate this because when we have all these these compute nodes running they run for years and Operating the kernel is quite difficult without Resurrecting the user. So we are trying to automate this to basically having a tool that continues live migrating instances in the infrastructure And when the compute node is empty Just reboots the compute node for the kernel upgrade Scale it's of course teamwork. Um So at CERN the core open stack team, it's, um six seven people But over the years we added the participation of thousands of Different students fellows project associates um that joined the team for Few years of time and contributed a lot for for this project So these are my slides. I Didn't intend to take so long to to go through them No worries No, this was great Yeah, no for people who didn't know what CERN was I think they got a really good understanding of it And you know the infrastructure the fact that you are running different versions of open stack based on what project Is something that most people don't do and y'all have really good reasons for why you're doing it Right, so I'm happy to to answer your questions. Um Also, the audience they they can follow me on twitter. Ask me questions there. Um some questions through the email I'm happy to answer them Wonderful great Thank you so much Belmero for joining us today. This was really good Very informative. Thank you Never would have expected the number of problems CERN has with infrastructure Uh question just came in though. Yep, I got it Have you ever considered any other bare metal provisioning other than ironic? So Ironic was a quite easy decision for us because we had all this Infrastructure based on open stack. So it was the natural choice for us Also having the possibility to have exactly the same api to create virtual machines Um And bare metal nodes it was quite attractive and it's I think is a real advantage for us So the user doesn't need to Learn a different api a different common line to this So ironic was always I think the The most attractive solution to us I know we're gonna have a little bit of a lag, but go toss one. Did that answer your question? Well, he's responding. I do have one of my own since I'm coming here from container land, right One of my questions is Are you already running some workloads that are distributed as container images? And if not, are you planning on it? So at CERN there are different teams that are using containers for To deploy their applications. So as you saw in the in one of the slides We have more than 600 Magnum clusters and most of them or almost all of them are kubernetes clusters So there are a lot of applications Using containers as their deployment method. We also start playing with the containers to deploy open stack We recently we are experimenting with one region trying to deploy open stack On top of kubernetes using the elm charts Open stack elm charts and We are experimenting with it currently for example in production. We have All of the glance requests going through a glance that is deployed In a kubernetes cluster That answers your question trash Yeah Out of just sheer curiosity. I mean how many maintainers of Open stack do you have like I'm always curious about team topologies and size and stuff like that So we are seven core members that um But then we have all these fellows and project associates that joins our team But usually they don't do operations and they do more Investigative work like looking for different projects Evaluate different open stack projects or kubernetes associated projects And then if we think It's worth to to invest in those projects So then we go further and we try to to implement them to deploy them in the clouds Awesome, so another Go ahead. So let me just Continue in that in your question. So currently we have Some people doing work on GPUs trying to understand how to to have GPUs in the clouds Other people looking to how to have functions as a service in the cloud for example We have all these different projects That are always going on I'm I can only imagine how many different projects are going on at any given moment given the resources that are available um Have you collaborated with any other Scientific organizations about your infrastructure specifically Oh, yeah, sure. Um So at the beginning we collaborated a lot with nectar for example that is um scientific scientific research network in australia And at that time they were using open stack. They are still doing And they were quite big And we changed a lot of ideas out to deploy open stack More recently we collaborate with the Ska that is the square kilometer array Basically is the biggest or it will be the biggest telescope telescope in the world The sites in um south africa and also australia for observation And we did interesting projects with them for example, um preemptible instances um So they are not available in open stack by default. So we collaborated with ska to develop this um Also running Kubernetes clusters on bare metal. There was a lot of work that was done in collaboration for example with ska In this area cool Awesome. All right. So I drop links to both Nectar and ska in the chat if folks are curious about what those organizations are all about If there's Anything else feel free to reach out to me question wise short at redhead.com. I can pass along to bill murrow and team here But without any further questions, I think we'll wrap up here. So thank you very much bill murrow. This is awesome presentation Thank you so much. Yeah Thanks bill. Thank you so much for having me Yeah, nice seeing you All right, take it easy and stay safe reminder folks red hat is having a recharge day tomorrow So we will be off the air completely and monday is a us holiday. So we will see y'all on tuesday Stay safe out there Take care