 Hello, my name is Wendech. I'm a software engineer at Red Hat. And I'm part of the Anaconda team. And I've been the lead developer of the Anaconda backend. So welcome to my presentation about the story of the Anaconda backend. Let's start with a simple question. What is Anaconda? Anaconda is a system installer used by Fedora, REL, and other operating systems based on this. And it kind of is doing three stuff. So usually, when you want to install something on your system, you download an ISO, put it on your USB stick, boot it. And then what happens is that Anaconda configures the runtime configuration. So it modifies the installation environment based on some preferences and detective values. Then it tries to collect the user preferences about the final system. So you can select what kind of language would you like to see on your system. The time zone, you can choose some specific software you want to install, define the partitioning, and create some user accounts. And then we verify that this is valid. And it looks like the new system should be viable and bootable. And after that, we allow to start the system installation. And this is important because why we collect the user preferences, we don't really touch any of your disks. And we don't modify your data. But once you confirm that you want to start the installation, that's where all the magic happens. And we use your user preferences to do some real actions on your hardware. So quickly, how does it look? So runtime configuration usually happens in very early stages. You usually don't even see the screen because it happens very quickly. Then eventually, you end up on something like this where you can specify what you want to do. As I said, nothing really happens. We just collect your data. And when we have all that we need, we enable the Begin Installation button. And then we actually do the real action. While I'm mentioning this, my conda is very difficult because of this part. Like the part where we just collect stuff, but we cannot really actually do some stuff because we have to kind of simulate what's going to happen and how the system is going to look like. And that's not easy. And also you can visit and configure anything in a very random way. So you can start with the user account and then specify the software and then specify the partitioning. But in reality, what happens next is that we start with the storage, download the software, and then we start to create some configuration. So yeah, this is the difficult part. So what's the actual Anaconda modernization? So this is the initiative that started in 2017. And the idea was pretty simple. So we started with something like this. And that was one huge monolithic Python application that had everything in one process. So the user interface had access to all the data that you might need or you might not need. Everything could use anything. And it was all very interconnected and it was just a huge mess. So the idea was to separate this data and business logic into the bus services and have the user interface and the debug services and the debug services can talk to each other and user interface can talk to the bus services. What will you gain with this? Is that eventually you can replace the user interface with something that's not even based on Python. And for example, have the web-based UI, which we might have heard about this year. So yeah, the idea of the web UI is very old actually and it started here in this year because it was the ultimate goal to get there. So that was the plan, what was the reality. So this is kind of what you have. Like Anaconda is a very old project and it's not pretty. So there are no nice boxes that you could just take and move to some service. You have to create these nice boxes first. So you just start somewhere. So let's say you are able to identify a piece of code that looks kind of isolated and maybe you might be able to move it on the bus. So the next thing you have to check is what kind of data does it use. And if there is anything that's not really on the bus yet, you cannot really touch this logic because you would miss the data. So instead of that, you target the data, see where it's used and migrate the data first, make sure that all the code that uses the data are using the database API to gather them and change them. And then you actually can go back to the piece of code that you wanted to move and move it. So based on that, we targeted the data. So Anaconda had a lot of weird data objects for some reason. So yeah, we had a lot of global variables that was lovely but we eventually got rid of them. Another weird object was the Kickstarter data. So Anaconda has like this special mode when you can like automatically run the installation using a Kickstarter file. And Kickstarter data is like a Python representation of this Kickstarter file. But the Kickstarter data are used also for interactive installation because we use it as a main data holder of your preferences, which doesn't make sense because Kickstarter doesn't support everything that Anaconda does. So you end up with selectable workarounds and that causes a lot of issues. Another funny object that we have is the storage model. So as I said, the storage model is, yeah. So we have to somehow like simulate the actions first. So with the storage model, we have a Python representation of your device tree and we do actions on this device tree and we just check the result. And when we are happy, we actually apply these changes to your real storage. But unfortunately, our storage model didn't have an undo button, but sometimes you had to like reset because you ended up as a very invalid model. But this object was already propagated to all corners of the UI. So you couldn't just throw it away and create a new one. You had to somehow reset it. And that was also an issue that we solved later in the polarization. Then we had a representation of the payload object. By payload, we call like all the support that we need for installation software on your system. And then there is a special category of product data where every product can have a little different defaults that they want to show users. So that's also something that we need to take in account. So the most problematic part was the kickstart data actually. So we decided to start the planning around that and we collected auto kickstart comments that are supported by Anaconda and we split them to some areas that made sense. And this is basically the foundation of the D-Bus modules. And you finally have something to work on. It has very clear goals and you just go module by module, command by command, and just try to figure out how to like handle this command via the D-Bus module. So this was the plan. The phase one targeted the system installation. So everything that you needed to actually finish the installation. And the phase one was actually finished on April. So yay, we are done with this one. The second phase is going to target runtime configuration. It's about the runtime module and the bus module. The bus module is kind of special because it's main purpose is to orchestrate other modules and send them some data and collect some data from them and basically it like oversees the whole D-Bus API. So what kind of challenges did we face? So the first question was where will we develop this code? And we had two options. The first one was to have a development branch which would be separate from the production branch and it would be nice because if we make a mistake it will not affect any critical workflows like Fedora, Prohite. Unfortunately, it means that until you release this thing you don't get any feedback about what you are doing. So you don't really know if your idea of how it should work really works in the real world. Another thing that we were afraid of was to keep it in think with the production branch because this project didn't have a very high priority and there was always new stuff and features and requests that were coming up and we still continued development of the production branch. And we didn't want to lose these features and backfixes so we would have to port everything to the development branch and we knew that that would be just too much work. So the other option was to use the production branch and that has a lot of benefits but unfortunately you can very easily break Fedora or Hyde and you don't want to do that. Yeah, so there was like a lot of pressure about making it right and we were doing very thorough request reviews and yeah, we spent most of the time on just making sure that we didn't forget any use cases before we actually did some change and that was very difficult to do but I think like considering the amount of work that we did we didn't mess up so much. Yeah, so what we had and have for all these years was kind of a hybrid solution because I mean some of the Kickstarter commands were migrated on the debuts modules but a lot of them were still handled by the user interface and that created a lot of interesting situations and changes because you somehow needed to take this Kickstarter file, tear it to pieces, send it to the right components, collect the feedback about possible issues and validation errors and later again collect new pieces of this Kickstarter file and generate an output Kickstarter file. Another thing that we had to make sure is that there is no overlap that we didn't forget to remove the management of one Kickstarter command from the user interface while it's already handled by another debuts module so we wrote a lot of unit tests for that to make sure that this is fine. Yeah, another challenge that we had was how to quickly and safely develop the debuts API. I don't know if you ever read the debuts specification, don't, you will, it's very difficult to like understand and grasp. So we needed to make sure that what we are writing is right and that we don't have to spend a lot of time on these little tweaks and weirdness of the debuts API and that we can focus on the code. And we knew that we will develop this very iterative way and that it's going to be a lot of refactorization of the debuts API. And for example, one of the things that you need to provide at some point is like XML specification of your debuts object and that's really not something that you want to refactorize because it won't work. So what we actually did, we started with the PID bus library but we built a lot of functionality around that that would simplify a lot of stuff for us. And eventually we throw away the PID bus library and create it a new one which solved some other issues that we had with the thing, for example, and put there all the new support that we created. And now it's available on PIP and other operating systems so you can use it if you want to. So another issue that we had was the management of the default values. So the problem with defaults is that we have a lot of sources of the default values. I mean, like Anaconda has some ideas what are the defaults but then the products have some other ideas what the default should be. And then you have these kernel arguments and boot options that can override them. And we didn't want to propagate all these sources to the debuts modules so they can pick one. So instead of that, we introduced the Anaconda configuration files that are just text-based and we, in very early stages, process all the sources at some point generate a temporary runtime configuration file and then we start the debuts modules. And the first thing that the debuts module is doing is that it's looking for this new runtime configuration and it's using only this one. So it doesn't have to care about the other sources which helped us a lot. So, and the final question was, how can we test this thing? So the main goal was to be able to test the backend with unit tests very easily. And unfortunately, and luckily, we were able to do that with the libraries that we had. So we didn't have to do any weirdness with debuts demands and we didn't have to test really the debuts API. We could just create these Python representations of the debuts API and unit test them directly, which simplified a lot of things. What we also were focusing on was the end-to-end testing. And with Anaconda, Anaconda is very difficult to test. So this was like the best effort we could do and that was the focus on Kickstarter tests. But that was actually good because we were targeting Kickstarter commands. So it makes sense when we was migrating a Kickstarter command like auto part and it wasn't covered by these end-to-end tests, we could try to a new end-to-end tests to make sure that it's covered forever. So we spent a lot of time on this and improved into infrastructure a lot and it was also great. So, what's the current situation? So as I said, we finished the first phase and this is the code distribution of our current modules. As you can see, storage kind of consumes like most of that code, then it's payload network and the other modules are pretty small. So you can guess that we spent years on the storage development and another years on the payloads module and then the network and the other stuff was pretty much easy. So here are some milestones and yeah, it took forever but as I tried to explain, there were reasons for that. So what were the benefits and was it like even worth it to do this horrible mega thing? So we have a pretty good code coverage of the new code. So this number is a little confusing but we can check actually, how does it look like right now? And when we focus on the Payana Konda library which holds more of the most of the biting code, you can see that the core package that's like a general library of functionality has a pretty good coverage and the modules have a pretty good coverage and UI, yes, we don't have tests for UI, so obviously not so well. And the modules itself, most of them have a very high code coverage except for the ones that are really big like the network module, the storage module and I forgot the last one, payload. Oh, payload is pretty good. So yeah, I think we did a pretty good job and when we started, I think the code coverage, we didn't even measure code coverage but I think it was around 20% and it wasn't great. So what's next? So the number of end-to-end tests that we run daily on Fedora, Rohite with our upstream changes is over a thousand which is a lot because when we started, these tests didn't even work properly. So like having these run daily makes me sleep at night. So another side effect of these things was that we kind of by accident stabilized the GUI and the current user interfaces because when you are modifying the user interface to be able to interact with the D-Bus API, it means you have to touch the code and you have to test it and if you find a bug, it doesn't make sense to ignore the bug and leave it there, you fix it. So we were finding a lot of issues there and fixing a lot of bugs just by working with this piece of code and then when we had the bug reports, for example, for RHEL, we noticed, okay, but we saw this on upstream actually and we fixed it so it was very easy to just port the fix to RHEL and it didn't cost us any additional work so this was great. Another benefit of the development of the backend is that it enabled development of the web UI because the UI wouldn't be even possible if it didn't have something to talk to. So this was kind of crucial part of that and the fact that we can work on this is great but as I said, this all started years ago. Another thing that we improved is the simplified way of customizing Anaconda for products. As I mentioned, we had some issues with that and we had to handle the defaults differently and side effect of that is that it's very easy to provide new defaults for your project and you don't have to understand Python. It's no longer a weird Python class that's breaking all the time. It's just a simple text file that everyone can understand and change and you can find it in our repository. Another thing that this was intentional was that the support for add-ons is much better on the backend level and basically there's almost no difference between add-on debas modules and our debas modules because they use the same basic, like the base API and we treat them the same. So we were able to remove some weirdness around it and make sure that it's easier to develop. Also add-ons can now be developed in other languages if you are interested because it's debas and we don't really care what's running behind it. So that's another nice thing. And one controversial thing I want to mention is that since we kind of like dropping the dependency on the Kickstarter data object, the data holder, and using Kickstarter just for input and it's no longer just there to hold some data, it means that there's a possibility to support more formats or maybe switch the format or stuff like that because we don't really depend on packing starts so much anymore. And I personally don't like the current for all. So if anyone is interested, this is definitely something to think about because I think we could do much better. So what's the future of this? So right now my colleague is working on the phase two, which means he's writing debas support for the runtime configuration. In the future, we want to clean up and stabilize the debas API because it's a little still graph and messy. Still, it was coded, it was developed for six years. So some of the early parts are not so nice as the later parts. And also we have zero documentation of the debas API, which is not great. Unfortunately, all the resources are currently working on the web UI, so I cannot promise you when we will have documentation of the debas API, but we will get there. Yeah, so that's all from me. I just, very quickly, because I have time. So like one of the side effects of this was that we created some new libraries. As I mentioned, we have the dustbus library for the debas communication. And there's another library called Simpleline. And that's actually a very simple Python framework for the text-based user interface. Because we don't have only the graphical user interface, we also have the text-based user interface. And because of some weirdness, okay, 390, we couldn't use existing libraries. So this is kind of like separated code that used to be for a very long time in Anaconda, and we cleaned up and created a completely new library that's independent on Anaconda. So you can also use it if you are interested in this. Okay, here are some additional info. As I said, don't look for the debas API in our recommendation, it's not there. If you have any questions, reach out some metrics, this is the best way how to get any answers. Yeah, does anyone has any questions? Yeah, okay. You mentioned the decision to this idea has been eventually evaluated to talk about all the roles of the users, the linux installers in order to create some kind of a Ligba Franca one, four, whatever you have here. What is the, evaluate what is already existing also inside or inside, for example, the federal system, like for example, one for the rest, one on the outside, because your professor recently reviewed it. Yeah, okay. So the question, sorry? Basically, the question. Yeah, yeah. So the question was if we will consider the new format instead of the kickstart, have you considered to use the existing formats that are used by other installers? So yeah, this is not really initiative. This is an idea that I'm throwing out for someone who's interested. But yeah, it would definitely make sense to look at what's there first. I personally would like to be able to use Uncivil, but we kind of looked into that and it would be so difficult to support it. But yeah, it makes sense to like, try to unify the formats and make it happen. Yeah, please. We've done in the end on the background. Have you considered writing some kind of simple command line interface to execute all these things in series if you wanted to simulate a workflow running for the web UI or running on the front end that some of the else will develop or something of that sort, because the open system guys will be able to go on them. Their first interface wasn't the web UI, it was the CLI tool that they've been both servicing some sequence, and they used that to be able to build all the other front ends on top that allow them to document the best interface, the best interface for ramization, and do those other things. Yeah, okay, so the question was difficult. So kind of if you considered to create the client, a simple client that was allowed to run the processes first and then write the web UI based on this client's tool, right? Yeah, okay. So, no. So basically, yeah, you kind of have very limited resources, so you have to explain everything that you are doing, and there wasn't really a need for this kind of tool because there are other tools who are better at just doing the installation in an interactive way, and Anaconda's focus is the interactive installation. So it's definitely possible to use the backend and write a very simple tool that will just use the backend without any user interface, but there was no demand for that, so we didn't really explore this area. Yeah, yeah, definitely. Yeah. Well, there was no demand. Like, no one asked for that, so we couldn't just go and write this piece of code that no one really needed at the time. Any other questions? Yeah. How would the application, John's application, I assume, talk to the people in the group? Would you use something like the cockpit's bridge or have you created something from your own? Yeah, okay. So the question was about the web UI, and how the web UI is actually talking to the debug services, which is a very good question. And yes, we are using a cockpit bridge and basically the whole cockpit setup to communicate with the debug services because the support is already there. It was very easy to reuse for this use case. Well, you are using all of it because maybe you were not there. So basically we were doing this iterative way, so we were doing the migration over the years iteratively, so there was more and more backend. And since we finished phase two, it means that most of the, or all the modules that are related to the system installation are basically finished, so all of that is running via debas. The missing parts are just very simple edge cases that are related to the runtime configuration, and that's currently being worked on, but otherwise it's all on debas, and it mostly was debas for years. Yeah, yeah. So the question was when we will be able to leave the hybrid solution and fully switch to the debas modules, and we should be able to do that at the end of the phase two, which is currently under development because the phase two is targeting the missing kickstart commands, and once all of these commands are handled by debas modules, we can drop all the support in the user interface, and the user interface should use only the data in these debas modules. And it's also critical for the web UI as well because it cannot really access any of these data that are not available on debas. I would consider that spending like this I don't know, a few months of development that will develop with the code covers to 90 last percent of the circular stress from all the scenarios in the break, when development is there, then the modernization, like, way? Yeah, so basically, I guess the question was why the coverage isn't higher. I don't know what you mean, in some of the latest cases, if you actually spend prior to developing more time on covering all the scenarios that you've done. And do you mean the unit tests or the end-to-end tests? Yes. So the code coverage is related to unit tests, and you really don't want to write unit tests for something that you are going to refactorize like three weeks later. So as we moved a piece of code to the module, we tried to cover it with unit tests, but since Anaconda is very complicated and it's a lot of code, it's really like 100,000 lines of Python code, it's not so easy to cover all edge cases. So I would say we did our best, and this is what we have. But there's definitely space for improvement, and it can be better. And the problem I'm having is I just wanted to say, I think you've done an amazing job of this. I think I can test this when they're done building it, and it's amazing to me how well it works the whole way through this kind of slow development process between divas, so great job on that. And the question is, as you've been going along with divas stuff, have you considered dropping support for certain difficult to maintain things, and how have you made that decision? Okay, so the question was if we were considering dropping some difficult support, and how was the process of doing that? So what we were able to do is that we very often found something that didn't work for years. So we kind of silently dropped it because no one was complaining that it doesn't work. And but sometimes there were people who later noticed, so we had to say, okay, it's gone, sorry, we are not reviving that. About specifically targeting difficult parts, I guess it never was on the table. I guess we always tried to support all the use cases that we had, which maybe we shouldn't, because it was sort of more. But yeah, that was kind of it. We didn't really try to target some stuff that were already working because we were afraid that someone will complain that, okay, but this was to work and now it's not there, and you did the horrible, yeah, maybe we shouldn't. And then the question. Okay, so thank you so much for coming.