 Okay. Hello. So welcome. My name is Lubomir, this is Mohan and we will try and tell you something about what happens with Fedora Composes, how they are done, what issues there are, and what might be possible to do about fixing that. Like in the keynotes you heard a lot about what's the bright future and what are the high-level goals. This is nothing like that. We will basically just tell you what we can do to keep on tracking the way we are right now, just slightly faster so that we eliminate some of the problems. So first we'll sort of try and describe what problems we're facing, why it's bad, then try and describe, say, what's actually the cause of those slowness issues that we are facing, and then we're trying brainstorm some solutions. And this is actually where you come in because we are looking for suggestions. So the main problem is that Fedora Composes are taking way too long. So for Rohite on a good day it's eight and a half hours, but there were cases where it took maybe 18, 20 hours as well. That's not really doable, especially given that there's basically a single person who usually runs those composes. So especially before release, we need to make it possible to test changes in Installer and all the important packages. And if it takes eight hours, that's sort of not really fast iteration. So that's what we would like to fix. And let's actually try and describe what's happening in the Compose. So as you might know, the composes are done with a tool named Pungy, which does a whole lot of work and it tries and does do it in a parallel way so that it's not completely stupidly slow, but still it's not that perfect. So the actual work is split into multiple phases. I'll try not to ruin everything here. So we are not at C, okay, I just hit the screen. It will calm down. So this is basically the overview of what's happening in there and I will go into more detail about each part of that. Okay, next please. So first some of the housekeeping, those are things that are fairly necessary for the Compose, but are relatively quick. I mean, I will have these average times. Those are averages taken from last five row height composes and it's an average. So this sort of housekeeping takes almost 40 minutes. So in the init phase, we start by preparing the Compose files that includes checking out the Git repo, including translations in there. And some housekeeping at the end of the Compose is computing checksums for all the images that are generated and there's actually quite a lot of them. So it's not just the same server installer. We have Netinst images. There are live media. There's a bunch of spins and laps. For all of these, we need to make sure that we have a check. Some people might want to check those at some point. And the test phase at the end is relatively simple. Essentially all it does is we run repo closure on all the repos that we created. Usually in row height, there are problems for actual released versions of Fedora like on GA date. We probably shouldn't have many of those, but to be honest, there still are some of them. And also we run some tests on the images themselves. Like if something claims to be an ISO file, we actually check the headers if it is an ISO file. If it claims to be bootable, we check some flags and magic bits in the file to verify that it can actually be booted at least in some way. Because in the past, we have been bitten by this. The process changed a little bit and suddenly we screwed ISOs in a VM, I think. It worked if you burned it on a physical media, but who does that nowadays? So next slide please. So the first real slow part is the cold package set. And this is essentially about talking to Koji and figuring out what packages were built and should be included in the compose. Historically, we have started with a single Koji tag. You find signed copies on the file system because you can't really get that nicely from the API. And once we have all the signed copies, we created temporary RPM MD file repo on the file system that's used in the following phases. So this just includes really everything that's in the tag. And there's one repo for each architecture. There's no filtering at this point. And as you can see, it takes over an hour. So it's not ideal. Next please. So the first part that's using this repo is called build install for historic reasons, because originally it used to call the script called build install. Nowadays, here we just run Lorax to create install tree. So there's a boot ISO, there's configuration for grab. And on average, it takes maybe 43 minutes, but it varies quite wildly. Like if it, on a good day, it can be done in 10 minutes. On a bad day, it can be two hours. It also, it depends on, for example, how busy Koji is, because all this happens on Koji builders. That's the individual instance, actually. That's the actual individual instance, because if you check the logs from the compose for how long the phase took, due to how it's implemented, like it will only report that it's finished after these two things are finished as well. So it's not really doable from that. So I just looked at, I think it was 90 tasks in Koji from those five composes. And I ran the average over that. And that was the result. They might be in the, sorry, the comment was that the builders in Koji that are used for these tasks are in their own channel. So it shouldn't vary based on the load. So in that case, I have no idea why it's very so much. It can be affected by the overall IOL load on Mount Koji. And it quite possibly can be. The variation is most likely caused by bad things in the Mount Koji, but I never actually had Anaconda run quickly through an image build. So like, there's an Anaconda slump. The other way, if you download all the packages locally and rerun war apps five times, it should have about the same time. No, it'll be consistent if they're all downloaded. Except it doesn't catch anything. So there was a very long comment that Lorax is just slow. And that's why it might take this time. So the next part is called gather. And this is basically when we when Panji decides what packages should go into each part of the compose. Because technically, it's not just one big compose. There are separate parts. There's everything, which as the name suggests, includes pretty much everything. There's server, workstation, cloud. And for each of these, we just include subset of packages that are in the overall tag. This is configured mostly by comps file that each of those variants basically say, I want these comps groups plus all the dependencies. So that's how we decide what goes in there. And once we know what goes in there, we create hard links to every single RPM, which is also in this phase. And overall, this takes over two hours. So there's definitely a room for improvement here as well. Because it probably shouldn't be taking this long. Why are we creating trees for every variant when nothing uses the outside of this process? So question, why are we creating trees for every single variant and architecture combination if there is nothing other than the compose process using them? Well, the answer to that, if there is something in the compose process that's using that, we sort of need the files and the repos there. That's a different question. Why are we doing the entire tree every time that way instead of... So we know what got built since the last time we did a compose, why aren't we just removing the things that aren't there anymore and just having the new ones back? Suggestion, which is what we are exactly looking for. Why are we rerunning this whole process every time and not just including the changes that happened since the last compose? And wouldn't we know things like the build IDs that are the source of this and as build IDs are changed based on that you could just do mercury flows and change all the... Yes, and when we don't need to rerun the whole create repo everything, we could just include the new builds. And that's a great suggestion. The reason why this is not done is that no one has actually implemented it yet. In addition to that, it would still have to handle this case, right? Because you would still have an initial condition or if you mess something up you need to do a full... Yeah, we need to make sure that we can actually run the process as a whole. So the part that would be doing the incremental changes would still be sort of like an add-on to that. We still need to make sure that we can say for GA we need to be able to create the whole thing and run it at the same time but not just collect random pieces from Rohit and say like, hey, this is Fedora 13 now. So I have another suggestion here. I noticed that it does hardlinks in a separate base so it creates all the repos and then runs hardlink over it, essentially. Here we don't yet create the repos that's over here. This is just figuring out what packages will be in that repo and hardlinks them into place. Okay, so it is hardlinking them as it's... So they can make the create repo and then we will run it for the directory then. So we're doing this so that you can run a create repo against the directory, right? Not necessarily. It's not just... Sorry, the question is, are we linking the packages just so that we can run create repo over the directory? And the answer to that is no. We could run create repo against packages in random locations. That's not a problem. We already did that in package set phase. We need to copy the packages into some locations so that we can r-sync them to mirrors so that people can consume that. So when you validate the tree, you can just go and say the clutch and you're done. Like you're pushing to the error master. Yeah, yeah. I mean, for pushing the mirrors, we need to make sure that we have something that is r-syncable. Like we don't want to r-sync random stuff from Mantcoji. I mean, technically it's possible, but I don't want to be the one who has to implement that. So one of the questions was related to why do we generate install trees for things when we don't ship them? So one, we stop doing that for cloud because we don't ship an ISO for cloud. Right. Well, I'm just saying we don't do it anymore for cloud. And the reason that we have done that in the past is because we actually ship an ISO with an embedded repo on it for server, right? So we need a separate, you know, repo for server specifically because we need a repo that will fit on a 4GB DVD image, right? For Workstation, we ship an ISO, but it's a live install, right? So there's a long thread on a Pagger relinch issue that says, do we really need to do this for Workstation? And so there's a long conversation there, and I don't think the answer is no. We just haven't explored everything yet. Also, there is some value in having the repos even if we are not shipping them for creating, say, the Workstation live media in that we know what set of packages was there. Like in Fiori, you could have a package that's not listed in comps get on to the media if it was done from some other repo, not just this subset. It's not like a big deal that's in Fedora, but for other use cases where this process is used, it is sort of like a requirement. We need to make sure that the images consume only the stuff that should be there. How long was two hours? What is the slow part? The slow part here is mostly figuring out the dependencies for the packages because from comps groups, we know what... Do we know the time? It's not really the question of reading it. That's still relatively quick. We read the repo data that we prepared here. The thing is that we actually have to run a transaction for every architecture and variant combination and figure out all the dependencies, what should go in there, apply rules for multi-layer plus a few other magic rules that create it over time because someone needed to fix some problem. Is this effectively then slow because RPM doesn't do something like this? Yes, using Libsoft could possibly speed this up quite significantly. There is some work in progress on implementing a backend that would use Libsoft. Right now, it's basically Python code using LibDnf to parse the repo data and ask for dependencies like what provides this thing. So that's a big reason for the slowdown. Does it similarly? They use PerlBSSol and it's just a pin wrapper around Libsoft itself to process that information and just export it. The comment was that OBS is using Libsoft to solve similar problems and it's very fast. I can actually confirm it from the tests on the new backend that I've been running. It's a lot faster if we just run Libsoft. Technically, we have to run it multiple times to add the multi-layer packages. But once it's moved in to C, it's a lot faster than the Python implementation we have now. Maybe we should move questions to the end. We'll let you guys get there. That's fine. Next slide, please. As I said, this is just creating the repos for the files that are already in place. That's fairly fast. It could be faster because right now it all happens on the machine that's running the Compose, which is usually there's one Compose machine for Rohite, one for releases, so on and so on. And it could run anywhere. We could ship it to Koji builders and run one create repo on every single on a separate builder. It might help us a little bit, but it's not a big pain point. It's just five minutes. We run multiple processes at the same time. I've actually bundled two OS3 parts here. That's creating the OS3 commit itself with the new updated packages and then creating a bootable image that includes the OS3. And this does take a fair bit of time. The thing that we can do here and that will happen fairly soon, hopefully, is to move this and run it in parallel with these other bits and pieces. So to be clear, this is the OS3 and the OS3 installer phase. Yes. So that includes a run-of-the-war app. That includes a run-of-the-war app. In the OS3 installer phase. Quick question. The times you're using are our current times. So this has the S390 stuff in it? I think so. The times are from five Rohite composers and I took five last successful, which is like two weeks ago. Yes, some bits and pieces are missing from that, but... So that's another thing that could be slow, although I don't have a solution there. The comment here is that a slow part here might be that it's over all architectures and some of them are relatively slowish. So the solution here is basically move this and run this part earlier than we do now. The reason why it's not done is that there was a bug somewhere between RPM, OS3 and libDNF and it got confused when it saw a repo with binary and source packages. This should be hopefully resolved now, but we need to make sure that it's tested. So question then. If this is the time for the OS3 creation and then the process for creating an installable image with OS3 embedded in it, what is the time for actually creating an OS3 commit by itself? Does anyone actually know how long that takes? I've never actually made an OS3 commit from our PM ministry, so I don't know. Because I'm wondering how much of that time actually is one thing... The question is how does this time split between creating the OS3 commit itself and creating the image? And my guess would be that the installer creation should be roughly the same as for the regular installer, so about 45 minutes. So that would leave about 20 minutes for the OS3 commit itself. The comment is a lot of that is copying images, sorry, RPMs, not doing actual work. So that seems to be a thing that happens in a bunch of phases, so a question that I've got is why don't we actually fetch the stuff the first time? Since this is being carried through through the same machine, why don't we just hold the cache and pass it forward to the rest of the phases so that we don't keep redownloading every time? The question is why do we download the files over and over again? And the answer to that is that we are basically not. I mean, it's all on the same NFS volume. It's all running in the same data center as far as I know. Okay, yeah. What I'm saying is hold the cache directory and just keep filling it up and that way you don't have to keep redownloading. That skips like half of the process. Like a DNF install transaction is a simple one. It's about three quarters of it is fetching metadata and then downloading the packages to process the transaction. The rest of it is relatively quick. Okay, so the suggestion here, and it's a good one, is to create a cache with the packages locally so that we don't have to actually check that again and again for multiple things. And then the follow-up to that then would be, can we also do like all the Lorex runs we're doing? Can we do it on HEPFS? Oh yeah, that's fast. Yeah. Moving Lorex to TMPFS is also a good suggestion. That one might be slightly problematic because we can't run all the Lorex tasks on a single machine. That needs to be, like for each architecture, it needs to run on that particular architecture. Like you can't create the installer for S390 or something else. Every machine, like theoretically an entire run group, you should just be able to run it in a random app. Why potentially do cross-arch stuff with creating routes and for images? The question was, why can't we create installers for different architecture? And the answer to that is, I have no idea. So take it down as a node and possibly run things in RAM. Yep. This brings us basically to the last part and that's actually the biggest, slowest part and that's creating all the live images and spins and whatnot. There are technically, all that stuff is split into four different phases because there are four different kinds of things that we generate. But overall, it's four or not five. I can't count today. Basically, it's all the same thing. We just run a time for it to finish. Once it finishes, we copy the file, the actual image into the Compose directory and we get done with it. Most of the time here is really just spent sitting and waiting until the task finishes. And overall, it takes about three hours on average. So there might be some possible improvements here but it would have to happen in the actual tools that create those files. This is a live media task. I know runs Lorax yet again. It should not run Lorax because it's consuming the output for the Lorax run we ran before. LMC is Lorax. So it executes Lorax to produce and runs in Anaconda again to install back into the ISO environment to create the live media. That's how that works. So you have at least there each piece of live media probably taking about 30 minutes by itself to factory download because it doesn't understand caching. And then we'll actually install and produce the tree and then take about five minutes to make it. So the comment here is that at least live media creator is running Lorax again. And we might be able to save some time there. Good thing to look into. Thank you. We might want to look at because all these things are in different channels or many of them are in different channels. We might want to create some logging or do something to indicate when we're filling those channels up because we can re-outtake where the system goes. So it might be that we have eight images we're trying to build and only seven builders in there. If we have one we actually save ourselves a lot of time waiting for that one to... Another comment is that we might get some speed ups here by reallocating builders in Koji so that it matches the actual demand. It's kind of not obvious from the actual process like what takes how long. And we don't actually run everything on the same architecture. There are more images for X8664 than for other different architectures. So I think that's it for my part and I'll hand it over to Mohan for his more ambitious ideas on how to fix our problems. Hello. Sorry, I was just taking some notes with all the solutions. Don't mind me tapping on the mobile. But anyway, so you heard of the pain points why it is slow and thanks for all the solutions. Now I want to talk about a couple of solutions that I have in mind and these are the four solutions that I'm thinking of right now. Well, fourth one is not my preferred choice but I'm just putting it out there. First one is like we are running depth solving as part of the compose process. I want to move it away from the compose and run it on demand. Let's take the example of Rahari. Whenever it gets built, it gets stuck into F29 pending tags, gets signed and moved into F29, 29 being raw height right now. So I want it to run depth solved once it gets signed and stored by variant, by arms. So that it just also finds the, puts the location of these all the builds. So whenever it runs, we can eliminate the packet set and as well as gather phase. Well, part of gather phase because gather phase also has some hard linking but packet set and part of gather phase can be removed which essentially saves about two to three hours because it's taking more than three hours right now for both of those phases and they're not run paralleling. So that being said, there are a couple of things we need to fix especially Koji or some service that Koji will call to do that absorbing and punchy changes because obviously we don't need all these gathering phase and gather phases so some changes over there. And that's the first option and my favorite option. And the second one is also my favorite. Basically, tracking all the changes in the images, what goes into the images. So currently the images are generated from looking at the kickstart and kickstart might include some comms and as well as some build-up changes and everything. If you can track all those changes and identify if anything has changed from yesterday to today, composed basically. And we can eliminate the creation of the same image over and over and just like hardling the previous image from previous composes. Essentially, what happens is we have to run everything because everything changes every day. So any image creation related to everything we have to run it. But all the other labs and spins and workstations and even servers sometimes we don't have to run them. So that will save a lot of time and when we looked it's about three hours plus. So I'm guessing it will come down to less than half of it. Anyway, guessing because we don't know how much of the things are changing every day. But definitely spins and labs won't change much. Then to quantify, can you just look at the previous images over here and just see how many are there? Because my guess would be that actually almost always something changes, right? Yes, but not in all the variants and all the arches, right? So like for example, our maybe changing a little bit more than say, ICT 16. Or S390X or PPC 64. The S390X is special. Yes. The S390X is what reduced the average time but not reduced to the maximum time. Because like one day you get locked in and changes, the next day the chronological scene changes and every world is broken. Yes. And it takes eight hours or ten hours. Yes, there is a possibility of that. You repeat the question. Oh, sorry. So if we implement this change, so the question is if we implement the change, I am going to reduce the average time of these phases but not the total time. So I would like to say that any change or any time that we can say is really helpful, especially during the freeze time. That's what is actually driving me to make these changes happen. Because during freeze times we actually get about one or two days to create the RC Compose and sometimes we'll find more bugs and every time we'll be stuck with like postponing the release by a week because we just found a bug. Yeah, just before we go on Nogo meeting. So I would like to implement as much as I can to get that number down. And some of the things we cannot do, basically as I said, asking for a new strength TX mission, it's never going to happen. Or adding more builders probably, but still. Why would that never happen? We're buying a mainframe? Sure. I didn't say buy one, but like why would there not be more builders or things like that? No, no, it's not about that. There are creative ways to get it so that there's more capacity to target architectures. We have access to extremely builders. Yes. Just in Boston. Yes. The physical location difference is the problem actually. Yeah. And all of our builders... Sorry. All of our builders and Mount Pudge is located in Phoenix location whereas us three net X is in Boston. So... So that means we're in network? Yes. Yes. Well, it's a slow lead. One response to helping during pre-signed, I don't know if this particular change would help us that much in pre-signed because I feel like when we're doing an REC, we wouldn't want to actually... We want to validate everything. Do everything, a total compose as part of an REC and not a partial compose that's optimized based on whether or not the previous... What? Compose... Because the actual... version would be different, right? Between the two. Compose basically has the... The dates and stuff like that. We would probably want to all come from... Another case there... Another case though where this would be very helpful is with raw hide. So a compose goes along, it fails. Right. Oh, I think that might be DNF, but I'm not sure... This is great. Until I rebuild everything and then you have to wait another eight hours. This is great for generating. Yes. Yes. Well, even in the freeze time as well, definitely the first RC compose going to be the everything. Right. I'm not going because it's going to be the different directory and... We haven't composed... Any RC compose before that. Right. For that release. Right. So first RC compose will have everything, but let's say we found issues. Second one. Right. And probably we add one or two bills. QF found something... They want to add one or two bills into the compose. Right, yeah. Then it will save a lot. Yeah, I would say the first RC... Yeah, first RC, yes. Definitely not. And the last one that we actually release is a full compose, but we can do optimized releases in between there. Yes. And like get on the QA immediately. Yeah. And say yes it works and then pick off the full compose. Yes. So yeah. And in my experience, I have done one RC compose for one release. Every time else, it's like multiple RC composers for one release. Yeah. So out of all the like six releases or seven releases I've worked on. Cool. And definitely, that needs some changes to punchy as well as we need to have some smartness to identify changes in these things that we talked about. And district was probably you guys might have heard about it. And it's kind of implemented, but we're still not using it because there is a small issue over there. But anyway, with district was what we can do is a packet surface where basically in punchy it runs on all tags combined together on one machine. With this, we can split that run multiple like district on different tags on different builders basically calling Koji and then split them again on based upon arch and sending it to different builders based upon their architecture as well. So it will save some time. I'm not sure how much, but definitely packet set being long phase, about takes about two hours. So this will definitely save some time over there. And the idea is basically call the create district task which is part of Koji whenever a new gets, a new build gets type and use that report. So, it definitely needs some changes to punchy and they have to fix that bug in Koji. So, the bug is multi arch? No, it's the debug packages and source packages are going into the system. So that's actually fixed in Koji 116. It's just never been pulled into an oriental structure. Yeah, like I was working with Mike, and a couple of other people on like finalizing the structure for this and that has actually been fixed in the latest Koji release. It's just never been packaged for Fedora. I actually built it in my last week. Oh, you did? Okay, that's awesome. And I'm holding off on it for the rest of the branches because 161 is probably due for 14 long years. Yeah, it is, yeah. And does the Koji still create the bug from scratch or it catches the previous? It does murdering those internally. It catches the previous. I'm not sure. It does, it catches the previous. I opened that. Okay. The design of it is that it uses the previous repo's input and the same as it does for the builder and repos. Yeah, it's very efficient. It's supposed to be. If it's not, then it is a bug. But I've not seen that reported or seen that. I reported it, but... I haven't checked it for some time so I will check it now. The last one, which is not my favorite one, creating these images on-demand or on-interview phasers. On-demand, obviously, I'm obviously involved in some of those images. Definitely it will be slow. Let's say the package gets built and it goes into one of the images. Probably we are creating an image and waiting for 40 minutes to complete, which is not optimal, hence not my best option. And also, it's very complicated to implement because we have to create the repos and then create the images out of it. And then we carefully have to copy all these repos into the punchy compost later when we run the actual punchy. Not ideal. You're still not saving time on depth solving because, well, you're not really saving time on depth solving because that task is still going to get run because of each image build process. And on top of that, you're still downloading the package to the vain again? Yes, yes. And so, like, probably an easier quote-unquote when that would be to first get it so that it doesn't force you every time that we download everything, that there's a way to tell it there's a permanent capture. Yes, yes. And then the second would be making it so that if you've got a purchase tree already in place, you can use that as a source to feed in to produce it so that you can get some steps in the sense. Yeah. Thanks for the session. And nothing much. It saves time in image building and, like, need changes for Koji, Panji, and some smartness as, like, not my favorite. These are the four options that I was thinking about and I have taken sessions from you guys. That's really helpful. But this is the goal of my previous release engineer and now became my goal. Panji needs to be a gatherer, not a creator. So everything should be ready. Panji just goes there, collects them, puts them in place. That's the goal. I mentioned you. Why is it not from the bush? You're having rhythm to Koji. I didn't work here anymore. I'm a manager. Huh? It's your job. Any questions for this moment? This is just... So, I mean, the work here is going ahead and looking at how you're going to take takes. I'm wondering if it would make sense to basically create a root chart for every kind of broadness. You can just look at this and say, wait, we're waiting for this for 2 hours or just, you know, having a graph with you why it took that long on every compose. The comment was, would it make sense to have an overview of where time is spent in the compose process so that we can actually see what was the slope bar? And you can sort of get this information in part from the logs because there are timestamps for everything. The only downside is that due to some implementation details, especially for the image building, you don't actually get run times for the tasks themselves. You would have to query Koji about that. And if there are multiple phases running in parallel, like if you remember the diagram, they all start in given order and finish in given order. So it might be waiting just for another phase to finish. Even if we got the duration on each task and sorted it by what's the slowest task here, that would be useful. It would be useful if we get the duration for every single task. That's a fair point. And also I just wanted to add one more point over there. Yes, we can do that, but just act differently. Yeah, but so that you're seeing it act differently, but you if you can see it, glance how they act differently, maybe that would give you a better idea of everything that's going on here. Over the time. I'm not saying that one single one would be a serious problem, but yeah, definitely. Just to repeat the comment, it's that if we have this information, we can actually tell that these weird outliers are happening and where what suddenly starts taking longer time than it did before. Actually, I was doing some profiling on the package set phase like a few weeks ago and I found out that the second function which was called most of the time, and took most of the time was the interpreter log for threads. Like Python interpreter log. Over the time of the whole package set duration it's like blocking acquired threads Okay, so the comment is that from some profiling might be that the actual slowdown is ineffective implementation in Python that we're using threads in a non-optimal way. Again, this might actually be solved by the suggestion number two of moving this logic into Koji and not actually preparing the package set in Python. That part is mostly an improvement like enabler for further optimization. There was an item like the first step was checksum in checking checksums or something like that. Is this actually necessary or is it possible to track checksums that are on this or cash checks somehow ? The question is if we actually need to compute the checksum for the images or if we could cash them somehow and I'm not actually sure if it's possible because in each compost you get a different image. Even if the packages are same I believe that you will get a bit twice a different file. It will be different. I was just looking at it because I was wondering the same thing. It's just not that hard to make gen isoFS just spit out the songs. We do not always run that command. The comment was that when we run gen isoFS whatever the command is it can spit out the checksum as well. The downside with that is we do not have direct access to that for creating, say, live media. It's somewhere behind . So it needs to be changed in four different places. We also check some things that aren't . You're checking the checksum in two files and we can make it so you don't have to check some one file. The fact that we don't check some of the other files in the other files is not the right you can crawl with everyone. Even if you don't want to work on that. Yeah, we should just take the checksum data that's output by the image creation tool. The image creation tool either old gen isoEvent or the resoFS, they're going to output that information no reason to do it twice because it takes five minutes to wait. The comment is use checksums from the tools that create the image. The reproducible build guys are doing work on making iso creation and stuff to be reproducible. It's currently not totally possible because of date and time stamps and stuff like that. There's people working to try and make that reproducible so they could do things like that. The reproducible guys actually do have some of this work already in place since the resoFS which is one of the reasons why I'm working on moving to it in live scene creator which you guys don't use anymore but there's one of the reasons I have a branch where I'm working on actually changing over to that but the other thing is gen isoEvent is basically dead. There's some of that going on in a while. Yeah, it's good to see. The key we guys have switched over and I'm in the process of switching over for live scene tools. I know there's apparently a couple of ports with how morax actually produces bootable media that is blocking the resoFS from being used instead of gen isoEvent but... Well, so the problem is that morax assumes that the isoEvent tool cannot actually create the hybrid isoEvent one step. So resoFS can and so it mangles the way it does it and manually mangles the image once it and it screws it up in the process because the padding is slightly different between how gen isoEvent does and what does resoEvent does Yeah, it is easy enough to fix this just you know, you see how it's getting around here and I found that in my own code it was like, this is really weird but yeah, there's a couple of ways to like improve that. Okay, so this was a very long comment about possibilities of improving creating images with different tools that might be more efficient. So I got a question which is not a question because there was a discussion There's another slide that says special so let's read them that they want to show them right now. I can change this with one more thing. So this one word, you know, image short thing coming out, one more file coming out of it in the end. The good thing is it takes roughly 5 seconds to make so it's not going to make the timeline any worse but I don't really understand how to fit it into that because I would love to have a discussion with you guys perhaps. Listen, should I want to make a base snap after cooling for it so I don't plug it into that so it fits out of the whole structure. So I don't think I'm just going to get to these and then you know, you're going to have to have the lovely titles having to produce them to do a run time flat time. So that'll be a whole other set of things. That goes most likely for making images spark. I think currently I don't have a plan to create a composed process that I would consider to be just as we don't have to create the core parameters out of a composed process to get it to be done separately. Wait, what's the parameters? The base images are the applications that are built out of those on there. It could definitely be something that was, you know, you could even put it into there or it's another approach to that to run. I think we're not. We just want to figure out where people are going to be happy with something. We just want to give you guys five lines of show. I mean, I've run a Python program that will basically do it and have support for caching and stuff. We don't know where to plug it in and that's one of the reasons we're here. The simplest thing is just make it a run root task in Koji and then plug that into in. Basically, Cunji doesn't have a generic run root and it doesn't have. It basically knows how to call Cunji with a run root which is basically a generic do these things container essentially which is not exactly a container but yeah, so that's what I would do. Because that's what we do for war axe and other things now, just run them in a run root. Okay. Absolutely. Yeah, okay. I'll explain you in post. Okay. The buzzword of project makes it fantastic. Except for Koji. Well, I like Koji. So, this will probably be productive or not. So, a lot of us have mentioned caching things so that we get away from hitting Mount Koji for everything, every time. We have any good ideas on how to implement something like that. And because there's a lot of ways to do caching wrong and basically confuse the hell out of yourselves so that it makes it impossible to debug or you thought something was happening but it really wasn't because of caching, right? So, do you guys have any ideas on the best way to do something like that? So, the question is caching is easy to get wrong. Do we have some ideas on how to actually implement that? And at least myself, I do not really. The issue with the caching is that if we actually want to reuse the same cache in multiple places it gets a lot more complicated and we would need to make it so that the I mean, if we create a new cache for every single step in the process it makes no sense. So, we would need to make sure that the cache is used by the dnf code that's figuratively necessary we need to reuse the same in a different do as well. So, that part is also mythical. I have no idea how that works right now. That's another question though. It seems like some of the local fruit we have by the index of the dnf code you guys can build it can write to more than one place and then just put in a copy of Mount Cody sitting next to the S390 in Westburg. Well, oh boy. My calls are basically that the cache would be per builder and so for example the cache is per builder and we fill up over time. If this task went on this builder this time it would create a cache if it goes to a different builder next time it creates a separate cache but that's okay because the cache is an optimization if it's not there it will basically go get the cache but the idea here we could do an optimization where you say you went to this builder last time and that builder is currently free so go ahead and go there but the idea here is that essentially you make it generic enough so that all of these different tasks including pulling dnf including pulling other things could possibly reuse you know this cache that is per builder so if you're going to build like if you're going to create images using Anaconda and you need to pull down stuff from the build install that was created we can also pull those down not just rpms, right? That's Cody That's already in the berth So that it's actually per post that they're posted on Yeah, because then you just have a shared disk You can pull that in So let me summarize this Are you talking about writing cache for all these things so that it includes not just the tools that would consume the cache but also a bunch of the infrastructure around it that would maintain the cache which makes it even more complicated Right The question more is can we filter it through something like currently nfs is basically something that you can't create like a proxy for it, right? However, HDBP is something you can create It is already old cache The cache packages when all the downloads via HDB happen because none of the dnf processes are using nfs directly, they're all using it via HDB which has a huge big cache on coge packages which caches all the rpms so they're not hitting disk they're being served out of memory 90% of the time It's very rare that they're actually hitting disk The only time you hit the disk is when you're doing things like running free repo to actually generate the repo though because then you got a and then coge does optimizations like skip start and stuff like that where it assumes the file name is exactly the same nothing's changed and it shuts all existing and pulls the metadata from the existing metadata and just updates it rather than creating it So what you're saying is it is cached so we're not hitting mount coge but we are hitting a proxy that is outside of the build It's cached at the wrong point basically because the tools are still fetching every time and revolving everything and having to go through that work every single time but they're just not hitting mount coge The reason that's intentionally done that way is to ensure that every time that you make the new image, new whatever that you can recreate that so any caching that you put in there to make it some optimization elsewhere you then also need to take extra steps to make sure that however you go about doing that that you can then reproduce that exact same thing regardless of the state in which you start Right I think what a lot of us are thinking right now is that those extra steps are smaller than what we're doing There should be anyway Yeah It's literally just what's the build idea of the build that we're using at the list That was the question that was essentially that started this conversation How would we do it? Right And I think what you're saying is if you're interested in what we're already doing then it gets harder to do it right But is there a happy medium in there that helps Is our assumption that mount coge is the bottleneck not valid Probably not valid So things like the anaconda runtime we don't actually know what's inside of that so this is a very enlightening discussion of your out of time so maybe you can go over this Thank you Thank you very much