 Hi everyone, this is Tech Talk and I will be today presenting the CP-ARC discrete investigation. So the outcome of this investigation is the document you can find in your ARC docs. Here is the link. If anybody needs it, I hope all of you know where the ARC documentation is living. What we did during this investigation or what was the purpose of this investigation was to find something that could potentially replace Pagur Disk Kit and make it more Gitforged agnostic so we can have a solution that will not be too hard to actually use with something else. Pagur Disk Kit is so much tired to the Pagur itself. It's actually just a plug-in or team for Pagur, so it's using most of the Pagur codebase and we would like to have something that would be more agnostic regarding the forge that we will use. Okay, so that does more or less the background about it. I will first talk about some of the terms that are used here, like what is actually the Disk Kit. The Disk Kit is the Distribution Kit and it's a way we manage the distribution packages and the sources for those in Fedora. I'm not sure if this term is being used by other distributions, but in Fedora we are saying that the Disk Kit is the Kit for distributing the Distribution or Distribution Kit and it just has the sources and packages, sources for packages and that is more or less it. This is from what the packages are being built from. As part of this ARK investigation, we were looking at various services in Fedora ecosystem and how they are interacting with Pagur Disk Kit. If there wasn't any interaction, they are not here, so if you don't see something that should be here, it's probably because there isn't any interaction or not direct interaction at least. I will just quickly go through the interactions because there isn't much to see. It's usually just mapping what we found out that the Disk Kit is actually using the service for our other way around. The service is using the Disk Kit for, so there are some API calls, some configuration values and the changes we propose to actually make this work with the new Disk Kit solution. This is mostly the same for all of those, so I will not go through everything. What I want to highlight here are the Fedora packages which are actually mapping the interactions between Fedora packages as a group that is creating the packages and working with Disk Kit and actually interacting with Disk Kit. Here are a few workflows that are being used. For example, if you push workflow where you have the packages machine, you are using the Fed package to upload to add new sources. The sources are downloaded based on the spec file. They are doing changes into local Git and then just push it, either by Fed package push or Git push, and this will create them on Disk Kit. This is mostly standard Git operations, not much else. The PR based workflow is similar with the PR being created on the way. It's usually when you are having a new branch and just trying to create it. This is from as far as you know, this is the only thing that we have regarding the package workflow. Then what is interesting is the summary. There is a map that is actually trying to comprehend how the Disk Kit or what is actually communicating with Disk Kit or what the Disk Kit is communicating with and how. For example, what is Disk Kit doing is to publishing messages which are being consumed and it uses Pagur schema for this, so it doesn't even have its own schema for publishing messages. Otherwise, there are plenty of other sources or other services that are actually just using API for Disk Kit or interacting with it using the HTTP requests. There are a few that are doing the Git interactions directly with Disk Kit. It's usually just a few of those. Either API packages or API requests, HTTP requests or just Git commands. Let me go back and here are the list of all the APIs called. We actually found all that being called by everything because it has the Pagur as its base. We needed to find out what is actually being used in case of Disk Kit because not everything is being used. There are a few APIs that are Disk Kit specific and those have the prefix underscore tg like Disk Kit. There are a few that are actually specific just for Disk Kit. In case of messaging schema, these are the messaging stack being consumed by other services that Disk Kit is actually doing. This is just a sub, sub, how to say it, just part of the whole Pagur messaging schema, so only a few things are actually used. For the HTTP endpoints there are two, there are three that are actually specific for Disk Kit or not really specific but they are being something that isn't something that is basic or simple. This is creating some JSON answers or JSON files that this could be accessed on specific URLs, which is something that needs to be taken in consideration when creating a new one. Then the Git interactions which are actually plain Git interactions. It's usually Git push, Git commit. Git commit is not actually doing anything with Disk Kit, so with Disk Kit you have Git push, Git clone, and Git fetch, the basic Git commands. For access control, this is something that needs to be re-implemented because this is completely implemented in Pagur and it's actually just controlling the access of the users to specific branch or repository. It actually handles the question who could commit to this repository, who can push to this repository and so on. Web interface features, there are few that are for to mention and that are not part of Pagur itself and that Disk Kit actually contains, I can actually show it. It will be probably just easier to show. If you open any project, you can see links to Koji, Bodhi, Baxila, Fedora packages and Koche, so you can actually, let me open one here. So here this links is what I'm talking about. You can see direct links to this repo or this package on various other services. The other thing that is specific is the issue stop is not really doing anything, it's just redirecting you to Baxila and you can set the monitoring status for the repository which is used by releasemonitoring.org and orphaning and taking packages. You can orphan, take package. If you have permissions to do that, you can do it. Otherwise, there is nice that you can see the state versions and releases in various releases, which is nice to see, but it's not any special functionality, it's just a table on the page. Okay, let me get back to this. And the last thing that is mentioned here are the two things, the fast integration. This is just to authenticate users and it's based on Pagura fast authentication, so it has the same issues as it. And the last thing is the local site cache, which is just on the server where the disk is hosted. There is just some disk space that is being used for the sources for packages. These are the tar files that you are just getting when you do the fact package sources and then when you do the upload of these sources, it will put them in the local site cache, which is later used for building the packages itself. Okay. And that is the current disk deployment. And here is the replacement that we actually want to use. So this could be somewhat confusing at the first, but the basic idea behind this is that we will have this git that will be just the front end and it will use the git lying behind it and it will just convert through the compatibility layer the HTTP API calls from various services or scripts and just do this, call them for the specific git forge. So only thing that will need to be replaced when we will just change the git forge in the future, it will be to change the compatibility layer, which we'll just will call other HTTP API on the other end or the git API, it depends where we will change. Git API should be same, the git commands will not change, but the HTTP API for a git lab or GitHub would be different. So this will be what will be changing. Other than that, we will not need to change anything. There will be few new things like for managing the sources. We will use the git LFS, which is the large file solution or I'm not sure what the S stands for, but it's actually for handling git with large files, which will be what we want to use right now, but we are no limited that we don't have this implemented. And this will be actually the improvement at all. And the second improvement is to be able to actually change to anything else in the future. There are even some kind of legions for this, like which applications are managed by what team and this is actually the replacement, these three components. Okay, I will skip this, because this is just a list of the services and this is the message handler for the new disk. It will actually not be that different. It will just, at least for the start to not break anything, we will use the same schema as pagura has. In the future, we will just replace it by one by one with something diskit specific and it will just send the messages and listen to anything that we need. And here is the diskit service. This is just more detailed view of the diskit service itself. So there is public access. This will be the HTTP API for the git. And this is HAPI. There will be core services, like it's a control list. This will be probably based on the forge, because this is usually set on the fourth side. It depends what we get to, but we will just ask the forge who has access and that would be it. In most cases, git, standard git deployment, git.lfs will be for light source binaries and source intervals. And there is webhooks, which will be just for event-based activity execution. Same as the pagura has right now, that you can actually do something based on a CI or anything else. Or to actually create the CI event in case something happens. Okay, and the last thing here is the HTTP resources, which will be just the front-end where you can see all of this in one place. And this will get us to the last part, and that is the compatibility meta-service, which is just the translation layer in your case. We will have the diskit dependent projects and diskit service on your site, that will be the new one, and there will be the compatibility layer that will just take the request from the diskit project, diskit dependent projects. Just go through API server, do the translation, and then just put it to the diskit so diskit could understand it. It will be probably the translator but just used for anything that will need to be changed, which in your case will be any operation that will go through diskit to the new Git word, because this will probably change. All the requests that are now done directly against pagura will be going through this compatibility meta-service and will be just translated to something, the new diskit actually understands and can do something with it. The same for the responses. And that is actually all we have. I will just mention that we made some roadmap that this should be done actually slowly. We should first know what the Gitforge will be used if we want to get rid of pagura or just use something else than pagura. Then we will need to look at the API and hope we can actually use it with the current API that is available. The bonus will be to adding the Gitlarge file system support. Gitlarge file system. That will be win for us in every way. Then there will be the compatibility meta-service that will actually happen first and we can do anything above it that will just have some kind of web front-end or anything. We will create the staging deployment which will be probably on some sub-domain first so we don't actually break the old one. We can just try and play with it and see if this will work and what needs to be done. We will try to point to all the this Git staging URLs to new sub-domain so we can try that the very thing works. There is a follow-up with the externally maintained application. If we keep the API scheme that we have right now we don't need to do that and just skip this step. It will be only needed if we need to change something in the API like something that is not great in the current settings or not usable in the new one. In ideal case we wouldn't need to actually change anything in case of the other services that are actually using this Git. Just point them to the new one and everything should work. We would like to get the feedback at this state so people could start testing it, could see if this actually works for them, if there isn't anything that should be changed. If this will go well we will do the same process for the production one and announce the deprecation of the old one and decommission the old one. That is probably everything. Your estimation is that we will probably need three or four people working on this for two quarters because this will be a lot of work and one of them should be CIS admin that should help with the deployment of the new solution. Definitely there will be documentation as part of these steps. It's just not mentioned that. That is all I have to share. I see that this is already taking half an hour. Does anybody have any questions from people that joined? Sorry for being so late with that. I have one or two questions about it. Firstly, thank you for the in-depth investigation here and the excellent write-up and of course reading it again. But the investigation, this just purely deals with the disk part and does it relate more to packages or is there other project teams within the Fedora project that would be affected by this change? Actually all that I mentioned here will be affected. Those are all the services that are actually really interacting with this kit. In an ideal place there wouldn't be any change or any disruption on their side. It will be just that they are getting new endpoint to point to in the case of this kit. But in the case of Fedora packages there will be probably changes for them in case of whole the disk it will look and maybe with the whole the packages will be or in the ideal case it will be that we will just point to the Git repository in any Git for which we will go for. So if you click in the... I already closed it. But if you click on the source in the puller in this Git it will redirect you to the Git page with itself. So this could maybe be something that will confuse people at first but it will make it more easier because they will work with any Git for which they actually want to work with or they know at least hopefully. This Git part will be just for overview of anything that is happening and for communicating with the services that need it and they'll be there. And for some of the other things like changing the orphan status or adding the monitoring settings. The reason I ask is because there's a discussion open on Discourse.fpil which I'm sure it's under to do to post these findings in and I think it's been referenced a few times that this work has been underway but the discussion is around like long-term planning for basically disk it, Fedora set up with disk it. There is probably going to be some sort of like a community effort to investigate now like various Git forages for options and this investigation would be like a perfect baseline for people to start with because it's so comprehensive it gives you the overview of exactly what packages interact with all the services. So this is more of a suggestion to throw some visibility on this work over there. Yeah, I remember I did add this link to this investigation to some, I think it was Fesco Ticket. And I'm not sure if it's not the same you are actually talking about. No, there was a split-off discussion. The original Fesco Ticket was around Git forage evaluations and then there was a point made in the discussion where there should be, you should kind of narrow the focus of the conversation. So to me, the most pressing, the most immediate, the most impactful part of the Git forage puzzle is disk it piece and that front layer what packages actually interact with. So there's a discussion around long-term planning for the service. I just can't remember what I called it down topic. Yeah, I'm actually looking that I give it to the Git forage evaluation 2024. Yeah, I would link it or I'll find you the link to the other one. But you can just point to the threat that we actually have in this course for the investigation. We are actually asking people for feedback to this. Excellent, okay, good. And last question is I assume that the work that CPE will take on for this will be alongside community involvement when there's been like a formal decision made and the project that the work here is not going to start until the project itself is cut off, is that correct? Yeah, we don't plan to start on this still. We will know actually what Git forage will be used and I hope this will be probably starting as Fedora initiative or Fedora project initiative and not very initiative by CPE itself. Thank you again for putting all the work and effort into us and the rest of the team. Thanks, I will just stop the recording now. If there isn't any other question, okay, it doesn't seem like. Thanks all for listening.