 So we are really, hi, everybody. Welcome for this new Jenkins infrastructure meeting. The first one after FOSDEM, the good thing is nothing major crashed for FOSDEM. So I'm not sure who put to the agenda FOSDEM results. I'm not sure if it turned out to be in this meeting, but let's see. Let's briefly talk about that one. I mean, it went great. I follow the Jenkins stands and the CRC Dev room for the Jenkins stands. I think there are things that could have been improved because we couldn't stream demo. So people had to join the GC call. But at least for the CI CD, the room, it went really great. I saw some numbers posted on the web, mentioning 33,000 attendees over the weekends, considering that the FOSDEM max attendees number is 8,000. That's a pretty impressive number, especially considering that was the first time that the system was used for the FOSDEM. And then, yeah, that was a great, great, great weekend. Regarding the Jenkins stands, I have the feeling that we had less attendees compared to the previous years. I think it's probably because people had to join the call. So we usually, during the FOSDEM, have people who just show up at the booth just to show the demos, but they don't really want to really engage. And in this case, we could not see those people. So it was only the people who started the discussion. So it's hard to say from my point of view if it was more attendees or less attendees compared to the previous year. But at least nothing major broke during the FOSDEM from an FOSDEM point. I mean, I accepted the VPN, but it was not critical for the Jenkins community. So we could do demo and stuff like that. So that was recreate. Any questions before I continue? No, sounds good. So let's briefly talk about the VPN outage. So since last Wednesday, we discovered that the VPN would not work anymore. The incident started after we updated the open-adapt docker image, which was really weird. And so we tried to replicate the issue locally. We identified that the problem was related to some TLS issues. So for some reason, we could establish the connection from the adapt client, from the VPN to the adapt, but we could not use the adapt plugin used by OpenVPN to establish the connection from the VPN to open adapt. We investigated with Garrett and Demian, and what we discovered was the adapt configuration. I mean, the adapt plugin from OpenVPN was configured to use a node adapt CA. So previously on open adapt, we were using a specific CA from a code ID, I think like that. I'm not sure. Then back in June, we switched to let's encrypts. And for some reason, for some reason it has been working until now, and then it stopped working. So our guess is a default configuration in open adapt refused to have connection coming with the wrong CA. But basically we fixed that this morning. So now everything is back on track, which is fine. In terms of incident that was annoying, but not a major issues because we had work arounds and we did not easily rely on the infrastructure running inside VPN. So it was okay to just delay the work on that for a few days. So now we identify improvement for the OpenVPN. So we can easily reproduce the environment locally using Docker compose file. Demian opened a pull request that is, which is already merged containing fixes. So we can use self-sign certificate locally. So we can easily replicate the production environments. A question before I move to the next topic. The next one that I want to briefly mention, I sent an email on the mailing list this afternoon. I've been monitoring the status of Cerverion, the mirror Cerverion. The IP was stable, the DNS record was stable. So I put back that mirror in our infrastructure. I had to remove the configuration and put it back again because for some reason mirror bits would not allow me to update the location. They would not detect the new location of the public IP. So I suspect that it only configure the location the first time we add the mirror. So removing and re-adding the mirror correctly, discover the correct location. So it should be back soon. And if there is anything wrong, then we'll have to investigate. We still have to monitor mirrors. I haven't had the time to open a PR for that. It should be a pretty quick fix. Oh, so Olivier, the location, do we know where it's declaring itself location wise now? Is it correctly declaring itself in the Netherlands? Or is it still appearing somewhere in the United States? No, if you look at my screen, I'm listing mirrors. And so it's correctly detected in the Netherlands. For the moment, it detects the mirror as down, but yeah, it should come back. So basically the way mirror bits detect if a node is up or down, you just take a random file and test if that file is located on the mirror. The problem that we have right now is because most of the, I mean, because almost every mirrors only contain part of the files, if we are testing a very old one, I mean, it detects the mirror as down, which is not. So sometimes it can take some time to add those mirrors to the pool. An improvement would be to configure archives to Jenkins here.org to be the source of the mirror. So people who download every file, it would simplify mirror bits management, but from an end user point of view, it would change nothing because who cares to download Hudson binaries anyway. So archive the Jenkins that I have contained every artifact generated under Hudson and the Jenkins project, where people are just relying on new artifacts. So this would just be a quick, I mean, it's not mandatory, but that would be a way to simplify the management of my mirror bits. Unless you have a question, I move to the next topic, which is Jenkins updates. Because last week, because of the first time that happened last week, last week you had issues with update CLI as well. And we had issues with the VPN. So we have quite a lot of pending PRs to update the various changes in our infrastructure. So I've been waiting before merging the PRs because when we merge PRs that define which Docker image we use, automatically restart Jenkins instances. I didn't want to do it today because we had the weekly release today. So both release.CI and trust.CI were affected. I think we have a stable release coming in the coming days tomorrow, right? Yes, so we don't, yeah. So it's definitely not the right time to play with Jenkins updates. So we'll probably wait until Thursday before merging Jenkins PRs related to Jenkins. So I just wanted to have some. We could attempt to start the 2.263.4 build early your day so that you have, and it's only a three or four hour process. If I remember right, then you could have Wednesday afternoon to work in for if you'd like, if we got, but it would mean someone would have to launch that early in the day. To be honest, I wouldn't bother for that specific version. So I would just wait Thursday. While we are also talking about Jenkins, I would like to update the EC2 plugin on CI.Junkins.io. There is one feature that I discovered that I was missing from my point of view for a while. On EC2, with the EC2 plugin, we had to specify a specific MI. So which means that each time we were building a new MI, we had to manually configure the Jenkins configuration, which was cumbersome. And so now with the more recent version of the EC2 plugin, we can now filter and fetch the latest version. So I would like to use that, which means that we would be using the latest, we would always use the latest MI, which is probably not something that we are doing at the moment. But yeah, probably waiting until Thursday to be sure that everything is fine before putting those services down. Yes? Would you be willing to use that same time to do the upgrade to 2.263.4? That way we just do them both. So both upgrade to 263.4 and update EC2? Yes, that sounds good to me. Okay, great. The next topic, which is about page or duty. So last week, I did a demo to Garrett Damian and Kara about how to use page or duty. I realized at that time that I did not have permission to invite people. So basically something that we would like to do now is start using page or duty again. So we'd like to have more people in the loop and also fix either incidents or monitoring. So right now we have few people in the monitoring, but they don't necessarily answer. And so we're just in your alerts and the idea would be to start using it again. I sent a bunch of invite today if other people are interested to participate in the on-call rotation. Basically what we try to do is to only be on-call during our, I would say working hours because we have enough people on different time zone to look at issues there. So yeah, that's the thing. The, I would say the priority right now is when we get a notification for an issue is to see to identify if that issue is relevant. If it's not relevant to update the check accordingly, so we don't get notified again. If it's relevant and we don't have the documentation, we try to update the documentation so other people can deal with that issue. And so yeah, there is to have more check to do the way we handle major duty issues. We already have a git repositories with documentation so that would be a perfect moment to update that documentation. Next topic, yep, next topic. I leave the floor to Damian, but Damian and Cara have been working on rootless GNP agents. So I propose that Damian explain the current status of that work. So the technical work for the initial issue which was having a default user on these Docker images to be something else than the root user because on different places on the infrastructure we are using these images and there were some scenarios where running as root user was triggering issue, not mentioning the security concern that could happen especially because we don't run Docker engine as rootless engine as for today. In that context, the technical work has been done by Cara and she took the opportunity to add the test harness which is common for all the images because all the images from the GNP agents repository share some same expectation in terms of behavior, the same entry point, the same default user because most of this image are built inheriting from the tools installed like Golang or Maven or PowerShell, so they inherit from this tool and some files or elements are copied or duplicated from the Jenkins inbound agent images so the jar file, a script. So the resulting image might differ from the inbound agent in terms of behavior or content. This is why Cara have wrote this test harness to be sure that what we expect, the general behavior we expect from all these images is detected really early in the process by this testing harness system. So with this we're able to deliver however there is still one blocking concern, it's about the naming. We made a proposal after some discussions about moving the images that were in Jenkins CI namespace in Docker hub slash GNLP agent. Everyone agree on the renaming of the image name part so inbound agent instead of GNLP agent following the parent images but since these images are only known to be used on the Jenkins infrastructure, at least initially, we propose to move it to the Jenkins CI name space so that the existing image not being documented, not being updated as for today should be deprecated and not passed as an artifact to the community. However, the discussion on the mailing list raised the issue that there are some people from the community that are consuming these images so it looks like there is still an expectation of keeping the Jenkins CI namespace. So no decision has been taken, I think the discussion was delayed due to the VPN and the FOSDM. However, we need to take decision. So the proposal would have been the following. It would have been first we deliver that change for the Jenkins CI infra. So we renamed the image including the move to the Jenkins CI namespace first and we announced the deprecation of the older image name. Then the idea is we can totally introduce back next month or in the next month is the new Jenkins slash inbound agent dash something. If and only if an image has been documented and there is someone from the community willing to help maintaining it. That's the proposal. So if no one come to maintain these images, then the images won't be provided. And if we don't have anyone, it will just disappear because no one is using it. The goal is to ensure that we have some quality level because providing images that are updated once a year is deserving everyone, us as maintainer but also the community because they will use this image, keep this image assuming that it will work while it's not. And I don't even mention the security issues because if we pass a security scanning on these images it will be really, really bad. Some things that were triggered in that discussion first do we really want to provide these images in the sense that we already have different dimension to maintain GDK version operating system inbound outbound agent. And this add another dimension about we want Maven but do we want to maintain Maven free and incoming Maven for? Do we want to add two version of Ruby? Do we want to add different version of Golang as well? What are the depreciation notice? So it depends on upstream tools that have different depreciation and life cycle than ours. And so we have to ask ourselves what are the use cases solved by these images? That's really important. That's also the reason why I propose that we move to the Jenkins CI infra because we know why we use it. And then if the community ask for question that mean there are people using it they have interesting use case and it's important to capture this use case to build the correct artifact for them. Maybe they will need back these images but we need to be sure to not waste our time there because it's a complex topic in particular around maintenance and updates. Some improvement that we see upcoming that might not be prior but due to this discussion and the work that Karadit adding a specific test harness per image because the Golang image have some differences from the Maven image in that context. So we need to improve the process so that the process is able to build and test images in parallel efficiently with first a common test harness and then a specific test harness by image. Second, we see another improvement. It's thinking about all the images we produce on the Jenkins CI, all the agents and even the controller and maybe use the CST harness. The goal will be to test half of the features or expected behavior with the CST which is really fast to run and delaying only the complex acceptance changes to the already existing test harness built on bats. This will also help to improve not only the speed of these builds but also the capability to add more security because CST does not need Docker engine for most of the tests that it described. And by doing this, we could also build Docker images without a Docker engine allowing the build to be moved on Kubernetes cluster. So it should build faster with more agent that could be provided and no need to maintain and ensure that no build are concurrent. We could have concurrent build without any risk for that. Finally, we have two ideas that are completely exploratory but I want to mention them. If we have to maintain all these dimensional builds maybe the Docker file is not the correct tool for that job. Packer is able to build images and we could use utilize Packer to maintain that matrix in the future. KARA is currently experimenting on a draft pull request. It's a proof of concepts. It's not aimed to be productized but still it's interesting as experiment. Second element, a Docker file generator ID. That's a project we might want to add to the GSOC with KARA. The idea will be a contributor will go on the Jenkins website on the Docker page and have a form. So you select I want the Docker image for an agent or a controller. I want that base operating system, Windows, Alpine, Debian, whatever. Then I want GDK 8, 11, maybe 15 in the future. And then I also want to customize with the following supported tools on that list. So you have a list, let's say Terraform, Golang, whatever. And then you click on I want this. And the local JavaScript on your web browser generate the Docker file. It's completely static, does not need a backend or a database. It's only built on file system and it produce the Docker file recipe that say, okay, we recommend you to build your image with that template. With fixed version eventually checksum and the database would be a JSON file answered with this old HTML file. So we could totally use that as a static service. And that could be a great idea to provide the service to the community because the idea will be to provide a recipe and not the cooked meal for the community because a cooked meal can be ripe if we keep it as it's outside fridge during month is that's the same for a Docker image. While providing a generator of recipes, anyone can cook and maintain and it's the whole meal when they want but the recipe could be shared as a service. But yeah, just to clarify here, the consider alternate ways to maintain those images are just suggestions. And if the community want to participate with those that would be nice but from an infrastructure standpoint, we don't have any plan to spend too much time on those. So it's more like ideas that we have that we are exploring, but it's definitely, I mean, I don't think that the Jenkins infrastructure focus building Jenkins agent for the community because that's definitely a complex situation. And the mean, yeah, just want to explain why it's challenging to build the Jenkins image here is because the Jenkins project build inbound agent. So we have a small Docker image containing the gene the de-inbound agent that can establish a connection with a controller because we also use image on Maven, Ruby, Python and those images we had the choice to either maintain our own image and manage our stuff the way we installed Maven, the way we installed Python, the way we installed Ruby or we decided, but that's basically what we did. We just said the Ruby, we're going to use the upstream Ruby Docker image, the Python, we are going to use the upstream Python Docker image, but because we don't control the ways those images, some image use Alpine, some image use Debian, some image use CentOS and because in our case, we want to have those inbound agent with a Jenkins user which means that based on the operating system we have different ways to create a user for instance and that's where the challenge come and based on the version of Java, I mean, that's what Debian explained with the metric, metrics of Docker file, but again, yeah, that's, we have to see how we can use it but yeah, that's definitely challenging. So that's why we had that discussion should we build inbound agent for different, I would say tools for the community or do we just build them for the Jenkins infra project because obviously building them for the Jenkins infra project mean that we build them if we want to duplicate, it's a decision on our side but on the other side, once we push and publish an image on the Jenkins Docker Hub organization then people assume that we are shipping those and that we are maintaining those image which is not necessary the case. So if you are interested to maintain those image we would like to put in place the person in the code owner to really identify who's responsible to build those image maybe just one image if you're interested to just build a Python that's fine as well but we definitely have to clean up and to simplify those inbound agents. That's why we are working on this at the moment. Just a side note regarding the Jenkins infra project that problem will be solved in a different way as soon as we will be able to switch to Kubernetes agents because with the concept of the pod with multi-container you only need highly specialized Docker container and you don't need this image anymore. So as soon as any Jenkins infra jobs that are using these images is switched to Kubernetes-based agents we won't have to maintain these images for our own and use it on our own infrastructure and that should be a good indicator of our work. Technically we do because one of the issues we had for instance was with the Maven Docker image which is running as root by default and running Jenkins test as root fails at some point. So that's how it started basically. So you need to be sure that if you are using Maven you are using Maven with a different user than Roots. I had the issue while working on this. Which can totally be provided through parent templates. For the Kubernetes agent. So we can provide these rules on the pipeline library or parent templates. But the goal will be to get rid of these images that could be a good indicator of simplification of the tooling we are using. Yep, I provide to quickly finish because we spent, we are over the time for the meeting and I would adjust to quickly finish. In terms of release we had the weekly release today everything went well. I'm not sure if Mark already published the GitHub change log. Yes, you did, that's awesome. We have a test release coming tomorrow. And the last topic before we close this meeting which is about Oracle Clouds. Have you created that account yet? Or is it still, so you created that account? Do you need some help to deploy Mirrored? Yes, but I think given the pending infrastructure changes we should look for that maybe next week. Not this week. Let's let this Oracle experiment wait another week if that's okay. If you have some time on Friday, I would like to give you Friday. Great. Because basically what I fear is once you start the accounts the time that we can use under the sponsoring program reduce and we always have excuse to delay work. So let's take a date and fix to that date. Great. All right, so I'll schedule some time with you for Friday. Awesome. Then thank you for participating to this infra meeting and I propose to stop here and continue the discussion in RSE basically. Thanks for your time. Bye bye.