 Hello everyone, welcome to the Genkins Infrastructure Team Weekly Meeting. Today we are the 26th of September 2023. So that will be a word session because I'm alone today. Mark White is currently travelling. Hervé Le Meur is unavailable. Stefan Merly is healed. Punevarahton is giving a course and Kevin Martin is healed. So anyone trying to poison the team must be brought to justice, right? Okay, so let's do that quickly just for the sake of running the meeting one. First of all, announcements. So the weekly release 2.425 has been released with success. So I believe packages Docker image are out. Already deployed to infra CI Genkins. So it's running well for now. Next release, check this item will be done. I assume tomorrow since Kevin is healed to be done later. No more announcements on my side as far as I can. There has been a quick incident on incremental publisher, which was in a word state after an Elm deploy open. So we had to delete the deployment and recreate it. So that was 5 minutes of unavailability of outage today. I'll open a posteriori afterwards status elements with explanation. But what happened is that the end state wasn't able to deploy and was stuck because the older replica set was being deleted, but never terminated and it was an unwanted state. So we had to remove replica sets, but deployment and then apply and file again worked. So replica set was not atomically deleted to remove it manually and redeploy. Next weekly will happen next week 2.426 as usual. Next LTS I don't know. So I will put NA. I assume it will be middle of October, so we'll confirm next week when everyone will be there. As far as I can tell, no security advisories. As you can see, none announced. We had one last week for the core as you can everyone remember. Next major event. We have more quates currently attending the DevOps World Tour at Chicago this week. Then Santa Clara in London. I believe London might change. So be careful. I'm removing the dates. Don't forget the first day of October. We'll see the October fest. We need to be ready. Okay, so let's get started with the walk. We have a task user blocked by an anti-spam system. It looks like we have some IP ranges that are considered as spam ranges, but these IP ranges are shared by different ISPs. So we had to unblock the user here. There has been an issue on the plugin not showing up, but it was on the plugin side. The maintainer took care of that and it was fixed by itself. No action required from us. We had an issue with launchable last version. It's still failing. We tried to deploy it and it failed. Jenkins Core builds during half a day. So we had to roll back to 1.66.0. That has been raised on the launchable command line issue tracker. And we'll see when the problem will be fixed. That is an important thing. It means next launchable updates require us to be more careful. We might probably want to test it directly using the pipeline library or a custom installation function in a pipeline. Another issue due to anti-spam fixed, same kind of problem, same result. And finally, the mirror, the list of mirrors for Jenkins.io are now stable. It's now stable and enabled. BelNet is back. There was, I think it was Servana, which is now back. And we have removed the server that weren't used due to no erasing or FTP. So we sent an email last week to the maintainer of these mirrors. They weren't responsive enough. So we have removed the mirrors for now. If they contact us back to re-enable the mirrors, then we will take care of adding them back. It's not written in row, but now we have a full and clean list of mirrors. Just to note, removing a mirror triggers a full re-scan, which means the fallback might be used for a few hours, delegating most of the traffic to fallbacks. Also another point that might result in an issue, as cooked by Daniel Beck on ARC, Archives Jenkins.io should be used as fallback, even with a lower weight or as a mirror as a lower weight. Or we need a fallback mirror hosted on the Jenkins infrastructure, because some old packages are fallbacked to OSUSL, which seems to have a time range. And after the time range, they delete the packages. Should be the fallback. Why Archive Jenkins.io? We are moving it to the digital ocean. So it can handle being a fallback mirror by default in bandwidth and performances. So that turn from the old machine that was used as a low priority mirror. There has been discussion on that topic. I remember Mark and Olivier telling that we want both. But in that case, given the infrastructure effort, since Archive Jenkins.io has everything, the goal is to put it as a low level and fallback mirror or lower weight mirror. By default, it has all the content. So being only fallback should be the way to go. Issue to create on the new milestone. Now, what about the work in progress? I'm taking them on the order of the list. First, thanks, Stephen Merrill, for walking and helping our new contributor. So there is a work on the Packer image, the only one image we are building, since we are back trying to build Windows container. We are trying to move all the test harness to a tool named GOS, which will make things easier for testing and more portable. There has been an effort on GOS, especially with the incoming 3D thing. So Stephen has worked a lot that help on the playwright issue from our new contributor that will help on the incoming Windows support and that will help on making the build faster. So that issue is a set of tiny step-by-step. Thanks, Stephen, for taking care of that. There is some automation with update CLI. And the next step will be to move all the sanity check to either an acceptance test or a sanity test. We need the two of them. Details on the issue. Work in progress on Oracle Cloud and tools. The goal is to remove them. So the status is we have removed the job. We are going to archive the Terraform project and then we will remove the wall access to integration. But only once archive Jenkins will have been migrated away. Next step, we have effectively switched from TFSEC to 3D for static analysis of Terraform files. 3D sometimes is a bit complicated to manage. It's because the inline exception are a bit more sensitive due to the regular support. It's work in progress. It has already been reported to the AquaSec3D project and they're working hard on it. So now the next step before closing the issue will be to remove TFSEC from everywhere on the infrastructure. Speed up the Docker image library to create push tag at the same time for both GitHub and Docker. That is an important part required for the next step of IRM transition. Alas, Stefan was absolutely full of other topics that are more important than this one. But now we have shared the burden so that one should be okay to be worked on for the next milestone. You haven't done any changes and didn't have time to work on the request field from TerraLogin page. So to be continued. If I don't have time on the next milestone, that will be back to the back end of issues. HA, unsure for the replicated services, huge work from Stefan on that part. So we have identified that we need to be careful on the anti-affinity and PDB. And now we have a list of the replicated services that we are treating one by one to ensure it has both. Anti-affinity to ensure services are spread. So if we lose a node or do maintenance, then the services still answer if it's replicated. And PDB to ensure that normal operations such as rotating the operating system or upgrading Kubernetes nodes. In that case, the scheduler know how to orchestrate start how much node can go offline at a given moment. So that's nice job. Now we are going to iterate on the PDB to open the road on fully migrating these services to RM64. So thanks Stefan for that great work. The list is really useful so we can split the burden. Award on Matomo GitHub Docker repo. So we have decided of the architectures and everything is ready to roll for starting the MySQL. And I'm taking over the Docker image build to help Stefan so he don't have too much tasks. So we should be able to deploy the new MySQL instance to support that new machine. Reminder Matomo should run from the start on RM64. Award about Artifactory bandwidth reduction options. So every issues has been fixed technically. So now it's on Markweight area to confirm that the statistics that he received weekly from GFrog confirmed that we have reduced the bandwidth. And then next month we should be able to report success or failure to GFrog. So let's see the result of the real life system. If it hasn't decreased then we will need to analyze where this traffic come from. The topic about deleting the pages on Genkin SAIO that aren't accessible and indexed anymore is will be put on hold because the survey is busy. An update center and other tasks and less availability for him this week. I will put this issue on the next milestone and I will try to remove at least the faulty page on short term before everybody can continue his work around backing up and doing the migration. Finally, a word on updates Genkin SAIO migration. So every thanks for the huge work here currently trying to find the, let's say a main sustainable setup for Cloudflare with Terraform that involves DNS, TLS and air to buckets. So then every should be able to start testing the new the new mirror bit instance. We weren't able to create or spend time on the Octoberfest issue. I hope we will have this week. Otherwise, we will, we'll just say no Octoberfest for the Genkin Synfra. I hope anyone ready to help could help. But yeah, don't hesitate if you're interested though. We still have a few issues that could be done and marked on Octoberfest on last minute. Okay, so that's all for me. So let's see each other next week. All the issues from current milestone will be moved to the next milestone. And if new issue happen, they will automatically be triaged. Thank everyone and see you next week.