 Hi everybody, welcome for this new Jenkins infrastructure meeting. So before we cover the main topics, I just want to remind two things for next week. We'll have Jira maintenance next Wednesday. So if you go to status the Jenkins that you see the announcements there, it shouldn't be too long. I mean, maybe one hour, maybe two. Yeah. In fact, it's a bit long, but yeah, it's needed. So everything is explained and that will be Thursday, right? So not not Wednesday with risk on the 303.2 release. Yeah. Exactly. Second one is, as you mentioned, security release next week, we're going to have a security release. That means that we try to not modify Jenkins instances related to the release environment. So just pay attention if you're planning to to apply any changes. Yep. Those are the two most important information. So what I propose is let's me I'm putting notes for the announcement status page maintenance. Jira, yep, that's it. So what are the main activities happening in the infrastructure project? So like a lot of other open source organization, we will participate to the active events. So it's officially starting today. So if you are interested to contribute to the Jenkins project, feel free to register for the active events. We prepared some issues that we named friend issues. Those issues are located on the Jira, the Jenkins Jira instance. We try to identify either issues that do not require strong knowledge on specific technology or do not require strong knowledge about how the Jenkins infrastructure work. So it should be pretty simple. But anyway, if it's not, feel free to ask questions and we are there to help you. And if you have some time to help us review PRs, that's also a different way to contribute to the Jenkins and to the active events this month. Next topic is about infrastructure calendar. So you may have seen that community, the Jenkins that I owe this week. So basically I created a new calendar. The purpose of that calendar is to keep track of important deadlines that affect the Jenkins project. For rest of your projects such as certificates, expiring, yeah, major events, whatever, that calendar is not supposed to be public. We don't have, I mean, we don't have critical information there, but we want to be able to put important information there. So if you want to participate and if you are, if you already contributed to the Jenkins infrastructure project, it won't be an issue. We just by default don't want it to be public. That's it. The rule is if it's something that affects everybody and everybody should be aware of an event, then I'm still considering to put that in the main Jenkins calendar, but otherwise everything else goes to the private calendar. Next topic is about confidence data. So I mean, you should be aware of the fact that we turn off the confidence instance and we don't want to bring it back on the Jenkins project. So the challenge that we are facing now is we have a lot of data that was on confidence and we don't want to lose that data. And so we are looking at how we can convert those information to let's say Jenkins on your website or others. So the first thing that I did was to export every pages to HTML. So those information, I uploaded them on a new Git repository, which I named Jenkins and for a slash confidence data. At this stage, the repository is private same. So if you are interested to help, I would be glad to grant you access to the repository. I don't think there are any private information there, but because the content of confidence I mean, exist since before I joined the project, I can't tell you for sure. So yeah, that's why I prefer to keep private for now. But yeah, so if you want to contribute, if you want to help migrating the information from there to a better location, feel free to manifest. The next step now that we can share the content of those exported HTML pages with other people, we can now start to think what would be the next steps. And so the next steps for us is to ensure that every link actually HTML links. So every URL are still working. And for that, we have different options. One of them would just be to deploy a simple web service like with NGINX or Apache. And we try to be sure that we can match, that we can map existing URL with what is behind the NGINX. If I can share, if I can just share, so we can see what the data looks like at the moment because that's the annoying part. Let me see if I'm connected on GitHub with this account. Yeah, perfect. So there, I mean, those are just different namespaces, the micro system, infrastructure, and so on. So if I let's say open the infrastructure, you can see that for every pages, we have a random checksum at the end of the name. We have a random checksum at the end of the name, and then we have the extension.html. The challenge that we have is because of those two information, we cannot directly map an existing endpoint with the HTML. And the second thing that I noticed when I exported the information is sometimes it converts a plus character to an underscore and sometimes to a dash. So it's not really easy to identify how to map those files with existing URL. So that's the next challenge that we have to face now. You have a lot of useful tooling on Nginx for that, especially there is a directive named try files, which allow to define a set of patterns that can be tried with a fallback. It's used generally with a pattern like, do you have a static file corresponding to the requested URL? If not, do you want to try a folder or another name? And then you can add a fallback like, OK, if not the case, then send it back to an upstream Tomcat, PHP, whatever. So we could use at least that part. So that means using a specific Nginx container that will act as a back end of Apache because we already have Apache vhost with a bunch of rules already made. But the goal of the Nginx container is that it could be standalone and we could try this locally. That will be only a backend service and it provides also a bunch of regex on the virtual hosting matching pattern like Apache is doing. But I don't know if Apache allows the same thing as the try files and I really like that one because it will be OK. Let's try that URL with the plus. If it doesn't exist, then let's try that URL with the minus. Otherwise, fallback to a default page that say, hey, no, we haven't found maybe there is an issue or go to the listing or whatever. So sort of Damian, they've got a concept that would say user asked for X underscore Y and we would check X underscore Y, X plus Y, X minus, you know, X minus Y. That's how that that sounds like an interesting idea. So it's important to note that the performance is of such a Web server might not be really good because try files can be quite CPU heavy. However, I mean, I don't expect a lot of workload on such a container and it's on mainly static files. So yeah, if we allocate two CPUs for such a machine, it can under enough for what we want to do. Yeah, that could make sense. That would be nice. So the way so I just want to look at the first thing to keep in mind is we could just reuse that massive V host that we have that we had on the conference machine that's really right to URLs to a different position. So that's one of the things that we need to implement. So we could just have a Docker file at the root of this confidence data repository with the engine configuration, a copy of the V host and so on. And see if you want to look to see what it looks like right now. So obviously we lost the CSS. And if we want to navigate, it just those are just directories. So, for example, Jenkins, it will list all the files and then you can just open one of them. But in this case, CSS is OK. But yeah, that's that's what we have at the moment. So if people are interested to jump in these projects, have a manifest. So let's see what are the other. First of all, thanks for taking that time, Olivier, because it will help a lot of people, developer and users. There are a bunch of links to that wiki on the current documentation already. So I think with the October, that will be a huge and cool and useful effort to track these things and update them as soon as possible. Also, I propose that for that work that we start by extracting a list of well-known wiki pages to have a list of URLs that we know that we need these pages because these pages are expected to exist. Because we have some inside the existing Jenkins.io website. Having these will also act as a kind of hills check for us. If we build whatever image with whatever Apache engine, whatever application, then these URLs should exist on the build image at every time we rebuild it. And that list of links could be a great start for any contributor, newbies or experienced. And it should help us so as a self documentation to track which pages still need to be migrated. Because we have that list of migration path for the plugins. I don't know if we have one for the other pages of the Confluence wiki. Yeah. So, yes, Mark, you want to say. No, go ahead, Olivier. You were mentioning that we can work with that with other people. The question that I have in this case is how do we validate that it doesn't contain sensitive information? Because I quickly try to get information there. So far, security contains nothing interesting because they're all linked to Jenkins.io webpage. That's a good question. I think we should ask a bit broadly. The three of us are not enough to answer that question. I think that might be brought to the, what do you think about bringing that to the advisory board? Or at least the security team, just to be sure. Because there is a bunch of content. So it's really hard to know if there are sensitive data. But most of these pages were public. That's why I mentioned the list of URL if I'm taking differently. Once we have a list of URL, we know that these URLs were pointing to public pages. And we can immediately map that the content associated to this can be public and shared for sure. Last I talked to Daniel, he didn't think there was anything here. He removed the security content a long time ago, like years ago. For the security part, that's what I noticed. It does not, everything is gone. Yeah, I don't look through like not exhaustively, but I couldn't see anything. And I just did a quick scan as well, looking for through the copy I've got locally. And Daniel had asserted that he hadn't stored anything sensitive there. And the last time he ever considered storing anything sensitive was years ago. And he stopped it very quickly and deleted it. So yeah, it's all made the same time. Okay, so it sounds like I can then put this public, this repostory privilege. Yeah. Any objection mark? None for me. I think for me, it would be a great help for Hacktoberfest if it were public because there are things that the Wiki exporter did not handle correctly that people will have to reconstruct from these HTML pages, even for plug-in documentation. I found that with the Ansible plug-in. The Wiki exporter did not handle the table export correctly. And so now, now someone has to reconstruct those those tables in cases where there was a table inside the documentation. Okay, sounds good. Then I guess the next step for me just to turn, but I'll do it just after And so if you're going to make it public, you don't mind then if I reference it tomorrow when we launch Hacktoberfest in the how you do things. That sounds good to me. Great. Okay, thanks. I don't have any other topic. The last one that I just went, yep, sorry. Yeah, a good one. Just finish unless you have any new ones. So my only topic was if we realize that for some reason, we need more compute on the different cloud instances that we have, we can still increase the workloads. So that's related to that's related to my topic I want to bring. So thanks to to mark a feedbacks from yesterday. It seems we have issues on some of the agents. So they're so on CIG and Kinsayo. So first of all, the Ruby agent, but only us. The infrastructure team are is using it for delivering. So it's not really doesn't have a lot of impact for the rest of the real users. However, the Maven 11 Windows ACI container are not allocated. I haven't find anything on the logs. I don't see any error message. I've tried multiple times to save the cloud agent configuration. That's the magical trick for the Azure VM, at least. And still, I don't see any line of log named Maven 11 Windows on CI Jenkins. I don't see anything mentioned about that on the cloud statistic UI. And I don't see anything in Azure in the incoming request or event. So there is something wrong with the Maven 11 Windows. So after the meetings I have, I plan to check first the XML files on the Jenkins sum of CI Jenkins IO to see if it's persisted there. I will plan a restart of CI Jenkins IO that should not have a lot of impact. Just to be sure if there is divergence between a cask, XML and in-memory persistence. But I don't think that will do anything. And third, I will want to try to manually change the configuration to do the same thing as what we do with Azure VMs. One ACI cloud for one kind of container. And see if it changes anything as well. I don't know, Tim, if you have other ideas. Did you say... Is it there not spawning or is it that when they start they fail? No, nothing happened at all. I don't see any event inside Jenkins logs that, oh, I need to spawn an agent. So Azure has no event at all. So it's not that they fail, it's just nothing happened. Did you increase the log level for those specific events? I should have to do this, but if that can be easy, I'm not really sure. So that's why I wanted to try the other ones that are quite easy to check. However, first the agent should be checked. I feel like we are eating some weird issues when we have so much different clouds that when we workload or one of the template of one of the cloud is not in good shape, then it results in weird behaviors. I've realized that the past few days, each time after a CASC reconfiguration, we had to either restart the controller or manually use the UI and click on save. So that one will be written on the runbook for if we have any issue in the upcoming days. But yeah, there is something weird. I cannot get to this. So I've had that condition repeatedly on my Jenkins instance, just as a natural thing that I had the one you just described of after I've reloaded CASC, I need to visit an agent to save it in order to get the labels to apply correctly. I've just been using as a standard workaround, but that was just me assuming. I mean, that normally means that there's a field not saved properly, that the UI is sending that's not in the CASC. Oh, OK, all right. Because at work, I changed stuff yesterday in our CASC and everything worked fine. Just reloading it. So. OK, so we'll try this one, maybe that will be the root cause. So then it means that we will have to check the CASC export once it works for sums to the current configuration generated on disk. OK, so that's the technique to assure that all fields are being provided, right? Yep, exactly. And I can do this in check online. Sorry, Tim, it's worth checking the XML as well. But yeah, the CASC export should hopefully do it too. And so something that Mark saw yesterday, there were an issue with Maven 11, so the Linux version. Based on the load statistics from the previous days, we saw there has been a peak like 600 tasks requested on the queue while we hit the limit of exactly 50 allocated executors. So I've provided some data on the IRC channel, but it seems like that since we moved back the container to force CPU instead of to CPU a few days ago, it had the impact to have only one pod per worker node on the Kubernetes cluster. I've tried and validated that. I see that it cannot schedule. My assumption is because for CPU should be terrifically two pods, but it sounds like that Kubernetes is allocating its content, one CPU for itself or for another pod on each machine. Which means you only have seven CPUs available on each worker node, so only one pod. The maximum scale was 25. So that's also no, sorry, 50, 50 machines, so that will count. My proposal is to increase the size of each worker node and with the active office to close the subject, increase for this month the maximum size of the scaling cluster. Then to validate that, I plan to run a BOM pull request build, one of these which failed last week, just to validate that it's able to quickly handle the scaling. Sounds good to me. Does it sound good? Does it make sense? And do you agree with that? Makes sense, and I totally agree with you. OK, so expect a pull request before end of day on that topic. I have one last pitch. So we are three minutes before the end of the meeting, so there is just one that I want to briefly cover. So we faced a bunch of issues this week with the accounts app, so with accounts that Jenkins.io. So I won't go back to all the issues that we faced and that we fixed this week, but also I just want to take some time to remind why we are not able to jump on beta that accounts at Jenkins.io. And the only reason for that is because we run extra validation in the account app, and so we would need to implement those validation on Keycloak based on the document, on Keycloak documentation. It shouldn't be too difficult to overwrite the class that implement the registration, but yeah, we just need someone to jump into that topic and try to compare the rules that we have on accounts at Jenkins.io and what's needed on beta.accounts at Jenkins.io. That's the only reason why we are still using accounts at Jenkins.io. So it's to avoid situations such as someone creating an account using dash admin, for instance. So yeah, we have a bunch of rules that need to be applied. And we also register some specific account that we want to be sure that nobody use that specific name as a user account. So yeah, I'm just inviting someone who want to play with Keycloak and overwrite it for configuration. It just Java skills. That's it for me in the last topic. Sounds like we can stop the meeting two minutes before the end. Thanks everybody. I have a great weekend. Goodbye. You too. Thanks for all the works people. Take care.