 Hi everybody, welcome to the Weekly Infra Team Meeting. We are today the 12th of March 2024. We have in attending that meeting, we have Marc Waite and myself. We may be joined by Erwe Le Meur a little later. Congrats for Kevin for his waiting, by the way, to yesterday. Announcement, we got today the last weekly released 2.449, which is almost done, I think, Marc. We are just having a little hip-hop because the FTP of Austria-Sel in Chicago is doing a maintenance right now. So, as it's used sometimes as a main mirror, it can have a little delay on the propagation on the mirror bits notes. But we did check that the last weekly, the 2.449, is at least in two mirrors already. And we did check too that the docker image is up and running. So, we'll be doing the updates for the weekly after that meeting. As for the announcement, I don't have any, let's check last meeting what we had. No, nothing much. The next weekly will be next week, 2.450, next week will be the 19th of March, 24, yes. And the day after that we got the LTS on Wednesday. Chris Turn is the release lead. Yes, thank you, Marc. Do we have any Jenkins? Well, so it needs review and merge of the change log and change log and upgrade guide. Oh, cool. I haven't done that yet. Kevin's written it, but it hasn't been merged yet. Okay. You talk about the next LTS? No. Yes. Yeah, I'll do that review and merge. That's my job. Okay. So, there is no security release. Last security release was last week and it happened quite well on plugins. Next major event, it's for you, Marc. You're going to travel a little to Los Angeles. Right. 17 and 19 of March with Alyssa and Basil, lucky you. And the CDCon is in April. Right. And CDCon will actually have, we will announce the Jenkins award winners. Oh, cool. And we should remind people voting is open. Open until I believe it's 18 March 2024. Maybe 20 March. See the announcement. Okay. Okay. So let's start with what have been done. During that last week. We were a little short on people also. Not much have been done. Some cleanup permission. Oh, team Jacob did handle that. So thank you, Tim. Oh, yes, that one. It's for me. We had an expiration form from three or four different things. That one was the digital ocean token. I did took care of it. And I think I took care of it yesterday. Yes. And we took advantage of that to use a new way of logging in digital ocean and sharing the two FA in a more secure way. So the, the three of us in the infra team can now use the same access for the digital ocean and renew all the token. So I think that's a good point. So the idea is that the dam is not the single point of failure anymore in there. You don't say spot. You say the single point of failure. You did it right. Yeah. Okay. I don't like to call them your point of failure, but that's okay. No, no, that's fine. Not, not the critical path of the lone wolf. So this one is for you. What I did here is when I detect spam in Jenkins Jira, I will disable the user. Change their email address so that it's clearly identified that they're a spammer and change their password. It's a common pattern that what, so I don't actually delete the email that are delete the account because I wanted a record that they were a spammer. Is that okay? Or is it? We asked the question when everybody's back, but yeah, maybe, but for me, I feel that it's good because we, we, we keep the account. So if that account have been used to spam anywhere, we can still have a link on who did what, right? And maybe one day we would have a tool to remove automatically all those spam. Oh, that's good because I use a consistent naming pattern when I, when I changed the email address and therefore it is an easy search. If we decide later, we want to delete the account. We can do so. Great. So, so what I did is okay at least for now. For me, it's okay. Let's check with the others. But for me, you did good. Great. Okay. And I'll keep doing that each time I detect a spammer and disable their account. Okay. Oh, this one was on Sunday. I discovered that we missed the event in the calendar. And we were the VPN URL expert. So we, we didn't have any access to the VPN anymore. So I did renew it on Sunday. It's not a hard task. It went through the, the only thing which is bugging is when you wait too much and it's expired, you cannot go in, in infra.ci and trigger the bills to go faster. So you have to wait or to trigger them from the, the local computer. But I did it. I added it a few team calendar event and added more calendar events. So we got one week, two week, three weeks and even a month before like that, we, we should not miss it anymore. But even if we miss it, it's still doable. So on this one, I was surprised. I didn't have to change anything. And I was able to access the VPN this morning. So this was just a server side thing. That you did. Yes. Yes. It's just renewing the, the serial, which is used to revoke people. It's, it's the serial is that. Okay. Thank you. Thanks. City token. I'm not renewed. What's that? Oh yeah. Same. We had a service principle exploration on trusted. Hervé did manage this one on Saturday. We also missed the, the calendar event. We missed three calendar event on the same weekend. And that one is part of it. And I did manage orders. Let me check. I did manage to other service account, which I don't find here. I probably forgot to put them in the, in the, Oh, sorry. What's the name? The week work. The milestone. I would probably forget to put that in the milestone. But yes, we had to work on, on Saturday, Sunday, and yesterday to renew everything. This one is for you. I think Mark. Right. And we found a, there was a plug in the, there was a, there was a plug in the, we found a, there was a plug in that we use in Jenkins Jira. That had its, its license was marked as expired. And as it turns out, there are actually two plugins whose licenses had expired. The Jira instance license had not expired, just the license for these plugins. Yeah. And all we had to do was ask Linux foundation, please could you renew the licenses? They clicked a button inside the Jira admin interface. And it renewed the license. So I started a discussion. I was like, okay, I'm gonna delete this thing. Shall we get rid of it? But the ultimate decision was we don't have to, we don't have to worry about it. It's, it's easy. They did it for us and it's done. Perfect. And I'm assuming it's free. Yes, it is because, because Atlassian donates the Jira data center license that we use and suggest, or the people who provide suggest a mate have agreed that it is also free to open source projects. I'm wondering if this issue that I'm opening right now is not a duplicator of the other one. I don't, I, so if, if there was an earlier mention of Q test, then it is a duplicate because Tim fixed it in both cases. He just, he granted them the permissions they need. The first one and the second one. So, and you did probably took care of this one. Right. And that talkmatic thing is the same as the earlier. Right. Okay. Same pattern as the earlier one. Yeah. You are not getting built by just a click. Oh, that, that failed in the, in the very wrong moment. But it did work. Okay. In fact, he created that pull request at the exact moment where the, the CI was down. Yeah. Well, and I'm, I've seen cases where branch creation was ignored and I've not investigated further because yeah, it as, as, as Damian says, if we really want to not have loss of webhooks, we need to something other than Jenkins to queue the webhooks. Yeah. It seems that they had a project at that one time. They did. Exactly. We, we, we've had a, and it, it's a, a service that you run multiple copies of it on Kubernetes. It listens and it, it does all the good Kubernetes goodness, but it's a lot of work to, to create and maintain, et cetera. Okay. So maybe, but we don't have much time. Okay. Exclude GB space from ACP. Oh. Okay. So that, that, that was taken care of by directly not my fault, which is Alexander and he did. Yeah. Okay. Well, it's good. Yeah. So it's good. Okay. Okay. Perfect. Archive. So that one was probably seven years ago. Yeah. That's been closed, cleaned up and archived. Thank you. Oh, that was a little job that has been removed too. Okay. Perfect. So some cleanup. Crawler failed. Oh, that was SS token expiration that we had to, to remove. And, and the airway took advantage of that to use his new SS exploration token, which are short time living. Yeah. That's awesome. So those token are really short life and are working right now. Now almost everywhere. Great. And this one. Oh, someone had been blocked. Yeah. I remember he kept that open. Yeah. As long as, as we don't have news, I know that we have to answer for that to open two new. Mirrors. In. Romania. Sorry. In Romania. Thank you. And that's on. Work in progress. And we should have two new. Oh, someone have been blocked. Yeah. I remember he kept that open. And we should have two new. Mirrors. By a stick. And I forgot the other one. Oh, stick. Oh, stick. Yeah. And RDS was the other one. If I remember right. Yes. You're right. Thank you. Close as not spam has not planned. Not. It did close by himself. Great. Great. Evolute clean owner of code owner. Yeah. That's a lot of work. So it's, it's been cancelled. I'm assuming. Okay. So no. And the removed your composites. Leap support look for motor. Yes. So, so Jesse and Basil disagree on, on the plugin. And we choose to leave both of them. Okay. Which is fine. Yes. It's good. Thank you guys. So, so we did already talk about that. We have a new mirror coming in by obstacle. We have a question and ask from basil to stop to mirror the central two and avoid the knots. And that, that can be done. We just have to spend some time on that. Oh, we need to take care of this one too. And check if it's a real account. Did not have time to, to check on that two weeks. Sorry. Need to check on that. Plugging. Earth scoring. Configure a new job on CI. Okay. So I need to ask Alex. Not Alex. I'm sorry. Adrienne. What we need to do exactly. To be used. Oh yeah. That's he did develop a new way of. Archiving the data in a file. Meaning that during any process and then we start and everything, he doesn't. Oh, God. Build the whole file. And the fight is, is on report. Excellent. Okay. I've been hoping for a static. Static file. Like that. That's great. Okay. That's exactly that. I need to make sure what he is expecting us to do exactly, but we'll see. Okay. No. So it was about to be closed and he reinsert. We'll do that tomorrow. If that works. No. I'm going to get response. It's boarding. So we need to. Wait for, for news. Maybe we need to. Call for action again. Okay. Um, that clear. Yes. This one is API. Rightly meeting. From that CLI. We need to spread the loads of, of those. That CLI and to use more. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up. Get up action. Or get a back to the location to spread the load, but that's. That's some work and we didn't have time to do that. But we know how we can avoid that. I don't know what's that mean. I'm sorry. I know that the migration to the sponsor account. I think that the man didn't have much time to work on that. All right. But that's, that's working progress too. I think it's priority for him. This one I can talk about because I did some of it. Two weeks ago I did digital extent. And the last week we did with Damien, the AWS. EKS, CI-KVTES, and EKS public, and they are both in Kubernetes 1.27. I started to prepare for the Azure one, which needs to be done by the end of the month, and I will have to prepare all the prerequisites this time because it's kind of sensitive, but that's in a good way. The four and six are done. We've got private KITS and public KITS to do, right? Work well, so good. Yes, we also want to add a private Kubernetes cluster on the Sponsored Azure for us to be able to move load on the Sponsored subscription. I didn't have time to work on that yet because the Kubernetes 1.27 and the IRM took most of my time. This one, we thought that it was caused by digital send if I remember correctly, and in fact, no, because we thought that was caused by the Kubernetes upgrade to 1.1.27 on digital Russian, and in fact, it didn't change anything, so it's not due to digital Russian. It's between the ACP and the Maven way of working. Basil pointed another issue that we will see after, that we have seen earlier, which is to cache a central that may help to have a better performance, more consistent, or show that the ACP got a problem somewhere. We have to try. This one, I think that everything, I have much time to deal with it. Yes, it's versioning Jenkins IODocs, nothing done. This one is on me. I did manage two more pipeline that are using the last image. I'm dealing with the image builder and I did this one in Jenkins IODocs, and this one plug-in site is almost done, waiting for a review to merge. It's a little slower than the other one because most of those pipelines are used on both sides by Infra, CI, and CI Jenkins IODocs, and the same pipeline is run on both controllers, Infra having a little more rights to deploy images and stuff. So it's taking time because I need to make sure that both are still working. But yes, those two are ready to work on the all-in-one as IAM64, and I had to choose the VM and not the Kubernetes pods because we need more memory, and we had memory issues on those pipelines. We thought about creating bigger pods and bigger node pool on the Kubernetes on IAM, but it's kind of expensive and we will dump in the problem with the auto-incrementation, auto-scaling starting from zero is not possible. So we need to start at one, meaning that we will spend much money for maybe nothing. So we choose not to build that new node pool and for now use IAM64 VM as we got them, and we got them on both sides. So it's easier. So this one is just waiting for a review. And next steps, you see I got everything to hold that to deal with. Excellent. Yes, a lot of work, but that's good. Next. Hi, Javier. So on that theme, before we leave IAM64 theme, I've added a topic, no need to move your page, but I've added a topic to the announcements. An IAM64 server has been lent by Ampere to the Jenkins project. It is currently configured as an agent in my test environment. We'll need a lot more conversation about what to do with how to do it, how to use it best, but it's allocated, it's running, it's building, it's built many plugins, it's built Jenkins Core, ARM64 works great for us. Back to you, Stefan. Thank you. Thanks. Hervé, you're right on time. Can you tell us a little more about this issue? Because I'm thinking you've got news. Yes, so I've got the last bit working. The next, the crawler is now working with AZCopy instead of Lobixer. And the next final steps will be the replacement of Lobixer in mirror scripts, the scripts that are used on the PKG virtual machine. And I didn't capture that in the notes, Hervé, could you say it again? So it's replaced already in, which was it? Every, since last week, there is a crawler, which is using AZCopy. And the week before, I've replaced Lobixer in Jenkins.io, plugin site, Javadoc, and I think that's all. Great. And we got really great improvements. For example, on Javadoc, it was spending between 40 and an hour and 10 or 20 minutes. And it's now spending 11 to 13 minutes with AZCopy. Great. To go out the air. So we've got improvement in time and cost by using premium storage. And yeah, in security because we don't use access key anymore. And long-lived SS token. Yeah, that's really great. It's a win-win-win. Thank you. Yeah, I'm going with that. I think the next one can be the update center too. Yes, no, it's not that one. That is, update center is next. I've also updated the pull request where we put in place a synchronization to a file share, our two buckets in addition to the existing synchronization. So pull request has been merged this morning by Daniel Beck from Jenkins security team. And the job is running. We're spending a little bit more time, around four minutes and 30 seconds. But Daniel said that if we need to increase the time of this job to five minutes, it's not a problem. I've looked into using MD5 to see if it could improve the IZ copy time, but it doesn't. So I think I'll improve the time to five minutes and go with that. I think the only way is to use parallel if you want to speed up. No, it's already using parallel. It's already massively parallel. So so just to be sure, I understood. So we're currently spending four minutes, 30 seconds to complete the copy. Yes, with the three syncing parallel instead of two minutes and 30 seconds. Okay. All right. Great. We have syncing parallel and we have also a kubectl call and publicator cluster, which is triggering your bit scan. So that's 55 seconds added to the time after the parallel thing. In fact, that's the creep we've worked over together. I'm sorry, I forgot this one. But so then just to be sure, I understand. So with with what Daniel merged and the extra time that was on a five minute cycle, we're getting very close to the five minute cycle. Do we need to take immediate action? Tell me more about what's coming. I will include, I've put back the octane. We have a not in environment variable allowing to activate or not the parallel sync. And it takes four minutes, 30 seconds. And the cron is currently configured to run every three minutes. And I will update it to run every five minutes. I see. Okay. So, so Daniel has agreed. It's okay. Yeah. It's okay for Daniel if this job is spending five minutes instead of three. Yeah, this job has been merged. But but it's it's still on beta because we can switch on and off with a variable. Right. But so Daniel has agreed. We can change the cron cycle from three minutes to five minutes. Yes. And now, now we just have to be sure that the variability of the job execution never exceeds five minutes. Okay. I've run six already and they all takes the same time. Oh, good. Okay. Well, it's currently running since, I don't know, 10 minutes or so. I think the three last, I can check, but I think the real time will be the same. But you did change the cron time? No, but it doesn't matter because the next build are in pending. Okay. It doesn't, yeah. It doesn't pile up. So we don't get multiple builds running concurrently. Great. Okay. That's all. Okay. So so last check on the on the the weekly build. Our mirror list still shows only three mirrors, only two mirrors plus archives. So worldwide service of 2.449 for now will come from only the University of Aachen and FTP NYC. But we think that will resolve itself once FTP CHI is back and their plan to have that back is by end of day tomorrow. So we think we're no more than 48 hours running with only two mirrors serving a weekly, weekly release. Yeah. And this weekly release. Yes. Yeah. I'm sorry. Yes. Exactly this weekly release because Stefan and I confirmed that preceding weekly releases are all being served by many, many of the, many of the mirrors. It's just for this weekly release and for a relatively few days that they'll be served by those two locations. And if we, if we have a look on the, on the 3H, we got this one to know if we, if we can handle that. They'll fix booking. Yeah. And that one, that one will have to work with Delphix because I, I, yeah, they, we certainly want, we certainly must comply with license terms. And if they're bundling a proprietary dependency, that means we can't distribute them. Okay. I understand that's Daniel. Okay. And then we got this one with EPV6. We'll work on it soon, I think. We've got an expert on IPv6 with you, are they now? No, no, no. And this will be difficult. So I don't think we'll work on it as an expert. If I remember correctly, that's because of GFrog not handling EPV6 now. It's not that. It has nothing to do with GFrog here. Sorry. Okay. So we'll have to wait and wrong email registration. We've got someone. Okay. We have to check if it's spam or not. Okay. So that's all. Nope. Nothing new except that. So we're done. All right. I'll stop the recording.