 Hello everyone, welcome to the Jenkins Weekly Infrastructure Team Meeting. Today we are the 4th of April. Around the table we have Isef Damien Duportal, Marc Waite, Isef Stefan Merle, Bruno Verarten, Kevin Martin, Okul Kumar and Sonali Rajput. I am pronouncing you down correctly. It's right, it's right. It's Sonali Rajput. Rajput, okay. Sorry, I'm trying my best with my terrible French accent. I'm really sorry, Rajput. Okay, so welcome. Yes, Ymer Velmer is joining us just when I was telling his name. Hello, Erwe. Erwe is the last one. Okay, so let's start with the usual announcement. So today's weekly is not out yet. Due to the current DG cert issues, the release for today is blocked at the verify state. So the one file has been generated, tagged and pushed, as usual. It used the new GPG key that we added last week. However, it's still using the old DG cert expired certificate. That is why the verify step fails. Now we'll have different options. We will discuss later the go we have, we might have to trigger a new weekly release or not. Or we could finish the packaging depends on what we want to achieve. That's a discussion for later. And once the packaging step is done. And we have decided to use fully the new DG cert or not, then we can proceed to the usual delivery bits. Is that a nice quick summary as part of the announcement I propose that we discuss this option a bit later. Yes. So the second announcement is that we have the new DG cert certificate. And that's a good news. It should be okay for the tomorrows LTS release. So the next week's signing certificate is there. And should be okay to use with tomorrow LTS. Do you have other announcements? Okay, I don't. So I let I propose we go to the upcoming calendar. We might not have a weekly could be next week or maybe tomorrow or eventually later today, but we might or might not have a weekly soon before the LTS, which mean the next LTS release will be tomorrow. So the next week is again to have that LTS release to have both code signing with the new DG cert and repository packages signing with the new GPG key. So everyone will stop seeing expired key once you will have imported the current one. Any question. So, okay. About security adversaries, we don't have any mail on the security mailing list. So, and a no security advisory plant. Next major event as usual, we have silicone and devox, I will copy and pass from last week, unless you have a new event where you can meet Jenkins community. Okay. So, that has been a really, really packed week. Thanks. Thanks, mainly every and Stefan for the huge work you did on all of these tasks and Mark for taking care of the certificates. That's the, the greetings that are required for because that week was pretty, pretty, pretty hard. So, the task that we were able to run and to finish the Robo Butler service has been sunset. Remove from Puppet removed on the virtual machine cleaned up and its repository and image have been archived. Haven't seen any error. If we see any rock due to that. Then we have to trigger an issue. I think that the edamame machine is now empty from any service. That's an OSU OSL sponsored machine. So we should be able to add that machine in. We have different way of doing that but as an agent for CI Jenkins. Food photo. CI Jenkins is now having an automated certificate renewal using Azure DNS and let's encrypt it's only and it's everything is managed as code and it's working. So, what we did a few months ago with a certificate bound to expire the 11 of April is now renewed every three months, even though that is a private instance. So the opportunity to clean up the DNS as Stefan and Hi so there were not only we had the current DNS record for search CI that we have to migrate to the new child zone where the permission are restricted. But there were former not used or let's say legacy records that have been removed. We heard anyone from the security from the security team and no one is ringing at my door to hang me. So I assume search CI is working as expected. Trusted CI and search CI now have all the permission and Azure credential required to spawn virtual machine agent in Azure. All of this credential are now restricted only to the resource group or where it's required and everything is managed as code. So thanks Stefan for that work. That includes expiration date for this credential directly inside the code. So if we don't have the time to manage to switch to manage identity that will remove the need of having a credential at all. We can have something that check our code publicly and send it alerts before it expires. So again food for talks new ideas. It's ready to be worked on. Digital ocean we have a token that should have that should expire in a few days that has been rotated. Maybe it won't serve. But at least we have a new three months valid token for the technical infrastructure and everything is working. Someone proved that they weren't a spammer. So we created the account with no spamming issue. I assume that the spam message come from their IP that was Marcus from a public IP Marcus abusing. Because I'm not sure why us but the anti spam will be triggered for this persons since I was able to create the account on my home. So the only difference I reuse the information they gave so either they had something on their web browser that was sending spamming requests to accounts. I'm not really sure. Anyway, the account was created so I let the user. To the thing. Social link updated on GitHub organizations that's a new feature thanks Alex and team and their way that has been updated on our GitHub organization page. Miss so someone was trying to create an account. They weren't doing the correct email. I purposely avoided to give the correct email later that person gets the correct email or create a new account. I don't want to risk any account takeover on that area. In the end of the presentation, every again we were able to close the introduce artifact caching proxy for CI Jenkin say oh, so it was reopened due to issues that are tracked on another issue, because the work for the ACP is there. We finished all the tasks that were to be finished last week. Now it's not creating a new service. It's finding why the service in some edge cases is not working as expected, which is different topic. Another account issue account show log status the user was able to find and fix the issue so that's closed. We added a new repository on CI Jenkin say oh, to be built coverage model as per someone requests. So we have now a number when we need to run a scan organization on the wall, Jenkin see organization for the plugins, it takes one hour and 20 minutes. It succeed. And if it's not running and at the same time of 10 or 12 other bomb bills, then it succeed without any problem. The main challenge here is that it has to pause because we have a rate limits imposed on GitHub API. So it needs a 30 minute pause in the middle. That way it takes so much time otherwise it should be more 40 45 minutes. There might be solution but I haven't found any about all making it even using different GitHub app. I'm not sure, theoretically we should not have any API rate limit with the GitHub app for this kind of permission. But it sounds like a GitHub branches plugin inside Jenkins limitation. I assume it come from the time we were using token instead of GitHub app. But I don't have enough knowledge on that area. So it work as a general matter of fact. Let's try to run this without having a bomb bill running and letting the other that you run this because CI Jenkin say you is using a lot of CPU when doing this. That might be the reason of the rate limit not because of GitHub API but because of the CPU usage. The account issue the user account was created. We had issues with agent instability a few weeks even month ago but jumps never answered our question so I took on me to close the issue, especially the instabilities looked word you assume that it was out of memory but it wasn't based on the metric we were able to get. And now we don't have enough information, especially since we changed and shifted or the workload from EC2 to Azure and Digital Ocean and we are back so I propose to close this one and see if it come back because it wasn't reproduced after the pool request was merged. So that could have been network issue or something else. Any objection on that because I don't mind reopening and but I never had feedback from Jim so you assume it's closed. No objections from me seems like a reasonable response. Maven 3.9.1 was released and deployed. So thanks everyone involved in that step. It's generally available and we sent an email to the mailing list. Thanks a lot for helping the Jenkins security team and specifically taking that thing alone. So you were able to not only grant access to two new GenSec team member to release.ci, but also you were able to restrict or at least clean up some part of the air back model. We might need more restriction there. But thanks Daniel for raising that and there we for taking care of that step. Hervé, were we able to restrict all the groups or do we have some following tasks on this one? We have some following tasks, but we should open an MDSK issue to ask other users what restricted permission should be applied for other groups. May I ask you to take care of opening that issue and starting the discussion so we can ask Daniel for a double check. Is that okay for you? Yes. Cool. Don't hesitate to add it to this week milestone. I haven't created it yet. But yeah. Thanks. I haven't heard back from the security team so I assume it's okay. I believe you checked with at least Kevin, right? And Yaroslav as well. Is that correct Hervé? Sorry? Were you able to check with Kevin and Yaroslav if they were able to reach the CI? I've checked with them in secret code access. So private token and release CI. Cool. Thanks. So subsequent issue for our back restriction on release.ci. Okay, we were able to close the big issue as well. Congratulations Hervé. We now have a brand new private gates cluster. So private Kubernetes cluster in Azure which is hosting in FRACI, release CI and our bots that doesn't require public access. It completely manages code so now we can iterate and it's able to run fields on Linux, on big Linux machines and on Windows and using two different subnet for the agents to avoid any security issues. So great work. Just a question again, Hervé. Were we able to clean up the former cluster or do we still have that manual task to run? As indicated, we have some public services to migrate to the new public 8S cluster. Yeah, but we're thinking about the temp private one. Temporal that one has been removed. Oh, you cleaned up everything? Oh, cool. Okay, so thanks. Next, someone asked, okay, there was an account question about deleting an account. Alex was using dummy accounts that has been done. Quite tricky issue, disable pull request merge mode everywhere on CI Jenkins. That was a long due request, but that one like the trigger the full organization scanning and put CI Jenkins in outage. So the lesson learned for everyone is that before triggering an organization scanning, we must let the other know in advance. And if we accidentally trigger one, no stress, just cancel it before CI Jenkins or your become a responsive, especially when we have already more than 1000 build on the queue. There might be improvement there, of course, but that's a status for today. The good thing is that that's change is now going to decrease drastically the amount of builds for plugins because now we won't have pull request triggering a build each time the main or destination brand changes, which was the case before. Now we rely on GitHub branch check that say, oh, your branch is not up to date and it's up to the developer to either update the branch which generate a new commit or new history and trigger new build or accept that it will stay as it. Thanks, Jesse and Tim for taking care of that one that will help on the billing port. Finally, we finish the work around as a credential for CI Jenkins so you so that's worked it by Stefan and I know the that's the same as the other controller as a credential require to trigger virtual machine from CI Jenkins so you in Azure are now managed as code and have a clear expiration date in clear. So we should be able to track it next time. Okay, so that was all for the job that we did. We closed a few issue that were out of out of subjects that has been closed as not planned. And finally, we can go on the walk in progress. So first of all, we had an issue with ACP in certain case, only only when using digital ocean and the bomb builds that are two to 300 parallel steps running on each one on the different pods. When we have that we our cluster are scaled at the maximum and after 90 minutes of that maximum workload. We start having word issues. Only on digital ocean so it's not even sure that it's the ACP set that we have. Because why, why don't we have this on AWS or Azure that sustained the same kind of workload that is for AWS. So we are trying to search the differences that could be related to really low level topics. There has been multiple, multiple areas. Thanks Basil for pointing that it could be related to the amount of maximum connection, despite what the matrix says it's still important to check. We delivered earlier today a set of engine x tunings. Because we were able to reproduce the issue of a spend quite some time during the past days on this one and we spent some time earlier today. I asked you help to retrigger a build forced on digital ocean using the ACP with the new settings. Where did you had time to check the result of this one? Yes. Did it fail or did it succeed? Because of time out for the first time and I restarted it but I did not have time to check why it failed again. Okay. Okay, so we don't. Okay. So right now there is no blocker. First, because thanks to Elvis walk during the past day, I think it was Friday. When the bills are running on digital ocean, then we use G frog. So it's a bit more bandwidth, but that allows developer to run the big amount of bombings that we're having. So currently not blocked. No ACP. When on deal. And work in progress on enabling ACP deal again. So thanks for checking this one. And I think that's all there has been a discussion trigger by team about why are we using engineering instead of our own artifactory instance. So the discussion is healthy. It sounds like there are different solution. I'm not sure if it's a willingness to change the world setup or to just ask question to think I'm not really sure right now. We are going to check this one because digital son might need that's a subsequent subject might need to be stopped in a few days. So maybe we won't have to dig dive on that one. So I propose that outside the bill that they just mentioned that we are checking with the new engineering setup. I propose that we don't spend too much time here. Time out again. Time out. Oh, yeah. Okay, good. Gotta check the time out reason then. Thanks. GPG key expires on March the 13th. So that issue is only there waiting for tomorrow LTS and the end of all the issue. Because right now the weekly Debian repository and read that repository are signed with the new key, but not the LTS one. That will be okay tomorrow and that issue should be closed so it will mechanically move to next milestone. Any question. Okay, so I'm really sorry for the users that are having issues and that are frustrated by that problem. If you just have to wait tomorrow and the problem will be gone. Yeah, and I have the action item from the board meeting on Monday that be sure we convene a retrospective to figure out what we do to prevent that in the future. We may want an admin monitor months ahead and etc, etc. There are lots of things to improve my my apologies we made a bunch of mistakes on on that PGP expiry and on the code signing certificate will get better. Thanks Mark. So let me write this down incoming post more time to improve for next time. Closed after LTS release tomorrow. So digital and the credits are almost all exhausted. Because of the usual activity on the bomb is that we had last month. But that you knew you know the activity seems to be more and more than normal activity. And the easy to issues we had and the shift of traffic and workloads will we consumed eight K instead of one point five K during the past month in digital ocean to handle the big amount of bills. So we only have a few days left of credits I think we should be under the $1000 and that's my credit card on this one. So I will stop the service once we will have depleted all the credits. Hopefully every was able to communicate that to digital scenario today so we hope we should have an answer. And we should do that the same time arm. So the question is are they okay to extend the credit or grow. We'll see. Maybe that will be the end of the digital ocean platform. I don't know. If it's the case we will have to search for other sponsor and if not the case we still have to search for the sponsor to extend the increasing amount of bills from the developer. That's a good if time for the community but that's quite the nightmare for us we have to search for findings on that area. Any question on that topic or a date or things I could have forgot. Okay, so let's wait for answer from the oh for adding credits again. We are really thankful to digital ocean for helping them for helping us because yeah we we really really had the opportunity to shift our workloads directly to digital ocean with almost no pain that was quite easy. The only issue and it's not even sure it's digital ocean related is the ACP. But yes, the API and doc was quite nice. So say again thanks for what you did for the Jenkins projects, I hope we will be able to continue. We have a user saying that the password reset email is not coming. So we tried multiple time they don't receive the problem that the Jenkins in for discovery is that. Accounts Jenkins I use send email through send grid SMTP. We have that mean we have a working account and send great I've asked Kosuke for accessing that send green account but it gave me access to a mail girl mail gun account so I might have missed something. So from now on we have to solution wait for Kosuke to tell us if he has access to send grid and if he can change that account. I don't even know if we pay for that so mark I might need your help on that area. It's in send grid is definitely in the list of infrastructure that we use. I'll have to we'll have to read that page to be sure. Okay. The main goal is can you ping Kosuke just to be sure that you see two different person asking him to check send grid because I might have missed something but yeah I need help to have an answer. I'll ping Kosuke yes. I will communicate to the user but right now we cannot analyze why the user is not receiving email. My proposal is that we check we wait until the end of week and if we don't add any news from Kosuke or anyone else that could help us get access to send grid. I propose that we shift account up to mail guns since we have access. So at least we should be able to monitor the state of the emails that could have been refused by the remote user map servers. Does that make sense. Wait until end of week. If no sun grid access, then shift to mail gun. So we can. So we can diagnose such. So I'm going to tell that to the end user that we don't we have an issue accessing the emails SMTP so we cannot check. Either they can use another email address or wait for us to get access or change the. We have work in progress so on using Azure IRM 64 virtual machine so as a reminder Azure announced in December 2022 that virtual machine with IRM CPU was GA so now we can use them. That's a good news because they found last week last month three shifted all the easy to virtual machine workloads on our private controller to Azure VM controller to decrease the AWS bill and to have less bandwidth and to have a centralized management inside Azure. But we still are using IRM 64 for building images so we need virtual machines linear virtual machine with IRM 64 right now we keep using easy to for that. That issue is about doing the same for Azure. Do you have a status report on that one, Stefan. Yes this morning I was able to switch from open to 2222 because thanks to the work of everything. So it seems to to work better I am waiting for another request to add that 22 channel to the gallery in Azure to be able to spawn those VM. So it seems good. Azure. There are some for gallery management. Okay. Exactly. Cool thanks. So we'll have to check this one. Is there any question on that area. I don't. Same thing. A question that we had privately. It was just an idea and we are going to open that more and more. We were thinking about being able to use this air and 64 machines on both AWS and Azure for the workloads that should then that shouldn't be concerned by the kind of CPU. We thought, I think you mentioned, you asked about the bomb. The question is, do we is the bomb, the whole PCT steps these 300 steps that cost us lots. Are there some that could run on IRM 64 or even all because the costs is 20, between 15 to 25% depending on the cloud and the instances. So it's not the only thing we have to do but at least gaining 1015% of the cloud cost for this bill could be interesting. I'm not aware of anything in the, the bomb or PCT build in the bomb builds right now that is dependent on Intel architecture on AMD 64. I would expect it at least the plugins I maintain run happily on ARM 64, just as well as they run on on AMD 64. So that could be interesting for this one and eventually for for the plugin CI, but not all plugin. So that could be also something proposed to the developers saying by default we might want to shift to IRM by default unless you need a specific Intel binding in that case you should. You should we should provide something on a pipeline library for build plugin. Well, and we certainly could already provide an argument in pipeline library that says that allows you to opt in to arm 64 or opt into platform independent maybe arm 64 is the wrong choice but saying hey as far as I know this plugin is platform independent pick anyone and it would allow us then on occasion to run on system 390 if we wanted. Of course building Jenkins itself and running the ATH won't work on IRM 64. In the case of the ATH not only because acceptance test you want to run the real life, we could add specific acceptance test only for IRM. But you have to know that most of the Docker images that are used in the acceptance test harness are using Intel image and the required work for ATH will be clearly too high for now. So that's why ATH is out of question. And in the case of Jenkins the question can be raised, but I think the Jenkins score itself might be a bit complicated. That's also something for us inside in for to think about all of our workloads should be able to run on IRM to decrease our build Kubernetes management Terraform project. Most of the tools that we're using are statically compiled using Golang so IRM is one of the targets and we can use binaries for this platform. Is that okay or do you have question objection things that make you think. So in terms of net cost savings there are other things that are higher priority in terms of cost savings right. There are other things that we think will could save us more more more cash but I like I like platform portability and the cost savings is nice. I see that one as a not top priority but still important because we have to do it step by step. And as you say if you start with opt-in we can do a little bit of this task every weeks until we have something working. The task itself on the notes is mandatory because we need to be able to opt out of AWS at any moment so we need the same feature parity between both clouds. Next task is the same kind. It's a long running task we have to upgrade all of our Ubuntu instances out of 18 and ideally out of 20. So that is to update everything we can in to Ubuntu 22. So thanks survey. You did the first release of the Packer image using that new Ubuntu version. We are currently, we are going to roll out first that version on our private controller verify that we can run Docker and our tools without any issue. Then we should be able to roll out to CI Jenkins. So I propose if it's okay for you that we deploy as soon as we can to infra CI. And we we my proposal arbitrarily if it's okay for you is to roll out to CI Jenkins with an announcement to developer Thursday. So tomorrow Wednesday will be LTS and Thursday will be Ubuntu 22 for most of the agents. Does it make sense for you? Is it okay for you? Cool. And the next step obviously that that will be from the team so it can be your way but it can be someone else. It will be planning to migrate all of the Linux node pool that we use on all of our Kubernetes cluster so the underlying machines are switching to Ubuntu 22. We can control that for sure on Azure and AWS. No, I'm not sure AWS I think we're using Amazon Linux so that we shouldn't be concerned, but at least on Azure. And I don't know for digital as well. That's okay for you folks. H2. So we have Packer image to roll out to CI.G Thursday. Testing in infra CI. We have Kubernetes node pools. At least Azure. Maybe do. And the next step will be checking our Docker images. Checking Jenkins. Docker. I think we have at least the open VPN that is still Ubuntu 19 so that should be increased. And we got the VMs. Yeah, but afterwards. Yeah. Two next issues unless you have another point on Ubuntu 22. Okay, so document code signing certificate renewal process and renew the signer certificates. So this one are quite the. Are quite the eats right now I propose that we will give a summary of what what we did today Mark Stefan RV. And what are the upcoming next step that we could go for the release is that okay if we discuss that now. Yeah, sounds good to me. So document. I've shared with we have chaired a four of us in a private channel because the sensitivity of the data that we exchange we have chair. We were able to upload the new digital search after fighting with transforming it from different open SSL certificate format but it sounds like Azure accepted the certificate and was able to pour seat and determine all the meta information which is a good direction. So we should be able to update soon the documentation once it will be finished there were just a few missing elements compared to what we have it wasn't enough but it was good enough for us to get started. So now we need to be sure that the data we have added the new digital search inside the vault is able to be used on two elements. First, to be able to generate a sign MSI installer for Jenkins. You want inserted in Azure. Need to check MSI generation. And the jar generation by Maven tricky parties and for could we could we change that were generation to signing, because really the thing that we generate the charges we generate the war file just fine. It's it's that we want to be sure we're signed. Yep, that's a signing the correct row. Thanks Mark for using this one. Yeah, otherwise it's confusing. Today, the weekly has been generated and signed with the former certificate the verify steps which is the last step of the release pipeline failed. Now it's time to decide what we do with the pack. Do we want to package to start packaging this version. Let me write this down. We have today's weekly is released with old cert. The packaging is currently configured to not generate MSI signed or not node MSI at all for today's weekly is configured for no MSI at all. The goal was to avoid a stressful message with that MSI because it would have been signed by the old certificate and every user downloading it would have been a red war red big red error message on Windows which is pretty scary. We don't want that. That's why we say better not having MSI at all. It's a weekly release user can wait or take last week release or LTS. Now, do we want to trigger the package build as it or do we want to rollback Mark change that disabled MSI generation and add a second change that will use the new DG cert and start and checking that we can produce MSI with the new certificates, meaning we can validate that the action we took in the past hours are okay with the new certificate at least. So the jar will be signed with the older one not verifiable but MSI should be okay. So alternative to be sure that I captured the alternative so alternative one is continue the packaging as it packaging as is no MSI right alternative to is and after that then then revert well no that that's okay alternative to continue or revert revert or let's say this way add MSI to packaging continue the packaging. Add MSI to packaging and new certificate. Right and new cert signing. Exactly. And new weekly this one. No, no, we could we could continue the existing because it hasn't run the packaging step right and new signing. So this is current weekly includes MSI and signed jar right or signed war file. Exactly. The danger is that if, if we, there may be some surprise in there where we say oh whoops, something failed, and alternative one might have detected that failure. But for me alternative to as a much more attractive choice alternative three would be a full new weekly with the new shirt. Right right I think so alternative three new weekly build using the new certificate certificate right with the risk of failure because we missed something somewhere. Right and and then we're stuck in the mode of guessing of trying to have diagnosing what failed and having diagnosed what fails then we've got to or how do we fix it. So we may choose between alternative one and two and add to the three one. Correct alternative three is possible, even if we choose one or two. That's correct. And really we could do one. And then we could even attempt to right but the problem with attempting to after one is no no I take it back we can't. We choose either one or two and then three. We cannot do we cannot do both one and two because as soon as we've performed the packaging step if it's successful, it will not repackage again that same release. I would choose two and then three. My personal of the way. Yeah, the same there, because the two include one. The same there my goal on proposing the alternative to right now is to have an intermediate step to validate as we say it's the certificate state is it correctly encoded usable the new certificates. Well, and for me, alternative to is more attractive because it's also much faster. Because when we choose alternative three we have at least a two hour window before we get any feedback whereas the packaging step will complete in 30 minutes 30 to 45 minutes. And if there's something wrong with alternative to we can stop think find and then do three. Right. Exactly. Anyway, there is a risk that even with alternative to walking as expected. The way packaging consume the DG cert certificate is different than all Maven consumers. So we could still have a bad surprise when Maven try to read the file, but I assume that. Yeah, at least we have a smoke test with alternative to if it smoke it's bad. And then we will Maven will fail immediately before generating any data, as far as I can understand. I remember we can we can start it again. Exactly. The only thing is that we are time constrained because we must finish everything before the LTS tomorrow. That's a request from me as Jenkins infrastructure officer because I don't want the LTS bill to discover bad surprise in last minute. That would be great to finish between before midnight. There isn't a fixed timeline tomorrow for the LTS. So as soon as we take decision we can say don't start the LTS process because it's a human that will trigger it. So we can ask that human to say hey wait for the weekly to be finished, even if it delayed a few hours. Is that correct mark that is correct LTS is triggered by the release lead. So in this case Chris Stern, and he'll trigger. All we have to do is ask him hold please don't trigger until we give the go. Perfect. Okay for everyone. Should we ask the team you come the release officer for a last stump on this one for last. Is it easy okay. Is it okay if I ask him directly on the. I think that makes yeah, we certainly want him informed about what we're doing and, and yeah, asking asking for a pause on the launch of the of the of tomorrow's LTS. There's a good one to ask and copy Chris Stern on it. Okay, and probably should copy Alex Brandis as well just because Alex has been a frequent release lead. And what are the, the, let's say the annoying part of triggering a new weekly in an unexpected way like this. What will be the additional work that it will generate for everyone here. Documentation as I understand maybe more changelog things. Sorry, ask, ask your question again Damien. What are the consequences that I don't foresee of triggering a second weekly, which is, yeah, we would have to we would have to create a new changelog but the changelog is is pretty easy to do and that that's a very low risk item to do it. Yes, people will wonder why the changelog is so brief, and we'll put a banner on it that says this changelog is so brief because we were verifying the MSI. Great. Okay. Okay. Cool. The alternative to is not is not creating generating a new weekly version. It's, it's packaging the one that we got with the same version. So the only one is the alternative three, which will trigger a new version but this one will be packaged correctly with the, everything with the new search so it, it's not only to check it's, it's the real good one with the new set not expired. So, so if alternative to is successful, it will result in a signed jar file or signed war file and assigned MSI if alternative to is, or no, I take it back. That's that's wrong alternative to only validates MSI signing not war or file you're correct. The so so one of the changes in alternative threes changelog as you said Stefan would be the war file is again signed. Correct. Correct. Yeah, it's signed with. I'm not sure that the signing even applied the that expired certificate so I suspect what we will see on the current war file is that it looks unsigned. Okay. Yep. So for that. I'm taking on will ask validation to team before proceeding and we'll let now Chris turn as the LTS lead. We will just before proceeding that the three, the two we don't really need it. Or you or we need because the two is just packaging so they don't really care. I think. Yeah. Yeah, absolutely. Before proceeding for the three. So as you said mark we will have so the board ask for a post mortem for both GPG and code signing, which I believe will be must do it earlier. Yes. Well, I mean there, there are a number of other things hiding in it but the most obvious is do it earlier. Yes, absolutely. So a set of different user complaint on specific details like changing the key name or not that okay we can discuss that that's healthy to raise them. But yeah, I see this proposal for user just a frustration from someone not being able to finish their day to day job which I can understand. Over. Yeah. There are consequences of doing as of not doing of not saying no to end user sometime. But okay. Do we have other elements on that area about the signing certain everything. Okay. Any question any objection any things and clear that you want to bring on these topics. Nope. Now, next issue realign repo Jenkins CI or mission it's easy nothing was done. I started the cluster and then we had a lot of other tests so I didn't have time anyone interested in helping finding an idea available at that that we can run on virtual machines or Kubernetes is absolutely welcome to help here. I still have that m chart to test locally. Finally an issue I saw just before the meeting you were able to send a pull request survey on that topic about to allow us to close that part it's about the disk space of the container agent use for the bomb builds that triggered issue we don't have issue today but it's an improvement, especially for performances. Can you remind me the status of this one I've absolutely forgot. So I have opened a pull request to move to the TMP and the home Jenkins folder as empty their volume as they are currently moved as overlay. Okay. Are not overly that's a good thing. Okay. Will you be able to sync up on this one after or tomorrow maybe I'm wondering about the testing process. I believe we should be able to test it. Not only. So yeah, is it okay or do you want to discuss it there. Okay, just to let you know that we might have to select an annoying path a slow path, annoying because slow to be sure that we don't break all the agent at the same time. I will give you elements because I feel like I wasn't clear and I didn't add and neither took the time to explain the testing path there. So I wasn't expecting the test path to be there. So the goal is to discuss and learn all together on how could we test this kind of element in production without breaking everything. I just see your message about digital as an answer. Survey. What did they say give us the primer. See approved extension and the renewal of the sponsorship. And Oliver will help us in a day or two. Great. Okay, so does that mean they are going they're extending again or just putting what you requested. I've just the surprise I will see we forever in a day or two. Okay, so at least that means we won't have to delete digital ocean. We didn't renew the token for nothing. And that means we will have to continue our analysis with the bomb. Excellent. The message from them looks like they're willing to continue funding us at least at the, at the level that we were funded before other than the one extraordinary month. I don't have any idea about. Okay, I just read you almost a response and I don't have any idea of how much. Well, thanks to digital ocean they and thank you for asking them that's great. If it's okay, I propose that we decrease a bit just as a safety measure for my credit card. If it's okay for you. Yeah, that's why I will if it's okay for everyone I will open a poll request that will decrease the maximum amount of builds back to the normal so we won't have any possibility but it will still be used is that okay for everyone. Yes. For you for your dream in nice. Yes. Fair. A few new issues if it's okay for you. So we received an alert about the update center as certificate that need to be renewed. That was done one year ago. We have two months but I propose that we do it in advance. So if it's okay. Thanks for Stefan for changing the alert the calendar alert to an issue. I'm going to add this one to the next milestone. We'll have to work on that. I started already. I sent you some stuff. Cool. I'm looking at the new issues. So that one will be on the new. Let me add it in the notes. New issues. I will format the notes afterwards. So we have this one. Digitalism credit depleted that one is already on. This one my great trusted CIG and can say you from AWS to Azure. That's the next big thing for you, Stefan. We have free virtual machines that allows to to see I trusted CI to run currently on EC2. We want to move that workloads to Azure so we can have a centralized management and less bandwidth costs. We might need to only move to virtual machines depending on the let's say network security solution we will select. The third one called bounds is an SSH passion. We can have the same pattern. We can have an Azure bounds passion because they provide cloud resource we could use VPN. That's something to be discussed, but at least starting the Azure Terraform resources for two new virtual machine with the associated data disk and install them with Ubuntu 22. Then that could be a great start inside a private network of their own. So is that okay for you to take this one Stefan. Yes. The sensitive part will be the permanent agents. Once you will be able to manage with Puppet these two machines with Terraform and Puppet, they are empty inside the private network as a first step. We will have to think carefully with Daniel and Daniel is here the bus factor about the way update center generation works because the cache is using hard links. We'll have to her sync and migrate this but using hard links, not sim links and hard links are really sensitive things. If you delete the hard link, it deletes the point in the associated file. That's the most tricky part. No risk for us because we will only run her sync from a machine to the other, but still be careful. I was about to say, oh, we have been fighting at yours, but no, I keep it. So Daniel will be our source of knowledge for that part. We already asked him elements. There will be plenty of way to test it without any risk, but it's just okay. That's a chair knowledge sharing. The goal is to control cost on AWS. And upgrade to Ubuntu 22 for that machine. That part will be the easiest part because we only run Docker images or GDK builds. Any question objection on this one? There is a third one that I've added that we need to start. It's not top priority, but it's really important. We are spending more than 1000 bucks per month just of outbound bandwidth of CI Jenkins IO controller. There are two source. We have identified two potential sources of outbound bandwidth. The first one will be for people browsing CI Jenkins IO. At first sight, I'd say just a few web pages, but as some people there that already mentioned, yeah, downloading the logs of a bomb builds. And I'm sure that both Mark and Derwey can confirm that. Clicking on console output, full console output takes some times because it's 30 to 40 megabytes log output for bomb builds. So that one could be a source. It looks like we have Apache logs parsed on Datadog. So we need to go on Datadog and run whatever magical request or put it on SQLite and run whatever. I'm not neither a Python or SQL person. So I will use happily Datadog interface or shell. The other source that could be one of them also quite big. It's the stash and stash. Let's say for a bomb builds, we are stashing what is called a megaware. It's a war file with the set of plugins that are being tested as part of one of the 300 builds. So it's generated one time, stashed one time during the preparation phase, and then it's unstashed for each of these two or 300 steps. So if you generate it on Digital Ocean, you stash it sends to Azure. So stash is only inbound. It's okay. But then for every AWS or Digital Ocean bomb PCT builds, you have to unstash it two or 300 times. That is outbound bandwidth from the controller. And that one is a lot of data. So one of the solution is using the what we call external artifact manager. These are alternative implementations that instead of zipping the file on the agent, sending them to the controller, which are the stash or archive artifact, they do the same. And then unstashed is sending it back to agent through the same inverted protocol. Instead it say, oh, let's use that S3 bucket. And so the agent copy the data on the S3 buckets and the other agents will be able to reuse from the same S3 buckets. The part I don't know here is can we do that on S3 bucket inside Digital Ocean and another inside AWS. I don't know that's the part that should be studied. But that is that will avoid a bunch of stash and stash directly to see a genkid say itself less pressure on this one as well. Yeah, or could we stash to the artifact caching proxy. Right, as as another angle rate my mind. It's, we've got an HTTP server sitting there. Right. But the problem then is permissions to write to that thing and that may not be what we want. Yes, that could be a code not be used. I don't know that will require. I don't think it has been built for that use case. Right. But yeah, that's that's an option. Can I ask you to add you to comment on the issue. Yes. So that one that budget will allow us to ask the trusted CI virtual machines without any overhead cost if we're able to decrease the bill here. Any question on this one. Any objection to edit to the next big cost for us. Not important issues but just wanted to mention them I'm going to move them at launchable to agents basically is working on launchable. And require a tooling the proposal and he created that issue just to keep track it's not priority now it's low priority but we can do it if we have time. So install launchable command line tools on style inside our information inside our machines and templates. So that won't be required to install each time launchable is called so it's optimization. That one we had thanks survey for adding create script to lock and lock as your resource group. I was almost deleted see I didn't can say you in production last week due to a human mistake on terraform. And as you provide a way to lock some critical elements so the, let's say the direction we are all going the consensus is having an external script, where we identify the sensitive element and we lock them. So, any change on terraform once it's code manage or even today when it's manually manage. We are all deleting and recreating the resource because whatever change we want to do accidentally will be blocked and forbidden, even if the user has the permission. So, I think it's a bit too early to spend time on that we already have enough, but yeah, I, let's keep that in mind that should be done. So, let's go back as the backups, the more operation we are doing the more critical it start to be so better planning for it instead of suffering for not having had the time for this one. Then we have an issue. That one is also for later. So no priority it's removing credential and using workload identity management, we can do it for search CI and CI dot g today. But yeah, that's, let's say bonus. The rest is for later. I don't have any more issues or recent changes or things to speak about. Do you have other topics or is it okay for you folks. You, you don't want to speak about the discussion you have to get budget for from CDF I think for Amazon. I think, I think you may be referring to a request to adjust the settings in the Azure account to allow us greater flexibility. As far as I can tell that's that's resolved at least resolved in the sense that the question has been asked and clarified with Fatih did you did you mention of CDF. Yep. You're right. Oh, correct. Thank you. Thank you. Yes, good point. We received an answer from Docker open source, despite us closing the issue because they, they went back and they want sunset, the free team. They confirmed that Jenkins in front Jenkins forever our part of the open source program. They were, they weren't just added with the label on the advertising publicly thanks for the reminder good point. I confirm that that's what we want. Because there is no need and I will say it's even safer for us for not advertising this publicly. Not because it's secret but because these two organization are using images for our own usage in the case of Jenkins. So I don't see a point. The goal is not to publish that for external usage, because the effort for helping and supporting people on our use cases is worth nothing. That's up to the debate. We can always ask them to have the badge. So as I survey checked, we have more than 50 million pools on some of our images on the info. So that means these images are popular. That's, that's bothering me. It's like, that means people are bill millions of people are downloading images that have no documentation, no guarantee, no life cycle. Out of the open, I mean, we could run some Bitcoin miner and gain some money for the sponsorship, right? Sorry, I don't believe it's coming from your mouth. It's sponsoring problem now. Oh, good. The richest open source program in the world, right? So juxtasites, no, that's, that's bad for them. But my proposal and again, you can disagree, but think about it. I don't want us to be advertised as, Hey, let's reuse your tools. We don't even have time for supporting our own use cases. So supporting our use case on someone else's machine is anyway, that could be a way to get some contribution though. So mixed feeling about that. But right now, yeah, proposal is that we keep it things as it's and we think about deleting this organization in the future from the Docker hub and using something else for Jenkins forever. Please don't advertise this element. This one, this one must not be advertised. I can accept for Jenkins CIA and fraud, but if you do Jenkins forever advertising that, yeah, yeah, let's do it. You will have the Jenkins security team that will come for you. They will find your address. They will find you. And we will never heard about you again. Now Jenkins forever is only untrusted workload. Please don't, don't advertise it's only for testing on short time window. The content here cannot be unsure that it's safe to run. We can say for Jenkins CIA and fraud because we use it, but not Jenkins forever. Other topics. Okay, so before I close, I just want to thank everyone there for the huge work you have done during the past two weeks, all of you. Honestly, that has been really, really two challenging weeks with a lot and a lot of things. And I'm really happy because alone two years ago, or Olivier was suffering and he did that for years. And yeah, I know the amount of time and the difficulties we all have, we all have a lot of work. We don't all work all together the same way we all have different ways of working. But that's I'm really proud of the team and the amount of tasks and things we do. And personally, that's a relief because I know I've known that when I'm not there I can be back and I know that all my tasks are reviewed by you folks. And you have the ability to say no to challenge what I said and it's really, really a safe place for me. So really thanks team because that makes the project really healthy and really, really efficient. So ability thanks for all the work you do. And on these good words. See you next week for everyone watching this and for the other. See you later tomorrow.