 Hello everyone, welcome to the Weekly Jenkins Infrastructure Meeting, we are the 16th of August 2022, I'm adding in the chat the link to the collaborative notes that should be exported to GitHub Jenkins Infra documentation as usual, the recording will be published. Aujourd'hui, on a moi-même, on a du portal RV Le Meur, Marc is PTO, Stéphane also, Basil, I think so, team is not there, and we have Bruno Verrarten. Can everyone hear me correctly, see the link, have access to the notes? Okay, okay for me. Okay, let's go. The new weekly, so that the war has been published successfully, I don't know for the rest of the checklists to be quite honest. So we might have to wait a bit before updating the Docker image that should be updated automatically in the next few hours. I think it's done, if I'm okay. So release checklists to be finished. I don't remember who's in charge of that. I will double, we can double check with team in the Jenkins Release IRC channel. But no error whatsoever, so we can start again breaking the infrastructure as usual. Tuesday it's weekly and then breaking the infrastructure. No other announcement, I don't know if you have something to share, folks. Nope, okay. So upcoming calendar, next Tuesday we have a weekly. I don't know for the next LTS, we had an LTS last week, so I assume the next one will be end of September. I never remember, I can't remember when. So let's just use the Jenkins public calendar. Next security release not far as we know. Next major event DevOps world end of September. Just kidding. Hervé, Bruno and I will be there, so anyone interested in infrastructure can come there. Let's get started. What did we do since last week? Not so much because Hervé just went back from PTO today, I was alone. Finished the Trilite API plugin fixes, which involved having to specify the Java binary path used to spawn the Jenkins agent process on all of our templates. That created a lot of chaos, I'm sorry for that. That's the value of having a team because when there is a team they restrain myself, they restrain me from breaking things. The idea is that for JDK 8 and JDK 17 agents, we had to add the JDK 11. The goal is we want to run Jenkins agent process with the same JDK as the controller while the default JDK is the one expected by developer. As a developer when you use a JDK 8 label, you expect Java dash versions to say Java 1.8. While the Jenkins agent process should run with JDK 11, same as our controller as for today. One exception, the EC2 Windows machines, they are using JDK 11 because it's the default, so it's not blocking that issue specifically. However, when it will come with the JDK 17, we will absolutely need to change the default JDK for agent. Right now, there is a feature request that I've opened on the EC2 plugin. A team commented that it should be quite easy to add and should be easier than hacking around. I agree with team. Next step contributing to a plugin. Anyone interested is welcome to help us, of course. Yeah, I've probably tried it. I looked at the modification for Azure and take a quick look at the EC2 plugin. I think I've located some part of the change for the Mac Linux and Windows Launcher. That's cool. Those templates, thanks Arvet. Not blocking the issue. Another consequence, I've opened a bug and I've contributed to the Kubernetes plugin. It's in review state. I don't know since when, but the pod agents on Jenkins were missing when connected the item system information. You were only able to check the pod logs and events, but not the system information, which is the GVM, retrieving information about the operating system, the GVM parameters, environment and elements. All the other agents have this and it's quite useful on the UI for admins, specifically when you want to check which GDK and which version and what passes is used to spawn the agent process, which was the core of this one. That is plugin contribution. Let's see if we can improve that situation with the upcoming GDK 17. The production proved when we'll change to GDK 17. If you arrive there and you have treated plugins breaking your infrastructure, please do the same as what we did on that infra. Check the GDK use for your agents. Is there any other question? Next one. Poupet upgrade campaign. Our Poupet infrastructure is up to date to the latest six.something mainline. All agents are using open source Poupet, but I rediscovered because I should have known, but I did not. Thanks Olivier for reminders. The Poupet master is using enterprise Poupet edition with the free license for 10 nodes. That means if we want to add more Poupet managed nodes, we are close to the limits. The proposal is to switch to open source Poupet master for the upgrade to the seven.something version. We keep the Poupet enterprises for today because it's not blocking us right now. At first sight, they only added value of the enterprise Poupet. In our use case is the installer that should be replaced by a few command lines for the upgrade. But outside this, I'm not sure there are other operational changes and we don't need support. So let's see. Next time. But at least we are up to date. Let me. But it has been updated. All the issues that we have, a lot of details since I was alone. So anyone wanting to upgrade for the next step will have to read carefully the previous pull request. I try to put as much details as possible. Writing a run book was not really interesting because it's a one step process specific to this version. What else? I don't know. I don't know. I don't know. I don't know. I don't know. I don't know. What else? The both of these issues created a lot of chaos and I apologize for that. But now that should be clear and we have a road to the next Poupet configuration. I hope it will help. As I said, next item unless you have questions. So Maven 11 caused by trilite. Fixed. Sorry for that. l'effectif de marcher autour des portes de traité. C'est la même chose pour les flancs des windows maven. J'étais en train de marcher sur des changements. J'ai perdu les choses. La prochaine élément de l'API. Il y avait un problème qui a été causé par le Centre d'updates. C'était sur Daniel Porte. Daniel a mentionné ça et a fixé ça. C'est comme Marc et Daniel n'ont pas communiqué officiellement. Ils ont émergé des requêtes pour ne pas se faire. Merci beaucoup, Daniel, d'être venu. Il a vérifié pendant 24 heures le Centre d'updates et a fixé le link hard. Ce n'était pas un tasse facile. C'était un link c'est-à-dire un issue at least. C'est ok. Les utilisateurs ont confirmé que c'était fixé pendant les heures. Il n'y a rien à faire. Il n'y a rien à faire expectant de l'infra-team. On a détecté l'issue à la même time que l'utilité. On a demandé pour elle. Granting trahage permission de GenkinsHeart app, fait par la team. Merci, la team. Et enfin, nous avons ajouté le relance LTS. Les rédéments pour la version 4 contrôles, c'est-à-dire les rédéments de LTS. Les rédéments pour les 4 contrôles, c'est-à-dire les rédéments de LTS. C'est-à-dire les rédéments de LTS. Merci, Alex, d'avoir ajouté ce relance. Nous sommes à la date. N'a-t-il question ? Non. Donc maintenant, le thème dans le progrès. Accès à un espace NPM. Donc, Hervé, shall we move that task to the next milestone ? Yes. Ok. Do you have just a sentence for a quick status on this one ? Yes. I'm currently retrieving the GenkinsHeart account with Tom Fennelly. I've opened a reclaimed tickets on Github slash NPM Sport for retrieving the GenkinsMT slash Quatted domain. And for Genkins, the Genkins CD, it's owned by M. Wilding, so I don't think we have to, we should get the CDs. Cool, many thanks for that. I'm Genkins, I'm looking to know how to create such organization. Cool, so I understand that we have to wait from either Tom for one and NPM people for the other. Yes. And move it to next milestone then. Thanks a lot, Hervé. I didn't have time to pass on this one. Next one, introduce an artifact caching proxy for CI GenkinsIo. Why ? Because we absolutely need to decrease the bandwidth usage to repogenkinsi.org. That's a long running subject in tiers and right now we are crossing some limits. The idea is to start with the Docker image and install one to three instances and then see how we can configure CI GenkinsIo and eventually others to use this caching proxy. The idea is it's an engine server that receives requests from MVN downloads and if it has the file it service from its local file system otherwise it requests repogenkinsi and cache locally. The cache can be cleaned either by deleting files restarting the instance or reaching the maximum time stamp. Why for instances ? Because one in Azure, because we have virtual machines and agent running in Azure. The goal is to have the same region with bandwidth cross-region costs. So we need one in AWS for container agent on EKS and same on Digital Ocean. But let's get started with one at first and see what we have. So I ask you alone this morning do you confirm you are interested on working on that ? Ok, move it to next iteration. So we have to work just one time both of us to start a local instance I need to share knowledge with you. So once you are able to run a Maven build with local caching on a local Docker image then you should be able to proceed to the install and eventually use the old code of that because there were services that we removed 2 years ago. Yes, I took a look at the settings Azure XML file which was present before. So the way we distribute the settings XML file depending on which cloud Mark told us that the first settings XML could be managed by a Jenkins plugin named config file provider I think. That's a plugin that allows to pass reusable configuration file for the tools. So Maven is a known tools by Jenkins. So the goal is for each agent or each build it adds the settings XML with whatever credential unit. So that plugin should be able to select different settings XML based on the label of the agents. So potentially we could use labels with the correct region or cloud to specify the correct settings XML. We might have other solutions. Let's see but let's start with one instance. Looks good. Yes. My dog disagree but doesn't have a power de mix. So let's get started with this one. I've put a finish cleanup of mirror brain. It's blocking the migration of updates Jenkins IO to Oracle. It's Puppet work. I got some templates cleanup on the packaging machine. So a lot of manually managed I need to prioritize this one now that we have a fix Puppet environment. That one was requiring Puppet 6 upgrade and Yara version 5 for Yara key merging. Now that has been done last week so I can proceed. Which mean automatically I remove my great update Jenkins IO from the milestone so I move it back to infra next because it's blocked. So no reason to work on it. Any question? OK. So I'm going to prioritize Java 17 windows agents. So the idea was to switch all the agent images to Packer. You did that successfully at least on the build part for the Linux container. So in order to finish that issue we need first to start using Linux container test environment before deploying and secondly start building windows container. Does it look good for you? Should we split the burden there? Or do you want to proceed? It's good. I won't mind doing some pairing with you if you have some time. So I'm adding both of us and moving to next iteration. We could go faster by adding a custom made windows image but the build time are absolutely terrible. So if we have to spend some time better to spend it on one definition rule them all. No question? OK. Next one. Open by team we have some part of Jenkins IO built on CI Jenkins IO and deployed from CI Jenkins IO that should not. So the idea is to work on moving that generation to either trust it or better infrasciate. I propose to postpone this one in two weeks when Stefan will be back because right now limited bandwidth. Yep. Publish acceptance test on the organization Jenkins. Everything is set up. Team describe the need that one should be quick at first step. So unless there is something un forcing I propose we keep it on next milestone. I need to add job job DSL configuration that's a few lines three or four and then that should be OK. I will ask team to validation. If it's not working as expected I will postpone. Looks good for you? Yes. Unless someone want to pick it. So dates removed because blocked by finished cleanup replace so the CPU machine I will remove it from milestone mark share with me shared with me the SSH key to that machine so SSH key shared with me there is no action there except I want to try an experimental ruby gem installation of perpet agent because there is no package for that CPU architecture so it's not emergency and mark will take care of that once it will be off. So that issue in itself is postponed because mark didn't have any time. Any question? OK. Alert fatigue, data dog and pager duty. So thanks Stefan on the survey you did very well on this one by disabling the issues. So first manual silent alert done 3 weeks ago 3 weeks ago and Stefan disable pager duty so the monitor in data dog is still alerting us but since that monitor sometimes work but most of the time generate false positive now it's silent next step require more steps so I propose to postpone as well Is that OK for you? Yes. No more false positives confirm that we received alert on service that failed last week so thanks survey for all of these new monitors of course there is room for improvement but that's not the priority for next iteration weekly release build does not resume I need to spend some time on this one I thought it was a quick one but some questions need to be asked so I'm adding in a next milestone for the analysis part just to be sure that we want builds to be retried team ask for valid certificate for trusted C.I. that requires configuring Lexand crypts certificates in new world from HTTP to DNS in Poupette no emergencies for this one so let's postpone as well there has been the same request for third C.I. but same status both services are private so we cannot use the default let's encrypt HTTP challenge so we need to create restricted technical account on Azure that are restricted to one or two DNS records and we need to add these credentials so third bot can use these credentials to validate the challenge and create the DNS TXT record Any question? Nope I'm a trick for FMR virtual machine so I'm adding that to next iteration the goal is to install Datadog agent on the VM template on Packer that's the goal for next week I missed time to do that past milestone and then the next step after that will be to configure controller to pass the Datadog API token and start the Docker agent service when starting agent but that one is will be in two weeks finally last item opened by Daniel it looks like there is a twitter jenkins-underscore release account which feed on RSS the RSS flow is absolutely okay the correct time correct plugin releases and that twitter account should publish tweets based on the plugin releases looking as expected we're not really sure thanks gave in, gave in pointed us different elements it seems like neither Mark, I, Olivier, Daniel gave in our team have access to the account itself only Tyler but that account might be running on where we run bots I think it's the Poopet Master so we have to diagnose that part I don't allocate on this one, it's bonus for next iteration that's analysis process I'm not sure it appeared that Oleg had ideas but we don't know if he has the account access if we cannot retrieve the access and cannot find anything we need to contact Tyler personally any question? okay so the milestone can be closed let's now check the if you have new elements do you have new elements folks on your own so here on the bottom of my screen you see the issues I've postponed migrating pipeline Jenkins IOCN it's blocked by the publish pipeline steps docs generator airway use case and repository for github command ops I saw you had it content is there any action to be done? yeah I'd like some more yeah it was an open issue for community to express themselves I don't see a lot of commands so I don't know okay was there a community Fred on that topic no if you want user feedback that's a possibility you might want to open a community topic linked to that issue to get broader audience yeah I think that would be a good start you could write a blog post but that's a lot of effort yeah yeah maybe convince team to write a blog post nice try nice try the rest enable development latest start using gdk 17 all of these have been postponed some another day okay I don't see new issues do you have a new ones on your home? nope okay so I think we can start the new milestone the release I need to add the final notes to the github release for the not generation there sorry when I want to publish that release I need to add the correct note okay so I will do that afterwards okay I'll take care of that I don't have anything else do you have something else to add? oh no it's okay okay so I'm stopping the screen share and stopping the recording bye bye everyone