 Okay, hello everyone. Welcome to the Weekly Jenkins Infrastructure Team Meeting. We are the 16th of November 2021. So today we have Marc Hervé and Hi Damien. Let's get started with the announcements. So I see that Marc is updating. So the weekly release to 0.321 has been released. Is that correct? It is. Do you want to share your screen Damien? Oh, sorry. I thought it was already shared. My apologies. Okay, can you see my screen? Yes, we see the split view. If you want to switch to the non-editable view, Will. Yep. And zooming in. Okay. So yes, the weekly release has been successfully terminated and finished. So we have packages and Docker images for the new, for today's weekly. Is that correct? Yes, Docker, Docker, War, Deb, RPM, all verified. Issues updated. So the release checklist is done. Okay. And Windows MSI as well. Didn't yet execute the Windows MSI, but I've got it ready to download. I'll download it to my computer and check it later. Okay. Cool. My question was because since we meddled up with the Kubernetes configuration last week, in particular, the Jenkins installation for release and the Infrasci controllers. We just wanted to double check that there wasn't an issue with the pods allocation agent allocation. And there definitely was not that. I checked the job status and the job health showed that the job completed successfully. Nice. So the mirror, the mirror status, I was a little surprised that mirrors were not yet showing only the only one or two of the mirrors had the latest release, but that, you know, it's unpredictable how long it will take for mirrors to see a new release. So that's correct. Okay. So good, good work. That means the infrastructure is still working as expected. Another announcement we had, we had the last Friday during this long weekend, we had a really quick security release on some plugins. So there has been a security built in published by Daniel. So thanks Daniel. That was done in less than one hour. So that's really cool because it was short notice and everything went correctly. So good job. I think that was the only announcement. Do we have another announcement on your side Mark? Well, so it's a forewarning more than an announcement release candidate for 2.319.1 LTS that will release candidate should be delivered tomorrow. The poll request proposal is out with the backports. And then two weeks later, December 1 will be the release of 2.319. Encourage people to test it. There are some changes in there and some backports that that really do need thorough test before we deliver an LTS. We should add it to the, let me take note, we should add it to the team calendar. Good point, yes. For both for the ERC. So tomorrow 17 and next LTS will be 1st of December. Okay. Thanks Mark. Is that all for announcements? Cool. So first, I saw today that a few days ago Kosuki sent an email, sent us an email about the TLS certificate for the domain repo.junkins.ci.org, which is the domain name used to point to the Artifactory Service hosted by Gfrog, if I'm correct. So thanks Mark for answering to KK. That email thread was on my spam. So that was. So if you both of you, you see such emails, that could be interesting to just ping the other just to be sure, because I don't have them always. So thanks for standing up on that one. KK opened the keys at Gfrog directly through emails, at least one on that one, because it seems like there is another subject that I'm not aware of, but that crossed. So that means for the certificate, I'm not sure we're waiting for Gfrog since this is a managed service. KK asked them if they have a let's encrypt or renewal on their own, because the DNS point to their platform so they should be able to generate certificate autonomously on their own. However, we haven't received an answer. At this time, as I understand we had to generate the certificate and the key by ourselves, and then send them the private certificate to Gfrog through email, I assume, totally secured, so they can install the certificate on their web server. So I don't know what will be the next step. Did I miss something or. Yeah, so the last reply from KK seemed to indicate that what you just described was right and he's working the issue with them. Okay, so if he needs to issue a key to that, if he needs to issue the private key to them, or if in our dreams they've switched and granted or can use let's encrypt. Yeah, I assume that KK has access to the DNS records for the domain because I'm not sure that there is another solution with let's encrypt. Since the DNS is pointing to the IP, if we switch the IP to KK's machine that might break some things. Um, should we ask him if he need help or if we take over on that part. I don't know if he, if he has time for that or not. I was just letting it run but I'm open certainly if you feel that we should ask him I, I didn't see a lot of problems with it yet, but if KK asked for help or says hey this is a problem we certainly don't want the certificate to expire. Yep. So it will be in March if I'm correct. Right. So that should be better to do it before end of year. Okay, I will send in an email to ask him if he's if he has access to DNS or if he won't support or complete takeover. Okay, and he pointed that if we have to generate a new certificate, the old one will be valid only for three days straight before being being expired by let's encrypt. So that means we have to schedule that correctly with GFrog because if GFrog take more than three days to change the certificate we send them. Then that means a lot of trouble for us. Right. Yep. And he mentioned, yeah, so that was described in the email to so better, better to assure we've communicated with everyone on this timeline, then to rush to do something. Absolutely. We have to be sure that GFrog will be ready on their own, otherwise that might start to be hard. Okay, I saw you I did the DNS renewal of Jenkins CI org domain. Yeah, we can do it after the helm charts discussion it's a low priority just wanted to remind and I have need to give myself an action item to put it on the calendar. Okay. Cool. So for the certificate renewal for repo Jenkins CI org is that all. Or is there any question. Okay, so switching to the next subjects Kubernetes Elm chart. So every thanks for the huge work you did on improving the structure of the Elm files that helps to keep our environment clean. We removed a bunch of dead code. We updated the Elm charts we have, we have tried to apply some conventions over the way we use them charts and values. There are still some work to do but yeah, that was a big change. We were expecting improvement and the ability to install the Elm chart in parallel to have a quick, a quicker process. I think it doesn't work as expected. Sometimes it's even slower when we try to increase concurrency. So we'll have to double check that. However, now the main work is around the date CLI that has to, we are doing a complete audit of the dates CLI manifest on the charts repository to ensure that it keeps, it keeps track of all and charts. We're working on all the Docker images, because we start to have more and more custom made images. Each time there is an automatic release or manual release of these images we want them to be updated on the cluster production as soon as possible. The main goal is, we don't want to slow down the maintainer of these images. Each time they release a new tag with a new version we want it to be automated as soon as possible. If you want to work, it's almost done. That was the past week and that's the reason why we wanted to be extra careful with the weekly release today because we had to remove old unmade Elm charts and rely only as much as possible on upstream Elm charts. So great job, Hervé. Is there any question points, something that I forgot about that area? How about the update now? The next point with the issue. I've opened a prerequest to fix the problem on a key clock Elm chart. Yep, that's the link I've put in the description. Yeah, good point. So please do not upgrade the Docker Elm file until we have the key clock has updated their Elm charts. Their recent Elm charts, at least the latest one which has been published in September, is failing the linked step with the latest version of Elm. So this is outside our area. That's why Hervé opened the issue. So right now, we have to fix the Elm version. It hasn't been done, but we should be able to release and use update CLI to only check the 3.6 branch of Elm. That could be a safer way to avoid merging by all. Yep, that's all. Hello, Oleg. Hi. Happy to start joining from time to time. Do you have the link to the notes? Yes, I do, but I'm on the laptop right now and let's say Github connectivity is not trivial right now. So I will need to register because I have now two Github accounts and many other things I will need to adjust to before it becomes habitable. If you have any notes, share them on the Zoom chat and we'll add them on the document since it's open on our missions. Okay, so next subject, Mark, I'll let you guide us on this one, the NS Renewal of the World GenkiCI.org domain. Yeah, it's one of those where the Tyler Croy has automated processes that renew that domain. The renewal happens, I believe roughly once every two years. Successfully with the automation two years ago. And so I'm expecting it to be successful and I've set myself an alarm to red flag it if we get to the point of less than 30 days to renewal and it has not renewed then we'll escalate it to Tyler. Okay. And do we have any services left on this domain that default to that, because since we moved to Viki, Artifactory also redirects to GenkiCI at the moment if I recall correctly. Yeah, we have, we have at least one repos.genkins-ci.org is still. Artifactory is now, yeah, I thought it's now on GenkiCI. I haven't tried it on Jenkins.io let's see no it's definitely not a repo. Let's see repo.genkins.io. No it's definitely not it's just repo.genkins-ci.org. But even if you, I mean, we need the alias no matter what, right because there are many old links that go to that old location. Yeah, that would be nice. Well, we will still have to keep them in for ages. Yeah, we want to own that domain we really do not want someone else to own that domain. Yes. Mark, could you add a reminder like 20, 30 days before in the team calendar so everyone will be able to have a reminder if you, if you are not there that day. We'll do yes. Thanks. Something else about the renewal. That's it. Next topic. What about the infra costs. So on the AWS ports. So we should be able to be around 10k or maybe a bit below thanks to the effectiveness of using spot agents. And then any complaint yet about spot agents being reclaimed. So either we are lucky or maybe the strategy of targeting six kind six different kinds of machines and letting AWS select one kind for us might also help because it automatically search for one instance on that collection for the reminder. That's the most available available from AWS point of view under the cheaper because that's where they have most capacity for us. So that that's a good objective but it's not finished yet the virtual limit is 8k per month not 10k. Next you join that we could have we have a bunch of the peak or elements that we could migrate out from easy to however if you want to have something with the order of magnitude of the 2k that we are over consuming today. The next candidates in line is pick a G Jenkins IO, which is the because that machine cost us free K of bandwidth per month, which is a lot. So we should think about the destination for that machine. First, that machine will be split into two virtual machines in the future because right now there are two main services there. But we have to check the bandwidth cost on Azure back because Azure will be the easier for us. We can have VMs or Kubernetes charts, which is good. But also, so we have to see what will be the theoretical cost with that bandwidth on Azure to first have an idea of the cost. And if we can afford it reference to the next point. However, there were there were also the idea that we exchanged, but that was purely an idea. So that was I'm taking that idea back to you to now that will be to create a virtual machine on Oracle. Because the bandwidth is theoretically unlimited. The projection we did with Olivier is that the bandwidth should cost 50 bucks per month on the worst case. Which is completely crazy. I don't know how they make money, but yeah. And the second reason is that we can use IRM machine instead of Intel machine, especially because the use case that required that bandwidth is a web server. That's an Apache web server that serve files on a file system. So the costs of the machine and the visibility that it could give to the Oracle cloud because they provide IRM machine. That could be interesting to maintain the relationship with them in terms of partnership and the cost in terms of money for us is really a good way. So that will be the alternative. So I propose that we we compare the cost, the theoretical cost on Oracle and Azure. Unless you have other ideas that we could ask for there. That sounds good to me. And I think you're right. It's the next win. 3 k Euro for bandwidth is is a big chunk of our total spend right now. Thanks to the spot reduction. The next the next target becomes very obvious. So I don't know if there are other questions on that area or things not clear or control proposal. Okay, the so the next cost is Azure. We are stable. It's been three months that we are at around 8k per month. So for reminder, the CDF asked us to have an average of 10k per month because the we have a budget for one year and the one year is 120,000. So that mean 10k per month. So we are now below the limit for the year, which is good. That means we have a margin there if we have to move services out from EC to if we cannot for whatever reason. However, given the effectiveness of the spot instances on EC2, that is also something we could add see to see the impact of using spot instances on Azure as well, at least on the area where we can afford it just to see the impact there. I've put the note about the new provider sponsorships. So I've started to work on terraform parts. The goal is to have a thing the same way to manage infrastructure, but one repository for each provider one for Azure one for AWS one for the next ones. Once that part will have been done and applied to AWS and data that that that is what we have today, then we can start the work for digital ocean and scale away. We won't have more than two VMs on both, but that that's a start. There is also some work we have to machine that are sitting there doing nothing from the OCSL the for the machine that we use for Confluence and Jira. So we can still reuse these machines as well to stop paying that much on clouds. And then there is the area of what we could start in terms of sponsorships in the upcoming months or quarters. I mentioned Kivo cloud last week, which provide managed Kubernetes based on k3s. And they are using Jenkins and they provide Jenkins and their installation marketplace based on the official operator. And I think you had something I've wrote nomad but can you remind me the square square. It's an infrastructure provider. I've met their lead tech during a demo and they're using nomad as their engine. And I think they could be really interested with sponsoring as they're seeking for visibility. And I think it could be good for both of us. I've put the link in the chat. So if I'm not mistaken, there is a Jenkins plugin for nomad. Under the state of it, but that could be eventually a good start with a stateless agent. I don't know. I have zero knowledge around that part, but it's a container orchestrator. So, and if there is a plugin, I assume that if they lend us a few capacity, a bit of capacity, we should be able to start agent dynamically on that part. At least on the paper. What do you think it's a bit right because nomad is not necessarily designed for sales use cases. So there is hyper validity that the plugins. I'm not sure I cooked. What do you mean it's not designed for so if you want to connect to agent and say you're to square when nomad plugin. It's probable that the plugin just doesn't support this mode. Last time I used nomad, which was actually quite long ago. Yeah, we're using it internally. So not a source option where you service externally and then you need somehow to connect agent by SSH but everything was directly accessible through the network. Okay. I wouldn't put bets on that until somebody tries it out. Okay, so that could be a great experiment are very few are up to the task that mean trying I don't know if they provide free credits at least to build or contact them to tell them that we have that use case. I've kept contacts and I can I will take them. Ask about it. I've got you the previous email trade about sponsoring and I have to look at them to to find argument and different point I can give them things they they will be I think they could help us if we have trouble. Because nothing to try. I've added the link to Jenkins plugin for nomad. It's like that plugin is still mentioning the wiki. So worst case that could be the opportunity to update the read me of the plugin. Still worth it. Maybe one quick note about Shiva. We were discussing before. So, a couple of months ago, we had a meeting with a young part of a few other people at Shiva. And we asked how we could actually collaborate together between Jenkins community. One of the follow ups that they updated the default template to official home charts. So this was done outside the meeting also did it to download Jenkins. But we still have an open topic about potential collaboration in terms of hosting some kind of sponsorship. And it is because Shiva is a really interesting, especially since it's probably the only Kubernetes provider where you can provision the cluster in just one minute or so. So it's not terribly painful as for any other vendors. So it could be a good opportunity. And actually last week there was a week of CCD organized the same attack. So I was speaking about Jenkins the presentation presentation. So the context is active and for somebody in the Jenkins section is ready to explore it a bit. I'm happy to use this context. Okay, in the context of the infrastructure Jenkins infrastructure particularly. Even though it's technically impressive for the fast provisioning. We don't really have that needs. It's more the ability to have a static but still a good thing for Jenkins excuses though. Well, maybe some kind of preview environment would be nice. Because currently we don't do preview environments in Jenkins anywhere. So that's that's that's what I meant on the Jenkins infrastructure parts that that's completely legit topic. And I would love to have that on Jenkins. But right now we don't have that need on the scope of the info. But yeah, okay. Thanks for sharing that. That's good to know because that means they are also in position of listening. Maybe that could be. Yeah, I believe I dropped it somewhere in the developer man and please but it was months ago so cool. That means they should remember us. They do. Jenkins is still a part of the user base. There are people running Jenkins and see one. And of course, taking the volume of Jenkins and the emerging things. I'm really interested to explore a greater small. Nice, great service. So they were quite excited about it. So there are definitely some elements. Cool. If it's okay for the infra cost topic. Do you have any other question. Okay. So move on Kubernetes one that went to upgrade will be the next major topic for us. So early, you volunteered to read that one. So once we will have finished the end short, we'll have to read our two communities cluster to one done 20 line. The main being the public gates on a case. With which host our stateful applications and the EKS on Amazon that host the Kubernetes agents for C agent. So that one should be easier. So that will be the next topic. Nothing more to say about that unless you have a question or something clear. Okay. And one last note. So thanks for working taking care of the labels. For the EBMZ, the agents on C agent. I saw the label updates to avoid building GDK eight. Thanks. I totally forgot to walk on that topic. I should have had last week. So thanks for that. And expect the clean up. I started working that locally. That will be on many levels. But we have a security issues about the puppet dependencies and we need recent puppet modules. That's really important. So the first step will be removing all the dead and unused code. So to be sure that then we can iterate. And the next step will be. Either deleting the puppet tests. Forever. If we cannot find a solution. Or upgrading them to something else. Because right now we are not testing what we have in production. So the tests on the lines error that will never happen. And they are, they did not cook. The last three production outage due to puppet we had. So this test are worthless. And worse, we need virtual box to run this test most of the time. So this test had a meaning. On other context when it was managing all the infrastructure, which is only a few machines for us right now. And being able to upgrade would allow us to manage the power PC and IBM Z and IRM machines. So I hope we should be able to advance on these topics. And that's all for me for the notes. Do you have other topics to share to ask to mention. Okay. Yeah, just quick question. Are there any requirements from the Jenkins governance board slash CDF with regards to infrastructure. So we, we do certainly need we're, we're working on the assumption that our budget will continue at the 10 K level for the next year and we haven't, haven't heard anything. So that's a topic for discussion to CDF. Yes, what's the sense you have a leg. I heard in this topic. So one thing I recently discovered that a chair of the technical oversight committee is also expected to be a member of the governance board on this year. We're still waiting for official onboarding. And once I'm officially on board that they have should have meetings in December, where they will be also discussing the budgets for the next year. So this what I know. I also know that Trace has submitted a request to Microsoft regarding Asia sponsorship. If it gets approved, of course, it will be a massive relief for us. Oh, I can see anything about the progress there I have no visibility into what happens there. I should talk to Tracy next week if everything goes as I plan. But yeah, currently, it was basically a pending discussion about infrastructure in general, because CDF would like to consider the infrastructure more including Amazon accounts. Again, no movements in regards to Jenkins account transfer. Yeah, probably need to bug Kara and to see whether it could be somewhere on her power list. But I think that she's busy with landscape and well basically there is no program managers at the CDF so if Tracy is not going to that it means that nobody's been better the moment. So, but I will look into. I will keep bringing up this topics. But yeah, right now, I didn't have any immediate updates. I wouldn't expect that there would be a massive drop because you're currently there is no challenge with regards to CDF members and paying sponsors. Also, you may have seen that there are more organizations joining. So, currently, it's not about, let's say, budget, it's rather about how the budget is distributed. And currently, let's say, Jenkins is the only good to do the project, but it still consumes the vast majority of the CDF infrastructure budget. So, yeah, this is a tricky point about how things work. But yeah, I'm looking forward to be on board so that they can actually understand how the budget will work some of the hood. Thanks for sharing this insight that's really important for us. That brings clarity on the way. Where do we get money from? That's really helping. Thanks. Thanks for sponsoring the Jenkins project. Thanks. I have to say that after the recent news. I will wait for your news. Okay, another topic in the area. Yeah, I have one topic. I actually raised it some time ago at the governance meeting. We are going over time, so if everyone find just a quick question. The Linux Foundation expects that this community as well as other communities, it's the account to all other organizations as owner, basically from the project sustainability point of view. Sorry, could you say that again, Oleg? There is an expectation that a Linux Foundation's GitHub account is added as owner to all organizations of projects for members of the Linux Foundation, including Jenkins. So there was a request something like one month ago from Tracy. There, I said that please provide details. You didn't get any details by now. But I know that, for example, in CNCF, it's a mandatory requirement for the new projects. And it will be the same way for the CDF at the moment. So my question basically to Damian as the incoming Jenkins infrastructure officer, et cetera. What should be on the Jenkins community side should such request arrive. I suspect it will be a major pain for us. They can private repositories, security repositories and determine the number of people who have access to this Linux Foundation's account. So miss the current situation, my understanding that it goes against how Jenkins governance and security practices work. I still need to discuss it with Wadiq. But yeah, I would expect that this is a discussion where we really need to get a governance board with Wadiq, Daniel. It's actually team and to basically build up the consensus how we approach that. So on that area, there are still some area where even being owner of the GitHub does not get you access everywhere though. For instance, most of the secrets are encrypted using GPG key or eventually owner access on the Azure KMS. So we could use that model, that threat model to, if we don't have any other solution that could be interesting to still see work on access which secrets. However, this is only for the infrastructure scope. There are still that question to be asked, as you said to Wadiq, because if you are owner, you can inject code on virtually any plugin or part of Jenkins. So all plugins with continuous delivery. Exactly. All infrastructure components, including components, handling critical credentials, also just access to GitHub organization with personal data, other funny things. I don't say it's not scary. I just say there are mitigation process that could happen for the infra, but you're correct. So that feels like one for me at least where security team and governance board are both crucial, just like you said, we've got to have that discussion with them before we would ever consider adding an additional owner to those organizations. That's terrifying for me. I'm not a security expert, but just the thought of adding somebody that is not a deeply trusted person strikes fear in my heart. Sorry, Linux Foundation is a great organization. I don't want to disparage them, but I would really want to understand why and all sorts of other things. Well, it's actually written in the links for the collection collateral. So the reasoning for that, that if something happens, then Linux Foundation can take over the project. It's especially important, for example, for sandbox projects and for beating projects, where it's quite common that the project is driven by one company. And basically, there is no contingency plan if things go in an unexpected direction. So, yeah, let's imagine James is starting to stop contributing to Jenkins sites. So, it's just an example from the city airspace. And basically, this request makes sense in principle, and this is a requirement for new projects being converted. And I believe that it will be a requirement for the new projects. And I think it's the foundation to be currently working on this and bottom checklist. Jenkins since it's already on board, but the situation is a bit more tricky. And yeah, I guess the same radical now is not going to be a good answer. But discussing what would be the rails, what would be the grace period, what would be actually a systems and additional guarantees from the Linux Foundation, all of it is often decision. To go point treating also Jenkins CI and Jenkins in four organizations separately because the concerns are different on one and the other. That might also help. So in my case, of course, Jenkins search is probably the biggest thing. And you see, for example, we are maintaining the drinks for the runner, I want to have continuous delivery. I don't want the Linux Foundation to have access to release of this. There is admin account, I have literally no opportunity to print it. So, yeah. Well, I guess I will have to live with that. But I can and how many years, maybe, why pissed off by this change if it happens. But people who will be definitely pissed off. Thanks for cheering that I mean that's a topic for security and governance board now. But yeah, that's interesting to show it. You won't get away. The election is not finished yet. There was no other thing. So, unless somebody that is you would have a reason to happen. Okay. We have competition for the commendation officer and for Jenkins governance. So it keeps the mark on his store site. But yeah, otherwise, a lot of positions, there is no competition. Nice. Okay, I think we can stop recording and sharing. We have reached the end of the notes, unless you have other topics.