 Hi, everybody. So let's start this new Jenkins infrastructure meeting. Yeah, considering the fact that we had the contributor summit last week, we've been quite, I mean, we have quite a lot of topic to talk today. So the first one is we make progress with Algolia. So the plan is to use Algolia to improve research results, both on plug-in site, plug-in search Jenkins.io, and the documentation www. Jenkins.io. So first, on the plug-in site, so it's now enabled. So let me just show you this one. So now when we go to the plug-in Jenkins.io website, we have, I mean, you can see the search by Algolia. And so basically what it does, it just return more results. So for instance, if you type just GI, it will show you git plugins, for instance. So it just, I mean, it's more, I mean, it's more powerful than what we had before. And also something that we are looking to process. We also have some analytics right now in the Algolia search, the consultancy. And so we, one of one of the topics that we want to analyze is what are the kind of information people are searching for. So either because it's not correctly documented, or maybe there is a missing plugin or whatever. So yeah, we have more analytics information now and we have to see what we can do with those right now. The next step is to use Algolia. So on that, I was impressed that if you look at this one already on this page, we see one hit notice number four in the search without results, they searched for the words git and plugin and got no result. And so it's like, oops, that's a bad sign because that should have, should have had hits. And so already we're seeing hints in this, oh, there are things we need to tune and improve. Yeah. And that's, that's quite a surprise, right? Oops, why would that? I mean, the word plugin suddenly made it less useful. Yeah, definitely. I mean, it sounds a bit redundant, but that's interesting because we know how people are searching information on the website on the right. So, so that those, those analytics are surprisingly useful, even as little as we have right now. Yeah, we still have to, to at least to understand how the analytics that we have by default, in case we have to turn off some of them. But yeah, right now we are, we are really still at the beginning. So I was in contact with Algolia employees and they were really happy to sponsor us. They are using Jenkins internally and they are really happy with that. So that was really good conversation we had last week. So yeah, we should add Algolia to the sponsoring page as well. The plan is to continue working with the documentation. So next step is www.geekies.io. It appears that they have a specific offers for documentation. I think it's doc search. So it's slightly different, but yeah, first that's, that's on the roadmap. I'm not sure if Mark or Alkay are giving will look at it. So it's, it's actually Gavin that will do the look. He's at least that's what he said. And they've already given us the doc search enablement instructions. And, and so Gavin who knows JavaScript well is a great candidate to do that. Awesome. Any question before we move to the next topic? No, awesome. So the next topic is about page or duty. So I had a few people to the page or duty system. So I had the carrots, Damian and Marquette. So if you just look at it. So right now we have four different slots, time slots, basically the layer one and layer two are most more European time zones while layer three and fours are US time zones. So basically it cover morning, morning in a part of the afternoon and then afternoon and the evening basically. What I did for the European because that was quite easy. So I added, so we already had Daniel and Daniel and Arno in the loop and I had Damian and Carrot. So basically you'll be on call once a week. That's the default behavior. And then the next, the next thing is if you get notified and you don't respond, then I'll be notified at the end of the day. So anyway, we have the same for the US time zone, but we don't have a lot of people there right now for the layer four. It's because okay, which does not respond anymore. Tyler from time to time, depending on the gravity, the depending are important is the notification. Same for Andrew and I just added Marquette to the loop. We try to catch, for us, we try to catch at least the most important issues. Let's say when the website is down, if when we are on call, the idea is to work on the system during our day to day working hours. So typically if something goes wrong in the middle of the night for me, I usually try to find someone in the US to deal with the issue. Otherwise we just delay until we have the time to work on the issues. Most of the time, the kind of issues is either we don't have enough space, then we have to clean up the system, or for some reason, as an SSL certificate did not renew correctly. And so we had some manual fixes to do. Any questions? So I need a tutorial on how to be effective using pager duty. Others may not need it, but for me, I've I've not been terribly helpful with it, I think. So is that something we should schedule separately as another time? I think I think we can just quickly do it now, because it's quite simple. So pager duty, the only thing it does is it receives a notification from Datadog and then forward that to you. The way you get the notification depends on how you want to be notified. So when you go to your profile, you can provide an email address, you can provide a phone number and SMS, or you can install the pager duty application on your phone. I used to provide my phone number, but I stopped doing that because otherwise, I mean, I find I find that quite intrusive, but I receive SMS when something goes wrong. And more importantly, I have the pager duty app installed on my phone. So if something goes wrong, and I'm not available, I can just acknowledge that the issue is there. And I'll work on it when I have some time, let's say the day after or in two days. That's the one I prefer. But yeah, that's real up to you. It depends basically what contact information you provide in your profile. So pager duty is really simple to use. Okay, so the simple step is I go to Jenkins.pagerduty.com, login as myself, configure my profile, and that should already be enough. If I do that, oh, and install the pager duty app on my phone. Okay, got it. So that basically the thing. The next step is when you get notified about something, we have a specific git repository named Jenkins from runbooks. We're typically, we document the kind of things that we do when I mean, depending on the situation. And so the process now is to first look at if the documentation there is still relevant. And if not, either open a gwrite ticket so we can try that we miss some documentation or write it or whatever. But the idea is ideally someone on call should be able to solve the issue with the correct documentation. Right now, the challenge that we have is we have to kind of infrastructure. We have virtual machine with some people comfortable with those machine. Some people have access to those machine. And then we also have the whole community's environments with different kind of, I mean, different people are comfortable with that environment as well. So that's the biggest challenge. But otherwise, most of the things are documented in this git repository. Thank you. Last, last, last mention on page or duty. Last time, yeah, I was that's also the next up the next point over the over the weekend, we we got a lot of notification on page or duty. So basically what I did is I enabled more monitoring on Friday afternoon, which is not something that that should have done. And it just spun people over the weekend. I was only on cold Sunday, so I only discovered that Sunday. And then I look in and think it's infrared discovered that Daniel was complaining about the amount of the amount of notification from page or duty. But yeah, that I think that was my fault, which I will explain the next topic. Any question before I continue? So, from that point of view, so basically what I did is something that I wanted for a while was to monitor third the third mirrors. I wanted to be able to detect when let's say several young or whatever the mirror is down or slow, whatever the issue may happen. Typically, the reason for that is when we have people complaining about get the Jenkins that I owe because they cannot install a plugin. Even if the Jenkins services are running fine, sometimes the root cause is just the mouse. So I just enabled basic HTTP monitoring for those services. And it ended up that those services were not reliable over the weekend. So some were really slow. And so the challenge that we had here was we received notification, acknowledged the notification. Then the problem was gone because the mirror went back to the normal state. And then 15 minutes after that, the issue reappeared. So it was a different alert, a different notification. And so yeah, we just got spam over the weekend. So I disabled page or duty alerting for those mirrors because the final goal is not to be, I mean, we cannot do anything with those mirrors. The only thing that we want to know is if something is wrong is happening. So we can maybe contact them or maintain or whatever. But we don't, I mean, it doesn't, we don't have to do that in the middle of the night. So that's basically the rule. But yeah, if, yeah, if for Garrett and Damian, if it become an issue for you to stay on page or duty, because ideally it should not be a problem. Then instead of just in your alerts, just tell me or let's first work on reducing the number of alerts instead of just ignoring the issues. Next topic, Damian, so Joseph, Joseph proposed a PR to switch to traffic for the ingress. So traffic is a different web server with more feature compared to NGX right now. And so this is something that I wanted to test in prediction for a while. Joseph started working on that Damian finalized the PR. So we now have both, those both ingress in place, one for the private network, one for the public network. We still have to plan to switch services to those ingress controller. We'll start with the private one. And if everything goes, if everything is fine, we'll do the same for the public one. The only one, the only, no, sorry, we can, once we decide to switch to traffic from NGX to traffic, nobody will notice that because we have many state less application on the community cluster at the moment. Yeah, I think it's fine. But migration can be quite easy for the application. It's only a matter to redirect the DNS record of a given application to the new IP because we have created both couples of ingress that are using different and separated IPs, either the private IPs through the VPN and same for the public ones. So the idea is that we can do a AB testing or AB deployment here. So the risk is quite low in the sense that if we see something goes wrong when migrating a given service, we can always roll it back to NGX. And NGX won't go away until we have moved everything. And we are sure that traffic fulfill all the needs. So starting with the private is a good exercise because if you break things, it will only for us. And Joseph did a really good work on that part. He did hold the heavy lifting. My involvement was only the last turn of the screw at the end. So it looks to work pretty well. And so the compelling win there is that we get additional capabilities from traffic as an ingress controller that we didn't have from NGX. And so this transition will give us additional flexibility in the future. In terms of feature, let's say the feature we're doing with NGX alone are almost the same. But with NGX, we had to add more components. One of the things I have in mind is, for instance, the certificate management with cert manager for the ingress specific port, which is less heavy. It's less code to maintain. So the feature set is the same. There is no new feature. It will just ease to have less code and less configuration to manage for the same feature set. Okay. And also a traffic provide feature that we needed in the past that we don't need for now, but we may need that again in the future. Like, oh, proxy and stuff like that. So it's not like we need it right now. It's just like in terms of feature, it provides more feature. And so I prefer to have it in a feed advance when we have the time to work on that than when we really need something in particular. Another point here is that traffic support dynamic configuration, not only from Kubernetes, but we could add different backend provider on the same instance. So the ingress will do an ingress but multi-system. So you could imagine having different Docker instances like you have a virtual machine with Docker, which is on a private network. It will be easy to auto-configure this. So you can still do this with Kubernetes, but you have to manage the configuration by yourself. So in the context of diversifying our sponsorships, the different platform where we would want to run different services, that could be also a great help because we will be able to have the same entry point for everyone and then distribute on the back end on different cloud systems. Thank you. Thanks for the clarity. Next topic that I put here was Damian is fine-tuning the Jenkins master. So we are affected by a weird issue with the add-up connection. So basically from time to time, add-up connection times out. Damian has been looking at that. He did not find anything interesting. So what he did, he installed the advisory plug-in to collect some information to improve the fine-tune Jenkins instance. Right now, yeah, right now this was on the memory settings. A leg made a good suggestion that maybe four gigabytes was not enough for infra. I think for now it is because at least we are using the graph and not when it is to collect some metrics and nothing tells us that we don't have enough memory or CPU at the moment. This is something that may change in the future, but at least for the moment, yeah. That's fine, but the idea is to have one change at a time regarding the add-up. If we continue to have the add-up issue, we may switch to a different system or whatever. Sorry. Yes, the tuning that has been applied are only keeping the same resource usage as defined today. It's just optimization based on the current state. In order to change the state, e.g. adding or removing CPUs or memory to the Jenkins instance, fine analyzing the add-up issues that are tightly related to the garbage collection inside the GBM, the goal is to have precise metrics. All the information source on that says that we need fine-grained information and metrics, so we will have to add them. Most of the time it comes from the CloudBees knowledge or Jenkins 6 knowledge, but that will be the next steps because almost everyone that I ask for help told me you need to measure, which makes sense. So the goal is to, okay, let's optimize what we have right now and see if we have a change. And then before going in dichotomy analysis, we could just measure precisely what we need and want to check, and then we will iterate based on that. The information which is really important here is that given the split state of Jenkins between GDK8 and 11, the end chart we provide is tried to be agnostic, but most of the GBM option that we are using should be a good default and sane default for any Jenkins instance most of the time. And this will be a topic in the future as a feature for the end chart by providing the set of GBM flags that are known to be a good rule of thumb to start Jenkins with. I think something basically that should go back to the end chart is to by default use the GDK11 version because on the end chart with body for running containers, so it makes sense to use Java 11 instead of body for deploying Java 8, but this is a topic specific to the upstream chart. But I think it would make sense to by default use the Java 11 on the default time chart and have specific default parameters. Yeah, any question regarding this topic? Okay, so. I want to keep in mind that there is a recent report about memory leak on Java 11. Okay, so if you observe the same on other instances because it might be a real issue. Okay, good to know. So it's related to pipeline CPS plugin. It's related to which plugin? Pipeline CPS. Sorry, sorry about that. I've got a lot of activity inside. Sorry. So the last topic is about Jenkins inbound agents. So several weeks ago we got issues where so basically we are using specific inbound agent on CI Jenkins.io which by default are using the default user by the upstream image, the most of the time root. And so Damian and Kara work on those images to by default use the Jenkins agent. So the idea is to not be using the root user by default. There was some discussions about should we maintain those images on Jenkins CI or Jenkins Infra. The first step is because the need is for the Jenkins Infra project. We first build and publish those images on the Jenkins Infra organization on Docker Hub. We did not have the time to update CI to Jenkins.io yet because of the contributors submit last week. But at least those images are now published on Docker Hub. So yeah, in the coming days we'll have to update CI to Jenkins.io. This configuration is done manually. For now, we still have to work on CI to Jenkins. But yeah, but I put some link to the Docker Hub and the Git repository. The next step is to see how we can publish tags for Jenkins CI as well so the community can use that. But what we want, what we fear here is that we just create a new set of images that the community expect us to maintain. The problem is it just introduced a lot of new Docker images and we would like to first identify maintainers for some of those images. So this is something that we already explained, that Damian already explained with the time in the previous weeks. But yeah, so we won't go back to that explanation now. But what you have to hear is what you have to understand is those images can be used, can be tested. I put the link on Docker Hub and yeah, from Jenkins Infra that thing is almost solved. Any question? Then oh, CI.Jenkins.io, there was one additional topic. I think you had mentioned EC2 plugin upgrade. Is that likely coming soon or should we plot and plan for that? I think we just either need you or me to take 30 minutes to update the plugin and to test that everything goes correctly. Oh, you are such an optimist. I love working with optimists. I was so terrified of that plugin in the past. Okay, all right. Basically we do. That's why I say we need 30 minutes or one hour. So we fix downstream issues and worst case scenario, we still have the Azure plugin in place. So we could still switch to the Azure virtual machines, temporarily the time that we fix the EC2 plugin. Okay. So in my nervousness, I just scheduled some time with you and the two of us pair on that. That's probably the healthiest thing to do. Great. I'll do that. The thing is we are quite a lot of version in the past. I think we are using a one-year-old version. A lot of things happen. The one that I'm the most excited personally is in the past, you had to specify TMI that you wanted to use. Now you can specify your pattern. So let's say you want to always have the same image based on that specific tag, which means that you don't have to build, you don't have to modify the Jenkins configuration each time that a new MI is available. It's just a lot easier. This is the way to work with the Azure plugin and I find it more convenient to use. But yeah, this thing was introduced several months ago. So when you have some time, let's just spam the version and see what happened. Thanks. I got just two information. I've started discussion for sponsorship with two French provider, cloud provider. I already have an answer from the first one, which is scaleway. They mostly do bare metal, but they also have a managed Kubernetes service, which is CNCF compliant and object storage. So they are open to start the discussion. So it could be interested to check. I was thinking initially and heavily on the Kubernetes managed service, because it could add additional capacity to whatever for the future of CI Jenkins outside the KS. It could be a great way to have a fallback because we can add multiple Kubernetes and we can reuse the same pod template on all of these Kubernetes. And yeah, the object storage and bare metal machines might be more interested if you want to set up mirrors. So if you have other ideas based on the service, they provide, don't hesitate. And the other provider is less well known. It's named outscales in a single world. There are a few of that sort. And they provide an easy to compliant public cloud and private cloud. They are differentiated in security. They host most of the statewide cloud because of their security concern. There are heavy users of Jenkins open source. So I'm trying to get them to give a testify at least on the Jenkins is the way. And it looks like they are interested if they could provide the EC2 service with a few credits. And so the EC2 latest plugin is able also to handle their own API, which they use internally. So that could also be interested. I've started the discussion on both. I don't know the amount, but yeah, diversifying sponsorship is always good. Yeah, that would be awesome. And I think it could also reduce, I mean, we have to spread to see our CI instances. So if we could reduce the cost of the Jenkins that I know, that would be really awesome. Thanks, Simon, for driving that. Any last topic? No, then I propose to stop here. Thanks, everybody. See you. Bye-bye.