 Hi everybody welcome for this new Jenkins infrastructure meeting of the year and obviously happy new year for this for this specific event I would like to take some time to step back and see the different stats from last year so let me share my screen with you. Where is that? Can you see my screen? So basically, yep, perfect sounds good. So what I wanted to remind us and what to look at with the last year is to keep in mind that the Jenkins infra project is an open infra project where everybody is invited to participate, to contribute, to learn. And so I wanted to look at what are the things that we did that improve that over the last few years. So the first thing that we did was the way we organized this meeting so we started doing the Jenkins infra meeting four years ago. Initially that was a way for us to synchronize with Tyler and I. Two years ago we start taking notes of the meeting so other people could follow what we were doing on the Jenkins infra project. I had a look this morning and we wrote 65 pages of notes which is quite impressive. And a year ago we did another milestone which was to record this meeting so other people could follow what we were doing here, which is always, I mean, I was really happy and I wasn't expecting that when I started working on this project. Another thing that I look at was what are the stats that kids can tell us. So everything is public. Most of our code is on the Jenkins infra organization. So in this case I look at how many people so we had more than 200 contributors to the Jenkins project. Those people contributed to 31 repositories, which is quite a lot. And also, we start to see a drift from the people in the different time zone so while people initially start contributing to the Jenkins infra project from the United States. When you see under the graph that I have here, we start to see more people on the European time zone, which kind of makes sense with Team Jacob and I. So, yeah, that was, I was really happy to see that because it accelerates the feedback loop. And when we look at the top key repositories, we can also I don't know if you can see the information. You know, I know it's kind of, it's quite small. I'm not sure if it's readable for you. But what we can see is we had quite a lot of different kind of repositories we have puppet code communities, we have the main websites and depending on the kitchen, we had more or less contributors. Another thing that I wanted to share with you is the way we we finance the Jenkins infra projects and how big that infrastructure is so we don't receive money so everything is running based on sponsoring. We have many different kind of sponsors. Some provide monthly widgets that is renew every month. Other provide a big amount of money that say for a year, and some sponsor just says, you can use whatever you want on our platform and just it's, it's free for you. At least what I was able to monetize is we spent more than $200,000 on the infrastructure over the last year. Obviously, it's more because stuff like for instance that the dog is free so we have no visibility on how much we would have to pay for it. The chief frog or IBM. So it just, it just gives us a big overview of the different things. Another information that I look after that I looked today was what are different stats that we can find for at least the two main websites. Another major website that I did not consider is here is the update center because I don't mean we we use Google Analytics and we use Google search concern on the main website and plugins, but we don't really have any of these to collect stats on the update center so a voluntary avoid it to spend too much time here. So the first one is the main website. When we look at the Google Analytics information that we have here, we can see that the number of people on the number of visitors over the last year was more than 6 million of user visitors, which increased by almost 30% compared to the year before. And that's where those people are coming from. I mean, it's not major surprise here, north of America, Western Europe, Eastern Asia, those are the continent where most of the visitors are coming. What surprised me in this stats in this information is that the number of visitors from the north of America decreased. It increased by 3% while visitors from the other places increased. So for instance, from Europe, it's more than 4% of visitors on the main website. I also looked at the Fastly. Fastly is the CDN used on our infrastructure. So they started sponsoring us in March last year. And so the kind of information that we have there is in terms of network bandwidth, what we are using there. And what is interesting here is it's an average by day. So in June, we had by an average 1.3 million of requests done on the main website per day, and we had a more 30 gigabyte of transfer, the network transfer per day on related to the main website, which is quite impressive for a project. To alleviate that data transfer is from just www.jankins.io. So what I think of as a documentation site is still transferring gigabytes in a year. Wow. Yep. So this one is each time someone tried to go to www.jankins.io. Thank you. Okay, that's that's much bigger than I expected. Wow. So if you look at the Google console, we also have information like the number of click there, but the one that interests me is the pages. What are the top pages that people want to look. And specifically here, we can see that it's a documentation for the pipeline syntax, the second one. We also have pipeline docker, pipeline jink inside. What you can see here that you have to keep in mind is it started, I think in September, we're starting to use www.jankins.io by default. So we still have some stats only for the top domain, which is, which was, I mean, which is Jenkins.io. So when we look at the stats for the plug-in sites, we don't have kogonalytics on the plug-in sites, but we still have, we could still look at Fastly. And what Fastly tell us is we also have quite a lot of traffic per day. So we have 17 gigabytes of byte transfer every day on average since June. Almost 700,000 requests done on the plug-in site, which is really impressive, especially for that website that was refactored a little bit more than a year ago. And it's really nice to see it's evenly used today. Another information that we can look is the Google Search Console for the plug-in site. And specifically here is we can know how many time people are looking for information, let's say for the pipeline, maven plugins, or the kids plug-in, or the docker plug-in. So when we see the number of clicks here, we see that the top one is the plug-in search and keys that are a slash key plug-in. And yeah, we can, we can, we can collect many information there. So yeah, so this was just a quick overview of the Jenkins project over the last year, which was, I think, really nice and interesting to share. Any questions? Thank you. So would you, would you be okay, or would you be willing to consider doing a blog post with this content, or would you like some ghostwriting help for a blog post? This, this for me feels like really cool data that we should highlight. Yeah, sure. I could write a blog post. I mean, since we've got a recording of it, I can, I can do writing, writing help if you would like. That's just having you present the data has been great. Thank you. Yes, we can. Yeah, we can see how if I have the time to write a blog post or if you have, let's say in the coming days, how we can organize. I propose to continue now. And if you don't have any more questions. So yeah, before we continue, if you are interested by a specific stats, or if you're interested by any information, feel free to ask and I can see if I can collect that information. Thank you for the morning to prepare together those stats. I mean, we have quite a lot of services with many information. So, and yeah, I think it's interesting to use them. So, I propose to continue. You mentioned homepage size and resource loading. Is that something that we could envision improving significantly without losing the content or is that are those things critical that we actually have to load them. Yeah, they're quite a lot. It seems like the YouTube player is the biggest resource. It's the jumbo trauma stuff basically which makes it big. Well, well, or if it's the YouTube player that may be the thing that I added with that Jenkins that below the fold Jenkins video and so we might be able to dramatically but that was only added like in December so Hmm. Okay, that needs more, more and more investigation. Okay, can you, can you see my screen again with notes. Yeah, sorry. It's very it's very small zoomed out and terribly readable. Yeah. I have a big screen, and I have good eyes. I'm a big screen too, but your things long. So we put a few things to the agenda today. So the first one that I want to highlight is the kid up organization for the Jenkins infra has been switched to the free tier. So, this is something that we discussed several weeks ago, we were paying $300 a year, and now the free tier offer everything that was included in previous plan. So the thing. The second one is we had an end up incident over the Christmas period. I apparently apparently the container was in the broken states and marquee had to jump in and just restart the container. I look at that. We didn't lose any data and we still have backups so everything is fine there. Considering the low traffic we had on the infrastructure during the pre the Christmas periods. I don't think it's a big deal. Something that I have to investigate is why archive to Jenkins that I go is terribly slow. So I noticed that from time to time isn't the service not available anymore. I suspect that we just have too many traffic and the machine, but I haven't. So, Olivier, when I look at my okay I'm doing a different status check but it's been off socket time out for. Oh no it's rich relatively recent okay it just there was a period a week or two ago where it was offline and now it looks like it had been on online recently again and just offline. Okay, so, so it does look like it's a performance thing. Previously it was a simply offline. Could you help me understand who should we contact when it goes offline I thought that was hosted by OSU OS. No, that that machine is hosted on the rack space. Okay, it's hosted on rack space. I had the issue. I mean the right before taking some vacation and the machine was not reliable so I could not as a sage on it. When I look at the rest of rack space console. Nothing tells me that the machine was overloaded so I couldn't connect on it and then suddenly I was able to go to a stage on the machine. I looked at the logs and I couldn't find anything. It's like the uptime was running for a really long time. No, no, I mean everything seems fine. So I'm really suspecting like too much traffic from the mirror infrastructure. But this is something that I have to investigate we could investigate together. That would pointing to it anymore. I thought we moved back to the Azure file storage for the full back. Me too. Me too. It's just a guess here. Normally normally we are using and this is something that I also would like to work on in the coming weeks because the problem here is we use a fallback. But the fallback use the same Azure file storage than mirror bits which means that if something goes wrong with mirror bits the same issue happened with the fallback. I mean, I mean we should use a different fallback machine. Okay. Now, is there content on archives Jenkins that I owe that isn't elsewhere. And if so should we should we should I copy the content of it to a home server or to a place somewhere else. Glacier storage on Azure is or is archive Jenkins that I owe is content entirely saved somewhere else. So it's safe elsewhere, but I think it would still be nice to have another copy of archives Jenkins that are you. It's on, it's on the visual file storage and on package to gingston I was in it. Yeah, so one of the copies on Azure file storage and so the other one is package to Jenkins exactly. Okay, so, so and those are both full copies not partial I thought. No, no, no, those two are full copy, but then the mirrors on the full copy and then the politics the policy and the different mirrors depends on the basically some remove after a year of that time. Just keep so it really depends on the mirrors but the full copy of archive is right now is into two different location. So it's not critical but yeah that would be nice to have. Another issue that I highlighted right before taking vacation which was the server in mirror was not done really slow but because it was really slow we still a lot of time out issues. It was heavily affected by server around the 17 of December because because the way the Jenkins where we deploy Jenkins on our infrastructure. So the way we configure our handshark is to install the plugin when the container starts. And so I had for one reason I had to restart the container on release that schedule Jenkins that I have. And the container would not start because he couldn't install the plugins because he tried to download the plugins from server in which was time out time time out at that time. So it took me a while to understand that but basically what happened is, some of the plugins were not installed. So the container was restart so he tried to reinstall every plugins and the loop was over and over and over and over. And I don't know why to identify that. So I think to avoid these kind of issues in the future is we should monitor every mirrors that we add to our infrastructure to be sure that if they're slower than let's say five seconds, we are notified and we know we can take that into account. And also another thing would be to package our own Docker image of Jenkins containing every plugins that we need. Because so we don't have to install everything when we start the container. This is something that we discussed with Damian and Garrett quite recently, we may start working on that. Any questions. So I see still that so very on is not mirroring. For instance, the most recent release of the get plugin but they are mirroring the previous version. So is the is there is so very on included in the mirror list right now or excluded I assume it's included. That's not current. So I just saw two things here I really are re enabled it today so it's not today. Yeah, so that's what that was quite recent. But because of the because of the fact that we don't control mirrors, we don't know how often that machine synchronized with our own mirrors. So that's why we cannot rely that I mean that's how mirrors are working. As long as another mirrors as the data that's fine. And, and is there a contact list that we could have could have reached out to the survey on administrators or or any other mirror administrator. I didn't know how to do that. Yeah. So, right now mirror beats is configured manually because we don't have many mirrors. So we are run, like around 10 comments and those comments are documented on the mirror beats handshots. So if you look at the documentation, you see that we provide the mirror, the after the pay after the, sorry, her singing and point I should be patient points and an email of contacts. Great. Thank you. Yeah, and it does have a contact. Yeah. And it already happened in the past where the, the, the maintainer was not, we could not reach the maintainer, but then in that case we just disabled the mirror of the time for the time being. Yeah, if you log into merit so you can also just run thing merit list or merit show and it gives you all the information about the mirror, including the contact details. Thank you. Thanks for the clarity. And the next point is about the way we build the car images. Sorry, on the Jenkins project so in this case I propose to, to, to show the floor to Damien, if you want to present that. So just for information you have seven minutes left. No problem. The whole context is improving how we build operational docker images. In order to execute operation and the infrastructure, for instance, in the process of working around terraform. We want to also had some testing and leaning. So we need a reproducible way or a reproducible environment to execute these steps. So the docker image is a kind of the factor standard to be sure that you will have the same be of you. Either you are running on macOS on Kubernetes cluster on a Linux machine Windows machine. So the idea is to build Docker image. There is already a lot of work around building Docker images using IMG. Because the challenge here is infrasci Jenkins IO is relying on Kubernetes cluster to run the agent workloads. That means running a Docker engine inside the Kubernetes pods would allow some privilege use cases because the process that run inside the pod should be unprivileged as much as possible. Even though Docker is able now to run as rootless, it still requires some specific kernel capabilities as for today. So our challenge here is around security to avoid any malicious Docker file that could come from malicious source or any change to the Docker file. We would not want this to break out. For building Docker image IMG or Kaniko are already solving that issue somehow or at least provide a useful security and it's already we do it. The challenge here is testing because we need to test such an image. For instance, for Terraform, we want a specific version of Terraform for our infrastructure or it will break the future usages of Terraform. So the day you update the Terraform version, you need to update the tests. The goal is to write testing at least validating what is inside the Docker image, which is easy to run with some tools we are adding. Finally, we have the reproducibility issue. I went in some issues where the Docker image produced by IMG or Kaniko were a bit different in terms of behavior than the one I was generating on my local Docker engine. The question is how a contributor could be able to run the same step as the build process without sacrificing the security. I mean, I don't want any untrusted workload to run on the infra CI Jenkins. So the work we are doing with Garrett and Kara is starting to work with that as a part of the knowledge sharing. We try to make that work as much public as possible. The idea is to define first the make file that will say, okay, to build the image that's make build, make test and make deploy to hold all the logic. That make file should be trusted as well as a Jenkins file. So it should end inside the chair library as a static resource. And the goal is to improve the pipeline library of Jenkins infra projects to provide a Docker build, Docker test, Docker publish and combination of both. So four functions that provide these specific features, relying on a make file which is located near this chair library. The target will be for a contributor. If they want to build or test an image or contribute, they install make. They curled the make file. And then they can run the make build. But they cannot change the make file themselves unless they pull request the shell library to avoid having security issue and still provide a repeatable workload. So the status of this part is we have to bootstrap the process to be sure that we have a first image to build the image itself. Now that we have bootstrap it that part. We can start again the work on the infra 2849 by building a Docker terraform image that we will have to test as well. So this is the current status of that work. I don't know if it raised questions. To me, totally makes sense. I mean, if there are something unclear, don't see dates. No, no. Thanks for that. Thanks. There is one last item that I realized that I forgot to talk, which is automated release. So before, so around mid of December I started working to have to build release candidates on the Jenkins infrastructure. I created an etiquette infra dash 2853. So if you want to participate in the discussion. I'm just going to go over there on team and just see already provided some feedback I have to answer there. But yeah, if you're interested to understand what's the current state of your free to look. Mark, are you the one who I let it see other Jenkins or you. I, sorry. Go ahead, Tim. That's me just getting annoyed. This is the single most annoying thing I had with the Jenkins infrastructure right now is. I can't get the agent to deal with any conditions dying randomly. And it'd be great to spend some time with severe and. I did, I did try but I just don't really have access. It's from a AWS agents and I can't see anything there. I tried to get them but I couldn't find the credentials properly and. I can, I can take some time to look at it from during the week. because I haven't looked at Seattle Jenkins for a while. Yeah, it'd be good if we could detect it and so that we can monitor it and so that we can see it grow up or down. I think detection might be feasible just by looking for file pattern for certain patterns in the build logs, but correction, I don't know what to do about correction, Tim. I could envision, go ahead. I just meant so that we could at least measure how much is happening, and then we know when we solve the problem. Ah, OK. We're currently relying on people complaining about that. Yep. Right. Right today. People are just going quiet and they just keep rerunning the builds. Bystander had a number of times over the holidays. And Olivier, it's an interesting one. Certainly our costs are increased as a result of running builds a second time because the agent was disconnected on the first time. And so there's an element of this where it's actually would be a cost savings. I just, I can't predict how much of a cost savings. So yes, it's an issue. Unfortunately, yeah, I don't think I'm gonna have time to investigate in any recent, any upcoming time real soon, Tim. I apologize. I could look at it. Regarding the cost here, I had absolutely like completed my document with a various costs. And the thing is because we had solo traffic in December, I mean, the cost decreased anyway during December. So I could not detect that. Okay. Well, and I don't know that the reliability hit here is going to show up as a measurable cost savings. It's just, we absolutely are causing users to rerun their builds because the previous one failed. Yeah, I totally agree. And I had to restart a bunch of jobs this way several weeks ago. So okay, I could look at it. We are two minutes over the limit of the meeting. So I propose to stop here unless you have one last element that you want to bring for the next meeting. So again, feel free to add your comments for the future for the next meeting in one week. Add any topic that you want to discuss. And meanwhile, we all will all be in RSE anyway. So see you there. Bye-bye.