 Welcome to the Jenkins info team meeting. It's the second December 2021. So we excuse Damian to portal he's stuck in traffic at the moment. I think we've got several topics let's go through them and talk to them briefly. So we've got as attendees we've definitely got. Ave me and Tim. And I'm hoping Damian will join us. So we'll leave him tentatively there. So in terms of topics. Previous weekly. Build failure. Upgrade campaign outages. Kubernetes one dot 20. TLS certificate for repo. Oracle cloud access. Terraform costs update CLI. Domain renewal. Fastly wow we've got lots of topics any oh and I've got one more topic to add. Oh no here it is it's already here. Okay. So it was a new test new solutions topic. Any other topics than what are listed here that you need to put on the list I think we're going to run out of time long before we run out of topics. Maybe the other Stefan next week. Oh yes, new. Okay, we'll have a new team member joining that's good right so. So that's an announcement thing. New team member new team member joining at cloud bees. On infrastructure. Stefan. And we'll introduce him next week's meeting. Great. Thank you. Okay, good. Any other things that go need to go on to the agenda. I think it will go good. Okay. All right so so in terms of previous weekly failure. I have the background on that this is is was this the weekly build. So I assume it was the weekly Jenkins build. Sorry. I don't know. I don't know. It wasn't. It was last Tuesday with a bunch of packages disappeared when no one story means was removed. Right. No one store recommends was added. Once the stuff. Got it right okay so this one was this was we had a bill failure on two dot 322. It was repaired by active work that day but got it that makes that one clear. Excellent thanks. Okay. So that's more, more status we think it's healthy. The next topic then was this apt get upgrade campaign and here it is. So what I think this is really saying is we've got a number of machines that are not running the most recent patches and and systematically we need to get them up to it. So one example is instances that wouldn't reboot. That we don't have access to that others on the team don't have access to that it's only Olivier and me, another that has had had to receive major rework Damian you're here great so we can let's we can switch to have you running instead of me. Good. Sorry. Okay. So you were describing the issues on the update great campaign. Right. So we had it was. It was successful. The goal was to check as much machines as possible and have a turnover. Some issues so we had some issues on machines direct issues like the machine not rebooting. So we had an issue that has had the impact on the weekly release where PKG was in a lot of held broken packages. And so Mark and I debug the dots. In fact, the package were not installed correctly because it was a mix of Xenial and bionic sources, even for the core packages. And most of the packages were from an initial Ubuntu non server image so a bunch of X1 running X11 running. Yeah, so we had to clean. Mark, while you're there or maybe team. I'm not sure what is the service census. I haven't searched. But if you have access to the VM because I see the VM running on AWS. Like, I'm not allowed to connect to it and there are references and Jenkins info. So if one of one of you know what it is. Which man senses CNS us and I thought it was used for for survey results for who's who's using it how. Mark, I don't have access either so. Okay, so we have to ask Olivier or Taylor. I assume Tim you also probably can't SSH to census dot Jenkins that IO. Is that, is that just since it's touching style. Yeah. Oh, you are. Okay. Yes, I've got red access. Oh, very good. Okay. So after the meeting, I might need your help. At least to ensure that we copy at least the public key of me, Mark and Hervé. And then we'll check a bit and stuff afterwards if it's okay. Is it just you have one or is it different? It's a different one. You're the same as CI Jenkins or whatever the mission you will have access to. I will send you a public key and I will take care of that after the meeting if it's okay for you. Yeah, so you won't waste your time or searching for the correct key. Great. Thank you. Excellent. Great. Tim, Tim, glad that you've got access. Another impact that happened yesterday is that the machine usage to Jenkins that IO, which is hosted on AWS. There were an old issue that I've mentioned on the notes. So in fact, there is one of the data volumes mounted to a survey bigger usage wasn't remounted after the reboot because it was not written on FSTAB. So it's an old issue. So I took care of that and Andrew opened a new issue with the issue and we took care. It's a kind of temporary situation because we are we plan to move that virtual machine to Azure or even better to Kubernetes. So we keep that intermediate state where one volume is managed by Puppet and the other has been managed manually but is persisted. That's one line of FSTAB. Yeah, so that's all. So we will have some work to do in the future to ensure that this campaign of upgrade are not exceptional. I would say we should have a weekly at least upgrade and reboot on all the virtual machines. Because a bunch of issues are cooked and it's better if we control when we could have the issue. So that's something we could plan for the short term future. We could use a what's the name unattended upgrade on Ubuntu. That will take care of rebooting the machines and having a reboot weekly that could be a good thing. We can have some machine Monday so some of the machine Thursday. The reboot is to ensure that the kernel upgrade are applied even though there are methodology on Ubuntu to have live reload kernel without reboot. However, the reboot also have a second effect. We change supervisor a bit more often on the cloud provider, which ensure that we have a recent hardware or at least hardware that is not heavily used at least for the M instance on AWS. Is there any question, things unclear on that part? That sounds positive to me. A point on the recent outages. So Jenkins IOPOS Mortem has been published last week right after the team weekly. So it's going to be frozen. I haven't seen any comments here. And neither did I receive any feedback. I apologize, Damien. I did not do anything with this. Would you be willing to leave it open for another day or two to allow further more time? Yeah, I'm going to wait to tomorrow end of the day. That is sufficient. Yeah, that would be sufficient. There has been an issue in archive Monday. So I have to write a post mortem for that one. There was a consequence of the appetite upgrade that upgraded the canal and some low level modules that apache relied on for the bandwidth limitation. So it sounds like that the bandwidth was not working with all the canal. And it started working, which was the outage. So we were way past the bandwidth and the calculation for the bandwidth was done for a space, which is not an issue on Oracle right now for that machine. So we removed that after confirmation by Olivier that it makes no sense anymore. Worst case, if we add all the bandwidth of all the infrastructure on all the provider we have, the worst case will be 10 bucks per month on Oracle. I mean, two petabyte per month. Great. So I mean, okay. Yes, so post mortem, every and I were involved on that one. So that's also a good thing. That means that we are trying more and more to work together and not alone. Great. Kubernetes 1.20 upgrade. Okay, Mike is yours. Yes, so it went well we had to for the case. Cluster, we had to. We had to add the addon in terraform. So we could manage them. It was the semi I don't have them, I'm just sorry. There was a core DNS network CNI driver to allow pods to be on private networks and AWS. And the third one was to remember. These addons are components installed in Kubernetes that we could installed as a chart or whatever, but that are managed by EKS and terraform, because they are the bare minimum to be sure that we can start using Kubernetes for real. So core DNS is the is an example. The DNS implementation depends on the Kubernetes distribution that you have. At the time that installation is managed by the cloud provider. That's the goal of these addons and we were not aware of that existence, because they were installed one shot and not manage automatically during the EKS installation. So that was quite a success. So great job every because that that went very well. The preparation and the communication were really good, better than Olivier or mines. So congrats on that also. That's a good thing for one users. And for the next one. So you will be the master of the timeline again. And we will have to improve a bit to work a bit more on terraform. But now we have proved that you have the ability to pull request and change element if needed on terraform. So you should be autonomous and I can be there either pairing with you or as a backup or you can do it alone your choice. Probably we'll do it with Stefan. We'll see. Just one thing Mark you and I we should contact KK to see if he's able to renew the certificate for repo Jenkins CI.org. I don't know if he acted with GFrog or not. And if he don't have time or if he's busy, we can totally take over but we have to contact him for that. Okay. So GFrog is ready. The ball is on KK. He has to send the certificates to GFrog so they can act directly. They are waiting for him. Okay. At least on the information we add and we were in CC. Maybe that's changed but we have to check with him. Would you be willing to check with him? I'm happy to do it. Yep. Great. Okay. I'll take care of that then. Oracle cloud. We need access. I haven't put your name team because I was sure that you have access on the Oracle cloud console. I don't. I should. Yep. And I propose gave in at the same time as well. If it's okay for everyone here. That makes sense. I posted a link into the IRC about integrating it with Azure AD. What do you think about that rather than manually doing it for everyone? That could be interesting. Does it work for all the cloud system we plan to use? Should do. I mean it works with AWS. At least. That should be good enough. I don't know how we would do that, but it sounds great. Tim, would you be willing to work with me on it? I think Olivier and I are the ones who have access to it currently. It's, it's too much of a one off right now integrating with Azure AD feels like a big win to not. Not have it be such a one off. Yeah. Thanks team for pointing that one. I didn't know it was possible. That's really good. So short term, we need access to the Oracle cloud right now and integrating with Azure AD to make it our life easier in the future. Great. Thanks. Yeah. And Tim, is it best if I just, I'll send your proposal sometime we sit together and have you coach me through the whole, what do we need to do? It's up to you. I mean, there's a end to end guide. I can sit there with you if you want. Okay. It's probably a, it's probably a template application already in Azure AD. It's normally pretty straightforward. We are happy to do it with you. It would, it would be a great help for me just because of my, my lack of expertise in that area. A word about Terraform for more clouds. So work in progress. There is a new epic created as an umbrella to follow up the, the short term Terraform related tasks. There is one task in progress, which is blocking for all the other. That's the pipeline library. And for the rest we will be able to parallelize tasks depending on who is willing to take which one. Great. A word about the costs as well. We were able to decrease the cost of AWS under 9K. It was 15K last week. So that's really good. I hope we could gain way more by moving some virtual machines. Almost there. And the cost on Azure also are way lower than 10K. So congratulations on the first of everyone. There are still some improvement that we can do on the cost, but that's already a lot. We decreased. We cost twice less than what we did the beginning of the year. So that's really good improvement. Congratulations Damien. Congratulations. That's great. Thank you. A word on the update CLI campaign unless you have question about Terraform update order costs. Okay, so airway continued the big effort about trying to put updates CLI on a lot of repository. Most of the time we only track one or two elements on our repositories. And so the idea was to track as much as possible. So we have way more updates that we control. So there are still two repository coming on the Docker images. And we will have the Terraform repositories to add, but that's part of the blocking task I mentioned earlier. So that's really nice. I expect that we should be able in a few weeks to start contributing to Jenkins CI repositories with update CLI proposal like team did for the BOM. I'm sure there are a bunch of elements that we can what we've learned on Jenkins Infra could be could help the community. IO domain renewal. Thanks for the pointer folks for that one. That means we should start renewal before end of year if I understand correctly. Yeah, for now, the renewal price haven't increased but I don't know where the domain is thread. What is the registrar for Jenkins.io. I don't know either. I would have to double check. Go ahead. Tyler I'm pretty sure. Yeah, Tyler. I think Tyler Croy had it. The one that is expiring soon is Jenkins CI.org. The Jenkins.io doesn't actually expire for 420 days. So we have over a year till it expires. If the registrar allow to renew several years in advance, it could be smaller. Yeah, because the direction it's taking with the private company behind. Yeah, the price will increase. That will be completely scandalous. Okay, good. So, so that's once we need to get an action item someone will check with Tyler Croy. Likely through Tyler Croy. Are you okay to take that action and drive it. I will be there to put you in contact with Tyler if it's okay for you. Mark or do you want to take over on that one? Oh, that would be great. I would love to not take over on that one I had had this conversation last year with Tyler about Jenkins CI.org and it auto renewed so he just left it to do what it did but in this case if you've perceived that there's a cost savings. And by changing our renewal pattern, certainly nine more years of Jenkins.io is a predictable thing we want, right? We don't want to lose the Jenkins.io domain. That would be so interesting to have Tyler feedbacks on that part might be valuable as well. It's probably not a huge increase like it's kind of for 40 to 55. If that's the only increase I would just ignore it in terms of but my worry was I think it's worth a conversation with Tyler to see if he wants to extend us to many years so that we don't have to pay the renewal every year. In the worst case we will learn something about how it works so it won't be a waste of time. Gavin asked about Fastly as code. So because right now the Fastly configuration for the domains are managed by the Fastly UI that mean you need access etc. And there are a bunch of ways to configure it remotely. Here we've put an example of Terraform since this is our trend. I don't want to do this without a way but that will be interesting because that could mean maybe less management to access and we could propose changes, review them or at least being able to roll back with shared knowledge. So honestly I think that's a good idea but there might be downside. I don't know which one right now. Not sure in terms of security if there are sensitive elements on this configuration. I don't think so either. So that could be interesting to share that. So thanks Gavin that will be an item to take. It's my proposal if there are no voice against that I will take care of adding an issue to the Terraform umbrella, unless someone want to manage it with another tool which I don't mind. If there is another tool you do it and you own it. And I will ask if Gavin is interested. Anyone interested on working on that one. Happy to let you work that one. Mark, about the solution you asked so. I'd say let's just continue the discussion and the thread I don't think we need to discuss it here. I'd like time in this meeting for a presentation by Gradle Enterprise and probably a presentation by a launchable. But this will wait till next week for the further discussion. The goal to mention it is to acknowledge that we saw your message using integrating Gradle and launchable as new services and infrastructure could be worth it for the contributor developers of Jenkins. We have to evaluate, let's say the overall cost in terms of time and techniques. Right. But yeah there is no strong know that could be a good idea for both. Thanks team for reminding us about Carpenter that has been presented by AWS. That's also something to check if it could be valuable to improve the auto scaling of the Kubernetes cluster. It's not Amazon only. So we have Carpenter and UniversalCube from Scaleway. These are two tools that are working at different levels of the Kubernetes stack that are Kubernetes provider agnostic. So there are a bunch of interesting things to manage to help us managing. Scaleway, Scaleway. For Scaleway it's one control plane replicated on different cloud provider and you can use worker pool on wherever you want. And Carpenter is an auto scaling solution for a worker to have different kind of workers as well. So that might be almost the same feature. Okay. And that's all. Unless you have other topics, questions. None from me. Cool. I'll take care of publishing on the GitHub repository and and if it's okay for everyone I will also publish these notes on this course on the community as well. That might be a link to the published notes. So the link will be a permanent on GitHub. Great. That's a new thing. And if you'll wait till after I get the meeting recording published, then then we get the benefit of embedding the meeting recording as a clickable link. Thanks. Good idea. Excellent. That's all. Thanks everyone for your involvement on your reports. Thanks. Take care.