 Okay. All right, welcome everybody. This is the Jenkins infrastructure meeting. It's the 11th of August 2020. We've got a number of topics on the agenda. Olivier, do you want me to share the agenda while we go through this? Sure. Okay, so I'm going to go ahead and share my screen. And you've got the first topic Olivier, so let's bring it to you. Everybody see my screen? Okay. Yep, I do. Yes, perfect. So basically, the outage that we had this morning is in the communities resource configuration. So basically what we had in the past is we used to have certificate that we bought from GoDD. And then when the certificate expires three months ago, we moved to a let's encrypt certificate. So the certificate is renewed every three months. But the LDAP configuration needs three components, CA.CRT. We need the key, obviously, and the certificate for LDAP. But cert manager do not provide a CA.CRT. So I had to retrieve that information from let's encrypt websites. And so basically what happened is because of the way the LDAP image was designed, we needed to pass those three parameters in the configuration when the container started. And the problem is because of how communities and the way we mount volumes, we needed to have those three components in the same volume. So what I did is I slightly modified the Docker image to fetch the CA.CRT from a different directory. And so I didn't needed to have the CA.CRT from the communities resource. And what happened, but what I did to go faster, I just manually modified the secrets in order to have that information with the secret as well on community site, even if the container does not use it. So if you look at the two PRs that I created this morning, I remove the mention of CA.CRT from the stateful sets. So we do not try to mount that information because it does not exist in the secret and it should not. That's one of the things. And the second thing is I updated the Docker image to correctly fetch to not fail if you cannot find the CA.CRT in the right location. So that's basically all. The way I found that was quite simple. I tried to connect on one of our services on Jira. Initially I thought that my password was changed for some reason or compromise whatever. And then I realized that the container was not running anymore. So just look at what happened in the communities cluster. So that's the container was not running. Look at the log. And in the logs, it told me that it tried to read a file that did not exist obviously because that file was not generated by the secret. So basically what I did this morning to solve this, because we needed that service to be running again, I just modified the community secret that contained the secret, the certificate and the private key. And I also added CA.CRT in the secret. So I did the same thing that I did three months ago. I opened two PRs, the two PRs are documented, are in the Google doc here. Once we merge those two PRs, it should be done. It should be fixed. The second main issue that we may have, but it's mitigated at the moment because the way we use the Docker image, the second issue that we may have in the future is because we switched to let's encrypt certificates, the certificate is renewed every three months. And the container need to be restarted every, I mean, within the three months. Because we, every week, we update that container based on, based on the latest updates, because we rebuild the Docker image every week to have all the latest version. We haven't had the issue, but yeah. That's consistent with our JIRA container that we must restart every two weeks. So that thing, if we don't restart it every two weeks has exorbitant disk usage and causes itself problems. So JIRA container restarts for right now, at least for me, are just a fact of life. Good. Okay. But in the case of the LDAP, it shouldn't be an issue because there is a huge check on it. So if the LDAP container does not work for some reason, the container is still that we build and it just takes a few seconds for the container to restart. So I mean, that's one of the risks. But yeah, we still have to improve. But anyway, once, if we move, if we moved the LDAP system to the Linux foundation, we won't have to work on that. So for now, I would just merge my two PRs regarding LDAP once we are ready. Because we have ongoing releases at the moment, I don't want to impact the work being done there. So I prefer to hold for a few days before merging the PRs. Thanks, Olivier. Any questions from you, Alex, or from you, Tim? No, I just got online. So I didn't know about the LDAP downtime. The service was done for 13 minutes, something like this morning. I mean, this morning, nothing. I was still logged in. So I was fine. Okay, super. Olivier, thanks very much for that status. We ready for the next topic then? Upcoming releases tomorrow. So this one, Olivier and Daniel Beck are going to monitor delivery tomorrow. I'm acting as backup if, for some reason, Olivier were unavailable. If Olivier becomes unavailable, Daniel's plan is to start a little later so that he can do it while I'm normally awake, rather than be started at roughly midnight my time. We don't expect any surprises there. We believe the infrastructure is stable, steady, and ready to go. Upgrade guides have been written, haven't been merged yet. Upgrade guide and release notes, but they look good. And I think now this is, Alex, the first delivery of the Windows installer on LTS that's going to be delivered same day as the build. We delivered 2.235.3 a few days after. This one, we hope, will be right on. And you've seen the upgrade guide proposals there. Yep, looks good to me. Okay. Olivier, anything you need to highlight there? No, that sounds great. Also, we have a bunch of PRs on Jenkins Infra Slash Release Git Repositories. And I think that we should not merge them before tomorrow. We have to be sure that we do that change of release process. And if we want to change it, then we have to wait for the next weekly because it's less critical than the security release. We don't merge anything before tomorrow. And, I mean, globally, we don't merge anything that could impact the release process. So, for example, that's why we do not merge to add up changes. Yeah, that's a good point. We're intentionally taking a conservative approach until after the release tomorrow. And on that topic, I did, I want to highlight something that's happened, and I was really surprised that it worked now because it did not in the past. So, last week, while Dennis was preparing the release, I accidentally merged a PR that updates the Jenkins version used by the release. So, I updated release at CI. The Jenkins so the master was restarted, and it did not impact the agents. So, the agents were able to correctly reconnect to the master and do the work as usual. This is not something that was working. I mean, in the past, we screened this agent. That was quite a surprise for me. Great. While we talk about the release, there is one topic that I put at the end of the agenda that I would like to mention here. I would just copy this here. So, basically, I've been investigating about different ways to reduce the costs on the Azure accounts. And one of the main costs, I mean, on the Azure, is taking 30% of the cost is about packages that we store on Azure and the bandwidth to fetch the packages there. So, right now, when we go on package of Jenkins.io and we try to download a package, there is an HTML access rule that redirects the request to our Azure 5 storage, which means that the Jenkins project is paying for the bandwidth for everything. And we should, I mean, that would be better to just rely on our MIRROR infrastructure. We have quite a lot of traffic just for the network bandwidth. We are paying $2,000 per month. And so, this is something that we could just reduce if we move to the MIRRORs. Until today, we weren't allowed to use our MIRROR infrastructure because we were not supporting HTTPS. So, that's one of the motivation why we investigated MIRROR bits. So, MIRROR bit is deployed on get the Jenkins.io. While we still have some improvement to do there, I think that for package of Jenkins.io, it should be ready. Something that I would like to keep in mind for tomorrow, for the release of tomorrow, the behavior with MIRROR bits is it builds the hash for every file and then compare that hash on the remote MIRROR. So, if for some reason the file between the MIRROR bits and the remote MIRROR mismatch, then it falls back to another specific endpoint that I configured. So, for that endpoint, I configured archives of Jenkins.io to be the fallback because we control and we know that we have all packages available there. And so, something that I would like to test tomorrow is when we do the release, obviously, the file will not be propagated to every MIRRORs that we have in our infrastructure. But I would like to be sure that we are still able to download the latest version from get the Jenkins.io. While it should be working, tomorrow will be the first release where I could test that behavior. And if it really works, then I can update, and I would like to do that, I would like to update the package of Jenkins.io, actually access rules to redirect, to not use Azure, but to use get a Jenkins.io to download every packages. I created a Gira ticket about that work because the package that's originally Jenkins.io is not any more managed by Puppet, at least at the moment. So, Puppet agent is disabled. It's just a manual fix. So, I documented in the ticket where all the files that need to be modified, what I suggest is just to, for example, as a first iteration, we just update every Debian packages and we first look at if it breaks something and if it works for the coming weeks. So, I just want to be clear on this. So, I'm planning to update the file tomorrow after the release if we can correctly fetch every packages from there. That's great. And so, yeah, I'll put the link to the ticket in the, I know it's already there. So, it's already there. So, any question on this topic? Have we tried to see how long it takes to download the packages? So, it depends on the mirror we get. It really depends on the mirror we get because in this case, you're really downloading. So, for me, it was faster to use local mirrors because there are two mirrors in Germany and one in Holland. But I guess it would depend on the people there. We have at the moment six mirrors. So, and all of those six, they actually pass enabled because, yes, they are only, so let me show my screen so I can give you a look. Those disabled process. Just a minute. I'll enable it. Okay, got it. So, try to share again, Olivier, sorry for the block. So, now the window is loading and Zoom is not responding. Okay, now it's working. Can you see my screen? So, I will just increase the size. So, I'm going to get the Jenkins.io. So, it's exactly the same content that you have on your off. So, I would just, for example, vegetarian. So, the thing that I wanted to test first is if you take, for example, a link for Hudson, copy link location, you just go to, you can specify parameters in the URL. So, I'm going to give me the mirror list. And in this case, for this specific file, you can see that no mirrors contain the file because that file really holds. It's like one of the first Jenkins. And now it was still Hudson. So, if you try to download it, then you will fetch the package from archive, the Jenkins.io.org. And we do not test the actual whatever. We just, we just, that's a full-back service. But on the other side, if we want to test, let's say, one of the latest version, this one Jenkins, I'm going to take the link, same mirror list. So, everything is happening on HTTPS. And so, right now, we have six mirrors that contain that specific file with the hash that matched the one on your orbit. And obviously, once we do the release, the first thing, because it's part of the process, we update OSUS and Miros, because we control them. But then Xmission, Cerberium, and Grun, those will be updated once they are available. So, they are all only working on HTTPS anyway, because I disabled HTTP, because the purpose of Jenkins.io is to work on HTTPS only. But yeah, that's what it looks like. And so, something that I would like also to test, I mean, is about having stats about how many people don't know, oops. So, Olivier, just for my clarity then, so today, when I access mirrors.jankens.ci.org, it's actually consistently going to an Azure vault, even though we have mirrors available. If you're going to mirrors.jankens.io, you are going to mirror brain, and then it will redirect you to one of the mirrors. So, that's the current behavior. It does not support HTTPS at the moment. But if you're using package.jankens.io, if you're using package.jankens.io, basically it's here, package.jankens.io. Okay, I'm on the machine. package.jankens.io. So, this is the content for package.jankens.io website. And you see that we have a bunch of directories. So, for a red ad, Debian and Son. And if you go to Debian, for example, you have an HTTPS file. And in the HTTPS file, we have a rewrite condition. If it's binary slash .dev, then redirect to project.jankens.release.blog.core.window.net. And what I want to do, basically, is to replace this rule. And in this case, I just want to replace yet the Jenkins.io. That's basically what I want to do. What I tried several months ago, what I was working, I tried to, oops, I will just revert to this. What I tried several months ago was to really use the rewrite rule, the initial one. So, to redirect to the house the Jenkins.io. But, basically, it fails because it does not support HTTPS. And the Debian package manager will fail if you try to. You must have HTTPS. Okay. So, that's a change that I would like to do. So, just a reminder, tomorrow we test if K.Jankens.io immediately contains the right packages. And if it does, I will start to change those by manually. There is another thing, I mean, any other question on this topic? No, for me. I just tried about a second faster than me on Git.jankens.io. You say it was faster, Tim? Yeah, it was slightly faster than me. Interesting. Okay. So, you said Git.jankens.io was faster for you. Slightly. Yeah, that's impressive because, I mean, Azure's got a lot of work that they do in their content delivery stuff. That's very impressive. Okay. Yeah, but in the case of Tim and me, because we are quite close in terms of location, for us, it's a little bit faster. Something that we have to keep in mind is the Azure Blob Storage that we are using is located in East US too. So, for example, probably for you, Mark, it would be faster. But we have Mihaurs in your location. But, for example, for people from China or Asia or globally, it will be faster to use a Mihaurs because they have Mihaurs closer than Azure. Because, again, we are not using the Azure CDN for this. Got it. Okay. So, that's something that... Sorry? Just wanted to say that once we have more Mihaurs, then the experience should be even better. Yeah, and something regarding the Mihaurs, we regularly had people proposing Mihaurs. And we declined in the past. The main reason to that is because we initially thought that we had enough money to distribute the packages by ourselves because then you have a full control on the pipeline. But, yeah, today we have to find alternatives and relying on Mihaurs is maybe more important, more interesting. So, we could definitely promote that and have more Mihaurs in the future. Another thing because I'm still trying to simplify the... So, any question regarding packages in Qizayu? Then I will just stop sharing. There is one last topic. It's a regard... I can just continue on this one, Mark. It's okay for you. Sure. That'd be... Well, yeah, that sounds great to me. Do you want me to share again or do you want to just... You can share again, yeah. Okay, great. Go ahead. But basically what's happening right now is we have two ways configured in the infrastructure and it's not consistent between the distribution packages. But either we push a SIM link, a link on the Mihaurs, and then if you go to, let's say, Windows slash latest, you're redirected, you're using the SIM link, or we are using HTXS. So, for example, in the case of Windows, we had two configured at the same time. And so, depending on the Mihaurs or packages in Qizayu, sorry, depending on Mihaurs, sometimes it was using the SIM link, sometimes it was using the HTXS because we have Mihaurs that disable the HTXS, basically. And so, we have to find a way to maintain that link. Personally, I would like to just... But yeah, just a personal opinion, I would like to not use HTXS anymore because it puts a strong dependency on Apache. And so, it means that every people who provide a Mihaurs need to use Apache in order to work with our infrastructure. So, what I would like to do is instead of relying on the configuration, our script, because the sync link, for example, on the machine to package at original Jenkins.io, there's the SIM link is managed by Pepe Z, if I'm correct. So, what I would like to do is to create that link when we do a new release. And so, that would be the responsibility of the repository Jenkins CI slash packaging to maintain that link. And then we could clean up every all the other location where we generate that link. Great. Thanks. And Weekly and LTS are doing that now. So, that's a relief. Great. Anything else on the PKG Jenkins.io latest link? No, that's all. That's the other thing that I wanted to highlight. So, it's really... I mean, I just want to be sure that you are aware that Pepe's generate some link. And so, we have some five. We have both HTXS and... Great. Thank you. Remember that. Thank you. All right. So, next topic, we are almost out of our 30 minutes. Very brief. The JIRA upgrade plan, we are currently running JIRA 7.13. It will end support at the end of November 2020. Our proposal has been to ask the Linux Foundation to host our JIRA instance, but using our identity management and they would then do the upgrade to a JIRA 8 version as part of that transition. We're meeting with them today. Others are welcome to join if they wish. It's a conversation about what are the limits of what they are willing to do, what they can do for us, et cetera. The upgrade plan is linked here. You're welcome to refer to it. It's my attempt to describe what our expectations are. We've shared that with the Linux Foundation IT team and invited them to guide and direct to give suggestions, et cetera. Go ahead, Olivier. But you were talking about the identity management. So, I think it's important to say that it's probably related to the JIRA upgrade plan as well. Well, so my argument and discussion with them is going to be that we would prefer to keep them separate because I fear that identity management is a much larger thing than transitioning JIRA, than upgrading JIRA. I don't know. I'm not yet confident personally that we can complete an identity management transition by November 28th. And if we don't want to go off of support with JIRA, we need to complete that transition by the 28th of November. So, discussions with Linux Foundation will certainly continue there. Thanks for following that. Any other topics we need to discuss here in the infrastructure meeting? No, I think it's just better to stick to the schedule. I wanted to organize a session and to talk more about the perpetrator infrastructures. I wanted to do a small demo because I gave access last week to a team about the machines. But we probably won't have the time to talk about that today. So, I will probably try to schedule something with team in the week and post a chat, post a link to the Google Meet, Zoom, TTC, whatever we'll use for the session. Great. Excellent. All right. Thanks, everyone. Let's go ahead and call an end to this meeting. I will post a recording to the Jenkins YouTube channel. Thanks very much. Thanks for your time. Have a good day. Bye-bye.