 Hello everyone, welcome to the Jenkins Infrastructure Weekly Team Meeting. Today we are February 2024, and around the virtual table we have myself Damian DuPortal, Hervé Le Meur, Marc White, Stéphane Merle, Anne Bruno, Verarten, announcement before the weekly release. Let's speak about next week meeting, meeting consults. As three of us will be on holidays, so the poor Hervé will be alone managing the production, so better to stay, so that mean we will have to schedule for a two weeks milestone like we did during past holidays. Any objection or question on this one? Next team meeting will be nature bout the calendar, so it won't be 27. It mean we will see each other 5 March. Ok for everyone? Yes. I will miss that meeting as well. Ok, good to know. Thanks Marc. Weekly release, 2.446, so release process successful. I haven't checked the package. Was it the package step? I don't know if it was successful or not. Easy to check, usually we go to genkins.io, download and it's successful. The container image is not there yet as far as I can tell. Yes, which is not part of the package, absolutely. I will let's see the windows, which is the worst one. Let's see if it's oh, it's not there, because it's not stable. Hey, of course they mean. Ok, we have at least we have 3 mirrors. Ok, package successful. Docker image, so there was an issue. I don't know if it's me or someone else. Most probably it's me. I might have tagged last week on the wrong commits. I must probably forgot to update to pull the upstream master when I created the tag. So that tag was recreated. Thanks Marc for catching that and sharing it that allowed us to quickly recreate the tag, which triggered rebuild. That is expected because from genkins point of view, the git ref changed and jhre is republished. And so the idea was to let that build finish before triggering a build of 4.4.6. Hopfully, that the initial build created by the tag you created earlier today, Marc, failed on trusted CI. I've added a message on elements. First build failed, waiting for 2.445 to be finished because of the latest, because we want 4.4.5 to be finished and the latest to point to it first and then the second build. Does it make sense? Because I haven't finished my sentence. Sorry. It does make sense. That's very reasonable. So 2.445 is still in progress or may still be in progress and there's no harm in us waiting till we're confident in that. I was impressed that you recreated 2.445. It wasn't that far out of date, but that's great. Thank you very much for doing it. I have added the steps just to be sure it's not stuck in my head in the thread. So another good trigger for us to see if we can find some time to work on automating that tag creation that will help partially at least because that would avoid the mistake I made because removing the human from the process would help. That wouldn't solve the 2.446 build failing, which requires us monitoring build status on trusted CI, which is an upcoming topic, but right now, we don't have the capacity to work on it yet. About the change log, since Kevin is not there, I assume it will be finished later. Is that correct? Finished already? Oh, no, the 2.446 change log. Yes, finished later. To be finished later today. Is there something else on that weekly release to share to discuss? Is there any question? Do you have other announcements? Oh, yes, I have an announcement. Did you want it here or do you want it somewhere else, Damien? No, that's perfect here. Okay, so Ampere Computing has donated 2 arm64 servers to the Jenkins Project. 2? Oh, no, and that's the wrong, has lent. Donated is the wrong word because they're not our property. They lent them to us, but a loan is good. That's great. And so they will be hosted at my house and we'll figure out how to use them most effectively. These are server class machines that are used. They're similar to the machines used at many cloud providers already. So we're looking forward to getting good benefit from these things. We'll figure out all the details about how we get access, what's what are the security measures we need to do, etc. But we won't do any of that until early March when I return. To wait early March and March's return from holidays to get started. Right. So I'm just really pleased. It's very, very kind of Ampere to donate the computers. We've been working with arm64 server class systems since 2021 when we got involved with Oracle in their deployment of arm64, even before they announced publicly. So the Jenkins Project has been doing arm for a long time. And this is great that they're willing to lend us computers to let us do arm without having to pay a cloud provider to host them. Absolutely. I had a discussion with Bruno earlier this week. Since Bruno had that kind of discussion about the risk V or any potential server, I wonder, and that's only an hypothesis, if we could find and ask sponsoring from I don't know the English word. In French, we say white room, Oster. So a company that provides the racks, the electricity and the hosting, and you provide your own server that you have to manage remotely. And they provide you services that usually you pay for instead of Markweight. The idea in that case will be we could benefit from this to server running somewhere that doesn't make your electricity bill explode, especially when we will start running bomb builds on it or something like that. That will help on the automation and avoid, depending on your ISP, because same thing, I don't know if your ISP has a quota on downloading or uploading. We never know. Maybe an ACP local server might help though. But yeah, that was just an idea that doesn't restrain us from using those machines on your basements. But on the medium term, that could be interesting if we have a few in the US, that could be a way, because it's not the first time we have company ready to lend or give us servers. Right. Well, and I think I think it's a very good, it's an interesting idea. I actually asked my ISP if they did white box hosting like that. It didn't get a response, but there are places like that. And oddly enough, not far from where I live is a large Hewlett Packard company facility where we might persuade them to donate a large Hewlett Packard Enterprise facility where we might persuade them to donate some physical equipment as well or lend us physical equipment. And in either case, having it hosted somewhere is a good idea. I think we should explore it. Absolutely. Because getting it, it's, it's, it's nice to get it out of one place and into a place where people care even more about hardware. Right. But I believe we can, March will be a way of Mark White rebooting the server a few times. Of course. Yeah. Well, and, and there are still some things that, yeah, it's, I haven't even turned the power on on the servers. So there's, there's much more to be done. But I have taken a photograph of them. That's nice. But yeah, let's wait for you being back from early days. And I want a validation from your wife that we can keep it working once we will have started the bootstrap process because I run XServe, Apple XServe and the Dell ProLiant on my student basements for one year and a half. And I remember the sound, not mentioning the electricity bill and stuff. And the vibration. Right. Yeah. But it's on the basement. So I assume it's okay for that part. But yes. They're already, they're already noisy machines near it. So this is, this is just adding to noise, not somehow noise where there wasn't noise before. Okay. But yeah, please don't break your electricity. And again, that's, that's, that's not, yeah. I have to, in the winter time, I have to heat the house anyway, whether I heat it with electricity or with, or with natural gas, I have to heat the house. In the summertime, it's a different thing. Fair, but many thanks for, many thanks for the work you did Bruno and Mark on that sound, the work you most probably will work on that area. From knowledge, from showing off our usages, from managing the relationship with Ampere, from that, that's really huge win. Thanks. Thank you. Is there any question? Is there any other announcement you want to make folks? Nope. Okay. So we can just have a look at the upcoming calendar. So 2.447 is expected to happen next week, February. Mm. Nothing else to add on the next weekly. We have an LTS tomorrow. Is that correct? Yes, that is correct. 2.440.1. 2. So sorry, I miss her. 2.440.1. Tomorrow will be 21 February. Do you remember with the release lead? Just Alexander Brandis. Okay. So that will be early morning for us in the, in the Europe. Perfect. Alexander Brandis is released lead. Hervé, I believe you will still be busy tomorrow. My question is related on, okay. That means by default, we will have to design Stefan and Hi as a release lead on the infra part. Is that okay? Stefan, do you want to be primary or backup? It's in what time in the morning? Given Alex night schedule, I would say early morning for us to be confirmed and ask within by the primary. Okay. So I can be the primary. Yeah. Okay. You know, I'm here tomorrow morning. So yeah. And I got your number. You don't know where I live yet, but still. Stefan is lead, primary lead on infra. Okay. Is there any question about that LTS release? No. Remember, we should have fixed the issues as the weekly showed the issues related to the file storage that someone in that room deleted a few weeks ago. And then let his colleagues have to deal with the consequences. That should be fixed. That's right. You tell it with that you came back. Yeah. In the end, everyone, everyone find the culprit again. Sorry for that. But yeah, tomorrow, if we have issues, that most probably will be consequences of this. But the weekly happened carefully. It's just, yeah, we'll see. But that should be okay. Do we have announced security advisory? Haven't checked 24 January? We don't. And next major event. As far as I remember, we have a scale X. It's in March. I don't remember the dates. Yeah. And dates, dates are not the scale scale in Los Angeles. And it's mid-March, sometimes 17th through the 19th or something. Okay. Yeah. Alyssa and Alyssa and I and Basel will be there. And Maroc will be there. Cool. Any other major event question about the calendar LTS weekly stuff? Okay. Let's, so let's proceed on the work that we were able to finish on the past milestone. Okay. Let me, okay. I'm going to go with what we have on the screen. It might be a different order than the notes and I will update the notes accordingly. We have two closed issue. The first one was to decouple incremental publishers from CI Jenkins. So two or three weeks ago, I don't remember. We had an outage on CI Jenkins. The outage was all the build were marked as the first January of 1917, you know, the epoch. Or somewhere in, so they were the dates and build time were corrupted leading Jenkins printing the epoch time by default. Caused by something on the plugins that has been solved since then. But one of the funny consequences is that that breakage broke the incremental publisher when a request for publishing an incremental artefact was sent. The service was sending an HTTP 400 as a code and reported by early. What happened is that incremental publisher was trying to reach a JSON file where there is a source of true of the permissions. Was username was able to read or write which Jenkins CI GitHub organization projects. That was required to check if the user triggering or the builds was allowed to publish an incremental artefact. And it was failing because on the back end, so the error was 400, which is first the problem, because that mean the incremental publisher application is not good. It should have said 500 because that was a server-side error from incremental publisher. But that's something I don't want to put my hand on. That's just the fact. And we realized that in fact that JSON file came from by default CI Jenkins IO, repository permission updater job on its master branch was archiving artefact. So with that job with no archive artefact due to the date time correction, it wasn't able to work. That mean if we put CI Jenkins IO down, incremental publisher is not able to work as expected. Most of the time it wasn't a problem because usually CI Jenkins IO is the place where we request for incremental publication. So if it's down, you don't care. However, it's way better to rely on a file generated by trusted CI Jenkins IO and deploy on an highly available system instead of something on a public system that could be easily taken over. If you see what I try to say is that it's not safe to rely for a permission source of truth on something that we consider not safe. So that's why we have changed incremental publisher to use report Jenkins IO published report from trusted. So we control the supply chain here. Is there any question on this one? Second one, we had a discussion. So we had a request to mirror Atlassian public maven repository on repo Jenkins IO.org. So first it was already the case. And second, that was a realization that Blue Ocean plugin was using ancient dependencies that weren't provided and published by Atlassian themselves. So it was somewhere hidden. So thanks to that discovery, to the dismay of the Blue Ocean maintainer, but thanks for them to the extra effort, they were able to bump that dependency to a more decent version that should be now there. So no action expected, but a good reminder to update your dependencies sometime to time because they can't disappear from the public internet. Any question on this one? Okay, and for the rest. Sorry, so the Blue Ocean maintainer was okay with the solution we chose that we're not going to publish those outdated non public or incorrectly licensed packages. The dependency has been updated to a current version of the Jira plugin. Yep, great. That's what I, that's my understanding. Yeah, it's a pump to pump parent, as I understood. Cool. Ah, okay, so that has the effect. Okay, so we can thanks then the Blue Ocean maintainer and Bazel for doing the huge triage work then, cool. We had a few user facing issue closed for no effort because yeah, not related directly to the Jenkins infra, but they were more general question on their own Jenkins installation, which is out of scope for us. Any question on these issues? Okay, so let's roll with now the issues we are working on with a hard work. Following them, what I see on screen, which is sorted by priorities. I didn't have time to sort priorities on the notes. So first one is replace BlobExphere by az copy. Hervé, can you give us summary on the part you worked on on this one? And I will show my part after. Is that okay? Yeah, so I've got the SS token generation from a service principal working. I implemented some best to come on to do that in contributors touching kins.io pipeline to synchronize its file share. And I then created a function in your shared pipeline library with file share service principal allowing us to just pass the as a credential ID, the file share ID and the storage account ID to retrieve a signed URL including the SS token with a shortlist expiry date set to 10 minutes by default. And this duration can be adapted if needed. It can be passed in the parameters of the shared pipeline library. I started to look at implementing that function in Jenkins.io. During that, I noticed that we're using storage version 1 file share for Jenkins.io and that it's and also that it's a standard file share. Looking at Azure portal billing, we notice that most of the cost is ready to transaction. So we'll use this Jenkins.io opportunity to upgrade this file share to premium and version to storage. This will reduce the cost of this storage account currently it's $70 per month. We think that most of the other file share use like that will be in the same situation and we might be able to replace them and reduce our costs. Nice work. Just something to add just to let Mark know, we have an opportunity here around the as a s token generation to create a new plugin. So because the work we are doing here is that we need a few shell commands that get a credential, a service principle credential from the controller. And right now we don't have other solution than to spin up an agent. The credentials are serialized on that agent. That agent run a few shell commands, a script in that case. So that's why they have put it on a pipeline library so we trust and control that shell script and this code. And this code generate a short live token with the customizable parameters that they have mentioned and then we have to retrieve that value back to the controller in a groovy variable. So we can use on the subsequent shell calls. Eventually we could use, we could improve the pipeline library to say hey, we stay inside boundaries on the same agent. But still, that's still limit because we are using an environment variable which is not known from the controller. So we don't benefit from the masking sensitive values. And that could be a great opportunity to say hey, we don't want the agent to be responsible as a identity to generate that token. But instead, we want the controller to have that responsibility. So user could only allow the controller to generate that short live token. And then the short live token could be passed to the agent just for the duration of the build. Not requiring a concept of localization or local identity stuff. It's the same as the idea on what we can do with GitHub apps. You can specify the GitHub app ID and key. And when Jenkins run a Git operation against GitHub, it generates a short live token name installation IAT token which is one hour token. And that token is specified inside the agent environment then as a credential. So that's why I think there is an opportunity here to create a credential plugin that will be Azure SIS token or something that could help user of Azure and Jenkins on that guy. Does it make sense or is it clear and does it make sense? It does make sense. So it's something that matches with that concept from the GitHub app token. Yeah, I think that's interesting. Okay, back to the walk. So Herve described the next step for him on that area. So then we have Jenkins AO that means for each of the usual file storage we will create a brand. So in that case, if it's worse, the migration create a brand new on Terraform. Once we have the brand new storage that means we also need to migrate data and service to that new one and then update the system so it can fill the content or move the content already and then change. But we will have migration operation to do here. So yeah, that one might slow down the whole process. So let's see step by step, depending on the time, because that might also create outages on the services when we will migrate because we will need to recreate the CSI, PVC, Incubanitas for instance, for most of these services. Is that okay for your Herve? Did I forget something on your port? Okay, and in parallel we have so az copy installation az copy is installed and version tracked on the VMs which include pkg.origin and the agent.trusted. These two virtual machines are managed by Poupet and they are running Blobixfair command to copy to file storage that cannot use the pipeline library either because these are shell script running as Chrome tabs or being run remotely by something else or because it's using freestyle jobs not able to use a pipeline library. So now we have added az copy next to Blobixfair and the work in progress here will be install the SAS token shell script from Herve pipeline library in the VMs pkg.origin and agent. That's the next step for me which means adding update CLI system that will take care of getting the latest scripts from the pipeline library on the master branch and update a copy for these machines that will taken over by Poupet. So if we do a change on the pipeline library on that script then updately we detect the change and propose a change on Poupet that we will have a pull request to audit and decide when we want to release it or not. Does it make sense for everyone? Yeah, I was... I didn't understood that when we spoke about it earlier in my... I was about to speak it when we arrived at UC helpdesk issue but my first intention was to get the script in the UC repository and get an updately manifest to update it too but I didn't understood that the script would have been present in the VM. So I haven't talked about the update center and that's a good point to mention it because that could benefit from this that's a good thing. Absolutely, yeah. For which use case At least crawler there is another one and for agent trusted at least. The crawler needed so update center will benefit from it and pkg origin for this one because we run every core release and every three minutes we run something that copied that was your file storage from the mirror brain user on that virtual machine. Yeah. I'd like to discuss it later if you would like to say it because it will... For the agent it's okay for pkg there is no other solution. I really want to underline that there is nothing you can do for this one. So it's mandatory so we have to do it. That's what I said. Okay for pkg yeah. Okay because because if we don't do that on pkg origin we cannot remove BlobixFair from here. That's really mandatory and there is no pipeline or Jenkins involved here. That's a Chrome tab shell script every five minutes on the machine. Yeah but it's in crawler I don't want to take too much time in there but I think the reference directly in the repository using the scripts seems easier to me to make something that's depending on the assumptions that there is a script imported from elsewhere than the script using it. No otherwise you will create an issue again on pkg. On agents that could be doable like you said it's not possible with pkg because that mean if trusted goes down or if GitHub goes down we cannot have the update center being updated. We need the most minimalistic system to work and that's why the update CLI with Poupet will create a copy of the script but will keep track of the changes. I think it's important if you disagree that we have that discussion because that's the core of the whole infrastructure here we are speaking about so we need to solve that problem right now. You said that if GitHub isn't available isn't there I'm not sure but isn't the crawler is requesting GitHub to Yes I'm speaking about I'm speaking about the 7th script running on pkg the crawler is a detail on agent trust it I'm speaking about something that synchronize data between pkg machine which will be a set of plugins cores and elements that will be streamed to OSUSL to archives and to the file storage for getgenkin.io That's the process which is run on each core release and every three minutes As we want pkg to be the source of truth we need it to be a vulnerable We can discuss if you want more details on this one I mean we have to discuss this as a team No I think I think if you disagree with what I said you have to to bring it I don't disagree again I need to have more time to discuss this with you but I don't think this is the right time to do this because we are in meeting which is already quite some time long and I don't know I don't know sure we want to discuss this in two hours this detail on a detail I know I know it's important but yeah I mean it's important as in the top level priority of our task yes that's why I wanted to discuss it right now I have to look at something and yeah okay you need this later okay but that mean we it's at most important so we need to focus on this one as soon as possible okay next step is Kubernetes 1.27 Stefan yes I did the digital ocean part we did the digital ocean part that was not as smooth as we thought but the good news is that for now we will be able to do the grade the next one as code directly we were able to get rid of a little hack that was taking us back for this because we upgraded we bumped the Terraform Cli version to 1.7 I think if I remember correctly and that allow us to have a full as code version for digital ocean now and the problem we bumped in for not being able to have the full a great process was that we are having minor updates by automatically done by digital ocean every Sunday and the last minor one didn't went through because of a problem with the cert manager m shorts with a webhook time out that were set to 30 and needed to be between 1 and 30 so 25 is good so we had to fine-tune the values for that m shorts and then the the minor patch went through on digital ocean UI which allowed us to have the as code bumped to 1.27 working after that so we had we had one issue and we took advantage of that upgrade to kind of clean up the code as not needed anymore for the hack that's all I think that's all so next step no you also upgraded you also worked to have kubectl updated yeah the kubectl cli is now on the on the all-in-one agent so all-in-one image which is used by all the and by some agents and that kubectl is no the source of truth is the is the image that we are using on the production and all the project using the kubectl can now have a match on that version with the update cli cool so as a reminder it's used to run the kubectl exec mirror bits can on the new update center work that are needed and not only if I remember correctly there is another one you forgot no it's that's the only one yep sorry um avant heard from some news oh sorry sorry sorry my mind is rushing uh do you think we will be able to start the upgrades of eks this week I don't think so next step upgrading in the case delayed 2 marches est-ce que c'est ok oui je pense que c'est ok avec les délais ok yep absolument donc rien d'autre pour rassembler en cuban et s1.27 any objection question things unclear nope ok no no news on giro update from linux foundation as far as I can tell let's just check ok so no news so nothing to say on this one um so updates Jenkins are you to another cloud are they I don't remember if we had if we were able to run action or if we have to run some on this one I tried yeah I started to I've I've used the same script content uh and I put it in the in the pool request where we are testing the data center resume work I am generation for this one yes we continue I've made a I've copied the script file from the search pipeline library and I intended to update it with update cli with the diamond mine ok I have to update to create the other point service principle potential on trusted last time I tried I I've got an error I have to try again ok when this test will be done I'll update the main pool request so we'll be able to ask chenkin security to take a look at it extend then merge it and deploy it on production test it try it on production ok cool nice work um I have something to add on the next step migrate to premium storage sorry but we we must for this one it's it will be the case clearly because transaction will be terrible every three minutes that same pattern as get chenkin sayo yep transaction cost I'm adding it I don't know if you will have time of course but I know that support that you absolutely can do without any risk for the production don't hesitate if you need a prior task next week you can spend some time on this one as well any question objection ok next top priority task was infrascii on rm 64 stéphane that's on me I am still on I think the last docker image that we use as a as an agent which is not the only one this is the builder one that image is used on both infrascii and cij and chenkin sayo my problem with that one for now is that it's using the builder is using a version of ruby which is not the one we have in the only one and we need on the only one an old version of ruby to work on the puppet build we are using so I had to manage a way to install both version and to have kind of switch to use it yes 3.2.x for now and then I will work on an update cli to track the new version and to match with the windows version and we we already had a ruby 3. something on the arm version because we use vagran that install the 3.0 version so yes work in progress cool then the next step will be not sure that all the all the tools I'm doing that step by step in fact I'm running the builds where is using which is using builder and try to run the build with the new all in one and check all the tools that may need or not to be installed or changed so step by step just a note since we did an effort in the bath months on merging the pipeline between ci and infrasci so some project have to pipeline some have been already been merged so this will create a temporary discrepancy between ci and genkins that io builds and infrasci we won't have exactly the same environment between both doing a short period of time is that okay for both of you yes is that clear yes are we okay cool so there will be different roads to fix that I don't have a strong opinion because that will me most of the time it will be case by case one of the solution will be to start defining the all in one IRM64 templates because we but that mean eventually having to check how can we run IRM64 workload on ci genkins understood that yes I don't think we will have that will be something to think for march so we take note and we will resume that process in march is that okay yes I will not think about that during my holidays but yes and until you go to your holidays do you think you will be able to continue working on this I plan this that's why I I put that in priority okay the new private Kubernetes cluster in the new sponsor that your subscription I propose that we delay that one for March the idea is to have these IRM64 agents run on a cluster only for infrasci on the new subscription I propose that we delay in March because we don't have a credits problem this month with Azure is that okay for everyone okay award on go long parts Stephen yes I think this one you can you can move it because I think it's merged can Eric can I ask you to double check since we it has been merged is there something else to run for that issue or can we call it closed because I remember that wasn't there and both of you discussed that before the issue was created by Stefan and you were the question was already up to date okay so does it looks like that the pull request from Stefan would have avoided the problem now I have to look at it okay so I propose that we close it until it's done or unless you think you will have time to to review it before close your survey what do you think okay so you are responsible for closing it is that okay yes okay close the above to be verified by her way so I add it to you and it will be kept on the current milestone then use separated pipeline and organization scanning for updates cli on janvier sinfra so we had an un forcing issue is that the github check are mixed between updates cli distinct jobs and the main jobs that could lead to problems we have the discussion as a team privately Monday one of the ideas here that we discussed is that we might want to use our job DSL templating to create a custom github check a bit like the update cli infrascii the name will be static will be infrascii j'ai qu'en saillot if I remember correctly and the goal will be then to change the branch checks so the required check to allow a pull request to be merged wouldn't be continuous integration be a merge that can be mistaken to the deadly but with the new one uh yeah that one must be done this week because we had additional issues with stefan yesterday and today her way is that okay if we take care of at least that part for the existing jobs and then you see if you have a subsequent action or continued separating jobs next week what do you think what don't you think I'm actually following you I don't follow what you ask me are you okay for if we take care of dealing for the the new definition for the the checks I've understood that part I don't understand what you are waiting for me to do after that do you want to do you work on that task then so first do you agree on if we run this since you are busy this week yes the fixed only the fixing that issue part is that okay yes because we initially discussed and you said you will take care of that but you have a lot on your on your thing and you will be alone next week so the question is is that okay and the secondary question since you answered yes right now is that do you think we can continue you can continue working on this separating next week separating the jobs for the next one as your net or as a project or do you think we delay that too much I can't say we can't delay it's for me it's not it's a long-term issue on the flow when we have some time so I I don't know if we should keep it on the of our milestone will you have time to work on on this on the one I can't respond yet I think my other task will be done for our first day but I'm not sure so I can't say really if I will have time to work on this long-term issue okay so we delay to march it's not because it's it's because we need to plan ahead for one week if you are not able to plan ahead you say no and you take less tasks that's the goal of that meeting and even if it's in the backlog you can bring it back if you see that you got time that's all that's that's why my question so your answer is no that there is no I hope you were working on it this week so I don't know oh okay okay okay you said first I will take do you are you okay if we look at it then you ask me do you have time to look at it not splitting so it wasn't clear for me it's not the same job okay so we delay to march and we do the fix tomorrow and we'll discuss this privately because it's it's start to be a lot of too much misunderstanding on all we work as a team so uh update CLI what is it it looks like the rest are things we need to close the meeting soon looks like it's too long to be concentrated so uh let's run the fix run the fix ASAP and then delay to march okay so since it's taking some time do you see other issues in that thing that we shouldn't move to march because we don't have time do you see something that we should take care of at all costs I don't think so a plank maybe but not sure oh yeah true is it I don't think it's uh yeah so we are in issue this morning with a plank that's why yeah we have an outage so that's a team thing that outage must knowledge must be done because if it goes down you you must have a bit of knowledge on next week so outage today due to database being re-indexed fixed by setting replicas to only one to avoid concurrent database operations so the problem on that application there is no shared logs so sometimes during restart it can try to apply migration jobs and that create logs we have to add a start probe so that part can be done to the end chart so that that's something that could be done either by us this week or by you next week survey but that one should be taken care to avoid other outages yeah at least we need to to create an issue take care to it takes time to run the other thing is still walking on cleaning up corrupted records that one so the reason why it was corrupted is that all postgresql documentation it's related to hardware issue so there might have been something in December that happened that Microsoft didn't communicate it to us early this morning we thought that it could have been because of the replicas but in the end that that should not corrupt the data worst case we would have lost data with a server-side error but not corrupted data on the database itself yes that would just push too much pressure but no corruption exactly the operation I'm doing shows that we can go further and further on the data that the data set that we want to retrieve it's not top priority so no need to continue that part but not top priority as Daniel does not need this data anymore so if we have other priority we don't work on it I believe that's what that's the part you wanted to mention survey because we were all at the first yep thanks for reminding us of this one it's just that the outage part was important to bring here today for another another one the DLA building small plugins that should be solved by the the fact that digitalism is now in terraform 1.27 we have to check that we have to re-enable digitalism 21 is okay on the O also enabling the agent on CI Jenkins IO to check if still if it is still slow you will decide if you have time to deal with this one or not I will add it on the backlog by default is that okay by default okay the others are minor tasks so I will remove them DLA, DLA, DLA DOCS Jenkins IO is that okay for you are they to yeah this one I delayed since I have more important thing to do I might take some time to ensure that when we are clearing the get the Jenkins IO and there are no mirror returns it doesn't update the reports yes this one I thought we will find another page to get the information that will not depend on the it's I can do the for right now I think that both are needed but you think it's true I just don't update the report if they are just one result yeah but if there is only one for the sake of clarity we should use get the Jenkins that IO as a source of proof see sure if that is we don't have a way to automatically query the last war that's why we are using update to Jenkins that IO to redirect to the last war absolutely but you should not query the last war if you have to spend time on that the goal is to find a source of proof yeah I think I I think it's my fault because I'm the one who chose the wrong source of proof you know it's no no it was there already I don't no maybe but we Damien is right we shouldn't request the last war yeah because latest is a pain is there is a problem for the for the weekly so every every Tuesday that will be a problem and you need to patch which is silly if we if we find a page listing the the nodes of the mirrors it's it's it's done it's just correct it will be simpler it will be better it's it's good go for it that's the idea I believe we all agree or do is there something else to discuss on this one then okay but the rest is okay for me delaying the others to march just wanted to check if we had new issue because I remember we had news at least we have to answer to the person on aryan I believe already you did the last one but you are overworked so is that okay if we take care of this one or do you want to continue on that area node care you should take it okay do you do you see new issues eks cluster that one will be moved to to march so I don't include it today the goal is to start consuming credit from the current subscription there is nothing with true age yep something else no okay so then I'm gonna stop sharing my screen so we'll work on the milestone so see you in two weeks for people attending the meeting I'm stopping recording bye bye