 Yeah, good start Okay, welcome This was a buff in the boss session. I got shifted around a couple of times That's why we're not in bow, but I tried to set up something that we can properly discuss I prepared a Gobi page for this both Auto package test gating migration and I do hope that some people are joining in in the ISC stream especially Nils from the release team and maybe later Steve Langesack who Offered help to actually get some of this work done Antonio also has the other Mike headset running the Dapsey service and I started actually Working on this about last December or so I said well, I do want to have this in Debbie and nobody's doing the work So I'm going to do the work so This session I'd like to Briefly explain what I want What the current status is a little bit of the technical background quite it's not just simple one place fixing it we had a Discussion on DB in devil in January. I think there were some concerns, so I'd like to discuss that here in the open as well There are some possible choices on how we're going to do this and I added some links one of them being an example of Ubuntu filing at bugs where For their migration it was blocking So in general my idea is that I want something similar as what the boom to is doing that What we're gonna what I want to do is that we're testing in Testing instead of the current setup where we're testing unstable Currently all the outer package testers I hope everybody here knows what that is but Package can define a set of tests that are run on a central server after built in Environment where it's going to be used Now what we currently do is we run these tests and That's it Currently nothing happens with this information except for maintainers looking at it but what Ubuntu is doing is they're Testing a a package that is well in what we call unstable And test it in testing such that you can see if when the package migrates it would break Testing or at least not only itself, but also reverse dependencies And that's the important thing that you can actually test if it breaks other things Which typically an uploader would not necessarily test especially if you're Having a big reverse dependency chain So My goal would be that we end up in in Debian doing the same thing and that we block migration if there's regression in out of package test Grab a microphone. So a weakness of that in the current stuff. It's off. I Turned that one on so check check Okay So a weakness in that is that the tests are run on Say the leaf package, but then the test fail what I've noticed just Just during debcom is the package that fails It's actually not easy to pick up what package caused that it was actually wasn't one of the trigger packages It was another package that was brought in that's been updated and that package was Moved from 1.2 to 1.10 and that caused the tests to fail in the package that actually was reported At least in Ubuntu instance of Britney proposed migration Under each package you get a set of results Which show you which package triggered and why which package got triggered and each test results are separate Right right and in addition in addition We have done a lot of work to make sure that we do not test broken packages against broken packages Such that if you're trying to migrate this particular package It will trigger the tests of all the reverse dependencies that are currently in testing not an unstable Such that you're testing against things that did not regress yet And so the the really the idea is to take the current testing which barely must work and If adding one package breaks it Then it's that package or its dependencies of still I mean you can have first and the dependencies But if you would add that package to testing it would break testing Mike, please so just for introduction, I'm sort of maintaining this stuff in Ubuntu now since Martin left the company so A bit of insight into how we're doing this here and what you're saying is you try and trigger for each upload right but We have that we still have this problem in Ubuntu We haven't managed to get down to the case where for each test result only one package has changed since the last test Right because you might trigger for a dependency But then maybe a dependency of the dependency has changed and that's the things breaking it or like maybe we don't trigger for build dependencies Or like recommends or whatever, you know somewhere down the stack, right, right? So I think ideally you probably you do want to get to a place where Every single change is triggering a test result But if you think about how many tests this is going to trigger this is an extremely Extremely large number of tests right and we're not there in Ubuntu yet. We're not we're trying to get as close as we can but We don't have the capacity and honestly, it's a lot of analysis to do as well for every upload right to go all the way Think think about uploading glibc, right? It's just it's crazy Yeah, so even just the direct depends that's that's bad enough, but then everything else that Transitively down the stack. So what the release team has Until now said what we will do in Debian once this works is change the age of package The age so the time it takes to migrate from unstable to testing So if your package has an auto package test or is that then and it passes all the regression test the age will be reduced That's what they promised now So in this in this specific case the package itself Passes all its own tests. So that would have his age producer will go in earlier But the package that is picking up the error Yes Because we can't run enough tests we we fail to pick up that actually this is in the dependency and it's the Because you what do you need to do sometimes is is pick up what version was run The last time and track back to what was actually installed. Yeah, but that's a problem today. So yeah We're just not going to solve everything so my response to that is what it However, you look at it in the situation at least for the quality of testing will improve I mean it did and one of the items that I put in the list indeed that it there's a social concern of You as a package it is it is the dependency that such I'm not sure if this mic works. Is it okay? So that is a concern it is valid However, it's quite minority one and it is a problem that I would like to have the majority of box that I see now is where a New upstream release is uploaded into Debian That package alone fails its own auto package test right it migrates to testing Ubuntu syncs it and it gets stuck and proposed forever and furthermore It starts breaking all the reverse dependencies because it itself is just broken right, okay, and and and at least catching that class of bags will unveil your bug But your bug is actually much smaller subset than the new thing just broken Clearly the practice the the uploaded package needs to have its auto package test run quickly enough that the tests are there before it migrates That's that's critical So I guess one way that One way this is working. Yeah, yeah, okay cool. So one way you could do this in Debian I suppose cause a failing test to find a bug against the package right and then in the case where the maintainer decides Yeah, so this is this is really on how we're gonna handle it I mean Separate thing which is blocking the package somewhere over there, right? Then it's annoying for the maintainer No, so I think to start with I I think what at least Niels promised The project in his email is indeed we start with his aging we implement the stuff We start with the aging and I think what are you gonna increase the age if things are regressing? I do hope so, but I haven't heard the statement of the release team right, okay But I do request that yes And and then another thing that security team at least in Ubuntu likes a lot is that you should be running out a package test In stable releases for the proposed up updates Such that the updates and security uploads do not regress stable releases It's in my miscellaneous ideas on my goby page It's not there yet because we don't have we don't have the container to to have this actually for the security team so What will happen is that the WCI supports gonna be generic enough that it doesn't care if it's Britney requesting something against testing or security team asking something against table, so The plans that that's going to be supported and Yeah, I think currently the security team doesn't have a sort of proposed thing So I mean they just have the the security archive. So this needs to be fixed first If I get this to run and this project that I'm now on the table works reasonably I hope I can take the next challenge on the PPA bike-shed thing, but That could solve a couple of these things as well and then this could be used there as well So indeed the the one of the choices that also came up in the discussion in the beginning of the year is in principle migration is a matter of Britney which is the Implementation of the release team that decides if a package can migrate Now this is it where Ubuntu implemented this this logic and they just say well It it's not there and it for the Ubuntu maintainers There's possibility I guess to override and tweak a bit Probably it doesn't scale in Debian to put this all on the release team if stuff needs to be over Overwritten often At least that was I guess a fair Fair concern so Another possibility that was raised to Indeed interact with the BTS to indeed file the bug but what I Think we in in that case we really need to do is not only inform the maintainer of the package that Is not going to migrate but also inform the maintainer of the especially for reverse dependencies that trigger the block Because the maintainer knows what he changed in the upload the maintainer of the reverse dependency knows what he's testing and I really think that they should figure it out together Why it's not migrating and what fails instead of just the maintainer of the one that wants to have his package migrate and I guess that is really a social thing, so we have to align on that and Have enough traction that that's that's feasible In practice we find that the two people argue between themselves and say no It's your problem and then a third person comes in and fixes both packages Sometimes by means of removal Right in the end that for for migration. It's a release team that is there Right Everything Right And Obviously for me as the one now trying to implement it will be way easier for me to Implemented it Brittany because basically Ubuntu did the work, but that's Not an extremely good argument for not doing it via the BTS, but it is for me. It's an argument Implementing stuff in Brittany and supporting hints in Brittany is useful for the release team if the release team Has a buy-in into all of this out of package tests and they embrace it and they like it What we found team really likes it the comment was during the release What are you gonna do right after the release and it said I'll bug albruz when I get my yeah, and what is it? Yeah, and and in Ubuntu release team at least the 810 people who have the access rights to commit hints for Brittany There are hints specific for the outer package tests in terms of ignore all tests triggered by this package or ignore the Specific chain of packages on specific architectures or any architecture for Specific version or just any like the normal set of hints that you can override the test because ideally you want to Yes, this has regressed in testing or this has regressed in staple and we know about it There is a bug filed about it and then you just Overrided and Brittany hints and at least the rest of the tests are still run and the rest of the tests are still Being monitored to not regress further Right so bugs are useful for people and Package maintainers to bounce things between themselves, but hints are useful for release team to do releasey things Yeah, but I guess in that sense then you're as the Dutch expression goes two captains on the same ship if you file RC bugs for Out of package test as and you implemented in Brittany. I think you're doing it sort of twice You can also kind of split the problem too, right? There's there's the request the half way you request the tests from the CI infrastructure Right, which is which is one part and then there's the part where you sort of collect the results back And then do something with them to implement the blocking of the packages Yeah, so I skipped that part because we immediately We're going into the weeds but like a bunch has obviously done both of those halves in Brittany But it will be possible in theory to split that right and have Brittany request the tests And then when it for clicks the results instead of putting them back to Brittany filing the bugs instead So so my idea on that is if if we as a project really want to go this via the BTS I think it makes most sense to not implement the blocking in Brittany Right, just rely on the BTS to make sure that you study to request the tests from somewhere right in the first place And that it feels like it does make sense for that. We will create I mean I thought go that route I will create a mechanism to actually file these bugs in the BTS in the proper To the package that tries to migrate, but then the BTS can handle it if it's RC it blocks and if it gets downgraded Mike Then you can make the question was if they would be RC by default, but So it's not an RC bug if your auto-packing test fail What I consider and Ubuntu considers it's a RC bug if there's a regression In the auto-package test. So if it passed in the past and now it fails Then there's something that needs to be fixed Why but that's covered by the normal Testing migration rules, I mean if your bug affects both the version testing and unstable it gets it can still be RC and not block the package So it just works perfectly in our case No, it will only affect the version in unstable and not the one in test because if you if you if the well in the case of Of a problem. Yeah, yeah, then it's then it's and it's for my point of view. It's fine No, I what I want to do now is improve the situation or Make sure that the situation in testing doesn't regress if somebody wants to do something on on failing out of package test And start filing bugs for that Fine. Yeah, anybody can implement that what I'm working on is per I want to prevent Regression in testing by implementing this So then the version that the RC bug would be on is the one in unstable and not the one in testing because there Everything was fine because it's a regression and not something that always fit Regarding filing RC bugs, I would be careful to not find them right when the test box phase just after a few days the the tests are stuck in the migration Because sometimes there are thanks. Sometimes there are flaky tests. So they could cause a lot of Bugs to be filed to the BTS. So if the release team could delay the migration So, okay, you can yeah, I thought it was a little bit more clear because yeah, yeah, yeah, so when Okay So when an auto package test fails in a package order in the depth dependence reverse dependency chain It gets blocked in Ubuntu It doesn't migrate to release packet from proposed, right? And if we do this semi WM So we walk we block the migration for a few days and wait to Have it cleared up by maintainers without any bugs filed We would find a lot less bugs to Innocent maintainers. Okay, so the proposal would be to raise the age. Yes, and After four days or but how to inform the people that need to fix anything At the be at Ubuntu, we have a nice page showing the mic the cue In Debian in Debian my way of communicating to the maintainer is the BTS So so what I could do is is file a bug say I'm gonna raise it in three days But I have to file a bug. I think we Oftentimes we face bugs which are failing for the first time but passing the second because there are network issues And those are flaky tests and the transients can cause a lot of bugs to be fired and everyone will be really really Happy about that. Can somebody write French in the transient bugs is important because like for example once we took pearl Transition and the new pearl got uploaded the first 1000 test runs they all failed because they could not be installed because there was uninstallability in in in the Unstable pocket effectively sweet and then once we rebuilt all the packages they started to be installable, but obviously you had to run the tests with Everything from unstable rather than just pearl from unstable and and so the first few runs We had about thousand failures, which were all transient and known that we need to rerun the tests in a specific way Yeah, I had a So if you file six thousand bugs, I think people will complain Effectively if you know that it's a single class issue of transient failures rather than real failures But again, I don't see this as a hard problem. It's solved for the archive rebuilds So where it works is that when you do Or a full rebuild you get a list of everything that fails and you try to identify first if there are big classes of new Failures, and it was like 500 new failures related to pearl. You just don't file those bugs But this is gonna run automatically. Yeah, I'm not gonna trigger it I don't I don't think that we should ever file bugs in a fully automatic way Probably never I mean it doesn't cost that much to have a human in the loop at some point You can have well very much Semi-automated process and that works really well But then how am I gonna? Inform the right people because then it just gets if I don't lock in in Brittany, and I don't file the RC bug Then it's gonna migrate You're asking him It's not a review that you do when when you can it's something that's running all the time So it's not possible to have someone runs every hour now. It's gonna run every hour It's gonna try and do this. So let me do an analogy. So today we have migration blocking due to pure parts regressions. So There's not necessarily a bug in the BTS So the maintainer needs to care about Their packages be migrating to testing and they have to go to the track and check why it's not migrating. So It's going to be the same thing It does that it could be the same thing that that's sort of also what I tried to get out of this bug if it had how I mean from the Perspective of implementing in one place had it it's Brittany that decides on what migrates or not There's a good argument of saying, okay, this is in the hand of the release team in that sense I find the the POP. I Think the POP maintainers doing a great job was also filing bugs But the pure part result in itself is already blocking. Yes And yeah, you have to find it out some way and I I'm not fully comfortable currently how that goes I mean if he if he's on holiday for for three weeks, you're not getting these these bugs Yes, and your package is still still blocked and you're not notified Well in in the end what we want is to have good stable releases I mean if for some time there's no bugs getting fight It's not such a big deal as long as you can't go back So what? I mean we can fix things later on. Yeah, okay, so I I think In the part of the ecosystem that Debian is it's I don't I care most for Debian But I don't only care for that. I mean, I'm not an Hardly doing anything in Ubuntu, but that's a derivative that does it differently than we do and they take our packages. It does Influence that as well So Well, this is long now I'm not sure what the problem is for filing bugs as long as we don't file them immediately So the the prevention of not filing them immediately seems sane to avoid like flaky tests, whatever but after that What is the problem of? Filing the bug like once you know that these tests is actually failing right like We avoid fake flaky test What's the issues filing back? Transient yeah, it says that fails the first time you run it, but then you run it again and succeed Eventually when you run it again. Yeah, but then how do you know whether to run it again? I suppose there's a problem right that was the question that I had like How do you rerun the test like you you somebody notices and then then people trigger the test to run again So it's manual Get the mic, please So like I've been rerunning tests to make my package migrate because like I upload system D quite often in Ubuntu and it triggers a few tests and then when I'm When I see regressions and when I retry them and I see that they passed on retry I file bugs with vengeance and I'm like this is a flaky test Which is not consistently passing it it passed three runs out of ten This is not acceptable. It's blocking my work and I'm like please fix this So like you know retrying the tests a few times like three or four times And then if it still fails after a week then file automatic bug And I think it should be fine in terms of if your test cannot pass five times in a row something is bad Or at least once out of five runs You have to speak so it's a way we sit under for a cover bill that Before when a build failed just we tried immediately just to avoid all the random failures It can be problematic to retry out immediately because for example a mirror didn't sink and we got Failures in downloading a package twice a row. So giving some time can be useful Yeah, but given there's a human in the loop you see it at the time when you look at the Whole list of failures and you don't file bugs for those. I mean all these have been solved We're not reinventing something new here I mean we kind of are though because we're talking about a more automated process than what you do for archive rewilds, right? This we're not saying let's do tests and have somebody look at the list and decide which ones are bad And then file bugs that is not a reasonable task to ask somebody to take on all of the time Right if you want to have this as a systematic Thing that's in the way of migration to testing I would prefer filing bugs about issues which the my maintainer can act upon For example, if there is a failing mirror around And you get 100 bugs for random packages. It doesn't have anyone and maintainers get upset. I Think you could probably think about things like safety valves and stuff, right? Like if something's crashing down around you, then like you see 10,000 bugs at one time Then perhaps just don't file those and like take a step back for a bit I don't know like just you know, I don't know this you can think about things you might do All right. Yeah, I think the archive goes down then you then you'll see every build failing And if you see every build failing out the package test itself Yeah, or like, you know, if pearl makes all of its modules uninstallable when they run like you would suddenly see like 2,000 failures at once right and you might be able to encode express like perhaps this is something to look at and not do the automatic stuff on because I mean But then Doesn't Brittany already check for installability before anything so maybe It's not just it's not going to request test if it knows that pearl modules are not installable But that's just from the panacea version kind of thing if there's a mistake in that it determines wrong, right? Can I have a question about the ci.doin.net actually, I believe you have Capacity problems there. Is that true and Is that true? We had that in the past And I also believe that you only running tests inside containers and not inside for VMs right so you can't run Isolation tests what's maybe there's a status update. Yeah, I'm also working on that. So we have a QMU back end now. So I'm Ironing that out and I want to switch to that for at least the package that say they need IVM. Do you have enough machine capacity for like every pearl upload triggering 5,000 tests and getting through those in a reasonable time, right? Okay, we So the container tests run on Amazon and we basically have infinite resources that so it's fine the For VMs it doesn't work because Amazon doesn't support nested KVM So I'm gonna need to last time I checked it. It didn't Unless you know something in Ubuntu. We don't use nested KVM We talk of our open stack API to launch the VM and you run the tests over We don't have an open stack But can you not spin up an Amazon VM and use SSH runner to run it? Maybe and then shut down the VM and I discard it such that you only use the as a bare metal house What are the ideas But they do need nested KVM for some tests right like system D and some some stuff just won't work But I guess you can deal with that Yeah, some stuff. Yeah, they pointed out I don't really want to be tied to the Amazon infrastructure or anyone else So I just want to use them in a ship providers of machines So maybe you can look at the how the SSH setup script for Nova works in not to package this and kind of tweak that for Amazon Maybe I know I'm late, but I concerning the VMs Basically, we did something similar We're doing back here and we're creating virtual machines were deploying and we're destroying virtual machine And you could use that for builders We already did that with the github a full pipeline starting in the VM creation Do using the preset and you could put there everything you need to build whatever you need to do a test and At the end you could Just remove the VM after you finish the test and you can rewrite that for any technology KVM or Send because Yeah, sure that there are ways to fix it So the point is that I am started working on it and have plans to do it in the future I had the impression that you were really stuck struggling for hardware and no wondering what the status so that was cool What the question was which architectures? Yes, so we have Intel now we I have access to a Power PC 64 open stack By the power PC 64 people and we had arm 64 but didn't work out because there was like it's not really Ready-made production hardware so lots of issues there, but We should have power PC 64 in the near future I'm asking Niels if he likes to raise the Age and he's happy to do that. So on regression at least also Have the age raised Yeah, my aim was actually to have this really early in the buster release cycle such that we could start experimental Experimenting early Sorry, it didn't happen Didn't have the time but I'm picking up up right now and so I do hope that we're not having just just before the buster really cycle to have this but Can I ask you one implementee kind of question? So as I'm maintaining this on the winter side I'm wondering if you're thinking of borrowing our cloud running stuff That we're using there or not and if you are I'm not considering any change to the infrastructure because that's not my So the thing about picking up uploads and picking up requests and dispatching requests like, you know, you know I've been discussing the interface with yeah, yeah, right So there's gonna be an interface for them dev NCI to receive test requests and that's what really is going to use Yeah, and the release team works with Brittany slightly different than in Ubuntu. So they really have two deaf Different runners. It's Brittany one that does all the interaction and Brittany two is not talking to the world We have the system of like we have like a box somewhere in the cloud, which is like receiving requests from So Brittany is like putting an MQP request into a queue and the box is picking them up and dispatching them and that kind of stuff Yeah, that's Brittany's going to be a dev NCI API HTTP API Okay, so I just wondered if you're planning to use the same thing that we developed And if so if you want it's like me to help you with that Why is it? Why is it the thing you developed? Why is it? Where is it? Oh, it's on launchpad get Ubuntu release team. It's called auto package as cloud It's basically just like connect to these MQP queues receive the requests and call auto package tests with the right parameters from a config file because during the queue above we are discussing Walking on a common schedule for QA test. Okay, that could be relevant Yeah, one thing to say the lesson that we learned there is that it's quite useful to have multiple queues of requests So if you have like Pearl which is triggering 5000 tests You don't want any other small uploads to have to go to the back of that queue, right ideally, right? So we have like a threshold. I think it's like 20 maybe so if you trigger more than 20 tests you go into this different queue huge we call it Okay, and then everything else stays like in a normal queue. So that's why it exists It was yeah, and then you round robin between these queues right so that you don't get denial of service Yeah, and you already supports priorities so we can okay a lower priority to anything It's yeah, but we want we like round robin between them. I don't know So you don't get starvation of the huge queue by the normal queue as well. You get things going between both of them Okay, I think we have two minutes left. So Try to run up what I do want to mention so the current state is that I've Had a version one of the changes required to out the package test I now have a version two which is a different implementation which makes it more generic also for other derivatives because Doesn't need hard coding and Debian or Ubuntu or Kali pending While you don't want to keep on adding all the derivatives to that file it just doesn't make sense Antonio and I have discussed what needs to be changed to Pepsi. I think that's rather limited so the big thing next is to work on what Debian would need in in Brittany I Already have a long time access and probably Niels is also Dropping in to help me there Steve is willing to help so That should be The next challenge So can we agree on a backhand link part to you? So I'll first Make sure that Brittany is aware and then I guess the release team will activate the age such that at least it Yeah, you reduce age when everything's fine and it increases age when it's not Obviously, I'll definitely not start automatic bug filing Definitely not in the beginning And actually how I hear it. I'm not sure that automatic filing is a great way, but Don't do that, please. No, I mean for bug filing the way you should do it He start with manual bug filing and then grew up slowly into so similar like a Few parts is doing it right now, but okay Well meeting with Balin during lunch I will show I run the archive rebuilds if someone else is interested just to get a grasp Or much automation there is Feel free to join Well, tell me but I think we we do have to improve on the communication of if if this is If there's an issue towards the maintainers of actually indeed especially with reverse dependency failing We need to communicate that better because just Waiting for maintainers to pick it up. It doesn't it's not a good idea. I think one intermediate step would be to send email to the maintainer Without filing a bug, but still not in the first stages, right? Like first you need to have an idea of how many emails you would be sending But once you have some idea that you would send like a not terrible amount of emails then It could be emails instead of bugs What does that really change only that it's not archived? So yeah, then you don't have to go and close it like if it was a Transient error, you don't have to go and close it. It was just yeah, I think I think if we ever go that route emails Right. I don't know for cinnamon. I get emails when see I Fails I get emails from shankings. I don't think that's nothing to do with this No, I know but I'm saying like this is It doesn't have anything to do with other package test But it sends emails when things fail and then it turns out it was a transient error and I get an email saying oh now It's passing again. I didn't do anything and I don't care as long Yes At least in Ubuntu we started to send emails your package has not migrated for so many days And that email keeps repeating. I never got one In Ubuntu not in Debin, right? And it goes to the uploader if it's a direct upload rather than a sync from Debin But yeah, I would love to get those by the way Can you send them to me? Yeah, we could probably maybe like in tracker You could add a new subscription type. Yes In the Debian maintainer dashboard in UDT there's a possibility to have an RSS feed so you can subscribe I'm subscribed to the debut ones Especially when Britney is aware it will be in the excuses anyway or it could Yeah, let's stop here Any feedback more info just email me or Actually, there's a wiki where I try to keep Progress going so if you're interested you can follow that as well And that's the AutoPax test email list if you're extremely interesting Okay, thank you for your contributions