 All right, all right. Well, welcome back everyone to the fedora leads and Linux distribution development track Here at flock. It's my pleasure today to introduce Adam Williamson who will be doing a presentation today fedora ci and automated testing I'll hand it over to you. Okay. Thanks a lot So yeah, I'm Adam Williamson. I have the fedora qa team lead for red hat I've been doing this for for quite a long time. I started in 2009 As you can see from the next slide in my deck, which is a picture Mo Duffy took in 2010 in Zurich So there I am. Well, you know young and optimistic with Peter Robinson. I think Jesse Keating. So yeah, I've been doing this for a while Just so you guys know my policy for talks is to hold q&a and Feedback to the end. Otherwise, it's hard to get through on time So I'll definitely make sure to leave space for questions, but just hold them to the end So, yeah, let's Get into it So next slide, I guess that's the next slide. So how do we test fedora? This is kind of some background. This is all going to come up later on. So I just want to establish it There's different levels. You can test that there's disk it, which is where we keep the sources for the packages So you can actually, you know run tests before a bill an official build has ever happened You can run tests on pull requests. You can, you know run nightly tests on the git repo if you like You can test after an official build happens in Koji or an unofficial build, you know any kind of building Koji That's another place you can test You can test updates which are you know when we put Related builds together into an update and say this is the thing that needs to go into the distribution You can test that level And you can test composers and composers are when we put all the builds together and we make images out of them We make repositories out of them We do that nightly for most streams of fedora you can run tests after composers happen So these are these are kind of the the main levels that you can run testing at So some background to I mean the topic of the talk is where we are with automated testing But you know how we got there in the first place is kind of useful to the story. So in Yieldy days This is what we did. We basically in a QA Consense context we didn't do any testing of package sources like maintainers might do it for themselves But that was outside of the scope of QA Build testing at the build level again We really only had what maintainers did themselves in the the check section of the package where You know a package build can run test scripts and fail if they don't work At the update level we had manual testing. So this is all you know circa say 2013 or something there was manual testing of updates, which was you know kind of haphazard We have this little mechanism where if a wiki pay a test in the wiki is specially tagged It will show up on the update page. So we can kind of give you some guidance This is a test it might be useful to run on this update But that was about it other than that people would test the update and say yeah, it's good What did they actually test about it? We don't know they just said yeah, it's good or they didn't if they said it was bad They would normally tell us why which is useful, but yeah It's very haphazard there's no way of knowing that what you're testing from update to update is the same and Sometimes an update just wouldn't get any testing. So do we know if it works? No, we don't we have no idea And it's very difficult manually to check if an update breaks initial install of the system or The compose process and we'll get to that later. This is an advantage But that was a thing that was very difficult to do with manual testing For compose testing that was probably where we used to focus our manual testing the most So we would put a lot of work into manually Doing what we called release validation testing, which was you know, it's focused on okay We have a release candidate is this release candidate ready to go and we would run, you know over a hundred manual tests to decide if that was the case And that gave us pretty decent coverage, you know, we didn't ship many completely broken releases, but it's it's very time intensive It's very boring because you just sit there spinning up virtual machines all day Camille remembers these days when we would just spend weeks just spin up a virtual machine run and install this way Does it work? Okay? Do it again do it very slightly differently do this until you're Tired of life. Yeah, so that was that was an issue and even the amount of resources we put into it We couldn't test every compose. There's a compose every day. We would go insane trying to do that So that was another limitation the consequence of only being able to test periodically is that it You get problems where there's a bug and then by the time you've noticed the bug There's another bug so you have these kind of stacks of bugs by the time you get to testing a compose There's these four or five different bugs in it all of which you could have caught one at a time if you could test every compose And it's if you don't test one of the nice things about testing very frequently is that it's easier to identify What broke something so if you don't test very frequently You're looking say we ran a test on a compose and then we do another test two weeks later And we see a problem. We know that something that changed in the last two weeks caused the problem, which is That is nothing, but you know, that's a lot of things to check So that there's a lot of issues with only doing manual testing Just some quick sort of history of how we used to do, you know How we did validation testing and where we got to before we started automating on the left here It's I have to shrink these down a bit, but this is the earliest organized Qa testing of fedora that I could identify is from for door core five in God knows when that was 2001 or something I don't know. There were 26 tests, which I counted out of this table So then by the time we got to of 2006 I have the know here by the time we got to fedora 21 in 2014 which was the release before we started automating things basically we had 138 tests That's not all of them. That's just some of them That had to be done manually for every compose, which was that was a lot of work So we you know gone up six times and that still wasn't testing everything and It was taking up a lot of our resources So that was the point of which we were like, okay, we need to change something or we're all going to go insane So that's kind of where we got up to and just as a side note After we started automating things we've continued to grow the test set So as of 2023 the fedora 39 matrices have 202 tests on them as I count it This is I've done this talk three times before for anyone who doesn't know I have a bit longer today And I'm sorry I can be a bit goofier and this morning. I found a talk that will woods did At flood Contoronto in 2009 and if I had a little more time I was going rewrite this entire talk to do it off will slide deck, which would be great But they didn't have the time So instead I've kind of dumped a few of his slides into my talk just for fun I don't know the license on these slides. Unfortunately. I asked will but he hasn't got back to me yet I'll update that later. So this is from 2009 from will slide deck when he was this is when they were starting auto QA Which was the automated testing thing at the time and the fun part is a lot of this stuff is Relevant this is the stuff that you know 14 years later. We have finally managed to fix so They were discussing. Okay. What makes raw hide broken or good. Can we write code to check that? How can we run that code automatically? All of these are the things we have finally managed to get around to solving an automation recently So this is all the stuff that we've actually managed to implement. It took a while, but we got there So what do we have now where did we get to what are the automated test systems in fedora? What are they testing? That's this is the main meat of where we're getting to the two main automated test systems are fedora ci and open qa The point I want to get across here is why do we have to what are they both doing? How do they complement each other? So this has evolved over some time open qa is Was effectively came out of that problem of having to do a lot of manual testing That's still not being anywhere near the amount of testing we wanted to get done At that point in history the thing auto qa had evolved into something called task atron And we were working on that but it wasn't at the point of being able to automate all of those tests for us And we wanted to do something so we picked up this tool called open qa that the sues folks wrote And said look this can we can use this right now to help us We're just going to pick this up and use it that was a basically a skunk works project Which ran on someone's computer under a desk it ran on sues at first So that's where open qa came out of and we've just kind of gradually built it up since there fedora ci Is a more kind of planned effort that had more resources behind it? It had some involvement from you know the rail side of things it was kind of an idea to Standardize and share how automated testing could work across fedora sent us Well and make it so that red hat can push some of its testing that it was doing internally upstream So try and provide a kind of shared environment so we can get some of those tests running outside of the red hat firewall And it's fedora ci is kind of trying to offer Tools and workflows to packages in the form that will be kind of familiar to people who are used to working with ci Systems upstream whereas open qa is very different from that So that's kind of the back where the background where we have here So the way it's worked out is that ci is kind of Intended to be self-service if you're a fedora package or it's like it's a way you can write tests for your thing and get them run At a time in a way that makes sense to you you can have the test run on pro pull requests You can have the test run on package builds the results can come back to you in disk it or in bodi It's not there's nobody Centrally writing the fedora ci tests in general. That's not how fedora ci works So it's kind of it's a developer workflow is the idea as I say there think about providing ci services to people working in fedora Open qa is more It's testing of fedora itself. It's not Something for fedora people to use to test their little bit of fedora that they're working on It's about making sure fedora the thing we define as fedora is working well With fedora ci the system is is the thing the people who work on ci wake up every day and think about how they can make the system better With open qa the system isn't the interesting thing I'm the me and Lucas are the main people working on open qa We kind of wake up and say are any of the tests failing? How can we fix that can we write some interesting new tests? We don't think about Improving the test system. So that's kind of the difference there So get into some more detail on both the test systems. What is open qa open qa is a system for testing which It's it's kind of designed to test the way a human tests So open qa spins up a virtual machine and then it knows how to type it knows how to see things on the screen It knows how to click on them and all of this is happening basically at the level of the virtual machine It has no idea what operating system. It's testing. You can use it to test a firmware interface You can test windows you can it doesn't hook into any kind of toolkits or anything like that It's just looking at the screen and clicking on things and typing things and then Yeah, it's The way it's originally works. It's all about looking at the screen matching screenshots It's been developed these days so that it does have some integration for Linux consoles specifically So you can run commands at a console you can get the output from them You can get the exit code from them in a kind of programmatic way So you can write tests that are like okay run this command did it pass or fail that can be a requirement So it has that capability What we use it for as I said on the previous slide we use it for high-level functional testing It's like we define we kind of build off the release criteria and things like that and we say okay What are the core requirements for fedora? What are the things we want to know at all times are these still working? That's what open QA focuses on testing. So we run a Set of tests on all fedora composes also fedora core OS composes iot composes cloud Composes there's probably some other kind of compose. I'm forgetting any kind of fedora compose. We run some tests on it ELN we're testing now, which is a fairly new thing It runs a slightly smaller subset of tests on every critical path update for fedora every single critical path update Nearly we'll have at least some of the open QA tests to go on it Which is pretty big across all releases not just stable, but also branched in the grow height So that's it's running a lot of tests. I'll get to some numbers in a bit, but that's a lot of testing So open QA is great the screenshot driven testing is really useful As I said what we've started out wanting to do with open QA is test the installer like automate our installation testing And it's fantastic for that because almost nothing else can do that but open QA is really good at it So that was kind of the entry point, but it is also useful. We use it for you know testing things like desktop applications Lucas has been working on a lot lately and it also We managed to implement like client server testing with it which is really useful So we're using it to test things like free IPA database servers just because we have a good architecture for Having multiple test machines sort of interact with each other. So it's also good for that Yeah Just to go from briefly how open QA looks because I wanted to have a sort of more practical element to this You're not necessarily Expected to look at this It's more kind of something for me to be looking at most of the time or Lucas to be looking at We see open QA is kind of a service QA provides But if you are interested, this is kind of how it works This is what it looks like on the left is sort of the overview for a given thing. We're testing in this case a compose So if you clicked on one of those little dots the green or orange or red dots You will get to the interface on the right Which is the details about that specific test and you can see kind of it's got thumbnails all the time It's running and green ones mean the thing that was meant to happen there happened either a screenshot match or a command succeeded Gray ones are kind of informational or it was waiting for something and red one means something failed So here we've got a real fail test. We click on the red thumbnail and we can see, you know It's booted, but it's at the emergency mode, which is not where it should be So we can we can see what went wrong and then you can get logs out and start investigating failure Okay, yeah, otherwise I'm gonna Mess up my timings So yeah, that's kind of how open QA looks from, you know, my end of it. So here's some details on what we're actually testing in open QA today because I Wanted to give this talk because we've been developing this stuff for a long time now And I'm not sure people kind of understand the extent to which it's grown and what it's covering because we really are testing a lot of stuff now So for composes we're covering 75 percent of the validation test suite Which are those those tables I showed earlier the hundred and thirty tests as of, you know Fedora 21 or 200 something tests now 75 percent of those are automated. We do not need to run them manually, which is Fantastic makes life so much easier. We also do run some additional tests that aren't technically Release blocking but are useful. So for instance, we test silver blue because it seems important. It's good to know whether it's working In more detail, you know what it's actually testing it runs a lot of install tests in different, you know Configurations it tests the different images it tests installing different package sets Tests a lot of partitioning stuff, you know, can you install the XT4 or can install XFS? Can you install butter FS? Can you do thin partitioning? Can you resize existing partitions all of this stuff that was incredibly annoying to do manually? It does all of that it installs in different languages since the last time I gave this talk Lucas has implemented a Turkish Install test, which is great. So we test French Turkish Japanese Russian I think one other and there are all kind of languages that are interesting for various reasons to do with character set rendering Right to left and Turkish is interesting because it has weird case ghouls So this lets us cover, you know oddities that tend to pop up with different languages It tests UEFI and BIOS, so it just covers a whole combinatorial mess that we don't have to test manually anymore But we go a long way beyond install testing these days So what we call the base tests test so the really core functionality like can you install a package? Can you remove the package? Can you update the system? Can you log in? Can you log out? Can you reboot? Is SE Linux in enforcing state like it should be? Can you manipulate services which again is incredibly boring to test manually? But it enables a service disables the service stops the service stops the service reboots about five times And yeah, you don't have to do that by hand anymore, which is great does logging work We also test upgrades which again is huge because that was a massive time sink to do manually So we test whether you free starting with a clean install of each supported package set for the previous release and the release before that Can you upgrade to the current release and then we also test does stuff work after you do the upgrade? So we know for instance that you can upgrade a free IPA server and it will keep working as a free IPA server, which is super valuable We do quite a lot of graphical desktop testing You know we test the desktop itself we test things like our notifications working Can you you know log in log out we test from within the desktop? We test whether every single pre-installed application starts up and quits successfully at least And we do some detail testing of quite a lot of apps now, which Lucas has been working on So there's really quite a lot of testing whether the desktop works printing we test like updating from within the desktop And we test all of these as I said we test them not just from a fresh install But we do an upgrade and then we test them again. So that's pretty useful coverage and we test on silver blue as well We have some pretty advanced tests for server functionality So as I said, we test free IPA very recently. This is a new one as of last week We test active directory using a Samber active directory server We test a database server client test. So make sure a postgres is working. We check cockpit pretty extensively Not on here, but we also have some tests for podman So we're testing a lot of stuff and all of this is being tested on every single nightly compose of raw hide branched Yeah So, you know, this is partly just for validation So we don't have to do so much work for validation But also it means we know whether this stuff is broken way earlier in the cycle like all the time now We know what's broken rather than in the past we would get to you know two months before release and we start testing And we find everything that was broken Some of these tests like adding the silver blue tests Coro s which we test and a lot of the server tests were kind of driven by The sigs so the service thing has been very good at working with us and saying hey This is what we want to have tested so we're like, okay great We can that's valuable to test so we'll add tests for that. So there's that kind of process going on So that was composed is what do we test for updates for updates? We don't test all of that stuff frankly just because we don't have the resources So we do a subset, but it's a fairly big subset for So recently it got more complicated because I kind of enhanced things so that it now tries to run only relevant tests for each update So before we had this problem where there'd be a KDE update And it would run all the gnome tests which was kind of done because a KDE update isn't gonna break no So I've kind of we tweak things around and now Critical path is defined in groups and it everything along the path knows which Groups an update is critical path for so open QA can say hey This is only in the critical path gnome group. I'll only schedule the gnome tests But assuming an update was in all of the critical path groups You would get about 60 something tests and it tests KDE tests workstation test server Not running every single test we are not composed but running a pretty big subset of them It does the most important base tests it tests the free IPA stuff the Sambar 80 stuff cockpit database tests It tests the it does do an upgrade test so it's it's pretty wide coverage and Something that we only do in the update tests is we try and test whether the update will break the compose like I said on an earlier Slide this is very difficult to do manually in automation. It's quite easy So for every single relevant critical path update we build a network installer image We build a gnome live image a KDE live image and we build a silver blue installer image And then we run an install from them and then we test that it boots properly So we know for every update does this break the compose process basically, which is very valuable That's one of the most important things we get out of the update tests And yeah, just as a note as I keep focusing on for open QA the idea is we want to know does this update break for Dora Does it break something important? So it's not so much that we're testing Postgres because we're really interested in databases. It's because database functionality is part of the server release criteria So it's something that is meant to always work in Fedora server Therefore we test that it always works and it's the test aren't meant to tell you is this update to package X a good or bad version of X it's more Does this version of X break the things that are important to us in Fedora? That's the idea The main limitation on how many tests we run is just capacity if I had more open QA machines We could run these tests on every update, but I don't so we can't And yeah, just to give some of those numbers as I mentioned earlier like we're it's it's getting pretty big this thing that started out as A tiny little skunkworks project under a desk. So we have two instances now. We have a prod in a staging instance Across the two we have over a hundred simultaneous test executions. We've run over three million tests since 2015 So that's kind of a lot Discovered hundreds of bugs. I tried to get a precise number But it's hard because often we report the bug upstream sometimes I just fix the bug and never file an issue, but there's at least 358 bugs tagged in bugzilla as having been found by open QA and there's a lot more in upstream trackers and things like that So that's that's that's a lot of bugs on a typical day We'll test, you know one or two composers depending on whether branched exists We'll test test three core OS and cloud composers and we'll test about 20 updates So, you know, that's that's thousands of tests a day for sure I should have updated the third bullet point. This is recent as of dev comp for a couple of months ago But just some recent things that it found at that time We had a change go into Fedora 39 to make the the EFI system partition bigger Which broke a surprising amount of things and open QA found most of them and we were able to fix those There was an update that made Firefox crash on startup So we caught that and that didn't ship to users and give them a broken Firefox The Arabic translation just disappeared from the installer one day. So we caught that Yeah, that was a fun one Nome was notifying about dates when running live, which is something it's not supposed to do when you're running live We don't want you to try and install updates because it'll try and install them to memory and the whole thing will hang So it's specifically not supposed to do that and we caught that and fixed it So, you know, it's catching real issues all the time that are getting fixed And an example of something that we caught with update testing was that a new util linux Update mounted the group partition read-only which, you know, not on silver blue on RPM installs, which obviously breaks everything and Because of the update testing we were able to make sure that never actually landed in raw hide and nobody Got that update and got a broken pack system. So that was pretty cool Open QA resources. Yeah, so I've mentioned this kind of in passing But the idea with open QA is it's a full service system like as I said You don't really need to go and look at the open QA web UI when something fails in open QA one of us the QA team will investigate it Try and figure out why it failed if it was just a blip we restart the test if it's a real bug We'll kind of investigate it and file, you know a report in a format That's useful for you as a package maintainer whether that's to bugzilla or upstream We'll and we'll try and provide, you know, the useful information on why it's failing So you don't have to go and do that debugging work yourself But that's that's kind of the idea of how it's meant to work and we developed the tests we run the system So we're kind of trying to provide an end-to-end service there if you need to contact QA about this We have a mailing lists. We are on this for discussion as well You know the new forum thing and on fedora chat. There is a fedora QA room where you can find us The other things are just kind of references if you download the slide deck But the upstream site for open QA and there's a downstream There's a wiki page where I kind of explain the whole setup and if you want to get involved with it You can I added the last bullet point after devconf because Steph Walter had a really good talk there about open source services the idea that you know, you should be able to Inspect and change services as much as you can open source code and open QA does try to be like that Everything involved in the deployment of it is open So if you are interested you can kind of dive in from the wiki page and you can you can look at and contribute to any part of this system Moving on to fedora ci these this slide all the ci slides were contributed by Miroslav Vagkurti who is one of the sort of leads on fedora ci and testing for him So thanks a lot to him at devconf He co-presented and did these slides way better than I can so you can watch the recording of that if you want to see what he has to say but For fedora ci as I mentioned it's not an end-to-end service, which is testing whether fedora is broken It's a system you can use to kind of improve your testing of your package So there's a dock site which kind of gives you a quick start and how to get how to onboard yourself to this system It run the test it runs are defined in package repositories So you keep the test in disk it alongside your package with a little bit of configuration that tells the system How and when to run them and they're expected to be maintained by the packages They're not maintained by the team behind fedora ci. That's not how that works And it's meant for component level testing for Dorsey. I isn't really targeted at the kind of high-level Integration testing that we're doing with open qi. It's more at the lower component level of testing There's two formats you can write test in but you should use TMT STI is going away and yeah, you can link to the quick start guide there There isn't a workshop tomorrow. That was also devconf. I should have fixed that. Sorry about that There are a few generic tests that for Dorsey I runs which are kind of a hangover from auto qa slash task atron Which are tests that get run on every single package? And that's rpm inspect which David Cantrell who maintains that is around rpm deplint which tries to check for broken dependencies and install ability which just tests whether package can be installed And they run in ci just because it's sensible place to run them Do do do Sorry Triggers so this is kind of when you can run test and where you can get results So for Dorsey I can run on pull request So if you actually do pull requests for your package which some packages do some don't but you if you use pull request You can have test run so When the test run a pull request is created there will be a scratch build run. I think it's running coper. I'm not sure And ci will test that So, you know, it doesn't ever have to touch go Koji and you can get the results back on the pull request page Which is kind of a good workflow if you're committed to using pull request for your package But it also it tests whenever a package is built in Koji if there are test design defined for that package It'll run them and then the results will be shown if you create an update from that package build You'll see the results on the Bodie page. So that's another place where they get hooked in And yeah under the hood stuff isn't so important, but Testing farm is kind of the back end It's the kind of combined back end for all of the fedora ci But also sent off stream ci uses testing farm some of ral's testing runs in a testing from instance So that's part of the project of trying to unify things Yeah, same experience with ral ci sent off stream ci and packet which is a cool thing if you haven't heard of it If you buy into packet if you're a maintainer upstream of the thing you're packaging in fedora, you can kind of Do everything in the upstream upstream repo You can have your spec file defined there and then all of the downstream stuff is kind of done for you by the system You don't have to do all of that stuff You will get pull requests for the package when you tag a new version upstream and tests will run on that and you can Just say yeah, this looks good, and then the build will happen an update will happen It's a really cool system if you're if you control the full stack as it were Oh, yeah hardware crimes in TMT. That's just if your test needs some kind of specific hardware. You can now define that There are a few things that I kind of see as Testing systems like fedora ci and open qa are the main ones, but some of these things Kind of feed in as well So Koshay is It's a system that tests any time a package's dependencies change It will kind of do a test build on it and see if it works which is you know a tool for developers But effectively it's quality testing as well. So I see that as kind of a quality system Fedora lease auto test Lily is not here, but she's around She's a member of our team who we're meeting for the first time here and she wrote this thing Which is really cool. We have this problem with some of the requirement the tests that are required are used very weird hardware Like high-end enterprise storage stuff that we just don't have lying around in our houses or in the fedora test system So fedora lease auto test runs those tests in red hats test farm called beaker, which has access to all Lily is that Sorry, I didn't see you at the back there. Good job Lily, which has access to a lot of really Exotic hardware that we don't have anywhere else. So if she it used to be a nightmare We would have to find someone to run those tests manually So now those all get run automatically by that system, which is great. It saves a huge amount of pain every cycle This stupid tool I have called rail valve which is related to creating the validation events as Just because it was a sensible place to put it actually runs the size checks So if an image is oversized rail valve is a thing that finds out it's oversized and files a bug on it Fedora coro s because of the history of where coro s came from has its own ci system Which is kind of cool and does a lot of testing on fedora coro s and they have their whole separate release workflow So that's another automated testing system that's out there I mentioned packet and zool-based ci is kind of part of fedora ci and the testing farm back end But that is if you have a project hosted on You know pager.io you can get testing for that run Via, you know testing farm and stuff. It's not very important more guest slides So, yeah, great. Why more guest slides? I must is this a messed up version of this talk? Dang it I'm completely ruining my joke here. No, mind. Oh How did I get past this? Skipped over two slides. No, I did I talked about that. Okay. I did. I'm sorry. I've had three hours of sleep doing do excuse me So yeah, we have a couple more slides from will's talk here, which I just thought fit in really well So this kind of ties into the bits. I just talked about so this was in 2009 remember So coming soon ish This is the stuff. They were trying to figure out back then so easy auto QA for packages This was the idea of making it so you can write tests for your package So that's what for Dorsey. I does and it works now, you know 14 years later. We got there. It's pretty good Post ISO build hook. So this is where we're saying. Yeah, we have this huge, you know install test plan Which he's talking about all those wiki test pages. It would be great to be able to automate those right? We got there open QA does that And Use for door message bus so at the time auto QA was doing this crazy thing where called watches So it would just wake up a script every hour and see if something had changed So obviously now we have a proper message bus and that's how all of these systems work So yeah, we we got all those and they're coming eventually someday. Probably maybe sly they were like hmm How can we do multi-host testing? That seems hard open QA does that we do test HTTBD We do test NFS installs Functional testing for GUI is open QA is great at that. That's how you script that so we got there Well domination is still a work in progress, but we're getting there. Well, we'll be there soon. Don't worry Tim okay quickly The peanut collars I've drunk a few peanut collars There's a yeah the the results if you've ever noticed on the wiki the results from open QA are filed under the name coconut Because one name for this whole thing was project coconut the idea being that we wouldn't have to do any testing anymore We would just let the robots do it and we just sit on the beach and drink peanut collars So yeah doesn't quite work that way, but you know, it's better than it used to be Anyway getting back to the serious topics So testing is only this is something I'm kind of big on testing is only one part of this whole process It is useless to run a bunch of tests if nobody ever looks at the results and does anything with them like One of my saddest things is when you go to an upstream project and their little things says 60% of the tests are failing and have been for the last Three years, you know, it's so it's crucial to make sure the results are going somewhere where somebody is doing something with them So and different Ways of getting results make sense to different people as I said the open QA Web UI is kind of for me and Lucas to go and look at and figure out what's going wrong We don't necessarily want packages to look at that what makes sense for packages is to use the interface Is that they're working with you know code disk it body? That's where you want to go and look and get your results right so different routes make sense to different people One really key thing I'm going to talk about I'm going to start saying the word gating a lot and gating just means Instead of just letting this thing through and testing it and finding out if it's broken We don't let it through unless the tests pass which is a really key thing. We've been discovering over time And the point the thing we're trying to achieve with gating is to catch problems before They infect other processes and to catch them before people see them on their systems So we don't want problems getting to where composers are broken You can't run package builds because another broken package got in and broke the build route And we don't want people updating their systems and seeing bugs. So that's why gating is important Do do do do So in practice, where do you get results? So as I mentioned for fedora ci the earliest point you're really going to get your results in a way that's useful to packages is in Disk it So if you're using pull requests you can get the results of tests in there And this is a fairly old screenshot and it's a little zoomed in a little scroll, you know Squish down that you can kind of see it So this is a pull request for a package and then down at the bottom You've got like three little green blobs and those are results from for Dorsey. I think okay. This is this one past its test This one's fine. So that's one level the benefit of this level is it's the earliest possible point so if you're testing at this level this is before anything gets even into koji and If you're using pull requests for everything, this is great because you know exactly this pull request broke it You can you have a very tight development loop if you're testing at this point, right? The drawback is that you do have to use pull requests Which a lot of packages are not used to doing if you're the only person working on a package It's kind of weird to go and create pull requests and look at the web interface You just kind of want to fed package commit fed package push fed package build and if you have Cases where packages depend upon each other So you need to update two packages at once and if either either of them won't work without the other Then you it's kind of difficult at this level to to do that So the other major integration point we have is Bodhi, which is the update system So this is integration at the update level So the advent the key advantage is just the thing I was talking about is if you have packages that are interdependent Bodhi is where you group them together and say these packages go together And if we test at that level we can test, you know, the the group of packages that should work together and see if They do work together And it's also, you know just by default for packages who don't use pull requests This is where you're probably going to see your test results So yeah, it displays the results from open QA and for door CI Fairly recently. It's been updated to also show when tests are queued or running like before Before a test actually completed Bodhi would just tell you the result was missing Which was a bit confusing and scary and that was just because the system hadn't got to it yet But now it does actually tell you it's it's in the queue or it's currently running Yeah Fewer, but we had some issues with this in the past and I've kind of been working on it to try and make it as Consistent as accurate as possible There are still a few things that we know are problems like for instance It will let you try and wave the result if a test is running, but that doesn't work So you can click the wave button all day long and it'll file a waiver But the the actual system that decides whether the update can go or not doesn't Account for waivers on a running test. So that's something we want to improve But we're generally basically trying to make this as accurate and you know consistent as we possibly can The bull the in both is there because it's kind of a big deal as of since the last time I gave this talk all updates for all Releases are now gated including raw hide So for the first time in basically Fedora history when you do a build for raw hide It does not go straight into raw hide It now waits for the open QA test to complete and if any of the critical ones fail your update does not go into raw Hide so this is this is one of the biggest things we've done for a while So I'm kind of proud of it and also scared of it, but that's happening and nobody's killed me yet So I guess it's going all right So, yeah, that just means nothing gets in unless it's passing the tests basically which is pretty huge On a per on a package by package level you can actually configure the gating requirements So for the open QA tests, they're all kind of the same But if you put tests for your package in CI You can also add a file in the repository which says gate on those tests and then your update won't go through unless those tests Pass as well As I said, there's a button for waving a bogus of failure So if you're sure if you're really really sure that a failure doesn't actually mean things are broken you can wave it I would like for people to be very sparing with the use of that button Because the problem if an update that has failures goes through is that then possibly all subsequent updates may have the same failure Which is going to be an issue. So but it is there if we need to use it And yeah, the way Bodie works is it listens out from messages, you know Fedora messages from results DB and sort of updates it's situation a Couple of things that we know about this at the moment If you've noticed problems with Fedora CI test results like being wrong or just not showing up for Dorsey I is having issues with DNF 5 and Python 3.12 Which are kind of affecting the test running So that's why the CI folks are working on that really hard but just because of the details of that system it's quite difficult to fix all of that and Some open QA tests recently have been affected by this annoying problem that Kevin knows all about where we're getting a 404 From a repository, which is making the test fail. I'm hoping something I did this morning We'll kind of make that less of a problem But if you've had an update held up for a few hours that was probably why and we're sorry about that As a package or if you see failures in Bodie, what should you do? This is a question people ask me. So I wanted to explain it to you If The test failure is for a Fedora CI test. That's not one of those generic CI tests Then that's kind of on you because as we said Fedora CI is something for you to use So you should kind of debug your own failures for Fedora CI fails When you click on a failure, it should take you to the logs It takes you to Jenkins page. I think and from there you can get to the testing farm page with more details If it's one of those Fedora CI generic failures Check the logs from the failure and see if it actually seems to be caused by your update in which case you should fix it Obviously if not talk to the CI team and see if there's a problem with the system And that's got some links to where you can do that for open QA in general You don't have to do anything if you're impatient You can go and look at open QA and try and debug it yourself But what we aim for is that within 24 hours one of us will investigate it and either fix it or Tell you what the problem is in a Bodie comment a bug report an issue report something like that if That doesn't happen or if you really need to figure it out right right now You can contact us via you know Fedora chat or via our mailing list or any of the other ways You can get in touch with us and we will try and help There is also a button in Bodie to rerun tests when you click that it sends out a message that both CI and open QA Listen to and just retrigger all tests for the update If there's a failure that you're not sure whether it's genuine you can click that button and have all the tests rerun Which is much safer than waving the result and only wave the result if you're really sure it's it's okay to do so And for problems with Bodie itself you can contact the CPE team In the Fedora applications channel or you can contact QA, but we'll probably just go and ask Kevin to fix it So, yeah And one more integration point the Fedora wiki we still have those giant wiki pages full of results and The reason we use it is simply because we need up we still have some of the tests are still manual So we need a place where all the manual and automated results kind of go together And when we're doing the meeting that decides whether we release Fedora, we just want to have something We can look at and see all of the results. So Sadly the wiki is still apparently the best way to do that So there is a dumb system. I wrote which actually reports the results from open QA into the wiki automatically And then humans file their own results into the wiki and then we look at it and we could see all the results together Well vile size check results also go in that way So behind scenes this stuff so We got what do we got 15 minutes left so we can go into this a bit this bit of the talk is a little bit optional But here it is Behind the scenes. There's a lot going on to make all of this work together It's It's quite complicated. I think because a lot of this stuff came out of an idea We had ten years ago called factory 2.0 when at the time the the buzz concept was micro services So I don't know if anyone remembers a few flocks back But we had these giant whiteboards which Ralph Bean would stand in front of with all these little boxes about how Fedora was Gonna get built in this bit with talk to this bit and this And so about half of that got done and because those bits are still around we still use them So there's kind of a lot of complication in how all these things talk to each other But we have a mess the most important thing is Fedora messaging which is a message bus So the way CI and open QA know how to kick off tests is that they listen out for messages from Koji Or from disk it for pull requests that say hey, you know a new package build happened or an update is ready for testing So they listen for the message and then they test the thing And then they also communicate back via messages So both open QA and Fedora CI publish messages when they've completed the test results DB is where we keep the results So both Fedora CI and open QA File their results in results DB, which is kind of part of Taskatron. So Taskatron is still with us Waiver DB is where the waivers live, which is just a really simple, you know, it's a it's a database of waivers with a JSON interface It's very simple. Greenwave is the thing that It decides when we talk about gating the way gating actually works is the body sort of asks greenwave Whether this package is okay, basically And all greenwave does is look at the results look at the waivers from results DB and waivers DB and say yes or no It's you know, it's if we were writing this from scratch today We probably just dump it all in Bodie because the idea was lots of different things would use greenwave But as it turns out the only thing that uses greenwave is Bodie, but this is how it got ridden. We have a lot of different pieces Yeah, so that's kind of how it all done one I guess an interesting thing about that is that Fedora CI and open QA look like very different systems But there is a lot of kind of integration going on and we have this concept that At the point where you're looking at the results in Bodie It doesn't matter that much which system they ran it because everything is kind of integrated in terms of where the results go And how the gating works so which system the test ran in is kind of just a detail So that's that's kind of how we looked at integrating those systems This was the most popular slide like that I added in the second version of this talk if you Would like to say you're in this talk, but it's been very boring and you've just been looking at cat pictures for the last 40 minutes Just look at this slide and you can say you were here. This is This is the entire talk in one slide So open QA tests updates and composes Fedora CI test package builds We test composes we use those manually updates get tested through Bodie and we do gate those and Yeah, that's the whole thing So yeah, and you can download the slide deck is on shed so you can grab the slide deck if you want to see all Of these and it has my notes on it which have some good stuff on it Something quick or we Yeah And just a question that popped up in my mind. I received the Buxilla report a few days ago One of my packages was causing a conflict Executable by the same name Would that be caught by QA gating these days or no So that's not something we explicitly test for so It's something that could potentially go into CI I guess is a generic test, but we don't have one right now and I say with open QA It's very focused on kind of functional testing, you know is for door broken So we would catch something like that if it caused something else to break So for instance, we'll catch dependency problems if they make one of the things we're testing fail But we won't just catch any dependency problem in any package in your update because it might not Effect the test so the same thing there is like if that conflict actually caused, you know The free IPA test to fail then we would catch it But we can't say we would catch all of those issues at this point in time. No Would it be kosche eventually catching it? Yeah, possibly kosche. That's the kind of thing kosche could be extended to Yeah, or as I say it could be a generic test infodora CI Yeah, I'm inspect Yeah, the things that we run Yeah, yeah, so this does that would probably be something that would go into the fedora CI work flow somewhere I just that was my question and I have an anecdotal effect. You just mentioned an old fotcom 2009. Yeah, apparently a two this year because if I turn around I was talking about the Toronto one. Yeah, yeah I'm having the other one. Okay. Yeah All right Something's with me. So we you mentioned the RPM inspect and I was just looking at The last eight updates that I made and they all fail because RPM inspect is very Inflexible and it lectures me they should talk to Product security at Red Hat to allow a code point in my test file And the use of a function in my test file I should rethink how I program and the third message was something else. Yeah Like, I don't know. I would love to have system D installation Tested by RPM inspect, but it doesn't happen because it just refuses to install system D Yeah, how can we get like? Manageable opt outs and opt ins of tests. That's a good question So David Cantrell actually is here and did it. He's the maintainer of RPM inspect He's here. He did a talk on our PM inspect this morning. So he would be a great person to answer those I know he's trying to make it as configurable as possible From the perspective of this talk RPM inspect is not a gating test unless you your package decides to make it one So nothing will ever be gated on RPM inspect unless you say that in a gating YAML file What I was looking at the same stuff during David's talk just out of interest and one thing I noticed is that RPM inspect has like good war good inspect or bad So a lot of the things are inspect which is kind of between a pass and a fail But any time we get anything but all goods the the RPM inspect result is reported as a failure So that's kind of one thing we could maybe improve on and the other thing Yeah, I know that he will take feature requests for you know Being able to configure to say this to exclude, you know this warning for your package and say I think this is fine So you can get that into RPM inspect at some level, but he could give you a much more detailed answer for that part of it Do you can add RPM lint RC to yeah Ignore these kind of errors. They're supposed to be some way you can just basically exclude Specific failures, but I don't know the details on it. If you put food on spec dot RPM lint RC Yeah, and the rider config which which error should be ignored wave And then it will be ignored Yeah, but I guess another thing if RPM inspect doesn't run the install ability test if the static test fails I guess that could just be changed and it could run both tests always But I guess David might have a reason not to do that. I'm not sure it open cura We run all the tests. We don't stop if one test fails So I just want to get through my last slides before we do more questions the future so This it's kind of funny to think about is this a lot of testing because to me it feels like a lot of Testing, but you can also look at it is not really enough testing because an operating system is huge There is so much stuff we could test that we're not testing It's a it's an infinite ocean and we're just trying to make the drop bigger, right? But we can always try and make the drop a bit bigger We do yeah, yeah So our limitations as again as I said is just both the capacity to run tests Like open qa has a certain amount of hardware and I can't run too many tests or it just gets backed up But also the capacity to analyze and act on the results as I said a test system Which is just spewing out failures that no one has time to look at is a useless test system So we also need people to I try and keep open qa down to the level where we can inspect every important failure and do useful things about it And I say we've come a long way We're not at the point where we're doing ci for an operating system yet Which is very difficult, but we can keep trying to get more for open qa We want to we keep writing tests especially Lucas is always writing more tests So we're going to test more applications I want to do more pod man testing the pod man team is keen on that We want to test flat packs because there was a case a few weeks ago where flat pack install just stopped working We weren't testing it so we want to test it so we can catch that We've got validation requirements for toolbox which need to get automated So we have a whole pipeline of new tests to write I'm always working on trying to make the grow high gating thing as smooth as possible So that's a future plan More architectures would be nice We have a h64 right now, but it's very hard work and strain So we don't run the update tests on the arch 64 because we don't have enough hardware It would be great to have s390 we do ppc on staging, but not proud again because there isn't enough hardware We got a cool plan to do bare metal testing in open qa using a thing called pike avm Which is a little raspberry pie base box, which would basically let us treat a real system It can inject you know mouse movements and stuff into a real system the way open qa does with the virtual machine So we should be able to kind of run our tests on real hardware using that which will be cool and Lucas has been working on that too and More tailored update tests is just kind of like since we now have this mechanism for running different tests on different updates We could maybe run the entire install suite for an anaconda test an anaconda update for instance It would be nice to do that and possibly move it to the cloud which I've been looking at for years, but maybe it'll happen Fedora CI plans This again is mirror slide So I don't know what all of these are but multi-host testing is kind of that thing where you have different tests talking to each other More kind of back-end improvements finally get rid of STI and Testing farm reservations is like again reserving specific hardware and just kind of improving their error rate and stuff So yeah, oh and another thing I want to work on is there are things That aren't packages that change and break stuff So you can make a change to the kick starts and break the compose and right now We won't catch that because we don't run a test when you change the kick start So I would like to write all the integration bits so that when someone has a pull request to the kick starts We will run the open qa tests and find out if it breaks But it's just it's a lot of stuff to work same thing for comps Pungy fedora is the repo where the compose configuration goes and again If you change that and get it wrong you break the compose and workstation OS tree config is Misleadingly named. It's where all of the OS trees, you know definitions go So not just silver blue, but also canoid sericea all of those and again They have sort of configuration which if you change it you break the compose and we don't test any of that So I would love to be able to test all of those things if we could And yeah, so thanks to Miroslav again for the CI slides. Thanks to Mo Duffy who made the template I used here And if anyone has any more questions, I think we have like three minutes Yes, no question, but a comment the coming or the goals TMT now has Multi-hose testing. Yay. Yes, and there's reservations in testing for me. So yeah, I should have got mirror to update this slide again It's it's kind of very recent. Yeah Yeah in the last month, so it's kind of fun because I've done this talk four times and every time like I've had to revise things It's like if you go back to the first version of it, it's like we weren't doing raw high gating We would like see I was totally different. It's kind of fun. How fast things are changing Anymore? Oh three more. I don't know whose hand went up first Amita What do you mean by moving to the cloud? Is it the testing moving to the yeah open qa moving to the cloud or You're gonna test the cloud. I see what you mean. Yeah No, we're running the test system in the cloud running open q8 something We do do some testing of the cloud in open qa already we test cloud images But yeah, so as I keep saying our constraints our resources like we have literal hardware systems in the fedora data center running the open Great tests and it's quite difficult to get more of them get them wrapped get the networking setup If we could move it to the cloud, you know We can scale up and down as much as we have money for as much as Amazon will give to us, right? But so there is actually a project going on with some folks from Amazon and meta to get an intern I think a meta intern who will look at doing this because it's kind of it's a several-week project And I never have seven me and Lucas We never have several weeks to set aside to look at this so it would be really cool if a person can kind of do it and Lay the groundwork at least for us to do that because that would let us scale much further Just a real quick one that was reminded by the RPM inspect stuff Fedora Q Fedora CI one of the tests that it does is install all the sub packages to make sure they're installable Yeah, that fails every time on Fedora release the Fedora release package because it has all the variants And they can flick with each other So there may be if there's some way we could fix that that would actually be great But I looked at it and I couldn't see how to like exclude that so yeah again I think it would be really good to talk to David about it because he is very open to like This is a problem. How can we fix this problem kind of feedback? So I think and he would definitely be the guy for that Yeah, do we have time to take the spigna as quickly or I don't think we have another so I was thinking about because I Feel the same problem in system D and then basically there should be a way to say try those groups and Open QA you had this slide where the the list of Screenshots goes to from green to red. Yeah, what does it mean that it's green at the end? Let me get back to the screenshot. There's a lot of slides in this talk my word. I Just kept adding those No, oh Yeah, okay, so that is a thing in open QA called the post fail hook So when a test fails because we want to get information out to know why it failed So after the test fails it runs like a but these things down the left are all the tests like graphical weight login There are all the kind of test modules that make up a test when a test fails It jumps to a special one called the post fail hook Which just what you use it for is like grab a snap grab the journal and upload that grab a bunch of log files And upload those we have special ones like if it goes to the emergency Boot prompt it recognizes that and instead of trying to get things out the normal way It gets them out in the serial console, you know stuff like that So yeah, that's why so that green one at the end is actually just The post fail hook like succeeding with something So when you if you are working from the web UI when you get a failure You generally want to look for the last red square and that's where the test actually fail and anything that happened after that was just you Know post failure informational stuff Yeah, anybody else help all right. Thanks a lot for coming out guys