 Well, hi everyone. Thanks for coming to this presentation. We're going to be talking about the graphics CI system that we have a team tell so it's It's something that we've been working on for past two years something like this and it really started to yeah to to get attraction about a year ago and so we've we've Got pretty far already. So we're going to be talking today about what services we provide Oh, yeah, okay, maybe you you can move out. Oh, you Yeah, otherwise I'm not in the video So, yeah, we're gonna be providing What it's fine To my right here. Okay Okay So yeah, so we're gonna talk about the Intel graphics CI but also the the test suite that we run IGT and what changes went through it recently We're not gonna really talk about lessons learned E. Unless we really have time and So it's more like backup stuff and dealing with Linux in products is is also something interesting But I don't think we're gonna have times but at least you know that it's there if you want So the Linux kernel it is a massive beast So we get yeah, basically a release every 63 to 70 days So that's pretty pretty fast and it's difficult to do QA on this properly because you know the Linus just cuts the Final version whenever he is happy with the patches doesn't mean that we had you would have time to To do proper proper QA on this There's about 14,000 changes per release Pretty okay in our in average that comes out at nine comets per hour so Good luck keeping up with that. That's 1,500 developers Okay, well 10% of our bias and a lot of companies what I mean by this is that good luck having a centralized system That would cover everyone's use case So we are gonna just cover our own and it actually covers quite a lot It's pretty big and what is really interesting is that there's a lot of integration trees hundreds and and then after the release there You get six stable trees Which also means that you need to verify that this is still working because this is what you know users are using No one is using the integration trees so One interesting aspect also about Linux is that it has no architects Which means that you don't have these people Deciding what the project is gonna look like it is I don't know if you've read the cathedral and the bazaar so essay by Eric Simmons about about how Linux is different from anything else so The Linux development model is everyone just goes there proposes changes and then either it gets accepted or it doesn't so but there are some rules and The first one and the strongest one is that they should not be any user-visible regressions So whatever new kernel you would have it's not allowed for it not to work as well as it used to So that is a strong requirement from from Linux that is very hard to actually Enforce when you've got a lot of hardware then of course the kernel changes need to be open source very enough and The kernel features cannot land unless you've got a user space that goes with it that is The general rules of Linux not all of them are problematic for us. The problem is is the first one So why do we need continuous integration? So when you've got that many people working in parallel on a massive We can't really do anything about it. Sorry I'll be saying what is there anyway so When you've got that many developers like 1500 working on different parts of the kernel and you know a developer at some point is going to be Changing the memory subsystem and it's gonna affect in in very weird ways other parts of the of the kernel for instance graphics And so how do you teach people what are the dependencies between the different components of Linux? Well, you can't really but if you make a test system that shows that there is a regression and Let the developer know as early as possible that there is this interaction then they can rework the patch Before it gets inside the Linux kernel. So that's why pre-merge is so important because once it has been Merged it's very difficult to get it reverted So some people are fine with it and they're gonna revert very quickly, but our experience with The rest of the kernel is that reverse reverting anything is almost impossible Even sometimes before it even lands into the IC one. We say like hey, this is not okay Please revert and they just say like I'm gonna fix and I'm gonna fix and I'm gonna fix it And then sometimes it even gets released like this So yeah, the sooner, you know the better So Yeah, one very important aspect for our development team is that we have our own integration tree that pulls in a lot of other integration trees and We don't want developers to be To be using like one version a blessed version of the integration tree and then let the rest Change we really want to be rebasing all the time and checking that the current integration tree of our order merge of Every other integration tree is still working for us. There's this is still in the idea of Testing as quickly as possible. So as you have more time to react and One problem with that is that it would be very often broken And so how do we make sure that it's not broken and How do we make it as stable as possible so developers have can concentrate on on making their patch works a patches work Without fearing regression. Well, it's by doing pre-merge testing so that's that's the goal of it and Yeah, very important is that it scales better with the number of developers because we put the cost of integration on the actual Developer who is making the change When you've got 1,000, yeah, 500 developers this really no other way Okay, so No, of course, it's it's all nice and good theory But then like what are the actual challenges that you get out of this? Well, this keeping the integration tree is indeed very difficult because whenever you get a back merge from Linux or from an integration tree You can get thousands of lines of code even more Coming into into the integration tree that have not been tested because other CI systems are not necessarily as thorough as As what we do at least they don't concentrate on the workloads that we care about So that's a lot of of code landing at once that is untested So that's usually as you want is a big problem for us So usually I mean it used to take I don't know two weeks to stabilize everything now We are down to usually two three days. So huge improvement. Thanks to CI, but still is problematic and And another problem also is that once when when you make a bug fix for a protein particular branch It it may work great on the integration tree and then you ask it to be merged to stable kernels And what the patch was supposed to fix it actually makes it worse and that's that's something we've seen and so every If you are about to apply a patch to a tree You need to test the integration of this patch on the tree you want to apply it to so that means even for stable branches okay, and I'm gonna hand over the mic to Arik Yeah, so I'm Arik and I'm one of the maintainers of the IGTGP tools as it happens So I think it's good that I will be talking about that I'm also maintaining the some of the parts of the infrastructure that we run tests on So I'm helping with that quite a lot. So IGTGP tools what it is it's a collection of tools and tests that Are aiding the development of DRM drivers. Actually, there are mostly tests. We have 40 something thousand tests 60 something thousand tests right now 61,000 tests and We are running them quite often not all of them a sensible subset of 4,000 Let's say because we are skipping the gem concurrent So recently made a couple of changes for the project to a new development So we changed the name previously It was named Intel GPU tools, but we dropped the Intel from the name because we support other drivers as well So it's not only Intel now. We support AMD GPU have a couple of tests for that We have a couple of tests in integration with new data drivers and a couple of tests for VC4 So it's more than Intel. We also move the mailing list. So people won't be off-puted by Sending mails to company named mailing list. We have separate IGT vendor agnostic list for the IGT now and Also, we are starting migration from auto tools to meson build system because it's much faster and nicer It's this new shiny thing all people like to work on. So, yeah, let's take a look at what tests we have in IGT. So 61,000 tests in total there are 18 tests dedicated for the AMD 27 tests for VC4 mostly by Eric and Holt and Also, we have 15,000 1500 Tests for KMS and those should be mostly vendor agnostic as well because KMS is cross vendor API A kernel mode setting so it's for Setting the resolutions and displaying images on the frame buffers So those should be mostly vendor agnostic and we have English specific gem tests, which are the most and as you can see we have 2300 of the tests without the gem concurrent which has a combinatorial explosion of H cases and it takes quite a lot of time to execute them all. So we are not executing them on regular basis But they aid the development. So That's this kind of tool. So Let's talk about why IGTs are more than Intel because we are not only a DRM driver. We can share huge parts of the code base for the testing and the testing infrastructure as well Because KMS is not the driver specific as I mentioned before. So all the DRM drivers follow the KMS and We can test that together Because API have to be consistent. So if something passes on one driver and not passes on one on the other and What behavior should be should be consistent across all of them. So VC4 Intel AMD So that's good to test using one suit And because why should we duplicate effort because Eric started something called VC4 GPU tools and it's a smaller project, but then he ported most of this stuff. I think to the to the IGT so Yeah, for that and what has to be done. So it started a single project. So there's a lot of intel ism there and As we'll show you another slide Not all of the tests pass are passing on other GPUs because of the intel isms that sneaked here and there and Cleaning them up should be rather trivial, but we need your help with that and also We are not handling multiple GPUs per host yet. So it usually defaults to Intel one first Yeah, you can guess why you can guess why because one Intel GPU at a time because of the history Yeah, so and one one GPU at a time So handling multiple GPU configuration to test the one you have if you accidentally have discreet GPU and Intel GPU That that should be done and it's one of the request features. So let's see how it looks with Okay, so it was cut off But I will read that to you so we tested Nouveau VC4 and video blob drivers and As you can see for Nouveau we have 125 passes for VC4 it's 118 and for blob drivers is 20 so But don't blame the drivers it's not fair comparison because most of the fails are because of the Requirements on Intel hardware that shouldn't be during the KMS tests So that those are the long low-hanging fruits that you can just jump in test run the test on other GPUs and fix them quite easily It's about 90% of the fails and skips I think for the KMS Yeah for the open source drivers for the proprietary driver It is not all the API is implemented and that's why it's just crashing on it, but it's good information, right? Yep, so we know that it's It the 20 test passes, so that's not bad Some things that are something that are that standard So let's also talk about Intel GFX. So now we know that we have a test suit. Yeah, and Now about the infrastructure we ran it on so we want to provide an accurate view of the state of the all hardware and software configurations We support so we try to run as often as possible on them and we believe that results should be transparent So they should leave the hardware configuration We have a page that says all that is the old hardware configuration and the assigned labels and in the results You receive from the system You have the name of the configuration and the results and every delta's every changes in the test state We also believe that it should be fast So basically it should come in around 30 minutes. That's heavily dependent on how much Puches sent are sent to the mailing list So usually the test run takes around 10 to 15 minutes for the fast feedback So this is a small set that just this sanity check whether you haven't broken anything seriously So it takes 15 minutes, but if there are multiple other series in the queue it will take Longer but around 30 is safe bet. So it should be visible. So we don't want to have system that Makes it struggle for the developer to find the results to explore them. We are sending them to the mailing list So if you send a patch series to the mailing list in 30 minutes, you can expect first feedback From the CI system saying that okay those tests you're fine to go or those tests failed or those test changes state from past escape or something like that and with all the devices listed and all the bugs associated if we know them and Also If the series fails to build that's even faster because when we fail to build or failure fail to boot It's much faster to send that feedback and Result should be stable. So we are blacklisting all the tests that we not don't trust yet And we try to work on them on making them more stable and exploring whether there's actually issues in the kernel driver or in the tests and for the known known failures we issue bugs and we Blacklist them during the runtime. So the tests are still run. They don't make the hardware going into some strange states, but They are Constantly failing for some reason and we have issues for them We know that they are failing and the issues for the free desktop or kernel dot org Bugs it less are listed along the results and they do not affect you pass or fail final result. Okay, go So, yeah, so let's talk about current state So we are providing pre-merge testing for DRM tip Integration branch that Martin mentioned so all the patches that we developed for the I-915 should be on top of the DRM tip We also have pre-merge testing for the test sheet itself. So whenever you submit new patch for the test sheet It's also getting tested on our infrastructure and the reference to with the previous version of the merged one Okay post-merge. We are doing much more trees. So we are doing their tip as well Just to be sure that nothing was broken Linus trees Linus next all the DRM fixes Branches and we also have one branch for Dave early that he can just push to and test whatever he likes using Infrastructure is the DRM maintainer. Yeah, he's the maintainer of the whole subsystem We have totally 74 systems in our CI running constantly and that's about 21 different configurations that includes like discrete points in the results because the different configuration of the Displays connected or SKUs. Oh, no, that's what that was actual generations No, we don't have and low power and high power systems So it's from Gen 3 to upcoming systems That were not released yet. So we have from Gen 3 up to Canal Lake, which we reduced Somewhere in the future. Yeah, and so Gen 3 is from 2004. So yeah, but 14 years 14 years is quite a lot of time No, we don't have Yeah, developers can access them, but they are not part of CI Okay, so we have all just sharded machines as you can see In the previous slide we have quite a lot of tests and they take quite a lot of time And if you would run them on a single host, it would take multiple hours So we came up with solution or sharded machines. Those are identical machines in seven or eight copies 64 German lake and We just split the workload among them. So you can get the full IGT run excluding the gem concurrent In about 40-50 minutes if there's no queue So that helps us with delivering timely results and we have Huzzwells KB lakes, Sandy Bridges, Apollo lakes and Gemini lakes. So it's smaller pool of machines But those are pretty current and it gets you with pretty Those are pretty important because we test the whole suit on them. Okay, that's the machines from the past five years. Yep We also have Xenon machines The scale ones we have GVTD, Broadwell and Skylake for the virtualization testing We have plenty of different monitors and KVMs and dummies connected to those machines We have HGMI, DVI, DisplayPort, embedded DisplayPort, MST displays DSI ones connected to Thunderbolt and LVDS So pretty much everything you can Yeah, and we are running the IGT on that. So we have the fast feedback list of around 288 tests and those are the ones that run in 15 minutes providing the first feedback and then we run the full KMS suit and Quite a lot of gem tests. That's around 2071 tests and they take 50 minutes on the sharded machines and As of thought put so we started in August 2016 with 22,000 tests executed a day and we grew quite a lot. So we total around at Around 850,000 tests executed a day for the whole CI system Yeah Okay, and We filed the bugs for all the issues we find in the CI usually Under half a day. So if there's something new it gets filed during the workday in less than four hours That's a pretty good result and this game is called spot the holiday week So, yeah Christmas But yeah, we can see that we execute quite a lot of the tests So on the side, I don't know if you can read on the left side It's millions of tests executed and so it went up to 5.5 million Yeah, but before Christmas and on the other side, it's Our number of bills made per week So, yeah up to 300 or 400. Sorry. Yep Okay, and here we have different numbers of bills per week. So the top one is DRM Intel fixes The orange one then we have the cast Yeah, DRM tip is the biggest one, it's the Long blue one in the middle. Oh, no, that is patchwork. Yeah, so patchwork is pretty much Okay, so what what this slide is showing is That post merge testing is basically using about half hour time and then the rest is the post bridge testing of all the other branches Yeah, pretty much testing of DRM tip and IGT That's that's the main thing Okay Well, we've been going pretty fast. That's surprising actually Okay So what we're gonna show you now is Is a demo of what we provide so this is the landing page of our graphic CI and because you know We are yeah, we're engineers. We don't really care about the design Sure, we do graphics, but Ultimately, you know, it's an engineering UI. You just want to get to the point And I don't want to be writing as much HTML as I'm currently doing actually Okay, anyway, so First things first is knowing what our system is running on so you've got two things You've got the short hardware list and the full one. So we're gonna have a look at what we have here. So Whenever we report a bug we would use the host name as an indicator of what it is because you don't want to be explaining the entire Configuration about every machine you report a failure on so it's easier to just have a nickname and then if people are interested They know where to go and if they don't know what we can teach them Because the people anyway, we're gonna read that are our developers and I think we can teach You know, I don't know like tens of developers to you to understand what it is So you've got the host name the what is what users would be usually reporting So you can see the name of the like the brand name of it what generation it is for the CPU and GPU So that's the the nice nicknames like the the public ones not the ones we use most of the time Then you can see what are the connectors available on the the machine and then you can see what is not connected So for instance on the first one DVI is not connected But if a developer would be interested in testing DVI because he or she has a reason to believe that it may be More problematic than anything else Then yeah, we can get a request to connect it. So we don't have to To have as much communication as one would expect. So for instance, there is This ELK machine here has been very problematic With display port on display port the machine just dies when executing a certain tests But in HDMI, it's fine. Who would have thought? So that's the sort of things that is very important to know. Oh And it's not updated. Oh, well done Okay, cool Then if you want to have full information about the hardware then you can go for instance Yeah, there's Braswell here and you can have all the stuff So CPU info is well, you know proc CPU So you can check that then you can also check what is available in the BIOS the hardware configuration If you want to look at the RAM Okay You can see what yeah what RAM is available. So this one only has one RAM and it's in, you know Like in memory channel a You can see the clock you can see the voltage you can see really a lot of things from this Okay, the hostname are your men so to see where the BIOS is putting every Every peripheral in the physical address space and things like this So that was the first two links Okay, a lot more to go Next thing Post-merch, so that's all the trees that we test IGT, DRM tip, Linux tree, blah blah blah blah blah What matters is that on the side you have fast and full so we're gonna take DRM tip and Go for the issues the fast issues So here you don't see the list of 288 tests that we are running you only see the issues for the past five runs or more Six runs for the past six runs Oh five okay, I knew how to count and not and And you can see what is going on and here at the top you've got the list of machines that are being tested There's a decent amount Especially for when you don't have a 4k monitor But you can see that well, I mean everything is fine right most most things are green Yeah, sure, but not across the board on this sandy bridge. This test is a problem and then actually Not every time because apparently once in in these five times And so if you want to know what is going on you can click on the test And you can see across all the machines on the vertical axis and way more runs So you can see that on this sandy bridge this test has been problematic, but quite sporadically So that's that's a nice view and then you can click on it and see what is going on So you can see the list of commits that was added between the previous version and this new version So if there is a regression you could find more easily already who to blame Don't assign blame no, I'm sorry who to contact Okay, very good then you can see the the log of the execution of ITT so Well in this particular case, it's just the time out the watchdog just killed the machine like Nothing moved for 10 minutes. So it just yeah freaked out so This one is probably Since this test is about reading all entries in debug FS There is one that allows you to read the CSC that of the of the output And if there's no output the CSC is you're just gonna wait for it So it likely is the problem that in the kernel The it thinks that it has a display of active, but actually doesn't and so it just waits for it oopsie But you can see also the boot log. So whatever Yeah happened at boot time. So apparently this one was taken from the Sierra console Up so nothing interesting. So that's what happened before we started running the tests And then Dmask here is what started after Running the tests so you can see that oh watch our watchdog is opening the watchdog and what time out It's set like a hundred seconds then you can start seeing that a Certain test is executed here Call hot started executing then starting this particular sub test and then this subtest is done blah blah blah, okay So that was the fast one Obviously if you look for the full one, you're gonna get a lot more issues Yes Okay, so that's the list of issues only the issues so think about otherwise loading the entire page It takes freaking forever. So you don't have as many platforms as we said. We only have the sandy bridge as well Apple Lake KB Lake and Gemini Lake. So all of these are the recent platforms and You can see that there is like sporadic noise and And this is this is what is the enemy number four number one in CI You really want to have stable results because otherwise developers are not going to trust when they say when you tell them Hey, actually your your patches problematic. So this is what we need to address this I don't know if I can move more left But all this area here. That's bad. That's really really bad if it's constant Consistently failing super simple, right? It went from fail to fail. Who cares? But when randomly a test goes from yeah past to DMS one or I don't know what it is. Oh, it's a bug Okay, so DMS one probably Yeah, that is problematic because the patch has nothing to do with it But how can a stupid CI system know that so you'll have to have another layer of bug filtering and we're gonna talk about it later So that is the first layer just presentation of raw results Okay, so we're not gonna click on the all things because you know what it means it's gonna take forever So we're not also gonna care about the other trees. It's exactly the same logic So now for pre-merge testing. So we we do pre-merge on the graphics CI main list on well, basically for DRM tip for IGT and Well, that's it. Let's not care about the tribal. So I don't know if you're familiar with patchwork So Patchwork is is something that got developed by someone actually Ozilaps so it's a part of IBM and it was developed quite long time ago and then it was Not developed for a long time and to our peer Damian lesbians. Yes, exactly. So you picked it up for development And then the original patchwork started to catching up modernizing the User interface and you're having all the nice feature So kind of we are now in the fork situation when we have the free desktop flavor of patchwork With all the modern features and there's also the original one. We are using the free desktop one obviously And that is also the maintainer of this this fork So what one nice thing about it is that it has support for patch series So rather than having a lot of patches and just selecting individual patches Yeah, well, I mean it's terrible support anyway No UI UI wise it's really really not how I would do it at least So what you can see here are just a series So one line is one patch series if a patch series contains a hundred patches then you don't you don't see them in this view and Since we test only patch series. We don't test any video patches that would take too long Then you get test results on a patch series and here you can have a look at for instance this test This patch that was sent by Chris Wilson You can see when and you can see that there were two revisions of it and Then oh, yeah, this one is bad because it of course. Yeah, apparently he's working on a Saturday Yes Chris Wilson So and yeah, this one is more interesting So you get a lot of different results. The first one is FICI bat So that's the basic acceptance test that was the fast feedback the the the your fast feedback So you can see here that there are two problems. There were one pass to a fail and a pass to an incomplete But these were not due potentially not due to this patch series because we have a bug Associating with this particular entry and this particular machine. We're gonna go again into that a bit later So basically no changes. So no worries You get here link. So as you know what to do like where to go for if you're interested in this failure So you can check if it is actually, you know Force positive of the system that suppresses noise or if it's actually Yeah, exactly what you were expect then you can see the list of machines how many tests got executed how many passes blah blah blah How long it took and the list of commits that that that was before and after the No, the list of commits between the previous tested version and new new tested version And if you're interested you can have a look at the full results that does the complete Like the AB a comparison So that's the on the first column. It's the previous one and on the right column. It is the new one so here the CIG RM 3270 was the baseline and then we applied the patches and this build was named patchwork 7875 and so yeah, it's a B testing Okay So let me drop you for a second So with the patchwork the idea is Can you go to patchwork? Yeah, so the idea is that it's it's as unobtrusive as possible So it picks up the patches from the mailing list It has email address associated with it and it subscribes the mailing list and parses them for you and arranges them for free And when the CI system sends the results here So what you see by expanding this section is also sent out as a reply to the mailing list So you don't expect that we don't expect developers to know about the tool to go there to discover that or have the knowledge from somewhere They actually are Replied with the results on the mailing list and the link to the patchwork as well and expanded results too So it's awesome. I'm not as unobtrusive as possible Thanks Okay So yeah, we're not gonna talk too much about the rest but there's check patch run So you can see if it was fine or not And sparse also check that's something that we started working on So as every patch in this case is being tested But okay, so that was patchwork We can also verify that that the CI system is indeed sending results on the on the main list and So let's pick up one Well here for instance You've got this particular patch and then you get a first result here So this patch got sent at 829 and let's see how fast it was Well exactly 30 minutes later No, not so bad a bit a bit more actually oopsie But generally that was it so you get the results immediately in the in the main list That makes it much nicer. Not everything is there, but the most important things are there Okay Then what else? No, it was here You may also be wondering where your patches in the queue. So if it's you know still waiting on Yeah, you want to know when your patch is gonna be tested and if You get 20 patch here is sent at the same time after the system is not gonna be able to test it in 30 minutes So we've got a queue system that allows you to check it out Unfortunately, we made an update to our CI system and now this is broken, but we'll fix it on Monday But it's good because now you can see that Yeah, exactly it would be empty so what you don't see on the right is time posted so you can see how long ago the patch series got posted and Indeed the last one that we saw was posted 18 minutes ago Well, the one that didn't get all the results yet. So okay, and we have it for all the Yeah, pre-merge. I stopped it Accidentally pressed the SD card Okay, so that was pre-merge. So actually we're going down then So we've seen how to get the results. Very good. Now. Where do The issue get reported because if you have a CI system, it's wonderful But if no one is standing to it and reporting bugs and making sure that the bugs get addressed It's not gonna go anywhere So we use two of them. There's free desktop bugzilla Which is all the bugs that we make for IGT and I915 and the second one is kind of all for everything else that is not support that is not Yeah, what's ERM or or X or whatever? so Then we can have a look at the system that we use to Suppress the test what we call blacklisting before but it's more like suppression So that's the old tool and later work. I'm gonna show you the new tool that I've been working on that is a bit shinier So here you what you can see is the list of of active issues that we have So for instance this FDO blah blah blah you can see what is the status of the bug when it was last updated The summary of the bug then what tested effects So it's actually only this test on this particular machine and you can see the failure rate Which is quite useful if you want to make a bug report and You can see the history of when it was seen last and all this so Well, okay, it was seen only once that makes it Not that interesting, but this one would be a bit more interesting. So it has This test is problematic on three machines and we can see the history of them all so Yeah, so you can see that it was problematic five runs ago six runs ago seven runs ago Well, basically all of them and it's Potentially fixed already because it was very stable and now it is not anymore. So It probably could could yeah Close it So that's the old tool and it has some statistics and metrics you can see for instance the number of issues over time So it just grew because we added a lot of machines and we kept on adding machines But it's slowly going down again. So that is nice and Well, there's there's plenty of metrics actually but one that is it that can be interesting for you is to see the distribution of Of testing requests throughout time and you can see what is the origin. So the this one here the the blue Well ish is I DRM tip then you get The tributes that we were not going to talk about it and then the Intel GPU tools is the green one So you yeah, there's only a couple of them. So it's not very common for them And then what you can see also here is how long it usually takes So you've got weeks here at the bottom and then the basically each line or each column is showing a distribution of of how long it took to two tests So for instance here, it means that most of them got tested in about two hours But then we had a long tail and the last the the highest one was 18 hours So sometimes it just happens that there's a lot of things happening in this priorities and all this And then you can see the median here. So and that's the dead end. That's what we would like to keep the the results under So, okay So we'll continue with the demo if we still have a little bit of time actually we should we have Yeah, okay, so I can show you the new tool. So this is the the old one that I was saying It's not exactly the cutest looking one But I've been working on this new tool here. That is a bit nicer. So right now. There's no issues filed But you've got a ton of untracked issues because I imported some runs that were that our CI system made And then we're gonna have a look into into one of them. So Let's open the first one So the first one is on this test on this particular Well configuration all the configuration and software configuration that was a fail by IGT standards You can see when it got started duration how it got run You can see the standard output of the tests when the error output of the test and even if there were Kernel messages generated during the execution So, okay, and you can see the history of this one. So I imported three bucks Yeah, three runs you can see here like 3669 3670 and 3671 and it got them ordered properly So what we're gonna do now is file a bug for this because why not, right? We want our CI system to be able to categorize to know what bug it is in order to filter in CI The the results so we know if a patch series is introducing a new problem or if it's something that is already known This is the tool that is gonna be like telling the system. This is known or this is not known So we're gonna teach the system how to know this now. So, um, I'm gonna just copy this Well, okay, it's copied whatever and I'm gonna file a new issue. So you get this lovely thing So we get we're gonna create a new filter and of course it doesn't fit the screen because Yeah So there is this string that was very Well typical from This bug it is yeah, this this string at you see where it failed with one and this is what I want to match I want to match this and actually I would like to know on which machines It's a it's problematic and it's how it's telling me that out of the eight hundred and fifty eight failures that I have currently To to to get the system to know 131 are matching this one. So it's pretty good So it is available on okay tag it doesn't really matter, but it's a DRM tip then on 30 one machines Okay, that's a lot 10 tests Okay, and the one status all of them are fail. So since I'm very lazy, I just you know added a button So as it would select everything for me Of course, this is unreadable because the the screen is You know super small, but I can I can show what it is. So here are the tags So if you've got bugs only only in a street or do you are next? Or whatever tree you have then you can select it there Then you've got which machines the bug is visible on well here It's really impossible to see and then the list of tests That are that we want to cover also and the list of statuses. So the output like a fail pass Whatever. So you just put all of these Then you just create this Come on What are you doing? They're more effect come on Seriously, well, I mean it doesn't matter Let's let's pretend that it added it here. Of course, it's not supposed to work I mean I use this in production and it works except now in you know development environment Yeah, exactly, but then what you need to do then is match this filter with a bug Because we want to report bugs. It's not just to know or not know we want developers to know also So the tool is forcing you to associate filters to bugs one or more So I already know that This bug is actually this one. It used to be your RTC wake failed with 256 but then the output changed recently, but whatever so then you just say like here is the bug and Here's the bug tracker then you put your email address. So as it's well, it's known and People can know who to contact and then you press save So I need to to put one and then I just press save and It's complaining because there's no filters. So well. Oh, well, but I can show you how it looks like on the production one Can show you the address so that's why it's in a different tab So on on this one you can see it's a bit like the other one except it's a bit more stylish so You can see the failure rate for instance of this one. So this bug is Okay, a performance counter bug and We have three filters matching this issue Because we consider that it's somewhat like three ways of the problem to appear and You can see what is the failure rate of the issue as a whole so it out of the 20 last runs 21 last run six times. It was problematic and Then you can see that it was more problematic This failure mode than this one and then then this one Yeah, okay So it's so what we want to do now is keep on doing this for By adding more hardware configurations new platforms and new display types and especially ones that allow you to plug in unplug stuff Automatically so that's Google's camellium. We want to add more test suites And more tools to auto do auto bisection. So we're gonna see that now So see I bug log energy is what I was showing you and we are trying to I mean we're looking into open sourcing the tool So everyone could be using it and then there's easy bench. I've talked about this project before It's an auto bisector for changes in performance rendering and unit tests This way you could improve automatically the bug reports by having auto bisect on it it would tell you hey, this actually was introduced by this commit and If you want to help us you can help us on the infrastructure, of course because we're trying to open source it On IGT because you can write your own KMS test or test for your actual driver and If you've got self-test also that are somewhat impacting Graphics like suspend and things like this then then we could even run it in our farm and file bugs because it affects us So we'll have to do it. Of course. We're not gonna do your QA because yeah But if it it can be you know a synergy so that would be good and here are the contacts if you're you're interested I don't know like how much do we have? Okay So yeah any question? Yeah, so the question the question is how do we select the hardware we run on and especially you were talking about VC4 I guess. Oh That we don't really deal with so Well, that's a good question. Why not? I don't know and we did not investigate Any other question? Okay, how do you in how do we implement sunboxing for the builds? We didn't know we do yeah, there's separate network. They're separate machines and if something fails we just roll them over so Yeah, the machines Not running, you know inside the Intel network. It's really something that is separate and The whole infrastructure is completely separated from the internet or the corporate internet Then we just roll over it and have a new thing Yeah, okay