 Hi everyone. I'm Guillaume Tucker. So this talk is about Kernel CI. Who is familiar already with Kernel CI? Can you raise your hand if you know? Okay. This looks a bit like science fiction, but it's a real project about real problems. Trying to solve things like testing the Linux kernel. So, of course, some people are testing the Linux kernel already. So you have individuals like developers, maintainers. People have their own platforms that test their code on their platforms. Then you have more like downstream people like users and distributions who will test things on their own products and make sure that works, but always very targeted at the things that they really care about. Sometimes this end fixes upstream, but it's like a second priority most of the time. Then you have more wider projects about testing in the upstream Linux kernel like the Intel Zero Day and the Intel Linux kernel performance systems. So they build actually a lot of kernels on many architectures, but the tests are mainly about the Intel X86, which kind of makes sense because that's run by Intel and they care about having their platforms to work working in Linux kernel. Then you have linear kernel functional tests. That's mostly working on the linear member platforms like 96 boards and overall on ARM platforms. There are a couple of exceptions, there are a couple of X86 boards and other things in the LKFT, but again it's driven by the linear motivations, which is to test on their platforms. And it all makes sense because that's why these organizations were set up for. So basically what you end up with is that there are things in the kernel that are very well tested because people care about it. Then you have other things that people don't really test very well. So you have total test coverage. If you look at all the code, all the APIs, all the drivers in the kernel you have some things that are really well tested and some that are not. It's always the same things that are being tested. Now if you look at what kernel CI is trying to do, it's basically go off-road a little bit. So try to test the other things and try to bridge the gap. So the actual power of kernel CI is to apply the same idea of free software but to testing. So you have different people running different tests on different platforms. If you combine them, if you make your hardware available to others, if you make your test available to others, you can have your test running on someone else's platform. You can have the other way around so people can exchange things and then you multiply the number of things being tested. So that's how you expand the test coverage. And it's really targeting, so in upstream kernel everybody contributed code in the same place. The idea of kernel CI is to have one place to test all of this together. So maybe people are starting to know a bit more about kernel CI but it started in about actually 2013. I'm not an expert in the history of kernel CI. There are some people here who started it, joined in 2017. So I'm more like, this is what a lot of people think about kernel CI. If you go on the linear website it says, I think it was, it says it was a linear project in 2014. Of course it originally targeted mostly the ARM ecosystem and it built and then boot testing but not more than that. And for a long time it didn't have a mailing list. It was hard to understand, how to know what was going on there and it happened pretty much under kind of closed doors. But now it's starting to move again but it needs a new home. So right now it's not, you know, it's really tight together in a very loose way but we are hoping that this project will become part of the Linux foundation very soon and that will give some structure and organization around it as well as sustainable funding mechanism by some membership scheme. So collaboration, you know, work for collaboration, we were committed to become a primary member of that project. So like I've explained it's about filling gaps in kernel and test coverage. Some big companies have already provided some servers. So we know that there is some interest around the, in different kinds of businesses. So it's not just embedded, it's also like, you know, in the cloud business as well. And so hopefully that would give us more sustainable funding, more sustainable structure and infrastructure as well I didn't put here. But right now people are giving servers, giving hardware, but there's no contracting. We don't really know how long they're going to keep doing that and if something changes we don't, we can't plan things ahead. So if we want to say, no we want to build more trees, maybe we can now, but who knows if we'll still be able to do that in six months time if we have fewer servers. So hopefully that will help with the project. So that was kind of a summary of where the project is coming from. Hopefully I'll have a bit of time for questions at the end so I'll go through the next phases. But if you have, if there's something here that doesn't make sense please, stop me and I'll try to clarify. So where are we now with kernel CI? So that's a summary of, you know, what it does. So you have, you know, developers, people making changes in Linux. They push some changes that's typically maintainers, but it could be individual trees as well. There's about 100 branches that are monitored by kernel CI, from mainline to individual developers. Then that will be detected and built. So there's, you know, various kinds of servers. Normally it's all managed by Jenkins at the moment. And then all these kernels are tested to see if they boot. And also we have some functional tests. So I'll come back to that in a bit. So most of the time it's using Lava to test all these kernels and capture the results. There are a couple of labs that are known Lava. So it's possible to include other things in kernel CI. And then the results are stored in a database. There's some processing done as well to detect regressions. So if something worked on one kernel revision and failed in the next kernel revision on the same branch, then that's detected. And right now we also have automated boot bisections. So if a board booted and doesn't boot, there's a bisection run. And if it finds, if it succeeds, it will send an email with the commit that it found in some detail. We're starting to add this for all kinds of tests as well, not just for boots. So that's basically what you get. Then emails telling you what built, what was booted, what failed, regressions and bisection results if it finds them. Then a dashboard. So there's a web dashboard on kernel CI.org. And we're hoping to develop all the kinds of dashboards as well because the one we have is kind of aging a little bit. In numbers, so we're building and booting millions of kernels, but it's hard to deal with big numbers. So basically on average there's a new kernel being built every 24 seconds as an average if you look around the year. And one board is booted every 40 seconds. We have 76 device types. We have more devices than that, but these are like different types of devices. So sometimes we have like three Raspberry Pi's maybe, but that's the number of different pieces of hardware. And these are the main labs. There are some other smaller labs as well. These are the main labs. And these are the architectures that we built, so quite a lot, almost all of them. We are testing on most of them. I hear Kevin is just about to get testing on REST 5 as well. So cool. We're not testing on Arc yet. So basically we're almost covering everything it's growing all the time. So yeah, about boot bisections. So it's currently around on mainline and stable branches, Linux next in some subsystems and maintenance tree. I've put a few links here if you want to see. These are all the ones and more recent ones. This one was actually the first one when it was the commit in the fix said reported by a kernel CI bot, which was quite cool if you look here. But the others were reported manually. So initially we would let the bisections run and look at the results and if verify them, curate them by hand and if they look good then we would transfer forward the report to the people. Now it's working on the recipients automatically based on the code that was changed and also of course the author of the commit and all the trailers in the commit message. So functional tests, these are the things we are running so we're not running very intensively yet but we're starting. So for IGT we're running a subset of them, the common subset that's architecture independent. I mean it's not just Intel, it's the DRM KMS. It's about 100 test cases I think on a few platforms. We're not really sharing the results too much, we're just starting with that. We're hoping to grow this in the next few months. For media subsystem, the V4L2 compliance test suite we're running this on Vivid and VC Video on several hardware platforms. And we're working with the media subsystem maintainers to find the best way to run the tests and to report the results and actually I was talking to Hans Bergfield this morning about how to improve that and how to expand COVA so there's good interaction going on here. It's like the first subsystem where we're actually having this kind of collaboration to get some real good tests working in automation and we're hoping to then expand to all that test subsystems quickly. We're doing some suspend-resume tests and USB, some basic ones, there's plenty of things we could do involving hardware automatically plugged in and plugged we're not quite there yet but it's the kind of things we could do in the future. So yeah what's actually next, what's planned to happen next, so like I said we're hoping to be for CanalCI to become a Linux foundation project so with a membership scheme you would have people subscribing and providing funding this way so I mean the details are not I don't know about all the details so I don't really know if it's going to be from an infrastructure point of view if people still need to contribute their own infrastructure, if people are going to be higher, I guess everything is possible. But what it shows is that it will give some sustainability and some stability for the project and yeah we're still waiting for the project launch, stay tuned see what happens. More build power, so yeah we've got some builders from Microsoft and Google and I mean Collaborab provide some builders and Bailey Bray and some CanalCI founders as well. So what we can do with more build power is build more trees so that means we can build subsystem trees, more subsystem trees which means we can and test them as well which means we can catch problems before they get merged in Linux next or in mainline because there's a lot of trees we're not testing at the moment and that's especially the kind of trees where people test on only one or two platforms because it's not very widespread at that point. So if we could do that then we would reduce the number of failures further down the line so it's easier to catch problems early because you have fewer changes you can even test each commit on a branch which doesn't change very often. So maybe in future we'll not even have to run bisections if we find all the problems on the subsystems unless it's an integration problem which happens in Linux next. And of course we can add support for multiple compilers and we've got that almost running now thanks to Matt over there it's funny about four years ago I went to a linear connect presentation about kernel CI and the only question I asked I said can you choose a compiler and then it was merged last week so that's pretty cool so we're adding gcc7 and gcc8 and clang 8 on some trees on Linux next on Intel and ARM and here the concept came up with so you have like horizontal and vertical testing so kernel CI is good at testing small things like boot build and boot on a lot of platforms and then you have other projects that are more like vertical integration so to Kft you have others but Kft is typically about running long things like Linux test projects if you want to run the whole suite it takes hours but only on a few platforms and if you kind of combine the two of course maybe you won't be able to run all the tests on all the platforms but at least it gives you possibility so if you have the capacity to run all the tests then it's just a question if you can scale if you have enough devices and lap time then you could in theory run all the tests everywhere so that's the kind of idea I was trying to explain where you at the beginning gives you more possibility and you know expanding tests so that's basically to walk everywhere on this picture of the beach where everybody is walking on the same thing the idea is to be able to walk everywhere and discover the whole place so I was talking like I said I was talking to Hans this morning there are some areas of the kernel that don't even have a test suite and these could be tested and yeah I mean some areas have a test suite of course when it's a hardware thing you can really only test it with the driver but there are some things that could have like a framework test on the API test or something that could be yeah also what's really important is you know I've been working with Hans for example on v4l2 but kernel CI people can't fix all the problems you know it's infinite so what's really important is to enable other people who have the board to submit it or a piece of hardware to have a test suite and make it work within the kernel CI project so there's two ways I guess to facilitate that so having a standard way of running things in kernel CI so people can make the test work with kernel CI and also we can help integrating things as well so it kind of comes both ways so if you have a nice test suite doesn't run in lava we can make it maybe work in lava this kind of interaction has to happen and of course there's very little standard in test results formats and that means every test suite needs a special script or something to pass the results that's something that would be really nice to have a small standard and the most crucial thing is to be able to rerun only the smallest piece of the test suite that actually failed to track things if you want to run a bisection if there's a very long test suite and it failed at some point in the middle if you can't run just that then it's going to slow down your whole bisection so these are the kind of things that we're trying to explain and make people aware of so we can hopefully scale better with people contributing tests and hardware and yeah it's like exploring the unknown mapping the kernel so that's basically summarizing what I've been mentioning a few times during this talk about the media subsystem so we're building now the media subsystem branch not very regularly because we're still turning this on but the idea is to do it maybe every day or every week and we're expanding the V4L2 test plan to cover more things so we've done this on QEMU with Vivid and also on UBC Video and yeah it's been great at improving test results and also we use that as an example for tracking regression so now if one test case in V4L2 compliance fails you get this in an email report it will tell you that it started failing or if it keeps failing it will say last time it passed with that version it's been failing since that version and now here it's still failing so that has been good dynamic there and hopefully when we go to do the same thing with IGT or with there's also GPO testing we might be doing things soon in this area hopefully we won't have to come up with completely all different approach testing unfortunately you will yeah well we'll see okay so we found a couple of issues with UBC Video you can see the details I'm not going to spend too much time on this but basically it's a driver that's been in May 9 since 2626 and if you're on the compliance test you find the DASM tests that are failing and you wouldn't expect that normally that's kind of an interesting thing if people had been using CI all the way then I think this would have helped push people to fix things earlier so yeah showing failures comes developers to fix them you know if you don't run the test suite you think everything's fine and you forget about the problems okay thank you very much I think we have a few minutes left for questions and yeah all the pictures are creative commons or public domain so you can find them here yes do you have an ability to allow to fix that failure to explicitly run individual tests on demand or is it just run? yeah we don't have any so the question is right now we have this automated system that will test all the kernels automatically and find failures is there a way for someone to manually rerun manually trigger a rerun of the test that failed right now is no so if you don't have the hardware if you can't reproduce it yourself then you're in trouble you can't really fix it however if you push a change to a tree and you hope it's going to fix it then it will go through the system but that's a long loop so it's possible of course it's just a matter of giving permission giving access to people to labs because you know Collabra has a lab will have labs and it's kind of you're in hardware and you're in environment and it's giving access to someone I mean if you gave the whole lab access to your lab it wouldn't really it could easily go wild so yeah it's a hard problem to solve if someone has an idea about how to do that I'd be interested yes I think I've read about them so I think I've read about them so the question is so these are made by Google are they made by Google? so the question is there are tools made by Google called this color so the question is whether there's any plan to integrate them in kernel CI so I don't know right now but if you have it can be used to run tests automatically oh yes okay that could be integrated there's no plan right now but that would be a way of extending test coverage how do you deal with random values so what do you mean by so the question is how do you deal with random failures I say that a test has a probability of failing of one percent and it's just what it is so if a test has a probability of failing of one percent so it's like intermittent failing okay so well in that kind of extreme case one percent it's basically we don't really deal with them in a specific way for bisections there's something that can be done is run each iteration several times so if the test fails most of the time in other ways you can request that the thing is run several times to be sure that it really passed on didn't pass also the bisections have some checks to check that the bad revision is really bad and good revision is really good and check that the result is actually failing and if you revert that result that it's actually working so these things at the end they run three times so if it was intermittent normally it helps filtering them but yeah these are hard things to deal with there will be a report to say that it failed in the original test reports but a bisection report you'll only get it if it found a comet that passed all these tests and it's very difficult to pass all the tests and have a false positive it can happen it's really rare it's like some weird combination of cashing trying to download some kernel builds and some weird infrastructure sometimes it can happen but it normally doesn't happen I guess there's maybe related to your question it's about tests that are not how pass or fail more like a measurement like performance or energy consumption because it's not just yes or no so we don't really deal with these at the moment and they're really difficult to bisect because the curve of the time can change many times and it's hard to find where it dropped but these are things we can start doing okay okay changes in rendering of games okay was it called easy bench okay so there's a tool for that called easy bench like easy bisecting and if you've got let's say you've got 10,000 tests that you execute and then 300 of them failed it's not going to bisect 300 times it's actually going to function together sounds good so maybe something to look at things that you can collaborate on yeah great question then we'll talk about all that we're just getting the right hardware but maybe you've run into programs with other that were difficult to test or into trouble trying to expand the rendering of the hardware so the question is do we have issues trying to expand the range of hardware because some hardware may be difficult to automate yes so yeah there's especially I think Kevin is an expert in doing that in taming boards that don't want to be automated by having a special work arounds to because some boards like if they are basically like Android phones done into a dev board they would only support fastboot to flash a kernel on them so you need to turn it into something that you can flash and then flash that whereas most of the boards normally what we like to do is use Uboot or UFI and then use TFTP to download the kernel and access the file system like a RAM discoverer TFTP or NFS so yeah there's plenty of boards that are causing issues with that so there are a few ways around it but every one of them is different so I guess the key normally is to have like a special tool to prepare the binary and handle the hardware in a very specific way okay so we're out of time for more questions so thank you thank you very much