 So good morning, everybody, and thanks for joining us today. I'm Mark Brown. This is Kevin Hillman. We're both upstream kernel developers, and we've both been involved with kernelci.org. Kevin, much more so than me. He's one of the main developers. And we're here today to talk to you about how you can contribute to kernelci. So the goal here is to answer one of the most common questions that gets asked in any kernelci presentation, which is what can people do to get involved, to work with the results, to maybe put hardware in there, get the things they're interested in tested. And the purpose of this talk is to answer those questions. So in case people aren't familiar with kernelci, we will do a bit of an overview of kernelci to start off with. Before going on to talk about how you look at the results, how you follow up on the results, and then moving on to talking about getting hardware into kernelci, sending boards to the team, and also setting up your own lab and working to get that integrated into kernelci. And I'll cover lava, which is the most commonly used software for doing this. Finally, we'll go over what the plans are for the code and how you can get involved with that, because the code behind the kernelci.org web service is all open source. There's a couple of other talks this week that we saw, which we think would also be interesting for people who are interested in this. On Wednesday, first thing in the morning, Antoine and Quentin will be talking about the board farm they have at Three Electrons. This is actually one of the kernelci.org labs. So if you're interested in seeing how this works in a work environment where there's a big team using the boards, that's probably going to be a good talk to check out. And also, later today, this afternoon, Geert will be talking about his own board farm. The details of the two talks are there. Geert's not involved in kernelci, but he's doing very similar things upstream. So what is kernelci? In a nutshell, it's automatic build and boot testing for the kernel. It's mainly focused on upstream, but there's nothing really tying it to that. And we do have a few trees that aren't very upstream in there. So at the minute, kernelci covers ARM, ARM64, MIPS, and x86. We could cover other architectures. That's just what people have been interested in so far. We cover a lot of kernel trees. Mainly, we started off with the upstream trees, so Linux's tree, Linux Next, and Stable. But we've branched out to cover other things. We now have a lot of developer trees for individual people. I have all my trees in there, for example. I see some other people in the room who also have trees in kernelci. And we've got things like ARMsock and the RT tree in there, as well as the Android trees. And what the service does is it monitors those trees, or a particular branch on those trees, and then every time there's a new commit that it sees appearing in those trees, it will grab the commit. It will do build tests. It will try and build every entry dev config it knows about for the relevant architectures, plus some extras. So for example, we build versions of the dev configs with ksand turned on to see if ksand shakes out any bugs. And then once it's done the builds, it will pick up the build artifacts, and it will try and boot them on as many boards as it can. Currently, that's about 260 builds, which work out at about 500 boots for most commits. The exact number of boots we try varies depending on what's being tested, because we only try and boot boards. We know are supposed to work on the given kernel tree, and because sometimes boards are busy and not available at the time we're doing a test. So over the couple of years that kernelCI.org has been running, that's resulted in about 7,500 jobs and over 2 million individual boots. You can check out the exact numbers by going to the website. But all this wouldn't really matter if all we did was build things and boot things. What's important is that humans pay attention to the results of the tests, and that's where most of the work in kernelCI.org is gone. Whenever kernelCI is doing anything, you can see live results on the website. You can see historical results on the website. And once a given build job or boot job is finished, we will email results for that, collecting together results from all the different builds and all the different boards. We'll email that out to the lists and to relevant maintainers for the tree. So what's this ended up doing? Well, the most obvious thing, or the easiest thing to pull statistics for, is that it's much, much more likely that if you download the kernel for any given tree, it will build. It used to be that it was quite common for many of the embedded targets to fill to build because people wouldn't be testing with every single one of them all the time on mainline. But since we've got kernelCI and some other build systems doing this, you can see the numbers here. If you look at the old stable kernels, 3.10 was before anybody really started doing this. There were 50-odd failed configs, 3.14, people had just started doing it a little better. By the time we got round to 4.1, there was only one configuration failing to build. And these days, if you look at mainline outside of the merge window, usually nothing fails to build. Right now, it's not actually so good, partly because we're in the merge window, but also because we just recently added MIPS and we're still sorting through some of the issues that we've found testing MIPS. So it does take a little while for a new architecture to catch up, but we expect to get there soon with MIPS. If you look at Linux Next, things are often quite a bit worse, but that's really to be expected. The whole purpose of Linux Next is to be an integration tree where errors get caught. So on any given day, Next might build cleanly or there might be a bunch of failures that have been newly introduced. The important thing with Next, from the point of view of the build results, is that any issues that we found get fixed quickly. The other bit of it, the boot testing, it's a similar set of results. It's much, much more likely that if you download the kernel, build it for your platform and try and run it, it's gonna at least boot. It's not quite so easy to pull numbers for this to show you numbers because we blacklist things that are known to fail. So if you look at an older kernel, we just won't try booting things that are expected to fail. This is because boot results tend to be a lot more noisy than build results. So we want people to be able to look at the results, understand them, not write them off as some sort of glitch in the system. All this work is important because it makes the kernel a much more solid basis for development. People are much less likely to be running around trying to figure out why the kernel only just downloaded doesn't work. Tony Lindgren was at Lenaro Connect a couple of weeks ago and he put it very clearly. He was saying that as the OMAC platform maintainer, he used to spend the merge window and the early RSEs running around trying to figure out why some platforms had progressed, work out what had broken, identify the problem, report it, get it fixed. Since kernel CI is now sitting there and doing the build and boot tests constantly, it's much, much less likely that a problem is gonna filter into Linux's tree during the merge window. So there's much less stress for the platform maintainers. And it's much easier to identify what went wrong because you're not looking at all the changes that happened in the last development cycle when you're trying to isolate things. You're only looking at changes that were introduced in the past couple of days. So that's what kernel CI is. But the main purpose of this talk is to talk about how you can get involved. So there's a bunch of different ways. You can work with the results, make use of them, make sure they're having a benefit for the community. You can contribute hardware either just by sending boards or by setting up a lab and running things yourself. And you can also work on the code. So I'm gonna go through each of those in turn. The results reviewing is fairly straightforward at a high level. You just look at the report emails, look at the web UI, see what's failing, analyze it, look to see where it broke, try and find out what broke and report the problem as closely as you can to the relevant people. It's important when you do this that you also, as well as identifying that there is a problem, you follow up on the problem. It's very easy to just send a failure email saying, oh, this didn't work. And then somebody might miss it or be busy or whatever, not get around to it. So it's important to check up that anything you do find actually gets addressed. So let's take a bit of a look at how you might do that. Working live with the Kernel CI web UI. So, actually, let's, let me, is that legible for everybody? That better? Okay, so yeah, that definitely had to be legible. So this is the, this is the main website for Kernel CI. Along the top here, you can see tabs, which are the main bits to go into. Usually you want to go into jobs. Jobs summarizes the, is a list of the individual commits that are being tested for trees. Builds and boots are linked from jobs, so you can just go in there and socks is primarily useful if you're a sock maintainer. Just want to see how your SOC is doing. So let's go into jobs. And we have zoomed in so much, it's not laying out nicely. Milo, the web designer will be a bit upset with that, I'm sure, but never mind. So if we look down here, we can see a bunch of the trees that have been built recently. We can see how many builds were attempted for that tree, how many of them succeeded, how many failed, and how many we couldn't quite tell the result of. For builds, that should always be zero. But for boots, sometimes we have copy, more than one copy of a given board available to us. So we will find that one lab manages to boot the board, one lab fails to boot the board. We're not quite sure what's going on, maybe it's a lab failure, maybe it's just an intermittent failure. So we mark it as unsure, which is the yellowy orange. So currently we're in the merge window. So mainline is a bit unstable, but let's click through and take a look there at the boot results. So the latest commit, we've got 85, it's just running a test just now. So we've got 85 boots attempted, 67 worked, six failed. Let's see if we can look at some of the failures and see what's going on. So that's the details of what's been built. That's a nice graph. This looks, you can see it all on one page normally. So we can see here, be Libra's lab, everything's fine. The main line are a lab in Cambridge, everything's fine, but there's a couple of failed boards here. I'm going to skip over and look at this, IMX6 board in the Pengatronics lab. I see some Pengatronics people in the audience looking happy. So you can click on the build, it gives a few more details. There's links to the logs and we can expand this more info here. So some details on what was tested and quite importantly here, the status is a first pass look at why the boot failed. So often currently I will be able to do some diagnosis, especially if it's something like a boot that failed. So one of the problems with boot testing is that there's a lot of infrastructure to boot the boards and sometimes that infrastructure fails. So we do make an effort to filter those things out. So let's click through to, oh dear. Let's try that text format log, see if it's zoomed in on that. Nope. There you go, that's better. You've fixed the bootloader? Yeah. Okay. Yeah. So we can scroll down here through the Uboot. So this is the raw log that came off the board. So we can scroll down through here, take a look at things, bunch of bootloader messages and then if you've used Uboot, you'll be familiar with the image printer. It looks like she bareboxed it. It looks like Uboot, I don't know. So yeah, this is something you'll see quite commonly if the kernel just fails to boot at all or the bootloader fails. It will try one boot, fail, give it another go, just see if it might have worked. But if it didn't, then eventually it will give up and report an error. So the thing to do if you find an error like this is to go and tell the lab maintainers. So in this case, that's Lampingotronics. Sorry. And we should have contact information there, but we don't. But the thing there is to go and report that to the lab maintainers so they can go and have a look and see if we can get it to the point where it at least tries to boot the kernel. Let's go and look at another one of these. And that looks like it's another, might be another similar bootloader failure. Nope, this tried to boot the kernel. So this is the nextness platform and we've got the kernel log here. So if we scroll down to the bottom of the boot, we found that it just decided to stop doing anything at some point. So that's gonna be annoying too. Yep, so we can look at the log, we can see. Yeah, it got stuck somewhere after the random initialization. So the labs are responsible for doing that, but yes, they will time out if they get bored. So if we scroll further down the page, we can see some information here, which will hopefully help figure things out. So we can see how the same board is doing in other trees and other labs. So this is a test for mainline. We can see that the same board in Linux Next is also seeing problems. That's to be expected at the minute because it's the merge window. So next and mainline should be coming closer together rather than going further apart. And we can also see the commit where it started failing and the last commit that in this tree that kernel CI is aware of having worked. So if you have access to the board, you can use that to do a bisection. If you don't have access to the board, you can still use it to take a look at the Git log and see what changed. Obviously for a high volume tree, that's not gonna be so useful, but for a low volume tree where there's not so many changes going in, it can be a quite an effective way of finding things even without access to the board. So yeah, that's the help. That's how you go through the build and boot results. The boot results that you see via email are essentially the same. It's presented in a slightly different format, but... Can you show one more tree that's stable stuff? Okay. Switch to the... Yeah. So yeah, actually, and Kevin was just gonna take over after me and talk about how you set up a lab. So let's just hand over to Kevin just now. So I also wanted to show one more bit in here about one of the trees that we test, or one of the sets of trees we test here is the stable tree. So Greg maintains the stable tree for several kernel releases and a few other guys like Sasha, he maintains a stable tree for the 4.1 and 3.14. So one of the sets of trees we test are the stable trees. But one of the things that we also are testing is what's called stable RC. So before Greg actually releases a stable kernel, he's collecting stable patches and he pushes into this tree called stable RC. And so Greg wants to actually keep track of this. So Greg started watching kernel CI recently because he wants to get a sanity check before he pushes a final stable release. So he's monitoring this and monitoring the email reports for stable RC. So as Patch is going to stable and he sees things start breaking in kernel CI, he will start putting the brakes on actually releasing the stable kernel. So it's just another area that kernel CI has become useful. Greg is using the email reports. And then when he, in our email reports, there's links back to the website so he can actually dig a little bit and look at the logs and make a decision. Sometimes he asks us and sometimes he makes a decision for himself if it looks like a lab failure or something. The board never even powered on. He doesn't pay as much attention to it. But if the kernel actually panics or does something like that, he'll actually use that to start looking into that and make sure that or ask us to look into it or ask us to bisect it or something. So this is just another area that kernel CI has become really kind of integral part of the actual kernel release process for stable as well as the kind of the bleeding edge kernels. By the way, if you have questions anytime, just yell and we can bring a mic or can't quite see everybody that great. But if you have a, yeah. I had a question. Do you also perform like runtime tests if the kernel has booted or is it just you get a log in and then everything's okay? So right now it's basically just boot testing but we are, we do have a set of tests that are running as on top of boot testing but we'll get to that in a little bit but one of the things we're working on still is basically we don't have the, we don't have like the analysis tools and the visualization stuff for those types of tests but we're running things like the internal self tests and simple things like hack bench and a couple of things like that. But that's one of the areas that we really want to help. We want to help in not only writing those tests and picking which tests to run but the actually running test is the easy part. The hard part is doing this type of thing. We're actually actually catching regressions. A lot of these test suites have tons of failures and so what you don't want is to nag people about the failures, you want to catch new failures and regressions and stuff like that. So it's kind of the post-processing and analysis stuff that needs quite a bit of work and needs a lot of help which is partly why we're here kind of begging for more help on the project because there's only a few of us working on this project and none of us are working like full-time on it. So, so yeah. And the other thing with the test results is before we start pushing the results out to the community, we need to make sure that the tests are in a fairly clean state. So if we just push something out and there's lots of errors, probably people aren't going to pay so much attention to it. If we push it out and it would, we can say, look, this was working yesterday. It's not working today. You need to fix it. Then it's going to be a lot more effective. But we need, like Kevin says, we need the UI and the tooling and we also need the tests to be in a state where people don't just write the test suite off. Thanks. Do you have some boring statistics? Can you show which boards are in the system? Which boards are in the system? Yeah. So if you go to the, if you go up top here to see SOCs, this will list, this will list all the SOC families. So down the column on the left, you can see SOC families. Most of these are ARM, or ARM64 families. And then within each of those families, you can click and see which labs they're in and how many unique boards are in each, in each kind of SOC family. So for example, in the SAMHSA's XNOS family, it'll list all the different boards here. And you can click through and see all the various boards. And if you click on the individual board, then you'll get results just for that, the history kind of of that particular board. So that's all under this, was there a particular board you're interested in or just wanted to see the... I think it's generally interesting to see which boards are actually voting in the next column for example. Oh yeah, yeah. Was there another? It is, yeah. There it is. This was a board that Tim gave me and actually they ripped it open and have cables hanging out of the back. So it's easy to power cycle and get a serial console on it and now it's in my lab. So that's the, this board actually can't see the results, but it's pretty reliable. It actually started failing, these failures here are, this is something I was actually debugging just recently with the Qualcomm maintainers because when you build this kernel with the config prove locking enabled, the Z image gets really big and the Qualcomm specific firmware has problems when it creates the, you can only boot this by creating Android images and using fast boot and the way that it packs images together when the Z image is really big actually screws up and it overrides DT but of course it's closed source boot loader so I need to have the Qualcomm guys help me actually fix it. So thanks for asking Tim. Was there another question back there? So Tim, yeah. So one thing I've been thinking about is when you're doing diff of different boot sequences, right, you end up with this, you can't just do a straight diff because there's too many things that kind of change but do you guys have anything that addresses that? I mean I assume you can go back and look at a good log and a bad log but is it just manual inspection at that point or are there any tools? Right now it's pretty manual and diff doesn't help because most of these kernels have printk times enabled so the printk times basically screws up any sort of useful, yes we should. I have a question. So you're tracking boot and build, are you also looking at code quality issues such as static analysis of the builds to ensure that people aren't regressing things that could be caught from analyzing just the build with things like sparse? So we're not doing that as part of KernelCI but the Intel guys at the KBuilder, the zero-day builder that Fenguang is working on catches a lot of that stuff. They run, they have a ton more hardware to throw out the problem so the static analysis, the coxie now stuff, the, yeah, all that stuff is being run by those guys. And people are looking at the warnings in KernelCI, aren't been doing a lot of work in particular with, because the build reports do include the warnings. So aren't been doing a lot of work there and I think we're down to about zero from normal GCC in line at the minute. So the question I had was, first for the kernel timestamps, those are easy to strip out because the width is always fixed so you just strip out the first whatever eight characters or so. And after that you can diff the DMS pretty easily and that actually shows you something like the probe, if the probe order changes some of the, in some cases, boot may fail. So it is somewhat useful but I think we are sort of past that with deferred probe but so the question I had was, what's the plan for auto bisecting? That seems to be where I'm spending quite a bit of time every week from RC2 until RC6 or 7 where I noticed that something broke again and I need to start bisecting. So it would be great to wake up in the morning and check the page and see the auto bisect and boot results. Yeah. So we're like Mark showed on one of those pages you can actually, we get, we're far enough to actually detect the pass and fail but we haven't quite got to the step of actually kicking off the automatic bisect. It's close but it needs a little work. Yeah. And some of that's bandwidth on the boards. So just having enough boards to run those additional tests and some of it's, we're not sure, we wanna have some level of first pass triage to make sure we're not bisecting with fluid or failure or something. That would be great to hear. Thanks. Any chance there's a bottle of water in the room someplace? I can, my throat's starting to give up. So one of the other ways that we're kind of asking for contributions is basically with hardware. And so we usually say there's an easy way and there's a hard way to do it. The easy way is just to send one of us hardware and we'll add it to our own lab and get it in the kernel CI but that doesn't scale very well. So the other way that we're doing it is kind of adding labs. So already in kernel CI we're up to like 10 different labs. Like we already saw on the list, the pangutronics guys have a lab, free electrons has a lab, Colabber has a lab, Bay Libre has a lab, Embedded Bits has a lab and we're all kind of, there's some common hardware in all the labs but there's also a lot of hardware that's unique to each lab. So kernel CI was designed from the beginning to kind of be a distributed model. The only thing that centralizes basically the database where we're pushing all results so that we can do analysts and analysis and reporting. So this is a picture of my lab. A very small piece of my lab. I'm gonna show a little diagram also that shows how I put things together because that's usually the thing most people ask about. But this is another little piece. So usually I have a little backyard office in my home in Seattle and this is what some of my shelves look like. And usually when people come to my office they don't say, oh, what kind of hardware is that? They say, why do you collect cables? So that's, now you know why. This is actually a relatively clean picture because those top shelves are also now full. But so setting up a lab, this is why we kind of say the easy way and the hard way. The easy way is just send a lab to us. Those of us who have labs can easily automate it because we have stuff set up. But getting it set up, your own lab is really nice and the trick is actually automating everything especially automating the power cycling and connecting all the UART cables and everything and making sure they're reliable and you don't have really cheap serial USB serial cables that work for a couple months and then stop working in these types of things. So, oh, thank you. I'm not gonna go into a ton of details on this because the free electrons talk and Geertz talk a little bit later are gonna go into a little more details, but. One of the tools that I use a lot in my lab is this, the Bailey-Brakmi board because we're also doing power measurements which is not automatically part of kernel CI right now but we have a little board that we built at Bay Libre that's a Beaglebone Cape and that can actually power cycle, automatically power cycle but also measure power at the same time. So if you care about energy measurement in your lab as well as just being able to easily power cycle things, it's a cool little tool. So this is a little bit more about how I set up my lab. This is just one example. Of course, there's a million ways to do it. I have about 80 boards in my lab there now. So I started things with small kind of piecemeal things but now I've scaled things up a little bit more. So I use 28-port USB hubs, I have four of them and they don't all plug into the same machine anymore because after the third one, the PC did not like it but my ThinkPad did not like the three 28-port hubs plugged in. So now I've kind of split this across a couple different host PCs but so I use these 28-port hubs and you just have USB all over the place especially for embedded boards. Some boards use USB to power and they have a separate USB for the serial console and then maybe even a third one for fast boot or something like that. So you end up with a lot of USB cables. So this kernel CI could actually, you could argue this is a stress test for the USB subsystem in the kernel as well, just itself. So yeah, so I use these 28-port hubs and then I use these little 16-channel USB controlled relays. So I got rid of all the wall warts and all the little power bricks that just do nothing but generate heat and I just use ATX supplies and I just run them through these relays. So I toggle five-volt and I toggle 12-volt DC directly and I just run those to the boards depending on what they need. Pretty much all my boards are five-volt or 12-volt. I have a couple boards that are weird voltage and so I do, I guess I do have a couple of power bricks laying around. And that's one place where the Bay Libra ACME board you can use. You can still use a wall wart and run it through one of the ports on the ACME board and toggle power that way. So the ACME comes with little probes. You can use just barrel connectors or you can switch power at USB as well. So this one shows, for example, the USB powered board that goes through the ACME. But this, I mean, it's relatively simple when there's only a couple boards but like I said, I've got 80 boards so this is kind of like multiplied by a lot which is why my lab just looks like a bunch of cables. But this is just an example. I mean, this is a way that I know that scales up. Some folks are still using just power bricks for all the boards and running them through relays as well. That works too. It just gets, I mean, no matter what you do when you add a lot of boards to a small space it's gonna be messy. Any comments on this or critiques or complaints about, if you're ever in Seattle and you wanna come and see my messy cable collection, you know, just let me know. Pardon? The network is pretty straight. I mean, this is how I just have a handful of 16 and 20 situations where vendors test their U-boot only at 100 megabit or something and so then you actually plug it into a gigabit switch and it doesn't actually work. And so you either have to, you know, you have to upgrade the U-boot or in some cases it's closed on the boot or something so I just end up having a couple hundred megabit switches around just for U-boots that have never been tested on. There's always boards that are unique in their own special ways. I do upgrade the firmware. It depends on the boards. If I can upgrade the firmware I'll put mainline U-boot on it. Some boards I can't upgrade the firmware at all or it's painful enough that I don't but typically I'll start with the vendor firmware if it kinda just works. If it can DHCP, if it can TFTP a kernel, I often will not mess with it if it's kind of, because I mean, I wanna test what most people are testing as well but if that doesn't work then I'll try and put mainline U-boot on or something or if the board maintainer for that board tells me to upgrade U-boot because they know the kernel only works with newer U-boot and I'll upgrade U-boot on it. And if we get differences between labs that's often one of the causes is the skewing U-boot versions. How do you manage boards where you have soft power on buttons? Do you ever need to modify boards to overcome that or do you put probes on or something different like that? For some boards that I have power buttons I just use a relay and run wires to the button. That's all. It's got like the service maintenance push button which is quite hard. The TK1 is special too because you have to power cycle it and then push the button. So that one actually have to use two relays to power cycle that board here. But that one, the button actually is brought on the TK1 the button is brought out to some headers. So I could just clip down some wires to the header from one of the relays and would you have to cut power, cut wall power, turn on wall power, wait just a little bit and then push that button. Do you support every strange boot loader thing they do? So do you depend on that they boot from TFTP or can you also do arbitrary scripts where which flash something or? So right now the labs are mostly using U-boot. The PENGUTronics lab is using a lot of barebox and then we also do fast boot for Android devices and then, but Lava basically will allow you to write whatever scripts you want to run your board. So the kind of stuff that works really easily is the kind of standard open source boot loaders but you can, if you can script it in any sort of way you can use it. Yeah, we have some boards in the Lenoro Lava Lab which we boot using JTAG. Would it be possible to set up the software part locally for private boards? Yeah, so with Lava you can do that. In fact, the Lava Lab in Lenoro has, actually most of the Lava Labs have a lot of boards in their farm and you can decide which ones you can allow KernelCI to send jobs to. So you can keep boards private, you can tell KernelCI which boards you want to boot. Yeah, you can do that. And the whole KernelCI server, can I run it locally? The KernelCI server itself, yeah, you could run locally, all the codes on GitHub. Okay. So the kernel image creation and the tool chain installation is it done on the server side or on the client side? The, sorry, I didn't quite hear that. Kernel compilation and the tool chains. So is it running on the kernel side? Oh, sorry, server side or client side? The build stuff is all centralized. So the compiler versions and stuff and the kernel versions is all kind of on the main kernel CI servers. So clients just download images. Sorry. Clients, so on the server side, KernelCI, so you compile images, then clients for their local lives. They download these images and run. That's right, the labs are... Then send the statistics back to the server. Yeah, the builds are kind of centralized server side and then what the labs do is when the builds are ready, they get notified and they just suck down the images and boot test those. I mean, there's no reason you can't use your same lab infrastructure to boot whatever kernels you want, but the fully automated stuff is the centralized build. Okay, yeah, that's good. Just in time because my throat's right now. Right, so I'm gonna rush through this bit of the talk a little because we had so many questions earlier, but still if you, especially because I am rushing, if you do have questions, please feel free to jump in and interrupt me. So Kevin talked a bit there about the hardware setup and during the discussion, we mentioned Lava a few times. Almost all the kernel CI.org labs use Lava. It's what we would recommend if you're setting up a new lab because it's very easy to integrate. All the scriptings there, you just need to install it and then talk to us. So Lava is some management software for board forums. It's a job runner and scheduler. It doesn't do anything about working at which tests to run. It doesn't do anything about understanding the results. All it does is it takes in jobs from an application like kernel CI and puts them onto boards in sequence. So the main website for Lava is validation.linaro.org, which is also the running instance that Linaro uses internally. So you can see both the running instance there and the documentation. One thing you'll see when you look at the Lava documentation is that there's two versions of Lava. There's two interfaces Lava presents. There's a version one interface and a version two interface. Version two is just in the process of being rolled out now. It's the result of several years of experience on the part of the Lava developers and the Lava administrators in Linaro. It makes things a lot easier to use. So I would recommend not paying too much attention to version one for new installs and moving to version two as fast as possible. Unfortunately, because it is in the process of being rolled out, the documentation is sometimes a little incomplete compared to version one, but the Lava developers are generally really helpful. And if you come onto hash kernel CI on free node, we can try and help you out as well. So yeah, I'm gonna go through how to install Lava and how to get it up and running for simple use cases. I'm gonna do this in terms of Debian because that's what I use and because Lava is developed and distributed using Debian. So that's gonna be the easiest thing to do. If you are not running Debian, you could set up a VM or a container to get Debian there or you can try and do an install from source or using one of the other methods. Those things are supported. They're just not quite as well tested as the Debian installation. And if you're setting this up with a view to contributing to kernel CI, which of course we hope you are, then you're gonna need a way for kernel CI to talk to your main Lava instance, which basically boils down to having a static public IP or at least the main name that we can put into the configuration. So if you don't have an IPv4 address, a static IPv4 address available to you, it's quite easy to get an IPv6 tunnel setup. We can work perfectly happily with IPv6. And it's also possible to host your Lava master instance on a VM in the cloud. I will cover how you go about doing that on one of the later slides. That's something people do. So let me briefly take you through a look at the Lava web UI just so you can see, you know what I'm talking about when I talk about some of the later bits. So this is the production Lenaro instance and see the address there. This is the home page. If you go along the top, there's a bunch of things here that are mainly of interest to people developing tests with Lava and to administrators. The most interesting bit if you're actually running jobs is the scheduler section. So here, if you go into the status section, we can see an overview of all the devices that are in this Lava install. There's quite a lot of them. There's a few non-public ones as well behind it. And then you can also list things by job that's been run and individual devices. Let's click through here quickly to look at what's going on with Beagle Boon Blacks. You'll see I've been running a lot of jobs on these Beagle Boon Blacks recently. Here's one that's been running. And then click through and we can get the boot log which is, as you'd expect, and all these results are available programmatically. So the Kernel CI will use a web API that Lava offers to access these. If you're using Kernel CI, you can also do things by hand. And finally, I'll just point out the authentication token section here which you will need there. I'm not gonna actually log in in case I show you something that should be private. But you will need that to actually integrate your lab into Kernel CI. So the installation process for getting Lava onto your machine is fairly straightforward. If you're running Debian stable, it's really, really strongly recommended that you use the back port of Lava to ensure that you have something reasonably current, especially just now with V2 rolling out. So add this line to your sources.list, do an update, then just install the package. You only really need to install Lava from back ports. There's no need to use the back ports version of anything else. Once you've got it installed, like you saw the main interface is the web UI. So you need to get that up and running. There's a pre-canned setup for Apache. So you can just say a to enable site Lava server. That will appear if you want to use SSL, which if it's gonna be on the public internet, it's strongly recommended. Then that just uses the standard Apache configuration. And once you've got it installed, you can create your first user from the command line by saying Lava server manage create super user which will prompt you for a username and password. And once you've done that, you can log in via the web UI. Now I'll get you a question from Ben. Yes, in fact, if we, you'll see, go back to the login page, you'll see there's an LDAP login section. That's how Leonardo does it in production. But obviously you may not have LDAP, so you can have local accounts as well. So that'll get you a beer Lava installation with no, I think we're running out of time here. That'll get you a beer Lava installation with no devices in it. So once you've done that, you'll need to go and tell Lava about the devices you want to run. There's a command line interface for this and a web interface for this. At the minute, the command line interface is quite poorly documented. I recommend using the web interface as far as possible. And I also recommend that the first device you try and install is a QEMU device. Lava has pre-canned setup for QEMU and QEMU doesn't involve any of those scary cables Kevin was showing you pictures of. So it's the easiest thing to do to make sure you've got things up and running and the basic system is working and you can control it, you can submit jobs and all that other good stuff. The process for installing a QEMU device is just the same as for installing a hardware device. It just, there's just some slightly different configuration options for each. So I'll not, because we're short on time, I'll not run through all the steps exactly. I'll just point out here that Lava has general configuration for a type of device called device type. So it knows, for example, how to control a Beaglebone back connected somewhere and then you can have many Beaglebone blacks in your system. There's sample configurations shipped for a bunch of devices. Usually a good one to look at if you're setting up a new lab is Beaglebone black because that's a fairly simple, straightforward UBOOT device that's very easy to take as a basis if you've got something else running UBOOT and which is very easy to obtain if you want to try with something that where you know all the software's already supported. So, yep, you run through all the steps. We've got listed here. I've linked to lava.example.com for the URLs in the web interface. You should obviously substitute in the address of your Lava server. Once you've installed the device, it's recommended that you set up a health check job for the device. There's a fairly obvious paste box to do that in the web UI and a very well-commented sample job here which is rather too big for slides. So I recommend you look at that. Then when you're happy your lab is ready, when you want to get it to Kernel CI, all you need to do is contact us with an API token having set up a user for Kernel CI and a Bundle Stream. A Bundle Stream is a tool that programs using lava can use to organize results. So Kernel CI wants to drop all its results into one Bundle Stream which can use to pull things. If you're using lava interactively as a local user, you really don't need to worry about that too much. And if you're submitting jobs locally by hand, there's a command line tool which you give an API token to in the same way as you would give Kernel CI. You can use the super user for that. There's no need to create a separate user although you can if you want. And if you've got multiple users locally, then obviously they can individually do that. Finally, I mentioned before that you can have your main lava instance that Kernel CI talks to in the cloud and have your lab somewhere in a private network perhaps behind that. I recommend splitting even if you're installing locally in a lab, splitting up your installation with multiple servers. Let me take a step back. So one of the problems Kevin was talking about there was that if you have a lot of boards connected to one machine, it becomes very difficult to wire things and sometimes the central machine gets very stressed. So one way to mitigate that is to have a separate lava central server and what are called dispatchers separately. These dispatchers are very dumb. They don't have any real configuration. All they do is run jobs, they do the connection to the board and they just feed the results back to the server for analysis and redistribution. And you can also use this to put your main lava instance outside of your network because they just communicate over IP. So if you want to install a separate dispatcher, you don't need to install the full lava package. You just need to install the lava dispatcher package. It doesn't need any database or anything. So you install it. You edit one configuration file to tell it where the server is and then on the master server, you go into the web UI and tell the master server to expect that worker to connect. If any of this is going anywhere near the public internet, then it's essential that you enable authentication and encryption. There is a very clear walkthrough of that in the lava documentation which I've linked to there. Yeah. And then all the configuration of the devices is done on the lava master in exactly the same ways it is for locally connected devices. So like Tim said, we're out of time. So we'll take any questions in the hall. Thanks for your attention.