 Nice to have you all here at this early hour and in parallel to the keynotes. Today I want to have a discussion with you about automated testing and bot farming. So it's not just me talking, but it's us having a discussion. We're having microphones so when you have anything to say, just raise your hand, we'll give you the microphone and we'll have your part on the stream too and in the recording. Some words about me. I'm Chris. I'm working at Pengotronics and Pengotronics mostly does software development, but I'm a hardware developer. So I'm working on hardware for embedded Linux testing basically. And yeah, I've done things like the USB SDMarkz and we're a team of three at the moment. If you want to contact me, drop me an email, find me on Twitter or something like that. Maybe a few words about bot farming. We've had some talks on this topic already, but just to recap what we've already heard. When I'm talking about bot farming, I mean we have some devices on the test, usually embedded Linux devices that we can control automatically. We can switch power on and off. We have serial console access. We can switch boot modes, stuff like that. And at least when Pengotronics is talking about bot farming, we always mean that we have interactive access onto these devices in our bot farm because that's one of our main use cases. So we have devices in our lab and our colleagues can work from our office or from home using these devices in the lab and we can share those devices between developers. And the other part is automated execution of stuff. And stuff is usually automated testing, I guess. So we can, of course, run tests interactively. Someone has to develop tests and someone has to debug them if they fail. But there's also this whole area of continuous testing. So when you have projects that want to test a branch that has been pushed in a merge request, they build artifacts and they can deploy it on a device on the test somewhere. We've heard that in the collaboration talk. They do that quite a lot. But it's also, on our side, it's usually testing of artifacts that we have built. So we are doing nightly builds for our bot support packages and afterwards they're deployed onto a target and it has two runs. Those targets in our lab are usually highly specific because we work with customers that build highly specific hardware. So test suits are always tailored for one device on the test in our case. But of course, we can also run some generic test suits or something that's specific to a project depending on your use case. Yeah, and with that said, I want to start with the discussion part of this. If you scan this QR code, you will find the link also in the talk description in the chat platform. There's a Google Docs document and I invite you all to join there. We will collect topics there and you can rate them with an emoji if you want to upload them like on GitHub and stuff. Yeah, and then we'll see where the discussion brings us. I still see some smartphones up so I'll wait before switching the slide. Yeah, I've prepared some questions and some initial answers maybe to share a little bit about what we do. Okay. Yeah, so first question here. I'll make this a little bigger in a moment. How are you using automated testing? What does your bot farm look like? And especially what has changed in the last two years over the pandemic? Or did anything change or did that not affect your bot farm at all? Well, our bot farm, we have like around 100 different devices on the test in our lab, mostly different devices. So sometimes it's two of one of a type, but usually just different devices. And when we have more than one, we usually share it between interactive use and automated use. So if a developer works on it over day and forgets to unlock it overnight, the automatic test can still run on the other set. And if one of them fails, we still have the other one, what's quite neat too. Most of our lab is built on 19-inch server racks. And we have seen the collaboration ones with the clean cable management. We've tried that too, but we're really bad at it. So those racks usually are scattered in our office. So every colleague has one in their room or at least in the room next to that. And everyone can work on it. So if you want to have a device on the test in a lab, you just find the free space, put it there. It's like 16 devices on a test in such rack. And we have a test server in the middle, somewhere here in that area. And Ethernet switch, the older revisions also had a serial server, RS232 to Ethernet. But we've skipped that in the newer revisions we've built since 2020, I guess, since they made a lot of trouble because they are not behaving good. And so we've moved our serial ports to USB, what has a lot of problems too, but at least we know these problems and know how to handle them. The older revisions also had a CAN bus on the rack. Newer ones don't have that. Yeah, that's what it looks like. Additionally, colleagues have labs on their desk. So they have a power switch and a serial server. Sometimes mostly the serial servers we've worked out of the remote labs recently. And they have their desktop PC for controlling of USB, for example. And all this is merged together in a large lab network. So everybody can access every device in every position if they want to. And what has changed in our labs, so we've built more of them. We've replaced our GPIO infrastructure. So to 2020-ish, we had GPIOs on a one-wire bus that was scattered around that remote lab. And now we've moved that to CAN on a hardware. We've built ourselves to make it more robust. So we've got rid of USB in that chain. And yeah, we have now all software under all control and can work on that. One thing that's really used a lot is multiplexing of USB devices. So if you've got a USB thumb drive thingy and you want to simulate automated updates using USB, what some of our customers do in the field, you can switch that thumb drive to your OSPC, put a new image on there, put it back to your device on the test, and then you can simulate an update from there without anybody having to interact with that. And yeah, we're still working on our one-test server-pad device on the test concept using our test automation controllers. We're building ourselves, but yeah, buying components is hard. So it takes longer than we want to. Are you using your new tag board to do your USB multiplexing? No, that's still an external device. So that's not in there. Yeah, on the other hand, your first row, I think. Does the test server include serial? The test server includes serial. And power relay as well? Pardon? A power relay. Yeah, and a power relay. So we want to have like the 80% case. So we can switch power up to 48 volts, 5 amps, and measure current and voltage there, have serial, 3 USB ports for whatever you want to connect. There's a serial port that has some GPUs and a CANBOS for more automation. There's an Ethernet switch in there, so you can do VLAN untagging if you want to have it on a test in another network. So quite a neat thing. Okay, so maybe just let's continue there. How does your lab look like? Does any one of you already do board farming and want to share a little bit of what you're doing, how it looks like? Tim. I'm going to talk about my personal lab, not the Sony lab. Well, it's a long story. But in my lab, I've got a single PDU that's controlling, that has multiple ports on it. And I've got Sony a couple of years ago did their own debug board that was kind of similar in spirit to the TAC, not exactly, but it's controlled over USB and actually send it commands over a USB serial thing. And it does things like control GPIOs and do USB multiplexing. And it also does power measurement and power control. So, but the problem is it's custom hardware and it's dated, right? It hasn't been updated for a while. But I'm really interested in what specific like PDUs people are using. I know someone mentioned yesterday, the Sonos or I can't remember the Wi-Fi controller. That seems pretty interesting and easy to use. But that's my lab. I don't know. I actually don't have that much. Almost all of my serial stuff is USB. I have my USB hubs are way overloaded. And I know you guys talked about in a previous conference that USB is quite flaky and that's been my experience as well. Yeah, over there. Hi, yep. In our lab, we're also seeing some of the USB issues that you're seeing. And we're really trying hard to get away from a one-to-one setup because right now we're using basically a sidekick device like a Raspberry Pi or some other controller and going directly through there to avoid some of the USB multiplexing issues. So if we could talk more about some of the USB issues that you're seeing that would be really helpful to me. Yeah. It works when you have one board. Yeah, but even if you're just having one board, we've seen issues where like USB serial leptors just stop to work after weeks or months of working. So what we've done with our test automation controllers, you can switch every USB port on and off, actually. So you can just power cycle your USB devices to work around that. So what PDUs are we using, Tim? What's your question? We've got one that's hidden here. That's the large one. It's 24 ports. And it's a Gude, a German manufacturer, I think, 8080. Yeah, it has 24 outlets. So you can switch it via Ethernet. You've got power measurement for like eight ports in a row. So you can control what's going on there. But that PDU, it switches at 230 volt, I guess. Yeah, it's just 30. So you've got lots of wall-words plugged in for that. Yeah. So we have, maybe we can even see that. Not really. So there are cables coming out of this PDU and going to every device on the test. Yeah. PDUs are just there. Do you also use a low-voltage distribution somewhere? Not at the moment. I'm running my board form of 12 boards from ATX Power Supply. And I run the two Beaglebone Blacks that are controlling it from the five-fold standby power. So I just have one power supply and I don't need all the bricks there. That doesn't work for us because our customers all do their own thing in regards to power supply. So some are five volts, some are like 48 volts. So we've used it to the wall-words. Yeah, so I do something similar to what you have, except I just took an old laptop power supply. I've got a couple different setups of about four boards each in my own home lab just for me. So I don't have the complicated needs that are here. But I just take that. It's a 12-volt power supply. I just use just a power distribution board and then a four-port relay board with an ESP on it, ESP something or other, 8266 or something running TASMOTA. A combination of that with MQTT and Home Assistant gives me kind of a web UI. It's not great, but it's there. But that's one of the things. I'm excited about LabGrid. I really want to get that integrated. But what I would love to see next is some kind of web front-end for all that so that I could just have different tabs for each of my boards with on-off switches on the left-hand side for the USB or the serial console and that kind of thing. If there's anybody that has recommendations on what kind of web frameworks and things to use for somebody who knows nothing about web design, I would love to hear that. LabGrid has support for those MQTT TASMOTA switches integrated. Not using Home Assistant, but directly on MQTT. Sure. That's the next step for me, Weeks. Oh, okay. Yeah, I hacked up some small Perl script which implements a web server. And since all controls to the boards are shell scripts, I just launched the shell script from the web server when you press the button in the web UI. Okay. I think there was a handover there in the back. So I work for TI and we have around 40 racks in Dallas and another one. We are moving to around 80 racks in India. So pretty large set up from outside. Each of the racks have around 1620 boards on an average, close to 200 to 300 boards overall in the system, distributed all over the place. We use an ancient system called OpenTest and we are considering about moving to LabGrid. Some of the challenges that we have had is time-sensitive network testing, camera testing, display, interactive remote verification testing. If there are techniques for us to interact from that perspective, that will be interesting for us. Yeah. So everything that's a camera or HID devices is still a problem for us. So when it comes to everything, graphics or x86 user input, we are just still doing that on our desk. So there is nothing we have to work on that remote, maybe anyone else has a solution, Tim? So I've talked about this a lot, so I feel bad bringing it up, but I've been working on a project for the last two years to do hardware testing, and it includes APIs for video and camera capture. It's not streaming though, so it's not live. So what you do is you, like if you're trying to see what's going on during boot, you can have an external camera that you control that watches the display of the device and saves that to an output, and then you have to post-process that output to look at it either with a human or haven't really got to the part where I try to automate analysis of that display for things. So, I mean, there's work in progress, but it's not ready for prime time yet. I can imagine having some way to automatically look at what you've captured could be interesting for testing in the end, yeah. Yeah, for the x86 case, we have something that's working nicely. We evaluated multiple solutions. There are, like, KVM switches that can capture the HDMI signal or VGA signal. They can produce keyboard input to the device. They can plug USB devices into our board. So, actually, we are able to power switch with a normal switchable power lane. We can switch it on and off. We can go into the BIOS. We can change the BIOS settings. Everything is captured via HDMI. So, that's like operating the device from locally, and there are several vendors on the market and actually under one very interesting open source project, it's called Pi KVM, and actually it was one of those who performed the best. Okay. But our manager wanted to have a commercial solution, actually, because I think he was afraid that we would start hacking on Pi KVM during work. Okay. Yeah, just a side note to the capturing displays. I have a really, really little pet project. It's called PicoCalfoM. It's a bit smaller than PicoCalfoM and just uses a HDMI USB grabber and a small USB HDMI simulator. It's on GitHub. Yeah, it's a pet project, bad code in there, but we can have a look at that. Maybe you can drop the link into our notes and we'll kind of find that. Okay. I've got one more right here in front. And maybe having a look at the clock, we should go to the next topic. It's 10.30. Years ago, there was a project called STB tester that did basically capturing HDMI or even LVDS, I think, and then analyzing the outputs, but I don't know what came of it. I just turned it here on ELC. I never heard of it again. Okay. Anybody? Related to network testing and so on, it is the next thing we are working on. We spent last week to extend LAPGrid to network testing support so we can export network devices and integrate it into LAPGrid as abstraction, kind of abstraction, and my personal goal was to be able to test switches, the SAT drivers and so on, but well, time-sensitive network and PTP and so on. So it can be added on top of this work as well. Okay. Maybe the question with the next most rockets is, do you have any new control gadgets, anything you want to share that you've bought and think is really cool even if it's maybe simple? So because simple things can help a lot. I don't know if those are really new, but we use iCUSH devices to unplug and re-plug USB cables remotely. And that's pretty useful because in our lab we have a lot of Chromebooks and we do pretty much everything over USB. We switch power on and off and we also access the consoles from there. And since USB are very buggy in our lab as well, it's really useful to have something to just unplug and re-plug remotely. Okay. Yeah. YK-U-R-S-H. It's already here or? I cannot. Oh, yeah, it's there. Yeah. So two things I've learned about USB that's helpful for our farm is it's much better to have a wide topology of USB devices than the deep one because if you chain USB devices, if one of the ports decides to reset, it affects everything downstream from that. Also, in sysbus USB, you'll often find an authorized file and if you write zero to it, it'll tell Linux to unbind all the devices attached to that, maybe Herbal or downstream device. And that's a good way to kind of I find between board users kind of unloading the devices, putting them back on. And that has the effect as well if someone has a file handle open to say a USB serial console, it will kick them off. And that's quite a nice way to kind of reset USB devices as well. And also buying good quality hubs makes a difference as well. Because of the whole USB devices, power cycling problem, we actually got a patch into a recent kernel where you can disable individual ports and the USB device won't come back until you re-enable the port. So if you have a newer kernel, then there's a new sysifest file you can use for that, but it also works reliably. And it switches off power if the USB hub supports switching off power? Yeah, right. So we still have to adapt the U-Hub control for that. I guess so, but yeah. The kernel already supports it. Just a follow-up question to that. Is it possible to power toggle from inside? Okay. Your USB hub has to support it. They need to switch for that. But the U-Hub control, GitHub has a list of hubs that do that. That's quite good resource to find them. Okay. I'll move on to the next most rocket question here. The question is, is it possible to test hardware features this way? I would say, yeah, definitely we can test hardware features. That's what we are all doing. So we want to test the software we have for the bots, for the devices under test. And we want to test hardware features too. So we try to connect all hardware that's around a CPU and test that. If you want to test secure and encrypted boot and stuff like that, you have to steer your device under tests through those states so that you set it up for a test and lab grid. There's a thing called a strategy you can write for a bot. And that's a state machine that describes how you configure a bot between those different steps. Yeah. Basically, everything you can do by hand is possible automated too. I'm not sure. I'm looking to the Pengatronics guys. Have we done that? Have we tested secure boot? And if we can break it? Yeah, so we did secure boot testing for one of our projects. And you basically have to use the vendor APIs to find out whether your bot is in the correct state. And then you have to write test and break secure boot in that case. And that can totally be done. And we also implemented this to verify that our app armor profiles used in that project also worked. And that worked out fine. Okay, next most rockets, I think is this question here. Which existing test suits are you using? So, on our end, we're usually using like tests we've written for a specific device on the test for a specific customer. So we share some test cases between all of them, like health checks for some time to time, and basic things for user space. But it's mostly very individual. And we're wondering if anybody is using like a case health test or test suit built into other software projects on their side on a regular basis. I think Collaboror does that very heavily. But for the other ones having labs or running labs, do you use those test suits and when, how do you use them? So from TI at least, we kind of forked from LTP and we created our LTP DDT. And there is a target based test, which is the device driver test. And we have something called a VATF which is the host site. For example, when you're doing PCI or USB, DFU testing for example. So that's the combination that we have been working with. We haven't found anything in upstream that we can contribute back to which we can probably share with the rest of the community here. So it's a forked out tree that is essentially TI specific right now. I was working here for Linux testing framework for some time. But it is hard to integrate into actual targets which we shall test as a device for production. I mean, you can integrate all of needed scripts and software into end software and it is affecting testing results. So probably all of testing which should be done at the end is kind of outside of the scope of most of testing frameworks which should exist. I see no other hand. Having a look at what's the next question. Maybe Tim's had three rockets here. Is anyone doing hardware testing in the lab? What kind of hardware testing do you mean? Well, so the ones that I focused on were audio and serial port off the board. So using instead of doing loopback testing which almost everybody does to test like Linux drivers, I wanted to test all the way through the hardware. And so my project is lab control and I'm working on APIs to control hardware in your lab that's not on your device under test so that you can capture for instance for a serial port test capture another device that's connected by a cable to your device under test and then analyze the data there. And we've done a couple of different things I mentioned before the video and the camera and audio and serial are the ones we've done so far. But I'm wondering if anybody else is doing things like that where they're manipulating external pieces of hardware and how they're doing that if it's just all ad hoc scripts or if they have a framework set up for that that's my question. I don't know if that really is the question but often to test the software actually because it's almost always a point to test the software you need some additional hardware to really test the interaction, right? Is that what you're asking? Yeah, yeah, that's I think that's a given for I mean when it's customer software that is in the end going to be deployed as a product there's almost always something else that you need to interact with and it's like Bluetooth client or Wi-Fi access point or something like that. My question is what are people doing now with their own scripts? Are there any other frameworks that do that? What I'm missing a bit is kind of management of the other device we need like for a Bluetooth test you don't have a Bluetooth device for every device on a test in your lab so you have one or maybe two devices there so when in your lab grid or whatever management thing you need to allocate both the device on a test in another device depending on a test and that's something that is missing at the moment. And you have to coordinate that not multiple devices on a test are using the same Bluetooth thing that you're using in a test at the moment here. For lab grid we actually support audio capture so you can get audio traces either like developing live and stream them to your computer or record them and then do analysis afterwards and for Wi-Fi and Bluetooth testing we just use to put a Wi-Fi and Bluetooth router on top of the racks okay I'm not going to the coffee break so thank you so for Bluetooth and Wi-Fi testing we actually put a Wi-Fi router with open VRT open WRT on top of our racks and it also has a USB Bluetooth adapter and just sends out discovery messages so you can at least test scanning for every device in the rack and you can test like whether Wi-Fi scanning and connection to the Wi-Fi works but we don't really have a solution on how to control the open WRT router at the moment you could add it as another resource in your app trust but I think we're not doing that really so a lab grid has support for controlling USB Wi-Fi sticks and USB other devices via network manager so you can connect to an access point hosted by the device and also starting access points so the device can connect to that and you can basically test IPERF and network traffic over the device so when it comes to Wi-Fi testing personally I do a lot of manual testing because I'm working a lot with drivers and in that case I'm using quite a nice box called PC engines and this box it has too many PCI Express slots and in there you can put two 11AC or 11AX cards I actually haven't found an 11AX card that works yet but you can use these for 11AC cards then you can run host APD on your host and host APD has this Unix domain socket control interface that you can use so I haven't used lab grid but if I'm going to check that out then I will see if I can also hook the PC engines into that so that I can manipulate the host APD control dynamically for things like switching channels on the fly and things like that if anyone is doing something similar then we can talk afterwards but this is not what I do but I think it's pretty obvious I think what we need from lab grid is because it's not a matter of having a very specific support for Wi-Fi for Bluetooth or USB that's what Tim is really need and I guess all those that said something here is we need a way to control multiple places in lab grid because right now when you write a test case for lab grid you have an implicit place as a target you can have multiple places because that basically is the tool here it's not a matter of having generic support for a lot of stuff because a lot of the time this is very custom stuff so we need to just use multiple places and do normal lab grid control of those then you can do it anything I guess and there is a reservation method in lab grid so you could wait for a specific place to appear to be available and then you could control your access point using lab grid when it's available and then start to test it so the thing was we have to be able to wait for multiple places when they're supported in lab grid it's like 5 minutes so maybe going down the list a little further one question is there any group for discussion this further it would be great to keep talking to everyone after this meeting has added the e-linux automated testing page and board farm page I guess that's a great place to start and there's a mailing list and there's the monthly automated testing call I think it's called yeah it always collides with another meeting I have so I'm not there that often so yeah I guess that's good places to continue the discussion anyone has anything else that I've missed okay then Bastian from remote questions does anyone have experiences with automated repeated tasks maybe like automated git bisecting my first idea would be to use lab grid scripting for that you can use lab grid as a library in python and then you can write a small wrapper that does your git bisect using lab grid but I'm more the electronics engineer so I'm looking to my colleagues are we actually using that that way usually as a kernel developer and some user of lab grid if I have some bugs then I use git bisect exact and execute actual lab grid tests which trigger this particular bug so probably I didn't understand the question what is actual problem so I think what the actual problem or what the question is about is if you detect that your test suit fails for whatever reason maybe because you updated your BSP and now there's bug in there somewhere you could try to automatically roll back or study git bisect session on the BSP to find out what the buggy commit was right and there's no real automation for that yet as far as I know because you have to hook back into the build system to roll back to a specific revision and there's at the moment at least and I don't think we have planned any of that yet any connection between lab grid and the build system itself so it would require external scripting at the moment. Kernel CI has automated bisection integrated as part of the CI so that could be worth looking at as a reference implementation I'm not sure if it works only on the lava labs at the moment it doesn't matter what the lab is right because it just drops and it fills it bisect it's not a job so it could be lab grid don't throw your laptop away it could be with anything really but it's a bit outside of the board from right from your board from you just want to know does this build work and then you can decide whether you bisect or do anything else okay it's like two more minutes anything else you want to add or I'll pick another question maybe one that we can answer in time is what software do people use to manage access to boards e.g. lab grid and how many homegrown systems are there so maybe a show of hands who uses something like lab grid in their labs quite a few hands let's say lab grid who has a lava lab quite a few other hands wego wego is the test engine underneath the foot source for 15 years what else do you use test infra your own scripts good question how many are using a homegrown solution that's not something public that's like another third or fourth I guess and there's a hand last question okay okay I guess we're done thank you very much it was great to have you all here and I guess we'll hear on the next automated testing monthly call or something like that