 So thank you all for coming, all of you who are here. And the talk is going to be about testing hundreds of OS majors with Jenkins. And what we have tonight is like at least, today I wanted to talk about something that you can take home with, something you can use, and especially something not technical to that degree. We are going to be talking about quite the journey from pull request to release, and it's going to be quite fun. I hope all of you. So I'm Wipo, and I work for a company called Balina. You might have heard from Balina called, from a product called Balina etcher. I am a product owner there, and I also lead documentation. I also run the documentation initiative to write documentation for open source startups, projects. I do quite a bit of open source work and volunteering in Picon India, alias, and my pronouns are he, him, and my pronunciation is we pull. And we will be going to pull quite a bit of concepts today into one of this. What is Balina OS? This is the test subject we are going to be testing today about what this thing is, and we're going to be talking about this. So it's an open source, Yachto-based embedded operating system. So it's lots of jargon here, but it's meant to run containers on small devices. Maybe you have heard of Raspberry Pi's, Intel Nougs, Beagle Boons, NVIDIA Jetson devices. We support them all, and it's completely free, completely open source. The source code is also out there, and a lot of the people can use it. This is the OS that we are going to be releasing by the hundreds and testing by the thousands. And I'm pretty sure you haven't heard about it, but we were the first people to deploy containers in space. We run the operating system from everywhere from tech farms to underwater drones to Dyson, even lots of work being done down there. And the major challenge then comes with building an embedded operating system is the problem of testing it and releasing it, where even your smart toaster will have firmware updates. You have to be very careful about what you're releasing. You need to make sure it's fault tolerant. You need to make sure it's completely reliable before releasing, how to do that? Because while testing software is quite hard, testing software on hardware is downright painful, there is no standardization, there is no open source framework available, and that's what we are trying to solve. So as I said before, there are several challenges involved in building an embedded operating system, and that involves maintaining backwards compatibility, adding support for new devices, and all the good things that comes with just maintaining operating systems, but now you're maintaining operating systems for 100 different devices, our solution, an AutoCAD. What's an AutoCAD? An AutoCAD is a device that controls a device under test. It's just that. Let's say you are building an operating system for a Raspberry Pi 4, so an AutoCAD will be the one that actually controls the Raspberry Pi 4 to test on it. It automates it. How you ask it automates it? We automate it in every way possible, where an AutoCAD is built in a way where we can use every single interface the Raspberry Pi 4 has, from power to network to serial to GPIO, we can actually check every USB port to CSI, DSI, we can overclock, we can do lots of things with an AutoCAD, and that's where the power comes of building a hardware in the loop testing pipeline. This is something that we are going to talk about in this talk quite a bit. What is hardware in the loop? What is this? We have a diagram, and I apologize for my drawing skills here, but the actual prototype is quite worse. We are still in beta and we're still releasing improvements there, but like I said, AutoCAD has, comes with quite a bit of features like flashing the storage of your device, connecting to the power, checking if the device is on and off. It does quite a bit of things, and all of this helps in creating a pipeline which can test an operating system. This is critical. And again, like we have tons of features about hard drive support, mortems, serial interface, and the best part, it's all available for 3D printing access. There is no custom hardware and all the off-the-shelf components is actually completely free to build and you can build it yourself today. That's the best part about an AutoCAD. And here's why I showed you the diagram. This is an AutoCAD. So it's one of our prototypes where we started working on, on the top right, you can see it's the host that controls the black box, which is the duty. It has an intel look inside and all of these wires and all of these components being used is controlling that duty fully. So once you run an operating system test, it gets tested by this hardware right here. And we'll talk more about this, how this works and how is the architecture. But the second part of this thing or something to just understand the journey is Leviathan. Leviathan is our software layer that runs or talks with AutoCAD. It configures the artifacts, basically how you have a hardware. If you have a hardware, then you will have a software as well. So it's free and open source. It can do results. It can give out results as well. And the same test suite, the most important part is you can run the same test suite for hundreds of devices. So you don't have to alter your test suite for every new device that comes in. Leviathan takes care of all the complexity and runs it for you. And here's the architecture. This is the big picture about how we are testing. We'll explain this. This is our journey. But if someone wants an overview right now of what we just talked about, the client Jenkins gives the artifacts to the Leviathan container. The artifacts is a Baleen OS image. Leviathan checks it out, gives it to the AutoCAD. And AutoCAD tests this Baleen OS release, the operating system on an actual device. That's how our loop goes. And that's how our feedback cycle also goes as we go along. Let's start our journey. We'll explain more, like if you have been following up like with all the jargon, we'll be explaining all of this in our journey. And as you know, we are all open source contributors here. So when you start your journey, it all starts with a pull request. So here is me adding some additional logs to the testing suite. And what this pull request does from the MetaBalina repo on our Baleen OS work, is it starts a Jenkins build of all available supported devices. We build every single device that we support on Jenkins first, because with Yachto, even the build process can tell you a lot of things that have gone wrong. Even if you change like some component, maybe you just upgraded the network manager version, you can, it can go wrong and the build can fail. So we build everything again. And you might agree, it takes quite a bit of time. How would it? But with Jenkins and the power of Jenkins and the power of the Yachto pipeline, it only takes just about an hour to build all of this. It's about hundred different device types which are releases being built and we validate them at this point only. Hello, yes. I guess this is the presentation now. One sec. Production issues, no worries. We'll mitigate that. Let's keep it going. So right now on step two, we have built the entire Bellino has released from just one pull request for the hundreds of pull requests that come to us. We build it again and again and again, each time. With the releases built in, we just about, if you take a look on the top right, it's just about took an hour and five minutes to build hundreds of releases. When the releases are built, Leviathan takes over. So Leviathan is an open source tool that I've been working on. It's a distributed hardware testing framework that takes this release and gives it to the next available AutoCAD to start testing on. So let's say you have a Raspberry Pi 3 that you, Raspberry Pi 3 Bellino OS operating system that you want to test. You can take Leviathan and Leviathan will configure it for you, make it ready, and then pass it on to AutoCAD. It all does it automatically. There are no fingers involved. There are no humans involved. Once you make that pull request, it's all done and clear. So here's how Leviathan works. I've added a test suite of where we are flashing a device. So if you look at the highlighted page, all it takes to turn the Raspberry Pi off, flash it and turn it on. It's just three simple commands. And it abstracts the complexity in a way where you don't have to worry about what device type you're using. It will just do it. It already knows which device type you're going to be testing with, which operating system it's for, which version it is, and it abstracts everything. You just give it an image path and we flash the operating system for you. In the end, there's an assertion. This is no tap. If someone has already used it, this is no tap where the assertion actually tells you if the image has been flashed or not. Our next main headliner is the hardware in the loop testing pipeline. The image that you see here is from our Galati office where we have Auto-Kit set up to test all these releases that are coming in. Imagine if we are building hundreds of operating systems, we need the hardware or a rig completely. This is our testing rig to actually perform releases on actual devices. That's how a hardware in the loop testing pipeline works as it tells in the slide. The Auto-Kit is automatically able to receive commands from Leviathan, power on devices, flash, turn them off, break them, do whatever you want. With Auto-Kit, you have the ultimate control of scripting your actions on a device. And that's where we come to, maybe you have a question in mind, why are we doing this? This is too much complexity for just testing an operating system. When the stakes are so high, you need to make sure the operating system can perform everything well. And that's why we use Jenkins and basically Auto-Kit together. And just like this, like slabs the Jenkins pipeline, we can fit so much testing. And this is how much testing we've written. This is a total of 200 plus assertions and 62 tests that we test back to forth using our pipeline right here. It takes about 90 to 120 minutes to run in. And that's majorly because we are testing this on a device which has one gigabyte of memory, maybe 2.4 gigahertz processor, just dual core. These devices are meant to be slow and steady. So it takes time. We break the devices quite a bit so that we know Balina OS can recover. One of my favorite tests is we, accidentally, intentionally, we break the engine in the OS and then start an update process so we can see that it can recover or not because the engine has been broken, it won't recover and we want to see a failover attempt happening. And this is the power of actually testing on a device because you can see all that. You can test all that. It isn't possible when you are emulating because in an ideal scenario, this can't be done. So once all the testing happens, once everything comes along, Jenkins reports back the results to GitHub. And with GitHub, we are sort of having this situation where we are getting our feedback from the test on the GitHub pull request and we get to know, like, okay, this is our pull request, this is the features we want, this is the fixes we want. And when everything is done and all the tests go green, which all the engineers would love, the PR gets merged. As the PR gets merged, we don't stop. We don't want anyone, like any human intervention, blocking our pipeline. So at Balina, we trust our test. We have built a pipeline where we can trust. If the PR is merged, the PR is deployed. Done. There is no further testing that we do. Every release that you see here is completely up to date from the newest operating system we released. So I can tell you a very nice story. One time, a customer reached out to us on support where they mentioned that the Red Sox proxy of our OS is broken now. So our engineers in the morning checked it out, actually validated the fix, fixed the OS, made a pull request, and then went to sleep. In four hours, the OS was tested, built, tested twice actually, and then released. And the next day, not us, the customer reached back to us, thanking that we have already released our new version of the operating system with the fix in just under 12 hours. That's the power of using a hardware in the loop testing pipelines with Jenkins, because in the game of embedded operating systems, you just need that much reliability on hand. And this actually takes your release pipeline from weeks to just hours. I mean, imagine the last time you have seen, don't want to name any names, but imagine any new operating system coming in just in the matter of hours when the bug is realized. It takes weeks of testing, even beta testers get involved, having this pipeline in your place or workplace can actually dramatically change your release strategy. This is the summary of our journey. We just, you can read it. I have these slides as a link, but it will help you sort of understand what we have gone through. And this is basically hardware in the loop testing. If you think about it, what our gains are, we have reduced our OS testing as well as development cycle from weeks to hours. We can achieve support. We can actually focus on building support for new devices rather than maintaining old support. And this has actually enabled an IoT company to follow TDD in software testing on hardware. And specifically it has helped quite a bit in chip shortage when we were dealing with it and so were our customers. So we could test new releases as fast as they could be possibly tested. The major question and sort of something that a lot of attendees and a lot of speakers miss out is how do I use this? Sure, it's a 25 minute presentation. We have memes. You had good laughs. But how do I actually use this? More bad drawings. And what can we use to plug into commercial hardware so that you can actually start testing or start maintaining this hardware? Let's say you have a Mac mini in your office. You have five of them. Let's say one of them needs an OS update. You have to either wizard or just start an update from another Mac mini to this one. And that's costly, right? You don't know if it went well. You don't have a way to roll back the release if it messes up. Things can happen and actually your business gets affected which is our major concern because in the IoT industry the major part depends upon the cornerstones of trust and reliability. We can't mess with that thing. So with an AutoCAD you can actually control all of your devices, you can maintain them, you can automate them using strips and if anything goes wrong just flash it instantly. What's our next thing? Let's say you are a software developer building Chromium or PyHole or VS Code for Raspberry Pi. How do you actually test this on a real Raspberry Pi? You can always emulate a ARM v8 but you never know if it actually uses all the features of Raspberry Pi. Maybe there's a new network manager in the US that there's tons of things that can change when you are actually running your application on the Raspberry Pi 4 here or just any other device with an AutoCAD and a hardware in the loop testing pipeline you can actually know and at Balina we like to call it we want to be ASAP as soon as possible but we want to be ACAP as well which is as confidently as possible. We want to release our operating system and know that it actually works rather than just going on intuition and the last thing is very much on quality assurance. Let's say you are a hardware manufacturer building devices, you can actually use AutoCAD to test new hardware, test if the serial port is working, test if in 40 degree Celsius your chip doesn't burn out, test if this device works in 100% humidity, stress testing, assurance testing, random testing, we can do it all. AutoCAD is built to handle most of the scenarios that you can throw at it and of course you can test operating systems that's what we use it for and that's the top. But what we have learned, testing operating systems is important. You don't do them, you already know how things go with other operating systems. Whenever a new operating system comes, be it Apple, be it open source, be it Windows, there are always bugs and you always go and wonder like, why did the OS engineer not test? We also think that we always think about how we can improve our own operating system and that's why it's important but it's incredibly painful at scale. It's so painful in fact that not many people go the distance of creating this pipeline and that's why we created a free and open source and hopefully a standardized way of actually you using this pipeline but it doesn't have to be and when you look at it when tools with Leviathan and AutoCAD, you can actually achieve exponential games with a hardware in the loop testing pipeline. I have resources here. You can scan this, you don't have to click. I have a QR code for all these slides and everything. You can find it there and that's the talk and that's the QR code. Thank you. But... People, what do you do for... Yes. I don't have a key. Indeed. I can have a very nice example. We have tons of stories. I'll have a story specifically. So for people who didn't listen to that, your name? Not Chris. Chris asked like how do we test operating systems on different device tapes? Especially one thing that Leviathan does quite well is you don't have to worry about it. With Leviathan, you can write one test suite that can target all of those devices. So that's one problem. But there is still quirks that happen. I have a very good story where the Raspberry Pi Foundation released Raspberry Pi 4. Everyone has seen a Raspberry Pi since but what they have done is they released a board revision which is going from 1.4 and upgrading it to 1.5 and that failed for us. The firmware or the bootloader had an issue so Bellino has had to be fixed and that's where the testing pipeline like this helps to catch the issue pre-production not post-production where the customer will get affected. So with the hardware in the loop testing pipeline you can actually find and with Leviathan you can actually write test suites that can target multiple device types, multiple iterations of those device types and multiple flavors of those device types, whichever you want. Indeed. Indeed. And like I said, there was one of the challenges is to provide backwards compatibility. We have a very healthy backwards compatibility for Bellino as and how we do that is let's say you have an issue that you have uncovered in the old operating system. We'll first, we'll try to follow TDD as much as we can. If it's an actual bug that we missed we add a test for it for a new operating system but to mitigate that we don't actually change releases for the old ones. We want to keep them like it is. We recommend people to update to the new operating system as they should because with the hardware in the loop testing pipeline what is very good and what's very impactful is we release almost weekly. You don't, you never can stay off the edge. Like if you don't update for a year you'll be probably 30 versions behind. 30 minor versions in semantic versioning not like 30 major versions like but that's the power where if you want to keep on the edge and be reliably on the edge you can actually count on Bellino as or count on this pipeline for your customers as well to deliver the best software as confidently as possible. Yeah, hopefully that answers your question. Yeah. And I won't hold you for long. It's already a very nice day in Vancouver. So you folks have lunch and thank you all for attending my talk. Here's the slides if you want. I posted a super bug. Thank you.