 It's a bit more otherwise than automate all things and make development easier and better. We have a bunch of talks today, each of which are about 20 minutes. There is room for Q&A at the end or in the middle, depending on which individual speaker feels like. Before we even start, can some of you move to the centre because there are going to be people coming in late? If you don't want them to interrupt you by walking across you, going to the centre might be the best. Why don't we line up to the front? Everyone's very friendly out of there? I'm not, so no. You're blind to me. Yes, and all those speakers will take wonderful deep breaths. Everyone questions, they'll leave a microphone somewhere that I will get some running packets up and down the stairs. We're calling over probably as well. Thank you. Yes, I'm Nick Coglin. I work as a provisioning architect for Red Hat. Working on a bunch of our internal systems, how we deploy them and things. This presentation was supposed to be given by a colleague of mine, who's one of the developers on our hardware integration testing system. Unfortunately he wasn't able to make it, so I'm feeling good. One of the big projects we work on is Red Hat's hardware integration testing system. The thing that lets everybody else not worry about the CPU details and all that sort of stuff. This is... Thank you, Firefox. I'll just go through a little bit about what Beaker is. What the hardware inventory system is and some of our unique requirements. How we go about creating that inventory and then the actual bulk of the talk will be about an interesting migration problem we've had recently in terms of having to replace one of the key components for collecting that inventory data. If people want to follow along with some of the links in the slides, that link there is Amit's Fedora people page. I'll give you a chance to grab that if you want. What is Beaker? Operating system company. Operating system, one of the main things we do. Our hardware integration testing requirements. A lot of things that people... Other folks can abstract away and just test against the OS and let the OS worry about the hardware. We actually need to worry about testing against the hardware. And so Beaker is the open source Beaker project org hardware integration testing system that we use to actually test Fedora, CentOS, Red Hat Enterprise Linux, Project Atomic. All that sort of stuff running on bare metal, preconfigured VMs, dynamically created VMs, Docker containers. We basically try and make sure that the operating system actually works in all those contexts so that the software running on the OS doesn't need to worry about it. And so Beaker basically lets you set up tests across multiple systems, combinations of bare metal and VMs, different hypervisors, all sorts of interesting stuff. And then you can also group them together such that you say does this test work across all the architectures, see which architectures are failing, all that kind of thing. All the stuff operating systems companies do so everybody else doesn't have to worry about it. I've got a much more detailed talk about this from last year, link to the videos in the slides, all that sort of stuff. So yeah, so as part of this, we need to maintain a hardware inventory of like Red Hat's got thousands of systems in our main Beaker instance. We need to be able to give people access to those. People need to be able to find the systems they want to do the testing they need to do. Now most hardware inventory systems, so if you look at things like OpenStack Ironic, on metal, all that sort of stuff, they have a fairly abstract view of what constitutes a computer. They'll go, oh well, you can say what kind of CPU architecture you want. You can say how much RAM you want. You can say how much disk you want. But it's not very detailed. It's the abstraction that the operating system gives you. When you're trying to test the operating system itself, you need a bit more detail. You not only need, you want to know not just what architecture the CPU is, but you want to know what version it is, like down to what specific options that CPU provides, whether the BIOS has hypervisor acceleration enabled, all this sort of fun stuff. So we'll show a little bit more detail, surely. So for example, if you're wanting to check that does the operating system work properly on Intel Celerons, then you'll say, well, Intel Celeron is actually family 15 from Intel's point of view. And so Beak will say, oh well, we know that this is a family 15 CPU. That's a bit obscure for most people to try and remember. And so we actually have these predefined host filters that you can say, look, give me an Intel family 15 Celeron. And so you can just kind of search through the list of predefined filters and all that sort of stuff. That's currently only available through the command line interface. So like a lot of things, techy, the web UI will get you started. But if you're doing this day in, day out, you just want the computer to take care of it for you. And so a lot of the, in some ways, the command line is actually easy to use because we've got some of these nice cities built into it that we haven't made available through the web UI yet. So yeah, so it's things like Celerons. So you can get down to the detailed CPU family rather than just the architecture. You can pull out stuff like not just how big the hard drive is, but who the manufacturer is, what the driver controller is. And so it's a lot of it's built around actually testing driver compatibility or all of the abstractions that the operating system provides. This is the system that lets us test that that actually works and that works in a way that we can support. However, there's no way you can realistically maintain that stuff manually across thousands of machines. It's just not going to work. It's going to get stale. It's going to be wrong. So we will automate the task of collecting it. You just run a job on the system. That says, OK, look at the box. What's here? What are all the details we need? Push that back up to the server, and then the server makes it available for searching and selecting systems and all that good stuff. And so that's a project, sub project, a beaker called beaker system scan. And you can see a subset of the data there. This particular scans from a power PC 64 bit machine. It's virtualized. So the CPU flags, there aren't any because it's not running on a real machine. Model number gives often gives a lot of details about what the capabilities are. Model name, this one's particular ones are power seven, four processors, virtualized so there's no physical sockets, details of the speed, who the vendor is. That's the CPU stuff. There's then generally a lot more info around various other bits and information. So beaker system scan itself is mostly a Python application, and you can run it independently of beaker. So you can install beaker system scan and see what info it gives you about the system. So that's then showing some additional info about the hard drives. And so, yeah, it tells you what's the actual model of the disk, sex or size, total size, a lot of these stuff around large scale systems, non-uniform memory architecture, that kind of thing. So you can scan for how many numeros as a platform have all that sort of stuff. So Linux exposes a lot of stuff through the proc virtual file system. Get a bunch more out of lib parted. And then there's then not built directly into system scan itself. We try to use the OS' own capabilities of self-reporting what the available hardware is. And so for a very long time, because you used a project called SMALT, this was part of the Fedora hardware sensors that reported back to the Fedora team where Fedora was being run. So it was one of the things that you could opt into during Fedora first boot to say, we're running on this sort of system. Here's the hardware details, all that sort of stuff. SMALT, unfortunately, got retired quite some time ago. And what that meant was that beaker systems scan. So SMALT's native reporting capabilities actually stopped working years ago, which is why it got retired from Fedora. Beaker wasn't using the reporting capabilities, it was just using the hardware scanning capabilities. And those kept working right up until relatively recent Fedora. But unfortunately, even those eventually stopped working and Fedora dropped SMALT entirely. And so what that ended up meaning was that we got into a situation where our hardware scanning capability would run on REL6, wouldn't run on recent Fedora, wouldn't run on REL or CENTOS7. And that then we had ARM, ARMv7, ARMv8, and new generations of PowerPC, our inventory scanning system wouldn't run. Because those architectures aren't supported in REL6, or PowerPC64, little ending I should say. Those architectures aren't supported in REL6, the old scanning software only ran on REL6, so we had to fix something. One of the options we looked at was just taking over SMALT maintenance. So adopt the project, either fork it or just take maintenance. Didn't really want to do that. SMALT had been dropped from Fedora. They were replacing it with a different project called LSHW, for least hardware. And so it seemed like a more sensible option was to go forward and try and figure out, well, okay, can we take SMALT as a designated successor in the OS? Can we migrate our own scanning capabilities over to that? And that's something we're currently in the process of doing. And it turns out that, yes, this is actually entirely possible because LSHW has an XML output option. So you can basically take LSHW and say, give me the XML and then process the XML to transform it into whatever format you want. And so what that lets us do is create a new version of the system scan based on LSHW rather than SMALT. And that lets us still support ITANium because REL5 is still supported. Yay. 32-bit x86, 64-bit x86, IBM S390 mainframes, PowerPC, ARMv7 and ARMv8, so the 64 on the end there. And so, yeah, the new version of LSHW basically lets us support all of those by running the inventory on REL7 or CentOS7 or a recent version of Fedora. And so, now that said, we've got the interesting requirements. We need to do things on R&D's timeline rather than necessarily being able to wait until new versions get into a release of Fedora. Or even longer timeframes for REL or CentOS. And so what we actually do is we actually maintain our own LSHW fork. And pretty much the purpose there is we're not trying to replace LSHW, but what the fork lets us do is it means we can actually try out our fixes and improvements ourselves and update it on our timeline. And then work with Lionel upstream on getting our changes back into upstream LSHW and then hands-on into Fedora and so on and so forth. And what it means is it makes that for a far more positive dynamic in the way we work with the upstream project. Because it means whenever in the situation where we have to get something into upstream to meet our own deadlines. It means that our fork runs on our timeline and based on what we need to do. And then we push it back upstream and work with Lionel to get the changes into a form that he's happy with. And so, yeah, and so that downstream fork just creates a really good separation between the interests of the two projects and lets us work together far more effectively. And then the main situation where we need to step in and add stuff is mostly around the fact that vast majority of the open source world, we're just all working on x86 most of the time. Red Hat supports a whole bunch of other architectures that have more of a presence in the data center than they do on end users' desktops. And so, yeah, and so that's where a lot of our contributions there and that link in the PR there goes through to things. One of the other things we're working on as well is trying to provide an automated test suite for LSHW that actually provides reference data for these other architectures. Such that you can run it against dummy data and dummy proc effects and all that sort of stuff. Such that even if you're running on x86 you can check that you haven't broken any of the other architectures scanning capabilities. That still needs some work to get an interform Lionel's happy with to accept upstream, which is fair enough. So in the meantime, we're pretty much just will throw new versions against our these hardware we have access to and say we broke such and such and will fire bugs and keep stuff working upstream as needed. However, there's always fun fun times when you're replacing a long live system. So big has been around several years now. If we break major parts of it, for some reason the company gets very upset. And so we kind of need to be careful that we can't just rip small to out throw LSHW in and say, hey, here's the new one. We really need to check. Are we collecting all the same stuff that we used to be able to collect? If people go searching for systems, are they still going to be able to find them? And so essentially what we do or what we've been in the process of doing is we still have the existing small pitch tasks. We have the new LSHW based task and we'll just run the will take a single will take multiple systems across all the different architectures, run both scans on them and say, are we getting the same answers? So we obviously can't do this for MV7 and MV8 because the old one didn't work there. And so we already just use the LSHW one that because it's like we just don't that's the only way we can scan them. But for existing architectures, we just want to be able to drop the old small one entirely and just use LSHW. And so what we basically do is we've just been going through and saying, right, here's the answer we get for from small. Here's the answers we get from LSHW. And anytime we get discrepancy, he's just trying to figure out what's wrong. So in this case for CPU, you can see that we're getting the same answers from both or good. But in other cases, we're starting to find that no LSHW really has gone backwards in some areas relative to what small used to be able to provide. And so like this is an example where like if the capitalization changes, but the information's all there, we don't really care. But if we're seeing things like in this one, LSHW is currently not picking up what the driver is for VRT devices. And so that's kind of a problem because if you're trying to test a particular driver, you need the inventory data to report what you've got. And so the current situation we're in is we're actually pretty far along in the process of replacing it. Most of the stuff that we need to fix that was missing in LSHW is fixed in our branch. We still have some issues with the drivers and device reporting that we're not quite sure why LSHW is giving us the wrong answer or is missing data or missing devices. And so yeah, so that at the moment is still under investigation. We're trying to figure out is it that LSHW is just not providing the info or are we not reading it out of LSHW's reporting correctly. And so yeah, so that's the current status. That's what Amit mainly is working on in this area is doing the rest of that analysis and getting a still point where we can drop the old small base task entirely and migrate fully to LSHW. And so yeah, and so there's a lot more detail about this in the upstream design docs and dev mailing list. But yeah, it's when I find quite interesting just because it's that lower level of detail that this is all the stuff that operating systems hide from everybody that we worry about it so other people don't have to. I still think it's cool. So again, so this actually started off as an intern project and then has been migrated into the main development cycle of Beaker. Cool. Any questions? Stun silence. Maybe. Oh, it works. Does the list of hardware that Beaker supports affect the hardware that Fedora supports? For instance, is there any sort of tail wagging the dog? So for Fedora, Fedora historically hasn't run its own Beaker instance. We're trying to get that fully up and running for Beaker Fedora project dog. That's partly driven by the there were with with the anaconda rewrite, a couple of Fedora releases ago, which then came into rel seven. We found some interesting regressions when we started doing rel seven testing. We're like, okay, that's kind of silly that we're not doing that testing upstream in Fedora. And so yeah, so we're actually starting moving more of that upstream into Fedora. Historically, because mostly been focused on the rel center side. But yeah, we're certainly trying to move more of it upstream just to shorten that that cycle time between. Does red have a massive room with thousands of every kind of machine? Yeah, we've got several thousand machines scattered across the planet, all hooked up to a giant Beaker instance. Cool. Thanks, everyone.