 They're not having a share of the property. How does that work? There are clients out here that are large. They go to different powers. That's how most people are handling this. But if someone lets the church in, they can go through the window. Well, there's good. I'm just going to give you the mission to the government and set it back. They want to donate some resources. Unfortunately, before traveling, there were no reasons to give me a job. There's no reason to give me a job. No. You know, I understand that, but you're asking me to capture it. There's your powers, but you're saying you'll die just like me. We had two guys. I know I've got to be on there. I've got to get people who care as much about this. I'm not sure that. Yeah, I'm sure. I mean, are you sure? I don't know. What is this? Yeah, but what's the matter with it? Yeah, one of the questions is if you're trying to capture it. I don't know. What's the purpose of this? I don't know. I don't know. Test, test. Oh, I know. Good afternoon. Welcome to this afternoon's session of the embedded track. Just a reminder that this is being recorded. So it'll be available on the SoCal Linux YouTube channel. It's also the slides will be uploaded to the website, so they'll be available as well if you want to see those. This afternoon's speaker is Steve Arnold, and he will be talking to us about the old IoT without a net. Thank you. Take it away. Well, it's without a net because I'm without my Linux run time and boot loader stuff. So this is basically my attempt to share what I've learned over the last several months, maybe a year or so, working on some projects. So this is actually a picture of the Beagle Bone Spy Slaves connected to another Beagle Bone, which was one of our projects for Google Summer of Code last year. So this is our topic reference for tonight. So I have some examples. And they're basically things that don't do anything until you load firmware into them. And that's going to be basically one blob that you either build with GCC or you build with another tool chain. And then the examples I'm going to talk about are essentially some Cortex examples, ARM Cortex for the Altera FPGA hybrid board that we got to use a little while ago and Beagle Bone since they have real-time cores on them. So who am I? I'm a Gen2 developer. I contribute to Open Embedded once in a while. I think I probably harass those guys more than I contribute, but we use Open Embedded for a lot of our stuff. So I've been doing Google Summer of Code the last couple of years. So I think we have any students that are interested in this stuff from a Beagle Board, Beagle Bone standpoint do check it out. I'm kind of a part-time grumpy tool chain guy and maintains and software tools for developers and whatnot on GitHub, and I'm the systems architecture and kind of OS guy at BCT Labs. So what are we talking about here? The primary characteristics of the kind of devices that we're talking about are essentially they're standalone, separate from a normal CPU. They might be on a board with another CPU or they might be just by themselves. It requires firmware, like I said. It requires a firmware blob to do anything. So they can include both hard and soft processor cores. In the FPGA world, everything is a soft core, and the FPGA is essentially a blank slate that has absolutely nothing in it until you define what you want and load it in. So you can define peripherals. You can define a whole Nios ARM core in there if you want with some peripherals, but that starts to take more space. So these things have interfaces for debug and peripherals. One of the examples that I mentioned in here is the ESPE8266 Wi-Fi module that has a small ARM core-texy compatible core in it. So it has a little bit of flash. It has a microcontroller, and it has pinouts. So you can connect sensors because the thing has an I2C bus, a spy bus, UArts, and it's all in a tiny little board about this big. So typical applications that you see these things in, some of them are in your car. They're in consumer devices and set-top boxes. They're used in fixed and mobile sensor platforms, wearable stuff, autopilots, or microcontrollers for your drone. So they get used in a lot of different things. Your lovely automated entry system that unlocks the door when your phone gets close to it has one of these things most likely. So there is a reference architecture kind of for this stuff. It's very... Well, it's not very detailed, let's say at this point, but there is a reference architecture, and these things have a definition. So they have a little bit of RAM, a little bit of ROM. They have peripheral interfaces for different things, mostly sensors and whatnot. You have UART interfaces. You have all the interfaces that you would find on an embedded ARM board usually. I have slides that will take more time. So here's some of the examples that we will talk about. Probably not so much the Arduino side, although since they've switched away from AVR for some of the more powerful Arduinos, then now they're essentially the same architecture as these other Cortex boards. So, and the same is true of like the Udo boards that have an on-board Arduino-ish controller and the ARM on the same board, and they use Cortex-compatible SAM controllers on those. No. The only one that you can see that's too scale or that has a reference there, that clip is the ESP8266. That's about as small as you can get that board, and it's actually powered up off that battery. So the one that's kind of oddball, and I threw it in here because it is a microcontroller, but it's not the architecture that you would expect necessarily. It's one of the massively parallel ones. That one only has eight cores, but it has eight tiny real-time cores, and it's not an interrupt-driven system. But those types of, that type, that architecture goes back to the 80s when the transputer guy first came out with this transputer stuff, and we had cards for Mac 2s that had like 16 transputers on each card, and you would slap a bunch of those cards into your Mac and do parallel programming stuff on them. So that's what these little guys are actually, most similar to probably the P8X one. That's the propeller quick-start board, and then there are a couple other related propeller boards from parallax that have this similar architecture. So the most common families, and some of these go back to probably before I ever even touched Linux, and they've been used in the past in industrial applications, but they're not as popular now in the open-source world. You can still buy plenty of them, I would assume, but I'm not sure that you can buy them in the same kind of a format, let's say, as some of the more open-source-focused ones, because they're essentially industrial controllers that you would buy if you worked at Ford and you would buy a whole truckload of those things. But so the 8051 and the PIC are kind of the older ones that go back quite a ways. And they and the AVR are Harvard architecture, which separates the RAM memory spaces. So ARM is actually a von Neumann architecture where a program and RAM share the same memory space. And this is now ARM Cortex, not your ARM V7A Raspberry Pi or whatever. These are the microcontroller arms, so there's a whole definition of that question. So the ones that you see that are Cortex-M and they go from like M0 to M4 in terms of sort of horsepower and capability. That ESP1, I know I read somewhere that it had M0 in it, but apparently it's not, but it's close enough. So anyways, the ARM is 16 or 32-bit architecture where the other ones are the old bite architecture ones. So the ARM Cortex ones are quite a bit more sophisticated than some of these older ones that you find in industrial automation applications. So still around also are PowerPC and MIPS and all kinds of other things. But the most popular ones that you see now are basically still the AVR-based Arduinos and compatible-ish ones and also the ARM Cortex ones. Pretty much what you see in the open-source world is those two mostly. So you have also some combination hybrid kind of architectures and we have a couple of oddball ones to talk about in that frame. The PRU subsystem is a TI and you find it on Beaglebones and Beagleboards, but these are separate real-time cores that run independent of the CPU. So it's kind of similar to the Toodoo ones where you have an ARM V7 host and then you have a Cortex-compatible microcontroller on the same board and they will share some, they'll have each have some dedicated pinouts and they'll share some. So the PRUs are not quite as powerful as like a Cortex processor, but they're not supposed to be. They're just supposed to be 200 MHz cores that will run essentially whatever you load in there. They can execute almost all the instructions defined in one clock cycle. So they have their own tool chain, they have their own architecture definition. There's a GCC port that works on PRUs. It's not as mature as the TI code generator tools, but it does work. So you'll also find DSPs, and so we're not covering the DSP stuff that's basically a special-purpose processor compared to a PRU, so it's not one of the topics here today. And as an example, the FPGA side is what I mentioned before. Those are essentially nothing. That's literally a blank slate for your firmware. But it is connected, in this case, the one example, the Altera board, is truly a hybrid board because the peripherals are connected to both sides. Some of the peripherals are connected to the ARM side. You get Ethernet and a console, and the rest of the peripherals that are on the board, the physical audio connectors and all that stuff are actually connected to the FPGA side. So they don't do anything until you have a blob loaded in there that actually fires them up and does something. And then that's the parallax propeller transputer category at the very bottom. So the hybrid boards, these are examples. Beaglebone is one of them. I mentioned they have PRUs on them. One in the middle there is the cheapest example of the hybrid FPGA board. I think that one's less than $100 for that one. The Udu Neo Full is there basically size of a Beaglebone, but it has a Cortex M3, I think. It has a SAM Cortex compatible controller on it that's connected to the peripherals for sensors and stuff. So these are all basically general purpose microcontrollers So as far as software tools, it's time for another tool chain for the most part. I mean, GCC essentially supports all of these devices in one level or another, not the FPGA stuff. That's a whole other set of tools. But for all the embedded processor stuff that actually is hardware, you should be able to find a GCC port that works. And the AVR and Cortex side is very mature GCC on the GCC side, but not so much with the PRUs. Those are a little bit unique, and that particular GCC port has not been around for very long. So they've taken a different approach to supporting the instruction set a little bit than the TI tool chain does. You can see some differences. There's links in here, and you can go see some of the information. But so a typical embedded cross-tool chain, you have your compiler, venue-tills, libraries, and headers and stuff, and an SDK, and a set of kernel headers. In this case, there is no set of open-source libraries to make your SDK with. You get a small C library, typically. You get the compiler bits that you can link into your program, and that's it. You get to write everything. So the better vendors actually have SDK-ish things. They have basically collections of C code that do different things, and so you can use those to bootstrap your project and build something quite a bit quicker than if you had to write it all from scratch. In this case, the Altera SDK provides little bootloader sample projects and other kinds of sample projects along with their SDK. So you can basically copy one of their sample projects, build the bootloader, and then try out some of their stuff without doing too much work. So this is just an example on one of my machines at home. I have tool chains for all of this stuff, but they're all built from source since I'm a Gen2 guy and I tend to use cross-dev to build my cross-tool chains. But you can have them all installed at the same time. It's no big deal. So that's what he mentioned a minute ago. The RMB7A versus RMB7M in the triplet will tell you what ARM device that tool chain is for. The fun parts about the compiler triplets are that you can use some of those fields for different things and you can leave some of them out and people do. So in some cases you might not be able to tell everything that you can see in say the first one there, that's a Cortex-M tool, or no, sorry, that's the A tool chain. And you can see the differences. One of them has Linux in the triplet and the other one doesn't. And then the ABI part at the end is a little bit different too. Because on the full tool chain we're using the E-A-B-I, full G-Lib-C, and in the other case we're only using NewLib. So bare metal firmware, this kind of is just a little bit of a digression I guess. And a couple of pointers to some people that have been working on some of this stuff for quite a long time. And they both have a lot of good resources in their blogs. So the first guy is actually really good and he's kind of where I learned how to hand and spin my own tool chains from some of the stuff that he's done. And then the other guy has a bunch of propeller stuff and you can play around looking at the upstream vendor stuff or you can just use his installer script and that's really the best way to get the propeller tools if that's the board that you have. But ultimately you need to do your homework up here. If you're going to use these things for a project you do need to understand kind of what the limitations are as far as the hardware itself, the tool chains that are available, SDKs, vendor stuff depending on what you're doing. If you're just trying to learn something then just pick one. Because they all have interesting things to try with them and do with them. And if you just pick one, they cost like 20, 30 bucks, a lot of them. So it's not that big of a cost just to get something and try it and see. Because that's really the best way to get some experience with this stuff. So here we are with some vendery tools and some open source tools. So I keep getting ahead of myself. I always do that. The Cortex-M tool chain, the GNU tool chain is the official ARM tool chain. If you go to the ARM site, they just point you back to the same site. What is it now? Well, they just point you back to the GNU tool chain and say, here, this is the official thing and there is an upstream for that. They spin like quarterly releases. So it was up to 5.4 a little while ago and it's probably up to, I think 6 is going to be their next release. So they're very current. On the FPGA side, that is definitely not true. There are some open source tools for building FPGA code. This is a little bit different because the firmware people actually don't write that code for the most part. They use a tool that generates the VHDL code from their hardware design tool. But that VHDL has to be compiled and it has to into a blob so you can actually load it into the FPGA. You're going to probably want to start with the vendor tool chain if you're going to try one of these things. They are a little bit more expensive than your common question. Well, yeah, that's what I was going to say with the vendor SDK type stuff. You tend to see the BST thing more on the full-arm side or MIPS or whatever else you're building with open embedded or something. But yeah, because when you have an R, if you're dealing with an Arduino and you're using the Arduino IDE, there's a bunch of libraries and stuff behind that that go with it along with the tool chain. That's exactly what I was trying to get at before. There really aren't open source equivalents for all of these things. There are for the ones that are popular the AVR Arduino side got that going quite a while ago. They should have the Cortex stuff in there now. It wasn't that long ago that they actually didn't have any Cortex support in the regular Arduino IDE package. You had to go to the beta version to try one of those. That should be a lot more mature by now. We have the TI tool chain for doing that stuff. The code generation tools is basically the compiler itself, the basic tool chain. And the CCS is their Code Composer Studio, which is I guess their IDE for that. I don't use that. And then as far as on the propeller, you can do that in C now. You can use GCC and some other stuff. But their original tool chain was based on this spin language that the propeller guy, the guy that designed the hardware actually designed that language too. So it's very specific to the way that the propeller boards work. So again, like I said, the rub is for the FPGA that it's very vendor specific. So that's one of the places where you need to do a little bit of homework if you're going to buy one of those things. And there are some, like I said, ones that are under $100. The Spartan FPGA board that I think Adafruit has is less than 100, right? And that DE-0 nano one is less than 100. So you can get started with FPGA stuff for not too much money. But you do need to understand, you know, before you pick one, you kind of need to understand what tool change you need for that and how much support is there. So there is a lot of support for the Cyclone FPGAs through Ulterra. They've been around for quite a while in their tool change set like version 16.1. It's not the best documented. And right now, I don't want to say that because there's actually a lot of documentation, but there are holes. And there's so much documentation that you can't find what you're looking for sometimes. It takes a while. And the other thing is their glue is broken, their integration glue that links their FPGA blob with U-boot and the Linux kernel is broken. So you can't just build their stuff out of the box right now. You have to use... pardon? Which? Okay, I would like to see that because it took us a few weeks to work through the issues with that. And it's documented now in one of our GitHub repos, the basic process that you need to do on the U-boot and kernel side. But that integration glue used to generate DTS files to go with your FPGA and you have to get right that by hand now. Yeah. Yeah, there are a lot of tools for those languages still. They've been around for quite a while. Yeah, yeah. So you can do some level of verification on that code with some tools from a couple of different places. But yeah, you're not going to be typically writing that from scratch like you would in another language. The hardware design tools absolutely need to generate that stuff based on the hardware design that you lay out in the visual side of the tool. So I think the last version of the Quartus stuff that worked was 13.something and you had to use a fairly old U-boot and kernel if you wanted that stuff to still work. But I'm not sure if it does. It does, but it lost all the integration stuff. So yeah, there is quite a bit of try it and see kind of testing that goes into this stuff. I didn't think it would take us this long to work out something for the customer at that. Yeah, that's typically a big part of those tool chains is being able to simulate what you've designed before you actually burn it into something. Got a good discussion going. Get all the ringers in my talks. That's what this custom integration link that's going to come back, right? The custom Linux integration link down there points to the repo where we have some notes and documentation on that broken Altera FPGA stuff. Yeah, you wanted me to do that. I got to remember. So that's the other thing is when you're working with an FPGA you're going to be playing around with device tree stuff. Unless there's some other vendor that has way better tools you're going to have to write some device tree stuff to go with the hardware that you implement in your FPGA. Otherwise the Linux kernel won't know anything about it. So examples. We have a tool chain. We have an SDK. We have some kind of a develop kit. Develop board now what? Well you can start out by compiling some of the vendor-y stuff initially. Like I said, I expected more of that stuff to work. In the case of the Altera stuff most of their demos were broken as well with new tool chain, new kernel. And I didn't want to roll everything back to the old one that they don't really support anymore. So we worked through it. But first thing is to start with some of the vendor stuff. And I would say that even for the PRUs too. So the first thing would be to just follow some of the TI hands-on labs and use their CGT tool chain and then you'll start to get a feel for it and then you can do more stuff. So connect up. First thing you want to do is compile one of their demos. Build one of those demos, flash it to the thing and then connect up any cables that you need and fire it up. I couldn't believe there's actually a Wikipedia page for ON slash OFF switch. That's Air Force terminology. So each of the boards is going to have their own kind of standard set of interfaces as well as vendor-specific things. So you won't be flashing FPGAs the same way across various vendor FPGAs most likely. You're going to have to use the Altera stuff for the Cyclone FPGAs. So you can't mix and match quite as much as you can in other ways. It depends. They have a couple of different interfaces. The Altera boards have a USB blaster interface which is actually just the host port. And so you just plug a normal cable into that and you can load your FPGA stuff that way. I kind of like the U-boot loader myself because it makes it really easy to redeploy stuff and test it. But that only lives on the machine for as long as it's powered on. If you load it from U-boot, then it goes away when you power it off. So there is a flash location where there is a vendor blob that's loaded in the normal flash place. And that's the place you would load yours if you were actually shipping something to somebody. But that flash has very limited right cycles essentially. So you don't want to do that. You want to load stuff there as little as possible. So like I said, the Altera boards have that blaster interface the other boards will have different interfaces. So it depends on how they physically connected their FPGA and what kind of hardware they have in front of it like a spy flash or something or whatever it is. Let's see, on the one, we haven't talked about that. We're about to get to it in a second. But the Nordic Semiconductor NRF-52 is one of those Cortex-M boards. So their DevKit board has a bunch of stuff brought out on it. It's a real actual DevKit that has hardware connectors and peripherals so that you can evaluate things and figure out what you want to use. You won't get that on a lot of other ones. Especially if it's a custom board spin that somebody's done to make it super small so it fits in to a door lock or some other kind of oddball place. And those boards you might have like a couple of tiny pads for you to connect your debug thing to. So it all depends on what you're actually playing with. The ESP Wi-Fi one typically just has a micro USB connector and you can power it with that and you can flash the firmware with that. And there's a little Python tool, ESPTool.py, that you use to flash whatever you want on that thing. And you can flash anything from MicroPython to NodeMCU. That thing has a ton of different open sourcey things. You can load into the flash. So yeah, on the DE0, DE1, Altera boards, like I said they have a USB Blaster if they have JTAG and serial UART connectors and you can also load it through UBOO. And then the BeagleBone, essentially the TI ones, if they can find firmware for the PRUs and libfirmware, they just load it when it boots. And if you want to load something different, you can use device tree overlays. You can stop the PRUs, load new flash in it, and start them again right from the BeagleBone. And then normally you're going to want a debug interface and that's going to depend on the specifics of what you have in your hands. But you can pretty much count on at least a JTAG interface. It might not be pinned, but there's a JTAG on pretty much every board you have. Not a BeagleBone, but most of them. So most of them are either 5 or 3.3 volts. So a lot of those smaller ones are going down to, they're essentially 3.3 volts, and you can run them off one of those lithium batteries. So a lot of like the ESP chip comes with a battery connector too, usually a lot. If you get the board from Adafruit, their feather board or whatever, it's also got a battery connector on it. So you can power it off at 3.7 volt lipo. So I think I've brought some of this stuff up already, but those Altaira hybrid boards, like I said, are very different from any of the normal ARM hosts. That you've played with, or that you have like a Raspberry Pi or something. It's truly a hybrid board, so that you can't, with just Linux on it, you can't really do anything. There's not much useful that you can do with that board if you only boot it with Linux and you don't have anything loaded in the FPGA. So that's where their integration stuff gets important. The integration stuff in the tool chain, you boot, and somewhat in the kernel. So that's kind of the part that's not well documented right now, and hopefully we can make that a little bit better. So that's again, you don't want to necessarily roll back to some old, not maintained set of vendor tools just to get the stupid thing to work. That's something that I have a real hard time doing. So if I can possibly get it to work with current stuff, I will. And you can check some of those. So actually, these boards are used in courses on, well, engineering courses on processor architectures and that kind of stuff. So they actually have a lot of extra information that you won't find in the Intel Altera documentation. You can find on some of these university sites that ECE 5760 course is one of them because they've been using that DE-1 board to teach that class for a while so they have a lot of good documents. Did I run out of questions? So examples on the PRU side, the only thing that's really kind of potentially confusing I think about the Beaglebone PRU stuff are all the different kernels that get used and that are still used. They're still using 3.8. You'll see people using the old 3.8 kernel and they have PRU stuff that they do via the old UIO interface. That's just basically a user interface into the kernel essentially, user space. You want to try to stay away from the older stuff if possible. So there are a couple, and these are long-term support kernels, the 3.8, 3.14, but most of the work these days is going into the four-dot series and has been. But what changed over time was essentially all of the important bits that you use in the kernel side so the device tree overlay stuff got better. Actually, I think Beagle started that stuff. A lot of other hardware uses it now because they share pins between different peripherals, so you can only fire up one of them, not the other one, unless you load a device tree overlay that switches the hardware configuration essentially so the kernel knows about that new configuration. So the old kernels use that old UIO interface. The newer kernels and the newer software support package, there's a TI support package and a Beagle board support package. I didn't know the Beagle board one was that, was not quite keeping up with that one, but so there's a TI support package that's all up to date, version 5 or newer, and it only talks about basically the new way and doesn't talk about the old way. So there's a couple more good links in there and there are a few other kernels. I mean, because you can run a mainline kernel on a Beagle bone, you just won't get quite everything working just right for the most part it does, but there's still patches going in and the TI staging kernel is probably the one that you want because it's the latest version of the official TI support for that hardware, so that's kind of where you want to stay. We have a little bit of stuff on GitHub for Beagle board if there's a BSP platform repo that you can use to build open embedded images for Beagle bone if you want to and that will also install the tool chain for the PRUs because you can run those compilers on the Beagle bone as well. And then you want to try the TI hands-on training labs because those are updated to go with the version 5 of their PRU package. So there's a little bit of hardware that you would need to do all of those training labs, but a lot of them are fairly easy to change a little bit and just plug in your own LED into the Beagle bone because they're using a cape. There's some kind of a... Anybody know about that one? You can tell me. Yeah, but what does it have that you don't get otherwise? Okay. Yeah, right, so that's what I was trying to get at, that you can go ahead and breadboard that stuff if you really want to, but you could just try a couple of them with basic LEDs and stuff and that's good enough to get it going and see if it works. So they are slightly different formats. There's a little bit of TCC versus TI CGT stuff. The L format is now the same in both of them. But the relocation stuff is different and the ABI's are different. So you can do a little bit of homework and see if you're okay with the GCC side or just try both of them. But if you're unsure about anything or you're just starting to get into it, then start with the TI tools because at least those are pretty mature and pretty well documented and there's lots of demo stuff that people have out there that should work with them. Are you talking about mixing? Yeah. No, no, you don't. You want to use one tool chain to build your firmware. It's just that there are differences. So if you expect a certain behavior, you might not get it. So the big things for TI stuff are all in basically their processor wiki. They have a lot of stuff on the web and so it's fairly easy to find what you're looking for. Not as bad as the Altera stuff, I will say that. And some interesting applications for Beagle Bones and Altera stuff. I don't really have much that's interesting for that Altera FPGA except a working demo. That's the only thing. It actually fires up the video and stuff so it makes all that stuff work that's not connected to the ARM side. So that's their demo, not the sound one, but the frame buffer one. But the sound one is probably the most interesting one that I could find because it's not one of theirs. It's actually some other guy that did it and it uses open source FPGA cores. There's a whole open cores repository that's full of open source IP cores for FPGAs. It's stuff that you don't have to write essentially or have your tools generate in this case. So Beagle Bones has a couple of cool projects and some of that work was related to Google Summer of Code Stuff, the spy slave drivers, the picture on the title page. That's kind of interesting. The PRU DAC is a nice piece of hardware that fits on your Beagle Bones and lets you do much higher speed data acquisition than you can with the ADC that's built into the Beagle Bones itself. So you can get high enough sampling rates with the PRU DAC to make logic analyzers and oscilloscopes and stuff. So that's what Beagle Logic is actually, a logic analyzer that can use the PRU DAC tape, but it doesn't have to, I don't think. So what do you do? How do you choose? It depends. Seriously, it depends on what exactly that you're thinking of, what do you want to do? Because there's so much out there that's potentially useful, right? So you do need to kind of define what your requirements are, especially if you're looking at doing a potential commercial product, kickstarter, you kind of a thing. That's where you want to do a real good requirements analysis. But if you want to play around at the makerspace, if you want to learn something, bring something to your Linux user group, then you can essentially just pick, right? But there are some things you want to think about. And like I said, just buy one, try it and see. But use some of this to help you decide where to look. And of course you want to look at places like A to Fruit and Element 14 and SparkFun. These are all the places that are probably the best vendors for most of this stuff. Whether you're looking for a breakout board with a sensor on it or a breakout board with an actual microcontroller on it or an IMU or whatever it is, they've got it. So you can look at the Linux on ARM Wiki. There are both FPGA and PRU-enabled hardware on the list. And the thing about the Linux on ARM Wiki is that the guy that maintains that stuff supports it with patches and Debian repositories and kernel builds and all kinds of stuff. So it's one of the better places to go to get something that you can actually work with as opposed to you pick something off of the place where you're buying stuff and you don't really know what the support is. So that's actually a good place to go look and see what version of Uboot he's using and what version of the kernel, things like that. It's a combo board. It's not really going to help you too much if you're just looking for an ARM Cortex, but then you've got Arduino stuff. That's probably the place I would start. But do think about things like the community and around the device that you're thinking about buying. So if there's an active community and they have a lot of projects and people doing things, then you can pretty much be guaranteed that there's reasonable support for it and that there aren't any giant holes. But if you're going to do the other thing, if it's not a fun project, then you really do need to understand what your requirements are. And you need to start with evaluating more than one board, most likely. And you need to figure out what some of those hardware limitations are to feed that back in to your project requirements. Because you can think up some cool stuff and basically you would assume that you could do it. But it may not be true once you get the actual hardware in your hands. So he's laughing because we had some of those fun issues. So where to go? Like I said, the Nordic semiconductor stuff is actually fairly well supported and documented. If you want to start with something like their development kit or something based on one of those, it's not a bad option. And that's, like I said, the official ARM tool chain for those Cortex-M guys is actually the GNUDL tool chain. So if you go to the ARM developer site, they'll just point you back at the GNUDL site. And then we have a few extra tools for that kind of stuff on GitHub. The little programming wrapper there is basically just so you can use the Nordic semi-tools from another ARM box like in the Raspberry Pi. So because they have some tools that they do, the debugger, you can download their debugger as an ARM build for the J-Link stuff. And you can build the tool chain on an X86 box. You can build it on ARM too, and I actually did that because I wanted to do this stuff from the Pi because it was already integrated into their test and deployment process for this little tiny Cortex board because they were using the Pi 3's Bluetooth to talk to the Bluetooth on the Cortex. And I just went the next step and built the tool chain for ARM, the ARM tool chain for ARM. And then it turns out that the last thing that we needed to get that stuff working was a little wrapper because there are some X86-only pieces to the tool suite from those guys. So you just need a little wrapper program to go ahead and be able to flash the Cortex board from your Raspberry Pi via the J-Link thingy. Yeah. Well, are you talking about like RTEMs or something, or one of those? Free RTOS. Free RTOS. It depends on what the board is. I know that the ESP chips come with a tiny RTOS in their flash when you buy them, I think. I mean, it depends. If it comes with the NodeMCU stuff, it won't have that. But they have a little RTOS for the ESP chips. But there are no RTOS flash loads for them for that, too. I think the MicroPython doesn't use the RTOS. And I think the NodeMCU flash does not use the RTOS. But it does. It has one. And it's available. Now, as far as random stuff, that's a bigger question. That takes a little bit of research, I think. But there is an RTEMs Google Summer of Code group. At least there was last year. So that might be a place to start since they're, I think they do some stuff on ARM already. But I don't know if there's an RTEMs that's small enough for like a Cortex-4. So Beaglebone, Beagleboard Info. Like I said, there's plenty of stuff. The Linux on ARM Wiki has some good Beagle stuff. And then the Beagleboard stuff is all on the eLinux site. And then an example from our thing. And the DE-0 stuff, the Altera, there is a little bit of support there. And then Robert's got a page on that one, too, on the DE-0. And that was helpful for me. So even though we were using the DE-1 and trying to do something completely different, it was useful for me to run through his stuff a little bit just to see. Because the DE-1 and the DE-0 are very close in terms of what they are. So a lot of the support is actually reasonable and usable on both. Am I pushing the time limit here? Okay, well, this is pretty much the end. So that's it, yeah. So questions. Got them all asked during the talk, huh? This is what I learned in the last few months, so. Thank you. All right. Thank you for coming. Actually, I was forced to learn. Again, this has been videoed. If you came in late, you missed part of it. So that will be online at the Scale SoCal Linux YouTube channel. And also, the slides will be available, uploaded to the website. So thank you very much for coming. And we'll be back in here in a half an hour. You're left here. Okay, welcome everyone to the 430th session of the Embedded Track. You're at Scale Second Day. Just a word of warning, this is being recorded back there for posterity. Wave to the camera. Okay, there we go. Thank you. The slides will be available afterwards, and they'll be uploaded to the Scale website. And the video will be located on the SoCal Linux YouTube page. So that'll be available. I always watch your name, and I know I shouldn't, because I've known you for a long time. Hanyu. Hanyu. Hanyu. We'll be speaking today. And he'll be talking about all kinds of different roles that IoT plays within Embedded Linux. So take it away. Thank you. Thank you, Tom. Hello everyone. Thank you for coming to my session about the main roles of Embedded Linux in the IoT ecosystem. Briefly about me, I'm an Embedded Linux developer. My day job, I do Embedded Linux consulting. So I help people build things. I help people do all sorts of stuff with Linux there. So this is something that I deal with on a pretty much everyday basis. I might sometimes go a little fast, so feel free to interrupt me if I'm going a little bit fast. So with that, let's begin. So today, we're going to talk about several things. First, we'll explain what IoT is. We'll look at what the different roles of an IoT device, how the different pieces are. We'll then look at what is Embedded Linux. How does it relate to regular desktop Linux that you may be familiar with? We'll look at the different roles that Embedded Linux could play for IoT in the IoT ecosystem. And we'll compare Linux versus Bare Metal, because for a lot of IoT devices, Bare Metal seems to be the natural obvious choice. Why throw in something so heavy as Linux onto a lightweight IoT device? So we'll look at some of that. Then we'll look at some real devices. First, we'll look at something like scenarios of how these things would fit together. We'll look at a few examples of what's on the market. And I built up a little IoT device to show you to demonstrate various things. And one thing that's often not talked about is security. If you may remember news from a few months back there, there was this big Denial Service attack where the hackers were just leveraging IoT devices to build up a huge botnet to launch a Denial Service pack. They were so successful because there were so many devices out there. We'll talk about millions of billions of devices out there that they were able to do a Denial Service attack without using the traditional amplification techniques that were used in the regular attacks. So what is IoT? IoT, one thing is that it's a big buzzword. It stands for the Internet of Things. These are connected devices, devices that you could talk to using your phone, which generally implies that it will be a device connected over IP or at least eventually connected over IP. These devices are accessible over the Internet. By Internet, I mean both the public Internet, the capital-I Internet, and also the private network. Sometimes that's referred to as the Internet there for the rest of this presentation unless I specify otherwise. Internet, I'm using that to refer to both the public and the private. Yes, it would not be IP, but I will mention that in a few slides. So these things that I'm talking about would be things like sensors, switches, devices such as, you could just control over the Internet or to be able to access and make use of it from the Internet there. And the useful thing is that you could talk to it over a phone. It could still be a thousand miles away and still be able to interact with these devices. So incredibly useful stuff, but this is a new thing and everybody's talking about it. Ultimately, we want to be able to talk to it by IP, either IPv6 or IPv4. They are the traditional way there. And note for that, a device, if you're doing it directly, would typically have either a wireless connection. In the case of a wired connection, a device would involve having an Ethernet, a USB, if you have another device to add a proxy for it. Or you could directly do IP using a wireless. For that, you'll typically be speaking Wi-Fi, classical Bluetooth. Those are relatively high-powered things. Alternatively, you could also have IoT devices that speak natively, just like ZigBee, a custom proprietary RF link, Bluetooth Low Energy, or even a phone, a 3G network. These things generally do not natively speak IP. What this means is that you're going to have to have another device somewhere to ecosystem to translate ZigBee, Bluetooth Low Energy, and these other protocols into IP there. And we'll look at how Linux is a natural fit for such a function. The roles for IoT. Well, this is actually pretty straightforward. IoT, you have sensors, devices. Those things could be divided into a self-contained device. That means something that naturally speaks IP. It's one piece. It has some kind of processing function, sensor interface. It also does the interface to your phone or whatever. That's their own IP server for access or some IP connection. Just speaks a self-contained. Then there's also self-contained devices that don't directly speak IP. By that I mean, let's say, a thermometer out there that might speak Bluetooth Low Energy. It's also self-contained. It's the device itself. It speaks a protocol, but it's not native IP there. Then there is a gateway. That's mainly used for devices that don't natively speak IP. So you have a phone that connects to your gateway. It would in turn speak as a ZigBee, Bluetooth Low Energy or anything out back to the device there. Or another reason why you may have a gateway in an IoT ecosystem is that you want to aggregate things. Consider an environment. Maybe you're trying to measure temperature over a large area where you have a thousand devices. You could maybe potentially put a thermometer in each one of those sensors there and have each one have an IP address. That becomes management nightmare. A gateway could potentially let you aggregate all those things there and not expose all those individual IPs to the world. In fact, you may not even want to use IP to recognize those devices. You could use some kind of proprietary protocol which is more adapted to handling a large number of devices. But in order for that to work in an IoT ecosystem, you need a gateway. So what is embedded Linux? I know some people here that I recognize know they are familiar with embedded Linux but if you're not familiar with embedded Linux, well, embedded Linux is basically the same as the desktop Linux. It's built from the same source base there. It runs the same Linux kernel. The biggest difference is that the user-land piece, it's paired differently. Just as an analogy, you could look at what's different with desktop Linux and Linux for a server. There's still pretty much the same Linux kernel there, but the user-land pieces are different. So in a desktop Linux, you might be running KDE, Unity, some kind of UI there. Whereas in a server environment, you might be running something else for managing the server there. Embedded Linux has a similar idea. You may have a UI that might be a web server for configuring it in the case of an IoT device. Or you might have some simple UI that's just an LED, a push button, an inertial sensor, to handle caps there to configure a device. But that's all fun to user-land. It doesn't really impact the kernel piece. So the kernel piece is common, but the individual user-land pieces are different. Another thing to look at is Android. Plenty of phones out there. That's just another example of an embedded Linux device. It runs the embedded Linux kernel there, but then you also have a different user-land. That's the Android part. They don't call it Java, but essentially it's a virtual machine that's very similar to Java. It runs all those things in user-land, and that's different, but it's still the core common Linux code. Another thing that's generally specific to embedded Linux is that it's targeted for particular hardware. Because you're running this on a phone, it's probably tuned for your phone. In contrast, if you're looking at desktop Linux where it's designed for all sorts of unknown PC hardware out there, during installation, you'll probably try to auto-detect. It may run things like UDEP, which will scan things, try to load different modules, adapt to different configurations. In embedded Linux, that's generally not the case. Oftentimes in embedded Linux, you would have things kind of compiled in so you wouldn't have as many modules out there. You may have embedded Linux that's targeted for your specific hardware. So I'll say an example. You have embedded Linux. You have a device that has a USB port. In the case of desktop Linux, you have a USB port. You can appeal plug-in all sorts of stuff in there. It could be a USB thumb drive. It could be a TV tuner. In the case of embedded Linux, you only handle specific things that you support. You could go as far as just locking out particular devices that you don't support. You could go as far as a white list for USB devices where it just says if you don't have the right USB vendor ID or the product ID, it would just ignore your device. So that's one thing that is different from regular desktop Linux. So we look at embedded Linux ILT rows. The obvious thing, as I mentioned earlier, is acting as a gateway aggregator. In the case of you're talking to, let's say, a whole bunch of sensors that do not natively speak IP there, it serves to address a resource issue. Because you have sensors there, they may be required to run for a year or more on something small like a CR22 coin cell. Perfectly doable with a simple processor system, but if you're running IP, you want to run Wi-Fi, that just does not work. So a gateway will let you run that on the actual sensor itself, and then you have Linux that speaks to whatever protocol is necessary to the device, and that would translate between that and an IP side. Another case where you might be using embedded Linux for an ILT device is a self-contained device. For example, there's plenty of IP cameras out there. A lot of them would just simply run Linux on there. The Linux would have drivers that talk to the camera to acquire the images. It would have a web server that could stream the data back out there. It's self-contained. Linux has all the drivers for all those pieces there. Everything just kind of naturally drops in. Another case that embedded Linux would be great for ILT devices is augmenting what you have there. You may have a product out there that does great today. It doesn't connect to the outside world, but it does all the functions you need there, and there may not be any good reason to redesign it. Or maybe perhaps you decided to design a device there maybe six or seven months ago, you started down the design path there, and all of a sudden the marketing department comes and says, we want you to ILT functionality. It's a great buzzword. We need that to sell a device. So your choices are either in that case to either completely redesign the thing from scratch, throw away the last six months there and restart. Or you could look at using Linux to help you augment the device to just add the ILT functionality. And depending on how your device is designed, there's different things that you could do. And Linux excels at bringing that functionality onto your device. Linux as a gateway, like I said, that's a very natural role for Linux there. Well, Linux has one thing that's really going for it. The IP stack, it's mature. It's been tested. It's been around since probably before the year 2000. It's been well tested. So let's say you have some weird network. You're going over the Internet, the public Internet there. And you can't tell when. There's a slow link. For example, if you're using your phone, your phone is going through the carrier. The carrier may in turn go through yet another link. And because you're in a remote location, you might be stuck in something, just a link that connects back up to the real world. So you don't know the behavior. Your packets coming over a network may get reordered, may get dropped there. Or if you're doing something, big packets, there may be fragments that might be delayed. All sorts of stuff can happen. If your IP stack is not mature, you don't know what happens. So you try to deploy a product that's using an immature IP stack. You get all these weird bugs. While the Linux IP stack is well debugged, the same one that you're running on Linux, there's millions of users out there. They've tested it. They've complained about it when it doesn't work. You're leveraging all that years of test experience onto a device there. So that makes it a natural device for the IP side of it. Then comes hardware support. Well, Linux has tons of hardware support based on network connectivity. If you're going to be talking back to the IP network, possibly to the internet, you're going to have probably either a Wi-Fi interface, an Ethernet interface, one of those interfaces, and there's plenty of support for that. That could be seen as evidenced by the number of routers out there. The famous WRT-54, basically an SOC running Linux there. All it does is it routes between the different ports there. For that same functionality, it's almost the same you need in an IoT device. Not completely identical, but it's very much the same there, and a lot of times you can share the same hardware. You have drivers available. Even if the driver is not available, Linux is open source. So you may be able to find a driver that's not completely mature, or maybe not acceptable to Main Street, the Mainline Linux because coding styles, different uses of buffer, maybe a setup for an older version, but at least that just gives you a starting point. There's something else there. You're starting from scratch, or you're hoping for the best from your vendor for the Ethernet or whatever chip you have there. Linux naturally comes with Bluetooth support. There are many stacks there, and picking which one can get a bit controversial. There's obviously the BlueZ1 that's open source. In recent years, they're moving toward using things like Bluetooth, which may introduce complications for the embedded side, where all of a sudden if you're using Bluetooth, there's deep buzz which introduces other dependencies there, but then there's other stacks available. For example, Android in the last few revisions has moved towards another stack called BlueDroid. It's open source by Broadcom. It's not as well documented as nice as the regular BlueZ stack there, but that's an option. There's even other commercial proprietary options available for vendors there. For example, TI for a lot of their reference designs there, they like to push another stack by a third party. They will provide it to you for free for use on their chips there, but that's yet another stack. It runs in Thailand user land, but the bottom line there is that you have all these stacks available to choose from, and they all pretty much drop into Linux. Then on top of that, most of these stacks does support both Bluetooth Classic and Bluetooth Low Energy. That means if you were to do a remote device in a very remote area where you need to backhaul internet connectivity, you could potentially leverage a phone as your gateway there. The phone there could just be set up to do a tethering there. You could provide internet connectivity using the pan profile. Some devices using a done profile. That's all naturally supported Linux there, so that's really easy to do. If you need to do Bluetooth Low Energy, let's say you're doing a device that needs to talk to one of the devices manufactured by a particular fruit company where they try to lock everything down. Well, they still leave the Bluetooth Low Energy protocol open, so you support by supporting Bluetooth Low Energy. You could use that to allow those devices to talk to your thing. And then even if Linux does not natively support all those protocols in the kernel itself there, there's always the option of running protocols in userland. Things like ZigBee, there are stacks that support exactly that kind of thing in userland. You need to do something proprietary. There's always the userland option available for it, and there are stacks out there sold by vendors or made available by the vendors to support their chips running in userland. So there's many options, and because of the userland kernel isolation, you have an additional layer of robustness. So another thing that Linux really excels at is adding IOT functionality. Like I was saying earlier, let's say you have a device or you're designing a device and you've been quite a bit far down the process. You've committed time and resources there, so you have an existing almost ready-to-go device, but it doesn't provide IOT functionality and for some reason it could be a marketing department. It could be that you decided you need that functionality and you want to add IOT to it. There's several different ways of doing it and Linux is a natural way of doing that. One way is that, let's say you have a current existing hardware. You have your existing app that you've designed or you're working on right now there, and this chip is reasonably large. It's a big SOC you're not using 100% functionality. You could just drop in IOT support as another process there. That's why it's running Linux there. If you know it's on a regular Linux-based server to add a web server, it's just another process. You could load in a patch sheet right there and it will provide IOT support right next to you and it exists on the same hardware without any additional functionality and additional hardware, sorry. Another case is you already have hardware and you've decided to use something as relatively small, maybe because of bomb cost reasons or the resource constraints there. You could always run Linux in a series there where you have your existing part there. This may not even be Linux here and you connect it up to either another external ISOC there which will provide IOT support or this whole thing could be one particular SOC. What you do is you do either virtualization so this thing is running my embedded RTOS there. You can use virtualization and then that will talk back to your regular Linux side of it. There are many options because Linux supports all these different things there you have many choices to choose from but if you're really short on hardware you can always throw in another small chip and you'll have Internet and it can be done relatively quickly with a lot of redevelopment. Most of that, adding the IOT functionality, adding a web server, that's relatively stock general purpose thing there. You could just pull off the market or you could find someone like app customers that will get to exactly certain things. You need some basic functionality that they're not familiar with Linux there. I go in there and help them just add that right in front of there. I never look inside there. This may be their proprietary secret source that's really flexible. This just makes it really easy to do. So different ways that Linux could help with adding IOT functionality on top of an existing app would be the different ways you could partition things. The simplest but the one that offers the least isolation is you just run a separate process. Like I was referring to, you could add Apache. Apache runs a separate process. So you have one process that manages your device has your own secret source there. You don't really have to expose that. There's another process that just does the IOT stuff there. If there's any bugs on the IOT side, it doesn't really migrate to your thing there. The drop-back with processes is that it's still relatively technically coupled to the whole system there. So whatever two processes running with the same permissions could do, they could still interfere with each other. They could kill another process there. So if you need more isolation, Linux could put a concept on multiple users. By having common permissions or selecting permissions there, let's say your IOT side to talk to your main application. An example of something that's already doing this is Android. Android, each time you install an app, it creates a new user. Each user runs its own app. So that effectively contains the permissions for a particular app, so one app cannot interfere with another app. The same idea could be applied to IOT devices where you have your IOT handling side and your main app handling side. And by running them as different users, unless you get them permission-specific or isolated, so in the case of you want to open an existing design there, you have your existing thing running as one user, then you have your IOT side running as another user that limits any potential damage to IOT side can do. Going further down is Linux supports something called containers. This allows you to restrict a lot more resources there. It still shares the same Linux kernel there, which means that wherever your normal secret sauce application side is doing, it's still going to be sharing the same kernel, but it requires it to be running Linux. This gives you a lot more granularity. It allows you to specifically say, oh, this guy, you won't even see these resources there, whereas IOT side will see something there. So let's say you have something that's sensitive, like an FPGA, whatever, you really, really want to isolate any potential interaction with IOT side except for very specific interfaces. Containers could do that there. This is getting to be more of a brute-force approach to isolation, but that's an option. Even if that's not sufficient isolation, maybe because there's some regulatory thing there, there's some certifications in the past, or you really don't trust the developers to know exactly what resources they need there. Another thing you could do with Linux is that you could virtualize things. In the case of virtualization, you have total isolation. Yes? Kind of. What you would have to do is treat that as a completely separate machine there, so you could create an SSH login session in there, login to that virtual session there, and start it that way. Another way is if you don't want to do a command line, you could always use the startup scripts in that virtualized environment. Yes? In that case, you've got to trust the kernel to handle that. The kernel is really kind of your gatekeeper in that case there. That's why there's limits to that. That might not be sufficient in some cases. Hence the reason why you might want to go further to virtualization. Virtualization, one thing is that you're actually emulating a full-blown machine. So whatever you virtualize, you'll be running a new kernel or running, if you run bare metal there, it's pretty heavyweight. You're replicating all the resources of running a kernel or whatever environment there. But the good thing is that that gives you a lot more isolation. It depends on how you go about the virtualization. Yes, you could potentially have leaks there. If you send out addresses to a bus, you could bounce it off some of the peripheral and have it come back. That's certainly a possibility. So when you design something with virtualization, you've got to be very wary of things bouncing off of something else. There's no direct interaction, but yes, there is ways of bypassing that. If you really need isolation to that level, you might want to consider something more restrictive except two completely separate boards connected over a single piece of wire running in one particular protocol. They cover most of the points there. Embedded Linux versus bare metal. Why would you ever want to run embedded Linux? Embedded Linux uses more memory there. It's heavier weight, requires a bigger SOC there. Well, let's look at the pros and cons of bare metal versus embedded Linux there. Embedded Linux offers you a very large library of drivers there. For many devices there, you could either just pull it off right in the kernel tree. It's a enabled option and a stair support it. Or a lot of vendors, mainly because of the Android guys, they've been running drivers for Linux there. So you could probably just get it from the vendor site there. There are also many people on the main list that's posting drivers there. There's all sorts of options there. Whereas if you're using a proprietary RTOS, or even a less proprietary one, because it's not as common, it's not shared in multiple environments there, you probably won't have the drivers unless you're really lucky that. I've actually worked with things like RTX, less one provided by ARM for free. It still lacks drivers and many things, so you want to spend a lot of time just writing out new code for that. And the SOC, the underlying device that you're trying to run it on there. Well, Linux is widely supported there. You probably will find a Linux support there. You get directly from the chip maker there. And if not, there's probably a port for the core. So you have some starting port. So let's say you have some arbitrary ARM core out there that you're trying to get Linux to run there. There's no port available for it. Well, ARM provides a reference Linux implementation for the ARM core. It uses their so-called reference chips there. It's not cheap to buy those. But it has something there. And that's a starting point. All the ARM core, the core part is pretty much the same. It's all the peripherals all around there. That's different. So once you have that there, you have a starting point and your main work would be looking for, okay, is that IP share of another chip in which case I could probably use the driver? Or worst case, you just write drivers for those specific pieces there. That's one less piece of work that you have to do there. And Linux gives you all that there. Same thing. MIPS has a similar situation. Many of the cores out there that are using SOCs have that similar situation. Yes, there are. Microchip is putting out new chips that plays based on the MIPS core. There's a lot of set-top boxes that's based on that. Linux has a lot of application support. So it's fine that you may be a low-level guy and just doing all your little kernel hacks there, but you eventually got to support somebody that's writing the UI or different pieces there. They may know how to use all sorts of fancy libraries. They might want to run Python for things. They might want to run Java to control things there. Some people are fans of things like Tomcat, all sorts of crazy things there. Try getting those things on a bare metal environment. Linux just supports that. You need that, drop it in. It may cause you more memory in which case you go back inside. You're truly of too much memory, but that's a different issue. At least you could get something out there. And then Linux has tested application stacks. Stacks have been tested on the Internet. It's been used all over. The famous one is the LAMP stack, the Linux, Apache, MySQL, PHP stack there. You could drop all those things into your device. So let's say you have an embedded Linux device, a very smart camera there. This is an example there. You could drop all those pieces into there and just use whatever that you have there. Or the flip side is that you could take the same thing, drop it onto your PC, do all your development there, verify working there, and then move it over to there. So let's say you have a PC environment that has all sorts of debugging tools, extra shell, local compilers there. You could just do edit, test, and everything once it works. Drop it into your embedded Linux environment. It runs pretty much the same kernel there. So most things would just drop right in. The exceptions are if you try to be smart and you say use hardware acceleration. So if you're doing an X86 and you write things for MMX, it's not going to work in ARM when they use Neon for that. But that's something that you could potentially get out of your code there, but at least that gives you some kind of starting point there. So there's plenty of stacks that are available, and it's just a drop-in in the case of embedded Linux. And I said earlier that it depends on what kind of functionality you need. You could go as far as using something like STM32s if you don't want to use MMU, but that's an extreme case where you throw a lot of different things there. A lot of times, the overall bomb cost, you can't just look at the core itself. You've got to look at what the peripheral trade-offs are versus what you need on the external there. So it's a complicated question. It's not something I can answer. You could, depending on your volume, you could go even lower in the price there. So it really depends on your application. Okay. All right, the IP stack, it's mature there so you don't wind up troubleshooting really weird IP conditions where you can't replicate in the lamp. You can't replicate it. You have a hard time troubleshooting that there. And that's something that Linux is a big plus for having that. Another thing is Linux offers you an isolation between a kernel and user space. If you're dealing with RTOSs, generally they don't support MMU. There might be some cases where they do, but if they don't support it there, you have a bug in your application or your IoT site. Someone sends you something there that causes you to dereference a null pointer. You may not crash. You may randomly screw over memory there. Well, the isolation in Linux will prevent you from crashing the entire system. So you could potentially have another process out there that could watch over, restart things, log that there's a fatal error. Whereas if you're dealing with something that's all mixed together, your system go into an undefined state and all sorts of bad stuff could happen. So this whole user kernel isolation there can't be leveraged to give you a more robust system. Something that's easier to debug. It's a big value add if you're dealing with debugging devices as I usually get dragged into. Then there's the inner user isolation. Most of the RTOSs are not designed for multi-user, which means there's really only one concept. You have these things there. They have tasks that will let you do different things, schedule them, order that sort of thing there. But you really can't run different things with different privileges easily on other OSs. Linux, because it started out as a multi-user operating system there, offers those things there, and you could use that to help isolate the different parts of it. Linux, of course, is open source. There's many different RTOSs out there, like, for example, VxWorks was a complete closed source. It's very popular, but Linux is open source. I think that all of you are here at scale there. I think open source... Right, there are many open source out there, but a lot of them are not. For example, some of the bigger ones, like I said, VxWorks, which is used in a lot of different things, they are not open source. But Linux does nice things at all this. You get all the values in the previous slide and it's open source there. And since it's the same thing that runs on your desktop there, it's easy to simulate things. So I say you're building some boards there. You could have your software guys working on a desktop PC simulating, developing the application layer with the hardware being ready. And yes, depending on how big your microcontroller is, there are a version of Linux that will run on the ARM Cortex M4 M3s. They do not have an MMU. It's a microcontroller Linux there. The biggest drawback is that you lose isolation between processes. It doesn't have the hardware MMU to provide that support. And another big advantage of Linux is that there's so many people that's developing for different parts of Linux there. Finding developers to work on it, it becomes a lot easier. So instead of having to find someone that knows a particular RTOS, you could just find a Linux guy there and then explain to them, yes, you are developing for Linux. There are resource limitations like, for example, you can't go around randomly mallocking memory and just getting about that. Other than that, it's pretty much the same guys and they could actually do a development for you. So it's a lot easier to hire developers, be it employees or contractors to work on things. Now look at some of the problems with Linux there. Linux has a bigger footprint. It requires more memory. Most of the time, Linux is a hard time fitting into some of the SOCs with, let's say, 512 megabytes of RAM there. So generally, you're going to be forced to write in DDR memory either as part of a little module or as an external chip there. So that's going to be a drawback there. Generally, if you're willing to or you're not willing to give up the isolation that MMU offers, you're probably going to have to get a processor and SOC with MMU on there because Linux has this kind of a file system of different pieces. You probably don't want to be using a little bit more flash space on there. But a little bit of flash space to deal with, you're doing packages and step-giant monolithic images. A big one that's against Linux is boot time. Generally, Linux takes from many from like 5 to 10 seconds to boot up all to your application, whereas if you're dealing with bare metal, you're probably getting instantaneous boot just like that there. If you're not used to dealing with Linux, that could be potentially an issue that you need to be aware of. There are tricks you could do. For example, there's a concept called loading there, and every time you just do a resume there. So that could stop some of your boot time issues. It's not perfect. It takes a lot of work to get snapshot boot working right, but that's an option that could work on it. A big one against Linux is that it's not real-time. Linux, there are patches out there. They claim that they support a real-time Linux there. There's like the pre-MRT stuff there. They will give you some measure of real-time, but they won't give you the same granularity that you're trying to control a brushless DC motor directly. You're going to have a very hard time with Linux. It's not going to cut it there, but you need to react very quickly. Yes, there are drawbacks, and there are ways of addressing this. This pure Linux, that's not going to cut it. It's not real-time. Don't even think about it. It depends on how they design it, because you can always offload it to an FPGA and have that handled at real-time capabilities there. So, oh yeah, it requires that bigger SOC, the MMUS I was saying earlier, but you really can give it up there. There is new seal Linux out there. It runs on something as small as the STM32, I think, pretty much all members of them. That's a Cortex M3 for the smallest member. You could potentially wind up burning a lot more power with Linux if you're not careful with bare metal there. It's pretty straightforward. You go set a few registers, the device shuts down, all the clocks go away, and you're down at low power state. The Linux is a little bit more complicated. There's a whole framework there, because the driver is supposed to be doing different things. The driver is supposed to be calling to one thing, and then the clock manager is supposed to realize, oh, you don't need all those clocks? Oh, to shut those things down there. Unless the driver has been written right there, there's going to be some debugging issues there, so it becomes more complicated. It's not to say that that's not impossible to do. It just becomes a lot more complex than Linux there. An example of how something like this could be done is Nokia at one point. The N900 and N800 series, if you're familiar with that, those things could run for a week using purely Wi-Fi, checking email. That's because they did a very good job of tuning the power management within Linux there. Yes, Linux is not ready for that, so that would be definitely a drawback with using Linux there. And then there is the open source licensing. For some people, that's a problem, because GPL is potentially infectious. You write your own code, you got to disclose it, you're under obligation to provide the code to somebody that you distribute binaries to. There are issues there, and some R-tosses are not subject to that, especially the proprietary ones, where you just simply write them a check there and you're done, whereas over here you potentially have some open source obligations that you have to deal with. For details of that, you should look at the license before doing anything. One way of addressing some of those limitations is to combine those two. For some pieces, you run bare metal, some pieces, you run Linux. You have TI, SITARA, AM335X, used on a Beagle bone. They have these really fine things called PRUs. That's a name suggest programmable real-time unit. They're to handle the real-time aspects. They are little teeny-time microcontrollers. You run bare metal on there to handle your real-time tasks. Let's say you're driving a motor for eye surgery, for example, like you pointed out there, you could use those things to control it, and it will stop in time, and it will flag back to Linux there. Linux will be able to make a decision to issue new orders there. So there are ways around there. It will give you the advantage of Linux, all different stacks there, and you get the real-time things there. Another thing is to look at, to address the power situation there. You could use Linux as a gateway. So as I was saying earlier, let's say you have similar to your pacemaker point you pointed out there, you have sensors out there. Maybe you're getting environmental data. It's a CO unit there, and the only room you have in there is something like CR2032, a tiny little coin cell. You can use Linux by itself. In fact, you can't even do that if you're running Wi-Fi. So what you could do is that you could have that thing running some kind of proprietary bare metal OS there, using a proprietary wireless thing. But you have Bluetooth low energy. There's low enough energy that you could give you the battery life, give you all the other requirements, and then you have Linux as a gateway there, and that gateway will just translate all those things. So you get the advantage of the power management on the place where it matters. For the gateway where you probably have more power available, be it a bigger battery, bigger solar panel, or even mains power, you get your other functionality. There's another way of achieving the isolation or the marriage between bare metal and Linux is to, obviously you could use two different chips there. That's sometimes expensive in terms of both board space and bomb cost. Another way, as I just mentioned, is that you could use onboard a co-processor. Another way is you could go back to what I was saying earlier, virtualization. In one virtualized environment, there's a lot of stuff which lets you do whatever excels at. You obviously have to tune the scheduling between those two to make sure that real-time requirements are satisfied. And then you have the rest of the chip running Linux. So you could achieve a lot of these things by combining those two. So that's all the basic stuff. Let's look at a sample IoT ecosystem there. You have a bunch of sensors there. You have like a door sensor. Look at a house there. It's a home automation example. There you have a door sensor telling you when the door is open or closed. You have a temperature sensor telling you how hot the room is, humidity sensor, how wet it is. They all interface over Bluetooth low energy. So that means you need a gateway there. And you also have devices that are controllable over Bluetooth low energy. You have a fan. You have a light that's controllable over Bluetooth low energy. And they all go back to using that gateway there. And you have a camera, IP cameras, because you have a camera, you're processing a lot of data, chances are you probably won't have a big enough SOC that could run Linux natively there. So that will connect over Wi-Fi. The gateway will aggregate all those things there. You could also put logic in here, say whenever the door is open, turn on the fan or turn on the light. You could put logic to turn on the fan whenever the temperature exceeds a certain thing there. All of this can be accessed over the Internet. You could turn on your fan remotely. Let's say you're about to go home today. You want to cool down the house. You could go pick up your phone, turn on your fan remotely. Come home. Your house is nicely ventilated there. There's different things you could do there. That's some of the value that IoT brings. Now that's a hypothetical example. Let's look at something a little bit more concrete. One thing that's on the market, Wink Hub, you could find this device at a place like Home Depot, Lowe's. They sell this thing there. It's an embedded Linux system there. It provides ZigBee radio, Z-Wave, Wi-Fi. I think it even has Ethernet on there. It links it up with your phone. I think the default case before the latest firmware, it has a go-to-there cloud system there. But for the latest version, you could directly control it. This thing, this Wink Hub, we'll let you talk. Let's say you have Lightbulb that talks about ZigBee. You could use the Skateway to turn your Lightbulb on and off. Same thing with Z-Wave interface devices. Another example that's on the market would be this IP camera from D-Link. This is a DCS93X series. It exists different versions of it. Some of it offers an infrared LED for night applications. There's, I think, different lenses you get there. This device has both Ethernet and Wi-Fi. It runs Linux natively. I'm not just talking about a hypothetical example of running embedded Linux on IoT devices. There are devices out there. This is just a few examples of actual devices on the market. Yes. It was actually a password problem. That was, I think, the most recent one. I think that's the one you're referring to. What happened is that there was a system scanning for commonly used passwords there, and they were just logging into device, modifying things, then locking out some of the something else. Didn't you have a question? Oh, yes. That doesn't help there. I picked these two because, one, I happen to be playing with them there. They just will happen right next to me. They remind me, oh, these are devices there. There's been several sites about at least hacking the Wink Hub there where you want to break into the U-boot there to modify it because the older firmware did not support direct access. And for what I want to do, I did not want to go through the cloud there, so I was looking at this device already. This was just something that I was looking at. This one actually has a brand on there. There's other cameras out there that I've been playing with where they're just generic, no name, but they're essentially the same thing. Not really packages. There's like website, for example, I believe there's a website not breaking into distinct there. The camera, I think there's an open WRT project that are refreshing different images on there. So I think you had a question? Yes, that is another potential Wink thing that come back and bite you. Okay, now let's look at something a little bit more closer to home there. What I've done is I've built an IoT siren just to show you how simple it is to build such a device. It's probably not the best example to model things. I have a write-up on some of the decisions I made there and how I built it at this URL. It's built around a BeagleBone BeagleBone green wireless that device was chosen just because it has Wi-Fi chip on there, so it's easy, ready to go. It comes up with a wireless access point, so that's one less thing to have to deal with there. This whole thing I think was constructed in over a course of about two hours, not including time to actually run out and get parts for the siren itself. It's the implementation. It took less than 20 lines of code. In fact, I think it's like less than 10 lines of code. Now it's still going in there saying that I want to use those particular pins as an output signal to control the siren and then some logic to hook into the web server saying every time you call this URL, it's down to siren. So it's a demo. I have my phone connected over Wi-Fi to the board itself. And so every time I access the URL, which is the address slash IOT, you could get the URL at that link there. The reason I chose a siren is simply because I didn't know what room I was going to be in. I don't know if people could see an LED. An LED would be the obvious thing, the simplest thing to do. But since sirens, they're audible. They're reasonably loud. The siren is not the most practical thing. This is pretty much it. This is a little alarm sensor from a dollar store there. It originally was powered by, I think, three point cells right in there. I modified it so there's a transistor that just switches the power. This cable goes back into the BeagleBone. This is just a little carry-up board to make this a little bit neater instead of having to sit down the wire. So, security. Why do we want to talk about it? I didn't press any buttons there. Obviously, someone's playing with it. It looks like the system got hacked. Well, in this case there, it was hacked by colleague there, Sarah. She's another embedded Linux developer. The reason I'm doing this is to show you that these things, security matters. You might have a thing that sounds a siren or does something there. You might want to play with it. Those are the less malicious people. Other people may want to use it to do other things there. Well, in the case of IoT devices, they're often on a public network like you were alluding to with the problems earlier a few months ago. While these devices, they're sitting outside your firewall. If you don't deal with security, all sorts of stuff can happen there. And when you have a compromise, unlike a PC there where you may lose your file there, people may lock out your data. There's a lot more. When they come home, then they know that, oh, you're not home, let's go rob the place. That's a physical security hazard there, not just a device hazard there. Or if you have an IoT device that can let you control things, they could be doing something malicious, like for example, turning on a cooker that you have the intent to automate. It could be turning on lights there when they shouldn't be, burning up power, a truth in our service attack for some people. There's all sorts of things. There's a lot more significance to an IoT security compromise than a desktop security compromise. Firewall generally filters only the IP. Oh, the question was, do things like ZigBee, Bluetooth Flow Energy, go around a firewall. They run parallel to Firewall because they don't run IP. The firewall doesn't really do anything with it. So conceivably, you cannot have an attack on a Bluetooth Flow Energy site. That's why Bluetooth Flow Energy, the protocol itself, has things like pairing where they check their addresses on there. There are different tricks. You could spoof things. You could potentially be encrypting a packet. You could also show enrollment process so that you know exactly who's on which side there. So there are security provided by the other protocols. What I'm talking about is mainly on the IP side. So yes, there are things that you should be aware of. Yes, to balance it to protocols or devices there. Security is a very complex topic. I'm definitely not going to cover all the different aspects of it there. This is not enough time there. In fact, I only have about 10 minutes left or 20 minutes left to cover the rest of the stuff there. But yes, those are certainly considerations you should have there. IoT devices, the security aspect of it is even if you don't care about your security, you don't care if someone robs your house there because you have absolutely nothing for them to steal there, they could leverage your device as a separate drone, a botnet. That's what happened I think a few months ago where they just have so many devices out there, each one making a request all at the same time. You have launched a non-surface attack by having to do it on anything that's too unusual. Securing embedded Linux. That covers a lot of things I was pointing out there. They're just not enough time to cover everything there. So let's just start with some common things and just inspire people who are looking at building an IoT device that security is important there. First thing is that if you're dealing with embedded Linux, start with the basic security principles in a regular desktop device. It's still Linux. It's not that different there. So you want to make sure that whatever you're running there, you have up-to-date patches there. So say if you're building a device and you're using a DBN distribution, make sure you have the right patches. Now if you're doing an embedded device where you do a custom distribution there, you're going to be responsible for that. You need to know to identify which one, which different patches you need to do. And then another thing is, you need to configure your device properly. A general principle in securing a desktop and services don't turn out unnecessary services. Why turn on a print service if you don't have a printer on there? If there's a bug in a printer service, all of a sudden your devices are going to be vulnerable to that there. By not turning on, you'll avoid bugs. You'll avoid your exposure by disabling those things there. Another thing you could do is just restrict access so you're writing applications. You could look at a source address of where the request is coming from. There's no reason to be answering requests from anywhere in the world. We don't need to. You could just look at it and say, oh, I don't want to do anything like that. And within Linux there are Unix, there's different libraries that could do that automatically and post a lot of files there. If you have the right library linked in, they'll automatically drop connections coming from certain addresses that you don't allow. So another thing that's important that's more specific to embedded Linux is determining what is applicable. When you're running Ubuntu there, they will tell you, you have a package installed. That must be applicable to you. But in case of embedded Linux, you may have things compiled differently. You may not turn on other features or you might turn on specific obscure features or you might have some hardware that needs it. It's up to you to decide, is this applicable? Failing to do things properly, there are brute force ways of improving your security. This should not be your first force in doing things there, but I should at least mention those. If for some reason you don't trust things or you just don't have time there, this at least gives you some potential illusion of security. There is SCLinux there. SCLinux will allow you to restrict everything or pretty much anything, even from the root user. It's a very powerful tool. You could use it to control all sorts of things. The problem with SCLinux is that you have to write a policy to say what can or cannot do things. So unless you completely understand your system, you could be introducing new holes that you don't know about because you don't understand your security policies there. So with SCLinux, it boils down to can you write a proper policy that you actually understand and know exactly what it will do? If you do not understand that, SCLinux is not going to be that helpful to you. Another way of doing it is you could restrict access to your device. Let's say you don't want to modify your code to look at source addresses or maybe you cannot modify your code there. You could potentially use IP tables there. This is a packet filter that specifies that say you have a connection coming from a particular address going to a particular place or particular port and you could say either allow, deny, you could drop it so you could appear like you're not supporting certain things there or you could just sit there, drop the packets in which case the other side will hang for a while, while it times out. This is a brute force approach to doing it. It's not the best way of doing it. You should really be disabling services that you don't need and doing it at a higher level rather than relying on things like a packet filter. In better Linux itself, the security issue is there. Well, the big thing is that you have open service and use line. Like I was saying, if you have printer servers turned on that you shouldn't be, those things should be disabled. That's something that you should be aware of because a lot of times when you're building a device you're probably starting from a demo image. The demo image will have all sorts of stuff disabled just because it's a demo image. They're supporting things and if not applicable to your device you should go through and disable those things there. On the kernel side, you should probably audit your configuration to do something similar. You may not want to support certain USB devices there because they might trigger something in Udev and lead to a whole chain of other events there so if you just disable it in the kernel there the whole chain will not happen. Demo root systems there. Oftentimes, just to make things simple, they have open users there. They have to log in with our password or worse yet they have a default password. In the case of that previous attack there what happens is that the bot was going around looking for things and just trying on a commonly used password and using that to log in. So you should audit your root files and make sure you don't have anything that's obvious or preferably don't use a password. Lock it up completely, use some other security protocol. For example, if we need to log in for a debugging session enable public keys with SSH. That way you don't have a well-known password. So doers. If you're running things that you want to well, their philosophy is there should not be a root password. You should be able to as a normal user using your normal password do things as root. And the default configuration is that you could do sudo which gives you root privileges or pretty much anything. That could be very bad and a better device. Let's say they managed to compromise a process that you have running as non-root user. Seems secure but also they have access to the sudoers. They could use that to run root and they have suddenly escalated the privilege and be able to attack anything they want. So if you must use sudo, you cannot properly manage your permissions in a different way. At the very least, tighten up the sudoers file. And yes, root file system, yeah, that's... I'm running out of time so I'm trying to make sure that I keep up. Okay, these are some common bad practices. If you're trying to run things, I've actually seen this in certain customers' products there where they try to run, let's say, IF config. They call system, that's been IF config blah, blah, blah, blah, and the parameter IF config comes from a web user interface. Well, if you're a clever hacker there and you realize that's the case, you could do something like type, give an IP address, semicolon, and then some other command. All of a sudden, the user's able to run that. The reason is that system passes whatever you give it to the shell. The shell would expand and interpret anything else. So let's say you have a semicolon, that's a shell, signal to okay, that's the end of the command and start a new command there. Or doing things like system. As an alternative, you could probably maybe do fork and exec, something explicit, or just find somewhere not passing through the shell there. Another bad practice is this failure check data. Any data coming from a user interface is potentially suspect there. So even if you're not going to shell there, if you're sending it to a database, check for reported characters, escaped characters, because they could potentially use those things to do things like drop an entire table. There's a funny XKCD cartoon on exactly that. You should also look at the links of your data that you're getting in there. Because your application back in might be expecting data to be a particular length and they only allocate buffers that size. If you're getting things from a user interface, which is much longer than that, you set it in there, you could potentially have it a buffer overflow. Depending on where the buffer is located there, if it's on the stack, for example, it could be used to affect the return address, which means your system could be compromised, it could crash, or so some bad stuff could happen. The last two, in the interest of time, are combined those two IP addresses and DNS. It may not be obvious why you want to sanitize it, but this goes back to a point I think that gentleman back there mentioned earlier about being able to bounce things off of different things. For example, let's say you have a firewall, you accept things from outside the firewall there. It takes an IP address and uses it to connect another thing. Maybe it acts as a gateway there. If you're not careful with address there and your internal side uses a reserve address, let's say the 1010 range, 192.168.172.16, that sort of range, they could potentially give you an address and use that to take effect for something inside there. Effectively, they'd be trying to take control from outside. You could achieve a similar game by using DNS addresses there. So let's say you take DNS addresses there, but then the attacker is clever. They try to put in 127.001 as the address that it resolves to. All of a sudden, they have control of your internal network, or at least be able to affect your internal network by just bouncing off your device. So you should scrutinize the data you're getting everything there. Another big one is running things as root. That's generally a very bad idea. When designing a system there, you should look at potential users and how to limit the privileges that you're running with. For the IoT devices there, HTVS is generally a good idea to use. That prevents people from snooping, but that's apparently a very common thing for people to do there. So that thing you should consider using. And in case of embedded Linux, that could be relatively simple. It just met a dropping in a server and direct certificates to support that. This is another big plus for using embedded Linux over bare metal. Where bare metal, you probably have to find explicit servers for HTVS. Yes, it can potentially provide security issues like that. Let's say for example, if you try to keep, you have a device that lasts 20 years and your certificate is only good for five years there. There's a potential expiration date there. There's potential naming issues. For example, some of these things that are checked against DNS entries there. In order to support that, you might have to provide provisions for, say, uploading a certificate for that particular use case. But yes, that is a potential risk that you run into there, which goes back to, those things are probably because there's a security issue with those things there. So you should probably address some of those things by appropriate patching and updating the system there. Yep. Hiding things by just using non-common ports, for example, hiding SSH on another port besides 22 doesn't really help. People out there are scanning for services and they, by looking at the responses, they can determine what kind of service you have. So that's generally going to be a bad practice. And okay, I'm running very close. Okay, IOT data issues. One thing that when you're doing an IOT device is to be aware of privacy concerns there. Like I said earlier, the Wink Hub, they send all your data over to the cloud. You should, at the very least, be aware of it because data that's not under your control, under third-party control, may be subject to other terms. It's possible your cloud provider could be selling off your so-called anonymized data for whatever reason there. I'm not proposing any generic solution. I'm proposing that you be aware of what you're doing to understand what's going on there instead of just trying to say, oh, the cloud is cheap. Let's do that. That's not a good way of going about things there. And in conclusion, okay, I'm running pretty much out of time. Embedded Linux is a good way to do an IOT device. It's a fast way to do your new devices to add IOT functionality. Embedded Linux really excels at the gateway just because of the large number of device support, the protocol support there. And when doing an IOT device, always be aware of security because it will come back and bite you either now or later in terms of regulations when no one does security and the government comes in with a big, heavy hammer. Questions? Can you, it's a microphone. Can you please repeat it? Oh, you want me to repeat it? Oh, okay. The question is basically, are there any best practices for pushing out patches and updates to that? That really depends on your particular application, the amount of resources you have on there. For example, if you're running as a Yachto-based distribution, they could generate patches like in IPK packages that you could potentially accept and just make sure you have them signed to verify where they're coming from. Or for smaller systems, there might be other things they need to do, so it really depends on what you're doing. There's no generic answer there. Android, each phone actually has their own different system. They are an example of an embedded system that has similar issues. Thank you. You had a question? Yeah. What do you normally use? A distribution like Debian, or do you use the Yachto or open embedded, or what do you like to use for creating your file system and packages and things like that? It really depends. In my case there, since I do this stuff for customers there, I am faced with an experience of whatever the customer wants. That's number one for recommendations there. Yachto slash open embedded is a great choice. It gives you a lot of control over exactly what you need in there or different dependencies I figured out. A very important thing is Yachto actually sorts out the different licensing issues. It generates source packages so you can fulfill the requirements of the GPL. That's a good example for other smaller things there. I've been known to handle my own file system. It really depends on what the job is, how big, how complicated it is, and input from the customer. I think that was the next question. I think just to add a little bit to the comment about certificates on the device. If you have an IoT device and you have been doing SSL serving, now you've got your certificate and your signed private key are on the device somewhere. If they're part of some kind of network, device gets stolen, there's a whole thing there too where somebody could pull your key off of a flash chip. Ideally you would have a unique key for each device if that one gets compromised, you get that one revoked. If you have a certificate, if it's signed all the way, you're not going to buy a new certificate for each device. Well, in the case of a serving, you really should be a chance to sell your stuff. Each device is not going to be named the same way. So in order to meet the conditional devices there, you have to have a new certificate. So in that case, it may make sense for you to try to... There's something open, I think there's open certificate authorities that you could use to do something like that. I don't know the names, but they are open there. Yeah, that's another way of going about it, there's some logistical management issues associated with that. We are kind of running out of time, but I'll take more questions. Yeah, if there's any more questions, if not I'll be available here. In terms of the certificate signage and stuff like that, the devices that we see currently in today's market, like the camera or whatever you've shown, do they really do the certificate stuff? What I've seen on the market is not the name brand ones, but it's like the generic ones where they use a self-signed certificate. So you have the risk of not knowing who the people you trust is. On your hand, they do have different certificates, so at the very least it gives you some obfuscation so people can't just be snooping in on your connection. Obviously, you can't use to verify who we came from. That's the problem with self-signed certificates. Okay, I'll be available otherwise. Thank you for coming. The slides are up on my mind. And yeah, let's give who you are. Thanks. Thank you. We're going to have to keep an eye on my dogs. My wife has to work on weekends and the dogs are home alone. So until the talk gets started, we'll keep an eye on my cam of my dogs here. This is being broadcast to you by a GPL violating device that I bought not realizing violated the GPL. It's a web camera, and we haven't gotten into compliance yet, unfortunately. But yeah, I figured out it was Linux as soon as I got it and there was no source or offer because 90% of the devices you'll buy will be violating the GPL, sort of the world we live in these days. But unfortunately, today I'll be talking to you about a device that does not violate the GPL. But until that starts, we can watch my dogs on this GPL violating device. They're sleeping. I saw two people brought their two pugs here, so I could have actually brought my two pugs. This camera, the resolution, I'm using the low quality feed so that I don't use up network unfairly, but there's the two. It's actually giving me an IP number, so. I'm using the slow one. The fast one somebody said was broken into, so. Okay. Oh, I see. Yeah, I heard you were talking about that a minute ago, yeah. The dog walker section is just to show up at 6 p.m. Pacific, so I'll turn it off soon because I don't want her to. I told her there was a camera there, but I said it's a new dog walker, but she doesn't necessarily know that I was going to put it on the screen before my talk, so I didn't worry about that. Live stream the dog walk to the world, yeah. I'm sure that's thrilling, going to be thrilling. She puts the harnesses on the dogs and the leashes and takes them out the door, you know. The camera doesn't move around, so you can move around to follow that, but I'm sure that's thrilling content you were looking for. But yeah, she knows it's there. I told her, yeah, I always want our dog walkers to have a camera, mainly to do this, to make sure that they're sleeping because that's most of what dogs do generally, but pugs them, trickle up. Anyway, so now I'm going to close VLC here because it's time to start to talk. So this is the first time ever that I've given a talk about GPL compliance issues where I actually had on a competing talk someone presenting a known GPL violating technology, which is kind of funny. So I'm glad you gave me this talk rather than that one. I leave it as an exercise to the reader which talk competing with mine is about something that's violating the GPL. It should be easy to figure out if you cross-reference. Conservancy's blog with the list of topics. But what I actually want to talk about today is something that I wanted for 10 years or maybe 15 to be able to talk about. But before I get there, I'll give you a little sense of the organization that I work for and what we do. So we're a charity based in the U.S. We have a booth in the expo hall if you want to come by and get more information. But one of the things I like to point out is that actually puts a very nice legal requirement on me. Now, most people who work for a company, how many people in here work for companies or somehow consult for a for-profit company says most is your primary work. So you wake up in the morning and your job, technically speaking, is to produce value for the owners of the company. In fact, if the company is public, they're legally required to produce value for shareholders. And in fact, they can get in trouble the board of directors of a for-profit company have been producing value for their shareholders. Now, I know a lot of people particularly in this great free software community we have have actually found really interesting ways to work for a company while actually mostly serving the free software world while sort of convincing their bosses that is also in the interest of the company and it probably is true that it is. I was never very good at that. I need my sort of mandate to be perfectly aligned with what I think. Which means that I have to wake up in the morning and serve the public good as a job requirement. That's much more comfortable for me. I'm not so good with the, well, I'm going to serve the company's interest by doing good in the world. I know a lot of people are good at resolving that. I'm just not. And there are other organizations out there in the world for-profit companies which I mentioned and there's also these things called trade associations with trade companies. So they're kind of like an aggregator of company interests. They basically get lots of corporate members who have similar interests and try to glean what the common interest among them is and then promote that which may or may not be in the public good. So I really like being able to work somewhere where I'm required to serve the public good. And what Conservancy ultimately is is an organization to provide all things but software development to assist free software projects. So we provide all sorts of- my executive director of organization is looking at me like I'm saying the wrong things. Karen, do you want to come up and give a preview of your keynote to say what I said wrong already? Okay. All right, that's fine. So yeah. So am I serving the public good adequately for today? Okay. So pretty much all the stuff we do is for free software projects but we try to do pretty much anything they need other than actually develop the software for them. That's usually done mostly by volunteers in the project but we help them collect donations, sometimes to fund that work. We help them run conferences and we help them with legal and licensing issues as well. And our goal is to make sure these projects are doing and serving the public good. And that means that we often deal with free software licensing issues for our projects to make sure the licensing and activities of the project or projects that have asked for our help or developers who have asked for our help that the licensing is happening correctly. So when we do our copy left compliance work that was a component of what we work on, we're trying to do that to help the public good. And I want to give a little bit of background on what copy left is and so forth. There are probably lots of you I already see people in the audience who know that topic pretty well. You can go read email for two minutes what copy left is. So much of open source and free software uses what's called a copy left license. Linux is probably, I've often said is probably the most famous and widely adopted copy left program ever created. And the specific copy left license that Linux and many other free software programs use is called the GNU General Public License or the GPL. And the design of that strategy and copy left is a strategy to reach a specific goal is to make sure that certain software freedoms rights and privileges that those of us who believe deeply in software freedom think every user should have are clearly and consistently given to downstream users. And I just want to go on a little bit of side point I don't want to talk about this because I have other talks that focus on this issue but just to point out that the GPL was always designed for downstream. It's not a developer, it's a developer's rights issue in the sense that developers who receive copies of software have certain rights, but it's not about the initial developers, it's not a license designed to give the most power and what those developers desire. It's a tool that those developers can use to make sure their users have rights and privileges because they might not know who all their users are. Because once they release the software as free software others may adopt it. And copy left uses a share and share like model to ensure as the software goes downstream that each individual user who receives a copy modified or otherwise improved or otherwise of that software has the same right to copy, modify, redistribute and very importantly rebuild and reinstall that software to make effective use of those rights. That if they have the right to modify and change it and improve it and distribute it further and resell it further if they want to, they have to really effectively be able to do that, which means being able to rebuild the software. And they have to be able to do that even after they modify and improve it. They want to take their modified version of the software and put it into production, use it in a real world scenario. That's the whole point. Free software is useless if you can't do that from my point of view. Because it's, well some source code is beautiful to read, most of it in fact isn't. Most source code is messy, but when it's good source code it works and does the right thing. But reading it in source code form is not all that interesting generally speaking unless you can actually compile that source code, run it and install it and then do that whole edit compile test loop again. This is a formal definition of copy left. It was originally written by Wikipedia. It comes initially from Wikipedia. I've made some improvements myself and others have as well. We have a wiki at copyleft.org where we host this definition as well as a guide that I'm going to talk a little bit about later. But I like putting this definition up because I think people tend not to think very abstractly about free software licenses anymore. They tend to think very concretely and say what do I need to do to comply with this license? These licenses are there to get certain governance strategies and goals in free software communities. And as such we should understand what the strategy and goal of copy left was. So that's why I like to put this on a slide so that you can look at that and read it and go to copyleft.org for more detail later. Now, I'm now oft quoted, mostly thanks to Karen because she quotes me saying this all the time, that copy left is not magic pixie dust. It does not function on its own. Many developers tend to believe that the legal system is a virtual machine that you run programs which are licenses on and then you get the right results just by some sort of automated process. That's not how the legal infrastructure of our world works. When someone does something that's legally impermissible there has to be a system that assures that they are in some ways prevented from doing that or that the issue is corrected once they have done that. So GPL enforcement is the process in making sure that the GPL works. That when someone fails to do what the GPL requires, that there is an entity or person or something to make sure that they are told that the license is complied with. The goal of any ethical or moral GPL enforcement is primarily to ensure that those users who got a product which was not a compliance and therefore they did not receive both practical and legal means to copy, modify, improve, reinstall, recompile the software, that they then get those rights so that they can actually make effective use of them. That's the primary goal and that's why GPL enforcement tends to be a somewhat technical process. The GPL has technical requirements that you have to meet so that you can compile and build because being able to compile and build software is a technical process, not a legal one. So the GPL is a legal mechanism to assure certain technical rights for users and developers. Our goal ultimately when we enforce the GPL is to hope that people who have engaged in the GPL violation become part of the larger upstream project long-term. That tends not to happen all the time because many people have forked the software for whatever reason can't reintegrate the changes so that's somewhat of a secondary goal because in particularly embedded environments I'm most worried about the device at hand. I'm most worried about the user who bought the television or the wireless router or the refrigerator that runs Linux and now wants to rebuild and reinstall that version of Linux because the GPL gives them that right. If you want to know more about how the philosophy behind GPL enforcement kind of the meta morality above the GPL that we talk about when we do enforcement you can take a look at these principles of community-oriented GPL enforcement that are published on Conservancy's website at that URL and that's pretty much the most I'll talk about that because this is actually a relatively technical discussion about how it's done and how sometimes it actually gets done right as we'll see in a few minutes. So mostly we're focused, actually I talked about the points on the slide in the previous one so we want as I said users to be able to make real and concrete use of the software that they get as part of the GPL compliance. So how do we get to the point where someone's failed to do the things that this requires? Well, they've committed a GPL violation which is effectively copyright infringement. The GPL is a copyright license that uses the mechanisms and systems of copyright law to, and how it got called copyright left, flip it over and instead of restricting people to assure certain rights and privileges for people. So in the actual text of these licenses there's various requirements. One of the requirements is that the entire work could be licensed under the same license. That's the strong copy left idea that encourages share and share like, encourages that modifications are licensed properly. So that means in kind of legalistic terms that when you add new copyrighted material to a GPL work either by making a combination in various different ways, or by making modifications in various different ways all of the material that's copyrighted in that whole single work has to be under what we tend to call in the compliance world, GPL compatible licenses which means any license that allows for re-licensing under GPL for example the two clause BSD and other such licenses that are highly non copy left do allow for re-licensing under copy left. But most importantly and what I'm going to be focussing on for the rest of the talk is both GPLs have a concept of what's generally called complete corresponding source or CCS. The GPLB-3 calls this corresponding source the GPLB-2 uses the phrase complete corresponding source. I still because I started with GPLB-2 just tend to call it CCS as the abbreviation an abbreviation that I admittedly coined. The way that the GPL actually triggers in a legal environment is when you fail to do these things. When you fail to license the whole work under the GPL or GPL compatible licenses and when you fail to give that complete corresponding source code that can be rebuilt your permissions under copyright evaporate. You no longer have permission to copy the software yourself so you now lost your right to distribute the software. This is the toughest concept I think in copy left for some people to understand that basically you take away the rights under copyright law of the person who fails to comply as a trigger to make sure that you can then say well if you want you to start distributing that software again but if you want to do that you will have to do it in compliance with the GPL. You can't just do it out of compliance anymore because you've lost your right. So what an enforcement process does is it says to the violator you've failed to comply you've lost your permissions under copyright we want you to have them back the only way to get them back is to comply with the terms of the license. As I said at the very beginning I'm going to do Q&A at the end because I'm very easy to get off track and I've left plenty of time so if you don't mind jotting questions down so you don't forget and asking them down I appreciate it. There's also a mic that has to be sent around which I don't know where it is but we'll find it at the end. We'll find it at the end. Copy left is an incredible technical set of technical requirements because to be able to actually compile and reinstall software those are all technical acts that you wouldn't engage in and the process of edit compile test is a technical process and in embedded software so this is not something we thought a lot about in the very early days when I first got involved with the GPL in the late 90's because embedded products were uncommon and they virtually nonexistent with regard to copy left. Today the most common uses of copy lefted software at least Linux are in embedded products be they mobile telephone devices wireless routers refrigerators whatever random IoT device you might buy at Best Buy or Fry's and as such that embedded situation means that the technical requirements of how to build often on a non self-hosted system are difficult and complicated. The GPL does require though that it be possible to compile and reinstall the software so we have to talk about does the CCS the complete corresponding source that they gave even if they license it properly all into GPL and they gave us all of it which that sometimes doesn't even happen there is this thing in the world called proprietary Linux modules which are failures to comply not just on CCS requirements but on licensing the whole work under GPL but let's assume the scenario where at least they gave us all the source code on all of it under the right license they also have to give us the ability to compile and install it GPLB3 talks about it this way so it says you have to be able to generate install and for executable works run the object code and modify versions of it including scripts controlling those activities GPLB2 is a little more wordy it's a little bit more it's a little bit more colloquially written than GPLB3 GPLB3 is much more formalistic but it's basically the same requirement there's really despite what you have heard by critics to copy left there's really no difference more or less on this issue between GPLB2 and GPLB3 now since Linux is GPLB2 I'm going to mostly be talking about GPLB2 I've certainly spent most of my time pretty much the only GPLB3 enforcement that I've ever helped with is for the Samba project which is under GPLB3 now GPLB2 at least in the embedded world is all about those 11 words right there installation and installation of the executable I've spent a lot of time looking at those words once I started writing an entire talk about those words that talk would include 20 to 30 minutes about the word duh and duh scripts to give you the 30 second preview of that does duh scripts mean the actual scripts you used or any scripts that work discuss amongst yourselves so it would have to be that if we were to go that route but let's not go that route unless we want to during Q&A so frankly if people are really trying hard to comply with the license they are hopefully focused more on doing it the right way so that developers will engage with them as an upstream an ideal product on the market that seeks to use pre-software ought to be seeking to engage its community of users in contributions this I think has been shown through projects like OpenWRT for wireless routers and SAMIGO for televisions is a way to make those embedded devices more valuable to consumers and more valuable to the company to resell them the WRT54G stayed on the market much much much longer than any electronics product should have still made today because the OpenWRT and LEED projects run on it and did for many many years because we enforced the GPL on it and got the source code release which was the spawn of the OpenWRT project so when and I'm not necessarily saying the links to the Cisco we're really operating the spirit of the community that's not what I'm saying at all my point is that when products do that they're not just trying to eke their way across the line of compliance because you know if you're only wearing those couple pieces of flair just doing the bare minimum are you really engaged in what you're trying to do in the project and so our goal is really to encourage people to do that but the fact of the matter is that most of the time when they come to conservancy's radar screen other people in the community have tried to convince them to do that and they have refused so we're usually dealing with the companies that just want to come right across the line so we had to come up with a lot of rules over the years of how do we evaluate a CCF, we call them CCF candidates how do we look at one and decide whether it complies with the GPL I use this basic rule for reference we have a developer who is reasonably skilled in the art of building embedded systems software builds attempt to build it, attempt to follow the scripts used to control compilation, installation they gave you or gave us and see if it actually works. Now one thing I want to talk about, I don't want to send too much time having the Socratic seminar on the scripts used to control but one other thing you have to think about scripts is this is not a technical issue it's an English word. The English word script could also meet things like the script of a play which means for example you can't simply say well we never wrote any shell scripts for the thing so we don't have to tell you how to build it well no, there's a set of tasks that your engineers do to build it so you have to write those down a text file is fine say go do this step and this step and this step about like you have to go the GPL requires you to write a nice make file or scripts or use some system like Octo, what it actually says is well tell us how to do it in whatever way makes the most sense for you what you do is run a awful list of commands over somebody's shoulder just write those down and that's what you need for compliance and we try to use those instructions so we often spend many rounds with a violator. Most of the effort in GPL enforcement is evaluating CCS candidates it's incredibly labor intensive and incredibly frustrating because imagine if your entire job was to build software that you know will not build when you try to build it that's the job that my colleague Denver Gingrich has one day a week with us that's not an enjoyable job and I used to do it myself and glad that Denver's around to do it now sometimes you do one or two of these CCS candidates and the third or fourth one is in compliance the most I've ever done is 22 that was during the Best Buy versus Busy Box case and in fact rounds I think 16 through 18 they gave us a bite for bite same thing again we sent them a list of problems they gave us the same thing back we sent them a list of problems they gave us the same thing back those are easy to check because we just checked the check sums and everything was the same we said well it's going to have the same problems they gave us the same thing you gave us last time so I admit fully that there's a kind of a know it when we see it standard right I mean we could be stricter right we could say that scripts means you have to write beautiful makefalls that work perfectly under all conditions I think that would be unfair actually but because that would be unfair we have to use this know it when we see it kind of standard we say well can we build the CCS you gave us well if so then it's okay and using this process over many years I think I did my first CCS check in 1997 so I've done a lot of these and now I've managed someone who does these and we've gotten compliance we've gotten what I call just got across the line of compliance CCS and for all that time I never pointed at one and said that's an example to look at and the reason I never pointed at them is because it was not an example of good practices it was not an example of what you ought to do it was an example that meets the GPL requirements at a bare minimum and I didn't want to hold up a bare minimum as an example so for years I was looking for an example and this is the first time I've given this talk with both of them in the room but there are two people sitting in row 3 who made it possible for me to have an example like this that's the guy from Think Tank right there so they put together a product which I am not embarrassed to say is a compliance with GPL just barely it is exuberantly in compliance with GPL with great and tremendous effort primarily from Bob but also from Chris to make sure that it complies with the GPL and what we did is Denver and I wrote this paper which we included in the compliance guide which is part of the copy left guide on copyleft.org that's a shortened URL compliance.guide slash pristine-example goes directly to the chapter about this if you want to read the entire the entire paper about this which is as I mentioned part of that largest tutorial and it's a very comprehensive guide but I am so glad we now have this example as part of it so what I want to do is kind of give you the highlights of that and this is where it kind of turns into more of an academic talk about a paper right I'm going to give you the highlights and then if you want the details you can go read the paper later but I'll just give you kind of a flavor of the stuff we saw the kinds of things that the Think Tank when folks did that were really really good compliance process now the very first thing they did and this is something I've talked about for years they didn't use the darn offer for source and I will tell you every corporate lawyer who has used the offer for source was probably not giving their client very good advice because for some reason for many many years and this is starting to change, lawyers would look at this text and say oh that's great all I need to do to comply with the GPL is put a little piece of paper in the box that's offer source code and they would often never bother to talk to engineers about whether source code was ready for distribution and the reason I know this for sure is because the most common occurrence when you discover an offer for source is you follow its instructions sometimes it's required very complicated instructions to contact a company which is technically compliant as long as you reach them eventually and then they say we will get back to you in 10 to 15 weeks now if they had the source code ready why does it take 10 to 15 weeks to send me a CD or give me a URL to download or something like that the reason is the lawyer said I don't worry about complying just put the thing in the box if anybody ever asks will figure it out when they ask and by the way if they don't ask for 3 years we're safe because then we don't have to provide it after 3 years so hopefully no one will ask for 3 years that we never ever have to really comply with the GPL we can pretend we're complying with the GPL and there are companies today in fact most IOT companies that's their compliance strategy basically Conservancy was the primary organization enforced in the GPL particularly during the busybox era they discovered that our first check to see if somebody was compliant was whether they had an offer for source and generally speaking and I'll admit this now because it's not true anymore in the early days of busybox if somebody had an offer for source we just went to the next one because there were so many who didn't have an offer for source so everybody said we'll just start making an offer for source we'll never contact this then we actually started going and testing offers for source to actually do that if you ever get a product that has an offer for source even if you don't want to recompile it test the offer for source because I will offer you 15 to 1 odds at any amount you want that it will not work this is the only way I'll ever get rich doing this because it probably won't work and then you can record that it won't work and email us and at least we have a record that it didn't work so you need to test it so the companies know it will be tested so many people don't I don't know why these so called open source lawyers are obsessed with the offer for source the early days it was because O was so expensive to provide source code we've generally come to the consensus despite some ambiguities in GPL v2 regarding this issue that giving a URL that does work is okay under v2 v3 clarified this made it abundantly clear that a URL with reasonable access to get source code is acceptable v2 was unclear on that point but I've never said not in compliance because you won't give me a CD as long as I can get the source somehow in a reasonable way so by the way that comes down to the question of what's a medium custom area for software interchange I will point out on the argument on the other side you don't know where your software can end up a lot of these people are like Silicon Valley companies they think everybody in Silicon Valley thinks everybody has 100 megabits up and down which that's not what the world looks like your product can find its way into Sub-Saharan Africa again, because somebody can resell it there and redistribute it there that person in Sub-Saharan Africa has a right to the source what's a medium custom area for software interchange in the middle of a village in Africa I don't think it's the internet but anyway so the worst part about this from a legal perspective is why as a lawyer I don't get it, would you advise your client you have a way to end your obligations right now if you give them the source code at the same time you give them the binary as long as the source code is not compliant your obligations are done I mean source code is not compliant if it's not CCS that's a different issue but if it's assuming it's good CCS you give them the source code at the same time they got the binary they can call you all day and you don't have to give them another copy of the source code but if they lose the copy of the source code they're a problem etc it's not just to your customers which is the lawyers get this wrong all the time it's for any third party any third party can call you up and say give me the source code for that product and in fact it means more people can have a right to get the source code than if you gave them that because if you give them the source code when they get the binary you're only required to give the source code to your customers because they're your customers they bought the binary, you gave them a copy of the source code but if you use the offer source any third party can call you up if you can't get a copy of the source code of course I can, I'm any third party am I not? so as I said using an offer source usually means they're out of compliance I'm sorry to say it's true but it is true 15-1 I'll bet it and I think if you must use the offer because of space, because of the distribution mechanisms whatever it is you've got to be ready with the CCS on day 0 or day 1 you can't wait until let's give them 10 to 15 weeks until we do it I mentioned before the scripts don't necessarily mean shell scripts they don't mean make files, they don't necessarily mean something technical that a computer can understand it can be a read-me and as I mentioned it's like a script of a play and it's okay under the GPL for your build process to require human intervention, even a lot of it as long as the scripts clearly explain what a human reasonably skilled in the art needs to do to build it that's completely okay and in fact the Think Penguin folks decided to do it this way this was relatively terse but it was completely understandable Denver looked at that, he knew what he needed to do he followed the instructions he installed the necessary dependency packages build dependency packages he ran the make menu config as instructed and so forth and he was able to build it and you can say it's not that complicated the make files mostly work but there's a little bit of extra stuff you had to know to do and doesn't mean like you don't have to turn this into a make file and be like let me figure out what system I'm writing on and run apps for you and as you do it so that you install it, I just tell somebody how to do it one of the things I tell people to test your CCS to make sure that it works especially if you're at a big company take the source release that you're planning to use find a developer in another department and hand it to them and ask them to build it if they can follow the instructions you're probably fine. If they get a firmware at the other end that works and installs it's probably fine and if they can't you probably are not fine, you're probably not compliant with the GPL. I'm amazed that most organizations still don't bother to do this they just put a source with it out I mean you type make and it doesn't do anything it like airs out on some weird thing and you have no idea why so I don't know why people don't bother to do this in part because they think that they're going to kill copy left and nobody's ever going to enforce again because they'll stop us from doing it somehow and therefore they'll never have to comply I assume that's why but this is really the most useful stuff you can do to verify compliance I know that compliance industrial complex is obsessed with the idea of cataloging every single license and every single line of source code that you ever had enter your company and it's somehow that will magically make you compliant. If you just had an SPDX file for every single file in your company there would be perfect compliance I'm telling you this is probably the best thing you can do to make sure you're in compliance if you do this on every product you're much more likely to be in compliance on every product than you are if you have an SPDX file for everything one of the things that comes up a lot is the question of tool chain I've often been asked is the tool chain part of the scripts I don't think it controls compilation I think it does the compilation I think it would be very difficult to argue the tool chain itself is part of the scripts used to build some other work excuse me I have to cough and the reason I believe that and the reason I believe that is because if you think about having GPL software on Windows if it were true that you had to give the compiler when you distribute a GPL binary so that you can recompile the source if I made a GPL binary for Windows and I built it with Visual Studio or whatever the modern compiler for Microsoft environments are I would have to give you a copy of that of course that's proprietary software that I'm not allowed to distribute so why would the GPL want to trigger a requirement that basically leaves you in a hopeless situation the GPL says you must do this and the proprietary license of the compiler says you can't so I think what's really intended here and what we generally interpreted this is explain what tool chain you need something like this telling you exactly how to build would be adequate in the Visual Studio example telling the exact version of Visual Studio you need to have installed for the compiler to work is reasonable I don't think you have to actually give them the copy of that and you can't admit they have to get it from a third party and they may have to get a proprietary license I think it's nice to include the tool chain I think it's a kindness to developers if you can as much as you're able to now people have actually spread a lot of misinformation about Conservancy's past enforcement because they said well Conservancy required us to provide the tool chain and then they say in their talks you don't have to let me tell you how that occurs what happens is we send an initial information we need a CCS a candidate CCS to look at we say in that information be careful not to include anything that might trigger further obligations for yourself and the company sends us as part they just send us everything that was on the hard drive on the machine they built it on which includes GCC at that point they have now distributed a binary of GCC to us and now they have obligations under the GPL to us because they gave us a copy of a GPL program that we told them be careful not to give us anything that you're not supposed to give us when you give us the candidate and then they go out to other industry parties and say that we trick them well and this is how the politics of this world work these days so when the GPL v2 was drafted make install was a relatively easy part of your makesault to write it was 10 lines that copied things into user local or something like that and ultimately on a server system a typical make install that you might know from the 1990s or that a debbing package might call as part of its build process would be reasonable embedded products are not so easy to install onto by a long stretch like the whole order of magnitude golf that you have to jump over and I think it's just the right way to do it to write out the instructions I don't think you're going to have to be able to like write magical a lot of times like we've gotten to situations where you have to like reinsert like one example we had once where to put the version of busybox to update it you basically had to run a bunch of weird commands to the firm the nrd image on the firmware inserted in and it was all complicated a long essay sort of like well to get this back on to the firmware image you're going to have to like take apart the firmware and there's the nrd at bite whatever and then you mount that and then you have to mount the copy of the thing in and it's fine as long as it's explained it's no big deal and you can't but you can't say oh no you're not going to be able to install it because it's an embedded device it's complicated you can't install no what are the scripts used to control the installation installation on a bad device means putting it on the actual device there's no other possible reasonable perturbation of what install means for an embedded device now people often point out that sometimes you need specialized hardware to install and I agree completely hardware necessary for the process is not a script I would not try to argue that to anyone and we actually encountered this with the think penguin scenario there were instructions in the think penguin stuff that said we need this special like USB serial adapter to actually install it and we went back to think penguin and we bought that serial adapter from them they happen to have to sell it I believe we bought it did we buy it from you or we had to go from a third party I can't remember I don't remember but I was hoping maybe you did the answer is no I don't expect you to remember but even if we had to buy it from a third party that's what you needed and actually pretty good detail Bob wrote an instruction actually I think we we bought it from a third party on purpose to see if a third party one would work and it did because Bob described exactly in his instructions this is what you need to be able to install this thing install this firmware on this wireless router and that's okay I think it was like another $20 we had to spend it's not the GPL violation that you have to go out and buy specialized hardware for an embedded device totally reasonable it's almost what you need you can't just say oh we have special hardware we're not going to tell you what it is if you ever want to install this no you have to say what it is but it might be something you have to build with Libre Boot is a great example you have to actually buy a lot of specialized hardware to install Libre Boot but it's all on the website it's all there the instructions are there yeah it's not easy at all at least the first time anyway just because of the culture of the company is that they want to engage with users they want people to become upstream developers so they wanted to make it so their users can actually rebuild the stuff and when you have that attitude you're going to make a very compliant product because your goal is to actually engage with the community and have other people develop with it it's important though to note the host system doesn't matter you have to explain which host system you need to build this stuff and it's okay if it's a weird old build system I recently as a couple of years ago I was still seeing instructions that said first step is install Red Hat 7 Red Hat 7 was released in 1997 Denver has like at one point a couple of years ago Denver and I collected every ISO install image we could find of all these different old distributions because these embedded companies they have a build computer that sits in the corner of someone's office that was installed in 1997 or 2000 and that's been their build computer since the company started and it's running whatever distribution was modern at that day and it's absolutely okay and for anybody who's ever bootstrapped GCC on a host system to go to a target like what version of GCC you use to compile the compiler that you're going to compile the actual software and it's complicated so it's totally cool if you say well you need that random GCC that was in Red Hat 7 if you don't have that you're not going to be able to build this I had been given those instructions we had made sure it worked on Red Hat 7 in a VM and it worked and that's totally compliant even though technically speaking we never got it to build with any modern distribution and as I mentioned telling what the tool chains are needed as nice is the right thing but having the tool chain is really nice and it's perfect and that thing about having a colleague build it I think that's probably the most important thing if you want to test CCS getting someone who's never seen it before to build it that's what the GPL intends for it intends for a third party new to the software but who is skilled in the art to be able to rebuild I've often complained I'm going to finish up with this I've often complained to this compliance industrial complex being so focused on the wrong things things have gotten slightly better not from that industry but from other people who have other goals to do the right thing in software that will help us so there's initiatives that originally came out of Debian but it's kind of larger than Debian now but a lot of Debian developers are involved with what they call reproducible builds this is the first technical initiative that I've seen that's completely focused on the central problem of CCS which is how do I rebuild something when I have source code from a long time ago that I'm not sure how to build and how do I get up and running again and reboot strap the binary and be able to verify that each time I build the binary I'm getting a reasonably similar binary I believe that if people adopt reproducible builds as a required industry standard would be the best thing to help compliance around the world it's very interesting to me that absolutely nobody who is obsessed with things like open chain and these other so-called compliance initiatives have any interest at all in this stuff they're much more focused on this XPDX inventory all your licenses present side which is not where and I'm the person who's seen the most GPL violations in history and studied them and I've told them that is not focusing in the wrong spot but they just tell me that they don't care it's very interesting that it sort of speaks to how the GPL is kind of a technical requirement because the reproducible builds people had no idea when I got really into I was at DEBCOM I think it was 2015 I saw the first talk in reproducible build like have you all considered that this might be helpful the GPL compliance like this is helpful the GPL compliance like yes the most helpful thing I've ever heard they were pretty excited they had no idea that this was part of the whole picture for that but it's really clear to me now having studied the reproducible builds project for a while that good CCS is simply a reproducible builds problem they're exactly the same type of problem and finally I just want to say my colleague Karen is keynoting tomorrow morning if you want to kind of detail talk about the importance of software freedom that's the place to go this is a relatively technical talk tomorrow morning it's going to be more about ideas and what's right so with that I think I'm finished up so there's a mic please wait for the mic to ask a question I believe they're recording and they would like to have the questions in the recording oh my gosh you're going to start like a Socratic session aren't you so this is a company that has a very expensive embedded project out now like say several million dollars and then they put it as they put the open source license into it what would you do then so it's a good question thank you for the question I think there's an important thing to note like I was talking about the offer for source situation it's certainly okay to selectively distribute GPL software so if you have a big giant system that costs millions of dollars to build for one specific customer you give that customer the big giant device along with the source code and all the stuff that I've talked about there's no obligation under the GPL that you post that on the internet the GPL doesn't require immediate public disclosure of the software what it requires is that the user who receives the software has all the rights that you had so it's actually quite common that for example we know pretty certainly that in some top secret military systems there is GPL software as far as I know it's not violating because the contractor be it Lockheed Martin or Hal Burton whoever it is built the product using the GPL software they distributed a copy of the source code to the government the government got the source code that they wanted and it was required under GPL the only customer who ever got a copy therefore that was their only obligation under GPL so that does happen and that was completely contemplated by the license so we're actually in a pretty good spot with regard to that now it's good that you ask the question because there's a lot of confusion about that and people assume that because the free software community is very open and people share their software that every single piece of software grows on the internet immediately well that's very common in our community so a lot of us in the embedded community are working on products that have safety regulation and capacity to do harm as well as good and these products tend to be heavily regulated a bunch of us are working on automotive products that have ASL automotive safety integrity levels or something like that and there's a very strict audit of the development process of the product sort of like ISO 9002 and it seems to me the existence of a build procedure that anyone could follow for the software for these products would have to exist in order for them to pass any of these regulatory guidelines so it seems to me if you wanted to argue as conservancy that you should be able to obtain such a document I don't see how you could have a safety implications product that did not have such a recipe for a reproducible build in the Debian sense people who are putting software on products that have the capacity to do harm should be able to prove that they're putting the right software on those products and that the software is the software that's been approved by regulators so I've heard this argument for a very long time the first time I heard it was from Broadcom in 2001 who said that there would never be a free software wireless device driver because of FCC regulations of course many many many wireless device drivers are free software now and there is a fight again in the FCC over the same issue that's ongoing there's a committee now convened which we do have Eric Schultz from the free software community about our concerns in that and I think absolutely we should engage with regulators to help them understand about the right to rebuild software the right to repair your devices that's an important right that has to be I agree with you balanced with the regulatory systems of safety I think the main problem I have with the argument is just oh you're yes and that I agree with you I agree with you that if you cannot reproduce it so what you're saying is if you cannot reproduce the build how do you know that what's running is actually the thing the Volkswagen question right I mean Volkswagen clearly modify their software in a seedy way to trick a regulator about emissions and there was no way for even the EPA themselves to verify if the software was clearly doing what it's supposed to do without except by like basically pen testing the actual device so yes I agree with you completely I mean I think there is the thing I'm talking about that exists because you experience it all the time in your work we've talked about it that this is Allison Chekin who's heavily involved in automotive embedded stuff and a big fan of free software and contributed for many years so I think that we have to what needs to happen is the rhetoric has to be toned down and the sense the kind of things we hear that it's not safe to have free software in cars which is what a lot of people in the automotive industry say we need to break down the walls of that like shrill rhetoric that we hear from these industry folks and then switch to hey why don't we sit down with regulators and talk about the kind of things you're saying like reproducible build is important for regulators to test and we have to talk about what happens with the consumer and I don't think the consumer should be able to put a car into a non-safe situation but you would probably agree that if you just use security through obscurity the nefarious people will put it into that bad unsafe situation and then what you've done is said well only nefarious people who actually go through the work to reverse engineer the proprietary software and sneak in their nasty stuff are the ones able to make us unsafe I think if we have a real as you suggest reproducible build conversation about this is important for safety and let's figure out how to make that interesting that the regulators are comfortable with that's really where the conversation has to go I think you probably agree with that I think have you seen any instances of like people think Penguin is to be commended for what they've done like I like what you guys are up to but have you guys seen folks do things like virtualization like creating a like a docker that can also be packaged to build the software and the other thing is it strikes me as it would be really easy at the time of compilation to just freeze the environment with one set of considerations and just go like okay well everything that built this is like packaged into this nice little thing on dockerhub now that you can go pull down and make exactly the same thing I mean it strikes me as largely if we can just kind of instruct or give people a process to this this could be really easily solved as the house software is being distributed right now I think the only way to do that kind of pressure I agree with you completely that would be one of many possible very convenient solutions using modern virtualization technology to solve this I think the problem we have is that we have a lot of players in the industry who don't want to solve it there's a lot of players in the industry one of two things we have the system of chip manufacturers who are on purpose violating the GPL and want to continue to do so their customers are in a bind because there's only so many of those vendors and almost all of them are violating the GPL in one way or another and don't actually want to help their customers comply and say fine we won't sell to you if you do we have examples for example there's a system of chip vendor it's a Linux based system of chip vendor if you go to them they will get you to sign an NDA to waive all your GPL rights which is in itself a GPL violation for both sides but before you can even get a spec board of their product you agree we are not going to comply with the GPL and you agree you're okay with that so that's the industry we're living in so while there is all these other ideas that could be worked I think the best place your idea can help is entering these conversations like say the open chain discussion which is supposedly an open community get on the open chain mailing list and say if you don't know open chain is this industry consortium thing under the Linux foundation designed to build a series of checklists to help people get on the right path to compliance Karen participated heavily in that and attempted to tell them they should add this kind of stuff to it I think they, you are a blog post about this they widely ignored your comments about these issues but they didn't, they don't offer they don't say you should be able to reproduce your builds and all that sort of stuff in there and you did suggest that and so that's I think that we need people saying this has to be a major part of the compliance processes and the recommended compliance processes yeah, I'm sorry I meant you participated and tried heavily to get them to change is what I meant why not just kind of build this into GCC it is at some level or you know you are using these tools there actually is a great project it's not Docker oriented but it could be adapted called Cross Tool NG which actually is designed to help you conveniently keep track of past versions and rebuildable versions of GCC for cross compilation so yeah, that could be mixed with that kind of solution again, these solutions exist in the world right, I mean and there's Yachter project which I hadn't mentioned yet it's designed to help people reproduce builds for embedded products there are various different mechanisms and then the help with compliance and I always joke with the Yachter guys like well I never see Yachter and the reason I never see Yachter is I'm upsetting the ones the Yachter folks in the room when I say this but it's not widely adopted in the embedded industry the places where it's adopted is by companies who already wanted to comply anyway that's why they picked Yachter and everybody else is using a build route that they forked 10 years ago that is not a compliance and they don't want to change because they don't actually want to comply I mean that's kind of the the real crux of the problem is that there are so many companies now that want to get away with GPL violations just to sort of follow on to that I think another thing is to be realistic about how things get built most of it's black magic I type make, I've got no clue I type GCC, I've got no clue I use these tools and I really have no clue the reason that that box is still sitting in that corner from 1997 is because nobody knows how to rebuild a thing that's the actual reason why this is occurring I don't think it's really because people want to violate GPL it's nobody knows how to actually become more advanced like the actual build tool chains themselves are sort of locked and so I think that's a real issue to deal with that often times it's this historical baggage that they just don't, they don't have the expertise in-house to actually upgrade their systems to be compliant that then causes more burden and more unknown for them to go research and that it's easier to deal with GPL violations than it is to really actually go figure out how to upgrade my stuff it's actually a confluence and a combination of what you're saying and what I'm saying right so everything you said I think is accurate I think you're talking about from the engineering side there's no will at the engineering side for that reason because they're afraid of that box and don't want to touch it because it works and it might break but then there's no support from their management side because their management side are the ones thinking you know what we've been getting away with GPL violations for a long time we know the 10 people in the world who actually enforce the GPL and we know that they have limited resources so we're going to bet the odds and gamble and it's better for us anyway because we want to have things like proprietary kernel modules which are violation of GPL and all sorts of other things they want to get away with like crypto lockdown and all that kind of thing and so they're like well you know we're just going to keep doing that and therefore the engineers can't get support of saying hey we actually need a build person in here we need a release engineer in here so like well why do we need that like oh because we'd combine with GPL better like well we don't care about compiling with GPL I mean I think those kind of conversations happen between engineering and management a lot in these companies I'm pretty skeptical of the idea of using any of these technologies to really force malicious actors into compiling they will not work for that but it does seem like developing tool chains that build reproducibly that document that make it easy to document these sorts of things these things could make it possible for actors who are either good-willed or are just lazy I think you need to really target the lazy developers and provide them with tool chains that make it easier than maintaining some random old docs from who knows when the reason I'm so excited about the reproducible builds project is because it has enough buy-in from so many different parties that it will hopefully become an expected industry standard so Mozilla which is totally outside of the world with Debian in a lot of ways is interested because they want to be Firefox to be reproducible buildable and that's sort of like this whole orthogonal thing to what Debian is looking at so the fact that there are so many in the software world who build stuff which is everybody effectively who's doing software what I would like to see and this is sort of where my thinking goes I agree with everything you said generally but there's one step further I want to take it I would like it to be the case that if things like reproducible builds are considered an industry standard by Mozilla, by Debian and so forth that companies that source embedded products because the usual violation happens this way they source the system on a chip, system on a chip vendor is sort of the real nefarious actor the company gets it they maybe even force their pressure I guess not force into agreeing to one of these horrible NDA things they're now shipping a product that product is now in the hands of thousands and tens of thousands of people and it's violating for all those people so then we're chasing that vendor who then has to chase their system on chip vendor who they're now locked into and have this complicated business relationship with because it's hard to change their production lines I want to create a culture ideally over the long term that that company will not accept a system on a chip that's not reproducible built because of all those companies we will not accept that as consumers of your stuff because all the middle layer people who integrate products and create them with a system on a chip board if they all together demanded our build has to be reproducible then the system on chip vendors could do the process we were just talking about over here of saying well we don't have a good process and we don't care because we don't comply like it's a way into compliance through engineering by saying this is an engineering demand that we need for good engineering reasons so that's the extra step, I think that's some of what you were saying but there's like an extra step there that I think we could get to if it becomes an extra standard which is exactly why I'm so frustrated and so sometimes cautious about the whole compliance industrial complex because they're trying to create these industry standards around stuff that just doesn't matter if you hear Wind River people talk they're like well what we're really going to be able to do is get every single copyright notice absolutely correct in our product and won't that be amazing we'll comply with every BSD license everywhere so imagine if you violate the BSD license which you can do by failing to put a copyright notice in the right place that is so easy to fix the copyright holder writes to you and say you forgot to put my copyright notice on your website and you edit an HTML file and put a copyright notice there for them and you're done it's such an easy problem to solve why is the entire industry investing so much resource and so many people's time into making that so perfectly automated that you never get it wrong because it's so easy to fix meanwhile this problem which is complicated engineering difficult requires lots of smart people to think about it a lot essentially less interest in resource yeah go ahead please you should get a chance at this I mean the answer to that is really simple because it's a hard problem if you can solve the easy problem which is let's ignore this for now and if someone complains well then we'll deal with it later but let's get our product out the door and shipping now obviously you do that I agree with that but the funny part about it is that the easy problem is 90% solved they're now focused on the 10% it's hard to solve it doesn't matter that much and that's because that's a productizable solution to something that seems like a problem this is exactly what I call the compliance industrial complex because they're creating problems they're scaring people into believing there are problems that need to be solved that are already more or less solved and ignoring huge other problems they're difficult to solve because there's no product there and this is why a project like Debian to start this because Debian is not driven by those forces they're driven by forces of doing things right and good for the world and so they're the first ones to say wait a second we need reproducible builds and they're getting kind of this buy-in that it's really important which is great so when you say the script to install thinking about something like my television and I've gone to I think it's my television manufacturer I'm not going to say who it is because I can't exactly remember whether all of this is entirely accurate or whether it was something else or not I seem to remember seeing a list of tar files for you know curling busybox and whatever and so it kind of raises the question the scripts to install that right the scripts to build and install would mean something like generating a ROM image that ROM image includes a proprietary component it's not really a proprietary user space out for example and then you start to think about well is that ROM image a derived work now because it's one derived image may or may not be easily separable well if it is then they shouldn't have made that user space application for proprietary of course right because if the thing and actually a German is the ROM image being like the file system image right because it's on one ROM and that raises things like well is a CD ROM a derived work or a composite work or maybe I don't remember the exact terminology so there's a couple interesting things about that is the German court has already held that a wireless router is a single work so if you put Linux in a wireless router in Germany according to at least one German court you have to GPL the entirety of everything including like the web interface and all of that and when people talk about how like I read copyright statutes so expansively like I don't necessarily agree with that decision I think that you know you can have a file system and it is permissible to have a user space web interface for a wireless router that is licensed differently than GPL like I think that that's permissible German courts disagree they're much more expensive copyright folks I suppose in Germany but so that's one question right so if you're in a German and if you're in Germany maybe you do have to do that I think it's generally been considered that and GPL v2 even talks about this explicitly that user space applications of that nature that are completely separate and independent works and do not combine as a derivative work can be proprietary I always just think the analogy back to how GPL was drafted it was drafted for proprietary Unix systems that had some free software on them right so the fact that I had and let's take the simple example I had GCC was under GPL in 1994 I'm sitting on Sunos I made install a copies GCC into user local that doesn't mean that I have the source code or how to figure out how to install Sun's kernel or Sun's copy of LS based on BSD all that stuff right that's also proprietary software so the same thing is true and embedded right so if you have your TV and your TV has like that smart TV like web and whatever they are whatever they use right they have this proprietary app that runs as a user space application on a Linux box which your TV is basically a Linux box running user space app that's proprietary so you should be able to put the copy of busybox that makes you built in place in that ROM image and boot the ROM image that's Linux that's your copy of Linux now that to be honest to do that user space app may or may not work with your modifications like there's no guarantee that when you modified Linux you didn't screw something up and now it doesn't work it's not required that they make all the proprietary software work fine and by the way this is one of the reasons why I hate the phrase TVLization with the cryptographic lockdown because TVL would actually as far as I can tell be in compliance with even GPLv3 because the way that TVL the modern TVLs work you can recompile and reinstall Linux it will boot and work and you can now use it as like your own DVR with say Kodi or something but what won't work is the proprietary user space app that TVL provides checks the kernel checks on and then won't run but you can still reboot it with a new version of Linux as far as I know this was true about a couple years ago it's still true now and that's totally reasonable like I don't mind the fact that when I install my own version of the free software all proprietary software on the box breaks I actually see that as a feature not a bug because it will help show you what is proprietary on your box because it now no longer works and now you can go download Kodi or whatever you want or in the TV case you use Samigo which is a DVR based system that runs on an SD card on a TV and of course at least we got from Samsung during the busy box lawsuit against Samsung so I think all the details of what you're asking about work out now if there's something I'm missing in there that you're like no you're missing this part tell me well I think the question is more like when you say the scripts to install it those would seem to include the scripts to install the or to produce the raw image GPL related stuff because you're going to be talking about the kernel and busy box and all of that but it doesn't seem like manufacturers are doing that things like it's big manufacturers are those manufacturers that you've just given up on I mean are those just not good examples of what is compliant GPL I have estimated in the booth earlier today that probably somewhere between 80 or 90% of all embedded electronics devices are rivaling GPL today I'm pretty sure that's true I was joking when I put my camera with my dogs on it I bought that camera on the open market got it and it's violating the GPL am I going to get into compliance as well it's not