 next speaker was 12 years ago at a conference in Berlin that shall rename unnamed, where he presented an automotive car hacking talk before it was called car hacking, where he injected RDS traffic and could reroute your navigation system. Luckily, he's no longer breaking cars, but instead went on building secure stuff. And for one of this, the USB armory that you might have heard of, he has now come up with Tamago, a full bare metal go runtime for your secure software development needs. Please give a warm round of applause to Andrea Baridzani. Can you guys hear me? Yeah. OK. Thank you for the introduction. Thank you for reminding me how old am I. And yeah, so we still break cars, but that's work now and not fun anymore. And the fun stuff is actually this one. But we still break cars, actually. So I work for F-Secure, just a little bit of introduction by myself. I work for F-Secure. And some of you might know me for a company which I founded, which is called Inverse Path, which was acquired a couple of years ago by F-Secure. I'm one of the author of the USB armory. And yeah, as it was mentioned in the introduction, I work a lot with hardware and embedded systems on safety, critical systems, such as airplanes, cars, industrial systems, and so forth. And because I'm getting old, basically, now I tend to like to build things, to make things, rather than only breaking them. I think this is a inevitable phase in the life of information security researchers. And because you get a little bit tired of just pointing your finger at things that are broken, and at some point, the industry becomes so good at breaking things that I think that we also should stop a little bit and think about creating tools, hardware, and software, which can really serve the non-security community better into solving all kind of security issues. Because we see that there are a lot of issues that they never change, despite the fact that we'll almost see in 2020. And one of the motivations for us into building open hardware, such as the USB armory and Tamago, which is directly linked to the USB armory, as you will see, it's also to provide better tools, tools that are maintained, that work, that are clean, that are trusted. And I think this is a phase that a lot of information security researchers at my age now are getting through, which, again, is just, I guess, getting old. So the USB armory is an open hardware computer, which is meant to be a secure enclave in a very, very small form factor, just a USB device. And Tamago is based on our need to build software in a better way for this device. So the whole inspiration comes from the journey of creating this hardware. And it comes from a very simple scenario that we face while testing all kind of embedded systems. So I'm a strong believer of the fact that, just like natural language, I mean, if any of you want to code for a specific device in whatever language you like and you prefer, you should be able to do it. And if that language, for some reason, its implementation generates a compiler which is not fast enough for you to have a successful project on any piece of hardware, that is not necessarily your fault or it shouldn't be your fault in choosing the wrong language. It shouldn't be any wrong language. The language should be adapted to your need, to your style. As a programmer, as a developer, we should, in an ideal world, care or not care at all about how the compiler is optimized or not. In an ideal world, all compilers should generate machine code with the same efficiency. If you like to do math in SQL queries or in Go, Rust, Python, assembly, whatever, in an ideal world, in like a Star Trek, USS Enterprise, E world, it should all generate the same byte code. Because the intention that you're giving to the code remains the same. You want to do the same operation. However, we do not live in an ideal world. We live in a real world. And this means that developers, to make the choices that they need to make in selecting framework and languages, they really need to be careful about the implementation that the language is reflecting. The implementation that the language is supporting, that the hardware that it is supporting. And this is not ideal. So usually there are very two distinct scenarios. We have hardware. We test hardware for a living that has lower specifications microcontroller units, which are used because engineers want to simplify their design or they want to save money on the parts, whatever the reasons. The only practical choice, or the only real world choice for programming on these devices in a client production system is by using unsafe lower level languages, which is typically mean C. And so we tested in our cryptography tokens, wallets, hardware diodes that play a very important role into ensuring a separation of safety boundaries on things like cars and planes, and all your lower specifications, IoT, and smart appliances, they all have firmware that despite doing operations which are pretty basic, they're all written in a language which is unsafe from an implementation perspective. On the other hand, if we have hardware with higher level specifications, we can code in pretty much anything we want, but we need the support of a complex operating system to do that. So if we have a system on chip and we can run Go, Python, whatever, higher level language on it, we're just shifting complexity around. So the complexity and the, let's say, unsafeness is taken away from us as a programmers, but it's distributed everywhere else in the stack that allows us to run that code because we're gonna have a Linux system, we're gonna have a lot of drivers that maybe we don't want. We're gonna carry on millions of lines of code that are not strictly necessary for the task that we're doing. And as we know, complexity is an enemy of security. And if I want to program a system in a higher level language, I just don't want to put all of that complexity under a carpet and have it there running underneath me. I just want it to go away. As a security person, that's the reason why I pick a higher level language. So we face these two scenarios and none of them is ideal. And so also in this case, now we see a shift toward system on chips away from microcontrollers, also in avionics, in any kind of system which needs to be a little bit more complex, your home router, higher specification, IoT and smart appliances. And we also see, which is quite common, that despite having this power and the underlying OS, we still see C applications running in user space on this system. Your infotainment system is very likely to do that even if there's no good reason for doing so. And we pop them all the time because inevitably, C is a hard language to code with. We should realize that no matter if you're a C lover or not, it is now vastly proven that it's very difficult to have production grade code done by a lot of developers to be safe because it just takes too much toll on the effort for making it safe. So our penetration testing rate on this kind of system is always 100%. And as we build a system on chip-based hardware, we didn't want to face these situations. We didn't want to write bare metal code in C. And at the same time, we didn't want to have our higher level language applications running under complex operating systems. And our goal in doing this is to reduce the attack surface of embedded system. We don't want to carry millions of lines of code that we feel are unnecessary. We want a system to perform only the bare minimum of what we need to do. And we think that this can be done by removing any dependency whatsoever on C code or complex operating systems. So we want to avoid shifting complexity around or having complexity hidden from us. But we want to run a higher level language such as Go directly on the bare metal. And that is the motivation for Tamago, which is directly inspired by creating the USB armory in the first place. Now, of course, a lot of you, I mean, I would assume that some of you know Go here and a lot of people are thinking why not Rust and we're gonna get to that. But the point here that we're trying to make with this project is why not both? We want people to have the choice of using the language that they want. And since we want to use Go, that's why we created this framework. So why Go? So first of all, disclaimer, because I know when we enter into these topics, there's a lot of frameworks, people have feelings, I have feelings, you have feelings, everybody has feelings about languages and so forth. So this is not a talk about saying that language X is better than language Y. I'm not here to say that Go is better than Rust. I'm here to say that we think that certain languages which have less of a chance now to succeed on bare-metal applications, they can have this chance. So we want the ecosystem to be more diverse and this is why we made this effort. But it's not to say or to force you or to tell you that Go is better than Rust. In fact, we want you to have the choice and want to give a choice of Go which wasn't present in the past. So if we look at speed versus safety access, so to speak, also this is not to scale. If you know Rust, if you're in love with Rust, you might decide to place the R of Rust in a different location on the chart and this is absolutely fine. Again, this is not to scale. Scale is objective here. But we all agree that if we would draw a line, Go is something which is of course lower than Rust in its end result. But it's easier to a certain extent to learn. The learning curve is certainly easier, more shallow on Go than on Rust. Rust, of course, is much better than C. Of course, it's a safe language and if we go on the other side of the spectrum, we have C which gives you more control, more hardware control, but it's also hardware to implement correctly. And now we are a situation where languages like Go, which are fairly fast and they're much faster than languages such as Python or Ruby. So they can really be used to create binaries that run on better systems. However, they're a little bit detached from the hardware. So if you want to either run on the bare metal or make firmware at a slow level, they're not ideally suited for now. So we want to somewhat fix that, at least for the Go language. And one of the reasons why we want to do this is because this is the typical setup of a secure firmware that we make for other USB armory or other kind of a meta systems. We have a bootloader, which is secure booted by the hardware, by the system on chip. So we have the first stage authentication of the bootloader. And then typically the bootloader authenticates and loads a Linux kernel image because that's the operating system that most people use. And that's the operating system which at bootstrap the whole decryption procedure for, let's say that you have an encrypted partition and so forth. And maybe it has drivers, it communicates with the system on chip to get some key material uniquely derived from something stored only in that chip. So this is a typical chain of secure and verified boot to achieve authentication of all of your code and also confidentiality of the data. And the problem is that we're typically faced with a scenario where we're developing something, let's say a cryptocurrency wallet or whatever crypto related firmware. And now we code it in a language like Go and we have very few lines of code in Go, a few thousand lines of code. We use a standard library of Go for everything for TLS, for crypto. So we minimize the first party dependencies and we have a code which is clean and nice. However, in order to boot this image on something like the USB Armory, we need to carry around a Linux image to do fairly simple tasks such as decrypting something, talking to the system on chip doing USB and then launching our Go application. So in the end, to us, it's kind of an elegant, the fact that we spend so much time simplifying and cleaning up the code of the firmware and then we need to carry a giant operating system compared to what we need to do. And we need to update it very often because despite the fact that you only have a few drivers exposed, you still want to keep it up to date because you never know. And of course, you also have user space tools which mean you can try to reduce them as much as you can. You use busybox, you use framework for generating compact Linux images such as build root and so forth, but still it feels, it doesn't feel the right thing to do. This could be more optimized. And while this is an example for the USB Armory, it applies to pretty much all kind of a meta system that we test and they have some sort of secure booting and so forth. They all follow the same pattern we're using a system on chip. So what we really want to do is take a goal and move it down there on the axis. So we want to keep the same ease of, and speed and efficiency in development, but we want to have more hardware control which also means that we want to kind of remove this red box over there. We want to take away the Mimio sublines of code that we don't own, that we don't maintain, that we're kind of stuck with. So this is the idea of Tamago. So of course, this is not a new concept. This is known as unicornals or library operating system which are single address images which typically run under the bare metal and their focus is to reduce the tax surface. The problem, however, with available unicornals is there are most of them, not all of them, there are so-called fat unicornals because, so first of all, a good chunk of them is just, again, hiding complexity for you. So there's a good portion of unicornal projects that they give you an API and documentation and they tell you, look, you're going to develop your application, you're going to compile it, and then that's going to be executed. But in the end, they do have an actual kernel underneath which sometimes it's even derived from fairly complex operating systems such as NetBSD and FreeBSD, and the whole framework just puts a lot of abstraction layers in the middle so that you don't see the kernel, you don't see it around time and you just deploy your application. Now, this is all well and good, but from a security standpoint, this doesn't really solve the problem. In fact, I think it creates the opposite problem. So while researching for this talk, I kind of looked at all the unicornal projects that are around and for most of them, it was really hard to find which kernel they were running and the documentation kind of gives you the illusion that they are this magical bare metal project, but they're actually not. A lot of them, they just pull in code from NetBSD or FreeBSD. Most of them, they are based on for-party kernels such as MiniUS, which is granted. It's written in C steel, but much shorter and smaller code base that's something like FreeBSD. And also most of them, they are actually not focused on the bare metal in the sense that they're not focused on running on embedded systems, but they're focused on running on the cloud. And so they all support hypervisor, such as Zen, which is not what we want to do on embedded system. So for all of these reasons, the existing, most of the existing unicornal projects, they're not really suited for embedded system developments and they don't achieve what we want to achieve, which is kill C. I don't want any dependency on C written code whatsoever while having my firmware running. And if I'm gonna have an hypervisor or if I'm gonna have a kernel written in C, you can abstract that as much as you want, you can hide it, but it's really not gonna solve the need that I have. So this is really not what we wanted. The other problem is that when it comes to security, these unicornal projects, and rightly so, they want to support arbitrary applications. So they want you to be able to compile your application in whatever language you want and then to execute it. Or they also want to be able to be kind of OSs and then provide support for multiple applications. But the thing is, if you're having multiple applications and different trust domains under a unicornal, or if you're running an application which is written in an unsafe language like in C, you kind of want an industry standard OS because you kind of want address space, layer randomization. You do want stack calories. You do want all of the security features that are the good parts of complex operating systems and that are there for you. So we think in our approach that unicornals such as this one, so at least we are interested only in unicornals that allows us to run bare metal on embedded systems and we want to run a single higher level language on that unicornal. We're not interested in everything else because for everything else we think that actually maybe operating system are a little bit better. So we don't want to focus on the cloud. We don't want to rely on an iProvisor. And again, I explain why we choose Go. It's what we use a lot and so we wanted to give Go a chance because we think it has a shallow learning curve. So productivity can be very good with Go and also primarily it has a very strong cryptographic library that we want to use. And again, Rust has already proven that it has a role in the bare metal world. So it has nothing to prove and it's going to succeed as well. But Go doesn't have the chance and that's what we want to give to Go, this chance because we think it really can and we're going to see why. So in a nutshell, what we're going to try to achieve because the other message of this talk is that it's important how you do it, not only what you do because anybody can run Go, anybody can understand that with the right effort you can put Go on the bare metal. It's fine but the problem is how do you achieve that because there's an element of trust there. So and we're going to get to that. So Tamago is the idea is that we want to find the path of least resistance in patching the Go compiler. We want a patch which is absolutely minimal to cleanly enable support on the bare metal. So our take is to provide a different OS variable to Go. So normally in Go you have Go OS to specify whether you're under Windows, Linux or other operating systems. So we created a separate Go OS support and a minimal patch to enable that on the ARM architecture so that we can run the runtime on bare metal. So this is one part of Tamago. The second half is a set of packages that provide support for hardware board. So the driver, so to speak. So right now we have support for the USB armory system on chip which is actually a widely used system on chip. So not just specific to the USB armory which is a member of the NXP IMX6 family and we're going to target more platforms in the future. And our goal for doing this is again to develop our security applications using the existing open source tooling that we have for assigning security images and so forth for the USB armory. So there have been similar Go efforts in the past and there are similar Go efforts right now but for a variety of reasons they all didn't quite fit what we needed to do. So we had two projects mainly which are now maintained. There was a project called Bisquit which wanted to actually create a kernel OS kernel in Go. So the idea there wasn't just to support Go application but to support any application written in any language with POSIX compliant interfaces and so forth. This is a maintain. There's a lot of complexity to the project because they would do memory allocation and threading and so forth and they jacked the existing Go S Linux support despite not running in. So for these reasons it wasn't exactly what we were looking for. There's another project I maintained which is called GERT which is an ARM adaptation of Bisquit for running actually only Go applications again in a Jax Go S Linux and has more complexity to it that's what we want. So that's also something that is not treated for what we wanted. There's another nice project called Atman OS which was presented I think three years ago which is kind of similar to Tamago however it targets the Zen iPervisor and has limited runtime support which is also something that we don't want. Now of course if you like Go if you know a little bit about the ecosystem you of course might know about TinyGo which is active and rocking. It's a great project. However for our purpose TinyGo is not quite what we wanted because it's a complete different re-implementation of the Go compiler. So it's a different compiler not the original one and because it targets microcontrollers and not system on chips it provides a different runtime with a more limited language support so it's not quite having like vanilla Go. So it has a different focus. And then a brand new which was actually published a few days ago we have Embedded Go which is a kind of a new project which targets also microcontrollers and the thumb architecture. So it actually adds new compiler support for it because RMV7M is not native to Go so it adds a NoOS GoOS for the thumb architecture. So again it does something a little bit different than what we do but it's actually quite interesting project so we're gonna keep a close eye on that. So all of these projects despite whether they are maintained or not or whether they are complex or not and they do what we like or not they really helped us improving that this can happen. So throughout our project our approach to it is not that we needed to understand if this was possible we just needed to understand if this was possible without polluting the compiler if it was possible to do cleanly enough and all of these projects just gave the assurance that this can be done. So we're really grateful to all of the people that put their effort into these projects. So I'm working in information security so to us and we are kind of entering for me at least entering in a territory which is just compiler and languages and so forth so it's really a new domain for us but we want to bring over our core principles which is enabling trust and we see that there are a lot of projects most of Unicernan projects is something that you would never see in production really nice like from a technical perspective they're really people that they do something they believe in, they have passion and they push the boundaries of technology but are you gonna find those in production? Well, not so much. So we want something that is done in a minimal clean and trusted way that is good enough to be eventually accepted upstream because that's our final goal. So we wanted really to find if we can patch the original compiler in a very minimal way and much of the effort has been placed in that we didn't want to pollute the go runtime to levels which we think are security people that are unacceptable. Less is more, that was the motto of our effort we want to have the least number of modifications still readable of course that it would make sense so that we'd match the existing style structure of the way the go development team is working because also this leads to code which is more verifiable and it's more maintainable in the future. So we design it for an hypothetical upstream inclusion in the future so we're working for that and we have a commitment to always sync against the latest go release. In the end we ended up with about 3000 lines of code of compiler changes and that's it in order to support runtime and enable the additional go as architecture. We placed strong emphasis on reusing code which was already there within the go compiler framework and the final goal is for developers to be able to use this just by having one import in their code and that's it. If you don't need to use the hardware you don't need to know about hardware and we want to support unencumbered go applications like no limitations ideally zero limitations in the end and also the compilers only have the story we provide drivers so that you can actually run this on hardware which these days is relevant and by using also the original go compiler we do inherit nice properties such as go compiler itself hosted can compile itself as reproducible build so these are all nice things that we do want when creating our firmware code. So we have three different categories of go compiler modifications that we've done. We have what we call glue code which is merely code that just adds the Tamago keyword to a source code that needs to be compiled. So this code has no logic is very benign it's just stubs and definitions where we say there's a new architecture and its name Tamago and so we update all of the lists which are required to enable this support. So this is about 350 lines of code across many files. So we change many files but the changes are really, really tiny and really they have no impact whatsoever on the stability or security of the code. Then we have a second set of changes which is the bulk of it which is about 2,700 lines of code which is reusing existing code within the go runtime to for execution on the bare metal. So I'll give you an example and this is what I call the go Frankenstein because it was like creating a Frankenstein monster but it's much better than what it sounds. It's not as ugly as Frankenstein actually works. So memory allocation a lot of projects that try to put the go runtime on the bare metal they completely reimplemented memory allocation and threading and we just saw that there's the memory allocation for plan nine which is included in go runtime and maintain and with one line changed, we can use that to run on the bare metal because at some point the plan nine memory allocator just use the VR case is called to allocate memory but we're running on bare metal. We have our memory space so we can just allocate pointers from it and so with one line of change we can use all of that code which is already there, test and maintain. For locking structures and so forth so there's a locking code within Golang which is for the web assembly primarily and we can reuse it identically and the nice thing about this code is that it has three functions which hook into the external OS and we're gonna use those to implement proper timer support and the nice thing is that we can keep a clean separation between what we need to do to run things on the bare metal and what Gol already has within its code and it's nice to do that rather than just hacking and changing Gol code, it's nice to have nice entry points for doing things which touch the hardware a little more and then we have an in-memory file system for now but this is gonna change soon in the next month because we're gonna add MMC and FAT support which is actually quite easy to do but there's a in-memory file system with NUC which we just copy over. We enable it for Tamago and it works and this is actually the bulk so there's a highest number of line of codes change because of the way the compiler works we just need to copy the mem underscore plan nine dot go file into mem underscore Tamago in order to use that code and then we have new code which is about 600 lines of code in 12 files which is Tamago specific functionality and it mainly provides initialization of the ARM core so this is all code which is fairly standard you will find it in any OS, any bootloader and so forth and then we have code which provides hooks with your application and the board package to understand how big is the memory and what's the offset of the memory and so forth so all of the changes like surprisingly it was really surprised to us that the go run time is almost freestanding on its own with not a lot of dependencies on the actual operating systems apart from system calls that we're gonna see now so this is the extent of the modifications that we need to do to run go on the bare metal or at least to have a compiler which allows us to do that. This is the memory layout that we use so your go application lives there in memory we have a heap, stack, an intervector table and so forth so all of this is pretty standard and we use all the available RAM depending on the board that we have so concerning go run time support so basically there are three components here we have the support within the go run time itself so this is an example of what happens in the file ostamagoarm.go and we see that we have hooks we have variables and functions which needs to be defined externally by the application so we don't wanna put information about all the different boards, all the different hardware and the hardware peripherals within the go run time we don't wanna pollute it with that so we have one generic function for hardware initialization we have a function for printing on the console and we have a function for getting random data and for getting ticks which the run time expects the external board package to provide and the same goes with the offset for RAM for where memory starts what's the offset and what's the size and then the rest of the coding in this file is just architecture related initialization so not specific to a board but just ARM initialization and so forth so this is part of the go component modifications then we have the system on chip package which is actually very simple because the only thing that it provides right now is in relation to hooks with the run time the variables where the memory starts and the offset of the stack and then we have the board package which actually tells what's the size of the RAM because the start of the memory is gonna be the same for the A specific system on chip but if you have different boards you might have more RAM so the actual size is specified in the board package and so for instance here in the USB ARMory package we say okay when I want anything to be printed out on the console, the console for the USB ARMory is actually the second UART, the second serial port so that's information belongs in the USB ARMory package so this allows us to have minimum modifications in the go runtime to have what belongs to system on chip specific information in the system on chip related package in this case the IMX6UL package and any information that's specific to the board we have it in the board support package so in this case the USB ARMory so this is the clean way for doing it another example here so here we have a timer definition so within the go runtime at some point go needs to get ticks or to understand what's the time and that's provided externally by the IMX6 package which provides support for the generic timers for the USB ARMory because that's what the architecture provides and we can also of course mix assembly when it's required this is something that already happens in Go it's not something that we're doing only ourselves it's something which is common and accepted and it's the most efficient way to deal with very low level aspects such as getting timer information so all of this is initialization code which accounts for about 500 lines of code so not so much and again it follows existing patterns in the go runtime so this is another example here we are at some point so we were developing and we saw that code was running slower than expected and we were like oh wait a minute we need to change the clock speed because this system on chip by default is clocked at about 400 megahertz and if you wanna run it at full speed not under megahertz we actually have to do it ourselves like the bootloader doesn't do it the bootloader always sets the default frequency and so we quickly coded in Go within our board package actually within our system on chip package defunction for setting the frequency and this is what a driver looks like in Go so we have our functions for setting registers so here we set the PLL register we set two bits to zero at this offset it's kinda what you, sorry it's kinda what you would find in C but just by using Go we can wait for a value to become one because we're waiting for the lock on the clock here we're removing the pipas that we needed for changing clock we set a divisor and so forth so you can write drivers in Go and the interesting thing about when using memory safe languages on the bare metal is that every time you need to do something which is not safe you have a specific keyword for that so in Go like in other high level languages you have the keyword unsafe so if you want to scout and look for all of the potentially dangerous places in the code where you're doing something which is pointer arithmetic you can just grab for it you can just search and save and you're gonna find all the occurrences and where you're using the finding or do we pointer arithmetic and because you do need to do that for drivers but at least it's very easy to identify those within the code that's also something which we thought was really nice about using a higher level language such as this one concerning Cisco so the Go runtime makes direct use of Cisco for a lot of functions and this was our main concern do we need to emulate and about 50 Cisco system calls in order to have the runtime working and it turns out that only one is actually really needed which is write which is the one that eventually gets hooked with the printK function so now we support the write Cisco only for standard output and standard error and we use that to print to the console because that's the only thing that you actually need on bare metal I mean you're either writing on a file descriptor which is handled in a different manner within the Go runtime with the file system but if you wanna write to standard output I mean on these class of devices you don't have a screen so you have a console and that's what we do in the board package and if anybody wants to do something different with that in the board package which again is outside the compiler you can define whatever printK method you want so in the end this is what it looks like so normally you would have your Go runtime running under user space under a complex OS the Go runtime would make system calls and then the kernel space with its drivers will be able to serve them talk to free for us and so forth with Tamago we live in a Go runtime process your package is linked with the runtime the system on chip and the board packages are also linked and these are the ones that support the driver and the Go runtime every time a system call is made which in this case the right system call it is just hooked to the actual driver support within the Go package but we are all within Go and we use the vanilla Go runtime with the exception of a few initialization and runtime support function which are only specific to Tamago which are the ones that are actually serving system calls and so forth so this is the change and again we're dramatically reducing not only the lines of codes count but we are completely eliminating C because in this setup the only C is actually the bootloader which it goes away after boot but anyway we're also working on replacing that but there's no C involved at all not a single line in all of this so how do you develop build and run this thing well so in order to use it you've right go as you always did and you just import the board package that's the only thing that you need to do if you're not using the driver specifically that's the only thing you need to do if you wanna use a driver like random number generator or USB then you also need to import that as you wouldn't go but to run basic operation that's the only thing that you need so that's the first step then you compile with Go build as usual with the exception of a few flags to the linker where we need to tell and this depends on the board what the entry point is gonna be and where are we gonna have the text of our the text segment of our application but that's it so we have go as Tamago, go arm seven go arch arm and then we just use go build so Tamago here is a variable where we have just the go runtime compile with Tamago support and then this is U-boot, boot loader you just load the resulting elf that's it there's no intermediate boot loader needed it just, you would just run this application as you would a kernel so we implemented drivers security oriented drivers for our system on chip to prove ourselves that this can actually be used and this was an important part of the process so the IMAX 6ULL which we use on the USB armory has a few security drivers that we needed to enable so the first one that we developed was for the data co-processor which is the element that allows you to do encryption and decryption and key derivation with a hardware unique key which is fused at the first power up of the system on chip within the chip it's fused you cannot read it you can only use it and it's unique for each chip so we wrote a driver for that the driver takes about 240 lines of code which is I think 10 times less than the Linux kernel module for this and then if you load its package you can just invoke the drive key and you can derive a key using the hardware it also detects if you're secure booted or not and also note the nice thing is that we can use structures that we create in Go so they can be made C compatible with a little effort so you can use them to and pass them to the actual hardware to the memory so that data can be allocated so we just allocate a structure here and then we actually pass a pointer to things to the structure and it just works here at the bottom here we're actually writing the address of our Go allocate a structure to register to the hardware register and then the hardware we're gonna fetch the structure and do its work we wrote the driver for the random number generator there's a true random number generator within the system on chip which can be used useful for the very first boot when you, because this kind of hardware doesn't have a battery there's no real time clock so the very first boot you don't have any, you need an initial seed and this is a good use for that and so we also wrote 150 lines a driver for this and we hooked it to the crypto run function of Go so you just use Go normally and the random number generators if you use crypto run they're gonna come from this then I wrote a USB driver which it's something that makes you question your life choices I tell you where you're at that point in your life that you're writing almost 40 years old in your writing USB driver however my only concern was reading and studying reference manual at least it wasn't dealing with C and memory and so forth so actually Go really helped me keeping me happy because I could use Go routines I could use channel I could use Mutexes whenever I wanted so it was a delight my only problem was actually understanding the reference manual and when developing drivers with Go there's only two aspects that you need to care about which are unusual for Go programmers because they never have to deal with that you need aligned structures in memory because most hardware will refuse to load data from an aligned pointer so we created a class for that and to keep the garbage collector happy you need to carry around the underlying buffer which allows us to do the buffer alignment but again that's the only concern that you really need to take care about so we have a full driver we also for every driver that we do every time we touch the hardware we put the page number and the name and the section of the reference manual because trust me on that by looking at code from the Linux kernel and other projects there were so many quirks that if there would have been just one comment to the right page you could have saved yourself hours of just learning so if you want to learn about system on chip and driver development we also put all of the references that you need in order to understand what's going on and I think that's something that's missing a lot into kernel modules these days USB networking so once I had the USB driver we implemented USB networking in two hours that was easy so half of the code is just defining the scriptors and then we define two functions for transmitting and receiving ethernet over USB packets and we have the two functions and we hook them to Google net stack which is a very nice full goal TCP AP stack made by Google and so we just pulled that in and now I'm gonna show you the demo of all of this if the demo gods have been kind to me so on the left side I'm gonna boot my USB armory with Tamago so this is Tamago running so what it did in these few seconds the boot bloater booted directly into go self-test of the random number generator we changed the clock speed we say hello because we're polite we launched seven go routines we derived a key we read a finding memory we slept for 100 milliseconds just to make sure that that time implementation is correct we generated a few random numbers we made a few ACD-ACA signatures we signed a Bitcoin transaction the go routine completed then we allocated about 1.5 gigabytes of memory just because to check the garbage collection works and now we are waiting for USB so if I plug it in into my other USB armory I have a USB armory connected to the USB armory it's very meta so now the USB descriptor has been evaluated and now we already see network traffic so if I connect to my USB armory now so hello this is a simple UDP Echo server I can ask for a random number I can debug the memory and I can also do this I can stream Star Wars in ASCII and if it's not as smooth as you think it should be the problem is not the armory it's actually Windows which doesn't support a console very well so yeah, all of this except for the boot part except for the boot loader all of your seeing here USB, TCP IP handling, streaming, everything there's not a single line of C code involved it is pure Go and little assembly and I think that this is I think this is pretty cool I don't know what you, but yeah so performance we'll see if the movie ends while we go so performance as expected the speed is the same compared to running the same Go application under Linux so this is an example of ECDSA signatures from the Go compiler test suite running under Linux on the same hardware and running under Tamago and the times are actually identical because, you know and this is what it's supposed to be it's supposed to be either identical or even faster because we have less overhead from the operating system doing content switching there are a few limitations there are very few and we're working on them so first of all on this hardware we're single threaded so if you have a tight loop and you have functions in this tight loop which don't go back to the runtime it's gonna be stuck forever there this is not unique to us this is what Go does every time you have a Max Prox one and you're single threaded so this is expected and normal and you can also avoid it by the way you can force invocation to the scheduler in tight loops but usually if you have really tight loops that don't do anything it's just because you're testing Tamago not because you're actually doing real work we have to implement five system storage so five system support and storage so we're gonna do that if you import a package that needs something which requires an OS such as terminal console and so forth it's not gonna work but that's expected you can link if you want C code but why would you after my talk, right? But you can if you want as long as it's freestanding there's no OS, there's no users there are no signals there's no environment variables this is a feature, not a bug so with the expression of the few surprises again, Go is surprisingly adapt to run on bare metal and now we're gonna use this in the future to write the secure firmware that we want we wanna write HSMs, cryptocurrency wallet authentication tokens trust on secure monitors and much more this is the baseline for developing secure applications on this kind of hardware so again, we learned that we can reduce complexity not just shift it around we kill C completely at least in this very specific implementation and again, it's all about enabling the choice of a language which didn't have much chance on the bare metal but now we think it does and we just wanna in the next months to build trust with this and maybe have it accepted upstream so thanks to all these people that enable us to do this project and now I have two minutes for questions I hope, just one couple of questions thank you so much thank you Andrea perfect ending time, 13, 37 so we still have 13 minutes for Q&A so if you want to ask questions we have three microphones please line up here microphone three is actually equipped with an induction loop if you're using hearing aids and I get a signal we have questions from the signal angel in the very back from the internet hello I have two questions from the internet from the IRC the first one is does the garbage collector somehow cause performance issues on bare metal? no, not in our experience and also when working on the bare metal if you really want, you can also turn it off I mean, that's something that Goal always had you can turn off the garbage collection if you want, and you can run it either at very specific times or if your application is short lived and has predictable memory allocation you can also decide not to run it at all it really depends on what you're doing in our experience for the operations that we need to do we never stumble into problems and its performance and behavior is pretty much the same that you would see on normal Go application running under normal OS there's actually no difference we're not changing its behavior okay, thank you the next question and the last question is is Tamago suitable for real-time applications and if so, how much? I think that by disabling the garbage collection possibly it can I'm not a big fan of real-time operating systems in our work experience every time somebody use a real-time operating system they had so many bugs anyway that the real-time part wasn't really working really well and actually they really didn't need it but of course there are some application and financial applications where you really need it if you have the time and effort for that Rust is probably a much better suited language for that Evan said that I think there might be a chance that by turning the garbage collection off this can also be worried because in the end the result is very predictable if you turn the garbage collection off next question from microphone number one, please thanks for your project three small questions first, do you usually look at the assembly which you have after compiling on your platform second, Go is very famous for fuzzing do you have some fuzzing of your applications on your platform and the last one did you find any bugs in Go runtime while porting on your platform? So, yes, we look at the assembly we also use the Go assembler ourselves the generation is identical again to what you have with normal Go on x86 or sorry, with ARM the difference is that it runs on the bare metal but the efficiency that you're going to get when compiling is the same because we're not touching that we're not touching the Go assembler we just use it the second question fuzzing so we want to use this to fuzz USB actually one of our projects is to implement a low level USB fuzzer with the USB armory that can fuzz the host we're also trying to understand how we can integrate fuzzing of this externally with Go by using Go fuzz but yes, it's something that we're thinking of and the third one did you find any bugs in runtime itself? Yes, in fact, if you look when you get the slides just look at this slide there's a fun Go bug in the top right in the bottom right corner about the garbage collection so yeah, we found at least one it's a weird property but yeah, we found one but we're working with the people that work on the Go compiler for a living and they're being so supportive so yeah, but it's not a stopper not, I mean, we didn't find anything that was a show stopper for us Thank you, excellent Thank you We have another question from the internet via our signal Andrew Yes, there are three more questions I think we have time for that one, okay then we take Well, you get all three but just one now and then Pick the easiest one Okay How suitable would Tamago be for writing code for other microcontrollers for example, 32U4? So for microcontrollers just go with TinyGo because the footprint of applications built with the standard Go compiler is pretty large so TinyGo, which is a great project is a very good reason to exist so for microcontrollers TinyGo, System on Chips, Tamago that's separation Next question, Microphone 3, please Hello, thank you very much for the talk and for the work so will you be supporting other targets as well like the Armory MK1 and all Winner Chips? Yes, so we plan to support the Armory Make 1 Mark 1 and we also plan to support the Raspberry Pi Zero we're actually working on that right now because of course we don't want to just support our hardware I think it's important to to give the chance to other projects of this and it's actually very easy to support other pieces of hardware it's only a few days of work so yes, definitely and pull requests at a welcome Okay, back to the signal angel Alright another question is oh, I got it Can this be run on other Cortex R-Class processors or is it the same use TinyGo? It can be executed on any system on chip that has arm architecture support within the Gold Runtime so I would say that it would be trivial to run it on any Arm V7 system on chip and it should be very easily adaptable to other ones Again, the number of modifications required and the hardware initialization make it so that it's easy to port it to other platforms so as long as we're talking about system on chips with Cortex it should be fairly easy to do Yeah, thank you for the question for the answer I have more questions Would Tamago also run on the USB Armory MK1? Yes, yes, definitely so we're gonna make sure that we provide that support soon enough A very interested C-Only developer asked on Twitter what is about debugging breakpoints, register maps and register manipulation on the MCU using Tamago? GDB works beautifully so we use GDB, we use breakpoints we can stop anywhere we want we see the code just like any other application otherwise we would have gone insane so yes, that works Okay, next up, microphone number two, please You mentioned FAT system support Yes Are you using... I think you mentioned FAT Do you use the... There's a pure Go implementation for this... Sorry, no, I'm black in the name The Fuchsia project I think has a full FAT implementation Can you speak up a bit, please? Oh yeah, the Fuchsia implementation I think they're using a user space driver for FAT That's all in Go, is that what you use? FAT, pure Go FAT implementation, they're already out The GERD project already has that and also has the MMC support so we're gonna try and get that and put it in It should be very trivial effort and I just mentioned FAT because it's an easy FAT system format and usually on a better system you just want to take a blob, write it, read it you don't need fancy storage Thanks Microphone number one, please Thanks for the talk Have you talked to upstream about getting it into the mainline? Yes, we're working on that We're very anxious about it because we want everything to be super clean but it is our intention to give this the best possible chance and we have contacts upstream and this was coded from the very beginning with intention to make things clean, nice, respectful on what's already there in the Go run team we didn't want to hijack things that are not meant to be hijacked and that's our goal because in the end I don't want to maintain the compiler part I just want to maintain the drivers and everything else So yeah, we're really trying hard to make this in a state where it has the best chance to be accepted upstream Do you have a timeline? No But I would hope by the end of next year Would it be still called Tamago and why is it called Tamago? So it probably won't be called Tamago It is called Tamago because Tamago means egg in Japanese and you have Go and Go lives on bare metal in its own shell and so it's Tamago and if you run it under Q-E-MOO, it's a Tamagotchi Thank you This is the only reason why we do these projects we first come up with a name and they were like Oh, what can I do with that name? Do we have any more questions from the internet? No, we do not I think we have another question at microphone number one How is it that today's embedded system often has some screens? The USB armory has no screens so it wasn't our focus right now having said that there's no reason why you couldn't implement a video driver with this Maybe it won't be as performant as it can but if you're doing DMA right and you're clever enough of course it can work but for now it's not our focus our focus now is having smarter smart cards HSM tokens, authentication tokens so we have Bluetooth on the USB armory we have USB for now if we really want a UI with the USB armory we either have a mobile app or you do it through networking but yes, maybe in the future, who knows? Are there any more questions? I guess not Well then, thank you Thank you so much Thank you