 Hi, everybody. I'm here to talk about how to make firmware for microcontroller from scratch. That's not to say that you should do that normally, but rather explain a venue through which you can learn more about microcontrollers quickly. I'm a software engineer at Red Hat, and I work on the Kernel Continuous Integration Project. And I also maintain the Digiment Project, which makes drivers for graphics tablets. And in my free time, I do embed it under electronics. Our subject will be this little development kit, which is called Blue Peel, but not officially just by people. It's very cheap. And all you need is this TTL serial cable to program it. By the way, I'm going very fast because we don't have much time. And links are in yellow, so if you open the slides, you can find them on the internet. You can get all the source code and everything. So what I managed to do with this is, for example, this little car made out of Lego and parts from printers and scanners. I tried to make it with Arduino, but I burned it out quickly, and then it was expensive. So I bought this one, and it served me very well. Much cheaper, and I haven't burned one yet. Another one is extracting E-Prom from a printer just on there, an old printer. I extracted its E-Prom. Another one was present for my wife on Christmas. I built the hardware quite quickly, but it took me until next Christmas to make the software, actually. So it was a little late. And the current project is I'm making an interface for a thermal printer to connect it to an old data spectrum using this X-printer interface. It's very educating and interesting. What I'm trying to tell you here is how to program it from scratch, meaning that you don't have to use any libraries or any vendor tools, just open source software. But you will still need one library, of course, the documentation. And we can organize the documentation in a stack here, going from top to bottom, starting this board documentation from the manufacturer and elsewhere. The particular microcontroller's data sheet, the reference manual for the microcontroller family, the programming manual with some programming tricks and the routines. Then the particular course reference manual and finally the architecture reference manual for arms V7M, the architecture behind those microcontrollers. And this is about 2,500 pages, but they're all very well organized and fun to read, actually. So the blue peel is, this has the USB socket, the boot configuration jumpers reset button, the MCU itself, the system clock crystal, the real-time clock crystal, the power LED, user LED and the programming header, which we won't be using because it's the SWD interface. The microcontroller itself can run up to 72 megahertz. Maybe you can overclock it, but I haven't tried it. It has single cycle multiplication, hardware division, which is awesome. It has 128 kilobytes of flash memory, 20 kilobytes of S-RAM, and that's plenty for a microcontroller. It also has lots of general-purpose pins. Even this little package, which has, I don't know, 26 pins was capable of communicating to a parallel interface, a PROM, which needs a lot of pins. And finally it has lots of other communication interfaces, including, as you've seen, USB. So first thing you want to know is how to get your program in there. And if you look at the data sheet for the particular microcontroller, you can see that it has a boot loader, which is capable of programming flash memory and it's doing that by USART. And that's why we are the serial port. That's why we are using the TCL cable. And this is the particular configuration of boot pins to get it to boot from that system memory where it is located. If you look at the board documentation, we can find a particular combination there as well. It's called ISP this time. And this is the actual configuration you can deduce that from the documentation. That's how you need to put the jumpers there. Finally, we need to find out which exact pins we need to connect our serial cable to and you can take a look at another document, which is a bootloader documentation that says there that you use PA-10 for RX and PA-9 for TX. So now we're ready to connect the serial cable. We don't forget to cross out RX and RX and we can power the board from the serial cable as well, but sometimes it doesn't work so good, so you can power it from, for example, from the microUSB cable. This is how it will look connected and powered on. Now we need to find a tool which would talk to that bootloader and you can readily find it in your repositories. This is the awesome STM-2 flash, which has tons of features and very nice to use. Even though you don't have a program, you can already run this tool to just check your connection and see some basic board parameters. So people don't say it often enough and if you look further in the slides, it might look a little scary, but actually embedded is easy and fun, or unless you're paid to do it, of course. So, naturally, we'll start with blinking on LED. We'll take a look at the schematics to take a look how it is actually connected and we find what is the LED for us. And it says there that D2 is our user LED and here it is connected to PC-13, but in a kind of a quirky configuration because it's connected to power one end and the other end to the pin. And that's a little unusual, you might say, but if you look at the documentation to the pins, there's a little note down there which is saying that PC-13, our pin, could not be driven faster than two megahertz and also must not be used as a current source, for example, to drive an LED. But it doesn't say that it cannot be used as a current sink and that's why the board designer used it this way, to basically use one of the most useless pins there on the microcontroller to drive the user LED and the, you know, soldered it in. So that was very economical. So we can start sketching out our program. We need to start on reset, configure our pin for open drain output and two megahertz max speed as required. We will go in an endless loop, do a little delay and then toggle our pin, that's all we need to do. There's, of course, a little bit more setup involved. So the first thing is we need to find out how to start our scene on reset. So we take a look at the architecture reference manual and there's towards the end a little part that says that we take address of the vector table from VTOR register. We take the first four bytes to initialize this stack pointer and if you take a look further in the, for example, in the programmers manual, but it's mentioned in several places, says that we have full descending stack, that means that the stack pointer for an empty stack should point right above the stack. Further on the next four byte entry, we load that and we jump to that. So basically our vector table is first the stack pointer, then the reset address. And now we can start making our vector table. First we allocate the stack, then we put the address right above it as the first entry and then the address of the reset routine as the second entry. Now we still don't know what's our VTOR register containing at reset, where it's pointing, where is the vector table supposed to be. So we take a look there at the architecture reference manual, it says, well, it's implementation defined. So we'll take a look at the implementation and it says, yeah, it's fixed at address zero, which is fine. Great. So we take a look at the memory map. There we see the address zero and it says there is a block of memory that is aliased to flash or system memory, depending on boot pins. And that's how we get to boot from the system memory. By the way, that's where our boot loader leaves, but if you change the boot pins, the flash memory appears there instead. So the flash memory is also and naturally at zero eight all zeros. And if we take a look further, we can see where our SRAM is, our memory is at two all zeros. Now we can start writing the linker script and instruct the linker where to put all the parts of our program. We take the, first define the memory regions, we say that flash is at zero eight all zeros and it's redone and executable. And then we say there is RAM at two all zeros and it's read write and non executable. And then we start putting them in the sections. First we put the vectors in the flash, so they are first the address zero, then we put the, our code there, and then we put the read only data and finally we put our read write memory into the SRAM. And finally, again, we instruct GCC to put our vectors table into the vectors section so it ends at the start of the flash. Next we need to talk to actual peripherals and to blink the LED to control that pin. And on ARM, the peripherals are controlled using memory map to IO where basically you talk to the peripherals by reading and writing certain locations. And if you look at the memory map, there are peripherals here, all of them. And peripherals usually are in certain memory areas as pointed there in the table. So further on it doesn't say that explicitly in the documentation it's like widespread knowledge but you're supposed to normally enable your peripherals somehow, not always like Atmel microcontrollers, some of them at least, they don't require any configuration, the peripherals can work right from the start, but sometimes you need to save power. And in case of this microcontroller, you need to enable the clock. But for example, if you take Beaglebone, the SOC there, it requires enabling the clock and power and a little bit more complicated. But in our case, just the clock. So we look at the clock tree, it might look a little complicated, but it's not that difficult to figure out. And it says there further on in another part that it's by default when you give the power, the system is running from high speed internal oscillator, which is basically an RCR oscillator inside there. And it's running at about eight megahertz. And one thing it doesn't say, though, where's our GPIOC port connected so we don't know how we are supposed to, how we are supposed to enable it. But there on the system architecture diagram, we can actually find it, there it is. And it's connected via APB2 bus and through HP system bus. So now we can find the APB2 bus here and it's connected like this through APB2 prescaler, HP prescaler and clock switch. That's how the clock has been fed to our GPIOC port. Now, we didn't actually need to go and look at the clock tree but it's instructive. We could have just looked for the port C somewhere in the document, maybe. And we can find immediately that the RCC peripheral has the APB2 peripheral clock enable register there at address 18 hex and bit four controls the clock to IO port C. So now we need to find out where our RCC peripheral is. There it is and this can enable the clock to the port. So we are defined the absolute address to our register and raised the bit four now are clocked for the port C is enabled. Next we need to configure the GPIO for the open drain and the maximum speed. So normally I would go through the chapter, the whole chapter and read about the peripheral to learn about it more and then know how to, what it can do and what it cannot do. In this case I'll just run through the registers and this is quite actually simple. There are two registers for general configuration, low register and high register because there are 16 pins and four bit per each pin. Then the input data register where you read the pin state from and the output data register where you write the pin state to and there are two registers which control, like which you can control the pins differently by writing once into different parts. And finally the configuration log register which could be used to prevent damage to external hardware like by accidental misconfiguration. Finally there is the table summarizing all the possible configurations of the GPIO port. And it tells us how to configure the general purpose output open drain with these particular bits and particular speed until we can control it with the ODR register, output data register. So we will be writing the output data register and reading from it because we're toggling it and that would control the NMOS transistor connecting the pin to ground and not connecting it. So zero will be connecting it to ground and one will not be connecting it to anything. Like this opposite the usual push-pull configuration where you're alternately connected to ground or to power. So we take the configuration register high because it's been 13 and spilled over. It's address is four. These are all bits. We take the output mode, max speed two megahertz as instructed, take the open drain output, take a look at where our port C is located in memory and we're ready to configure it. There's our register address and there is like a bit too much math there but it wasn't so necessary. Try to make it clearer what's going on but I failed. Then finally we need to toggle the pin actually turn it on and off. There is our output data register at address zero C and the bit 13 controls PC 13, port C, pin 13, 13 is big, clearly enough. And there's our register and we are toggling our bit already. All we need to add is a little weight. There you go. Good but effective. So now we need to build it. The tool chain is generally available in distros. This is GCC arm non-ABI means that there is no host operating system. That's what non stands for and is enough to install. Then we saw the compiler to compile for Cortex M3 which is our SOC in some mode. This is rather redundant because it's the only mode it supports but all the GCC versions require we just compile it to object file then tell the linker to use the linker script to place the sections from the object file into right places. And then we can, that this produces the Elf binary then we can inspect our Elf binary and see that everything went where it should. And we see that vectors went precisely to the flash address zero eight all zeros and there is just eight bytes to enter spy for bytes. It's followed by the code. And finally our stack was placed in SRAM. Next we extract the flash part from the Elf binary and write it into row binary and we can actually take a look again how it turned out. So there is our stack address. 400 hex bytes into the SRAM followed by the address of the reset routine with bit zero set because it indicates thumb mode and it starts right after our stack. There it is, our code. Now we're ready to flash, that's easy enough. We tell, give the file and tell it to verify after writing and it goes quicker. And we're ready to run. But first we need to switch our jumper back from the boot loader to boot from the flash. So that's Alice to the address zero. We can take a look at the board documentation again. Set our jumper and off we go. Works. So as you could have seen, this program was very short. That's thanks to hardware designers. Tweaking it, tweaking it, tweaking the default in a way that is suitable for most basic use cases and not so basic ones. And that's what actually makes it much easier to figure all this out. Like you read the documentation but you can trust that what basic use case you're trying to do there or probably the defaults will be very good and you can check those in the register description. So this gets boring writing all those register values and addresses by hand so naturally you would want to make a library and there's many ways to do it. But that's how I did it. You basically define the layout of registers for a peripheral, define peripheral addresses and then bit meanings. And that will take quite a long way towards making them functional. So right now the microcontroller runs at eight megahertz and it's capable of running at 72. So how you can do that, you can take a look at the clock tree and figure out how you can connect the external oscillator to the system clock and that's going through this part, configuring the PLL to multiply it by nine and then waiting it to stabilize and then you switch it over the system clock from the internal oscillator to the external oscillator to the PLL that's being fed external oscillator. But there are of course some gotchas like the flash cannot keep up with the CPU at this high speed so you have to tell it to slow down the CPU about certain system clock speeds so that the CPU can actually read the instructions it's supposed to execute. And another one is that APB1 bus that has some peripherals cannot work faster than 36 megahertz, you're supposed to slow that down too. So here it is, that's the only configuration for the speed. And if you want more peripherals it's also very well explained there. It's actually more than you need explained here and that's all you need to talk to send and to transmit and receive one byte through USART. Here you can see how to configure the PWM mode and that's how you blink an LED without the CPU. That's how you can read the byte from ADC this is also simple enough and there is a little bit more involved in the description and that's how you can send byte and receive a byte through SPI and finally don't be afraid buy some of the boards and experiment. Thank you. So if you have any questions I have a present for the best question one of those development kids. Go ahead. So I really know very little about this so what would happen if you run the same program but without the code that configures the timer and the GPO? You can actually run this program let's say this is a working program you don't need to do anything else and there is a, if you click on the link on the slide you get that program and if you execute the comments that are in these slides you will get a working program, exactly. You don't need to configure anything these are extras, that's where you can go later. That's the whole program that you need to run that program and the linker script and that's all. What's that your question? To run the program without which part? Without setting the clock and the GPO. The clock and GPO. What kind of program would that be? Trying to write to that... To stay, to be doing something. I don't remember what's happening. I suppose nothing would happen well you won't see anything most likely there will be no activity. Hard fold probably if the bus tells the processor that something is wrong there will be a hard fold probably an exception and it will try to jump to address on the vector table which will be wrong and it will go haywire. So I have a question a little bit from the practical side so we've seen a lot of documentation being someone who hasn't touched it before how would you go about finding out what I don't know? Because right now I'm in a situation where I don't know what I don't know so I don't know which documentation to look at, right? Well you can start by asking questions and searching through that documentation. First question is how do you program this thing? So if you try to look in like program flash in the documentation you can stumble upon the mention of the bootloader and how you can program. You can take it from there then you dig further like how the bootloader is booting and what controls it that's how I was finding it. And also if once you buy the board there's the board documentation that says ISP, ha ha. And system programmer that can give you a clue. Any more? Yes? So if you want to maybe like constrain so a reason for this I would think is something you want to constrain in size or you want to make sure that all the code that's in there is a code that you've audited somehow so what about taking this approach and taking one of those like LibOpenCM or CMSIS and understanding it and then removing as much as you can. Would that be another approach of understanding how that microcontroller works? Would you take that approach? That's a good question. So I would say without understanding how the microcontroller works you don't know which parts of the library you're supposed to remove. That's the problem. And that's why I actually went with this first before trying LibOpenCM3 because I didn't know how it works. Okay, I can go read it but it's difficult to understand if you don't know how they work. So when you were going through the slides I've seen a mention of a JTAG on the STM32 which bears a question. What sort of debugging tool do you have? Well, usually I raise a pin or blink an LED. That's good enough for me. But of course you can use an SVD, this WD interface and there is plenty of cheap tools available like to debug it through that interface. Did you try them? Yeah, I have an OMX USB OCD H which supports it. But that's actually expensive but on eBay you can find like these tools which connect there and are able to debug this like stop the execution, step through instructions and all stuff you want. I just didn't find it that was worth the trouble. More questions? Oh, oops. Have you ever run into a case where the ROM bootloader has information or defaults or magic numbers that you need to look up and then use in your program if you were to then jump to Flash and run? Is there something that kind of like on Unix or on a desktop computer with like UEFI? Does that occur ever with the microcontrollers? No, not really because you only use the bootloader to flash the board. Once you switch the boot pin, the bootloader is out of the path. It's not getting executed. It jumps straight into your code. You're in full control. Oh, there's another one. There's another one there. We have time, yeah. I haven't tried it. I'd like to. I think that the bootstrapping would probably be harder than C because Rust is a little more involved in that part. And I've seen, there's a link there discussing how to do bootstrapping and make it work there. But I assume that, well, it could be more fun, I expect than with C. But C gives you like real hard feeling that you're doing it. All right, so thank you. Thank you.