 Hello and welcome everyone to my talk about building open hardware with open software. Today, I will tell you about my experience of using open source tools to build a software system or software system on ship, synthesizing the bitstream with an open source toolchain and running Linux on a software RISC-5 CPU. I will put a special emphasis on the challenges we faced and what I think should be considered in order to make the tools useful in an actual production system. My name is Michael Treta. I work as an embedded Linux developer in the graphics team at Pengutronics. Usually, I am working in the Linux graphics stack, device drivers, OpenGL graphics libraries, video encoding and decoding. Sometimes, I am even writing applications with graphical user interfaces. So, why am I engaging with FPGAs and software topics? Usually, I am working with device drivers for commonly used IP-Cores on ARM SOCs and for example, these IP-Cores are video encoders or decoders, camera chips or display ratios. However, sometimes customers have special use cases and implement their own IP soft-course in an FPGA. And these cases, I also write the device drivers for these custom IP-Cores. Since these IP-Cores come handy at times, I wanted to be able to write and experiment with custom cores myself. However, to be able to experiment, I need to be able to program and use FPGAs. And I needed to expand my knowledge into the field of FPGAs and software hardware. This is the story of how I started to get into FPGA development. The talk is structured as follows. I will start with some further motivation what makes FPGA programming interesting. Which use cases make FPGAs relevant as a platform, even for production systems. And what are the relatively recent developments made FPGAs a much more interesting topic. I will show which tools and projects you might use to build your own FPGA bit-streams. For example, the Rocket CPU and BexRisk 5 risk-5 cores. The Litex SOC Builder and the Uses and Next PNR synthesis tool chains. Based on my experience with the tools, I will further explain what I think must be done to lift the current status from a proof of concept to some actual production-ready system that actually helps developers and doesn't drive them crazy. In the end, I will sketch the next steps that we want to take and widen our understanding of the FPGA ecosystem. I won't be able to cover everything related to building FPGA bit-streams in this short talk. Therefore, I am expecting the following background knowledge for the remainder of the talk. If you want to dig deeper, there are plenty of resources out there in the internet for further reading. I expect you to know that FPGA stands for Field Programmable Gate Array, which feels like hardware but is actually reprogrammable and can be changed after it has been manufactured. That an FPGA basically consists of many lookup tables that can be configured and connected using a bit-stream. Special tool chains synthesize the bit-stream from special hardware description languages like Verilog or VHDL. The tool chains have to know about the capabilities and the terms of the FPGA to produce a bit-stream that is specific to each FPGA model. During FPGA startup, the bit-stream has to be loaded either externally into the FPGA or the FPGA reads it by itself from a flash. Without the bit-stream, the FPGA is just a hello-hol. I also expect that you know that RISC-5 is a relatively new royalty-free instruction set architecture, which gained a lot of traction during the last few years. There are already various RISC-5 chips in the world. There is a whole lot of research ongoing with regard to RISC-5 cores and extensions, which encourage the development and experimentation with new CPU features. Therefore, there are even more open source RISC-5 designs out there, which can be synthesized into FPGAs to allow the evaluation of these features. Finally, I am expecting some basic knowledge about system chips. The CPU, caches, peripherals, interconnect, buses, etc. This won't be necessary to follow the talk, but once you start to synthesize SOCs yourself, this means trying the stuff I am showing in the talk yourself, this will be helpful. Regarding FPGAs, I would like to start with a look at the use cases, in which situations you would want to use an FPGA instead of implementing a software solution or having a full custom hardware design. As said before, as part of the graphics team, I occasionally have custom products on my desk that feature one or more FPGAs. In most cases, these products have one or more of the following properties that lead to the usage of an FPGA. 1. The system has to handle a high data throughput. In graphics, this very often comprises video data. For example, a raw 4K video stream at 60fps, which is a lot of data, is captured on an HDMI input port, may be encoded and sent via network, or processed and forwarded to an HDMI output port. 2. There are at least some real-time requirements on the system. Again for video data at 60fps, there is a deadline of 16 milliseconds for the processing of one video frame, since the next frame will arrive after 16 milliseconds. This is not at limit for the overall processing pipeline, but each processing step has to be reliably done during that timeframe. 3. There is the possibility to capitalize hardware parallelism, either by running the same task multiple times in parallel, or by using a processing pipeline with multiple tasks on the same data. On a CPU, all processing tasks would use the CPU bandwidth, but in hardware or an FPGA, different computing units might be used with different data at the same time. One example of parallel processing with a video stream is tiling. The video frame is split into smaller parts, the tiles, and we have multiple processing units that each operate on a single tile. After processing, the tiles are put together and we get the whole processed frame. I was using video as an example, because this is very close to my daily work. But another common use case is networking. You have a lot of network packets that are processed, but the packets are independent and can be processed in parallel. A lot of that is already done in real hardware, but for experiments and special use cases for which hardware is not available, you might implement it in an FPGA. To summarize, you would think about using an FPGA if you need parallelism to cope with real-time requirements for processing a lot of data. But you would only do this if the expected volume or the current development state do not justify the production of custom ASICs. Now, FPGAs are great, but why haven't you looked into FPGAs earlier? For a long time, there were only proprietary and closed source FPGA toolchains available. The bitstream format for the FPGA is usually proprietary and there is no publicly available documentation. Therefore, there was no other option than to use the software tools that the FPGA vendor provided. Usually, these tools come with a whole tool suite and graphical user interface. They are very large in terms of gigabytes and running them on a host system that is not actively supported by the vendor is a hopeless task. Furthermore, each vendor provides their own tool suite with configuring menus and coded data. If you want to develop on Xilinx and let us FPGAs, you basically have to learn two totally distinct workflows. As the configuration and development mainly happen in graphical user interfaces with clickable menus and buttons, integration and automation are very hard to handle or require obscure programming languages. Having to install the entire tool suite is also a problem for continuous integration. Due to proprietary formats and other magic, putting projects into version control isn't that easy as well. And don't get me started about licensing. This situation has significantly changed in recent years, but not due to vendors. Reverse engineering, some FPGA bit streams has gained a lot of traction. The project iStorm has led to an extensive documentation of the latest iS40. The project trellis did the same for the latest ECP5. There is even some ongoing effort to document the Xilinx Arctic 7 bit stream format in project Xray. Based on this reverse engineer documentation, it is now possible to use uses that go to open source toolchain for FPGA synthesis to generate proper netlist for various FPGAs and to use next PNR to further route the netlist and generate actual bit streams, all of that without installing a load of vendor tools on your host. There are even new FPGA vendors like chip cologne that actively add support for their FPGA architectures to the open source toolchain. Additionally, these FPGAs are relatively affordable now. While the high performance eval boards by Xilinx still have a price tag of a few thousand dollars, you can get boards with an ECP5 FPGA for about 100 dollars or less. This lowers the bar for hobbyists to start fiddling with FPGAs by themselves a lot. Also for hobbyists and beginners, it helps that there are many open source IP cores available. This might be experimental RISC5 CPUs or some other custom cores for SPI or video. You are able to look at other people's designs, use them, learn from them and improve them and having an open source toolchain to observe how they are synthesized and courageous learning as well. Thus, for hobbyists, using the open source tool is a great way to learn and start working with FPGAs, but what about actual production systems? For example, the linux automation GmbH provides various small hardware tools for developing, testing and debugging other embedded devices. It might be interesting to have prototypes or IP cores that are implemented in an FPGA for these tools as well. Possible examples are a custom display bridge as an early prototype for some eval board or for testing hardware error states or injecting foils that are otherwise very difficult to reproduce. All the fuss around the open source FPGA toolchain raises the question if we can also use the open source FPGA tools for actual small production systems. Once you start asking these questions, many more questions appear. We read about users and XPNR. Are the tools really sufficient to build complex FPGA designs? Do we need other tools to create FPGA designs? What is the state of these and other FPGA tools? How do the different tools fit together? How well are they integrated? How do I use the tools to build a more complex FPGA design, potentially consisting of several IP cores? How do I control what I built into my FPGA design? Can I collaborate with colleagues and share it like I would share source code? Can I use version control on the FPGA designs? What about potential regressions, reproducibility, continuous integration? Basically the question is, how good do the FPGA tools work with regards to the requirements that we set for our embedded software use cases? We know our failure to meet these requirements hurts in the long run. Therefore, we set out to evaluate the current state of the FPGA ecosystem and how well it fits into the use case for implementing small helper tools for embedded systems developers. First off, we need some hardware. Simulation plays a large role in FPGA and hardware development. However, the behavior of the simulation can significantly differ from the behavior of the actual hardware. Usually this affects the timing, but you might experience other effects as well. There are various boards featuring FPGAs on the market now, even some which are open hardware, for example the FOMO board. So far we have been working with two boards, both of which have a latest ECP5 FPGA and have good support by the open source tool chain. The orange crab is a very small form factor hardware board. It works via a single USB connection to a host PC. USB is used to power the device. It will appear as a DFU device on USB and allows you to send a bit stream to the FPGA. If your bit stream contains a respective USB serial converter, it will finally appear as a USB serial device with a serial console on whatever is running inside the FPGA. The form factor is neat, but having all this functionality on a single USB and handling the physical button switch to the DFU mode is a bit difficult in our remote development labs. We really like to use them to share hardware between different developers. Our main development is now happening on a Lambda Concept ECP-IX5 board. The board also features an ECP5, but in a variant that is a lot larger than the FPGA on the orange crab. It has 512 megabytes external RAM and a separate power supply. Furthermore, it has built-in USB interfaces for JTAC and serial, which makes integration to our lab a lot easier. For the remainder of the talk, I will focus on these boards because these are the only ones that I have actual experience with. It should work with similar other boards if the FPGA and the toolchain are the same, but things might not work as expected. Let's start with the hello world of hardware development. Let's blink an LED on the board to verify that the toolchain works, that we are able to put the bit stream into the FPGA and that the board does what we want. The blink example for the orange crab is a good starting point. We navigate to the examples repository, find a very simple varilog file, a short readme and a meg file. We check the meg file and find more or less the following commands in it. First, we call users with the synth ECP5 command on the varilog file. This creates the netlist for the ECP5 as JSON. Then we route the netlist for the orange crab. Here we get spot-specific definitions for the pins of the clock that drives our design and the LEDs that should flash. Finally, we pack the textual bit stream description into an actual bit stream to be able to bring it to the device. As said before, we are able to load the orange crab via dofu. We do that with a generated bit stream and the LED starts to blink. This was pretty easy, comprehensible and feels really nice. You still have several steps involved in this process, but you could glue them together with a mason.build script or a meg file to integrate everything into a similar build process. Need. As a side note, some people like to integrate loading the bit stream into their build script. I am not sure if this is really a good idea. From my point of view, this is something that really depends on the environment and thus should be done in a separate host system-specific script. For example, what happens if you are building on a different machine than the one that is connected to your target hardware? However, I understand the convenience. Having a blinking LED isn't something that's really useful for further evaluation. The same as Hello World is not really useful as a benchmark of a software environment. Therefore, we would really like to evaluate the tooling based on some more real-world example, which we can actually modify, test different configurations and maybe even run benchmarks. And we want to see how much logic we might actually be able to fit into the FPGA. As said in the beginning, the RISC-5 universe provides us with several RISC-5 CPU cores that are meant for FPGA synthesis. This should have a reasonable size for evaluating the tools and see how well they perform. We more or less randomly picked two available CPU cores, the VEX RISC-5 core and the Rocket CPU core. Both are freely available on the Internet. There are tutorials with instructions how to build them into an ECP file and they claim to be able to run Linux. Let's look at both cores more closely. The VEX RISC-5 implements the 32-bit RISC-5 instruction set. It's highly configurable via plugins, therefore there are all the brackets in the instruction set because these instructions can actually be disabled. It is implemented in Spinal HDL, which is a very low code generator written in Scala. This is one of the reasons why it is able to support all these different variants. You can generate the very low code by running the generator with SPT and the intended configuration. For example, you can add an MNView, build it as an SMP cluster, enable and disable the privilege levels and add or remove the data in the instruction cache. The VEX RISC-5 is optimized for an FPGA usage without any vendor specific primitives, therefore we can use it on the ECP-5 with an open source toolchain. This gives us a very flexible CPU core to play with. The Rocket core supports the RV46GZ RISC-5 instruction set. This is a 64-bit instruction set compared to the 32-bit of the VEX RISC-5. It is written in the chisel hardware construction language, which is another Scala implemented very low code generator. It has an MNView, run prediction, floating point unit and supports the RISC-5 privilege levels required for running Linux. We now have two different CPU implementations. These are nice for simulation, but in order to use them on actual hardware, we also need peripherals to interact with the outside world. Therefore, we need to integrate them into a system on a chip that is suited for our test hardware. Both chips come with their respective SoC environments for evaluating and simulating more complete systems, but neither directly supports the ECP-5. Furthermore, we want to do evaluation of the different cores in similar environments and understand the effects on the environment. Thus, using different code generators and SoCs does not look constructive. This can be achieved with Litex, which is a framework for creating FPGA SoCs and claims to provide all the common components required to easily create an FPGA SoC. Let's have a look at the Litex feature list. Support for mixed languages, including VHDL, Verilog, Mygen, N and NMygen, Spinal HDL, etc. Build backends for open source and vendor tool chains. An ecosystem of cores like DRAM, LitePCI Express, LiteUCH, LiteSATA, etc. And a lot more, smiley face. This looks very helpful to bring up something more useful on the hardware and to be able to play around. Looking a bit further, we also find the Linux on Litex project, which already provides support for Linux on a Litex SoC. And even more interestingly, there are projects for the WaxRisk 5 and the Rocket CPU. Great. In order to understand how Litex works, let's see how we can build a bit stream with a Rocket CPU core with Litex. We go to the Linux on Litex Rocket repository and find a readme that contains instructions, how to build a bit stream. First, I shall install a few host tools. Nothing really suspicious. Furthermore, I shall install the correct tool chain for my respective hardware. I have an ECP-IX5 and it tells me to install users, traders and XP and I in this case. Nice. But I should choose the latest version and it might not work with the OS package version. It does not specify which version it actually works. I also need a RISK5 tool chain in order to build the bit stream. At this moment, I am not sure why I would need a RISK5 compiler tool chain for synthesizing a bit stream. But okay. We will see why this isn't needed later. For now, I simply follow the instructions. Now we should install Litex. We should WGet the Litex setup.py script and execute it in the current directory. This script contains a number of git repositories and installs the contained Python projects into my system. Looks safe so far. But I give you a moment to think about this and its implications. Now we are ready to build the bit stream. For that, we call a Python script for the board that you want to use with the CPU and the variant that we want to use and which SSC features should be enabled. Things start to happen and a lot of output is produced. We can go and grab a coffee as the build will take a few minutes. The command successfully runs through and we will end up with a bit stream for the FPGA. The README further instructs us how to build the Berkeley bootloader or BBL which implements the SPI that sets up the system to boot Linux and is later responsible to trap some hardware exceptions. And we should build Linux and a busybox-based Initram FS. The README also instructs us how to copy the build images around to create some usable final image and how to load the bit stream via open OCD. Let's look a bit deeper into Litex and what is happening during the previous steps to build the bit stream. The lambda concept ECPIX5.py script is the entry point. It contains a main function instantiates the ECPIX5 platform which contains the PIN configuration and configures an example SSC with the possible peripherals. It also contains functions that implement how to build and optionally load or flash the bit stream. To me this is a somewhat surprising design decision. The user interface of the build tool, the build instructions, the board selection and the SSC configuration are all mixed into a single file. Litex starts by compiling the BIOS which is an early ROM bootloader that is integrated into the bit stream. This is also the reason why Litex depends on a RISC5 toolchain. I think this is a bit difficult because every time you change something on the bootloader you have to synthesize your bit stream again and this can become pretty cumbersome. Litex then generates a large verilog file for the SSC and copies the verilog file for the CPU core into a build directory and further creates a script how to synthesize and verilog the verilog into a bit stream. Compiling the generated verilog at hand is very helpful because you are able to compare and read the code before you pass it on to the synthesized toolchain and you can also inspect it for any suspicious verilog statements. Litex then calls the build script and runs the user's next PNR. On the other hand the separate script is nice because this allows to use different toolchains or test the bit stream with different toolchain versions but I'm not sure if this should have been even more separate and not buried inside the board file. And regarding the board loading maybe this should be part of Litex at all but remember my previous comments regarding this. So far we only followed stuff that other people have done and documented before. However originally we wanted to find out if we can use the tools to build FPGA bit streams for at least small production systems. Let's address these questions again. What about collaboration between different colleagues? There are various host tools that have to be installed. You can get them from your package manager or build them yourself. There is no guarantee that someone on a different host system uses the same tool version handing your work over to a colleague might or might not work. I've seen the latter. What about reproducibility of older system states? Litex usually updates all sub-modules at once to the latest Git status and Litex will update itself whenever running the script to update sub-modules. Returning to an older known good state is at least difficult and one needs some extra scripts given that you know what your known good state actually was. It is possible to handle multiple projects with different versions. Litex installs itself into the system when preparing the environment. This means there is a single version at least for all projects of a user. If you have multiple projects, for example, two different SOCs with RPEXRisk 5 and a Rocket CPU, using a different Litex version for each project is not trivial. What about continuous integration? As Litex always pulls the latest version during initialization, the point in time when the job is running is relevant. This will determine the versions that are used for the job and rerunning a previously successful run might without any local changes now result in a failure because external dependencies have changed. Now all of these aren't any new problem. What can we do about it? There is a very similar problem for creating Linux images for embedded systems. The images contain a lot of various software packages which have dependencies on each other, might depend on certain versions of other packages, or might even change the updates of the host and build tools. On the other hand, you always want to have a reproducible system and want to be able to return to earlier states. Therefore, for creating a Linux image, you would use a build system like PtxDisk or Yocto, which helps you with these tasks and ensures that you know what you are building or past you have built. Yocto uses build recipes for host tools and target packages. It defines exactly which version and configuration it uses. Everything, including the toolchain, is tracked in a git repository, giving you the opportunity to exactly tell what you are building and rebuilding it later again. If you could use Yocto to synthesize the FPGA bitstream, we would at least be able to record what we are building and pass it on to co-workers or a CI server. Of course, we are not the first with this idea. Nelson of Rossi has already created a meta-layer for Yocto, which provides the recipes for the various tools required for building FPGA bitstreams. The layer is called meta-HDR. We find recipes for users, next-pnr, and trailers. There is even already support for Litex and Linux on Litex VexRisk 5. We include meta-HDL into our Yocto BSP and start hacking, and it works quite well. And using the layer really feels like using a standard Yocto, except there are some additional images for the bitstream. Something that I find a bit unfortunate is how meta-HDL handles versions. There is a lot of ongoing development for the FPGA tools, and you usually need really recent versions. Therefore, there is an update source for that script in the repository, which updates all recipes to the respective latest Gitmaster. This is only slightly better than what is going on in the Litex setup.py script. At least you now know exactly which versions are used, but updating all tools at once again leads to really arbitrary development states of the tools. I would rather see separate updates, for example, of the tool chain and Litex, to see which tool is responsible for possible regressions. In order to locally adjust the images, and also add a new image for Linux on Litex Rocket, we added another meta-layer on top of meta-HDL, meta-PTX FPGA. We have a recipe that translates the instructions from the Linux on Litex Rocket readme into Yocto build instructions. It builds the bitstream in the Yocto environment. We also have a few fixes for Litex in this meta-layer, but we are planning to bring the fixes either to Litex upstream or already has started to do so. The meta-PTX FPGA layer adds a mechanism to use machine configurations to switch the images between ECP IX-5, WaxRisk-5 and ECP IX-5 Rocket. Therefore, from a user point of view, you build the same image, but under the hood it will either build a WaxRisk-5 or Rocket CPU bitstream. I'm envisioning something like BitBake Core Image Minimal with only the machine set to ECP-5 WaxRisk-5 or ECP-5 Rocket. This is not completely working for a few pickups in the meta-PTX FPGA layer, which I simply didn't have time to address yet. Either command builds the bitstream, device tree, kernel and root FS for the respective machine. After building, you take the bitstream from the images directory, throw it at the hardware and take the Linux image, throw it at the synthesized core. Now you can see how Linux boots up and works pretty similar on both machines. And you can give it to a colleague who can do the same. As said, there are a few pickups left, but currently I'm able to build a functional booted bitstream for both machines, which is nice. We are already using Jenkins to build and test Jopto BSPs on a regular basis for software development. By integrating LiteX and the bitstream build process into the standard Jopto flow, we are able to build the Jopto BSP whenever changes are pushed into the repository. This allows us to detect any breaking changes early and remind developers that they have local changes that are not included and reproducible in the BSP. As we faced more serious issues, which the bitstream was not able to start on the hardware, we would like to also have runtime tests that load the bitstream onto the actual hardware that we already have in our remote labs and verify that the bitstream starts and can boot Linux. However, there are still a few things missing to actually make this work, but that's definitely something that we want to have. This brings me to further open issues that we want to address now that we have working bitstreams in the Meta-PTXGA layer. We are currently using a relatively old version of Litex in Jopto. If you switch to the latest Litex to build the bitstreams, the bitstreams for the Rocket CPU and the Vexus 5 do not boot. However, I can build a booting bitstream on my local machine. Works for me. I'm suspecting the RISC 5 toolchain, but I need to debug this further. If I want to build the Vexus 5 with multiple cores, the bitstream does not start. I guess I have to dig deeper into the bitstream for that, but I didn't have time to look at it. The Linux image for the Rocket CPU only works with a busybox-based root FS. As soon as I'm trying to boot a proper root FS with an actual init system, like I'm using on the Vexus 5, Linux starts to boot, but hangs in ULIF. I was glad to have a console with busybox and I did not investigate this further. The differences like this, however, are now explicitly visible and documented in the machine conflicts in the MetaPTX FPGA meta layer. There is a circular dependency in the Linux image for the Rocket CPU because the full root FS is built into the kernel. I didn't care because finally I want to separate the root FS from the kernel anyway. And as a convenience, I would like to add a virtual package for the bitstream to be able to do something like BitBake Bitstream. This shouldn't be hard, but I didn't have time yet. However, apart from these relatively simple issues, there are still some more fundamental questions. The CPU cores are generated from Spinal HDL or Chisel with SBT. The results are cached or checked in into the Linux repositories for the respective cores. If a core is missing, the Linux will generate the core on demand. This requires Scala, SBT, Java, et cetera. There is no support for that in Yocto meta layers yet. That means that the CPU cores have to be generated outside of Yocto and checked in into some repository. That basically circumvents all the effort we put into having something reproducible in Yocto. Maybe having the code checked in in the repository is good enough, but I'm not sure about it. Maybe we want to always generate the cores during the build as well. Are the results using the different SoCs actually comparable? While the board is the same, Litex might configure different SoCs and peripherals depending on the CPU core that is selected. For example, the board rate of the serial console differs between the different configurations. And we are using a different Linux user space for the different cores. Thus, while building and starting the different configuration looks the same from a user perspective, it might be significantly different as under the hood and not all of them will be obvious. This raises the question if we are actually able to compare the results of the different CPU configuration or is it all that we gain that we know that we are able to switch the underlying architecture without really changing the user interface. Finally, it seems that a lot of the issues that we are facing are related to Litex and the mix up of the configuration and build system. This raises the question if there are any other build systems that help to build SoCs. Let's have a quick look what we found and why we don't use them. The WexRisk 5 repository contains the pre and Moorgh's SoCs implemented in Spinal HDL. These SoCs do not have actual board support but are just a more complete CPU. While the very look could be extended to add the board integration, we would also need a DRAM controller and everything else. There is an example for the I-14, but not the ECP-5. Furthermore, these SoCs are specific to the WexRisk 5 core and integrating the rocket CPU into Spinal HDL does not seem to be really reasonable. ChipYard looks interesting because the configuration and the tools and the build tooling are relatively separated. However, the build tooling does not seem to support users and the open source toolchain yet. ChipYard claims that new toolchains can be added with a few hundred lines of code but I did not look further yet. Also, it seems that ChipYard only supports Chisel. Therefore, integrating only signs in other languages does not seem to be possible. There is Fusoc, which is a package manager for IP cores and a build tool. It defines a package format for IP cores and the build instructions for each core. This looks interesting from a scope perspective but it does not solve the problem with host-tool versions. Pipe is another platform to configure build SoCs with RISC 5 clusters. As far as I can tell, Pipe is highly integrated with the Xilinx Devado tools and cannot be used with the open source toolchain. We are using RISC 5 cores with Linux in FPGA to evaluate tooling to synthesize bit streams. I'm wondering if there are any real-world use cases for running Linux on a soft core CPU in an FPGA except for bring up experimentation and testing. If we remember the use cases for FPGAs in the beginning, we had high data throughput, real-time, and parallelism. By using a CPU in an FPGA to run Linux, we basically void all these advantages. In the end, if your use case wants Linux on a real CPU, why not just buy actual silicon? Having a CPU with Linux in an FPGA is cool, but if you have any real-world use cases for a CPU in an FPGA, I would really like to hear from you. In summary, we now have working bit streams for the Wexflux 5 and the Rocket CPU, which are built by Yocto in a Jenkins CI environment. Now, what are the next steps that we want to tackle? First, we are still using a relatively old version of Litex from March 21. We would really like to follow the most recent versions of Litex, but we are not entirely sure why the latest version does not work. It seems to be related to the RISC-5 toolchain that is used by Yocto, which is much more recent than the previous toolchains in the tutorials. But so far, we are not able to track down the issue. Second, we would like to have runtime tests in our CI infrastructure to not only test the build, but actually verify that the build bit streams and Linux systems are at least booting. As said, for that, a few things in our infrastructure and in the Yocto build are still missing. Third, the Linux on Litex Wexflux 5 bit stream is only working after SOS. SOS is configured with a single CPU. As soon as multiple CPUs are enabled, booting the bit stream does not work. We are looking into this because we would really like to have the Wexflux 5 as a multi-core option. Fourth, and that's what we originally set out as a goal. We want to write and integrate custom IP cores into the SOS. Up until now, we are still reproducing stuff that is already supposed to work and use X configurations. Since we chose the FPGA to be able to use custom hardware, it is not sufficient to just use what is already out there, but we need to be able to customize it. However, we don't have any experience how well this works yet. I'd like to thank you for your interest in my talk. If there are any questions left or if you have any comments or want to share your experience with Litex and FPGAs, feel free to discuss them in the chat or send me an email. I wish you all an engaging remaining conference and I'm looking forward to further interesting talks and discussions.