 He is a researcher at the embedded security group of the Horst-Gertzer Institute at the Rural University of Bochum and also associated with the Max Planck Institute and will speak about all the open source hardware analyzer. It dives into foundations of hardware reverse engineering and the NETLIS analysis framework hall. He will introduce us to the foundational levels and the challenges of hardware reverse engineering and discuss what research can be done with the whole framework that he and his team are developing. Please welcome him for your drone of applause. Yeah, welcome also from my side. Welcome to all the people that are here this early and also welcome to everyone at home watching the stream. So how to start the final day of C3. And I thought let's start with a quote. Trust me, I'm an engineer. It's a very famous quote by basically every engineer. And if we look what's out there, if we look at the real world, the quote is proven right. We see countless examples of great engineering in the wild and these things, they make us wonder. They are great. All these great examples of engineering out there, they focus on so many areas, but for all of us here, like technology is one of the main focuses of engineering that we are looking at. And if you think about it, like our current technology level, all of the stuff we have experienced here over the past three days and like the whole world at its current state, it's driven by technology and this is the result of years of engineering. So if I was asked to define engineering, I would say engineering means turning a high-level description into an actual thing. And this is like a very hand-wavy definition, but I think it captures quite well what an engineer does at its core. So humanity is driven by curiosity. We want to understand things, especially if they are fascinating. So how do we understand the complex works of engineers? You don't become an engineer just by watching a few YouTube videos, like years of training go in there, and all these expertise, how can we understand this, how can we learn from this? And the most simple answer is by talking. We ask engineers, we talk to the teams that make all these things, but what if no one is there to teach us or no one wants to, no one is allowed to teach us? Well, then we have to take a look at this ourselves. So if the way from the high-level description to the thing is engineering, we are interested in the way back, and this is called reverse engineering. So to put this into perspective or in our definitions, reverse engineering is analyzing a thing to recreate a high-level description. And it's important that it says a high-level description, not the high-level description, as we are not always interested in the exact high-level representation that resulted in the thing we are analyzing, but maybe just an approximation or something equivalent. And if you think about it, we have all done some form of reverse engineering in our daily lives. If you think about your car, if you have an engine failure, you have to look at it, and you did most likely not build this engine. So you have to understand how it works, and this is already reverse engineering. Or even more abstract, if you eat a delicious cookie, you want to know how the cook made this cookie. What are the ingredients? What was the recipe? So you go from the thing to the high-level description. But this presentation is not about cookbooks, it's about hardware. And more specifically, I want to talk about chips. So all these microchips that you find in your smartphones, your gaming PCs, and so on. But like a more fundamental question is, why would anyone even want to do this? Why would you want to look at chips and how they are made? Well, curiosity, of course. But you might also want to understand or locate errors. Just like in the engine, if this is a chip that has already been produced and passed all the tests, then you might have to take a look inside to find errors. Also, from a legal perspective, you might want to detect if someone used your third-party design that you developed without paying the license fees. So you want to detect patent infringements. So these are all Benin use cases. But you can also take a different approach. So, of course, you might want to detect trojans that were inserted in the circuit. But on the other hand, you might also want to insert trojans into the design. And on a much wider scale, you might want to just clone or steal intellectual property. And also, for this, you have to do reverse engineering. And this is also something that came up in media over the past years. So, for example, you might remember the Big Heck, a story that was published in Bloomberg Business Week. And just as a disclaimer, this were allegations. So nothing of this has been proven true. There has been a lot of discussion, but no definite statement in the end. But what they claimed to have found is that in server hardware of the company Supermicro that was used by Amazon and Apple, there was a hardware implant, an additional tiny chip on this server board that allowed for backdoor access or remote access. And in their image, it's located in this white circle. This tiny thing was the hardware backdoor. And they claimed that it was inserted during fabrication, so putting all the parts together on the board in China. And the interesting thing is they claimed that this chip that was put on there looked like a passive component. So, it didn't look like a chip. It was camouflage, you could say, to look like something passive, but it was actually actively interfering with communications. And then also, we have the telecommunications ban by the US government on China telecommunication equipment. So, the whole Huawei debate, which is still ongoing in these days. So, in this talk, I want to give you first an introduction to the basics of reverse engineering, but also into the basics of hardware engineering. And to understand how we can go the backwards direction, we have to take a look at forward direction as well. Then I want to talk about HELL, our hardware analyzer and the research that we do at the institute at the university associated with HELL. But as a disclaimer, this is a foundation stock. So, I do not, and I cannot make you experts in such a short period of time, but I want to give you a high level overview so that you can get the big picture that you can talk to others, that you understand the underlying principles and difficulties of hardware reverse engineering. So, let's start with engineering hardware design. Since I guess that most of you know their way around software pretty well, but hardware is most of the time kind of difficult, I will try to make analogies to software as often as possible to make it like more understandable for you. And in this talk, I will focus on two kinds of chips, two kinds of integrated circuits. The first one are A6, application-specific integrated circuits. And the second kind are field-programmable gate arrays, FPGAs. So, an A6 is basically a classical integrated circuit. This is what you think of when you hear chip, basically. It's optimized for a dedicated purpose, think about a sensor or something. And the logic in it is implemented through Boolean gates and flip-flops. And some other elements, of course, but these are the main, the core elements that comprise an ASIC. And FPGA is reprogrammable hardware. The ASIC is fixed. Once it's built, it does what it does, but you cannot change it anymore. And the FPGA, of course, the hardware in the FPGA itself is also fixed, but you can reprogram the ASIC, the FPGA. So, it is optimized to offer you a high degree of freedom for reconfigurability. And how is this done? Well, the main elements that are reprogrammable, so accessible by you, lookup tables and routing elements. And you can now think about how to implement logic on this hardware. You will just take your functions, your logical functions, you put them in the lookup tables, and then you just connect the inputs and outputs of these lookup tables via the routing elements as you like. And this way, you get reprogrammable hardware. So, you have hardware that is kind of dynamic, but you still, you can change it, you get all the benefits of programming, but the parallelism of hardware. And for both of these architectures, we have to start somewhere when we want to design them. And this start is kind of comparable to a programming language. It's a hardware description language, HDL. And just as an example, it looks like this, but there are many hardware description languages out there. And this code that you write, it describes the hardware you want to have in the end. So, this short example here, it describes this circuit diagram, a full adder. And now, if you think about software again, the compiler that you use has to work with a specific instruction set. So, the instructions that your processor provides, the compiler has to break it down to these instructions. And we have, in hardware, we have something similar, and these are called gate libraries. And such a gate library simply contains all the elements that you can use. For an FPGA, it contains all the programmable elements, so you know what you can work with. And for ASICS, it describes whatever can be built from transistors. Because transistors are like the underlying elements, everything is built off here. And another difference from software is that software is compiled, most of the time, but HDL gets synthesized. And this synthesis process can be broken down into three major steps. So, the first one is architectural mapping, where you take your description and you map it onto the elements of the gate library, what you actually have. And the second step is placement. You put all these elements on the area that you have, and then there is routing. You connect all these elements in the most efficient way. And the result of this synthesis process is called a net list. And this net list is the central element that we will talk about today or that we will focus on. So, to put this in a picture, you start with the HDL code, you run through synthesis, and in the end, you get your net list. And what is this net list? Well, think back to physics class. These circuit diagrams you all had to draw, these are basically a visual representation of a net list. Now, just replace the light bulb and the battery with, like, digital design elements, and basically you are done. The net list contains the whole design. It contains all elements that we use, all gates, flip-flops, and so on, the interconnections, and additional information. Let it be, like, position of the elements, drive strings of gates, and so on. So, again, to make the analogy to software, this net list is somehow comparable to assembly in software. Now, let's take a look at our big picture. So, we know how we get from HDL to the net list through synthesis, but how do we go to FPGAs or ASICs? How do we end up there? So, let's start with FPGAs. As I said, the FPGA is reconfigurable. And we want now to get our net list into the FPGA. And this is actually kind of straightforward. We take this net list and translate it into a configuration bit stream. It's called bit stream for FPGAs. And this configuration simply maps all of the elements in our net list to the FPGA elements. And then we put this bit stream onto the FPGA, it configures itself, and we are set. Perfect. So, now let's take a look at ASICs. Here, the story is quite different. Because if we just take a look at this circuit diagram, we immediately encounter a problem. How do we manufacture this? We have a crossing of wires. And if we would just, like, lay them out and build it like this, we would have a short circuit. So, the solution here is to work with multiple layers. And this picture shows this quite well. You basically put all your logic, so all your computational elements, storage elements, you put them on the lowest layer. And then all the layers above, you only use to connect them for routing. And you can see the metal lines up here. These are basically what you can think of as the wires on a single layer. And those orangey lines, these are called vias, vias interconnect layers. And this is what we call a layout. And this layout that represents our net list, our design that we want to have, this is sent to a foundry for fabrication. So, in the end, in our big picture, we now know how we can get from HDL over the net list to either an FPGA or an ASIC. So, after this brief overview on hardware design, hardware engineering, we can now take a look at reverse engineering. But again, as always, we take a look at software first. And in software, you usually start with a binary. Or if you don't have it, you have to extract it from some kind of memory. So you have an extraction phase that sometimes is very short. And then you start by going back to the assembly. And maybe this is enough. But maybe you want to go like a step further and reverse partially the high-level language this was written in. So, you have an analysis step. And basically, we find these steps also in hardware, but they look a little different. So, in our big picture, let's start with FPGAs first. As I said, the FPGA is configured using the bitstream. And the bitstream is somehow put in there, but how do we get it out? Well, it turns out, in reality, it looks more like this, that an FPGA is configured through an external flash memory. This is true for the majority of FPGAs. So, we actually don't have to bother with the FPGA at all. We just have to get the contents of the flash memory out, and then we have the bitstream. But how to get from the bitstream to the netlist? Well, as I said earlier, the bitstream is just a mapping of the netlist elements to the FPGA. So, it's a different representation. And if we understand this format of the bitstream, we can just translate between the representations. So, we can go from the bitstream back to our netlist, and we are done. Nice. So, what remains is the ASIC, how to get from the ASIC to the netlist. And this is a different story, because we know the forward engineering way, we know we have to get to the layout, we send it to a foundry, and the foundry then delivers us the ASIC somehow. But how do we get back? Do we actually, like, we don't go back to the foundry? So, if I say chip, this is what comes to the mind of most people. But this is not what actually computes. If we take some asset, for example, and remove this so-called packaging that we have here, this is what we see. And this is the so-called dye. And the dye is the thing that actually computes. The dye is also the thing that gets fabricated by the foundry. Packaging is an extra step. And you can see these tiny golden wires that reach from the IO pads of the dye. They are connected to the silver pins that you see on the package and that you can use for soldering. So, the actual dye, the heart of the package, this is what we are interested in reverse engineering. And if you think about it, this dye is exactly what we send or what we specified with our layout files. So, how do we get from this physical thing that we have in our hands to this layout or maybe, like, immediately a step further? And the solution is quite simple. We take sandpaper and a camera. It's actually not that simple, but the process can be best explained that way. You take your dye and you take a picture of the top layer of the dye. And then you combine, like, a set of chemical or even mechanical edging processes to get rid of the first layer. And you take another picture, now, of the second layer. And also you want to take a picture of the vias in between. And then you repeat this for all of the layers of your circuit, of your design. And in the end, you have, like, a set of images with an image for each layer. But it's not that simple because in current technology sizes, optical microscopy is not sufficient anymore. So, we make use, for example, of a SEM, a scanning electron microscope, and they have a very, very small area they can scan. So, actually, for just a single layer, you get, like, thousands of pictures. And the SEM always has to move its focus to the next position. So, in the end, you'll not end up with, like, a few pictures of the layers. Like, modern ships have something of, like, 10, 14 layers, or even more. You end up with, like, thousands of images that you now have to put together. You have to stitch them. You have to align them. And then once you have a perfect image of each layer, you have to also align the layer images on top of each other. So, to put this in more formal terms than sandpaper and the camera, the first step we do is de-capsulation. Get the dye out of its package. The second step is delaying. And this is interleaved with imaging. So, always image, delay, image, delay, image, delay. And in the end, we put all of these images together in some kind of post-processing. And then if we analyze these final image stack, we get to our net list, finally. So, to get into our big picture, getting from FPGA, from FPGA content, so from the bit stream to the net list, is done through format adjustments or format translation. And getting from an ASIC to the net list involves, like, a step of quite complex, a series of quite complex steps. So, you still might think, now, okay, I think I've got this. I mean, all the steps seem quite clear, although, like, I summarized a lot. Where's the catch? And I want to now highlight a few of the, like, hard problems we have here. So, the first problem is these steps of ASIC to net list, they are very complex. They take a lot of time. They take a lot of expertise. They are costly. The equipment itself is very costly, and the engineers that have to work with this. So, this is like a major, major step. Second problem, the gate libraries for ASICs, and for FPGAs, the bit stream formats, they are proprietary most of the time. So, we don't actually know what all of this stuff maps to or what it does. And the third problem you might think about is, if I say now net list, as I said, circuit diagram, you might think about something like this. But a real net list looks more like this. And to be fair, this is a very, very small net list. This just shows an AES circuit, roughly 6,000 elements. And also, the wires you see, they are aggregated. So, one of these wires might transfer 8 bits, but in reality, you would have 8 separate wires in this place. And this is a mess. How do you work with this? How do you cope with these problems? But if we take a look at the problems again, for the first problem, specialized companies exist. There are companies that do all of these steps for you. Also, there is ongoing research how to do this better, faster, more efficient. So, there is a lot of stuff going on here. And also, gate libraries of ASICs can be reverse engineered on the fly, because the gate library specifies the small things that we can use, an end gate, for example. And now, if you think about it, we just look at a few transistors, figure out what they do, and then we use pattern matching to find duplicates in our images. Then we can quite efficiently get this on the fly. And for bitstream formats, there is actually like a lot of ongoing work that tackles this, that takes a look at how can we get this bitstream format for interoperability, for example. So, the remaining open problem is how to reverse engineer a net list. And this is also the core question that we tackle at our university. So, we don't really tackle the questions that came before. We don't look at how to get from ASIC to a net list. We take a look at how to reverse engineer such a net list, how to understand it. And if you remember, a net list is kind of comparable to assembly. So, let's take again a look at software where we know how to handle assembly. And we have powerful tools for binary analysis. We have IDA Pro, for example, or Gidra. But we also have open source tools and frameworks that can be extended by plugins and all that stuff. So, we have a very rich environment if I have to put this in a picture, it would look like this. We have a lot of options and a lot of people that are going there and trying things out. Well, this is the situation in hardware. And it looks quite bad. There is none such tool available. There's no IDA, there's no Gidra, and also very little people are actually doing hardware reverse engineering because they think that the entry barrier is very high. But luckily, there is hope. Roughly four years ago at the beautiful Ruhr University, Bochum, in the ID building, we started a project that is now continued at the Max Planck Institute for Cybersecurity and Privacy in Bochum. And my former colleague Mark Fierbjerg, he took a look at the state of hardware reversing research on net list analysis research. And what he found was, well, a lot of the stuff is not reproducible. There are no scripts for the algorithms given, there's no good evaluation. Or if there's something, it often immediately crashed when we put our own net list in there. So, we certainly had to conclude that research in this area has taken quite a hit. So, what he wanted was an IDA Pro for net list reverse engineering. So, maybe like an open framework with the ultimate goal to aid researchers and professionals in their everyday tasks. So, everyone should benefit from this. Because this is hell. But this is hell. And hell is our open source framework, as I said, for net list analysis. And it allows for automated and manual and inspection of net lists of all kinds. So, both A6 and FPGAs can be analyzed here. It's written in modern C++17 and it offers support for custom plugins. So, everyone can extend this framework with his own ideas, new algorithms, and so on. And it also offers a graphical user interface. So, what's the idea behind hell? Well, the main idea is that we can represent a net list as a graph, as a multi-D graph to be specific. So, a graph, math, graph theory, consists of a set of vertices and edges. And if you think about it, these edges that connect vertices, well, this is just like nets and wires that connect net list elements. So, actually, it's not difficult to map a net list to such a graph structure. And this has the immediate benefit that we can use the algorithms from graph theory. And, like, they are well researched. There's a lot of thought in there. And this is also not something that we came up with. Graph representation is basically the most common step in all of the works on net list analysis. So, how does hell work? Well, at first, you have to, of course, present it with a net list. And hell will pass this net list into this graph representation. And then the user can interact with this graph either through the graphical user interface or via an included Python shell. But you can also run hell as a standalone command line tool for large-scale analysis. But as I said, you also have the opportunity to load plugins and then execute these plugins on the graph. And, of course, all of this can generate output. You can get reports. But you can also modify the graph structure and then write your modified net list back out. And since you might not work alone on the project, you can always generate snapshots, share them with other researchers, and they can continue or join in on the work. So, let's first take a look at hell. And I will show you hell later, like, in action. But for the moment, let's just take a look at picture so I can zoom in easily. This is the main view you get when you open hell. And let's focus on the left side first. So, here, the first thing you notice is the graph view. And I have opened a very, very small net list here. And you can see all the, or a few of the elements in there and their interconnections. And now let's suppose that we want to get more information, for example, on this element here. So, you click on it. And hell will give you a detailed overview on what this actually is. And it displays all the associated information. So, for example, you can see here that the type is a LUT6. So, a lookup table with six inputs. You can see the boolean function that it implements. You can see the nets that are connected to inputs and output. And all kind of associated information. And this is also true for all the other kinds of elements that you have in there. So, you might also have noticed that there are colors. And these colors are not meaningless. Hell is made to help you modularize your net list. Because if you think of the image that I showed you when I said what you think a net list looks like, you think that you have separated modules that interact with each other. And this is also how you write or design your hardware in HDL. But after synthesis, this is not visible anymore. So, what, like, one of the overarching goals of many reverse engineering works is, we want to try to recover some modularity back. And so, hell allows you to create these modules. And you see in purple a module that has been unfolded. But in green here, a module that has been folded. And now, if I want to, I can simply inspect this module in isolation and the included parts. So, I can get all the distracting stuff away and focus on the part that I'm interested in. And to help you navigate, we added, like, a tree structure of the modules because they are hierarchical. And, of course, you can have multiple views open on different parts of the net list on different isolations. And you can work your way around. So, on the right side of the interface, we have the Python environment. And this is a super powerful tool. You have a Python editor window right on the top. And below, you have a console. And they both interact directly with the core of hell. So, with all the included gates and elements. So, you can do an explorative analysis. You don't have to, like, close all the stuff, think about it, write stuff down, implement a new C++ plugin. You can do all of this stuff live explorative through this Python interface. And our vision for hell is basically that you take your net list, you put it into hell, you execute plugins that you or others wrote, and you invest, like, your own critical thinking. And then you might actually get from this mess of gates to some kind of modular representation. And this is what we designed the graphical user interface for. So, to now actually take a look at hell, I want to show you two examples where we used hell to actually improve the state of research. And the first example I want to show you is a paper on the difficulty of FSM-based hardware obfuscation. So, what's obfuscation? Imagine you have something like this. And you turn it into this. This is obfuscation. It takes something that is comprehensible and turn it into something that has the same functionality, but is way less comprehensible. To put this like in our big picture, if we say that the engineering way is easy, very hand-wavy, and the way back is not that easy, then by putting obfuscation into the game, we want to make the way back the reverse engineering insanely difficult without making the engineering any worse. And what we take a look at now and what was analyzed in the paper are finite state machines, FSMs. And these are your central control elements in digital hardware. Most of you have probably seen a state machine, but it basically works like you have a set of states, and upon special events or conditions, you transition from your state to a different state that controls other parts. Think about like the ATM machine that is in its state that it waits for your card and then transitions to the next state where it waits for your input or something. So, in this paper, among other schemes, the authors took a look at the Harpoon FSM obfuscation scheme. And Harpoon had the idea that we take the FSM, the original FSM in blue, and we add dummy states and dummy transitions in front of these original FSM. And they point to each other, there are a lot of transitions that form loops. So, if you do not know the one perfect transition series to get you to the original starting state, you will be trapped in this dummy logic and also a reverse engineer will be massively held back because he has to analyze all of these dummy states. And we will now take a look at how we can reverse or break Harpoon. So, there are basically three steps. The first step is we have to find the obfuscated FSM, because think about it, we have this massive sea of gates. And then as soon as we have found the obfuscated FSM, the approach was to recover the state transition graph. So, what you just saw, the circles with the arrows. And then maybe by just analyzing the state transition graph, we can already break Harpoon. So, let's talk about the first step. How do we find the obfuscated FSMs? Well, state machines have a self-updating structure, because the next state always depends on the current state. And the storage elements that store the state, they therefore influence each other. So, every storage element has an influence on all the other storage elements. If I were to draw this in a picture, it would look like this. You have your state memory that has a feedback loop through some transition logic which may depend on external inputs. And then, of course, you have some output logic that controls the rest of your circuit. And if you think about it from a math point of view, this looks like a strongly connected component. And this is a term from graph theory, which basically says that from every element in this component, you can reach every other element. So, a good idea might be to look for these strongly connected components and see what we find. And then the next step to recover the state transition graph, we have to take a look at the candidates we have. And basically, we don't care about the output logic. We don't care about what this controls. We want to get the state transition graph. So, we also don't care about the state memory. Instead, we take our state that we are looking at currently. So, the initial state we can get from reset behavior of the memory on FPGAs from an initialization vector. And then, we brute-force all the remaining inputs and compute the transition logic. And this way, we get all the states that can be reached from our current state. And all these states we put into our queue. And if we process the queue to the end, we will have all of the states that are reachable by this finite state machine. So, the final step is to analyze the state transition graph. And we will now take a look at this, how it looks in hell. All right. So, I've prepared a small net list here for you. And as you might be able to see on the bottom, it is quite small. It has 128 elements. And if I now go in there, you can see that you basically see nothing. You see a lot of elements. But you have no idea what they mean. How do they, like, interact? How can I work with this? So, I prepared a short Python script here. And what this basically does it, groups all of the input buffers and output buffers into custom modules. We don't want to look at them. And then it runs the strongly connected component algorithm that we supply from an external graph algorithm plugin. So, if I run this code now, you can see that we get, like, a much more cleaned up view on our net list. You see here at the top, we have a module that has a lot of outputs. We have a module that has a lot of inputs. So, these are input and output buffers. We are not interested in them. And here on the bottom, we have, like, two different modules. We have a large one. Let's look into this first. And we see, like, okay, it's really large. It has a lot of elements. And if you've done a hardware design before, you know, state machines are typically very small, as you can encode the states very densely. So, let's go back and take a look at the other module. And this looks actually kind of nice. It looks very small. So, let's first verify that we might actually look at something that is of interest here. So, I will select this flip-flop I have here and I will isolate it in a new view. So, this is just our flip-flop in isolation. And I can now use the arrow keys to navigate to the output pin and follow, basically, the output pin to everything that is connected. And in my list here, I immediately see a lookup table that is also in the same module that is candidate 2 here. And if I follow this, I can immediately see, like, a small feedback loop. I see that my storage element, my flip-flop on the left, reaches the lookup table and the lookup table has a feedback back to the logic element. So, let's take a look at what this also connects to in the same module. And I find two more flip, two more lookup tables here. And if I take a look now at what these lookup tables connect to, I immediately find candidates that are in the same module. So, the remaining flip-flops. If I move this now around, we can see, like, a perfect feedback structure for all of the elements that we have here. So, if I select all the outgoing ads, you see they go back to the storage elements. And if I select the elements in between, you see they all reach the logic in between. So, we actually have a strongly connected component here. Now, let me run the script for the second part, the recovery of the state transition graph. And this is actually the whole brute-force code that you see here. And if I run this, we get this output. And this is now the state transition graph that was recovered from just looking at the transition logic. And you can see here at the top, this is our initial state, our starting state. We have, like, a lot of transitions to other states that go back and forth. We have loops. But then here, we have, like, a single transition. And as soon as we take this transition, we never go back to one of the prior states. We always stay below here. So, we can directly conclude just from looking at this, that this upper part here will most likely be the harpoon obfuscated part. And this bottom part here is our original FSM. So, with this short analysis, we have basically broken harpoon in this example. So, just as a disclaimer at this point, it's not that easy in reality. State machines are, like, much, not much more complex or bigger, but they are not easily found by just looking for strongly connected components. But in the paper, we have described, like, a lot more techniques and metrics, how to find them better. And, yeah, if you're interested, you are welcome to look into our research results. So, the second one I want to show you, and it's very short, is scan-based reverse engineering. And this is research that was not done by us, but by researchers from the Technion University in Haifa in Israel. And they said, okay, these four steps of ASIC analysis to get to the net list. So, we don't have a net list yet. These four steps, they are expensive. They are time consuming. And you have to do a lot of manual interaction. So, we don't want to do that. Maybe we can find a different path to get to a net list. And so, they thought about scan chains. What is a scan chain? Well, you might think that your hardware looks like this, that you have storage elements, logic, and then storage elements again. But in reality, almost all designs look like this. Or if I want to draw it a little bit better, like this. All of your storage elements basically form one or more chains. They are chained together. And depending on the mode you use them in, they either just transmit the data to the next storage element, the next flip-flop in the chain, or they just run their data through the logic and compute normally. So, why would anyone want to do this? Well, think about it. You can take your chip, you can put a full internal state in there by using the chain. Then you can run it for, for example, one clock cycle. And then you can get the full internal state out again. So, you have cycle accurate in hardware debugging basically. This is perfect for testing. And this is why almost every chip has one or more scan chains. And they are commonly, they should commonly be disabled after testing. But they aren't always disabled. And there's also work on how to re-enable the scan chains. So, how can we reverse engineer using the scan chain? Well, the first challenge is we have no idea how these flip-flops are connected through logic. So, what are the dependencies between these flip-flops? This is the first challenge. And the researchers solved this by basically putting in, like, an internal state and then changing single bits and taking a look in what changes in the output. And there are specialized algorithms to this efficiently. And this way we get dependencies between the flip-flops. So, now we know which flip-flops are connected through logic. But we don't know the logic yet. And for this way, they take, like, a specific state, also carefully crafted, observe the output, do this again with a different state, and so on. And if you do this with enough states, again, there are algorithms that do this efficiently, you get the boolean function that is implemented by the logic. So, the tool they developed, the technique is called Scandit. And by combining now an ASIC that you can query the scan chain on with the Scandit tool, you basically get an approximation of the net list. So, you do not get the actual physical parts that are built in there as you would get from opening it up and looking at the layers, but you get an approximation that is correct to the functional level. And Scandit is currently under development as a health plugin that is able to operate with real hardware, with Scandit access, but also to use a simulation of the hardware. And this is what we want to briefly look at now. So, I also prepared an instance for this. And I have an empty net list here. If I want to enter this module, it says me this module is empty, you cannot enter it. And I prepared our Scandit plugin with a simulation of a small addition circuit. So, a multi-bit adder. And the first step I execute right now just recovers the dependencies between the storage elements, the flip flops. So, this is what our recovered net list looks like. It is very small. But we immediately see several flip flops, data elements in here, storage elements. And they are connected to LOPs, logic blocks, as we call them. And these logic blocks, they do not contain any function yet. You would make them out here in the details. So, now let me run the second part of the analysis. And this is the part where we want to recover the Boolean function. And this is all done in simulation right now. And if I now click again, you can see that there's like a large Boolean function that is implemented by this six-input logic block. And if I take a look at, for example, another logic block, you can see down here there's another Boolean function. So, this algorithm now recovered us a net list of this adder circuit that we can work with, that we can analyze. But again, it's just an approximation. It's not the perfect net list that was built in the ASIC. Okay. So, I showed you around. I showed you basically the internals and the underlying principles of hardware design and hardware reverse engineering. And I showed you where the main problems are. And the big problem still is net list analysis, net list reverse engineering. And what we wanted or would like to see is basically that you throw your net list into hell and you're done. But this is not the case. You just saw that in the obfuscation example. The reality looks like this. We take hell and we use it and it helps us a lot, but we always invest manual thinking. So, reverse engineering, not only hardware reverse engineering, reverse engineering in basically every area always involves human processes and human thinking. And this actually sparked a completely new interdisciplinary research direction at our university, which we called cognitive obfuscation. And in this research, we work together with psychologists from our university with a goal to involve the human factors in research on obfuscation. Because as you saw with the obfuscation example, it might have been difficult for a machine to break. But for us as humans, we are still, like our brain is still the best thing at pattern matching and parsing that we have. So, just by looking at this, we broke it. And the idea here is to maybe get some metrics that actually quantify the real world obfuscation strength. So, strength against human and machine. And this is then ultimately the goal that we might be able to get novel techniques that are strong against both parts. So, this is ongoing research that was sparked in part by our work on hell, and that is currently being worked on. So, for the end, I know I'm standing here and talking to you, but this was not my work. The whole project has grown over the past four years, like not in a continuous development, but from time to time. And there's a team behind this that works on hell. And the core team are basically Mark, Sebastian and myself. Mark is not actively working on hell anymore. I see, finished his PhD. But Sebastian and myself, we are working on hell, actively discussing the directions we want to take. And we are coordinating the rest of our team that consists of several PhD students and student assistants. And I want to ask you for a round of applause because without these people, this would not have been possible. So, while I'm thanking, I also want to thank these amazing people that created the icons that you saw in this presentation. They never get thanked enough. And I want to end with a call. We need you. Because we want hell to grow and we ourselves, we can only work on hell in part-time because we are paid to do research. And sadly, the research community does not see tool development even if it is used for research as research. So, you can help us with your expertise in software design. We are IT security researchers. We are not software engineers. So, we can greatly benefit from all the experience you have. And any contribution that you can make is greatly appreciated. Because hell is open source under MIT license on Github. And you can contact us through Slack, through email, or even through Twitter. And, yeah, as I said, we are greatly thankful. We appreciate all of your help. And with that, I want to thank you for your attention and take questions. Thank you, Max. If you would like to ask a question, please line up at the microphones in the room. Use a chance to also ask questions. Why internet to the signal angel? Signal angel, do you have a question from the net? Yes, I have. Are there any plans to add a debugger or emulator to hell? So, I can watch real-world data flow through the modules? Yes, they actually are. So, internally, we are working on a simulator right now, on a cycle accurate simulator at the moment. And it works quite well. But it's not yet in a state that we can release. But still, this is something that we ourselves would really like to see in hell. The idea of debugging this not only from code, but also visually. So, we are definitely looking into that. And we want to implement this into hell. Yes. Thank you. Microphone number four, your question. I have a question about how well this performs on larger circuits, like a modern FPGA with hundreds of thousands of LUTs or an ASIC with billions of transistors. How well does this perform? Great question. So, hell, as the tool itself, the hell core, is perfectly fine to handle even larger net lists with several hundred thousand gates. But the graphical user interface is not in a sense. As I said, the vision that we have for hell with the user interface is to support modularization. Because even if you have a net list with 100K gates, looking at those gates, at the sea of gates, this doesn't help you. So, what we wanted to do is develop a graphical user interface that supports this modularization. And this is what it is made for. So, you can still run all your large scale analysis using the command line version of hell, basically. And this is also what we did in our graph similarity paper that you can find on the website of our group. But for like visual inspection of all of the gates in a large net list, hell is currently not the tool to go. Microphone number one, your question. First question, is it a common thing to obfuscate, find state machines in chips, and it doesn't change the function of the state machine, but is it somehow measurable if it consumes power or had the other performance metrics? So, I think actually the next talk will talk quite a bit about this, so I don't want to take too much out of that. But like the answer is, it depends. So, there are a lot of works on state machine obfuscation. But in the paper that I presented, basically all of them were broken, all of the underlying scheme. So, first take-home message, from my perspective, they are not strong. Second, you can, from my point of view, always find some metric to identify method X, but the question is, does it work on method Y? It might be that your obfuscation technique that you think about results in a higher power consumption, but maybe the other technique has all of the elements idle, but it has a lot of elements, so like not that high power consumption. So, it's a difficult question to answer that broadly. Microphone number nine, your question. Hello, hi. How do you know how thick the laptop layer of the die is before removing it? So, if you think about it, these layers, if you could look inside, they are clearly separated because you always have your metal layer and the wires you have in there. Then you have a few of interconnecting vias and then wires again. And this cross-section view, you can actually get, you can, for example, take your chip and simply not remove the top layer, but like rotate it and remove the entire side and look at it, or you can do this via X-ray, for example. You can get a cross-section view of the chip. Microphone number three, your question. So, there's an old project, maybe 10 years old, called DeGate. How does that connect to hell? I see you already moving your head. Yes, so DeGate is a different, let's say, is applied to a different point in the process. DeGate is actually used to analyze the images that you get, the layer images, and reverse engineer them. So, with DeGate, the goal is to get from the images to the net list. And maybe, of course, you can already incorporate some kind of analysis in there. But the thing is that this is not for analyzing an existing net list. Single Angel, do you have another question from the net? Yes, I do. What do you think about FPGA protection with a key such as control lock to prevent net list reverse engineering? Interesting. So, there's one technique, this is bitstream encryption, so of course, also with a key. And there have been several academic works that basically broke all available bitstream encryption schemes. But also, from what I recall, the numbers of people that actually use bitstream encryption is very, very small, like in the single digit percentage range. Then there are, of course, schemes that will also, I think, be talked about in the next talk, like logic locking, where you use a key to basically change the circuitry in a way that it only works correct if you input the correct key. And with the wrong key, it just works incorrect. It still works, but it's incorrect. And this is a topic that is, like, largely discussed and where, like, several design flaws have already been uncovered. So, from my point of view, this is not a good thing, but I think the next talk will be, like, much more detailed on this. Microphone number three, what is your question? Regarding the state machine graph previously, how do you conclude that the upper part is the harpoon part and the lower part is the actual state machine diagram? So, I would just open it up one more time. So, this was the graph that I showed, and I know that this state up here, the 000 state, this is the initial state. And I know this by taking a look at the initialization values of these flip flops. This was an FPGA net list, and FPGAs, you can initialize the flip flop to a specific value. On an ASIC, you would have to look at what the reset behavior of the flip flop is. And then, from this initial state, I traverse all these states, and the idea behind harpoon was to prepent the original state machine with, like, a net of states that are interconnected and that have loops, but once you leave the harpoon part of the state machine, you never go back. And this is exactly what we see on this transition between the state 110 and 001. Once you have this transition, you never go back. You always stay in this lower part. So, this is where we concluded that this lower part is the original state machine, because if you take a look at the state before, it goes back to the 000 state, so it should not belong to the original state machine. Single angel, another question, please. You use brute force to explore the state machine. Have you tried machine learning, like some of your colleagues at Bochum, to explore state machines of internet protocols? So, the brute force approach that we took was just to recover the the state transitions of an already discovered state machine candidate. So, we had to brute force all the inputs that come from the remaining circuitry and that not depend on the state memory itself. So, machine learning on that point does not help you at all. This is really just compute and take a look at what comes out. Using machine learning on the whole design, on the whole net list to detect state machines is a different topic. This might be possible, but think about that you need for most of the machine learning algorithms, you need a lot of training data. And this is really difficult to get in hardware design. Also, every synthesis suit optimizes and synthesizes differently. So, if you train your neural network with all the, for example, net list that you generated yourself, you will actually just learn this one synthesizer from my point of view. So, I don't think that machine learning is applicable here. All right, thank you. Signal Angel, one last question. How is debugging done when obfuscation is used? What tools do the people have that are supposed to debug a chip? Okay, so, I assume you talk about the designer now that obfuscates his own design and now wants to debug. Well, it depends on the obfuscation technique used, of course, but as you saw in the diagram, the idea of obfuscation is that only the way back, the reverse engineering is hard and the forward direction, the engineering part still stays like equivalently easy. Engineering is never easy. So, as long as you know how the obfuscation works and you know because you used it, you can always incorporate this in your tests. There are also works on this that say, okay, now I have like a protected design. I have a design that is locked, for example. How can I still apply testing data without giving out the key to whatever entity performs testing because this is also often outsourced. So, there are works on this, but it highly depends on what you actually use. All right. Thank you, Max, for your talk and for answering all the questions.