 Hi everyone and welcome to my presentation. In this video, I'm going to introduce you to the field of physically unclonable functions and present our paper, splitting the inter-post-puff, a novel modeling attack strategy. This is joint work with Chris, Niklas, Ha, Marian, Jean-Pierre, Martin and Ulrich. Whenever I use my key ring, I like to think that I'm the only person who can use it. Just because it's in my pocket. I like to think only a person in possession of my key can open the door to my home and only someone who has my hardware token can access my servers. In reality, though, what enables anyone to open my door or log into my server is not the physical possession of the key ring but is the knowledge of secret information. In the case of my front door, anyone who knows the secret shape of the key can manufacture a copy that unlocks the door. In the case of my servers, anyone with knowledge of the large secret integer stored inside the token can access the remote shell. And this is actually true for anything we do cryptographically. In theory, the only difference of Alice, Bob and Charlie is their knowledge of secrets. This means that all cryptographic protocols are vulnerable to attacks that retrieve this underlying secret information. But what if the possession of a physical object instead of a secret could become the foundation for authentication? Let's say only a person in possession of this very microchip may be authenticated. Physically unclonable functions or paths for short try to do just that. The physical properties of an object become the basis for authentication. However, if paths are practical enough for everyday life, it remains to be seen. And our work is a small contribution to this challenging question. But how do paths look like? In our paper, we study orbiter paths one out of many approaches for building physically unclonable functions in microchips. The orbiter path circuit consists of a number of stages each having two input lines on the left and two output lines on the right. Each stage can connect its inputs and outputs in a straightforward way like this or in an exchanging way like that. How it is connected is determined using the bottom configuration input. We take a number N of these stages in a row and connect the configuration to an external N-bit input which we will call the challenge. At the end of the row, we insert the orbiter element shown in gray, a circuit that determines if a signal was first seen at the top or at the bottom input. It determines the corresponding response. After a challenge has been applied to the circuit, a signal is sent from the left all the way to the orbiter element on the right which determines the path's response. We usually refer to a pair of challenge and corresponding response of a path as the challenge-response pair. Now, how does that connect cryptography to physical possession? As you can see from the symmetry of the schematics, in an ideal implementation, the signals would always arrive at the precise same time. However, in the real world, nanoscale imperfections of the material cause different arbiter paths instances to behave differently. While for a certain configuration, one instance may result in a signal arriving at the top first, this fact does not tell us anything about how the same configuration will behave when applied to a different chip as the delays are then caused by different imperfections. This means that ideally, not even the manufacturer could produce a microchip with the exact same behavior. This individual behavior can then be used in cryptographic protocols. For example, if we compare observed behavior with a list of prerecorded data. If it matches, we can authenticate the request. Of course, individual behavior alone won't make for a secure path. As one out of many security requirements on a path, it must resist any attempt to model its behavior even if the attacker has direct access to the path for some time. This is the attack model we focus our paper on. In the path literature and in arbiter path literature in particular, the attacker usually collects a large data set of challenge response pairs that show the behavior of the path under attack. This training data is then fed into a machine learning algorithm which builds a model to predict future path responses. The modeling process typically takes minutes to weeks. If successful, the attacker can impersonate the path when communicating, for example, with a remote server by just generating responses from the machine learning model. This enables the attacker to pretend to be in physical possession of the path when they're actually not, hence breaking our security model. Unfortunately, the arbiter path circuit is vulnerable to such a modeling attack based on the logistic regression machine learning algorithm. This attack, let's call it LR attack, takes advantage of how the signal delays accumulated through the arbiter path stages, a process that is mostly linear. This works so well that the attack only requires a couple thousand challenge response pairs and a couple of seconds to yield a high accuracy model. To mitigate the threat, XOR arbiter paths were introduced. To build an XOR arbiter path, we take a regular arbiter path, then add a second and possibly many more up to say 12 or so. Then we connect all the individual arbiter path challenges to the same external input and the inputs to the left to all the same signal. The twist is that the final and public response of the path is the XOR of the individual response bits. This breaks important properties of the mathematical model of the path and increases the computational cost of the LR attack significantly. But also the XOR arbiter path was broken by a machine learning attack, this time based on an evolution strategy algorithm using covariance matrix adaptation. It is often called CMAES attack and takes advantage of the evaluation noise of the arbiter path. It only plays a minor role in our contribution but I will come back to it later in this video. With the arbiter path and the XOR arbiter path now both broken, it was time for a new path design. Last year at chess, the interpose path was presented. It is a stack of two XOR arbiter paths which I will call the upper and the lower layer. To evaluate the interpose path, the challenge is first applied to the upper layer, giving us the upper layer response. Then the challenge for the lower layer is created. It has just one extra bit in the middle which is set to the value of the upper layer response. All other bits are copied from the external input and are hence the same as in the upper layer challenge. After evaluating the lower layer on this challenge, the lower layer response becomes the final output of the interpose path. The exact number of arbiter paths in each layer can be parameterized. But like with XOR arbiter paths, this parameter is limited and only scales up until evaluation noise becomes too large. The main design rationale for the interpose path was to resist known modeling attacks, in particular the CMAES and the LR attacks. It was shown empirically that both attacks cannot model an interpose path with high accuracy. If you were to attack an interpose path using the CMAES attack, the achieved model accuracy would be around 50%, no better than just random guessing. In the case of the LR attack, around 75% correct predictions can be achieved. That's a lot better than random guessing, you may think, and that is the foundation for our work. Let's look at the details of the LR attack on the interpose path. When applying the LR attack to the interpose path, we choose to first attack the lower layer. But this means that our training data is missing a piece of information. We do not know the middle bit of the lower layer challenge. As a replacement for the true upper layer responses, we randomly guess their values and start the attack. I did this attack in a controlled environment. That is, I'm using separate simulations for upper and lower layers. This put me in a position where I can evaluate the accuracy of the modeling result with respect to the lower layer instead of comparing it to the interpose path as a whole. What I found was that the accuracy is getting close to 100%. In this chart, you can see for different parameterizations of the interpose path, the resulting lower layer accuracy for our attack. Again, the overall accuracy of this is around 75%. But what we are interested in right now is the accuracy solely on the lower layer. Most trials result in an accuracy of around 90%, and that is only because of how I terminated the machine learning algorithm. In fact, the accuracy is only limited by the amount of available training data. This remains true for all interpose path parameterizations that I have tested. Regardless of how large upper and lower layers are, given enough challenge response pairs, the LR attack is able to recover a high accuracy model of the lower layer. But how does this result in a 75% overall accuracy of the interpose path? As it turns out, for around 50% of challenges in the lower layer, the middle challenge bit just doesn't make a difference in the response. Of the remaining 50% of challenges, in half the cases we are just lucky and guess the upper layer response correctly. That's how in total, a high accuracy model for the lower layer alone gives us an overall prediction quality of 75%. Before we continue our attack by creating an upper layer model, let's take some time to let us sink in and draw first conclusions. First, the LR attack is robust not only in the presence of classification noise, that is when the training data responses are noisy, but it is also robust in the presence of some feature noise when the challenge bits in our training data are noisy or unknown. I think this hasn't been discussed a lot in the literature, but it's an important piece of information highly relevant for modeling attacks if we are to continue designing paths on the basis of our other paths. Second, although the original LR attack is unable to yield a high accuracy model of the interpose PUF, it does model parts of it correctly. This is something that is from a designer point of view, most likely undesirable and should be included in future security analysis of physically unclonable functions as it may pave the way for a full modeling attack. This is the case for the interpose PUF, so let's see how the attacker can obtain a model with overall high accuracy using the lower layer model that we've just discussed. Once in possession of an accurate model of the lower layer, the attacker can use this to recover upper layer responses. Remember that the attacker already has full knowledge of the challenges to the upper layer. The upper layer responses are all that is missing to launch an LR attack on the upper layer. To recover the upper layer responses, we first take any challenge from our training data and derive the two possible challenges for the lower layer. Then we evaluate the lower layer model on both challenges. Now, if both are predicted to give the same response, then we can't learn anything about the upper layer response and discard the challenge. But if they do give different results, then we record the middle bit value of the challenge that resulted in the response matching our training data. As the middle bit of this challenge was taken from the upper layer response, this is the missing information we are looking for. This method works for about half the challenges in our dataset. In the other half, the middle challenge bit just does not influence the lower layer response. Of the half where we can recover the upper layer responses, most of our guesses are correct, depending of course on the quality of our lower layer model. But as we have already seen, the LR attack is very robust with respect to noise. So in any event, we are in a good position to obtain a high accuracy model for the upper layer as well. Let's see how this plays out in practice. The Interpose Puff has two parameters that can adjust the security level. First, there's the challenge length. The longer the challenges, the more computational effort the attacker needs to invest. For the LR attack on Axor Arbiter Puffs, it is known that the attacker effort scales as a polynomial in the challenge length. Second, we can choose how many Arbiter Puffs we include in the upper and lower layer, respectively. The more Arbiter Puffs are involved, the harder the machine learning problem for the attacker becomes. For the LR attack on Axor Arbiter Puffs, we know that effort in this parameter scales exponentially in the number of Arbiter Puffs. These are our results on the required attack time independence of the challenge length of the Interpose Puff. You can see the attack time on the y-axis for a selection of challenge lengths, which is denoted on the x-axis. Empirical results for the Interpose Puffs are shown with markers, and all Interpose Puffs here have the same number of Arbiter Puffs in the upper and lower layer. For each of those sizes, a fitted polynomial is shown. As both axes are scaled logarithmically, the polynomial appears as a straight line. Significant deviations from the fitted polynomial can mostly be explained with the use of different CPU types and slightly different code versions. The well-fitting polynomial means that the Interpose Puff behaves similar to Arbiter Puff and Axor Arbiter Puff. Required computational effort for the LR attack is approximately polynomial in the used challenge length. Next, we studied the required computational effort, depending on how many Arbiter Puffs are used in the upper and lower layers. There are two reasonable parameter choices here. First, we used just one Arbiter Puff in the upper layer and many in the lower layer. This is called 1 comma k. The other option is to use equal size layers here called k comma k. I show our results in these charts. On the y-axis again, we show the required computation time to model an Interpose Puff. On the x-axis, we show the number k for both parameter choices. All Interpose Puffs here have 64-bit challenge length. As can be seen from the fitted exponential function, time required by our modeling strategy behaves exponentially in the number of used Arbiter Puffs. The exponential function here also appears as a straight line as only the y-axis is scaled logarithmically. This also matches with previous experience of the LR attack on Axor Arbiter Puffs. As you can see by comparing the left and the right chart, our attack is also robust against evaluation noise of the Interpose Puff. On the left, we show values for a simulation of realistic levels of noise. On the right, you can see results when using a noise-free simulation. When comparing the 1 comma k and the k comma k parameter choices, we notice that there's hardly any advantage in using the larger option. What I haven't shown here is that the required amount of training data also behaves exponentially in the parameter k. This and the dramatic increase of required computation time limited us in the size of Interpose Puffs that we could attack using our implementation of the logistic regression algorithm. The largest instances we were able to model successfully were Interpose Puffs with 64-bit challenge length, and first, eight Arbiter Puffs in both the upper and the lower layer. This required around 300 million challenge response pairs in our training data and took around two weeks of time. Second, we could also model the Interpose Puff with a single Arbiter Puff in the upper layer and nine Arbiter Puffs in the lower layer. As modeling the lower layer is the first step, this required more than double the training data we had used in the 8.8 case and took around four times as long. Let's summarize the key points of our attack. Using the noise robustness of the LR attack, we were able to split the modeling attack on the Interpose Puff in two stages, modeling one layer at a time. The attack time scales exponentially in the number of used Arbiter Puffs in the upper and lower layer. But the designers choice of this parameter is limited by the level of evaluation noise in the implementation and cannot be increased without limit. Also, we found that the attack effort scales polynomially and the challenge length used in the Interpose Puff. Hence, increasing the challenge length could help to make the attack more expensive. But as the relation here is polynomial, eventually the designer won't be able to outbid the attacker permanently in this way. We conclude that there may be parameters such that the Interpose Puff is reasonably secure against our attack, but this crucially depends on the noise level of any given implementation and the chip area that is available. In our paper, we state that the major contribution of the Interpose Puff is still standing. It cannot be modeled by the CMAES attack. While this statement, of course, remains true, we must note that now there is a claim that an advancement of this attack actually is able to model the Interpose Puff. This attack is said to be able to model a 1,10 Interpose Puff with just 10 million challenge response pairs, two magnitudes smaller than what our attack requires. Finally, let's conclude this video with an outlook of what can be done in the field of Arbiter Puffs. Clearly, the Interpose Puff won't be the last attempt to build a Puff in a microchip, but novel, more advanced designs will follow. This requires better tools for their security analysis. First, as a lesson learned from our paper, Puff designers should make sure that there's also no partial model of their design that can be retrieved by a known attack. This requires that modeling results are not only judged by their overall accuracy, but are studied in detail. Second, as an aid to the design process, basic deep learning attacks can be used. This can be done with minimal effort, and if a design can be modeled even by a generic machine learning algorithm, then specialized attack will surely also succeed. An example of this methodology is in our paper. I'm sure there's much more that can be done in order to help in the analysis of Puff designs, and I believe that improving the tools available for security analysis is a crucial step of finding a secure, physically and clonable function. To help in the process of designing a successor for the Interpose Puff, I made all source code used in this paper available as a Python package called PyPuff. It provides tools for simulation and analysis of arbiter Puff-based designs, and of course includes the attack presented in this video, as well as the mentioned generic deep learning modeling attack. Last but not least, I would like to thank my co-authors and everyone who supported me in producing this video. Thanks for watching.