 Thanks Mark, and yeah, it's great to be here. So as Mark already mentioned this is part of a study we did at nxp. So this are Our colleagues at the time we did this word nxp looking at secure software solutions and more specifically Looking at white box solutions. So if you have no clue what this exactly means hold on and I will try to explain it and We will introduce a new attack factor on these secure software solutions named differential computational analysis Which can be seen as the more or less the software counterpart as the the well-known dpa attacks We all know from the from the gray box attack model So when you start to investigate the security of any implementation hardware or software, it's good to ask yourself Who is the attacker and where is the attacker? This could be the user of the device it could be a virus it could be malware So it's good to establish these things if you want to show that something is secure So we all of course are familiar with the with the black box model the model from the 80s you have the two endpoints you assume this are the trusted parties and the adversary just observes the data being transferred we all of course know that I mean things advanced and There were hardware implementations the user could have his smart card in his own hands the smart card is running a cryptographic hardware implementation and then This adversary could either use passive or active attacks use this meta information From this cryptographic hardware implementation to extract the secret key. I mean we all know this. That's why we're here at chess but there's another model and After the black the gray box model this was named the white box model and that's very similar to the gray box model But now we're talking about software implementations So Imagine you have a device. Let's say a mobile device and it runs a cryptographic software implementation But the user who owns this device should not be able to be able to use this implementation But not be able to extract the secret key used by this cryptographic implementation In order to be secure in this model, which is of course is a much more powerful model for the adversary It's called the white box attack model because now you can assume the user is rude on this platform and again it can use passive or active Attacks, but it could just dump the memory of your software implementation hook it in a debugger Change registers change values change your code even so this is a very powerful attack model and It looks like really difficult to to be secure in this model So where is this actually used in practice? So in the beginning of the 2000s in 2002 this model was introduced by chow at all In the setting of DRM applications, so digital right management So think about you get streaming content a movie coming to your setup box or to your phone Of course as a user you want to be able to decrypt this information Because you want to see the movie, but you don't the content provider doesn't want you to extract the key Give your key to your friend such that then he can watch this movie But so it got popular people looked at this and then there was not much research going on in white box implementations Until a new hot topic arise and and that's Resparkled interest in white box implementations and that's the setting of HCE so host card emulation So what's host card emulation? it's a technique which you can use with NFC's for instance with your mobile phone to pay or pay for goods or pay for your transit applications To pay with your mobile phone so it replaces in a sense the secure element. That's why it has the emulation the e in there Such that you don't store your cryptographic key on the secure element on your mobile device But you would store it in your software implementation So the question is how would one do this and the solution which people have studied our white box implementations So this really took off starting in 2014 when both visa and mastercard announced that they would they would support HCE and the prediction is that in the in the upcoming years More and more points of sale will actually support NFC and more and more mobile phones Even the lower end mobile phones will support NFC as well. So it's predicted that HCE solutions And therefore white box solutions will get more and more popular and then the question of course arises Is this actually secure what is being used in practice? So let's take a step back and let's look at these things from a more theoretical point of view So hiding a secret key in a software implementation can be seen as a form of code obfuscation and There's this well-known result That obfuscation of any program is impossible But then of course the natural question arises a white box Implementation or the functions we want to white box namely cryptographic functions are not any functions. It's a very specific Subfamily of all the functions out there. So it's currently an open question if they can actually be obfuscated or white boxed or not and Then the thing which which got us into this business and looking at to the security is to realize If there would be a white box implementation secure or an implementation secure in this white box attack model By definition, this means it's secure Against all the current and all the future side channel and fold attacks, which is a really powerful statement, of course Because you assume the security is as strong as just having black box access to this implementation So that's more from a theoretical point of view from a practical point of view In academia, there are only results known about symmetric cryptographic primitives We don't know how to build white box implementations for asymmetric primitives. Although companies are selling this All the academic designs of standardized crypto which in practice means either AS or DS have been broken So we actually don't know how to get to secure solutions So how do people actually build white boxes in practice? So that is all based on the initial papers and all the follow-up papers in in the early 2000s have used a similar approach It's you take your operations you do in your cipher So let's take as an example AS and every operation you convert to a lookup table If you use your secret key you merge this in your lookup table And then you're gonna try to obfuscate these lookup tables. You do this by putting encodings In front and after this lookup table, and then you merge this and you hope that these tables are Obfuscated enough that an attacker cannot extract the key So this is more or less how an AS white box implementation would look like as what is the ID From the original paper by Chao So the size of these white box implementations will grow significantly Because every step of the implementation is replaced by a lookup table so here the T tables is just the S box in AES where the key is used and the AIs and the MBs are Linear encodings and then the green steps are nonlinear encodings and these these squares surrounding it That's the complete lookup table you will use and that immediately explains why we know solutions for the symmetric case And also for the asymmetric case because in AES and DES everything works nicely on bytes Which we can convert nicely to lookup tables, but for asymmetric crypto RSA or elliptic curve cryptography We don't know we don't know how to do this by just doing small Lookups So in practice white box is just it's good to remember just a small part when you build a secure software solution In practice one would use strong code obfuscation on top On your white box and you would somehow try to glue your binary to the environment Because what you have one particular attack angle is that you don't try to extract the key from this implementation You just copy the entire binary put it on another device and you run it there Then you have not extracted the key, but you have extracted the functionality of Decryption So this attack factor is called code lifting and what people do in practice they have techniques called Where they try to glue the the binary executable to the platform or to the user You might want to have some support for trader tracing and actually this year at your crypt to invite a talk That was a really interesting talk about the cat and mouse game Related to code obfuscation and this binding and all the different techniques But it didn't go into much detail for white box implementations while the remainder of the talk I would like to focus on extracting keys from white box implementations, but it's good to remember that in practice People stack up much more countermeasures for if you want to achieve secure software So it's good to keep in mind that the effort and expertise required before because I said all Approaches for standardized crypto were already broken and were very white box specific So you need to know exactly which encodings were used on these lookup tables On which cypher operations and exactly in which lookup table So then if you would attack it you would need to reverse this binary so remove the code obfuscation Identify exactly which white box scheme is implemented target the correct lookup tables and then apply your algebraic attack So our approach Simplifies things significantly So here you only need to know which algorithm is implemented So you need to know is it a s or is it ds in practice that are the only two choices And then our attack works automatically you don't need any knowledge and if there is code obfuscation We just ignore it. So it doesn't matter how strong your code obfuscation is So the idea to get here is related To obtaining software traces, so it's similar to the hardware traces or the power traces used in the hardware community So what is a software trace? It's a trace which? reads or remembers all the memory accesses made by the software implementation and The idea is to collect this information using the dynamic binary instrumentation tools, which are already available So the examples we wrote plugins for pin, which is a dbi tool for Intel and Fall grind so most people if you are a software engineer have probably worked with fall grind it has very good debugging and Memory leakage capabilities, but these tools can do much much more And we extended these frameworks to get these software traces and how are these software traces used? so except for the software tracing tools we created the Visualization tool of gooey and then if you visualize the tool it will look something like this So this was inspired by an unreleased quarks lab tool, which was presented at a hacker conference as stick So here on the y-axis you will see the time of the execution of your software of your software binary and On the x-axis you see the memory being accessed So in black are the the instructions being executed and then the memory reads and writes are in a in green and red So if you would execute a white box you would see something similar to this Because remember for our tech to work. We need to know which algorithm is being executed a s or ds So you would obtain a trace and visualize it and then just see which algorithm is being executed So here you see a whole lot of things going on But if you zoom in on the little blob over there You see nicely nine times four rounds, which is a clear indication that this is probably a s 128 because the 10th round is slightly different and they probably merged this somewhere else But what if you have a good white box implementation and it really tries to hide The code being executed so it's completely unrolled and it doesn't leak any information. So you will get a nice Straight line. So here we cannot learn much But then we just look at the data being accessed in memory. So here. This is an example of another white box implementation Where the you can clearly see that in memory there are one plus 15 times Lookups being done in memory. So this is a clear indication that this is probably this But then you have an even better white box implementation the code is unrolled and they try to have Operations in memory Which are completely linear so they try to hide What they're doing But then completely on the right you see a small bar So what is this? So this is the stack being accessed By your software implementation and this is just a zoom of that stack and it's really high too It's really difficult to hide access patterns on the stack and here you can immediately see that are again a pattern Access pattern of one plus 15. So this immediately reveals That we're working with DS All right, so what so now we know with white box implement white box cipher has been implemented So now how do we extract actually the key so the naive approach of course is to follow the Approach from from the grade box model So you take your white box implementation port it to a smart garden measure the power consumption and use all the attacks We all know here from chess But that's really quite naive Because we have these software traces available. So what we could do Is get these software traces so which are just a sequence of memories accessed And then it realized that every bit of these access Memory bytes are equally important. So we serialize These memory accesses and then we get a trace Which is not very informative. It just consists of zeros and ones And here for instance trying to visualize the rounds already is quite difficult But a hint auto correlation will reveal them immediately. So why is this useful? Because this really corresponds if you think about the similar setting in the hardware world Proping each individual bus line getting the information without any error. So we have zero error on our trace Which is of course in a very good situation to be in for an attacker So what are the results so we tried to get hold of white box implementations, of course So we took and they're not that many so Brechtweiser was the first one to publish a Challenge online in 2007 and then there were various hacker conferences Which published white box challenges and the last one from Kleeneck is a master student who published the latest Whitebox AS design. He made an implementation. We created a challenge out of this and as you can see with a minimum number of traces so for the Brecht's challenge, we only needed 65 traces to immediately extract the key by just running DPA on the software traces The two middle ones were much easier, but it's good to remember this attack was not known so they removed these encodings on these lookup tables and because they wanted other hackers to break this in one day on these hacker conferences and Actually the white box challenge we created with Kleeneck's implementation was slightly more difficult First we needed 2,000 traces, but in the end we managed to do it with only 500 And so what's the intuition why this works? So it was already studied by a follow-up paper at FSE this year is Because the non-linear encodings you put on these lookup tables do not sufficiently hide all the information So you are still able to correlate When the correct key is used and this really opens up a lot of potential for for follow-up works Because we know of all these countermeasures we study here at chess But it poses some challenges because Most of the countermeasures make use of some sort of entropy they need random data But in the white box model by the definition of this model the adversary is in control of the entire platform So we could just disable this entropy So maybe for follow-up work, maybe you could use some sort of static data within the Implementation itself or the only really random data you could use of course is the input to this white box So maybe initialize something based on the input of this white box as we saw from the previous two talks Can we borrow ideas from threshold implementations to help and secure these software implementations? Then another countermeasure is of course if you had very strong detection mechanisms That we're running within a dbi framework And then just output wrong results that would of course make our attack for it, but the hacker community is quite active By by circumventing such mechanisms, but it might be a more practical approach And then of course DCA is not a holy grilled will not always work when you use really large Nonlinear encodings this attack will not work because it really hides all other correlation information But a side effect is the lookup tables will get even larger in practice and Then it would help against DCA, but there are still other attacks like the algebraic attacks people use before and There everything would still work But then from not from a countermeasure from an attack perspective It would be interesting to go into more depth So we only tried DPA attack on the software traces but one can think about Porting or bringing over more of the attacks from the grey box world to the white box setting So one of the things is that riskier has already Shown that fault attacks in the software setting work very effectively as well So more information can be found on their presentation last year at black hat And then of course the question is what about higher order attacks and all these things we all know Will they reduce the number of traces needed even more or will they cut counter some of the Counter measures which might be effective against this type of attack So we released Everything we created related to these attacks and we put it as open source online So if you go to this URL, you will find scripts to just run all these attacks our software tracer sort of plugins for For pin and for fall grind and the visualizer we actually didn't find any good open source CPA tools Maybe we didn't look close enough, but we wrote one ourselves. So that's now open source as well And for the fault attacks we created an implementation as well So if people know more what know more about white box challenges We'd be happy to learn about them or if people are interested in extending our tools They're more than welcome to do so So to conclude so software only solutions become more and more popular and Especially white box implementations if they're still used in the traditional setting so for digital right management But now with the upcoming new use cases for HCE They sparkle a lot of more interest But the level of security and or the level of maturity of all these white boxes is quite questionable So we didn't look at any industrial designs. These designs are often kept secret But when these things get deployed so widely it would be good to investigate this in more detail So what we did is we presented a new automated tech DCA Which allows to an easy manner just extract the private keys from these secure white box or secure white box implementations By using techniques. I mean well study techniques from from the gray gray box community. And yeah, we really hope this Sparkles more interest from you guys to look into this topic in more detail. Thank you for your attention