 Thanks for the introduction. Hey guys, it's a pleasure to be here Awesome, yeah Let me tell you some story from the trenches that we are fighting Where we are fighting every day. This is joint work with Vova Quitsnitsov, Laszlo Shekharis, George Kandeya, R. Shekhar and Don Song. Most of the work has been done at UC Berkeley Vova and Laszlo did most of the Heavy programming work and heavy lifting while I only coded some of the support libraries and a little bit of some of the other stuff, but most of the Programming praise goes to the goes to the two students Also blame me for all the bad jokes in the presentations. It's not it's not their fault So let's start right away. We live on an ugly planet a buck planet and It's just terrible out there. There are so many bugs It's it's just crazy and we are overrun by them if you just look at what happened in the last couple of years For example pointers not working Memory corruption bugs are abundant. They are everywhere. They are just literally everywhere and there are so many CVs out there that Where an attacker can gain some form of control flow high-check permissions or capabilities and This is just as We just picked a couple of programs out there like acrobat firefox ie general OS X and Linux bugs that allow an outside attacker to gain control flow or code execution capabilities on a system and The attacks are on a rise and all of these attacks rely on some fundamental memory corruption vulnerabilities somewhere in the background and We are being overrun by all these bugs even though we are investing more and more time into fuzzing and Tools that find bugs. They're just way too many out there to the that we can fix manually so we have to come up with some techniques that allow us to Have some proactive steps Against these bugs and vulnerabilities that are in our software to protect them against different form of control flow high-check attacks and Just to name two prominent examples this year. There was the the harplead vulnerability Which was a memory safety or a memory corruption vulnerability and shell shock as well and some of these vulnerabilities state a sleep and slumber in the software for many and many years and Certainly they are found and they can be Exploited on a very very wide scale and I argue that we have to come up with some proactive defense mechanism that protects us against these kinds of vulnerabilities so Let me take a step back and explain you what memory safety is all about and why we don't have it yet so at the Most basic level memory safety relies on some form of or a memory safety corruption or lack thereof relies on some form of invalid D reference so for example there can be dangling pointers Could be a temporal issue So for example at one point in time in the language like C or C plus plus We have a memory object and we have a pointer to that memory object so we have a valid reference, but We free that object at one point in time and it becomes an invalid of Object and there's a there's a dangling pointer to that object which can lead to some form of memory corruption When we when we do reference it or they are out of bounds pointer imagine that we are working and iterating through an array And as soon as we step outside of the array We have an out of bounds pointer, which is some form of spatial memory safety violation Both of these pointers are still fine as long as we don't do reference them But something bad will happen if the pointer is read written or freed again and we end up with some form of corruption so Otherwise, there's no violation The threat model that we're gonna have for this talk looks as follows We assume that our attacker can read and write arbitrary data in the whole process image The attacker can read code as well, but the attacker cannot modify the code as it is being executed and Well discuss some of these limitations later on or the attacker cannot influence program loading So if the attacker has control of the whole process image before we can actually set up already fence mechanism There's nothing we can do so to Speak a little bit more permission wise. There's there are X permissions for the attacker On the code, but the attacker cannot change the the code pointers There's read and write for the heap and read and write for the stack so Just so that we are all on the same page I'll quickly walk you through some form of control flow high-check attack as it is being used Nowadays to sidestep all the defense mechanism that are active on on current systems so that we can we can understand How the defense mechanisms that I'll present later on will work so We have this simple C program on the left-hand side. We define a function pointer And there's some weird pointer arithmetic going on where we where we assign q to a buffer plus some attacker-controlled input so the attacker can kind of Point q somewhere into memory by carefully crafting the inputs to that pointer After that happened the function pointer is assigned to the function foo but later on the attacker controlled Pointer q is assigned a value so initially the programmer intended q to point into the buffer but the attacker can overwrite or can control q to point to some function pointer later on somewhere somewhere in memory and As a second step of the attack the attacker can write To that function pointer and instead of pointing to the valid function that we want to execute the Function pointer now points to an attacker controlled gadget and this allows a control flow high-check attack that the attacker can then Exploit as soon as the pointer is dereferenced and as soon as that happens all the bets are off the attacker Runs runs code and can get control over your system But you might say we've got all these fancy defense mechanisms out there. Ah, what about these existing defenses? So, yeah, that's actually true We've got a bunch of defense mechanisms like data execution prevention which prohibit an attacker from injecting code That's why I assumed in the beginning that The attacker cannot modify the the ongoing code So this is all fair and nice But in addition to that a problem on current systems is that the attacker can Re-stitch Individual gadgets right next to each other and thereby execute arbitrary codes code so called return oriented programming or jump oriented programming if you want to look it up There's a SLR which is a nice probabilistic defense mechanism that shuffles memory around But it's only a probabilistic defense and if we assume that the attacker can read arbitrary memory the attacker can easily Side-stab this defense mechanism by reading out some pointers and thereby Reconstructing the randomization algorithm that was used and thereby leaked the correct addresses and craft an attack carefully that side-stabs this defense mechanism defense mechanism and the same goes for for stack canneries that protect against overriding the return Instruction pointer on the stack if the attacker can read out the stack frame before that the attacker can carefully craft and attack The side-stabs this defense mechanism So it looks like we're host right if you want to know more there are a bunch of nice papers out there Or you can just watch the talk From last year where we evaluated all the different kinds of war games that are in yet In memory and how these these war games all play together and how an attacker can exploit them So now that we know that we cannot just use C code or C++ code because there the code is full of of bucks You might say ha ha Memory safety to the rescue will just move to safe language right instead of coding in C or C++ Program language research has come up with a whole bunch of languages. There's Python There's Java their C sharp or they're swift These are all memory safe language languages and these problems that I just talked about would completely go away Sounds good, right So where do we stand? Let's assume. Let's look at the dropbox uploader for your files. It's written in 3k 3,000 lines of Python code completely memory safe Everything is good, right? There's no way you can exploit that code But imagine to run these 3,000 lights lines of Python code. What do you need on top of that? Well for one, there's a Python runtime half a million lines of code it's written in C and Again, we have all the memory corruption vulnerabilities in addition to that We use the lip see to all the codes to do all the system calls to an a half million lines of C code and on top of that there's that thing called the Linux Linux kernel or the Windows kernel another 16 million lines of C code So C code or C++ code low-level languages are not going to get away There's no way we can fix all these bugs and it's just way too much of an attack surface And we need to come up with strong defense mechanisms for those So if you look at the current state of defense mechanisms a Lot of the code that runs on the system even if you program in a safe language is actually unsafe and only a small Tiny little subset is written in safe code if at all so if we compare The half how close are we to safe languages? We are way off and there's a Huge way to go and we need to get much better So you might say okay fair enough We cannot rewrite everything in a in a safe language. So let's just retrofit safe language on top of the old languages, right? sounds like a good idea because memory safety first all these bugs would go away and Academia and industry came up with a bunch of defense mechanisms on top of that so you could that you can just plug into your compiler and More or less easily just recompile your software Do a couple of hundred hours of code changes and you're good to go So they're soft bound plus CTS Which is a great defense mechanism that retrofits memory safety on top of C and C plus plus But it comes with a one almost 120 percent price tag. So it's fairly expensive Also, it's not Exactly compatible with all the code But it runs most of the code or a lot of the code They're secured that tries to retrofit memory safety on top of of the C language and restricts or Enforces a stronger type system that then comes with some form of memory safety on top of it But again, it comes with a 60 percent 60 percent price tag, which is way too expensive in practice Or there's address sanitizers, which is great for debugging and a great tool But it's on one hand. It's only probabilistic and again it adds 73 percent overhead So even though we know that as the tools are right now, they have way too much overhead That's what we want to do we want to retrofit memory safety on top of these existing of Compilers and then enforce it at runtime to ensure that we are protected at all times and Just to show you where this overhead comes from. I'm gonna give you an example of how Softbound would enforce memory safety on this on this small program So we've got a couple of lines of C code We have a buffer that is that is allocated We have our pointer queue again, which is from the from the motivating example Which is assigned the initial base address of the buffer plus some user-controlled input Softbound is a compiler based transformation that adds additional checks in the background That are then executed at runtime so First of all it assigns metadata to all the pointers So as soon as as we declare buffer there, there are two additional variables declared that contain the the lower and upper bounds of the Of the buffer itself So we have a lower pointer and an upper pounder pointer and those are carried along All the all the accesses and so on if we do have assignments From other types or or across pointers we propagate metadata So we see that Q low the two variables Q low and Q upper Are assigned the the lower bounds of the the buffer and the upper bounds of the buffer So these this metadata is carried along and can be used for further protection in addition to that we have The reference or a check whenever that pointer is used so before Star Q dear Q equals input two is actually executed We do a check if the current value of Q is inside the bounds and Abort otherwise, which protects us from any possible attacks against That would that would assign it out of bounds so the function pointer cannot be overwritten in this example So What we have is or what we get is these 116% performance overhead because they're just way too many pointer assignments in low-level languages like CRC plus plus and the Compiler has a very hard time at optimizing these and getting rid of all the surplus Assignments so in reality Or with a perfect compiler we should be able to prove for many more accesses that they are actually safe But it's very hard to reason about these things on the on the compiler level And therefore we have this very high or fairly high overhead even with a very sophisticated compiler analysis framework like LLVM offers So it looks like this right we are walking towards that that safe safe Haven and we do have that safe Haven but we are facing a problem that we either have safety or Flexibility and performance so it's an either or which is Really really bad. We would have ideally we would have to you want to have both safety flexibility and performance You might want to know more about memory safety Feel free to either read the paper or watch last year's talk by Andreas who presented Softbound for for free BSD So now that we know how memory safety works can we adapt it somehow so that we can protect only a small set of data We no longer want to protect all the data Because that's that's way too expensive right otherwise we would run again into into this high overhead But possibly instead of protecting all the data that is out there Let us just focus on a small subset of data and protect that small subset of data so just a couple of code pointers on the heap and a couple of pointers and variables on a stack that we deem to be protection worthy so instead of just enforcing a probabilistic defenses for defense for all the data or a Strong defense mechanism for all the data with high overhead. We offer strong protection for a select subset of data and In addition to that we have a very Different attacker model to do other defense mechanism We assume that the attacker may modify any unprotected data that is out there and the attacker can freely write to any of the data That we don't really care about So instead of protecting everything a little we protect a small set of data completely and just to give you a peak preview From we change The overhead numbers from complete memory safety which faces 120 percent overhead to as low as only Two to eight percent overhead if we protect only code pointers So we focus for our protection on code pointers and in for strong memory safety for code pointers so anything that is used in a control flow decision and at one point in time Through an indirect jump through an indirect call or anything like that. We'll be protected by our defense mechanism But we don't care about any of the data that is under on the heap or on the stack and we focus only on the on the small subset of Code pointers that actually are used for control flow decisions and therefore We can protect against control flow high check attacks We don't protect against any data data attacks though To actually enforce and force this We had to come up with this with a set of special techniques that Transform your program into a protective program One of the core ideas that we have is something that's been out there in or has been used in networking for decades But hasn't been used in in software engineering. We separate separate control data and The control plane and the data plane so instead of having just one single view of memory We separate the program memory into two different views So on one hand we have the regular memory with all the buffers all the pointers and so on and on the other hand we have safe memory and Our safe memory contains code pointers and code pointers only The regular memory on the other hand contains all other data, so it looks quite bad, right? but I guess you get the gist so Safe memory contains all the safe code pointers regular memory contains everything else Just that the memory locations for the function pointers are not used The memory layout itself is unchanged So in the place where a code pointer was before there will just be an unused block in a control plane any Memory location is either a code pointer or null and we can impose this memory view using some compiler-based technique and Enforced that the safe memory region only contains code pointers and nothing else using the transformation But more on that that like so for now just remember that we split the memory view We have a safe memory view that contains code pointers nothing else or Null values and the regular memory is the rest of the data so on the stack we have a Different kind of technique We split the stack similar to the the heap memory into a safe stack and A regular stack has it been like this like a since the beginning There's like a third of the slide cut off Let me try to do something That's annoying So much for DVI out Okay, let's let's try to continue So we've got the safe stack and the regular stack on the safe stack We add an additional compiler instrumentation pass that looks at all the local variables on the on a stack frame and Everything that we can prove in our instrumentation pass that is safely accessed So any local variable that is that is safe is pushed to the safe stack While all the other variables that we can approve are safe Are pushed to the regular stack so stuff that could cause something to be unsafe as either some weird pointer arithmetic If it excites escapes the local What the heck is this? So if it escapes the local stack frame or Yeah, whatever we push it or we keep it on there on the regular stack and we assume that the attacker can corrupt anything on the regular stack so we Ensure complete safety on a safe stack we don't give any guarantees on the regular stack So if you look at our small code snippet the Variable are and the return address would be pushed on to the safe stack and the buffer Would be pushed on to the unsafe stack and the attacker could corrupt some of the other stuff on the unsafe stack So in using this principle we can ensure that the attacker can only corrupt data that we're not interested in protecting And we can decide how much of the data that we want to protect and in our case we protect anything that's code pointers the that contains code pointers or is used in in control flow decisions or Can be proven or is proven to be safe on the on the stack frame so if we look from above at the The memory layout we have two areas of memory We have safe memory for code pointers and regular memory for all the other pointers for a safe memory We ensure and guarantee that all the accesses are safe While for the regular memory They are fast But we don't give any other guarantees In between safe memory and regular memory we use we use hardware based instruction level isolation using some technique like segmentation or some other form of of blinding the Regular memory contains regular heap stack frames and the read only code regions and the safe memory Contains likewise the safe heap the safe stack And the safe stacks of the the individual threads so Now that I've I've shown the basic overview of how we separate code pointers And it's it's actually just like in the name code pointer separation that all the code pointers are in a completely different memory space How much protection does that give us? Let's look at how we can attack code pointer separation. We have this small C program here It looks very similar to the to the motivating example. I used in the beginning with a slight difference So instead of just designing a function pointer. We assign a Function pointer through a struct. So let's assume that the function pointer somewhere in the struct and we have a doubly indirect dereference to the function pointer Again, we have the opportunity for the attacker to corrupt the buffer and We know that the attacker cannot corrupt the function pointer itself Because this function pointer is a completely different memory space and therefore protect it So the attacker using this the simple right which has a different type and our type based analysis ensures that the function pointer is in The other memory space the attacker cannot overwrite the function pointer itself but what the attacker can do the attacker can overwrite the structure the struct pointer and Let it point to somewhere else in memory Where possibly there might be some other function pointer lying around? So if you assume that this or if you think back to that different memory view that code pointer memory view contains only code pointers or no and we whenever a Region is freed we clear the code pointers in addition to that So this memory view only contains the code pointers that are currently active and in use But using this kind of technique the attacker can redirect it to some other code pointer Which is a null or a pointer to another function, but we'll see what the What the defense mechanism in the end is like so using the compiler based technique in our analysis and analysis phase We identify all code pointer Exises through some static type based analysis and redirect them to the different memory view The two memory use again are separated using instruction level isolation For example segmentation on x86 or blinding on x64 and we give a bunch of security guarantees so using the separation and You have to think about it for a while until you you understand the security guarantees the This separation ensures that the attacker cannot forge any code any new code pointers so using a Memory write using a non-code pointer type the attacker can never write to that safe memory view We guarantee that any pointer that is written into that safe memory memory view is either an immediate like a fixed value or Assigned from an other code pointer Therefore an attacker can never construct a code pointer to some other location in memory What the attacker can do using that doubly indirect or multiple Going through multiple in direction can replace existing functions through these in directions. So for example foo bar Function Can be turned into foo bus function, but it must be a code pointer It must exist the current memory region must be valid at that point in time and then successful attack is very unlikely for this so What we basically did is we took all the code pointers that are used in in a program group them together and Make them look out for each other and protect them against other modifications from non-code pointers that are out there and thereby protect them against other modifications also if you think about it all the Memory safety violations, they are usually not in pointer arithmetic to code pointers But in pointer arithmetic to some other buffer or some other input. So it is very likely that this will be exploitable So This is the simple code pointer separation, but can we do better than this? Remember we talked about memory safety before and here I didn't talk about memory safety yet So, let me see Code pointer integrity takes code pointer separation as a baseline and In addition to that and forces memory safety for the code pointers so the sensitive pointers that we protect are the code pointers and In addition to that any pointers used to access these sensitive pointers So everything that's used in a dereference chain Automatically becomes protected as well in addition to that we enforce bounce checks as I've discussed before for all the for all the sensitive pointers that we identified We could identify individual instances of Safe sensitive pointers, but we do an over approximation which does not Which does not lower security, but might increase overhead instead of protecting individual instances We protect types that we identify and deemed to be sensitive. So it's an over approximation which is safe This over approximation only only affects performance, but we measured on spec 2006 which is a standard benchmark that using this full transformation roughly six point five percent of All the memory accesses are accesses to sensitive data So let's see how our example looks like if we have Bounce checks in addition to the to the separation just like in the example of memory safety before we add the lower and upper bounds and We execute an additional check if the bounds are still valid and Before the attacker could Overwrite the the function pointer or redirect the struct pointer the an exception would be triggered whenever that is dereferenced So if we compare Code pointer integrity and code pointer separation what additional kind of security guarantees to we get Both defense mechanism separate sensitive pointers from regular data Both of them are based on a top or use a type-based static analysis But where they differ is what? Sensitive pointers are for code pointer separation sensitive pointers are code pointers only for code pointer integrity We add in addition to that pointers to sensitive pointers Which is a recursive definition and kind of follows the dereference chains for and be there for protect anything that? points to To code pointers directly or indirectly through any chain We guarantee that accessing sensitive pointers is safe so we use instruction-level granularity separation and For code pointer integrity in addition views runtime bounce checks on top of that Also on the other hand accessing regular data is safe is fast So we don't impose any additional instructions when accessing Other regular data that we don't want to protect and this allows us to get very low overhead So what kind of security guarantees do we have? For code pointer integrity we offer a formally guaranteed Protection and if you are interested in the more theoretical aspects, we do have a formal proof in our paper All in all if we enforce memory safety for the code pointers and the sensitive pointers that we identify on top of it We run into eight point four to ten percent ten point five percent overhead for roughly six point five percent of memory accesses which is almost deployable and definitely deployable if you are interested in Protecting against specific attacks or in in different security contexts For code pointer separation We offer strong protection in practice, so it will be very hard for an attacker to find An exploitable condition we don't say it's impossible, but it will be very hard It's definitely much stronger than any of the defense mechanism that exists currently it offers complete protection against anything that against any return or in the programming attacks and strong separation For anything any code pointer that is on the heap at almost negligible overhead We have 0.5 to 1.9 percent overhead So 0.5 for spec CPU and 1.9 percent average overhead for the for onyx benchmarks And we protect roughly 2.5 percent of memory accesses What we can also do is we can only use the safe stack which protects the Return instruction pointers on the stack and we offer full protection against return or in the programming attacks at negligible overhead So we've got different levels of defense mechanism that you can use whenever Or however you you feel like and how But how we how big your your budget is so if you want to give strong Deterministic guarantees use code pointer integrity if you want to protect If you want to use strong protection use code pointer separation If you just want to be sure that the return or in the programming is not possible use the safe stack So enough of the design of the system. Let's talk about the implementation We implemented it on top of clang where we collect type information And all the transformations are then done on an LLVM instrumentation pass that either or the CPI CPS and the safe stack We have additional runtime support for the the safe stack And the safe heap and all the management functions on top of that and currently we support x64 and x86 although x86 is a bit shaky For the systems supported systems we support Mac OS X free BSD and Linux The current status of the implementation This is a research prototype after all right, so nothing is perfect we have great support for code pointer integrity separation and stack pointers for Mac OS X and free BSD on x64 and Fairly good support for for the other architectures, but we are working on on improving the code quality We are currently in progress of upstreaming the patches and as you might imagine, there's a whole bunch of changes out there, so we build a we built a couple of chunks that we are upstreaming one after the other and currently we are working on the on the safe stack because that's the logic first choice Compete it has lower overheads than stack canneries that are currently used Currently used it offers full Deterministic protection and it's widely compatible with anything that's out there It's easy to add and gives you very strong protection. So this we are hoping to finish Integrating the safe stack soon and we'll then continue with the CPS and CPI patches You can fork the current version on github. It's all out there. Feel free to give it a try and We're happy to hear back from you if you if you find bugs if you find other things Also, we're currently working on a code review for code pointer separation code pointer integrity. There's Still a couple of bugs to be fixed There you can still play with the prototype It's out there Download it try it out. We will definitely release more packages and An updated sources soon and also we worked or we obviously Run into some problems as well and there were some changes to super complex build systems Like for example when we worked on on free BSD we had to adapt some of the larger make files to actually get it to run So you might ask is it practical? Well, it is so we recompiled the complete free BSD user space with our protection and in addition to that more than a hundred protection with One more than a hundred packages with strong protection guarantees Where we can now guarantee that control flow high-check attacks are no longer possible at Fairly low overheads that you can you will run into So now it's up to you to do your part. Look at the code Try to port some of the more complex make files. Try to find bugs in our Implementation help us get Get the word out there Get distributions to actually use these patches. Let's attack or accessible software be compiled using a very Deterministic and strong defense mechanism and if we compile our software that is Reachable for code is reachable from the internet using these strong defense mechanisms. We can stop control flow high-check attacks So let me conclude Code pointer integrity and code pointer separation offer strong protection against control flow high-check attacks and The key insight that we had here is that we offer memory safety for code pointers only Which allows us to limit the overhead that we actually in cure bringing down the overhead from almost 120 percent to less than than two to eight percent on average for depending on If you either use code pointer separation or code pointer integrity Which is easily deployable in practice and can be used on a very wide scale without impacting the runtime performance off of current software We do have a working prototype which supports Unmodified C and C++ at very very low overhead in practice Upstreaming of our patches in isn't Progress the safe stack should be available soon You can fork you're gonna fork it on github. You can read the paper on our home page We'll be happy to hear from you If you if you have questions if you find bugs if you want to audit the code or help any in any way and if you continue like that we'll go on and We'll be able to understand the bugs and then in the end we can get rid of them and protect against them And with that I would like to close my talk and I'm happy to answer any questions. Thank you Thank you very much Matthias player As you heard he's open to questions now. We have four microphones in the room. There's microphone one microphone two Three and number four. You can go up on any of them. I'll just pick them Well one after another and I will start with microphone to please Thank you for fascinating Talk on the Approach here. I've just got one question. You've picked discussed extensively the overhead in runtime What's the overhead in memory footprint? Because normally the page page size will be the limit for your hardware Yeah protection. So the overhead is not too bad Naively you would just shadow the memory space, right? What we do we implemented we've got several implementations where we store the The code pointers and the meter data You can use a hash map a compacted hash map some form of array or some other data structure That's just protected and moved move to the site I'm not 100% sure if you put numbers into into the paper, but They are in the the low low digits There's not too much memory overhead because a for one there are only very few code pointers So we don't need to store an excessive amount of data Also, we can we can store them in very compact Representations using some form of hash map or something like that that gives us the illusion of the full memory space Okay, all right, thank you microphone one, please yes in your motivating example The biggest chunk of potentially vulnerable C code was the Linux kernel and I was wondering whether this kind of protection is practical in kernel space and with full Fulfill permissions in hardware That's a very good question It's also very hard question to answer. So for the C code We'd be perfectly fine Unfortunately, the Linux kernel contains a whole bunch of Assembly code and inline assembly code, which is very very hard to protect So we cannot give any guarantees for inline assembly that is in the in a source code We could imagine some form of Annotation based system where the programmer has to identify Assembly or inline assembly sequences that modify code pointers and if so we could give the same guarantees But we would do the So our instrumentation pass runs on top of LLVM LLVM does not have any type information for the inline assembly code that you put into the Into the code So if you as a programmer would supply additional Annotations so that we could reason about the inline assembly code We would be fine, but that would be definitely ongoing work Thank you We have a question from the internet Yes, thanks There's still an open question in the chat room and the question is how are the address spaces are separated the pointers from the code? Yeah, that's a good question On x86 we use segmentation a segmentation register that we set up So we can easily enforce We can you easily use hardware enforced separation on x64 We can we can use a set of different techniques depending on how much overhead you're willing to pay you can use as LR So you can use a randomization based approach And just use or allocate your safe region somewhere in memory That is safe from the attacker because in a in a 64-bit address space The address space is big enough so that you can you can hide it and in addition to that We guarantee that in unsafe memory or an attacker accessible memory. There will never be a pointer to our Our safe memory therefore we are safe against Information leaks or if you're willing to pay two or three percent overhead you can blind A blind out a memory region So imagine it that for every memory access that you execute or every every memory reader every every memory Right that goes to unsafe memory you do an end to the To the actual address that is used and therefore blind out With a mask blind out the bits that are the protected memory Which costs you like two or three percent overhead All right microphone to please At the beginning of the talk you mentioned the hot bleed buck and as far as I could tell None of your security measures would help against it. Yeah So the heart bleed bug is a piece of information that It's basically a data leak, right? So you could protect against the heart bleed buck by extending the protection that we currently have to other data types as well So you've seen that you we basically run a type-based analysis and We identify we currently identify everything that's like a code pointer or used like a code pointer or Anywhere in the chain when a code pointer is dereferenced But there's nothing that stops us from adding additional data and increase the protection for that additional data so instead of just protecting code pointers we can select other data as well and sensitive data types like private keys would be very good candidates for additional scrutiny and Inclusion into the set of sensitive pointers. So that's definitely ongoing future work All right microphone three All right, you've mentioned that you've evaluated the performance With the spec benchmark have you also evaluated the functionality? You've mentioned that you've compiled the free BST the whole free BST set and a couple of hundred packages But have you also run them successfully at all? Yeah, so spec CPU is a self-validating benchmark that checks if the if the code runs correctly and verifies that it runs correctly and So does for onyx. So we did run the for onyx benchmarks on top of free BST, which is a big package of Benchmarks that sell verified results and sure that they are correct. Oh And then you've mentioned that you have runtime support. What does that entail and how portable is that? So we do need runtime support to set up all the data structures We need runtime support to set up to save memory and so on which is basically like do you know compiler RT in LLVM? It's like a library that is linked into any or included into any executable that is compelled. It's like a GCC has its own library and so on it just contains a set of startup functions and so on that set up the process image You it when you execute a program. It doesn't start at main There's a whole bunch of other crowd that is executed beforehand and we add to that stuff that is executed beforehand It's it's a standard compiler technique to Include some initialization functions and stuff like that All right. Thanks. Thanks. All right. There's another question from the internet Yes, thank you another question is what about Applications with runtime code e.g. Browsers, can you protect these? Yeah, so that's a good question at the beginning and the assumptions I put in that There's no self modifying code basically and browsers Use a lot basically just compilers and therefore recompile code all the time There are two answers to this question one of them is if you lift the compiler the just-in-time compiler itself into the into the trusted computing base and you instrument and give the compiler access to All the support functions the compiler can produce save code as well But the drawback obviously is that the compiler itself then is inside the trusted computing base in During the time the process or the program is executed and might be Attackable if there are bugs in a compiler obviously the other option or the other answer is that most of the time Even if we don't give any strong guarantees the compiler The trust-in-time compilers that we use in browsers Compile a safe a memory safe language and inside this memory safe language. The attacker doesn't have access to To code pointers directly so the the It should it should be safe in most cases, but obviously it's not a perfect answer That's why we included it in that's why we excluded to two compilers or just in time Generate a code from the attack model You can come up with some defense mechanism, but you'll have to verify the compiler. That's a short answer All right. Thank you. There's still three questions left at microphone to I guess So you said that This doesn't work in all cases. So it was in almost every case, but it doesn't work for some cases You already explained that this one wouldn't work for in line assembly and stuff like that. What what are the other cases? This doesn't work um Well, not necessarily not work, but For code pointer integrity there are some time sometimes very weird costs like C code if you if you look into what is actually compiled and written out there see then C code becomes very very ugly and According to the C language specification you you're allowed to cast everything into void or car And if you have a bunch of these costs From code pointers to void and back and forth and all that you end up protecting all the pointers and Then you end up with a fairly high amount of overhead. So it actually happens a lot that Pointers are cost into into a char pointer and then back into a struct pointer that then contain a code pointer And if that happens you might end up protecting all the pointers Which then results in fairly high overhead I think so you do have high overhead as a as a drawback What if you don't support is if something weird happens from the hardware? So if you would protect the VMM or if you if you have access to page tables, let's say in the kernel If the attacker we don't protect data memory So if the protect if the attacker Rides into the page table or something like that. You could come up with very weird stuff But in in user space You should be safe Hi there, so it looked like from the talk that all the work here is being done in the software I was wondering do you see any opportunities for Hardware acceleration and say the CPU to reduce some of the overhead you're talking about MPX MPX or anything else you can come up with Yeah, we actually looked at MPX It's a it's a bit of a bummer that it's not available in real hardware yet But then again Intel kind of advertises MPX as a debugging feature There's some hints that up to that it'll have up to 40 percent overhead So we'll definitely look into MPX as soon as it is out there in real hardware What we could really profit from is some faster implementations for the For the for the additional tables that we have like for the for the bounds information and so on and Intel tried to address that using using MPX Something following this line of work Could be interesting or some some other doubly lined look up that you can you can speed up using some additional Instructions might be interesting as well Thank you. Next question. Hi. I wanted to ask what is the Interaction between this and dynamic linking for example, what happens or can you link unsafe? Plugins that aren't compiled with this blobs into your code and have guarantees or safety or the other way around have a Yes user program that's not compiled with this loading a safe library So let's start with the safe stack first Let's assume we only use the safe stack if you branch into unprotected Code we just don't give any guarantees while you're executing unprotected code as soon as you return to protect a code It continues. So unprotected code is perfectly supported for a safe stack We don't give any guarantees while you're executing the unprotected code for Code pointer separation or code pointer integrity It looks a bit different as long as you don't modify any of the sensitive pointers you're fine if you modify the sensitive pointers then Some of the pointers could be out of out of line So you're the like like you would miss some of the updates because the the shadow location would be written That is not actually used in regular memory and you would miss the update So we did there there are no safety guarantees and you could miss some of the updates, but the unsafe code If you segmentation for example wouldn't be able to modify the code pointers even when executing unsafe code So which way around is like safer like loading loading No unsafe coding a code loading your safe library or the other way around well It just depends right while you're executing unsafe code. There are no guarantees. That's basically how it looks like Thanks, you could you could run some some form of binary instrumentation on top of it to kind of Pay a very high overhead for for the unsafe code, but it's not what you what you would want All right, we have three questions and five minutes left or four minutes. Go ahead Okay, not exactly a question just a comment casting function point is to void point. This is not allowed I don't have the c-standard memorized, but I'm pretty sure of that and Icc and xlc both warn for that, but gcc does not for whatever reason, okay I don't know about but casting to char and back is allowed Yeah, POSIX requires it like DL sim it you get a void pointer All right, thanks microphone to and then after that the internet He was developed a few years ago. So why Say again. It's a See was developed a few century in it. No a few decades ago So why has it taken so long to find so Simple solution Well, the solution is not simple, right? It's fairly complex So we run a whole bunch of type-based analysis and all these type-based analysis have only come up to speed in the last couple of years And you've seen like I talked about C cured which which was proposed in the in the early 2000 I think like 2002 or 2003 or so and there's been a lot of research going on and only now We have these frameworks available that we can actually do these heavyweight transformations And then run additional optimization passes on top of it to get the overhead low enough so that it could actually be usable in practice Also people assumed that C programmers would write safe code, but apparently it's not like that All right one last question from the internet Thank you. I have to read it out actually have no clue what it's about in the context of C plus plus Wouldn't all the pointers to instances of classes containing virtual methods be protected and by extension all classes containing pointers to those classes as members Should I read it again? Yes, please In the context of C plus plus wouldn't all the pointers to instances of classes Containing virtual methods be protected and by extension all classes containing pointers to those classes as members. Yeah So the answer is yes All right, thank you very much If there's no more further questions all right one last one really quick microphone three I don't know Is it on okay? Expanding the last question. What's the perfect on that? What's the what the perf hit performance on which one the last question when you get? Because that sounds expensive to me it actually depends we only look at the pointers and Luckily a lot of this stuff is pushed onto the stack and so we don't pay anything on the stack We do protect we do get a performance hit on some of the stuff on the heap and especially for for C plus plus as as was said we might end up protecting a whole bunch of the of the pointers with two objects with virtual functions Depends on the the program how frequent these these pointer operations are there. There's a full list of of individual benchmarks in the paper The we most of the benchmarks are Below 5% some of them are around 10 very few around 20 and the highest overhand we've seen was roughly 80 percent Thank you So there and for the one with 80 percent. There's definitely room for a future optimization where you can you can kind of add or reduce the number of Of checks by by streamlining and grouping and reducing the total total amount of it. Thank you All right. Thank you very much again. Matthias player. This was in cold point integrity