 So, C++ programmers kind of use typecasting like we just go for a beer at a bar, and they don't really seem to enforce the types very, very carefully, at least the compilers don't, and memory access is also a bit of a problem. This leads to all sorts of issues like specifically type confusion and other vulnerabilities. Matias Paya has come to us all the way from Purdue University via other places, but just recently from Purdue University to tell us all about a compiler-based extension that he's been working on that should be able to detect and avert some of these bugs. And with that, it's over to Matias. Matias, introduction. Matias, introduction. Matias, introduction. And different forms of type safety vulnerabilities that can be exploited by attackers to gain some form of code execution. And we've worked for quite a bit on looking at the type hierarchy of C++ and different forms of vulnerabilities that can arise that can then be exploited in different forms. And it's just a quick show of hands. Who here is a C++ programmer? Wow, pretty much every one of you. Who has written more than, let's say, 10,000 lines of C++ code? A lot of you. Okay. So, that's a good setting. For C++ programmers, you're likely used to the different forms of type costing that you have. You have static costs, you have dynamic costs, and you have a bunch of other different costs. And amazingly, C++ has no form of type safety whatsoever. So this talk could also be described as cloudy with a chance of calculators, as we'll see in a bit. There's a lot of opportunity that an attacker can get to exploit these different forms of type confusion. And type confusion leads to remote code execution, which an adversary can abuse the different forms of type confusion and the different settings in the type hierarchy to pretty much execute arbitrary code on your system. And especially browsers and hypervisors and kernels are great targets to find different forms of type confusion and then exploit them, as was recently shown on the point to own competition. And we've seen a couple of these type confusions that are used to spawn calculators all over the place, which are fun. So a calculator pretty much shows you that you get arbitrary code execution, if you can spawn that. And as we see, if you think about it, the attack surface that we face on our systems is pretty much huge. We are no longer working with systems of a couple of thousand lines of code, but with millions and millions of lines of code. The abstraction is immense. And if you look at Google Chrome, for example, we have more than 100 million lines of code, which is an immense code base that we are facing. And it's very hard to protect against vulnerabilities in this large code base. And even though the folks at Google are doing an awesome job at code reviews, figuring out test cases, checking all the different conditions, there's still a large opportunity for different forms of type confusion in this large source base. And it's not just Google Chrome with 76 million lines of code, but also a bunch of other systems on top of that. There's your window managers, there's your standard library, there's the Linux kernel, there's a hypervisor and so on. And this easily clocks into more than 100 million lines of code. And there's easily, there's a lot of opportunities for different forms of type confusion. We will explore these different opportunities of type confusion. We will see how we can find type confusion vulnerabilities, how we can automate the search for type confusion vulnerabilities, and also we'll discuss what kind of capabilities an adversary can get through different forms of type confusion. So how can you exploit it? What is the underlying attack vector? What are the attack primitives? How can you build attack primitives? And then in the end how you can automate it to figure out exploitability. Now the attacker model is as follows. We start off with an external user, for example, without any form of code execution capabilities. You have to imagine on one end you have a program that is answering to different forms of requests. You're sending in a request, you're getting a reply. This defines some form of computational capabilities that you get on the other end. So you can send in a request, you get a reply, and it conforms to some form of computation that you're allowed to execute. You are severely limited in what kind of computation you can execute. So for example, if you're interacting with a web server, you're sending in a request and you're supposed to get an HTML file as a response that you can then render and then look at. An adversary tries to craft a request that is being sent in so that instead of a HTML document, you get a shell in return. And among several steps, you're extending your capabilities from an external user that is issuing requests to retrieve documents to a local user and then to an administrator account. And these steps are being followed to extend the capabilities of an adversary step by step. And the nice thing is that an external adversary can easily trigger these attacks through very simple means by simple requests that is being sent on. On current software, we're mostly focusing on control flow high-check attacks. And as software security community over the last 20 plus years, we have worked on a large amount of mitigations. These are forms of mitigations that try to detect an exploit condition or some form of vulnerability that is being used. So given the attacks that I've shown before, it is highly likely that the large amount of code that we have will contain vulnerabilities. So we are working on mechanisms that protect the integrity and availability of our systems even in the presence of vulnerabilities. So as a very first step, we have to accept the fact that there will be bugs in our code. So all the defenses that we have, all the mitigations that we have focus on detecting this exploit condition and stopping an adversary from actually running a full exploit. And over the last 20 plus years, we've developed a set of different mitigations and defenses that make it harder and harder for adversaries to gain full code execution capabilities. So if you want to high-check the control flow on a current system, you need to jump through a set of hoops to actually get code execution, which then allows you to spawn a calculator or execute arbitrary other commands. So for a control flow high-check attack, what an adversary does is it influences the address space of an application of a process to readjust the different forms of code pointers, pointers, and the data of the application so that it executes something different. So imagine originally the code is just the web server, but instead of serving a web document, you want it to open a shell for you that gives you full computational capabilities on that system so that you can interact with the system and then further escalate your privileges to an administrator. Now, with all the different mitigations that we have in our systems, we've severely restricted the set of capabilities that an adversary can have, even if there are vulnerabilities, memory safety vulnerabilities or type safety vulnerabilities in the code. And the slide here shows the address space of the program in an abstract form. So we see that the code section is read-only and execute-only. So an adversary, and this is the only section that is readable and executable. So an adversary can no longer inject new code. And this is one of the defenses we came up with. In addition to that, there's the heap and the stack which are readable and writable but not executable. So the only way an adversary can influence the program is by modifying the data and then reusing the existing code. So what we have here is we have a large amount of code pointers on the heap and on the stack which then point to the code region. To get code execution or to hijack the control flow, an adversary can simply override these code pointers and redirect them to some alternate location. And this is precisely what is being done for a control flow hijack attack. And this can either be for direct code pointers that point from the heap or the stack into the code region itself. Or we can also go through v-tables, which is a way for C++ to handle inheritance and virtual functions. So if you have a virtual function, you are allowing it to be overwritten depending on the class hierarchy. And if you are dispatching based on a virtual function, you're following the code pointer to a specific class-based implementation. And we can influence all these different pointers as an attacker. And then redirect and stitch together the existing code parts in alternate ways. Now, as the show of hand showed before, all of you are already C++ programmers, but it'll still quickly go through the different forms of casting behavior. And even if you've been using C++ for a while, you may not be aware of how the different casting operators actually boil down to the underlying code, how they are being compiled down into an actual application. So there's two main casting operations that we have. They're static casts and they're dynamic casts. A static cast allows you to cast an object or a pointer to an object into a different cast. The advantage is a static cast is very, very fast. The disadvantage is it doesn't do anything, except for doing a feasibility check at compile time. So what a static cast actually is, it tells the compiler, please check if there is a pass from the current type to the other type, to the target type. And if there's any pass in the class hierarchy that goes from the source type to the target type, then the cast is actually allowed. And it doesn't need any runtime information. It doesn't introduce any overhead, which is great for performance, but it doesn't give you any security guarantees. A dynamic cast, on the other hand, executes an actual runtime check. So a dynamic cast is somewhat comparable to a cast in another programming language, like for example in Java or so on, when you're doing a type cast, it is actually enforced that the type that you're casting into, or the objects that you're casting into a different type, actually is of that other type, so that the cast is allowed. And in C++, the dynamic cast leads to a runtime check. Now, to actually execute a runtime check, you need the runtime type information to be able to decide, hey, what is the actual type of the object? What is the type of the memory area that we are looking at? You need to identify the underlying type of the memory object in some form. And this is where we see some of the drawbacks of C++. C++ is an extension of C, pretty much. And in C, everything boils down to untyped memory. Everything boils down to bytes in memory. You have some car array that can be interpreted in different ways. And without an actual identification of the underlying car array, you don't know what the type is. And with a dynamic cast, you are using a unique identifier for an object to actually decide what type it is. And this is where the vtable pointer is actually being used. This allows you to have a unique identifier for the actual object that allows you to decide, hey, this is the runtime type of this object. So it's a unique way to distinguish between the different object types. So let's look at the different casting behavior in a little bit more detail. So if you have a static cast, we cast an object B into a pointer greeter. And this is being compiled down into a load of the pointer B into the RAX register and then a store into this other target area. So there's no real type check happening there. And the compiler only does a feasibility check at compile time when it goes from B to A to make sure that it is of the right type. Now, if you have a dynamic cast, if you compile it with O0 without optimization, you see that there's actually a lot of code being generated. Again, we load the pointer, we do a null check. And in addition to that, we load the pointer to the greeter class and we load the pointer to the base class, and then we execute a full dynamic cast. This allows us to do this actual check and make sure that the type of the runtime object that we have conforms to the actual type that we expect. So we do a full runtime enforcement check. Now, if we optimize this, we do a dynamic cast. We load the two pointers, the two base pointers. We check what the current base pointer is of the current object. And depending on the result of the cast, we do allow it or we terminate the program at runtime with a type safety violation. Let's look what a static cast is optimized. It ends up in zero instructions because we reuse the other register. So a static cast does not incur any runtime overhead and does not incur any runtime check. Use this as a take-home message. Static casts do not result in any instructions being executed at runtime. So no performance overhead and no security guarantees. So now with this knowledge, what actually is type confusion? Type confusion arises through illegal downcasts. Assume we have the following type hierarchy. We have a parent class and two dependent classes, child1 and child2. Now, if we allocate an object of type child1 and we store it in the C pointer, we can cast it to a parent type. So we can cast from the child object, child1 object to parent object. And as these two classes are dependent, this is a valid cast. And we can store the pointer to P in the P pointer and we can use the parent objects or the fields of the parent objects. Now, as a second step, we can cast it a parent object into a child object. And if the child2 object, if the underlying object has been allocated as a child2 object, then this cast would be allowed. But the static cast does not do any checks. So at runtime, this would lead to type confusion. And this is exactly where the exploitable behavior comes in. So with this static cast, which is not being checked, the static cast could be abused to reinterpret the underlying memory as a different type. Let me get you a little bit more detail and background on that on the type confusion. We have the parent class and we have the child class. And I'll break it down to just parent and child to make it a little bit easier. Now, the parent object only has a single variable inside it called of an int type. And the child class has a second int type and a virtual function called print. Now, if we allocate a P object, we only allocate the four bytes that are being used for the integer. If we allocate a C type object, we have the vtable pointer that points to the actual location that contains all the code pointers. We have the x integer and we have the y integer that can all be used. Now, let's assume we allocate a P object, a parent object, and we have a pointer to it. If we do a static cast into a C pointer, the C pointer ends up pointing above the actual object. And the underlying object or the underlying data that is at that location would be reinterpreted as a vtable pointer along with the y object or y integer that could then be read and written which would expose the underlying memory. So this leads to a memory safety violation and control flow hijacking after a type confusion. And if you look at the chain of violations, the type confusion is the first thing that happens that violates the integrity of the underlying application. And this is the initial entry vector for an attacker to abuse this underlying bug, this type confusion bug. And this can then be used as a memory safety violation or then for control flow hijacking. Now, how do we use this vulnerability type to build an exploit primitive? So imagine that when you're using type confusion or when you're exploiting type confusion in your programs, you're trying to control two pointers of different type that both point to the same memory area. But the two pointers of different type allow you to reinterpret the different fields of the object in two different ways. So you have a certain memory area that is of one original type that has been allocated as one original type, but you have two pointers of different types to that same memory area. And for example, in the first type, the parameter is the first entry is interpreted as a V-table pointer, while in the second type it is interpreted as a long, right? And if you use a setter for this long value, you can use it to override the V-table pointer in the other view. So imagine that you're using the first view to set the V-table pointer and then you're using the second pointer that you control as well of different type to dispatch on that pointer. As a simple example, just to show you the power of this exploit primitive, again, imagine that we have a base class that just implements some basic functionality and we have two subclasses of it, two descent classes. We have a greeter class that just says hello and we have a great executor as a service. So both of those are implemented as virtual function because we may want to build our fancy framework on top of that with additional functionality so we want to be able to override these functionalities. So the executor service implements one virtual function called exec that takes a string that is then being passed to system to execute it as an additional service and the greeter function just prints the string to standard out. So that sounds pretty reasonable, right? There's no way that a programmer would confuse exec and say hi because the functions have different names and there's no way for us to confuse it, right? Now, if we allocate two base objects, B1 and B2 of type, the first one of type greeter and the second type, the second object of type exec, we can actually dispatch on those objects. So we allocate these two objects, one object of type greeter, the second object of type exec and then we cast the first object B1 into greeter and we call greeter say hi and then greeter says hi. And then with the second object, we again cast it into greeter, so from the base class to greeter class and the compiler does a compile time check and sees, oh yes, the greeter type is dependent or a descendant of the base type so this static cast is actually allowed. And then we can call say hi with this weird string user bin XCALC and it works perfectly fine, the compiler doesn't complain and this is actually really fun. If you look into this, we see this is exactly the code that I've just shown. We've got the static cast into greeter here and we've got the static cast into greeter here and we call say hi twice. So we make the object, we allocate two objects of type base of type greeter and type exec but we call the say hi method two times. And if we execute it, we once, the first call to say hi, this greeter says hi and the second call to say hi opens a calculator which is not what we want. If you look at how this is actually implemented, so why does this bug happen? First off, the initial or the underlying bug is that the type hierarchy or the compiler cannot stop us from casting a base class into a greeter class even though it is an exec class. We've got these two v-tables from B1 and B2 and the first v-table points to the v-table of the greeter type and the second base B2 pointer points to the exec type. And we can easily cast between the two of you without the type system in C++ actually complaining against it. And if you look at the actual implementation if you drill down in the source code what it actually ends up with is we dereference the first field of the greeter class which is the v-table pointer and then we dereference the first v-table pointer. So even though we are executing say hi or we have written say hi in the source code it boils down to executing the exec function in the exec class instead of the greeter class leading to the actual type confusion. Now this is a fun way to exploit software now. The hard question is how do we find these types of vulnerabilities? How can we find such issues in our software? And the classic approach that people have been using for the last couple of years especially to find vulnerabilities in large browsers has been fuzzing and fuzzing is great, right? But what it ended up being is you are fuzzing and you're trying to find these type confusion vulnerabilities but as I've just shown it's really hard to find or to actually trigger the type confusion vulnerabilities because there's no way for you to enforce the actual check, right? So the only way that you will discover that something is amiss is if you run into a memory corruption if you run into segmentation fault if you don't run into segmentation fault there's no way for you to detect the actual type confusion and you may be missing a large amount of type confusions, right? So you're only, if you're running a fuzzer you're only detecting the subset of type confusion that results in a direct memory corruption and a segmentation fault. There may be a large amount of type confusion that could be abused that you're missing and what we wanted to look at is can we discover the missing set of type confusion? So can we bring type safety to C++ or at least some form of type system and typing so that we can be aware of when an illegal cost is being happening, right? So the underlying problem that we have here is in C++ aesthetic cost is checked only at compile time which is fast but does not give us any form of runtime guarantees. On the other hand, we have dynamic costs that are checked at runtime which result in high overhead and are limited to polymorphic classes. Polymorphic classes are the classes that have virtual functions in them. Why are dynamic costs limited to polymorphic classes? Well, we need to have some way to identify individual objects or the type of an individual object and the vTable pointer is such an identifying field and this goes back to the design of C++ and in C++, a struct is a class and a class is a struct and if you allocate a struct in C you have no idea what the underlying type is, right? There's no way that C remembers that you have allocated a foo struct, it could be any arbitrary type. Only if you have an identifying field, a type ID, only then you can actually identify the underlying type. Safe, object-oriented languages like Java, C sharp and so on. Whenever you allocate an object, they have an object ID, an object type ID that clearly identifies the underlying type. This is missing in C++, this is why we cannot explicitly check all the costs between any objects, but only for polymorphic objects with virtual classes. So what we figured is, there's something missing here, we need to be able to do an actual type check for any of these objects. So according to the motto of the 34, 3C, we figured we would 2WAT and bring type safety to C++. And our underlying idea is that we would check every single type costs. So we do a dynamic check for every single type cost, and then aggressively remove as many costs that we can as part of our design and as part of our implementation. So we are making type checks explicit. So we enforce an explicit runtime check at all cost sites for dynamic costs, static costs, reinterpret costs and also C-style casting. Now, this sounds like a contradiction, right? I've just told you that this is not possible for in the existing framework that C++ has because we have no way to identify the underlying type of an object. How do we solve this problem? Whenever you allocate an object, whenever you execute its constructors or if you simply go through the allocator, we remember that this memory area over here is of that particular type. And we keep some form of metadata table somewhere in the background that allows us to query and look up for any byte in memory. What type does this piece of memory have? And we can then use this information in any of the costs. So we can replace a static cost with an actual runtime check and make sure that we detect a type confusion problem right when it happens right at the cost site and not much later than an actual memory corruption happens. So we build a global type hierarchy during the compilation of the software and we keep track of the allocation type of each object. So we instrument all forms of allocation and we keep this in our disjoint metadata table. And then in a second step, we can execute for every single type cost that happens at runtime, we can execute this check and make sure that it actually matches. So we've built this large system based on LLVM where we instrument source code on top of Clang with additional explicit type checks during the compilation. We do object tracing as part of additional LLVM process and track the type hierarchy. And then at runtime, you can check if something fails or not. And then we have a hardened binary that does all the explicit type checks. Compared to some prior work, you may know UBXAN, which does the checking for polymorphic types only. This allows us to check every single type cost that is out there for static costs and for dynamic costs to do this fine grain checking. We cover new object allocations, we cover placement new, we cover reinterpret costs and a bunch of other things, right? So we've worked very hard to compile real software including Chrome, Firefox and other systems. Now the problem is, as soon as you enforce full type checks for every single cost, you run into impressive overheads, right? So our main task was to reduce the overhead to make it more useful. So on one hand, we limit tracing to unsafe objects. If an object is only used in a safe context, so for example, if it is never being used for costing, we don't need to instrument it. We don't need to remember the type of the underlying object. If an object is never used in costing, we don't need to worry about it, right? So we can remove tracing for types that are never cost in the program. We limit checking to unsafe costs. So we do some static verification inside the scope of a function to figure out what parts of the code are actually used in a safe way. And this also allows us to remove some of the costs. We also replace all the dynamic costs with our special form of costing. As it turns out, our cost that we have developed using our metadata information is much faster than any cost done through the RTTI information that the original C++ dynamic cost does. As it turns out, dynamic cost has never been optimized. People don't really use dynamic costs due to the performance overhead, therefore it's not been optimized, therefore it's not being used. So it's this endless circle. If you replace all the dynamic costs with our type costs, we can actually improve the performance a little bit. Interestingly, by just doing this base system, we already found four new vulnerabilities in Apache Circuses, which is a large XML processing library. And there were costs from a DOM text implementation node to a DOM element implementation node, which allowed us to reinterpret these types in different ways. And we've also found type confusion in the QT base library, going from the node base to the map node itself. And those were easy low-hanging fruits that we found by simply compiling software and running it in a day-to-day use. So by simply compiling your C++ software with our type checker, you can already find vulnerabilities and bugs in the software by just running them in your day-to-day settings. This was step one, and we found a bunch of different vulnerabilities, but we wanted to go further. So a couple of weeks before the Congress, we started to fuzz all the things. As it turns out, you can combine our type safety mechanism with AFL. So you can compile any arbitrary C++ software with our hex type LLVM-based instrumentation, and you then run the software on top of LLVM, and you fuzz it to find different forms of type confusion. You simply let AFL do its magic, and you have to invest some time into triaging all the type confusion reports, and you'll figure out different forms of vulnerabilities. And at this point in time, I would like to give a huge shout-out to the students that actually did all the work and invested a lot of time into developing these systems and triaging the vulnerabilities, building the system and playing with it for such a time. So we spent some time fuzzing on our ghetto fuzzing cluster that we have under the desk of one of the students. So you see, this is a very low power setting. We only have five machines that were running different pieces of software, but nevertheless, we found quite a couple of interesting cases. So after two weeks of fuzzing, we found two new type confusion bugs in Qt Core, unfortunately not exploitable, but they've already been fixed and acknowledged by the developers. We found one more bug in Circus, and we found seven issues or reports in LibsAs, where we're still looking if they are exploitable or not. As it turns out, pretty much every software you throw at it will generate a couple of reports. And some of these reports are due to the underlying problems with C++, as there's no explicit type information. Developers are abusing the type system in many odd ways, which leads to some spurious reports. So actually triaging and figuring out if it is an actual bug or not adds additional overhead to it. So you have to spend some time to look into it, as for example, this LibsAs. Now, we focused most of our time on small software to test the scalability of our approach and to find some reasonable bugs, but we also looked at Firefox for a while. Also to test the performance overhead, for example. So these are the results for Firefox that we currently have. And they're fairly impressive, right? So based on a specific set of benchmarks, we found, let's just say some type confusion reports. And we are still figuring out on how we can handle these large amount of type confusion reports. Many of them will be duplicates, and even more of them will be false positives. And we are working hard on triaging and trying to reduce them to a smaller set of actual bugs that we can then report to the Firefox people. The big problem that we are facing for Firefox and also for Chrome is, but much more so for Firefox, that the code is really, really messy. A hard problem that we have here is that Firefox has several allocators. So there's different forms or different locations in the code that handle different parts of the heaps. There are different heaps that move data back and forth, that share data. And there's very odd allocators that are messing with different parts. So there's not seven billion type confusion bugs in Firefox, or at least we hope so. We guess that the number will be much lower. But it's a first step and we are working on reducing them. So Firefox is ongoing work and we'll see how we can get there and make it more useful. If you end up after five or six days of fuzzing with seven billion reports, that's clearly too many. So we have to figure out how to reduce them to see which ones can be interesting. So as concluding remarks, what did we do? What are steps forwards? How can we improve from here? On one hand, we want to fuzz all the things. So we want to go deeper. We want to find more software. We want to find better test cases, better fuzzing inputs, and get deeper coverage for the overall systems. And especially looking at Firefox, one thing we want to do is we want to do selective fuzzing. So instead of just blindly fuzzing a large software system, which may result in a large amount of false positives due to the way that the software is architected, think about the Firefox example again. You allocate an object. It may be reused in different times without being freed. This would, so, or let me take a step back, right? So one of the problems we've seen with Firefox that led to a large amount of reports is that you often allocate an object. You return this object to a pool. You know the developer knows that there's no more life reference to that object, but is then being reinterpreted and reused as a different type of object, which leads to type confusion report. But this is not an actual exploitable bug. It's just a quirk of the lack of a type system that C++ has. So we want to move towards a more selective form of fuzzing where we can say, hey, we're only interested in this subset of the type hierarchy. So we want to do explicit type checks for this subset of the type hierarchy, but we're not interested in anything else. So focusing on, for example, just a DOM or just a JavaScript object or something like that. In addition to that, we are also looking into an always on check for polymorphic objects. Think back to the control flow high-checking defense that I talked about in the beginning of the talk. One option is that you check the type of the object whenever you do a virtual dispatch. So this would protect against the type confusion from before, right? And as an example, if you are looking at the code here, before when I compiled it, it made two versions that you may have observed. The second version is with the type safety mechanism. And if I run it with this type safety protection instead of opening up a calculator, it actually reports type safety or type confusion. So we want to extend this into a bigger and larger system so that we can run it partially. We can build it on top of Firefox in a selective part. But you can also use it for your software to specifically protect against these dispatch vulnerabilities as we just saw here. As to say, hi, wanted to dispatch, we stopped the execution and terminated the program. So to actually conclude, type confusion is fundamental in today's exploit. There's a set of existing solutions that are incomplete, partial and slow, and make it very hard for us to protect these systems. And especially in large software systems like Chrome, Firefox, and other large mechanisms, we need to develop new ways to protect and enforce type safety at runtime. I presented HexType, which is an LLVM-based extension that allows you to trap upon type confusion. So you can compile your software with these type confusion protection, which allows you to track the true type of every object that you allocate, and then upon type costs or dispatches, allows you to do an actual type check. So we can trap at the type confusion and not at the later memory safety violation. I showed you one application of this approach where we've combined our HexType mechanism that does the type checking with a fuzzing approach, and we found a nice set of bugs that are now being fixed or were fixed. Overall, this has reasonable overhead. So for Firefox, depending on the benchmark, we have between zero and 50% overhead, and you can integrate it with EFL for broad bug discovery. And as always with our research, it's all open source, so you can download the system, you can play with it. It takes about 15 minutes to build it on your machine, and you can then compile your software with LLVM and full type checking. And with that, I would like to thank you for your attention, and I'm happy to take any questions. Thanks. Awesome, awesome. So we have four microphones here, one, two, three, four, where you can ask questions ad. And just to be clear, a question is like one or two sentences with a question mark behind it. And with that, I'm gonna go to microphone two. Thank you for the presentation. Oops, this is loud. Would it also be possible to have a compiler plugin that prevents you from misusing static cast at compile time? Like, could you build something that only allows you to use, does not allow you to use static cast combined with dynamic dispatch? Let me think about your question. Would you want to detect the type confusion statically, or would you just forbid the programmer from using static cast for any object that has a virtual function? Second one, I just want to prevent bugs. Like usually in C++, we try to load as much checking as possible at compile time, so we do not have the runtime overhead, so it would be nice to disallow dynamic cast for anything that is virtual or something. Right, disallow static cast, yeah. Yeah, so Ubisan followed a similar approach. They convert and make all the static casts for polymorphic objects into dynamic casts, and they simply replace them. Unfortunately, as the Greeter example showed, the base class is not necessarily polymorphic. So you run into weird runtime behavior with auth. Like the base class is not polymorphic, and if you turn a static cast into dynamic cast, you fail, right? So there's, C++ code is really, really messy, and it's very hard for you to actually simply replace them. You can report it as a warning as part of the compile process, but in the end you need to support non-polymorphic base classes, which are surprisingly frequent, especially for browsers, as we found. So there are several base classes that are non-polymorphic. Thank you. Thanks. Microphone three. Hey, thank you for your great talk. You mentioned that in Firefox, you had the problem that some objects were freed and then reused, so I was wondering, could you build on top of temporal memory safety analysis and take that information to account to make your analysis more precise? Sure, temporal memory safety usually clocks in at like two, three X overhead, so this is actually more expensive than what we are doing. But it wouldn't be an obstacle to fuzzing, right? Like if you only use it for fuzzing, not in production. Right. Sure, well, ideally you would combine it with additional sanitizers as well, so you would use our type sanitizer, combined with a like spatial and a temporal memory safety sanitizer as well, so you can use it with ASAN as well in addition to that. I, you asked about if the additional data that you have from the type safety system, sorry, from the memory safety system would be useful in our analysis. I would answer you that the temporal memory safety sanitizer will run into the same problem, right? So this is C or C++. We have a lot of untyped memory and Firefox simply reuses the memory, even though there are still references to it and then just changes the type under it. And this is allowed according to C++ semantics. So they're not doing something illegal. It's just that it's really, really messy and we'll have to work around these quirks that they have there. Thanks. Thanks. Question from four? Yeah, to be frank, I'm a bit puzzled by your terminology because just because the type system is not checked dynamically, it doesn't mean that it doesn't exist. So wouldn't it be better to have some static solution like preventing, yeah, forbidding downcasts that are static and forcing developers to use dynamic downcasts only? It would be much faster than having a fuzzer fuzzing the entire application because the result and the problem is spotted at compile time and not throwing some fuzzing step. Sure, this would be a great solution. Unfortunately, it doesn't scale. Try to do this for 75 million lines of code where you have 200,000 violations. Right, so rewriting the whole software stack is always a solution. You may run into time constraints. Right, this is just a source base that we have to live with. It's a great solution, so your approach would actually work really well to protect against these downcasts or illegal downcasts or confused downcasts. You may have a hard time rewriting all the software and there are non-polymorphic objects where you cannot enforce dynamic costs, right? So you can only do dynamic costs for polymorphic objects and you may have an illegal downcasts for non-polymorphic objects as well. You can take it offline. It sounds a little bit more complicated. Microphone two. Thank you for the talk. Well, you gave an example counter to it. At the very beginning you made a claim that static casts don't introduce any code. This isn't exactly correct as everything is implementation defined at this point, but it can shift especially when you're going from a type that has multiple inheritance and it needs to shift around to get to the correct thunks. This introduces a very specific type of type confusion bug where if one casts to a void pointer and then from the void pointer to the different type in the chain, it won't do the shifting properly. It also is a very specific one that might be hard for you to catch. How do you attempt to catch those ones? You mean for our system? Yes. At one point in time you allocate a slab of memory. You allocate a piece of memory according to a specific type. If you do a new foo type, we record this is now of type foo, this memory is of type foo. Then you can do anything you want with your pointer. The memory slab will still be tagged as type foo. Whenever you cast back into some other type, we look up what is the base type of this object. If it is foo and you're casting into foo, everything is fine, otherwise report an error. And you do this when you're casting from void pointer. We do this when you're casting from anything. Cool. Thank you. Which actually brings me to a nice topic that I didn't really talk about. So this is a nice observation that you had here. C-style costs are one of the ugliest features that C++ has. And I just want to call out this ugly feature which pretty much there's a... So if a C-style cost, if you do just the parentheses and the target type, this is pretty much a hammer that hammers this object into this other type. It says, make this underlying memory area now of this other type. So this is the pretty much the ugliest thing you can do. If you're programming in C++, never ever under any circumstances use the C-style cost because this really messes up the underlying type system. Thanks. Microphone three. I was wondering if you tried your tool with Chakra or Safari and what happened? We did not. We ran it with Firefox mostly. We tried a little bit with Chrome. Again, one of our future works is to port it to more software and larger software. So for this presentation here, we focus mostly on smaller libraries. Yeah, if anybody wants to offer more resources, feel free. If you want to run it on Safari, it's open source. Download it, build it on your system, run it on Safari and report the results. Yeah, just my thought here is I think you might have even more of those Firefox type issues with the way that they do casting. Though I think the difference we found between Firefox and Chrome, so I don't know about Safari, but I can tell you an anecdote about the differences between Firefox and Chrome. So Firefox has a very old code base. So there's a lot of ugliness hidden in there. So we found things that do direct dispatches or indirect dispatches in assembly code and they are doing weird stuff to Vtable pointers in inline assembly just due to the legacy nature of Firefox while in Chrome, Chrome is a much more recent code base and it uses a much more recent C++ standards. It's much nicer and much less likely that we find bugs in there. So just the age of the code base is bad for Firefox or may lead to a large amount of potential vulnerabilities. So some refactoring is needed. Microphone 4. Yeah, what you did is like super cool and son, but I'm just, my question is like the concept of downcasting is a code smell, right? And isn't your tool kind of allowing like people to keep code smell and keep writing smelly code basically? And I just didn't like the reason saying, well, we already have a code base. Wouldn't it be better to have like something that would help people to write non-smelly code? Sure, let's rewrite everything in Rust. I'm all up for it, right? Yes. Sure, there's just this 100 plus million lines of code that are lying around and we cannot easily port it. And the job that we try to do is we try to make it as secure as possible. And we try to find potential vulnerabilities in the existing code base. If you have unlimited resources, let's just stop right now and rewrite everything we have in a safe language. Sure, I'm all up for it. It's just the fact that we have this large amount of code that is out there and we are using it, right? So we have to do the best that we can to bump up the protection for this code as much as possible. Microphone 2. Thanks for a great work. Do you have any similar idea for C code? Yes. Okay, is this work already done? Can I read something about it? Some of it is in progress. We can talk offline about it. Yeah, cool. Okay, microphone 3. Thank you for the talk. I would like to know why did you take the types during allocation and not inside constructor which would feel more natural to me and it would also be as we solved the problem of false positives. Not every type has a constructor. Okay. So what we do pretty much is when you allocate a new type, we don't run it in the allocator, but as part of Clang, we know where the allocators are and we can then tag it and add the metadata as additional code. So as part of our Clang pass, we detect wherever data is being allocated or where the individual allocators are and we tag it and then instrument it in a later step. This allows us to tag all the locations and not just the ones that have constructors. So it allows us to extend the coverage further to not just classes with constructors but all the object allocations. So imagine, and this is something we found in some older software as well, you allocate, you call malloc on a struct and then you use it as a class, right? So you would never be able to detect it as part of a instrumenting or constructor and this actually happens in software like Firefox and other older code bases. You call malloc instead of new and use a struct instead of a class. So I said the code is really, really ugly and here you see the similarity between C and C++. In the end, the class is just a struct and if you allocate objects as structs, you may end up missing a large amount of objects. Shouldn't static cast then fail in compile time if you cast a struct to class? It's still like a class can be a struct, right? So they are equivalent. If it's the same type, you can use the struct as a base type and then you can have a class that is a descendant of that struct. Ah, right, I understand that, thank you. All right, C++ is ugly. Tell me about it. With that, we're at the end of our questions. I'd like to thank our speaker Matias again for this wonderful talk and contribution to the C++ mess. We are trying to fix it. Yeah, okay, thank you very much. Thanks.