 Hello and good morning. My name is Case Cook. I'll be talking today about the work we're doing in Linux Kernel for adding some more meaningful bounds checking. In 2021, I started looking at what data we had on security vulnerabilities that were classified as buffer overflows. And I went back through three years worth of these and found 25 of them and broke them down sort of by what types they were. I'm not going to talk about the four smaller ones. They're handled in various other ways, but the second largest is the array index overflow case. And I think it's worth covering that first because it's quite common and the mitigation for it already exists. So array index overflow is seven out of those 25. Without any instrumentation, there's no checking of array indexes. The compiler will happily just convert it into a pointer and try to read or write. And in this example, I've got a structure with a member salg name is 64 bytes. And in some usage of it, you have some variable index and it's just going to get dereferenced and nothing is checking it. So this might be less than zero and will read stuff before the structure member or it's greater than 64 and it starts reading a writing past the end of it. This is this is the common pattern for that type of flaw. So the good news is that all fixed size array indexing flaws are already mitigated today using the endifying behavior sanitizer in GCC and clang. So you just have to set up your your kernel configs to use these and any of these cases will be caught at runtime. So it's really quite powerful and basically mitigates all that entire class. So that's not an issue. I'll come back to this later though. The example I use here is from AF-alg. The bug actually was fixed by replacing a fixed size array with a flexible array. So one that can't be determined at compile time, which, you know, the fix is correct and it does all of its own size checking itself. But this means things like, you know, undefined behavior sanitizer can't actually see it anymore because it doesn't know how large that that array is anymore. So it will ignore accesses for against flexible arrays. And like I said, we'll come back to that. So that leaves us these 11 mem copy buffer overflow flaws. Basically a destination or writing past the end of a destination. And it's actually the case that made me go look at the data, look at what the historical weaknesses were with the kernel was the bleeding tooth flaw. So this was a zero click remote code execution in Linux Bluetooth. And it's the, you know, the classic problem. We've got this last advertised data with a fixed size, but the length that we're copying into it was, you know, attack or controlled. And there is no, there's no checking for this and it happily writes right past the end of that array. And going through the other, the remaining 10 of that same class, it's basically always the same thing. Here's writing past the end of an internal structure. More internal structures, more arrays, an array of structures, start getting into seeing there's a pattern here with a lot of networking, especially wireless. And we get into, you know, further things that are start being, you know, SSIDs. This has been a long standing problem. And it's really kind of frustrating because, you know, the compiler has all the information it needs. It knows how big these destination buffers are. So it should be possible to check them at runtime. And in fact, some of you might be thinking, Hey, wait, I thought config fortify source solved this, don't we do bounds checking on these kinds of mem copies. The answer is, yes, we do bounds checking, but the mem copy, this is an, this is sort of a simplified source code dump of the fortified mem copy. But the simple answer is that the main workhorse of the fortify routines, both in the kernel and in user space is the compiler's built in object size built in, which analyzes the pointer that you give it and gives you a size back. And says, Hey, this is how many bytes you've got that you can write safely. So fortified mem copy will look at the length of the copy and if that was compile time constant, then it can at compile time fail and say, Oh, our destination is size is less than our constant size or trying to write or read. So fail at compile time. And if it can't know the size of compile time, then it can do a runtime check against the destination and source sizes and panic at runtime. And if everything else checks out, we call the underlying mem copy and the actual copy occurs. So again, the main, the key point here is built in object size. And the most important difference here, the reason the prior security flaws are happening is because of this mode setting for built in object size. Mode zero says to the compiler, tell me how much, how many bytes I have from this pointer to the end of the entire structure in mode zero and mode one, it says, tell me how many bytes I have until the end of this structure member. So in this contrived example, you've got the structure, the entire structure an instance of this entire structure 16 bytes total. It's got an integer. That's four bytes long. So you say if you ask built an object size in mode zero, it's going to say, Oh, you're going to write to count. Well, to stay inside the entire structure, you've got 16 bytes worth before you run off the end of the structure. But if you say mode one, it'll tell you, Oh, you've got actually only four bytes here that member is only four bytes. And this works for array members. Oh, this is going to be 12 bytes to the end of the structure, or just the size of the array. Okay, we've got eight bytes that before you run off the end of that particular member. And in the case of a flexible array, this place where it's unsized like this, the compiler doesn't know it'll return minus one, which is actually size max built an object size returns a size T variable but effectively this is saying minus one I don't know it's, I have no way to check at compile time how big the contents of this are because it's going to get allocated at runtime and I don't know where that allocation happens, etc, etc. So it can't be flexible arrays can't be validated at compile time. The diving into bleeding tooth a little bit more you can see exactly the problem here because what was being targeted was this last advertising data, which was an array HCI max 80 length. But you would you could as an attacker you could overflow that and keep writing until you hit this list pointer that was still inside the enclosing structure. So composite structures are considered the enclosing structure and you know even outside of you know the internal discovery state struct. There was a list head struct and you could still write to it because in mode zero for built an object size, it's looking at the entire size of struct HCI dev. And that's tightening that is what's needed. So again, we've got our mode zero limits copies to the end of the surrounding structure because of their zero arguments. And if we instead changes to a strict fortified men copy with mode one, then we'll limit those copies to the end of the soundings of the surrounding structure member, which would stop all of those. And I wanted to see what effect this would have like are we going to end up with false positives what are we going to have what what's the data look like. So I instrumented a build of all mod config, and it had about 35,000 mem copy calls. You know, 22,000 of them the destination buffer size isn't known at all so it's treating it like a flexible array it says I give up built an object size as minus zero I don't know how I can bounce check this at all. And then about 12,000 of them the destination buffer size and the copy size are known at compile time so they're already you can already, you know, determine them 100% with 100% accuracy at compile time. There's about 1200 or the destination buffer size is known, but the copy size is dynamic, which is exactly all of those examples I just showed. And looking through the 11 mem copy flaws, they all would get mitigated by adding a dynamic copy size check to just those cases and that was really exciting to discover because, you know, you end up with a situation where you are risking a false positive runtime check to only 4% of all mem copy calls, but you're gaining potentially two orders of magnitude of additional runtime coverage of unknown flaws. You know, the example being here we had we knew about 11 flaws of this type. And that that class of problem there's potentially 1200 unknown other flaws. So that that that's, that's a big win to have that two orders of magnitude coverage with a very, very small percentage of potential false positives to have to fix up and find at runtime. So, of course, even that would be too easy. That middle case, the 12,000 cases of known buffer and copy size actually need to get fixed first because of intentional like pre existing intentional overflows, which will warn at compile time and then always trigger at runtime. The small good news is that they are detectable at compile time so they can all be fixed and there were about two or 300 of these to go through and deal with. And they mostly followed some common patterns. So I'll give you an example of these kind of intentional cross member mem copy overflows, where we know the sizes ahead of time. So in this mem copy, the destination that we're looking at the the member that's being pointed out is this key material. So, you know, built an object size inside of the fortified mem copy is going to say, oh, your max anchor key Len long. So we look at key M Len, which has a compile time known size of, you know, Max anchor key Len plus two times Mickey Len. So it's actually saying I want to copy into across all three of these members, which also implicitly implies that they come in a specific order. But this is all determined at compile time. I can see how big this is and will fail and say, hey, what are you doing? You're going to always be exceeding the end of key material. What's going on? So to avoid breaking this up and having three separate mem copies, which is a huge pain and a lot of churn. One option is you just wrap all three of them in a new structure with a name. And then you can your mem copy can actually target that that that structure and say, oh, hey, cool. The size of TKIP is in fact totally correct key key M Len now matches that size. So the mem copy can continue and the compiler and fortify are happy off we go. Unfortunately, if you end up changing the struct in this way, then everywhere that you refer to those members key material TKIP transmit and receive keys. You now have to convert everything from, you know, from the old style to now the new name, which creates a huge amount of churn and, you know, might change line lengths and all sorts of other stuff and is incredibly ugly usually and incredibly disruptive the code base. And we've got, you know, like I said two or 300 of these to fix. It's not going to fly. So we end up inventing a helper called struct group, which gives us the option of referring to a collection of struct members by a group name, or without the group name. So it's, it's a bit of a trick that basically says I'm going to create a union of anonymous struct, in which case we don't need to use the name and a named struct, and they have identical members. So we can actually refer to either all of the members collectively by the new name, or we can leave, we can leave the members being identified without their name so all the existing code doesn't have to change. And there's additional helpers for adding attributes and tags and some other stuff, but I left that out just to not fill the page. Okay, so now obviously we've fixed the, however many intentional overflows. We've got our built in object size set to mode one. All, all the compile time stuff passes now, we're left with now the runtime stuff that's been added. And if that doesn't panic, we're good. But of course, no, we're not done. We have to look at that list again of all the mem copy calls. We also had those 22,000 cases where the destination buffer size isn't known at compile time, or built in object size inside mem copy says minus one. I don't know what this is. I can't know it for whatever reason. And I hinted at that earlier on when I was talking about the first vulnerability that changed its array indexing from a fixed size array to flexible array. That's sort of the situation this is, except of course that was just for the UBSan. So what are we going to do about this? And once again, another reminder, we've got our built in object size here for the flexible array case is returning size max. Because again, it's, it's a size D return minus one is actually size max. So the largest value that size T can hold. So desk size and source size are there are size max. So in this case, the compile time stuff is never going to fail and destined the runtime stuff for destination source checking is also never going to fail. So we're always going to be calling the underlying mem copy. That's, that's why the minus one case just falls through. So a little bit of background on flexible arrays to show you a little bit more clearly would got as an example here. So the traditional flexible array structure is a structure that has a flexible array of some kind. And a count of how many elements are in that array when it got allocated or whatever. It's important to remember that while many of many flexible array uses in the kernel are bytes or byte arrays, they can be arrays of anything. They can be a structure. They could be in this case, like a U32. So, you know, these are four byte elements. And then you've got some counter to look at it. The way that the internals work in the compiler is if you use built an object size mode one on pixel data, like I showed you'll get minus one. And this actually will, if you try to do a size of against it, the build will actually fail like size of will refuse to do anything with this and will actually fail the compile, which is kind of frustrating. Well, trying to deal with this, but let's move on with some other examples. Now, this is this is true flexible arrays is the proper C real C way to specify this is to have that empty square brace for pixel data. Now, before that was an official C standard, there was a new extension that said, hey, you can have a zero side, you know, a zero element array. So it takes up no space in your structure. And it'll just be trailing your structure, just like a real flexible array, except it says zero. Now, this was a new extension because actually having a zero element array wasn't at the time, I think legal C. But notice there's a difference here size of now actually works against this because size of looks at that and says, oh, it's zero bytes. Cool. We're done. So that's a weird glitch between the two because this is still treated as a flexible array by built an object size. It says minus one, but size of says, I do know how big it is. It's zero. And before the new extension, really kind of awful way that this got done was you just have a one element array. So it's, it would take up space in your structure, but you'd keep you'd allocate more after it. And this is just really painful because all of your size calculations are kind of off by one because you're saying, oh, I want the size of bitmap image. Plus this many you 32 minus one, because we already had one in our structure anyway these these continue to be a huge pain to clean up in the kernel. And again, you end up with weird states where built an object size again is happily pretending that it's actually flexible array to deal with this kind of older code. So it says minus one. But if you say size of pixel data, now you've got one instance of yours of whatever element it is. Whereas before you had zero of them so be zero now you actually have a real size associated with it. And we go even further and unfortunately, it's so much worse than this, both clang and GCC treat any trailing array as if it were a flexible array. So if you are unlucky enough to have some fixed size array at the end of your structure fortify source doesn't protect you in any mode. All rights will just succeed unbounded. And that's really, really frustrating because most code has no use of doing these huge, you know, huge kinds of trailing arrays the code like that is incredibly incredibly rare. There were instances I think like sock address or something used to have or still does have a 14 byte trailing array that actually could be up to 256 but it got changed along the way it was it's really, really painful. Luckily, we're adding now to clang and GCC dash f strict flex arrays, which will clean up all of this will just it won't treat trailing arrays with any size 0164 whatever, as flexible arrays they will be actually sized. However, they're specified so only real flexible arrays will get the minus one return from built in object size. So the question remains, how do we solve the dynamically sized destination overflows and probably doesn't know the destination size automatically. So everything has to be open, open coded. But what's important to realize is that almost always the bounds are being stored somewhere and most often they are part of the flexible array structure itself. It's usually nearby. And in my example here there's you've got pixel data your flexible array and then you've got pixels the count of how many of those pixel data elements you've got. It would be nice if we could actually get this this relationship between the element count and the element array actually specified in the language. And there have been proposals like this to say you can make a flexible array a bounded flexible array that says, I am I am bounded by the size of, you know, this member name within the same structure. And now the compiler could actually reason about the size of pixel data at runtime, depending on what other things got turned on. For example, you be son bounds checking for for array indexing overflow could use this now for flexible arrays. It'd be very, very handy other languages have this. And it's not too hard to imagine adding this to see. It would be very welcome. I'm hoping we can get there. But without that, we need some way to deal with this to systematically enforce bounds checking. We need a new API. We need to disallow mem copy with destinations that aren't a constant size, because C was not designed to deal with having mem copy fail. There's there's no error checking about it. You know, and most users even in switching to a new new API will need some level of refactoring to actually take an error condition and pass it up because there's so many different ways things can fail in this kind of code. Like just in this contrived example, you know, we've got count with times height. Is this a multiplication overflow? Who knows? Here's the allocation of something at least it's using struct size to do its calculation. But maybe this calculation saturated and went to size max because you know the size of pixels times count was huge. So maybe your image result here was null because it failed to allocate it. How about you've calculated your count, but now it's going to get truncated because the the member type for for holding this count can't contain the size. You know, this is a u 16, but it's being assigned from a size t count or we can get truncated. And then finally, like the mem copy has no idea you've given it some target pixel data who knows what's happened how big it is after all those other flaws weren't checked. And maybe again, you've got count times your u 32 size, you get an out of bound right, etc. Like there's just a whole series of different things that in in safe code or accidentally safe code are doing these checks and they're all open coded there. They have to remember to do each one of these and check everything and usually the, you know, the null check is definitely one that gets caught but not always. So it'd be nice if we could just toss out all that and say here mem deflex dupe we're deserializing this information from, you know, screen and sticking it into the pixel data member of image. And we need to allocate that for count many of them, etc, etc. In this case, you know, you can leave the multiplication overflow here because it'll get caught by mem deflex dupe mem deflex dupe will look at the count and say, Oh, something's weird. I don't like it here. We're going to set a return value and everything will be safe. We won't end up with a mismatch between allocation and copy, etc, etc. You can still have bugs around it. And again, I'm leaving out out of bound reads out of the source just for simplicity here, especially since most stuff is about right overflows. And in this example, I'm showing you like a worst case design of the API where you actually have to name the pixel data and pixels, the array elements, and I'm sorry, the flexible array elements member name and the flexible array element count member name. And I'll show you, I think, a way that we can avoid having to repeat that every time. There's also, you can have helpers for, did you already do the allocation? Cool. Okay, the we can test, you know, we can test that we're not copying beyond the bounds of the allocation because we can check the flexible array count member, you know, pixels. We'll actually look at that before we do our copy, even if, you know, if the helper wasn't responsible for doing the allocation itself. So this not only checks for more conditions, it also reduces the amount of code that's actually visible here in each of these instances and makes things, I think, significantly more readable. Here's a real world example. More recently is another remote code execution and TIPC is another heap overflow of a simple flexible array structure. We've got key Len, which is the flexible array element count member and key the flexible array itself. And it was just happily copying into key from data coming off the wire. Here's a drill down and a bit of a simplification of what this looked like. So, you know, one thing to call out here is that originally this was you 16 size reading message data, which was returning you 32 so that has risks truncation. And then down here we've got, you know, key Len is coming in off the wire. But there is no checking that this is going to be within the allocation size that we just did above. And we happily copy into key off a key Len. So we can use that new API, like I was saying, to just replace basically all of it. We've got mem deflex, we want to allocate s key, we're going to copy from data this much, and with these allocation flags. It becomes much shorter. There's one return that we check. We have to fill the fixed size copy. But there are some proposed alternatives so we can do an entire deserialization at once. So the early design of this helper needed the colors to include the flexible array and flexible array count member names as args every single time. So here key Len and key had to be there, you know, in line two. And this just made it clunky. It was hard to use. It was hard to do conversions. It would be nicer if a compiler, which really ought to be able to figure this out since you can only have realistically one flexible array. Meaningfully in a structure, most structures only have one flexible array. Things are different when you get into more complex stuff, but more complex stuff can use different helpers. Anyway, so there's the key and key line that we could just get rid of. Be nice. So what's being added is this bounded flex array helper, much like the struct group helper from earlier, that you specify your element count variable and type and the flexible array type and name as well. And now we can drop the names here because the helper is actually going to create named aliases for the count and the elements. That is the same in every single one of these that uses bounded flex array. So we can find them always, but the types can be whatever they want to be. And this is done. Again, surprise with unions. So we had the flexible array struct, the count member, you just have a union with the same type, and you have whatever name you want it to have. Plus this fixed name called underscore underscore flex array elements count. And then for the flexible array itself, it's the same thing. We've got a union of the same type. And one is named whatever you want to name it. And the other is called underscore underscore flex array elements. Now, the kind of ugly thing here is that making a union of a flexible array is weirdly not legal currently, but you can work around it by adding a named thing in a structure that isn't a flexible array, but it also doesn't have to have any size. So you just have an empty struct with some unique name. It's really, it's really silly. So I don't know why that limitation exists because it's clearly not about anything about the size of the structures. So this basically makes a union of two flexible arrays, one with a name as specified by the user and one that's can be used by the helpers. And in this case, we can also keep things forward compatible to, you know, future C language structures where we can actually specify the element count bounds for a flexible array in the language itself. So here we could have the the count gets declared and then we could say, oh, here's the type of this array of the flexible array and here's the name that we want. And by the way, it's runtime bounded by that prior member name. And we can actually gain that automatically once the C language grows that, which I really like the idea of not having to send hundreds of patches to the kernel to get some of this stuff landed. And so here I can give you a look at a really simplified version of the memdeflex dupe helper. These are all horrendous macros, but I'll step you through what's going on. And I broke coding style a little bit just so I could fit this on the screen like you can see my lines of breaks on them, but I'll get into that. So first of all, this is a statement expression. So it's returning RC at the end of it. So whatever value RC has once all this code runs is what will be returned from this macro. But because we want to make sure we're always checking the return value, we really want to have the must check attribute added to it, but you can't add function attributes to a macro. So since this returns an int, the statement expression turns an int, we can just add a wrapper called must check erno as a function that's a static inline that has the must check attribute that takes RC and returns RC and has the must check attribute. So any callers of memdeflex dupe now actually have to check RC, they can't just leave it unassigned. One complication though that this is not a function it is a macro is that for me to sorry that it's a statement expression, not a function is that I can't just leave in the middle of it. I actually have to reach the end of it. So we can sort of simulate that by adding a do while zero in the middle and we can break out of stuff to jump straight to the end. This is just a fun way to get the shortcutting to the error conditions. So we start with RC equals negative einvalve just as a robustness thing about if there's somehow some code path in the macro and the larger macro that doesn't ever set RC for some reason, we will leave with an invalid RC and things should be discovered quickly. So that's mostly robustness thing. And then we've got our failure cases, which I'll cover here we've got local variables we've got this this pointer to what we're going to be allocating the the flexible array structure. And then two size t helpers for us doing size calculations. So the first one, the first check the first sanity check here is, can the count that has been requested actually be represented by the flexible array count member of this structure. So since we're, you know, able to use the the aliases that were generated, we can look at the type of the flexible array elements count member, and use another helper that gives you the maximum storage value of that. You know, if it's if it's u8 is going to be 255, etc. So if count is greater than that, we fail because it's going to be truncated, and we don't want to truncate and warn we want to fail, because we are going to end up in an inconsistent state. Then we start doing the calculation of the actual size of of of what we're going to be copying. This is the the flexible array elements. So the size of a single flexible array element times however many of them that we want count gets stored into copy bytes. And if that fails if that multiplication overflows for whatever reason, then also fail. And then finally, we need the allocation size, we need the everything ahead of the flexible array structure, sorry, the flexible array in the flexible array structure, so that the header, all the rest of it. So we add the size of that structure, because it doesn't count the flexible array itself in the size of plus the copy bytes we just calculated and store it in alloc bytes. And if that happens overflow, we also fail. All of those lead to saying at something's too big here, and we're going to break. If all of that passes, we can actually do the allocation. And if the allocation fails, we return with you know, then in this version of this helper, we're going to set the header everything prior to the flexible array member in the structure to zero. And then we actually do the mem copy. And this is, you know, we've we've calculated all the sizes, everything's happy. Let's let's do it. Let's copy that out. And again, I'm skipping read size checking here is just for right size, right size checking. So we know that this one isn't going to overflow, given the allocation and everything else that's gone on. So we do the unfortified mem copy. And then finally, we're going to store the flexible array elements count for the count that we already validated can actually fit in it. Assign our allocation, you know, assign the first variable to mem deflex duped to our allocation, clear RC to be zero and exit and we're done. So we've done all of that in the macro to do everything we need. And that's basically the, the shortened version of what we've been up to. This has been a long road. There's a strong relationship between dash W array bounds with array sizes and flexible arrays and the cleanup needed to get the compiler doing the right thing all across, you know, 25 million lines of code. So, skipping past the like two and a half years of cleaning up all sorts of stuff that people have been doing. We've got 5.16 released this last January and that introduced struck group and related helpers and did the bulk of the remaining flexible array conversions that that were needed. And the addition and struck group conversions 5.17 in late March, we were finished with all the struck group conversions and had almost all the rest of the array bounds warnings fixed finally. And 5.18, we were able to finally turn on array bounds warnings, a compile time globally and add the compile time mem copy enforcement. So, you know, refusing to do a mem copy when the size of the destination and the size of the copy were known in advance, which really means all of the cases of intentional, you know, cross member overflows were fixed. And then coming up 5.19 probably in July, we added a helper helper called unsafe mem copy which basically doesn't perform fortify checking even when fortify is enabled. And that's mostly to deal with cases that there is no clean solution for right now it's a bit of a chicken bit it's designed to be temporary. It adds an additional fourth argument that is meant to contain a comment that describes why the mem copy is in fact considered to be safe. Why it can't be, you know, why can't be converted anything else like a basically an enforced comment about why this area of code is believed to be safe. So that we don't, you know, people fixing these in the future don't have to reinvent like rediscover what was going on. And of course, what happened right as 5.18 released is that GCC 12 came out and it had some more intelligence about its internal diagnostics so we ended up with 200 more array bounds warnings. Unfortunately, it seems like some somewhat large portion of these are actually false positives. I think we've got three open bugs now against GCC 12 that are related to getting things wrong. So right now we've kind of had to turn off array bounds for GCC 12 because it's it's it's kind of overly broken. And then in 5.20 coming, maybe in October, hoping to see the mem deflex dupe and related helpers landing so that we can finally turn on the runtime warning of mem copy overflows. So all that stuff, the bleeding tooth and everything else we can actually start turning that into a strict failure, but we got to start with the warnings first and shake out any other intentional runtime overflows before we can turn that on. Although people could also set panic on warn if they've tested their their workloads. So hopefully by 5.20. And that's that's everything. If you have any questions or feedback, feel free to send me email the slides URL is here. Thanks for your time and attention. Take care.