 Hello. Yes, I will tell about StackLeak security feature and its long and complicated way to the Linux kernel mainline. My name is Aleksandar Popov. I'm the Linux kernel developer and security researcher at Positive Technologies. The plan of the talk, first, I will give the overview of StackLeak security feature, then describe my role, give the technical details. How does it work? And then, finally, describe this way to the mainline, the timeline, current state, and interactions with the Linux and substituent maintainers. StackLeak is an awesome security feature originally initially made by PaxSteam. It comes as PaxMemory StackLeak in jar security PaxPatch. But unfortunately, this patch is not public anymore. The last public version which we have is for kernel 4.9 from April 2017. And I took the goal to introduce StackLeak into the Linux kernel mainline. And I want to thank my employer, Positive Technologies. It allows me to spend part of my working time on it. And of course, thank my family because I spent a lot of my free time on it. What did I do? First, I had to extract StackLeak from the jar security PaxPatch, which is really big, more than 200,000 lines in it. And I carefully learned it bit by bit and went the usual loop, sent to Linux kernel mailing list, get feedback, improve, repeat. And I'm doing that for more than a year. I've sent 15 versions of the PaxSeries. And it's still in progress. Now about StackLeak security features. What does it provide? First of all, I want to show you this map describing the Linux kernel security area. The only thing you should see on this slide is that you don't see anything because the area is very complex. There are a lot of different kinds of vulnerabilities exploit techniques and Linux kernel security features which mitigate them. And there are a lot of interconnections. So if you want the details, I give the link to the repository. And that is the part of this map about StackLeak. So StackLeak is out of tree defense, which is going to the mainline. It is inspired by PaxMemory StackLeak. And it interacts with three kinds of kernel vulnerabilities. It has stack depth overflow, uninitialized variables, attacks, and information exposure. This links these arrows on the diagram. It doesn't mean that StackLeak mitigates those kinds of bugs. It means that there is some interconnection. And I will describe it soon. The security feature number one, StackLeak erases the kernel stack at the end of the system call. And that reduces the information which can be leaked by kernel stack leak bugs. How does it look? How such bug looks? We have user space and kernel space and two system calls on this diagram. During the first system call, we have some security sensitive data placed at the kernel thread stack. And the second system call has a bug. We have copied to user with some data not initialized. So this data contains the values which were previously put at the kernel thread stack. And these values are copied to the user space and the attacker now can analyze them. And what does StackLeak do against that? At the end of the system call, we have StackLeak erase function, which is called. And it erases the used part of the kernel stack. It writes minus beef to the kernel thread stack and overrides the security sensitive data. And later, the system call number two, it still has the bug. But the only thing which is copied to the user space is this minus beef. So security sensitive data is not copied anymore. And we have a nice implication from that. StackLeak blocks some uninitialized kernel stack variable attacks. There are nice examples. And I really like the write up by case cook, which describes how to exploit such kind of flow. This diagram shows it. We have now three system calls. On the first system call, the attacker prepares the payload in the user space, this target address. And it is copied from user. It is copied to the kernel thread stack. The second system call has a bug in it. We, the payload number two is copied to the address which is not initialized. So it turns into so-called arbitrary write primitive. And the attacker controls the target address with the first system call. And now the payload number two, which is prepared by the attacker, is copied to the kernel space. And the third system call can be used to trigger the payload and elevate the privileges. So it is a local privilege escalation exploit. And what does this stack leak do to mitigate that? At the end of the first system call, when payload number one was copied to the kernel space, it is overwritten by minus bif in stack leak arrays function. And then later, the arbitrary write primitive, which we had previously, turned into copying data to minus bif, which points to the unused hole in the virtual memory. So it will give a fault. And the user space process, which called it, will be killed. So the uninitialized stack variable attack is mitigated with that. And there is an important limitation. Stack Leak doesn't mitigate such attacks, which happen during one single system call. Because the stack leak arrays function is called at the end of the system call. If the attack is performed before, we can do nothing about that. Now the third security feature. Stack Leak blocks stack clash attack. It is one kind of stack depth overflow. And if we want the mainline kernel to be defended against stack depths overflow, all kinds, which we know right now, we need three security, three config options all together. It is config thread info in task, which moves the thread info out of the bottom of the kernel thread stack. Second, we need the map stack function, config option, which adds guard pages around our stack. And finally, we need stack leak, which blocks stack clash attack. How does it work? The idea of stack clash attack is quite old. It was published for the first time at 2005. And then later it was revisited by Qualys research team in 2017. It uses variable length arrays, which were already covered by case today. And the memory for variable length array is allocated on the stack with aloca. And if the attacker controls the size of the array, he can make the kernel allocate a lot of memory. And the end of the allocation will jump over the guard page. And the attacker now can override the neighbor memory, which is next to the kernel thread stack. It can be another process stack or some heap object. So it can be used for privilege elevation again. And what does the click do about that? Before every aloca call, there is this code running. If we have the allocation size bigger or equal to the space left, we call panic or bug. Depending on our config options, and it is hated by Linus, you may guess why. I will cover it a little bit later. OK, and what is the price? What is the performance impact? It is the result of the first performance test, which is quite attractive. We see that building the Linux kernel on one core gives us less than 1% of performance penalty. And there is another test, which is not so attractive. The Hackbench synthetic test, it starts a lot of threads, which send short messages to each other. So there are a lot of short system calls. The stack is raised at the end of each call. And we have more than 4% performance penalty. So the conclusion about performance. StackLeak performance penalty varies for different workloads. So before deploying in production, first you should evaluate the performance impact on your expected workload. And I've added the StackLeak metrics feature, which shows how much of the stack space is used for the special process. And then after evaluating the performance penalty, you can decide whether it is fine for your case on your system. Now before I speak about the upstreaming process, I should say that StackLeak consists of two parts. First is the code which raises the used part of the kernel stack at the end of the SQL. And the second part is the GCC plugin, which is responsible for compile time instrumentation. It is needed for two tasks. First, it is needed for tracking the lowest border of the kernel stack, because we raise only used part of the kernel stack. So we need to know how much of the stack we used during the syscall handling. And second, it adds Aloka check, which was hated by Linus and dropped. And now a long thrilling story of StackLeak upstreaming. It reminds me famous Russian painting, The Hunters at Rest. So you can see how they share the experience which they had in forests. And it is the same about the kernel developers, which share experience about what they encountered in the Linux kernel mailing list. So it is a timeline. It started in April 2017, when GCR security decided to close their public patches. And in May, I decided to work on StackLeak. I continued the work which was started by Tycho Anderson and sent the first version to the Linux kernel mailing list. And I should say that I was learning StackLeak bit by bit. So there was a cover letter when I wrote the to-do. So on the first iteration, I just learned how the stack-raising written in assembly language works. And I marked in to-do that I should learn how GCC plugin works and what it is used for. But suddenly, in the middle of June, the Stack Clash report by Qualys was published. And GCR security published their blog post about Stack Clash and the Linux kernel. And they trolled me and my upstreaming efforts, saying that we just copy-paste without understanding. At the same time, I marked in to-do that I need to learn the GCC plugin. I am only at the beginning. Anyway, I understood that I'm in the middle of this events going on, and I should proceed. On the third version, I learned and documented the GCC plugin, found some bugs in it. Then on the fourth version, I learned the assertions which StackLeak adds in StackTracking and Allocachaking. And there were errors in them again. I fixed them. Then I found all the points where the stack should be erased, because there are multiple ways from the kernel space to user space at the end of this system call. And I found a point which StackErasing was missed. And then later in December, there was a really short and interesting email, something like that. Did you see these patches called PTI? PageTable Isolation. Did you try to re-base on them? So I re-based on PTI, which introduced the Trumpline stack, some intermediate stack which is used before we go from the kernel space to the user space. And later in January, Meltdown was published. Again, I felt that I'm in the middle of this hurricane. It was really impressive. Then during some month, I was ignored. And OK, I thought that I'm ready for upstreaming. But version nine suddenly was burned by Linus. He appeared in the email thread and told a lot of angry words. But anyway, he stated that variable length arrays are bad by default. And we should clean up the kernel from them. And CaseCook started this movement of VlaCleanUp. There are more than 15 people, I guess, participating. And it started by this email from Linus. Anyway, I was emotionally dead for several weeks. But then my wife helped me. And I decided to extract the technical objections from these angry words. And the main objection was that stack-raising is written in assembly language. And so maintainers don't like it because it is quite a lot of assembly. And I decided to rewrite it in C. It was not easy because it is tricky to make the GCC compiler create the binary which looks similar to handwritten assembly. Anyway, I came up with the next version, which was called Stockholm Syndrome Patch Series by Brett Spangler from JR Security. There were more versions. And at version 14, I thought, again, that I'm really ready for mainline. But the pull request for 4.19 was burned by Linus at the second time because of this bug on in allocation and stack-raising. Again, extracted the objections, came up with the next version, which avoids it. And it was called version 15, CZIFO's edition. Quite funny again. Let's see what will happen. It is not taken. The pull request is not merged for this release. And now, what are the changes from the original version? How my patch series differs from the original JR Security patch? First of all, there are bugs fixed in original Tech League GCC plugin. The assertion in stack tracking and the local checks were wrong, and I have fixed them. And as I said, there were points of stack-raising missing. So stack was not erased in some cases. There is plenty of refactoring which was done. I extracted the common part for easy porting to new platforms, and it includes rewriting the stack-raising in C, which was tricky. But it allowed easy porting to ARM64, which is done by Laura Abbott, thanks to her. I got rid of hard-coded magic numbers, and I documented the code, so I prepared the patch series for the mainline, because the initial version is far from the usual requirements. And it is the same about the code style. And what is new functionality which was introduced? This trampoline stack support, which comes during page table isolation, the nice tests which we wrote together with Tyho Anderson, ARM64 support, which I already mentioned, and GCC8 support. GCC8 was released during my Tech League upstreaming efforts, so we added the support for this new version. And the new functionality which was requested by Inga Molnar, this Tech League matrix, which allows you to see how much of stack space is used during the current and previous system call for some process. And this Tech League runtime-disabled option, which I don't like really, because runtime-disabling of security features, I don't like it. But Inga Molnar forced me, so I added this sys-control under the config option, which is disabled by default. So it is some kind of compromise which we found. And dropped functionality. As I said, the assertions in stack tracking were wrong. I dropped them. And the first thing which triggered Linus was stack-a-raising after pit-race-accomp and auditing, which happens at the beginning of the system call. So when Linus saw it, he felt something bad and decided to burn it all. And the lock-a-checking is dropped because Bagon is now prohibited in the patches which come from the security developers. But anyway, the variable-length race will be removed. And after they are completely removed, there will be a global WVLA flag. At the same time, I think that a lock-a-check would be good for the code which is not upstreamed. But it is the only way how we can get to the mainline dropping this. So let's see what will happen with the 15th version. And as I said, when StackLash was published, Brett Spangler said that we just copy and paste the JR security code without understanding. But I'm sure it is not applicable to StackLeak upstreaming efforts. And a few words about what is burned by Linus. So it is strong language, even swearing. And there are examples I don't quote. Technical objections are mixed with them. So you should put off your emotions and just try to extract what he means. And he gives a knock without looking at patches. And it is difficult to handle that. So sometimes he simply ignores. And it makes me think that he is, by default, irritated by kernel hardening initiatives, maybe. But anyway, I love Linux. But all of that kills my motivation. Let's see what will happen with StackLeak. If Linus will not merge it, really, my work will be Sisyphus' work. But if StackLeak will finally find the way to the mainline, it will survive like a phoenix through several flames. And closing thoughts, we are the Linux kernel community. And we are responsible for all those machine which run our favorite operating system. And if we put more effort to Linux kernel security, we will definitely not be ignored. Thank you very much. So I wanted to ask about writing StackErasing in C. So more or less, how is it done? Because C doesn't give you direct access to registers like StackPointer. So how do you do StackErasing without accessing the registers? Or how do you access directly registers through C? First of all, there is a helper, current top of Stack, in the Linux kernels. So it is not a problem. The main problem was connected with local variables which should reside in registers, but not on the stack, because we are raising the stack. And it is the assumption which we make about the compiler that this function has, as I remember, four local variables. And they are residing the registers. Then another assumption was that during the raising, the StackPointer doesn't change. It was another assumption which we just have. But at the same time, this Trump-aligned stack makes things easier. Because when we switch to Trump-aligned stack and then we can erase the thread stack not from the lower stack to StackPointer, but to the top of the stack. Because we are at the separate stack right now. So it makes things easier. But there are cases when we erase Stack being on it at the same time. So now stack-a-raising supports both ways of work. And another complicated thing was that GCC likes to optimize. And when the raising is performed, first we need to find the poison and then start from the poison. It looks like that. So there is a lower stack. And first of all, we need to find 17 poison values on the stack. And after we found it, we just go and write minus beef between the found point and thread stack top. And what did I want to say? So yes. And as you see, we don't erase the whole stack. When I did some performance evaluation, if I erase the whole stack, it will give you a 40% of performance penalty. And we erase only the used part of the kernel stack. And it makes it so fast that on building the Linux kernel, we have only less than 1% of performance penalty. And I want to say that the GCC wants to optimize this searching. And when we come to the thread stack bottom on the first raising, it touched the guard page below the stack because the optimization loop was so that it read the next value below the stack bottom and crushed the kernel. So I had to play with GCC to avoid this optimization and make their binary similar to initial assembly language version. Thanks for the question. Yes? So some of us would actually like the bug on version of the patch. Maybe we can have a discussion afterwards about how to convince Linus that some of us want that feature. I think first we should convince him to have StackLeak even without bug on. So if we have it in the main line, then we can discuss further work. There are several ideas, what can be done further. So there are more points where we can raise the kernel stack. But I think we can go with small steps. And first we should have this success story of getting this security feature in the main line and then go to further goals. We might need to introduce a security bug on version. See if it's bug on's bad. I have that in the kernel. It's used in two places. I am completely terrified of pointing it out to Linus at this point. So I'd like maybe to do that. In the meantime, people can set a panic on warn as they work around. And that's not the first time that somebody's changed the name of something to get it up to Linus. I can't remember any of the others. No, don't ask. OK, any more questions? Have you tried to compile your C-Routine using Cilang? No. First of all, this has this GCC plug-in, which works only with GCC. And the stack raising without it will give you a big performance penalty. So I just didn't try to do anything with Cilang because I depend on this GCC plug-in. But there is a plug-in infrastructure in Cilang as well. So if it will work for the GCC version, we can do the same with Cilang. OK, thanks for all your effort. Thank you. Thanks with all your support.