 Smile, there you go. I didn't expect the flash to go off, now I'm blind. So unexpected talk, I just threw this stuff together. I was confronted today by a couple of people, or not today, yesterday, about an issue between life kernel patching and BPF. And they want them to play along. I guess some people care about that. So I wrote up a problem statement. Basically, my presentation is, I'll just read this. Live kernel patching uses ftrace to hijack the function call to call an updated function. In doing so, it sets this IP modified flag in the ops. And when it registers it, because it modifies ftrace trampoline so that when the reftrace trampoline will return, it's not going to return to who called it. It was going to return to another function. Obviously, you can only have one caller, or one, basically, callback in ftrace that could do that. It won't make sense if you have two callers that want to go someplace else, because you can only have one place that you can return to. BPF uses something called direct calls, which also kind of uses the IP modified, because it's not going to return to the caller. It's going to return someplace else. So the way that BPF trampoline works, if it's a single register, so you have some kernel function that you're going to tap to. And then you switch it to the trampoline. And what ftrace will say is, oh, there's only one user attached to this one. It has a counting of all the functions. All the functions that can be traced in the available filter functions has a counting table. And it keeps track of how many functions are hooked. If it's only one, there's a trick that does the optimization to call the trampoline directly. The direct call is actually one of the fastest. It just says, OK, BPF registered some sort of trampoline. It just says, OK, the no op is going to just call that trampoline, and then that trampoline does whatever BPF does. If you add another ftrace callback to that same function that it won't call the direct call directly, it actually will change it to call the ftrace caller trampoline, which is an ftrace generic trampoline, that then you'll save the registers. It records some registers, like a structure. And then it will call this loop iterator. It's not really called loop iterator. I wasn't really going to look at the source code and figure out the real name is. So I just said loop iterator. And that does basically a four each of all the registered ftrace ops that have been registered to ftrace, and it goes and calls the functions. One of them is a direct func helper that says, hey, we have a direct function attached to this function. So instead of just, we got to let the trampoline, the ftrace caller trampoline, know that there's a direct caller attached to it. So when you have another something attached to the direct, it'll call all the function tracing or whatever, comes back to the ftrace caller. The ftrace caller checks some sort of architecture dependent way of, for x86, we used the oridge ax. And we put in who we really want to call the return. So if anything is in the oridge ax on the return, it will then jump directly to that. So when the direct trampoline from BPF gets called, does whatever it does and doesn't even know it's going to look just like it was called directly from the function. Now, how is live kernel patching does? Same thing. We start off with a know op and we're going to call the ftrace caller. So the ftrace caller now goes to the same thing. It calls the live kernel patch function that will change the stack pointer. Actually, it wasn't changed stack pointer, but it will update the stack pointer so that when the ftrace caller returns, it's not going to return back to kernel funk. It's going to return back to the new patched kernel funk too. And that's how live kernel patching works. Now, you can see an issue now of how we're going to get to play along. So one idea is to end up with something like this. So you're live kernel patching. I know it's not the fastest because it has to go through a trampoline, but the trampoline really isn't that slow. It's not doing anything with the new ftrace args. It only saves the args with six registers for the arguments and then calls live kernel patching function that updates and codes back. It's actually rather fast. And what we could do is have a way that our ftrace could actually detect that when you try to do a direct call, and if ftrace had a mechanism to talk to live kernel patching or live kernel patching, you would just say, hey, we're modifying it, or say we have an IP modifier, or have some sort of mechanism to say that this is a live kernel patch and we're going, and we could tell ftrace, by the way, this kernel funk is no longer kernel funk. It's kernel funk too. And here's the address of the caller trumper. So if someone says, hey, I need to put a direct trampoline on, then I could update kernel funk too. We can actually update kernel funk too, and that when the live kernel patching jumps to kernel funk too, you got your direct caller and everything works and no problem, they too could work together. Let's say if BPF is first, so what BPF would have to do first, if the live kernel patching were to come in and we say, hey, we have a direct caller on here, this is why the live kernel patching would have to tell ftrace that, hey, this is live kernel patching. Here's a new function I'm going to be doing, and here's the address if there's a direct caller to now update the direct caller. So when you register it, so when the live kernel patching registers on something that has a direct caller, it will be able to update the function that it's going to update first, and then when it does the switch, you get the direct caller calling the new function, everything works great. There is one problem. It's this does whatever BPF does. Because the issue is BPF also does things differently than k-probes, k-wrap-probes, and function graph tracer on how to return a call. What it does is actually puts the call to the original function into the trampoline so that it could trace before and after the caller. So if we were to do this and add the live kernel patching would say, hey, switch, what would happen the direct trampoline would be calling the old function and not the new function. This would be a bug. So now we need to have some sort of, no, this is where we need the BPF to be involved. That live kernel patching comes in, or if we do something, we're going to have to say, hey, this is probably more of an issue if the BPF was first. If live kernel patching was first, you'd probably get the kernel func to anyway. That's probably, if Trace could hide that from you. But BPF is first. We have to have a way of, when we apply live kernel patching, to tell BPF, hey, change it to this. So, and this would then still work. So that's my proposal on how to solve this issue. So let's have the, let the games begin. BPF, if there's a live patch going on, BPF could, the BPF trampoline could be upgraded to the kernel func too, right? We could make a call when this happens. Like, remove the old trampoline, right? Go there, and then create a new trampoline over there. There'll be a window of time where that. Oh, so you basically mean that we could have two direct, so instead of having that direct tramp, and I'll use this here, whoops. Instead of having this guy, we create, I mean we create actually both of them. So you have two of those, and then all you do is when we do the switch, we make the direct call the second one, or have the update function. So basically, okay, so actually let me go back. What would actually be happening would be the kernel function would be calling the direct trampoline, and so what we have to do is when live kernel patching wants to say, okay, we're ready to update, it's going to actually have to update the, either we create another direct trampoline, or we update the call of the direct trampoline to do it. I mean, I guess we could create another one, and then just have it so the new function doesn't point to the direct, it points to the new one, so when they switch, it jumps to the new direct one. Okay, yeah, actually that would work. So first you jump from this old kernel function to the live patch logic, to the new live patch function. The live patch function also has a knob. In that knob, you do the ftrace thing, right? The ftrace string then comes back to the kernel function. We execute the BPF trampoline logic. Yes, okay, so let me see. So ideally, we would start off, and we try to flip things here. Get back. We'd be at here, this is where we start off here. And what we would do is when we wanted live kernel patching, we'll say, hey, this means that ftrace and live kernel patching will have to go talk to BPF as well. So when you have a direct caller done, like this, now we're going to do live kernel patching on kernel func, we have to tell the, we're going to have to tell BPF, we're going to be updating the live kernel patch so it will create a separate direct tramp that points to the new function. And then when we do the switch, the new function is going to point to that second trip. So when ftrace does the switch, everything is just smoothly goes. That's actually a good answer for that. Wait, the one all the way, this one? Or? Yeah, the direct trampoline? This one? Yeah, so when it calls the old function, it actually can call the stored return address, right? Wait, what was that? So when the kernel func2 is calling the direct trampoline, the trampoline can actually get the return address from the call instruction, which is on the stack. Yeah. And just call that, right? Well, and it would call the kernel func2. Is that what it does? Oh, does it call? No, I thought it, does it call from the stack? Do you get the address from the stack and call that? No, no, no. It technically can, but we don't do it to always indirect call. Yeah, I did it in my patches that were never pushed, like for this trampoline rewrite, because I had trampoline, which was called from many other functions. So I used this solution. I read the return address from the stack and... So that won't get affected by this? And it's very easy to change. It was just... Yes, but only for this multi-attach trampoline. Right. But it's generic, right? I mean, at the moment we put the address, like Harco did, but we can read it from the stack. But you're going to lose five cycles, so we don't want that. That's slow, that's the point, that's slow. Oh, okay, okay, yeah. Like indirect call versus direct call, there is this noticeable difference. Oh, okay, yeah. There is a difference between direct call, especially when you have all the, you know, replenes and all that in there too, and you have to handle that. Okay, so I think that we have the solution then. So basically the difference would be, before we do the switch, we would actually have, this will be a set, well, we'd have both of them. We'd have actually this and this. So when we do this, this will be attached to this guy, but the direct trampoline, once we inject the call to F trace caller and the function live kernel patch will switch, it will switch back to, yeah. So once it jumps to the new thing, it jumps to the new direct caller. I think, I think actually what you were saying, the proposal earlier, probably better instead of like allocating and generating new trampoline just to rewrite the address in here. Actually, the reason why I don't want to do that, because I think that's going to cause race conditions because this update actually will jump here and we'll actually hold off jumping to here for a little bit. So if we change, this has to be changed. Like the will be race condition regardless. Like KLP is not atomic. It's like, KLP changes like 10 different functions, like in the whole executing parallel. Like KLP has this multi-state stuff when it's patching, patching, patching and then eventually converges. Like just because it's like calls kernel fang the kernel fang two for some time. No, reason why is because once we have the switch over, once this guy says switch from this guy to this guy, this guy's already calling the correct one. And this one will always be calling this and of course this guy will be calling the old one until we do the switch. But this will be happening like regardless on different CPUs, it's potentially possible. One CPU is using fang to another fang, it's normal. Like KLP is. No, we don't, with the switch over. No, once we switch it, it's atomic across CPUs. No, it's not. The CPU can start executing any of these functions and it's already there, right? So like just because text rewrite, like text poke whatever BP is atomic, the CPU might be already like executing different stuff. They already enter the function. Yeah, yeah, yeah. Like further across the CPUs it could, cause you're on the direct fang calling this. Like imagine this fang called fang two, they're all just long functions. Like regardless of how text atomically is modified, different CPUs will be executing them. So there's no point, my point, there's no reason to avoid races when the racists got to be there and KLP deal with that already. So don't worry about like race conditions. Well, we'll see, like you said. It doesn't need to be solved, but. That's what text poke BP does. Like it's whole logic to modify text safely. Yes. So the textbook, yeah, cause it does, you know, it adds the product, syncs out all the CPUs, change the things, change all the CPUs. And then so it is actually kind of atomic. I mean, when it hits because the, the text, when you hit them, we put a break point in them. Once we put the break point in, it's basically is calling the new new call. When the array, it could be calling the old or the new until it's syncs right there. And that's the thing is like if it's already calling the old function, you'll have to see how live kernel patching does it. Because live kernel patching will, I think it switches it once it does the switch of the regs. But then again, the question is. Yeah, it's either. It's like, well, I'm saying, don't generate the second one. Just your right. Actually, well, here's the, here's the problem I have. Oh, you mean when we add in our own like a Fentry stuff, like second Fentry program? Yeah. Yes. And actually, and here's the thing too. The racing conditions I'm worried about is more of a live kernel patching has a lot of its own race conditions that I don't understand and it determines. So even once we do this connect here, this live kernel patching function doesn't actually set the regs pointer until it's ready to do so. Then basically it does, I believe. So it's this little trampoline function. It may not do anything. And it'll just return back to the original direct call. So if it calls this guy, we want to call the original one to come back to this thing. And once it says, okay, come to call the new guy, we want this guy to call the new guy. So it does make sense to have two trampolines, depending on who calls it. This guy will always call the direct trampoline. This guy will always call the second one. What's the concern like, say, if we do this, right? Yeah, I mean, I think the only real race condition is if between when live patch thinks it's flipped and it checks every task stack to make sure that nobody's using that function that nobody actually ever will. So if there's like a period where you could have a BPF trampoline that ends up calling the original function after every task stack has been checked, that would be a race. Otherwise, you can have tasks executing both for a long time. Yeah, it could be. So, I mean, live kernel patching does a bunch of things where it checks all the file, all applications to know where it's called. It actually does the stack trace. That's where the Orcon wonder came from because live kernel patching needed a accurate stack trace. Does this signature of the kernel funk to change, can it potentially change from like kernel funk? Can it get like extra argument or one less argument? I don't think you can because you have callers. Okay. And then like, what would be the mechanism for notifying BPF about like the need to update existing trampoline? So there will be just a function you will pass us like the original function and new function. Yeah, I think it'll pass you a new function. Can we look up the trampoline from the function? So in fact, F trace could probably even help you with that too because that's exactly what live kernel patching needs to tell F trace because F trace has to know where to, because ideally is when live kernel patching happens, F trace has to know that when we have live kernel patching active and anyone wants to hook to this guy, F trace is going to return this guy. So live kernel patching is going to have to tell us, where is that guy? Live kernel patching must tell us this guy so it could tell BPF as well saying here's the new function, the address. So you can actually set everything up with? As long as we don't miss like any execution of either kernel func or kernel func too. Right, and that's why. So basically you said- It's better to update in place and we also need to update the func IP that like will be returned from. So yeah, it's just like updating the trampoline. So should work, right? Devil in details, right. Right. So yeah, fine, let's not like get bogged on whether we're just patch the existing trampoline or like get the new one, like both option potentially work. There are pluses and minuses to both as far as they can see, but. Basically we need to get the live kernel patching folks involved, obviously, because they're going to have to, a lot of the work will be there. I could, it wouldn't be kind of trivial for on the F trace side. It's just a little bit of accounting that just has to do better than that. So Steven, since you're here. So like this F trace caller, so kernel patching once the function especially always does this. So it always does if trace caller doing this and then calling that function. Yes, because it could always switch to a new function. And it does it in a loop, right? So there is a still loop. No, what it does actually, okay. So you know, this is actually. So this F trace caller is generated trampoline by a trace. Yes, this is actually a direct call and this is a direct call. So it's a direct call, direct call, and that's. But this live patch funk is only one or it's specific. Like there will be different function for every kernel function that KLP is patching. I don't know, I can't, I don't actually know. They might do it per that. I don't know the implementation. Just curious. So there's a single function that everybody jumps directly to to see if they should be using the new function or the old function. But there's different, like that live patch funk is different for every single function being patched. Yeah, there's like, there's just an indirect. So it's probably a statement there or maybe even a static, oh, maybe there's a static branch too. So for each task you check to see if it should have used, if it should be using the new patch funk or the old patch funk. That's in that static ftrace caller thing there. And then you jump to the live patch funk if it should be or the old function if it shouldn't be. But you do that for each function and obviously each task. But it's all basically, that's my. And then we do all this dance back when KLP gets unloaded. And what? We do this dance back, like we'll reward all of this stuff. Yes, it's pretty similar. It's unloaded, right? Because a lot and lot equally, yeah, boneless same operations. Yep, it'll be easier to actually get off, yep. But that's just the idea, so. Any other questions? Okay, thank you.