 So my name is Adrián Moreno and I work for Red Hat for the networking team and I'm here today to present a new tool that I have written with my two colleagues, Antoine Tenard and Paolo Valerio and we're gonna talk about networking and debugging and tracing. So this is a short agenda. I will try to go fast through these slides and run a live demo, although given situation, I don't know if I should just Anyway let's just jump directly to the problem. The problem that we're trying to solve is network visibility and network tracing and for at least for me this is a three-dimensional problem meaning we have on the one hand we have many components in the Linux kernel. So we have we can have a packet in the TCP stack, UDP stack, TC, Netfilter, we can have it in OVS. OVS is especially complicated because the packet temporarily goes to user space and then it gets re-injected into the kernel. So it's there are many places where a packet can be but on each of those places we not only have the packets that we want to look to we want, we have all sorts of packets. So our packets, our traffic is hidden among lots of other packets. So we need good filtering but apart from that the packets mutate over time. So filters get stale. You can filter on the source IP address and then the source IP address might change because you do something like nothing. So it's really complicated. You have these three dimensions to try to find your packets and know where they are and what happened to them in the kernel. So looking at the existing tools we have of course venerable TCP dump. We all love the PCAP filtering You can express any filter and it just works. It's like it is of course the seed of EBPF, so our respects. We have all the tools like DropWatch which is just focused on drops. It gives you the stack trace of each drop. It's really nice. We have PWRU. If there's someone that hasn't tried this tool, I recommend you to use it. It's awesome. It probes many different places and we also have tools like BPF Trace, Perf, SystemTap, which are slightly complicated to say the least but very very powerful. So with all this what is Rhettis and what is the tool that we've written? We can give a definition. It's a tracing tool that gives contextual information of different places in the stack. But you can also think of it as TCP dump plus PWRU plus Perf times Rust. Of course we haven't put all the features of these tools into hours, but we do have we have gotten a lot of inspiration. Lots of nice things that we like from these tools. We have kind of integrated them. So if you do know these tools, you might feel that something sounds familiar. So let's just jump into how to use Rhettis and try to understand this tool. And for us to understand this tool we just can jump into an example. This is an example of Rhettis and the output that it just brings. Okay, so if we understand this we understand Rhettis and what it does. So first of all Rhettis is based, it has something that we call collectors. Collectors tell Rhettis what data to extract. So for instance, we have the SKB collector there and the SKB collector collected information from the SKBuff from the packet. So that line which resembles TCP dump output is what the SKB collector generated. Apart, we also have the NFT collector there. The NFT collector generated that little line down there which is NF tables chain, verdict and table. So collectors collect data and extract information from the kernel. And on the other hand, we have something called probes. Probes tell Rhettis where to look for packets. Some of the probes can be explicit. So in this case, we have over there an explicit probe. So we told Rhettis, please get into the K probe IP RCV, which is a well-known function in the IP stack. And therefore that probe was well, that K probe was probe. Basically, we attached an EPPF program there and collectors collected the available information. So some of them are explicit, but some of them are automatic. Like for instance, the NFT collector, the NFT collector, you cannot just collect net filter like NF table information from anywhere in the kernel. So it automatically added that probe in a special place in the kernel where this information is available. So we have probes and collectors and if we start combining them, we think we're able to achieve a fairly good low-level tool for network tracing. We have many existing collectors. I'm going to just quickly go through these slides because we don't have much time because of the delay at the beginning. We have the SKB collector, collects packet information. We have the NFT collector, which also we shown in the in the first example. You can filter, it can do something really cool, which is filter on the verdict of the net filter rule. We have SKB drop extracts drop reasons from a special function in the kernel. We have SKB tracking because we do extensive tracking of packets. So that is a very important feature in InsideVetis. We have an OVS collector. So just a small reminder of how OVS works for those of you that might not be familiar. In OVS, we have a kernel data path, but it acts like a cache. So it's empty at the beginning. The first time we see a packet, we send it to a user space demon where we process it and we determine what to do with it. And then we put the packet back into the kernel alongside a flow that tells the kernel data path what to do with similar packets. So then the next packet, which looks similar, will be just processing the kernel data path. So we have an OVS collector, which does exactly that. Traces add some automatic probes in the kernel data path and the user space demon in order to extract all this information. This is a short summary table of these collectors. And we found a problem which is as we start adding probes and probes, since they are explicit, many of them are explicit, we end up having very long command lines and we need kernel knowledge to actually know what to probe. So these might be obvious for a kernel networking engineer, but might not be obvious for for the rest. So we developed something called profiles. So it's just a YAML file where we list the probes. We enable the collectors and it's very simple, easy to share, easy to ship in your distro packets or whatever, easy to write one for specific use cases and and have it ready for your debugging sessions. So you just enable the profile and that's it. And one of my favorite features, we have pickup filtering. So same syntax as TCP dump, exactly the same. If it works in TCP dump, it will most likely work in retis. And it's like 19% I think like in for most common use cases, it will work. And yeah, retis gets that, translates it to BPF, translates BPF to eBPF and inserts it into a kernel for filtering. And similar to perf a little bit, events can be stored in files and for easy post processing and events in this case are just JSON. So you can do any kind of post processing, you can write your own post processing in Python or whatever and get some more insights from your events. We have one of the built in post processors is the sort event, the short command which I will show you in a live demo actually. So fingers crossed bear with me, so this is a very challenging. Okay, so sorry, I had to reboot the machine just seconds before starting. So I hope my script works. Okay, so I have a very simple setup here. So we have two network devices attached to two network name spaces, a private one and a public one and we're just nothing like masquerading between them. Okay, so for instance, if I am going to enter the private, the private network name space and I'm going to ping, so I have connectivity with the public one. So I'm going to just also verify this. So this is the, I'm pinging this IP address, right? And okay, I have a cache. Okay, so I'm going to capture the packet as it's received and as you can see the source IP address is not the source IP address that I have here. So there's nothing going on, okay? So let's see how Rhettis can, let me keep that ping flowing. And so I have some profiles installed. So I have a UDP profile. I have also a generic profile, NFT profile. The generic profile is shipped with Rhettis and it's with v1 at least and it's a generic, pretty useful for starting debugging session. So I'm going to use the generic profile and the NFT one, right? And I'm going to collect, I'm going to filter on host 192.168.202. And okay, so that's interesting. So a live demo, of course, is not working. Why is not, okay, okay, sorry, so sorry for that. The, I have a plan B which is, I have a, oh gosh, but this is not visible, is it? Do we see something there? Oh my God, sorry, sorry guys. So can we do like zoom in? I just rebooted this machine before and do you guys see something? Well, these are events, okay? So I just run the NFT, the NFT collector. And this printed a bunch of events, right? What I wanted to show you is this is just putting some events into a file called events.json, okay? So events are stored in a file. And after that, I can run the short command. When I type short events.json, ready short events.json, this is the output. So as you can see here, at least visually, I hope you can see that most of the events are indented. This is because we detected the first packet and we identify that the rest of the events belong to the same packet. And we identify that even through nothing. So at some point, the source IP address of the events change from here, which is 192.168.102. And it becomes another one, right? So basically, with this demo, I wanted to demonstrate that we can get events all around the network stack. So IP receive, we get the IP forwarding, we get nothing, we get much like not manipulate functions. We see the nothing going on and we see the packet being received. And later on, in this demo, I increased the rate of the ping, and I see that some packets are being dropped. So doing the same experiment, like doing collecting the events and shorting them, I can look at the drop. And I can see how, maybe you don't see it, but at some point, you see a drop in the NF table, an NFT event, dropping the packet, which has a different source IP address than the one I put in the filter. So even though I was filtering on the source IP address and the source IP address changed, I was able to detect that and detect that packet, specifically that one, got dropped, because I had a net filter rule in egress, which dropped it. And at the end, I see this SKB drop event with the SKB drop event reason, which is in this case is net filter drop. Also, there's another way to see it. There's an event. We can enable just the NFT, sorry, just the SKB drop and the NFT probes, sorry, collectors, and put the option minus, minus stack. The minus, minus stack will print the stack trace of each event. So we will capture the packets as they get dropped and print the stack trace. This is similar to Drop Watch, if you know the tool. Okay. And here, again, pretty small, I think. Sorry about that. So in this other demo, I have Open V-switch setup. And in the Open V-switch setup, just two namespaces communicating each other through OBS. And in this particular example, I use UDP DNS resolution. So I run a DNS server on one side and a DNS client, basically a dig on the other side. This is what I was telling you about. So this is the cache, the content of the flow cache. Okay. The first time we see a packet flowing, two entries appear, two flows appear in the kernel cache. Any other UDP packet will hit that flow and directly be output to the right port. And after a while, the cache entry gets invalidated, expires, and gets flushed. So in this particular example, what we see is, of course, we have a profile that helps in this use case. The profile defines some traces in the UDP stack and some traces in the OBS stack. Profiles can be combined with each other. So in this case, we have two completely independent self-contained profiles that, when combined, will help us debug this particular case, in which maybe we have some latency in UDP resolutions. So we start collecting, we execute the example, we stop collecting, and we run, again, the short command. And here, the short command gives us a very big path of events that belong to the same packet. Okay. And of course, it's not very visible here, but at some point, so we see packets in the vEath port of the short network namespace. We see the event, sorry, we see the SKB going from one network namespace to another, because we list also the network namespace in which we see the packet. We see it being received by OBS. We see it being processed by OBS. We see it being upcalled by OBS to user space. We see it being received by OBS in user space. So we know that it wasn't dropped in the middle, like we didn't overflow the Netlink socket that we use. We then saw a flow being put, like a flow being configured because of this packet, and we saw a flow being executed, meaning the packet being re-injected into the kernel. And then we see events where the kernel receives the packet and executes an action, which in this case is an output action to port whatever. And then the next event here is vEath transmit. Again, it goes back to the vEath and up the IP stack and the UDP stack of the DNS server. So we see the entire flow even in user space. And of course, if we scroll down to the next event, okay, so if we scroll down to the next event, what we see is the next packet didn't go through user space. And we see OBS DP execute action right after it being received by OBS, meaning we are able to see which packets go to user space, which packets stay in the kernel, and we can see drops or any unexpected behavior with packets even in user space. Okay, yeah, I'm really sorry I couldn't do this live, but you know. So I can show it here a little bit bigger. So this is the output of the short command. You see the first line is the first time we see the packet. Anyway, the first line is the first time we see the packet we see here in the IP RCV. And then we see NF contract, NF contract ICMP, nothing going on, right? So we see all the events shorted. It's like TCP dumping everywhere in the kernel and being able to see it in a nice shorted manner. And what's next? So we have just released the first version. We have many, many collectors in planned. We want to add contract, EC, container integration. We want an embedded Python integration, terminal user interface, and whatever people suggest, right? So contributions are welcome. That's the GitHub repo. And you can just create an issue and suggest profiles and any other feature that you would like us to work on. And that's it. Sorry, the demo didn't work. Last minute issues, as always. And that's it. If there are any questions. Yes. So there are different techniques. So inside the kernel we trace the SKB. The first time we use the SKB head and we track whenever the data pointer changes so we know when it mutates. But essentially we use pointers inside the SKB to track when the event belongs to that same SKB. In OVS we don't have a very good infrastructure for tracing packets from the kernel to user space. So then what we do is we get hash from different parts of the packet because we don't have an SKB struct in OVS. We just have bytes. So we hash it. We use other techniques to track the packet through OVS because there are several places where OVS installs flows and things like that. And then we also hash the packet when OVS re-injects the packet into the kernel. So that's how we... And then we kind of combine both tracking information into a single one for us to be able to short all those events. I don't think we can trace any user space application that can change the packet in any way. We cannot do it in a generic way, that's what I mean. We can do it in OVS because we know how OVS works. I happen to work in the OVS team. So we use OVS internal knowledge to know what OVS does to the packet. In addition to that, we expose user-defined trace points in user-statically defined trace points, USDT. So we allow in OVS in user space. So OVS has hooks for EBPF programs to be run. So this allows us to extract that information in very specific key places. Of course not all applications might do that. Each data path or control path that we want to monitor, we would need a specific collector that knows how to do that. So I'm hoping that we can add other user space applications like other control planes, other programs that alter and modify the behavior of packets. But yeah, not in a generic way. We don't have a user interface at the moment. We just have the CLI. We would like to add a terminal user interface similar to what PERF has where you can have all the events, inspect them, expand these little tables and collapse them, filter and things like that. So we want a terminal user interface, but it's just in the backlog. The question was how do we convert the pickup filter into EBPF? Yes, we do use pick-up to convert it to legacy BPF. And then we basically manually convert instructions from BPF to EBPF. They are mostly a one-to-one relationship, but not exactly. And so we create an EBPF program that has that functionality, that filter, and we attach it to the rest of our EBPF programs. That's pretty cool, actually. Okay, thank you.