 Hello, I'm Jorge Salamero. I work for CISDIC. How many people know CISDIC here? Oh, that's good. That's quite a few. CELAUCIO, yeah, some basic about CISDIC. But today we are going to talk about something. It's like kind of an experiment, little project. It's called CISDIC Tracers. And we came up with this thing because traditionally, CISDIC they have been doing troubleshooting, not tracing, because CISDIC they didn't just do tracing with these kind of tools, applying free, you know, all that. But then suddenly something changed. And these tools, they become less useful with Docker containers and all that. But then, in addition, we got microservices deployed everywhere. And it was time for a change. Developers started to use, because they have new needs, they started to use sort of different tools. And they started to use tracing heavily, because when you have distributed services, you need to know how they talk to each other, where are the problems, troubleshoot like in-between. There are quite a few options, some of them open source, big ones compatible with open tracing, which is in standard API. A few commercials, probably, you know, New Relic, Opinamics, and obviously people, because they didn't have heavy and very specific needs. They just had to monitor something. They started to do all sort of hacks, like using the Statsy, JMX, print, who hasn't done print to see what their code is doing, or misusing logs for that. But we thought, what if we had something that was obviously open source, so everyone could use, that it was very simple and easy to use? Who knows, or who is using more or less often VCC here? Okay, no one. It's great, eBPF, it's a great tool, can do amazing things, way more than SysDake, but it's difficult here. You need to know the API, you need to know every system call you want to hook into. It's complex, it has some trade-offs. Also, we wanted to have something very lightweight that could run everywhere inside containers. So, we have the SysDake technology and we thought we could do something else. Not to get confused, we have SysDake open source tool and then there is another commercial tool that I'm not going to talk about today. We'll focus on the open source. For those that don't know how SysDake works, basically what we do is we instrument at the kernel level, we load a very small kernel module that allows us to capture every single system call and everything associated with it. And send it into a user space agent that can run as a demo or inside a container and even save it into a file and do all kind of analysis and everything. Some of you probably, this is going to be a question later on, so I'm going to answer already. Why we didn't do this with BPF? Well, when SysDake was created, BPF didn't exist. Even now, the way it works has been changing a lot lately. Who has kernel 4.11 deployed in their servers in production? That's what we need to do, everything we do with SysDake. So that's why we had a different kernel. Hopefully that in the future changes but this is how things are now. So at the end, SysDake, everything it does is capture system events that they are system calls. We filter them, we can apply all sort of filters and run scripts to reshape, aggregate them. We can dump things into a file. So at the end it's like TCP dump for SysCalls. We have container support and we have a command line on courses interface. How things work internally is that we have like an evidence stream where all the system calls the year introduced. Open rates connect everything, every single call. And we dump them into a disk, something like pickup file. We filter them, we do some analysis. And with this, we thought that we could do some tugging on this events. So we created the SysDake tracers. This is like not distributed tracing. It's system called tracing which we think it's enough for most of the cases. Obviously we are not trying to compete with Neuralic or full distributed application tracing. As I said before, we want something simple, lightweight, easy to use that can work for like daily use. So the idea is to inject markers inside the SysDake even the street. These marks, they can basically scope, they mark anything. So we can mark function calls in our code. We can mark network requests, any piece of code. You can do this from any language. And when we started to design SysDake tracers, we thought about creating a virtual device in a slash dev. But we wanted this to be a container compatible. And we thought, well, DevNull is always available everywhere. No matter how minimal is your container, it's going to be there. We can capture every single syscall. So why not use a DevNull? It's a hack, but it works. And low overhead, like as I said, simple is it is fast. So the idea is that we have this even stream and every time always we have like an entry marker and an exit market. So a span is everything that's been executed inside. The enter trace and the exit trace. If you notice like the symbol just before the trace is just to understand if it's an entry and exit, this is the span. And we can obviously have like a three of different spans and nest of a structure to measure different things. When we write to slash dev, slash null, we can do it in two different ways. We can use very simple format. So forcing it's not very complex where we have a direction, an ID to identify the span from pagging and we also have arguments. So in case we want to have function entry arguments and function exit arguments, we can also use case on format, but the other one is more simple. And I'm going to show you an example. When I was deciding what kind of title is to show us as the tracers, one of the examples I'd like to start with is this one. And when I saw this, I thought, well, this is like APM for bash. So let's call this APM for sysadmins. It's not entirely accurate, but I think it's a good and easy way to describe what sort of things we can do. So yeah, if this is a loop, there is an echo. It's a write of the string into dev null. We use a tag, then we do whatever we want inside, which is load the page and then the exit. So when we run this, we are going to mark all the system calls happening in between and we will be able to do things with them. What kind of things we can do, a few examples? Well, we can measure latencies. So in this case, in the example before, we can measure how long does it take to execute that command. We can save things to your file and analyze later on. We can open it with the endcourses app. We can see what's happening inside. All the system calls that that executed. We can see logs in case that was a diamond, then stuff wrote or exported into a stat C because we also misuse a stat C and have it in our monitoring system. So that's everything I have on the slides. And now we have 15, 20 minutes. And I'm going to show you how this works life. So I'm going to take a seat because I don't know how to type properly, standing up. And let's move here. So I showed you before that, can you see this? Better now? Which one is the light? Which one is the light? It's not these two. Okay, yeah? It's the light, you can press these two. Now? But the thing is, what's the black and white is better? Okay. And if you move here, but now that you've broken through your... Let me see if I know how to change this. Okay, so that's the script I showed you before. I made some changes that PP is to use as the ID, the program process. So I got that one. I'm going to run it. Going to leave that behind, running. And I got another batch of script here, which it's loading Google. Same thing, different pack. I'm going to run it as well. And I have those scripts that they are running or they are triggering traces. So now what I can do is to open Ciccic. This is my laptop. And for example, we have multiple views here. If you haven't used H-top before, you will find this very similar. We have multiple views with different things. Keep that for a different talk because I don't want to go into all the details that Ciccic can do. I just want to focus on tracers. But one of the most nice views we got is Instagrams. So Instagrams allows you to see the latency. We are missing something. No, it's, there we go. No. So we can see the latency of the network. We can see that some pages, they go fast. Most of them, they are around 100 milliseconds. Some requests, they take longer. And I can do all sort of things like, and this is when I'm going to open the notes I have because, so this is just for me. You can ignore it. Okay, so for example, I mentioned before we had command line and courses interface. So the command line, there we go. If I say, okay, filter everything. Actually, I want to give you a second to have a look at this. Every type tracer, I want you to filter out all tracers. So I see this, the command, the ID, the different tags and this is live happening. But Ciccic has a lot of filters. So I can do different things. For example, if I want to see only the tracers from the script loading the Ciccic website, I can do event type tracer and then and span dot tags Ciccic. If you have used TCP gum before, this is going to be very, very similar. The syntax is very intuitive. And now we are filtering only Ciccic. But probably I want to do more interesting things. I want to know how fast are we loading. So what I'm going to do is to format the Ciccic output and include a span duration. So I want to see how long does it take to execute everything I have inside. Okay, so this is around 200 milliseconds. Cool. I can write things into a file. Okay, Sam, can you see this on the bottom? Or just bumping everything into a file. And then I can use the endcourses interface. Something happened here, okay. So I'm going to capture a few seconds of my scripts. That will be enough. And now I can open it. And I can show you the spectrogram that I show you for. This is a very small sample, but I can do interesting things saying, okay, so most of the requests, as I said before, they were around 100. But I'm wondering what was this. So I can click and then I see that for some reason there was a request to Google that took 83 seconds. But we can do all the things, like for example, I can say, okay, I want you to send all this into a start C. Our Ciccics has some scripts written in Lua that allows us to manipulate all the events, all the Ciccals. So basically everything is in Lua. So basically everything it does is to take all the tracers, convert them into a start C, and I got here somewhere, we'll see if it works very quickly, otherwise you will have just to believe me, that we are exporting those latencies. No, my laptop is not connected. Well, you will have to believe me. This is basically exporting all the metrics into a start C. I want to show you more interesting things. So yeah, write into slash dev slash new, still have few more minutes. In addition to do things manually, we also have some libraries and wrappers for different languages. This is for Python. We also have our Node.js. We also have for a Go. You can use it in different ways. I can show you a more complex example later on. If we use this with a statement, we can tag all the code executed inside with this hello world and do things with it. So let me run this script. So you see there is one hello world from inside, another one from outside. But what I can do is to minimize this open. Minimize this open. So I can say, okay, so they show me all the events, so all the system calls being executed inside of the spam. That was just a print of hello world. So if we just do one iteration, there we go. So we can do, we can see it's a write to slash dev slash pts and then the tracer exit and the data of the write system call. So this is a very simple, but we can again use chisels to filter and format this into a nicer way. So I have the echo fds chisels that everything it does is to capture those system calls that they are writing into any file descriptor and echo them into a nasty format. So hello world. I can only see hello world and know the hello world from outside. So this is a simple, this is a very simple example, but since we have, how long do we have? Okay, I'm going to show you a more complex example. So I wrote here a Python web service that has three different endpoints. One is doing, it's calculating a Fibonacci number just to demonstrate how we can use decorators to tag with static tracers everything inside this function, including entry and exit arguments or as we saw before using the with statement to use scope different pieces of code and as I told you at the beginning of this talk, we can use a tree, a hierarchy of different tracers and we can have some tracers inside of each other. So I'm going to run this. Let me remember how I was running this. Okay, looks good to me. Okay, so I'm going to run everything. I got the server I show you and I got the client which is doing random requests to the different URLs. All they are different, they are different microservices running in different containers and we got static running on the host and we are going to be able to see what's happening. So I'm going to record a few seconds of all the tracers being executed because it's more convenient to work on a file. Just a few seconds would be enough and now I'm going to console this and now we can start working with that file. So for example, I can do this filter that it's going to show me all the spans that took more than five seconds. So I can see like the different tags I showed you before download handler, Fibonacci, the different pieces of code that it was taking longer. If you can see here in between my scripts running on the background, they are showing from time to time, very slow requests on the network but we can do more interesting things. We can use again chisels and we can group. We can group the different system calls that they were writing into any file descriptor and create a very simple report. So we know that the piece of code inside of do download write, it wrote one gig in file descriptors. We know that the download write wrote 500 megabytes. Obviously, heathers did not write almost anything. Empty heathers, the same Fibonacci, we got less requests, et cetera. We can even see that Hello World is writing to again in the background some other stuff. We can use filters. And for example, what I can do is say, okay, this is writing on any file descriptor but what if I want to see only network traffic? Well, I can do file descriptor type and then IPv4 and I'm going to get exactly the same report but only four sockets. IPv4 sockets. If I'm interested in seeing like file descriptors that they are actually files on my file system, I just use FD type file. But I can do even more fancy things. So I'm wondering, okay, so what kind of files are you writing on the file system? So I'm going to add some extra filters. FD type file and then I want to see only the files written by that tracer and I want to exclude my terminal because I'm not interested in anything being written on the standard output. So I'm going to run this and I see that my code, it's a black box, it's running inside of a container. Some developer create this, they started to run it on my infrastructure and I have no clue what it's doing but now with this thing and tracers, I didn't have to read the full code. It just went around different functions and I were adding the creators and with the statements and okay, and now it's writing some files and wait, one of them is quite big. So it's probably the one that's making my micro service to go more slow. We can use different or other different chisels to format the output. So for example, if I use the HTTP log and I write it properly, I'm going to be able to see all the HTTP requests happening like method, URL, response code, agency size. Or I can use only show me the top URLs. So we can see that this was the URL that my random URL generator heated more. There are plenty of different chisels. You can write your own, but we created the fear that they are shipped by default. So I show you the code in HTTP requests. We can decode memcache requests as well. Writing one, for example, for read this should be very simple. We can visualize things in spectrograms. We can show top CPU process, errors, activity on file descriptors, spine locks if there is a diamond writing some locks and we don't know what the RV in Britain. We will find it some network filters, performance or security. Well, I think this is everything I wanted to show you today. I think we have some around 10 minutes for questions. Questions, anyone? Yeah. Hello, okay. My question is if I'm having this running on a couple of systems that I want to trace, is there an option to parse this to something where I can filter it in, I don't know, Kibana or something? Is there a front-end where I can just compare the traces of different machines to see events and stuff? We wrote a front-end to aggregate tracers coming from different microservices. It's a very, very experimental project. It's open source. It's available in our GitHub. Actually, I haven't linked it here, but if you go to github.com slash rios, browse between the repositories, and there is some JavaScript application that can load some JavaScript application and some Lua chisel that write things into a format that the Java application can load and you can see the request from different microservices. But it's something experimental. We created this as you saw. It's small. It does quite a few fun and interesting things, but it's open source, it's work in progress. We are still figuring out where this is going. Okay, another question is what was it? I had it, now it's gone. Okay, we have more time. You can grab me afterwards for the last. I wonder for very large systems, like those that have 10,000, maybe 100,000 of requests per second, is there any possibility to sample while recording, just recording 10% or 1% of all events? What you can do is to filter everything you write on the capture file. So I don't know how big was my capture file, but you saw it was just a few seconds and you see it's 100 megabytes. So imagine running this into a production system. We have customers that they trigger this and they get like gigabytes in just one or two seconds. And then you need to open the file and you need a powerful machine. So ideally you want to apply capture filters so you reduce the amount of information that you are sending into a file. Here's Winters. So you told that to install this you are inserting like a kernel module. So how do I install it on like legacy stuff? Do I have to on a Red Hat four or five or something like this? Does that work or does it just work with recent kernels? The system is packaged for the most popular distribution. Davie and I want to Fedora Red Hat and OS. But also for old ones like really old ones? So I don't know what's the minimal kernel version that we require. We use DKMS. So we dynamically build your kernel module. So I don't know, I would have to check out. Give it a try. Thank you. I think you have some questions. Hey, I'll take a look. OK. When you're using Docker, for instance, can you filter to certain containers? Can you repeat your question? When you use, for instance, Docker, can you filter to just see the events for a certain container? Yes. So I showed you this before, the different chisels. So for example, if we use that chisel, it allows us to filter the different containers I have running. But actually, one of the things I haven't mentioned in this talk, and I'm talking about it later on another talk, troubleshooting Kubernetes, is that the other cool thing that Syslick does is we talk to your orchestration to API. In this case, Docker. But if you are running Kubernetes, OpenShift, Messos, DCOS, we talk to the API. We know what kind of infrastructure are you running, the pods, name spaces, services. And you can use that metadata in Syslick Ventures. So you can say, OK, give me the syscalls from this pod. Or give me everything happening in this name space. So yeah. Another question here. So I didn't get the main idea behind Sysdick. So is it built on Bbf? Because you said several times that it's similar to TCPdump. And we know that Sysdump was using Bbf firstly. And second, just what about the overhead that Sysdick just creates? Does it take a long time to do that? Or it takes a lot of resources? Well, the project started five years ago, something like that. One of the reasons to not to use Bbf is that Bbf didn't exist as it is today by then. Has changed a lot. It exists, but in a different way. And we had requirements that they were not covered by Bbf by then. So for example, when we capture syscalls, we want to see everything, all the buffers. And for example, Bbf can only give you a small chunk. We have been submitting patches to the kernel. And now we are in a position that probably Sysdick could start using Bbf. We haven't done it yet. We have to look at it properly. But yeah, the reason was that originally it wasn't possible. And this is like multiple years project. The team who created or the person who started Sysdick, Loris, he was one of the creators of Wireshark. So you will see a lot of similarities between TCP-DAN and Wireshark to Sysdick. It's more or less the same concept, but instead of capturing and analyzing network packets, we did it with system calls and everything happening. OK, well, if there are no other questions, I left some links. Please use all the stuff, like leave feedback if you found this talk interesting or you thought it was crap. Have a look, test these links, test the product. And as I said, also before I'm doing two other talks, one on Sysdick applied to security on containers and another one used for troubleshooting instead of tracing. So have a look at the schedule and if you like it. You can join. Thank you.