 Welcome. I hope you can hear me. I wasn't a party to So I didn't think cryptography. This is the problem that I started thinking on quite a while ago And I done a little bit of research to see and there was anything that could be done Let's put down what is the problem? Cryptography is mostly implemented through libraries not just you know final programs And this is pretty good because we have very few well-managed pieces of code that we can easily Upgrade on the system if there is a flaw and Almost I will use it The bad part is though that it's very hard to get data out of these libraries because the libraries don't have an environment where they can easily You know create log files or do anything like that because they are embedded into programs and they don't want to disturb This problem they want to get out of the way as much as possible So how do we learn? These are the questions I asked me say how do we learn what the system is actually using in the operation like if I want to Decide on what policy cryptographic policy to to To use in my company. How do I know what is actually going on? What the surface of it is? How do I know that if I change the configuration the system is actually doing what I tell the system to do Sometimes it will be bad also, you know either in the configuration how I Design it or maybe even in the library and how do I gather statistics so in the future? I can change my decision if you want to So I thought about another area of computing where They actually have a similar problem. They want to trace what's happening in programs uh, it generally that is for performance and there are tools to that So I thought I started looking at other tools that I can also use for my purpose uh, so tools that are used are Tracing through the bug statements. This is the same problem as logging like it's not really something you can do On a library. It's because you're going to interfere with the program. Especially you cannot use it in production um, then you can use tools like ptrace Drives that like gdb that actually You know intercepts a program But again, it's not okay for production first and then you will have to intercept every single program in the system And wrap it into gdb and there's something strange there And finally I Encountered this thing called ebbf And it was interesting because it's low impact and it potentially usable in production because it mostly stays out of the way So what is ebbf ebbf is a Extended Berkeley packet filter the name comes from because it was initially used in the in the bsdos to do Pocket filtering but has been since being extended. It's basically a limited virtual machine not in the sense of a virtual computer, but in the language sense and then it can even be Optimizing the kernel through a jit optionally and it can kind of run arbitrary code Although the limitations are that they cannot do things like loops or anything else because This virtual machine wants to verify that this program does not interfere with the kernel and cannot crash the kernel stuff like that But it can't intercept a lot of other code running even in the kernel or in the user space The other thing is that it requires with privileges in most cases But that's not a big deal for our purposes So why ebbf exactly because you can't intercept anything including anything in any libraries in user space So at that point that was like, okay now I can really see everything that's happening in the system I want to without having to do any hack on every single binary Or you know, except the Stuff in user space, which is really heavy The other part I can do is I can get your data in the kernel So you don't have to stop the program Do some pressure in the space come back You can simply get the data in the kernel and have that other user space program extract this data in the kernel without affecting The program you're probing and generally has no performance impact because of that The other thing is that it does not require code changes Although if you have changes in the code, it might be easy to use. So what would what did I try to monitor? My idea was to monitor specifically tls ciphers, for example, because it was the easiest thing to look at There are many ciphers for example in tls 1.2 and there have been attacks over the years So it kind of it's a good thing to know what's going on what's being used and to inform future decisions So let's try to find out how to see that So the first thing I tried was to use a bpf trace with u-probes. So u-probes are basically Ways to trace user space programs by just knowing a function name. So if you know the function That you want to look at you can use a u-rat code Which is a probe that intercepts when the function returns back and you can extract for example the return value and in There was this nice function SSL choose cipher That is invoke when the tls session is established and you choose which cipher you're going to use For that tls session that this is called in once for a session So it was ideal to know how many times a specific cipher is used on the system And this is the complete program basically if you run this thing You will get out With this printf what program is using with cipher? So what's good about using u-probes? It's easy to set up quickly. The program can be created very quickly For some simple things you could even do a single line bpf trace script But the There are some issues with it. First of all, I required to install a A tropler of the bug info packages because most of the functions you want to look at are not necessarily directly exposed in in the alpha header And the probe cannot find the function without knowing where it is So technically could figure out exactly where your memory does and But it's very complicated at that point. Uh, and it's somewhat hard to pull data from complex data structures So if the data is hidden through many pointers and stuff like that, it's it's Very annoying to try to pull it out in a given site Because you have to make a representation of the structure and then try to pull that out The other thing is that whenever The code changes, you have to make sure that the bug info packages are exactly of the right Version and if the actual Code not only is being updated that also changes the way structure internally You might have to change the probe over time. So it's it's very good for one shot But for something that you want to use over and over is it's less So I try then I try this other thing is called USDT probes. Uh, this is really interesting. It does require a little more work Because it does require to instrument code. Uh, this is exactly the same probe they're used for example for system tub and they're used for doing, uh Performance tracing and stuff like that Uh, but it's really simple. It's really just one line that you have to add an header That tells you that gives a name to the place where you want to look at something And you can pass, you know the data you want to look at. So in this case, it is a somewhat complex structure here, but I don't have to care You know to try to undo it in the current site. I'm just telling I compile time where that the data is and I give a name that I can find without the bug in the buggy for packages. So It's easy to get the data you want. As you can see you can put a probe in the middle of a function You don't have to wait until the function ends or anything like that and uh, Do not know about packages involved And you don't need to adjust the the probing code over time in the sense that if you if you insert this in the in the source code Upstream and this get maintained Then ideally you don't have to change anything because the name will stay there with the right mean Um, but on the other hand, if this probes not exist in the code already, you'll have to patch it. So this is the The problem this way Okay, so how do we gather the data? So there are a few ways to enable probes. One is the bpf-trails tool that I use for the uprobe experiment But you can also go a little bit deeper with bcc, which is the bpf-compiler collections There are a number of tools they already provide for some common tasks But you can also create a custom c program or an item program, which and that's the the way I went To do my little experiment. So If you want to know how to use we've got here python This is almost all you need to do if you know already bpf You might want to read the bpf documentation to know what to look for here, but For finding out how to use it then in python. That's that's what I did mostly So let's look at the actual bpf code So this is an extract from the program brought and Basically I embedded a small c program which then gets compiled into the bpf bytecode Inside a variable in the python code Um I just created a small data structure where I hold the data that I want to later pass the user space And I created in this case a hash map in kernel so that I can Hold a counter for each cipher I care for and then I just created this function that counts ciphers so basically You get a cipher and you increment the hash map With the cipher Number as the key so every time you get that cipher this counter for that specific cipher cipher will be increment The second part is that you have to load the bpf program into the kernel And so you have to tell what you want to expect in this case. This is the Libreous a cell library. So this this means that any program that ever Opens this and you know runs this will be intercepted from this point on and this is the very powerful part of it You can intercept basically any program using your library. Um, sorry Then I have to use Give it the name that I put into the in the source code for the scd case in the scd probe And I need to tell it which function To use and this comes cipher use is this function that I showed a moment before And this basically hooks the The code to the probe that we created in user space And then we just load the program and it's in Finally Finally what happens that I had a very stupid program That gets this hash map. Sorry about the forwarding thing Uh, and then just Goes through the hash map every five seconds and prints out What's in the current so where the counters are what the counters say So this is the another example that I extracted from from my From my computer that was running overnight One weekend. I don't do anything with the computer the weekend. And so I just let it run And the program was almost identical to the one I showed but I was also Instrumented to look at at the nss library and I just was printing whether I open a cell or nss data was being recovered and It came out with this interesting little table So it had a number of ciphers were being used Um And just from the cipher I can see that some of them are tls one the three because tls one the three defined completely new cipher So just by knowing which cipher it'd be news. I knew it was either tls one the three or tls one the two or probably also earlier and So I came out and I have About half of the connections likely less than half using tls one the three already Which was quite interesting because tls one the three is is recent The other interesting thing is that About a quarter, maybe of the connections used 256 bit ciphers And most of them uses 128 bit ciphers Another finding was that This is basically just fireworks because I just rented only the client side in the nss library And I didn't have any other client program that I know of running and so Firefork did do a lot of connections where I was but why I was not using it Like most of the connections were done by fireforks. I don't know what it was doing Um, the other interesting thing is that I didn't ever figure out What this was uh, there was a couple of tests I was doing with a simple SSL open SSL server and a simple open SSL client just to see if this was working So I think what one of these Was done by me intentionally But then during the night My hypothesis is that Either there is a dnf update kind of thing or maybe package kit trying to pull You know the updates for fedora overnight But I don't know it's 79 But there is something running on my machine that is using open SSL to pull data from somewhere Um and This is pretty interesting because I have No idea what's going on with crypto. I didn't know what cypher were being used at all over time And I didn't know how much how many connections are going on overnight on fireforks or that was that there were other things Um, I could have printed for example Uh The program if I you know if I wanted to explore exactly what was creating the stuff I just didn't think of doing it. Uh, because I wasn't expecting this many connections. So this thing was running and I Kind of almost forgot it and then the next day I was looking at oh this please go And yeah, that's basically it I think I was quite fired Any questions? Yeah So yeah, that's what I did Um, so yeah, so uh, there are basically Oh, sorry. Yeah, why I did use a system that probe uh for this So basically there are only two kind of probes that ebbf that gives you for user space programs So for the kernel you have Uh, I think uh, like 10 or more probes, but for user space you have only two the u probes And this us d t user space uh dynamic tracing probes You probes you can use them Without any instrumentation of the code And so they're appealing because you don't have to change anything But as I said before the the problems are that If you want to use this in product actually at any time It's pretty annoying because the debugging foe need to be installed and that's a ton of Stuff and it's really prone to error The the reason why I use the us d t probes is really because Although it requires a change in the package or what I did is I created a patch and did the rpm build to rebuild the package It really like allows you to look exactly at what you want Like you don't have to scavenge Into the data in the in the program memory You just tell you know the program basically create this no operation and put in the health header A pointer that says the data is exactly in this place And so it makes it much easier to get exactly the data you want Out of the program So yeah, you have to balance whether you can change a program and in that case you can use this method If you have like a sort of a black box Then you will have to use u probes And you have to constantly Update them when you know you update this program In particular you can use system Oh, why did I use e bbf also the the question morphed into why did I use e bbf with Assistant with these probes instead of using system tab itself? Well one Very simple reason is I wanted to experiment with bbf and initially Well, initially, uh, I thought that u probes would be the way to go because u probes allow you to basically Intercept without doing any change And so at that point I was looking at u probes mostly and systems that cannot do anything without instrumentation The other party though while I was looking at is that e bbf is much safer to run than system type Because in this case I mean to you I would have to introduce systems up in the current to look at this What people do with a system type in user space is usually look at the program while they are Where it's running is to create basically For this kind of use create a current module That have basically unbounded access to what the kernel is doing because system type is very powerful But actually it's too powerful like if you make a mistake in that program you're running there And it's easy because it's so powerful that you want to do a lot more You can easily destabilize the system at that point While e bbf because it's very constrained as language forces you to move most of the processing out you cannot do loops so you cannot like Cause you know the kernel to be stuck somewhere. So it's kind of much safer. And so in the end I was like Yeah, even if I could use system type and just I'm going to continue experimenting with bbf because it has these properties called So the question is uh, I had to add code so what are the prospects of getting this upstream and whether I talk with upstream So this is just an experiment at this point. I didn't even I think show it must on my team These changes I made so it's a package I built on my machine However from the point of view of acceptance It remained to be seen. I don't think we will have to necessarily carry downstream patches for a long time I would prefer if upstream adds this kind of stuff And because it's basically it doesn't cause any issue to have this code I have good hopes that they will accept it because yes, it is a code change but this thing Is not an instruction in your program What this thing does to the binary it basically introduced to no operations Like assembly no operation. That's all it does And the other thing it does is that when you compile it it will add a small segment in the elf header That is only seen a link time basically or by the usdd from that tells ebbf How to hook into dynamically around time? So basically you have absolutely zero overhead in doing this. So from the point of view of You know the program itself. There is basically no impact. The only impact is on the fundamentally on the source code and in the else So I hope that most upstream won't have a problem But this kind of thing the only problem is if we want if we want to introduce thousands of days because then the maintenance May be perceived a little bit too much, but I don't plan off on you know doing a huge instrumentation for dates It's just Key places where you want to audit some use of crypto at least from now So I hope that it will be Reasonably easy to get this stuff accepted and there are similar things At the question are there similar thing in other open source operators? So Yeah, so one point of view that your question make me think I've is that yes, this is currently in it specific So some Upstream might be a little bit reluctant to have all the main specific stuff However, see me like the all this stuff can be really hidden behind a more generic um Macro Where you can have this specific legal system Thing coming up, but then on Solaris there's also the trace. So this is I think basically Perfectly compatible for you know compiling on there and I don't know very well about bsd Again, if you have a macro you can simply enable it only when you're bleeding. So from the point of view of Of the source code, it doesn't have a huge impact At worst case you have a macro where you you don't have anything on the system But you don't have support because the macro is all in empty define So you don't even need to litter the code with this. You just need to create a macro that does nothing When we don't have support Yeah, so the good thing is that you these This dynamic tracing probes are Somewhat somewhat common at least among a couple of systems So should be okay And you can easily use this for other kind of tracing And so you can like kind of mix them with the bug tracing Question is Question is could I use this system to check that the for example system white crypto policies are applied correctly My answer would be Kind of like i'm not going to use this To go into the program and look what are the ciphers that have been enabled It's like you can do that with better ways. However, what you can do is something like this is You could have A list of ciphers that you know you want to allow Or maybe a blacklist depending on what you look at how you see it And then you can have like something alert you if you see anything else Like if you see one of the blacklist ciphers or you see a cipher that is not in your white list You could have these things sending events to the program Controlling it like the python program and the python program could like log look you have a crypto policy that says that only Shout to 56 is allowed, but i'm seeing shawan being used because it's being involved Because you could intercept for example evp shawan the Initialization function for shawan and you can you have alarms a shawan is being used and in theory You told me your policies that you don't want to use So it could be used to audit whether your system will do stuff that you don't want to do Or whether your system is doing the things you want to do like you can have other policies that are About checking what is being used, but not about checking the configuration itself Yes, so The answer is kind of yes if you Want to basically audit what you're doing which is the point of this research It's about the idea initially was how can i audit what's going on? and then you can do it for example To alert yourself or you can do it for future policy decisions Like maybe you don't have any whitelist of blacklist, but you want to know what is being used like If you see that there is a cipher that's never used Maybe you'd say well, let's exclude it because nothing using it anyway, and I don't want tourists to have more you know attack surface Or you want to know How hard it would be to disable shawar, which is a problem we'll have soon and you can see is it being usable Maybe you find out that you know Most of your systems don't use it and then you're like you can have a custom crypto policy that says no shawar because You're fine. That is like make one percent of the connection to use it That may be you can instrument to tell you what program use it that maybe you are like Yeah, this program can handle you know a failure there. So i'm gonna disable it So that this is the idea that sparkled a little bit this research And I hope you can get actually to to build something that does exactly that Okay out of time. Thank you