 Hi, I'm Daniel. I am also a software engineer on the Linux kernel team at Meta and Today I'm going to be presenting Blacem, which is not really kernel related But it has relevance for BPF and tracing in general And so I hope it's a good conclusion to the day What will we come up? So first I'm gonna motivate why we wrote Blacem go a little bit more into the details of what it does Provide a library overview sort of the high-level concepts that we try to cover Look into the current status There are some gaps that we know of I want to elaborate on those and then we look ahead What's coming up and hopefully have a little bit of a discussion? Whether you see you know also some gaps that we can close or whether you have used cases that we could cater to So for the motivation Blacem is a library that Helps with address symbolization among other things so address symbolization in brief terms is basically taking a raw address a pointer an integer and Inferring a name that is human readable and so it helps with consumption By humans that has relevance for instance if you capture stack traces and you want to make sense of these and Address symbolization is kind of an intricate problem If for many for many reasons right for first you can use different sources for symbolization We have elf symbols Relatively easy to handle. There is dwarf, which is a rather complex format. There is G sim. There are break pad files all of these basically Require additional logic in your program and let's say if you are a tool writer, right? Just building on top of BPF capturing addresses and wanting to evaluate Adresses and presenting them to the user in a consumable form That is obviously all something that you don't necessarily want to concentrate on as a tool developer because it Like takes you away from your domain of concern. There are also additional system specific details and corner cases that you would have to deal with and kind of Yeah, it gets cumbersome if everybody does that and lastly there are certain trade-offs to be made as part of the symbolization process itself. So for instance if you think about Inlining functions whether you want to consider all the possible locations where a function got inlined That may require so this information is contained for instance in dwarf but it may require additional memory to Sort of go through all those inline information There's also an inherent trade-off between performance and memory usage. How much do you cash? It would be easier and that's kind of the proposal we are making To just provide a couple of knobs that a developer Let's say a tooling developer can adjust as opposed to having to reimplement an entire caching scheme In the application of interest So basically, yes, that's our proposal We want to have a default or go to a library that can be used for symbolization and that obviously would take the burden of These low-level details of tooling developers shoulders Now to give a brief overview, I already motivated that we want to tackle the problem of Symbolization itself, which is the mapping from addresses to names We are also interested in sort of going the reverse direction Where you map names? Function names to addresses and potentially other info Information that has relevance for instance if you consider attaching a break point somewhere or even considering Configuring a AU probe or something like that On top of that so one consideration that the library has to do is We need to or we want to handle both user space and kernel space addresses These are handled slightly differently on the different operating systems And we want to make this for the most part transparent to users so that they don't have to worry about it They don't have to use different libraries and then Obviously for these two You would still use or we would still support Elf and dwarf and gCMS backends or as sources for the actual symbolization so these were two parts that we consider and now the Sort of third one is more use case driven if you will and that's what we consider or what we call remote symbolization and Basically, it just means that we split the process of symbolizing Addresses into two parts and These two parts can essentially happen on different systems So if you normally symbolize an address are you would do that on a single system whereas with remote Symbolization you split that and you have two two systems involved the first system can be an embedded system It may not even have debug information available What it would do is it would just capture these Addresses and then it would would be referred to as normalize them basically if the address belongs to to a Process it would look into a proc PID maps And then spit out effectively an address that is sort of relative to the binary that it is in and It would report this address along with The binary information as well as potentially some other matter information such as the build ID this information would then be conveyed To the more beefier system if you will that would actually handle the task of symbolization itself and there on this I labeled it here fast server you could have the actual dwarf data Which you know may not fit onto the small embedded device at all because we we touched on this before right? It can be huge potentially Okay, so what is the current status so we have Basic support for symbolization We can use Elf as a source meaning we can handle elf symbols that doesn't have you much if the binary is stripped and so we also support dwarf as well as G sim as Symbolization sources a dwarf comes with an asterisk and I will explain a little bit later where the gaps currently are For the reverse operation on the lookup of Adreses to names we currently support elf and dwarf as well Other than that We are still so this is still Somewhat early in the development. We are currently Converging on the API surface so we hope to have like more of a stable interface In the near future currently We know there are still some things that we want to rename for instance and rearrange some parameters But in in large terms, I believe We are almost there What I didn't mention so far is that the library is written in rust Which has implications for the largely sea-based Ecosystem that is surrounding especially BPF We have or we provide sea bindings for that purpose It basically works out of the box we provide a auto-generated header That you can just include and then after you build the library meaning the rust project effectively what falls out of that is a Static archive that you can link to or a shared object that you can use We also started the integration with the meta internal profiler The goal there is to sort of get validation at large scale So meaning you know a lot of the profiling within meta itself happens on large binaries. We hope there's a Fair amount of requests happening, right? So we hope to evaluate the library and get some more confidence into it by by running it at meta scale We also started Using the remote symbolization process that I outlined earlier for use on our VR headsets Meaning you have these small embedded devices if you will they capture the Adresses of interest and then you have a desktop system That sort of symbolizes the adresses and you can perform your analysis this way on the desktop system I also want to thank Andrew for basically at the idea of this library and then QE Feng who was involved in the first version of it I have links here to the repository as well as the documentation But the slides will be uploaded or rather they are already uploaded and I will show the links again later on Okay, so looking ahead We know there are still a couple of gaps In the current version most notably our dwarf support is still lacking. So currently I Think it was touched on in earlier talks. There are multiple to our versions out there multiple to our standards Currently we only support to offer version four We definitely have plans on supporting other versions as well Just for the sake of completeness and you actually find those in in well in the wild We also currently don't support split dwarf information Which is a problem in general that you will probably run into With kernels who I understand Often rely on split dwarf information Another thing we want to tackle is the symbolization of adresses and APKs that mostly has relevance on Android And perhaps if you follow the BPF mailing list you saw me also at some Support for Android to the BPF itself Lastly second to last We would like to have transparent support for name demangling or symbol demangling Basically, if you're dealing with rust, let's say or with C++ symbols They are encoded in a special way and it makes sense for the library You know in this batteries included fashion to actually take care of demangling them in the process and And Now truly last we would like to support advanced use cases such as the usage of debug info D this will likely happen as sort of a bolted on solution more or less and not pot be part of the core library, but We would really like to support that moving forward Yeah, so that is a short overview of the library. I hope that makes sense. I would like to discuss with you Whether this is something that you are interested in using did you identify anything that? Maybe missing is there Potentially an integration opportunity where you say oh look this is an existing tool that I have tried to work with It only supports elf for symbolization Let's say and I have stripped binaries. It would be nice if you know there was an easy way to Support other debug sources. Let's say Or is anybody working on something similar and Lastly I mentioned that it is a rust project. We'd be interested. So we heard some Well concerns about the consumability in that space So we would be interested if if you're interested in the project on with that pose a potential issue for you Would it help? Perhaps if we provided pre-built binaries or? something like that And you wanted to ask a quick question So obviously there's a lot of benefits to place in beyond just performance But have you guys compared the performance of symbolizing binaries with LLVM symbolized because in my experience? It's quite excuse me quite slow. So even if you're just doing that was would there be a benefit using place in just for performance That's a good question. We have not yet compared performance But we can do that for sure But from like production experience a little bit. I'm symbolizer is like a huge memory hog like we as so like our Profiling solution has like as a as an option Option to like use local symbolization with LLVM symbolizer and we basically can never use it because it just basically crashes on on like production binaries, so yeah, I Just want to echo maybe you haven't benchmarked it against LLVM symbolizer, but it can't possibly be slower I Would could you elaborate on your plans? With debug info D the reason I'm asking is because I'm I'm very familiar with like the internal meta version of all of this and When I read about debug info D. I don't know a year or two ago I was like this is a great place to integrate like a remote symbolization service. So Is that what you're thinking? Yes pretty much. Yes, so Our initial plan and again I'm neither super familiar with debug info D nor do we have implemented most of this currently or flushed it fully out The plan was basically to kind of here in this process where you symbolize Your ad-reses are you would get as input this build ID? because this is part of the normalization that happens on the device and We would provide sort of a user callback Where the user would be able to do something with this build ID and potentially with the path and other metadata and so you as a user would be able to make your I don't know HTTP API call to debug info D now again When I say you as a user it could also mean like we provide a small shim layer The on top of place in itself that already provides this functionality for you and then you as a true user of the overall ecosystem Could use this shim layer and it would automatically just use the build ID make a call to debug info D Pull the respective binary and that's all the symbolization for you Just to confirm in what you're describing is blaze sim running on both the slow device client and the Simplication service fast server Yes, it is. Yes, although The way it is currently done on it could actually be like only half the code like you can literally Compile out half of it the parts that you don't need for the slow system and you just use only the normalization logic Great. Can you speak to to the plans of integrating this into BPF tries? Are you intending to integrate it was BPF tries I guess? Yes. Yes, we do. So I think the the current plan is that we want to get a little bit more Validation on metas infrastructure itself and then once we have Confidence we will speak with the BPF trace maintainer and hopefully get it integrated there again. I think one of the Main questions will be, you know, will it be a block of BPF trace to require a rust toolchain? Or how can we solve issues there? Did you also have a question comment or? Can we have a go lang library? Just a minor point. I think last time I checked BPF trace was you was using the BCC Symbolication library to symbolize so like you could just talk to myself in Yong Hong and we'd be in favor of it Sorry, what is the question? Oh, you were saying, you know in response to the question about replacing or Improving BPF trace symbolication with Blasim. If I recall correctly BPF trace relies on the BCC Symbolication like library stuff So really you'd be you'd be improving the world for both BPF trace users and all the BCC tools as well. Yeah. Yeah I'm not sure if you mentioned that it's already integrated into one of the BPF tools So there is an example in BCC on how to use Blasim. Correct. Yes There is a Libbuf bootstrap I believe is what you're referring to right and one of the examples already uses Blasim. Yes. Yeah. Oh, okay Okay. Yeah, so we have like an example in Libbuf bootstrap, but we also have like one of the tools I don't remember which one it's like trace Some some some of the tools in BCC itself But like Libbuf base tools that is using RedSnoop. It's it's working with user space stack traces So that's not not trivial to implement like with a custom code But if I may to engage the other part of the room What does the ice of island do like in Tetra going to like do you capture stack traces? What are you used to symbolize? Kail seems we don't symbolize for now. Yeah, but Since we are a go Program It's really difficult like and so we are a go program and we also don't want to have see go so we've tried to Remove all of the dependencies from Libbuf, for example, just so that we can be a pure collaboration using the ceiling we be a flabbery so It will be great to have something like that But unfortunately, I'm not sure how it would exactly work for like a go language. Maybe we can sell out But that's not something that we like to do in general I don't think we have any plans currently to write go bindings, but it is a possibility We could We could the the thing about I'm not sure if it's even possible to write go bindings just because like I Don't know like probably you would need to use the C bindings as the FFI for for the go Like if you have the C bindings, it's easy to use that but it means that we will have to like go back on our initial idea to be pure go Maybe I don't fully understand but it could we would it be an easier task to build the go bind go code To talk to that fast server. Is that what you're suggesting like in our goaling side? We'll just somebody else will be running the server and we'll just have the API In go like that slow basically we're not a slow device, but Well, basically are the slow device in this context, right? I think that would be okay maybe Right, but in your case So you would still be running You would still want to run go on the slow device. Yes. Yeah I mean the slow device would basically be would be tetragon in our case So we'd be running collecting stack traces and then if we needed to Ship them to the fast server Yeah, so I think the The point to take away there is that you would basically be on the hook for providing these normalized Adversities to the fast server as I dubbed it which already is A little bit of logic, but it's obviously not the entire core of the symbolization And is there an error going back to the slow device like can we get things out of the fast server? Like we might want to query the data or something So this is just a schematic right place in itself would merely run here and it would run here These entire transfer of information is basically not taken care of by the library at all We would provide the data we assume it can be transferred by transported by whatever means and So you could also go the other direction obviously So so yeah, so this slow fast right like this remote symbolization scenario is not necessarily what you need, right? like so we needed for like Weak device where we do not want to do symbolization, right? And also actually we use this architecture in our production like fleet-wide system because we don't want to like disrupt Workloads right so like we try to do minimal amount of work to capture data And maybe some additional metadata and then like do heavy weight Symbolization later on like dedicated fleet of hosts, right? But in your case if you don't like need to care about this and you just want to like capture stack trace and do Symbolization on the same host then like you can combine those two in a remote set like remote relative to your go process, right? And then just do RPC within the local host and then like as long as the server is like D-Rust based then like it will be easy to consume this blaze same library, but then from go side you just do gRPC and like Send send raw like addresses and like get back information Like I think in that model like I don't care what the fast server is doing, right? It's its own thing I'm just talking to it over like a Unix domain socket or something So it could probably be rust. I'm not sure that that would be a problem. Yeah Yeah, so I think that's what Andrew was getting at This separation of concerns may not be relevant for you You may be able to just do the entire Symbolization on the device because you mentioned it's not a slow device itself And then you just do like you spin up a gRPC server or whatever and you just talk to that to get the result of the Symbolization just maybe it like to make it a little bit easier for people to understand So like what do you input into slow device basically like it think about BPF, right? Because like this is designed for BPF use case you get BPF gets stuck right you get a ray of raw Addresses and like you know that it's coming from some PAD from some process, right? What slow device is doing it takes this array and PAD goes into maps Finds like elf elf files and then normalizes the address like it Removes like the the absolute address at which the process was loaded and translates that into like file offset I think or maybe virtual address upset doesn't matter, right? But like it does all of that and nothing more so on on the output you just have Build ID may be passed to the to the binary and like those normalized addresses It's still not not human readable yet, right? But it's enough to actually do like a heavyweight post-processing using dwarf Potentially later on so that's the split between the slow and fast like local and remote In typical use cases right like the BCC tools, they don't care about this They capture stick trace they do this kind of like normalization and dwarf based or elf based processing on the house Right, so like you can combine those two and that's like what typically is done So this is kind of advanced scenario in your use case. It will depend on like how fancy you want to be basically Do you want to split or not? But in either case you will need to hide this behind the RPC My only comment to that was that once you have a gRPC it doesn't matter if it's local or remote It'll be you next to main socket or it'll be it wrapped in HTTP or something So I have a question at At what point? Because I mean you need the dwarf right and you need the elf at what point you get this like is it in a previous point in time? So like I would like if you are for example on Kubernetes and then a container starts I would have to like orchestrate the container system to pass you like the container image so that you can get the binaries like How would you do this in a cloud scenario like a Kubernetes scenario? You mean how the symbolization would work in a cloud scenario? So I guess there are two parts of this right one is getting the elf and the dwarf image to the Fast server. Oh, so you're talking about the remote symbolization specifically. Maybe I shouldn't even have put up this example It's a rather advanced use case in a in a general Setting if you would assume that you would have the debug information available locally somewhere and You can configure place him To know about what this somewhere is so you would for instance provide your path to I don't know use or live Debug and it would look up, you know your dwarf debug information below there and in the remote symbolization case Basically this logic would reside in the server obviously and it would The assumption here is that this dwarf debug information and the path to it could be inferred based on this metadata the path I thought to the binary or let's say the build ID and Based on that you could perform the symbolization itself. Does this answer your question? Yeah, I think so So I think like if you're running on a container system like at some point when the container like starts you would have to pass this information into the set like the dwarf information into The server so that you can so I can get like the symbolization later whenever I need it, right? You want to perform the symbolization in the containers that correct? So you want to do it outside? So like in Tetra for example, we monitor Containers running in the system right and at some point you will get the stack trace, right? So for this work, you would have to know the debug information for the binary inside the container Right. No, you would not necessarily have to that's basically where You could do only the capturing of addresses in the container Then you would along with this capturing of addresses You would normalize the addresses as I explained and then you would capture the path Slash build ID to the binary and if you can make sense of this information Outside of the container then you're all golden and you can still do the proof that sorry the symbolization outside of the container itself Let's take the flame Build idea is supposed to be unique ID right and like given build ID You should be able to even fetch the original dwarf and like L file from some other system, right? So like in production we actually have like some sort of cache where we if we have build idea We can just say like give me the the original package and then like we So you have some sort of service where you give the build ID and then right so I don't know if that thing exists in So that thing in open source is debugging for D as far as I understand It's like it's a service. It's RPC protocol where you can say like give me binary for this build idea I mean hand baby. I actually don't know but that that's how it's supposed to work Thank you All right, thank you very much