 Eric, is it going to talk about OGRT how you can keep an eye on what your users are doing on your cluster? Exactly, so I'm going to give more of the system administration side. So, I'm Garag Rath, I'm a systems engineer currently doing freelancing. So, I'm the guy installing CentOS 5 on your machine and ruining your modules. So, I want to start at the beginning like what is a job from the administrative side. A user gives a script to your scheduler, whatever is inside the script gets executed, you get basically start time and time resource usage. So, for me, that was not enough. So, what else is happening? You have programs executions, the programs loadshared libraries, depending on the environment variables that are set and depending on the loaded modules, which is basically the same as the environment. So, the question is why would you want to know what's going on inside your job because you're basically just an infrastructure provider or not? One of the questions is what software do my users run? Because you just heard it, it's like a wild west out there, everybody's installing their own software, so I would like to have a way to figure out what the users are doing. Why? Because maybe your users are using, I don't know, an unoptimized Python version from their home directory that they got from a colleague or a binary build of a software that you would have optimized, maybe infinite support on your machine. Maybe some blast library had a bug that happens, so maybe rounding was problematic and you need to inform your users that there was a problem and optimally you would like to supply them which of their jobs had a problem and what they need to do to fix those. And from more of an administrative standpoint it's interesting if you build software for your users, can you reduce the profile? So, if you have software that does not get used, you can basically throw it away and don't rebuild it when you build a new machine. And to help the users basically figure out if their environment is the same. So, if you run a program and library path is set wrong or you build a program and the environment is not the same then your user is going to run into problems. So, how would you do this? How would you approach this problem? And we tried several steps, like the first thing is you ask your users. That's simple to do. You go to the user, ask okay, what do you do in the cluster? How can we help you? And then you won't like the answer because it's usually, we use this pipeline and they got it from somebody and you could check everything that's going on there which is very hard. Repeat this for every user which obviously does not scale. Then there are, if you're using a module system which, like how many people here do know what a module system does? Alright, so I'm going to explain this briefly. If you do, a module system is a way to load and unload software in different versions. So, module load, in this case, emboss for example, loads emboss in the most current version. If you would do an unload, it would unload it. It does that by setting environment variables. So, what users like to do is in your profile, you load all the software that you use often. So now you have a problem that a module load, if you track that, it's not actually the software that the users use, it's just what gets loaded and if that's in your profile, well, that's not exact. So what does OJT do? It tracks every program execution in a job. It tracks which shared objects a program loaded. So you get a very exact representation of what happened at the time the program was run. It embeds a signature into programs and shared objects which happens at compile time and the little starter, that's completely optional. So you don't need to embed signatures in your programs if you don't want to. And it outputs the data currently into Elastic Searcher Splunk, but it's pretty open so you could use any backend you'd like. I just recommend something where you can do analytics in. So, what's unique about this, this works without a launcher. So you don't need a wrapper around the programs. You don't need to do anything, you just set environment variability track, everything that's happening inside of a job. It's very lightweight. Everything, like reading the signatures, it happens in memory so it doesn't hit the disk, so it's very fast, it's transparent, which in this case means if anything fails executing that, like because it gets executed with every program execution, nothing happens, your user will never see and the programs keep on running. It's resistant to outside influence, which means the environment has no impact on the functioning of OGT, so it has no runtime dependencies. There is very little that can happen to influence it and that's more for the system administrator, so it's very easy to deploy. So how does it work? Which is the interesting question. In depth here. For executing OGT with every program that gets loaded, the loader has an environment variable that's called LD preload, which forces your program to load a shared object. GCC and GLIP-C has a constructor which gets called every time you load a shared object, so that basically makes it that OGT runs inside the address space of every program that you execute, so you get every execution, not only the stuff that you build as a system administrator, that gets loaded as a module, but also the stuff that your users build in their home directory so you get everything except, there's a tiny exception there, programs that don't use the loader, which could be statically linked programs. There are ways around that, so you could compile it in, but I currently don't do that. So we talked a bit about this watermarking, what it does exactly is you have an ELF file that gets executed and it embeds a UID into a section of the ELF that is done by GCC as well and by the binutils. And what's interesting about this, if you embed this section into the ELF, it gets loaded on execution, so it's in memory, you don't need to go to disk. Also an interesting question is, why would you have such a signature? It's the same path, can have a different executable, your user could copy for whatever reason a program to its home directory, stuff like this, so you can identify the software exactly, not only by path because that's not the unique identifier, you can discern user-generated programs, so if a user recompiles a program, if you're wrapping the linker, you get the recompile and it's just a unique identifier. So that would be an example of a signature, you can do it with Redelf, so the first output here is OTRT, which for Redelf is an unknown type and then you have the ABI compatibility and the build ID that's done by GCC, so that's a mechanism that's pretty well known and it works very well. And down here, if you use OTRT to look into the signature in this case R, you see that you have a unique ID, allocatable means it gets loaded into memory when it's run and the version number. So the functioning of OTRT is like this. The black box on the outside would be the address space of a program in this case because we're in a big data there from its R. LibOTRT.so gets loaded into this executable and it executes the code which goes to all of the stuff that the linker loaded and checks if it finds a signature of OTRT. If it doesn't find when it's okay, it just notes the path. The rest of the stuff, it wraps this up into a package and including other stuff. For example, it does the environment. OTRT is very configurable, so at compile time, your C7 decides what he wants to monitor. You can send certain environment variables where you know that there are problems. You can do all environment variables. You can send the loaded modules. You can decide to blacklist some stuff, for example, user bin because you don't want to track LS. So that can be done. And now it's the question that's a lot of data. How do you persist that? How do you make sure that gets somewhere where you can use it? So there is a transport which is the server. So you have liberty.so which runs into your program. It packages up all the data, like environment, shared objects, et cetera, et cetera. In the UDP packet, that sends it to the server. And the server then sends it to configurable outputs. At the moment, the data stores implemented are Elastic Surgeons blank. File is also there, but that's for debugging only. Don't use this in production. The server itself is written in Go. It just takes the UDP packet, opens it up, decodes it. Internally, it's a protobuf. So if you would like to write your own OJT, it would be compatible. It decodes the data and forwards it to the outputs. Because of the nature of the runtime of programs, because you could execute, I don't know, if you have 10 compiles running at the same time, that's a whole lot of volume of data, the server buffers the data. So it takes the data, puts it into a buffer, and then if the backend is too slow, it just smooths over these specs. It is a tiny embedded web server, which displays metrics. So in this case, at the moment, there's only one input. So you see how many packets it processed, what was the rate per second of the packets, how long did it take. The reason to notice that the input, because it only copies data to a buffer, is very quick, very fast. And here you have an elastic search output configured in the bottom, and you see the stats there. So in this case, the input has 270,000 packets, so the backend is a bit slow, which in this case, it ran on my laptop, so that might be a reason. I implemented very, very basic lookup functions into this web interface. So what you see here, there is a blast library. There is a blast library that was broken. I say, I know the signature of the library because I built it. Then I press lookup and it displays me the effect jobs. So if I use these jobs with my accounting database, I can inform the users of certain that this library was broken and they should do some steps to fix that. Having said that, because this uses an elastic search back and I recommend using elastic search or any other analytics platform of your choice, because in the case of elastic search, you get access to a whole ecosystem. You can use Kibana to do interactive analysis of the data. An interesting use case is if you have your resource utilization of the job in Grafana dashboard, you can say, give me the data from elastic search and overlay it so you would see the usage, let's say memory usage of the job and overlay it with the program executions. So I don't know, you start the memory hog and then you see the disk cost, the memory usage to go higher. Plus, stability and performance is quite good using elastic search. That's a slide more for the administrators here. It's very easy to deploy because the server is go. So it's a static binary, you drop it somewhere, look into the configuration if you want to. That's it. The client is basically configure make and there is a vendor as script. So if you want to try this out, it downloads the dependencies automatically, compiles them with the flags that are necessary for that. You type make install, then down here, you preload LibOchert, set it to active. If you don't set it to active, it doesn't do anything and you need to turn it silent so there is no accidentally running in the background and if you open this Bash here, everything you do in there gets sent to the server and processed. So in conclusion, Ochert is a more of a sensor that gives you deep insight about what's running on a machine, like on a cluster inside a job. It's a versatile tool. It's very configurable to your needs so if this is too much data and you want to blacklist something or filter something that's very possible, it's very easy to deploy. I mean, I might be biased. It doesn't take more than 10 minutes and it's what you can do with it, with the data that it gathers. You have a sense of the software that your users use which includes what they build themselves. You can troubleshoot problems so if you see that a job failed, you, without talking to the user, not that you don't want to talk to the user but it saves the roundtrip, you can check preemptively if there is anything obviously wrong and right in the middle of something. You can retroactively inform him about buggy libraries and reproducibility seems to be a big thing so if you know how a program was run, including all the environment it was running, I would say that contributes to reproducibility. What I want to do in the future for our duties, some of the queries that are interesting building into the web interface, like very tiny stuff, then maybe take another transport and usually infrastructures like this have syslog so it would be interesting to instead of the server put it into syslog if you want to. Question about the format. There is another thing which would be very interesting. There is a kernel facility called eBPF where you could basically do the same that OTT does in kernel which would make it faster. It would be an interesting evaluation there and what you can do with the linker, track not only which shared objects your program loaded but also if it uses them because the linker provides some hooks that make this stuff possible. So with that, I'm going to go to questions if there are any. Thank you. Would this clash with anything else? Well, you would need to attach it. The question was would this clash with anything else that uses LD preload? As far as I'm aware you can preload multiple things. Now, as far as I'm aware and it has been running in production it does not clash with anything. It does not modify the binary in any way so it's very self-contained. So the question was is there an alternative method instead of using LD preload? Yes, you could compile it into the program itself as an object file. I used LD preload because it's very easy to do. You could also, in our GOT but I did not want to do that. You could enforce that preload is always set to a value that you want it to be set to because basically if you're inside the program you can overload and hook and make sure that it gets loaded. So you could do it like this. You usually do it in a module file where you do module load or GOT. It sets LD preload and then from that you're set. Graduates just walk in and in and then papers submit and suddenly the environment changes. Could you use this software just to track when something hasn't been used for a long time a version of the software? Sure. We're going to delete this. Don't delete it. Well, you haven't used it for six months. Okay, I'm going to repeat the question. The question was can you use this to track if software hasn't been used? And the answer is yes. You can very well do that. Well, the question was what did you find out? What did users do wrong? I would say it's more of a reactive thing. So if you know that there is a problem you start using this data to... I mean it would be interesting if you could use this data with some kind of method to automatically decide if something is going wrong but it would involve some machine learning probably. I don't know. It's not my experience. Is there something with AWK? Oh yes, machine learning or AWK. No, I mean AWK was being used a lot. Ah yes, you get... We found out that the most valuable bioinformatics tool is AWK apparently. It gets used a lot. And then if you know this for example if you know AWK gets used a lot which may sound stupid but then you can say okay there are optimized versions of AWK so we're using an optimized version of that you could bring your users to speed up, for example. I have one more question. How much of a big data problem are you creating here? How much data is being pulled in for every process or every job that gets run? Like in volume it's quite small. It would be a kilobyte like a packet that the program sends is one and a half kilobytes including everything. When it's addressed in the elastic search I don't know how much it takes but I mean of course it's big data, it's gigabytes. Alright, thank you very much Georg.