 So Kyle and John are going to talk about process emulation. They're all the way in from Boston. So this is heat for them. Luckily they haven't melted. Let's give these gentlemen a big round of applause. Alright, thanks everyone for coming. So like you said, this is about process emulation in particular automating that process emulation at scale. So before we get started, what I want to talk about is kind of like how we think about analyzing all the malware and all the binaries that we get. And in particular we have a fairly large ingestion pipeline and we need to be able to process all these samples quickly and efficiently. So when we think about a sample, what we think about is like every sample has some, you know, finite amount of total features or data that can extract out of it. And all of those pieces of information cost a little bit different or they have a different cost with each one. So when we talk about the cost, the reason that the cost is there is because as you can see in the graph, we have static. The two domains of extracting the information is static and dynamic. So static is obviously the cheapest and the fastest. And the primary reason is because you can just do that at scale very easily. Throw it in Docker or your function as a service or whatever, cloud provider. And you can just extract those features really quickly. Dynamic, of course, is kind of the opposite of that. It runs very slow. You have to spin up lots of machines. You have to configure them in particular ways. And it just increases our cost for getting information. So as we collect more samples and as like, particularly when Maurer comes in, there's some problems with that, right? So when doing this research, we really have, we have three problems that we wanted to solve, particularly with regard to Maurer. And the first one is obfuscation and packers and encryptors and to a lesser extent, different types of installers. And what these have the effect to do is they actually limit the amount of information we're able to extract from static analysis. So they reduce that total set. And then what it forces us to do is push those samples because we're not getting all the information we want from them. We're forced to push those into the dynamic analysis domain. But the dynamic analysis domain, you know, it's more expensive. It increases our cost, which is not something that we like. But the dynamic analysis domain also has some problems, right? So there's, you know, anti-analysis techniques like no Maurer likes to be like run in a debugger or like on a different type of hypervisors. But effectively there's this infinite set of like cat and mouse kind of things that you need to add to the machine to trick the Maurer into thinking to run. In addition, not all Maurer is like equal opportunity. It doesn't run on every machine equally. Some Maurer is targeted so it won't run the same way on a Windows machine in English as it would, you know, a French machine or some other, you know, keyboard or any number of features in there. So, you know, with those three problems in mind, you know, our goal is to basically we have three goals in mind. First is we want to reduce the cost. We want to bring the static analysis features that we lost due to obfuscation. We want to remove those from the dynamic domain and bring those back into the static analysis domain. So that's kind of the key one. So we want to bring back parity of static analysis or at least the cost associated with that. And then the second thing we want to do, and this is, you know, we would hope to bring as much as we can. But what we really want to do is we want to take some of the features that you would find inside of a dynamic analysis domain and emulate those or bring those back into, uh, or bring those into the static analysis domain, increasing like the total set that we could actually extract via static analysis or at least more than traditionally, uh, you would accept. And then the last thing, of course, we want to do all this at scale. So the static analysis kind of implies the scale, but we want to make sure that we can run all these things either inside of Docker or inside of, you know, function as a service. Uh, so with that we came, explored several options and what we decided on was emulation. Uh, and if anyone's not aware, there's a really amazing emulator out there right now called Unicorn Engine. Uh, and this basically is a CPU architecture emulation, but they also emulate memory. Uh, and this is key. So what we want to do is we want to load up malware and we want to replicate the Windows process memory inside of the emulator. Uh, is with as high fidelity as we can. And what this allows us to do with emulation in particular is we get full introspection into every single instruction on the system. Uh, we get full introspection into system calls so we can hook everything. Um, and then what that also allows us to do is because we can hook everything and we can see everything, uh, we can also mock out more of the system. And this was key and this is where we feel like we're adding a lot of value is. We took a kind of an idea from unit testing with software engineering where you're mocking out functions and services and we thought, well, why can't we just mock out, you know, large chunks of the OS to allow the malware to execute, you know, further than we, we would normally expect. Um, so with that, uh, there were other, uh, you know, PE emulators that were out there. Uh, the three at the top were I found on the Unicorn website and these are pretty nice, right? They, this is where we got some inspiration and, and what they do is they'll actually take the PE file, they'll load up a bunch of DLLs and they'll actually throw it in the, into the emulator space and execute them. And they do it pretty well and the output, you know, you get the parameters for the function, you get the, you know, where they're called, what the function name is and the return values. And they work, they work quite well. Um, and we wanted to see, alright, using this as inspiration, how can we take this like the next step further? Um, so with that kind of in mind, we had some requirements for when we decided to, what we wanted to build, right? So, uh, well there is one extra requirement is we wanted the, the code to be simple. So we wanted to build this framework for executing. So we wanted the, the actual code to be very simple. Um, and when we originally POC'd this, we did it in C and then we, you know, to make it even more accessible, we, we switched it to goaling. Um, so that's just as a side note, but then the four core requirements to Bini were, uh, one we wanted to kind of formalize that mechanism for loading a PE file and all of its DLLs. So we like sync all them up, get the imports working and so when, when a call is made it actually jumps to the real DLL within, uh, or the real function within the DLL that was called. So once we had that, uh, then we can, then we can really build a good hooking framework. And this is, the hooking framework is really the, is the core part of, of Bini and it enables everything else. And we wanted this to also be as simple as possible. So when you want to define a hook, you just basically fill in a structure that defines the name of the hook, the parameters, and then the, the implementation. And then the third thing, because of that, uh, we want to, you know, extend the current research a little bit and then add, you know, mocking out some of the, some of the OS. So a lot of malware, you know, touches the file system, touches the registry, has threads, things like that. You know, what if we can actually mock that inside of the emulator and then make the malware think that it is actually running and inside of a real PC. Um, and this, what this really allows us to do is add more fidelity to some of the function calls and really kind of take the next step, uh, we think, uh, in the PE emulation space. And then lastly, we want this to be highly configurable. So we want, with the problem with dynamic analysis, right, is that you have all, kind of an infinite array of cat and mouse, uh, and figuring out, you know, you know, anti-analysis techniques and like the, whether malware will run on a machine. So what we want to do is we want to define a single configuration file that represents the entire environment of the machine. So the entire context with the idea that then we can enumerate a bunch of different configuration files and then we can run Bini against a particular sample much faster against a hundred, you know, configuration files and we could spinning up a hundred different enumerations of those machines. So the idea is, you know, you can change the keyboard, change the registries, uh, and then you can do this kind of very rapidly and then you can take, have the malware take different branches. Um, and then just really explore what it actually does. Uh, so with that, so now we're talking about the implementation details of Bini and, and while we're going through this, so a little, a quick history on Bini is that it started as just like a side project. We were looking at those three, uh, examples that I showed you from, uh, the Unicorn website and then we were just kind of playing around with it, throwing the P and the emulator and seeing what it would do. Uh, and then eventually we got to this point where we were actually getting interesting data or useful data. Uh, and then we decided it was a, it was actually worth pursuing as, as a real project and now it's actually living inside of our, our ingestion pipeline and everything goes through it. Um, so this was, uh, this talk will be from now and it'll kind of be like a little bit. We'll have some screenshots and some demo stuff. Uh, we'll do official real demos at the end. We'll have screenshots of that and explain what we were talking about, you know, in the slides prior. Uh, and then this will be kind of in chronological order of, you know, the different problems that we had while we were, you know, running different samples through it. So, you know, with that said, what's the first problem when you're building an emulator like this? So the first thing is you need to write up, you need a parser to parse the PE files and you need something to parse, you know, the DLLs and, and all, and all the, get all the information out of there and you need to be able to link that up and throw it in the emulator. So the first thing we did is we, of course, wrote a parser. So we wrote a PE parser that would, you know, parse, you know, our PE file, all the DLLs and then we added some, the reason we did, we wrote the parser is because we wanted to do all of the linking and the updating of like the base address and the, the imports and various other features. We wanted to do that all outside of the emulator. And the main reason to do that is because we want, once we get everything in the emulator, we only want to start the emulator once. You know, we don't want to interact with the emulator a bunch. It's just, it's more time consuming and kind of problematic. So, you know, writing our PE parser and, and getting that going, you basically, you know, you load the PE, you get its imports tables recursively and you go through and you get, load up all the PEs and, and then you update the base address of the DLLs and then once those are all synced up, then you throw that in memory. And, and one of the reason we wanted to do this outside of the emulator was because for every DLL, once we finish linking everything up, we go back through every DLL and we grab the exports. So we, we now, we know where the real image, images for, or the addresses for particular function in the memory space and now we're going to build our own map outside of the emulator that has a map between, you know, the DLL, the function and, and where the address actually is in memory. And this is so we can enable the hooking. So when the emulator actually jumps to that address, we know to stop the emulator, run our hook, uh, if, if there's a hook available and then, and then move on. And you would think that this would work right away, right? Um, but on, on modern, on modern binaries or modern Windows machines, uh, there's these things called API sets. Uh, so when you compile a win, uh, binary on Windows, um, anything after maybe Windows 7, I think some service pack in there, you'll actually, when you, you know, pound a, uh, include a particular, uh, file and you do like load library in this case, this is the screenshot, um, indicates you're not going to get the real DLL. So we didn't actually know what the DLL was. Um, fortunately, uh, Jeff Chappelle has, you know, some amazing documentation, he's done a lot of RE work on this and we were able to pull that implementation directly into Biny and add this to our PE parser. Um, so now when we, you know, when our PE parser gets this kind of file, it grabs that, that full, uh, API set, grabs the API set scheme of DLL, which is in, you know, every version of Windows, you know, beyond, I think 7, um, which is the lookup table for the actual real DLL on disk. So now we can do the real linking and load all this into, into the memory space. So this was kind of the first problem that we had. So now, you know, we have the PE file inside of memory, we have all the DLLs linked up, uh, we haven't quite hit go yet. Um, but the next thing we want to be able to do is, you know, we're mocking kind of the system, right? We're doing unit testing of the malware effectively. So we don't need to implement, you know, all of the Windows, you know, subsystems of course, that would obviously be painful, but we can implement subsets of it. And, but, but at the end of the day, what we really want is we want the malware to just think that this particular API was executed successfully. Like, we don't really care that about, like all the other details of the API we just need, alright, does this function return a value? Is that value, what does that value need to be in order for the, the malware to continue executing? So in, in the case of like create file, uh, all we really need here is we need to pop the, some values off of the stack, uh, and then we need to put a, like a valid, uh, a valid address or a valid, uh, uh, identifier into EAX. So basically just not negative one there. And then, you know, the malware will continue thinking that it was executed successfully. So that brings us to hooking. So I mentioned hooking, we wanted it to be incredibly simple. Um, and, you know, we, we believe it is. So when you define a hook, as I said earlier, we have, we have a mapping of all the hooks where the addresses actually are. And what you do is when you want to implement this, uh, inside of, inside of the framework, you just define a type of struct that, you know, describes your hook. And all you really need is the name of the function you want to hook. Uh, the parameters, which is just a string, uh, an array of strings. Um, and they also accept format string variables, which I'll explain on the next slide about, you know, display. And then the last two are optional, uh, function or optional parameters. So if you define the function, this is the actual implementation that you would be over, uh, overriding. Uh, and then the return value. Um, and if you don't, you know, if you don't have these, it'll just assume, uh, its own default behavior. So with that, then we ended up with two types of hooks. There's a full hook where we actually override the implementation in the DLL. So when Bini is executing inside of the PE, it goes to a function call, so in this case it hits sleep. And then what'll happen is it'll do, it'll call instruction to sleep into, you know, kernel 32, whatever, and it'll go into that address in memory. And then what happens is when we hit that address, because we have them all, we have the, the lookup table, we'll hook that address, stop execution, and run our implementation, which then can, you know, either override it in a full hook case, um, or what I'll talk about in a second is partial hook. So looking at that, so we think this is fairly, the only way to define this hook is you just define the name, which is sleep, right? And then you define a pointer to the hook, which contains the parameters, which is the string, and then the function. And one thing that we think is pretty cool here is, particularly with malware, if it's doing like a sleep very early on, you know, sometimes it's trying to evade something. So sleep in this case, we can actually, uh, intercept the parameter. So with our function implementation, you get two parameters that are built into it, and those pass in the context of what the function's called. So basically, the parameters that it's called and some other, you know, metadata. And what we can do is we can just increment the tick count of the CPU, and then just return immediately. And then the malware is just like, okay, yeah, we must execute it successfully. Uh, we slept for five minutes or whatever, and it can go then call, gets tick count later, and then it'll actually get an incremented tick count. So this will happen in real time, and it's just right, and it's just instant. So that's a full hook. Uh, but unfortunately, we don't want to do, or fortunately, we don't want to hook everything because this would just take forever. So we really only want to hook things that make a system call. Um, so for everything else that doesn't actually make a system call and transition into kernel mode, we just want to do what's called, what we call a partial hook. And effectively what this does is this just gives us the details of the parameters in like a human readable form, uh, for that particular function. And so what happens here is when that function is called, it jumps into the DLL that's that, uh, the function within the DLL, and it literally emulates that function and then returns, uh, back to the, the PE file as normal. Uh, so as I mentioned, the parameters field. So this is, this is a particularly important field and it actually does like three things. Uh, the first is it defines the, the string values. It sets the parameter name. So when you output as the, kind of in the bottom part of the screenshot, it'll give you the human readable values, uh, to the screen. So this is, this is information that we would capture, uh, in our data pipeline. Um, the second thing it'll do is it'll, uh, it'll dictate what gets popped off of the stack. So when you return from a function, it will actually say, all right, we have some helper functions built in that, so we go back a slide. The skip function standard call that's built in, and that'll recognize how many parameters are on this, in this, in this list, and then it'll pop those, those values off of the stack. And so it'll do all of this for you. Um, and then lastly the parameters are what's used for when you call, when you call into that function and you want to access to those parameters to do your implementation, uh, this is how you would reference them as, as the offset that they're actually in. So a quick example, and there'll be several of these demos, you know throughout, or these screenshot demos throughout, but if we look at the highlighted lines, the first one is, you know, we're actually running binny, um, and we have various parameters, so the dash v will actually say we'll spit out all of the, uh, instructions. So this will be like verbose mode, and it'll give you all the instructions leading up to a particular, uh, function call. And then the last line, uh, that's highlighted is the actual function call that we're, that we're getting. So we've actually hooked it, right? And you can see it's at a different address. So we know that this is where the DLL was mapped, this is where that function is within that DLL. And then just next to the address there, because it's slightly different than where the P.E. is, just next to that address is an F. So this would indicate that it's a full hook, uh, versus the partial hook. Uh, and knowing this is actually very useful for debugging it when you're going through a malware and you're trying to get it to go to different branches. Um, but that is that indication. And then you get the, you know, the actual name of the function and, you know, the parameters, um, and they can be in a human readable format. They take a format strings, um, parameter. Um, and then of course the return value, which is also pretty important. Alright, so now at this point we basically have a P.E. parser, we're, we're able to load in DLLs and actually get everything set up to, to a certain extent. You know, we, we actually have some hooks, probably implemented, we have our hooking system implemented. And that's all great. And this is where we kind of hit the feature parity mark for some of the other emulators that we were, that we were, uh, kind of describing earlier. Um, and so we asked ourselves, are we done here? And of course not. Um, so we're able to actually set an entry point and then start the emulator. Um, set everything up as, as Kyle had mentioned earlier and then press go and, and, and continue forward and hopefully get through most of the sample and have it terminate successfully. Well, in the case of some of what we were doing with partial hooking it wasn't particularly working properly for us. Um, in the case of like say get current process ID as you can see on the screen here. Um, we were partially hooking this mainly because it wasn't making a system call. And typically we, we only want a full hook functions that are making system calls ultimately. And so in the case of get current process ID it was actually accessing Windows user land objects and we did not have those set up at all. And so before we can actually go and set up these user land objects we actually had to implement segment registers and that was kind of a glaring oversight when we first set this all up. Um, so fortunately, uh, there's actually, and thanks, thanks to Chris Eagle on that, um, there's actually a lot of examples on how to, uh, on how to do that. And so we were able to kind of take the code that he had written and kind of take that and, and make it, uh, suited to our needs. So now that we actually had the segment, uh, the segment registers, we were actually able to start building out Windows user land structures, uh, in particular. Um, most of what we've actually done is like the Tib and the PEP which are the thread information block and the process environment block respectively. Um, and we actually wanted to start filling those out because it turns out that Mauer actually, or any, any program really uses those, um, extensively. Um, typically when the Windows loader actually starts up a process it will, it will actually populate a lot of the, a lot of these structures. Um, so that the, you know, whatever's running can actually reference that. And in, in many cases, uh, uses that information so it doesn't have to make say, a, uh, a context switch into kernel mode to get certain information. So we actually wanted to, to build these out. And much like the rest of the requirements of Biddy, Biddy, they needed to be configurable. So, with that we actually started building out the Tib and the PEP. And, and at first it was, it was mainly just as we saw fit. Um, we'd see samples that would hit a certain portion of, of these structures. And then from there it would fail. And then we'd say, okay, we're missing this, this certain field. And so we actually built up the structures. And these structures are actually quite large. Uh, for those of you who, who have maybe seen it before, these structures are quite long. And every version of Windows they keep adding more and more to this. So as we were kind of going through the Windows versions making sure that we had all the information. Um, these, these structures actually ballooned in size quite a bit. Um, but the important takeaway on this is that we actually, again, can only populate the, the fields that we needed at that current time. Um, so most of what we did was usually lower in the structure, but higher level stuff we kind of ignored and would not bow. And so if, if we had an issue, we would then populate that and then move on. Um, and again, much like we did with the PEs and the, um, and the DLS when we were parsing, uh, we were actually setting up the tip in the pep ahead of time and actually having it kind of in a, in a semi working state before we actually started the emulator itself. Typically that's done as the loader is going, but in this case we, we again just kind of set up ahead of time with mock values that would kind of make the sample run and be happy and kind of go through what it needed to do. So now that we actually had something going, we were actually getting through some samples. It was, it was great to see. We, we actually got through a few, you know, example programs that we had written to test certain things out. We were going from start to finish. We were actually seeing the terminate process at the end, which was great to see, you know, really exciting stuff. So then we actually came to like a crossroads where we wanted to actually continue adding more to it. Like Kyle mentioned with the requirements. There's, there's a number of subsystems that we want to mock out, especially on the OS level. But before we can do that, we actually have to go and set up some, some more stuff that windows would require. And for, for that reason we actually, actually started doing, starting doing stuff with handles. Windows actually uses handles for a number of things. Basically anything that you can think of on like the lower level side of things, like especially like file, file access registry stuff, anything with the threading. That, that's all done via handles, which are effectively just a memory pointer to some sort of information and memory. And so what we did was we basically created like a superstructure of, of all the types of handles that we might use and kind of put them in one place. And using that information we can then just abstract away most of what windows would expect to have inside the handles. And we can then kind of hand wave and pretend that they're actually there and the malware can, or any sample that we put through it can actually go and continue forward and we can, we can then build subsystems on top of that. So it's, it's kind of a basic building block for us to kind of start building subsystems and mocking out the OS. And of course you can't actually do anything with handles until you've, we've actually made a proper memory manager. So we've actually started to build a memory manager inside Bini itself that would be kind of separate from the emulator. And that we, we, we took a, like a normal heap implementation and then kind of built it up to, to kind of see our needs a little bit more. So it does everything you would expect out of a heap, a heap, a heap memory manager. And then we kind of built on top of that. The way we kind of handle the heap in this case is anything that references the heap. So say it's malloc or free some, one of those function calls. What we would do is any library that was, that was calling some, that some style of memory management, we would partial hook. And ultimately what's going to happen is they're going to hit NTDLL. And when we hit NTDLL, then we're actually full hooking that memory management function and then we're passing it over to the Bini memory manager and then doing everything from there. And from there we can then start doing more things than just, you know, just heap management, for example. Everything that's, everything that's not on the stack we're then just throwing into the heap. And the, the memory manager is taking everything, care of everything for us. So with that we can actually start putting atomic IDs for our handles which, which will point to our superstructures outside, outside the emulator and then we, from there we can actually have in from, have those, those structures filled out and we can then from there build on top of it. So say we want to do file stuff, registry stuff, et cetera. So that's all now handled. Okay. So now I feel like at this point when we are doing the development we, this is where it kind of switched into where it was a real project and we were actually able, we were confident that we were going to get useful information whereas before it was, it was kind of touch and go. But basically what that means is we've, we've mimicked the, the userland space quite well, right? So we have, you know, as John was mentioning, we have a heap which is, you know, critical and then we have all of the userland processes and the hooking in place and things like that. So the next thing we wanted to do is all right, let's see if we can actually capture some, some files. So if we have a piece of malware that writes a file, can we actually capture that and then save that and analyze it for later? So the first thing we want to start implementing when we got into, you know, kind of the mocking of the OS was the file system. And the way we did that is we basically, Bini will have, it's kind of operate similar to Windows where it, when you're searching for a DLL, when you load something is, Bini will have several different paths where it looks for files. And the first path is its own kind of folder when you, so when you start Bini, you can give it a file path and that will be its root directory for where the, you know, where the file system is. And then from within there you can actually copy in your DLLs or copy in, you know, all the other files that you would want. So when Bini actually does a create file, it will actually grab, you know, whatever that file actually is. And coming back to, you know, the hooking again, so what was the malware actually need though? We just need a valid handle. So we have a file, we have a mock file system that can support the handles and then the abstractions that go along with interacting with the system. So with that, the first thing that happens is, you know, first of all, all of the file functions, like create file, you know, write file, all these things, they're all going to be fully hooked, right? Because eventually these would make a system call and we want to intercept all of those things. So when we fully hook that, what happens is we're going to hit that address within the, within the emulator, the emulator is going to know we're at some address that's fully hooked, it's going to pop out, so similar to the transition from, you know, user to kernel space, it's going to pop out to our own file system handler. And then it's going to get one of these new handles. It's going to pass in all the parameters from create file, so specifically the path and the permissions. And then with that, it's going to, you know, it's going to instantiate a file, create a, create a handle, put that handle in Bini's lookup table, and then pass, you know, since Bini's getting a, you know, a valid handle from the heap, which is just going to be an atomic address, it's going to pass that back as, you know, EAX back into the create file. So now when you run create file, it'll actually get a real handle that then the malware can then use, or the file can then use. But the other thing we wanted to do is if we're running multiple samples within this file system, we want any rights, like we don't want to pollute the system, the file system, or the registry, or anything that we do, but any rights need to go to our sandbox location. So if the malware is like installing some persistence mechanism or whatever, we want to actually capture that so we can analyze it later. So if there's anything that's created at like a create file with any type of write bit set, Bini will actually understand that, hey, they're trying to write here, so let's, instead of going to the real path that they're trying to write to, or the real file that they're opening, let's redirect that into the sandbox location. So as I said, so write file will get called, it'll get into, it'll get hooked, right, because we know the address, it'll go into the full, the Bini system handler, grab that handle and then write anything that we're writing to the file, it'll actually just redirect it to these bits. And then it'll just return however many bytes were written successfully and then the malware really doesn't know any different. And here's a little screenshot of that actually happening, so of course the two highlighted lines. The first one, actually, you know, create file and we're getting, you know, the actual file and the path that it's located and the permissions. So this is super helpful when we're doing data analysis later because we're capturing all of this data. And this is the more the human readable form if you're running in manual mode, there's also a dash of J option where we can get all this in JSON. So that's part of our ingestion pipeline. But the important thing here is we know the path, right, we know the permissions and then we know the return value. And the return value is the handle. So the malware actually gets a valid handle here, right, it doesn't get a negative one. So the malware is like, okay, yeah, we're good. So we're going to continue executing. We're not in a sandbox. And then at the end there, the last highlight of the line is write file. All right. So now we're, the first parameter here is our valid handle. And then we know when we hit right file, Biny is going to execute or intercept that call and then run its implementation. It's full hook. And then of course we're going to return a valid output for that, which is going to be the number of bytes written. And then again, the malware, you know, it doesn't really know any different. So just, you know, in the console, you know, you can LS the file and everything goes to this temp directory. You know, we created a mal.exe, mal file.exe. This is not a valid exe, of course. And then we just cat in the bytes. So we can actually, this was kind of like, when we saw this, we're like, all right, yeah, we can definitely, you know, we can definitely do something with this. And of course, this is a very trivial example and we'll show some, some more of the demos of real malware. So, you know, we're just kind of continuing on. And this whole process is very iterative. We just get a new sample of malware and we kind of execute it, see how far it goes. And it's like, oh, now we need to implement the file system. Oh, now we need to implement this. So the next one that we needed, that we wanted to implement or we needed to implement was the registry. And similar to the file system, every registry call is eventually going to, you know, drop into a system or into kernel mode. So we want to full hook all of those. And then when we full hook those, we've built a registry within Vinny that has a bunch of helper methods for, you know, accessing the, the different keys in the registry. And then one cool thing about this is, we made it to, we wanted to mimic being able to load any value from your Windows registry into the Vinny registry. We want to make that as seamless as possible. So if you just export, you know, from RegEdit, you can actually import all of that data or that file that it exports that you saved to disk. You can just copy and paste those into our configuration file and then, Vinny will just read those and load those up. So this is really useful if you're, if malware is dependent on certain keys or you're trying to, you know, evaluate what's, what it's actually storing in some keys. And then you can toggle these in kind of a real time and it's very fast. So, you know, all registry keys, they're going to, you know, get hooked. It's going to get into the, you know, the Vinny subsystem for handling registry, you know, do whatever option or whatever function that that's required and then just return valid data back to the malware. And again, the malware has no idea. So the next thing we want to do is, is configuration files. And as I mentioned, you know, these are directly copied. The registry keys here are directly copied from, from RegEdit, from the export. And then inside of Vinny's registry, there's a handler that converts these to bytes for whatever the proper type is. So we've kind of abstracted that away, so you don't really have to worry about that in the hook. And that'll just copy the bytes back into the emulator and it just works, you know, as you would expect. And the other thing to note and kind of one of our requirements was this configuration file needs to be, you know, very easy to edit and very easy to understand and kind of all inclusive. So I've just kind of cherry-picked a few values here, but you know, there's obviously a lot more. The root file system, so this is how you define where the actual files are located. And this is typically where we put all the DLLs in like NLS files and you know, other, you know, files that we want that are supporting the execution. And then you can also have like code page identifiers, you can have like process IDs, you can have pretty much anything that would go in the tab or the peb or that would help the, you know, the execution of the sample would go in this one file. And then of course the registry, you can just define it like this and it'll parse everything properly and build the tree so it's, you know, pretty efficient access. So again, another quick sample, looking at the registry key, we have three, you know, highlighted rows, you know, we're actually with the registry key, you know, we get great insight into what's actually happening and these are kind of the IOCs that we really want when we're doing some of our machine learning. And then, you know, the query, the values, you know, we can look at the particular, you know, the value name. And then next we can see what's actually getting set into those registry keys. And because this is, again, all turning, returning back successfully, the malware doesn't really, it doesn't really know. So now that we actually have, you know, a few subsystems, we have file subsystem, registry subsystem, we have configuration files to kind of back all that up. We actually started running this against more and more samples and we were finding some interesting results. In the case, in the case of a few samples, we were hitting some, some interesting threading stuff. And in some cases, we weren't, we weren't actually doing anything with threading and we wanted to. Benny, you know, by nature, because it's an emulation, what didn't have any sort of thread management. So we actually implemented a thread manager. When we actually implemented this, we kind of treated it as like a global interpretation lock. So it's, it's basically just around robin scheduler, everything's single-threaded. And we time-sliced out each of the threads. So, you know, it'll, a certain thread will run for certain instruction time. So say, n instructions and then it'll move on to the next thread and we can kind of hand wave away most of the threading issues that we were running into. And from there, we can actually allow malware to run threaded. And that, when you think about what a malware actually wants, it just cares that there's threads running and that didn't, it didn't get an error with a thread, right? We just, as long as it got a, a successful return value out of create thread, it just assumes the thread is running and it's probably looking for information to be updated at some point, it doesn't really matter when. So that, that's where this, this round-run scheduler kind of really shined was because it was something so simple but we were able to actually get stuff to, to run through completely even though it was multi-threaded. And so here's, here's just a quick example of, of that in action. Pay mostly attention to the left side of all those highlighted numbers. So in the numbers between the brackets, those are all the thread IDs. And this is like a standard output of any, all the function calls. So as you can see, as time goes on, the, the thread ID will change because we're context switching into a different thread. And then from there, it'll just kind of continue forward until it completes. All these threads are really doing and the highlighted lines are just creating threads and each, each thread is at some point printing something. And you can actually see some of the print def stuff with the format strings right there. We have like the, on the first print def, you can see like the percent D, et cetera. You can kind of see what's going to be put into that line when it's actually written to console. So, another thing that we actually ran into when we were running samples against Binny was we saw some, some malware actually start dropping files. And, and this case that would cause us to kind of pursue DL main stuff was it was dropping DLLs to disk and then it was loading them up dynamically. And so we actually weren't doing anything with loading DLs dynamically up until this point. In fact, we were just, all we were doing is just running a DLL, I'm sorry, just passing a DLL. And then, while we were doing the parsing of the PEs, and then from there it would, we would just kind of run it. We weren't actually running the DLL main of those, of those DLLs themselves. So, we actually went about trying to implement DLL main because that's something that's required for us to actually do a little library properly and probably for implementing more stuff within the TIP and the PEP. In some cases, especially for some system level DLLs, it will actually populate some values for us for the TIP and the PEP. So we can actually go through and have those run and then it'll populate most of those fields for us, which is, which is great. We don't actually have to spend time with the fine tooth comb kind of going through and adding more value to those and you know, that would probably drive us crazy at some point. So, what we ended up doing, we actually tried to, two ways of, of implementing DLL main. So for any DLL that we were loading up statically, when the, as if we're, we're pretending to be the loader. What we were, what we originally tried to do was just set the stack with the arguments that were required, set the entry point to, to the start of the DLL main if it did exist, if it didn't, we would skip over it. And then, the return address from that would, would actually point to somewhere else in Bini's memory, where we had actually put interrupts. And so the interrupts would pause the emulation and then, we would then go and update the stack with the next DLL that needed to be ran. And then from there, if, if, if it also had a, if it also had a DLL main we'd just run through that again and so on so forth. But it turns out that when we were actually pausing the DLL main, or we were pausing the emulator, it would actually kind of be a little bit more of an unclean state for the emulator itself. So we wanted to stick more to the, press the button once when we're ready to go, load up everything ahead of time and go. And the way we accomplish that is just through, just through a rock chain, essentially. So the idea behind this is that, we wanted to, set up the stack in a way, in a way, such that when we actually started the emulation, the first entry point would be the first DLL that we needed to load. Typically, this is NT DLL and then Kernel 32. So we would set the stack with the arguments that we needed for that and then we would actually put a return address onto the stack. But in this case, the return address for the, for the first DLL would be the entry point of the next DLL that needed to run. So we would just kind of keep chaining these together in the stack so that when we hit a return in a DLL, it would then pop the, the address of the next DLL into the instruction pointer and then from there, we can just go through all the DLLs until they're done. And then finally, when we hit the last DLL in line, the return address for that will just be the entry point of the PE that we're actually testing against. So this is actually quite a great way to do it. We were able to hit the, hit the emulation start button once and then it was able to just continue through and then we were actually getting some great results, a lot more fidelity, especially since we're having a lot more setup of the, of the Mac OS done by the, the DLLs themselves, which is great. And from there, we could actually then take the same idea and apply it to say load library with like a full hook. We could then just obviously give it a few different arguments, argument values and then from there, we can, we can do more from that. So, I think that's about it. So yeah, now we have some demos. So sure. Okay. All right. So the first like half of these demos are just kind of go through some of the, like how you'd run this like at scale or how it, how it would look at scale I guess, but, and then we'll show some, show some real malware. And I'll try and pause it. So in the beginning, there's just, you know, there's a bunch of feet or a bunch of parameters. Probably the most notable ones are just the dash V and VV. Oh yeah. So first one is I, yeah, our API sets. So we just thought this was cool. So we just added in there. It was super helpful when we were doing, when we were building this because, you know, we have to be able to resolve those quickly. But, you know, there's a verbose mode. So when you do in verbose mode, like I said earlier, you'll get the instructions, which is like super helpful for arring things. But then if you do dash two Vs, you'll also get the registers in the stack, or at least a subset of the stack. And then there's another flag, which is like dash D. So you'll get the, the actual DLL that it's in. So you'll get DLL colon the function in the name. So just going through here, so this is a very simple, kind of a printf example, basically of a console application. And we actually, we start at the, you know, at the beginning at the entry point and we can go all the way through. And we're getting, you know, nice, you know, detailed information about the function parameters. So this is pretty cool where we're getting, you know, human readable stuff from like, as John was mentioning, like printf, you know, create file, write file, you know, we're getting really good data here. Let's continue forward. So now we're just showing some of the, the verbose mode. And again, so when you're doing the arring of a piece of malware, like having obviously the instructions is really helpful, especially when you're trying to manipulate the config file to maybe take a different branch. You can kind of just go through here and see what's actually happening after the function call, change, modify the configuration file, which changes the environment and now the binary runs completely differently. And then the dash vv, sort of the same thing, except as I mentioned, you now will get the registers and then a dump of the stack as well. So useful, but not as useful as the other one, but it's interesting information nonetheless when you're particularly debugging by hand. So now we'll get to the threads, I think, or not one more application. Oh yeah, so the dash d, so now if you ever need to know what the actual API set was, so this is where we, this is the reason we actually implemented the API set, the dash a and capital A within Bini is because we were getting a lot of this and we need to know, all right, what API set is that? Okay, what DLL is that? All right, what's the function we need to look at to see if it's a full hook or a partial hook or something like that? And again, just kind of looking at the data, create file, write file, you're getting all these IOCs, you're actually seeing what the malware is doing without actually executing anything. This, okay, so, yeah, so this is a key piece right here. So the dash j flag, right, it converts all of this to JSON. So when we put this into our cluster and we get all of our samples, this is what we're getting back. So then we go through and we can parse these for all the different interesting functions that we're looking for. So in this case, we're looking at write file and we get, pause that. The interesting part here is we get the parameters, right, so you get the string value of the parameters and then you also get the values that's being passed to it. Right, so you'll know exactly what's happening. So, you know, this function is slightly less interesting, but the malware that's coming next is a little bit more interesting. But again, you can parse all that data in real time and we can do this, you know, we can actually do this at scale because this will run, you know, it'll run on Linux, it'll run in Docker and it'll also run in whatever, you know, your function as a service is for your cloud provider. You know, so again, threading is, this is, this is actually sort of simple to implement, but it was really cool and we actually got it to work. It was just one of those, kind of, aha moments. Alright, so now for some real malware. Here quickly. So this first case is from the, I believe it's the orange worm variant, one of their samples and you know, this would run all the way through and this is part of that family. I think there was maybe 30 samples in that family and the interesting thing here is, you know, we get the, you know, we get the IOC so we know what service is actually trying to start and we can do this without actually even running the sample. So now we know where to look if we're hunting, you know, in your environment and you're looking for a particular server, you're looking in a particular malware and now you can see some of these IOCs and now you can hunt in your environment and figure out, you know, where these things actually are. You know, and additionally, there's this git proc address. So this is useful for getting around like some obfuscation and some packers. You know, if they hide all the, you know, from the imports table, you're not going to get that statically, but you can get that now dynamically and this is a pretty useful feature for any of your statistics. So this next one, this was an interesting sample because basically has multiple layers of packing and unpacking and what it does is, you know, effective, well, highlight these. So this is particularly interesting because now we're getting, this was hidden previously by all of the, from the static analysis because it wasn't in the import table, but now we're actually getting this in real time. So we can add these to our, to our machine learner model and say, hey, this thing actually does use these imports. These are the functions. These are the names. This is the DLL. And then additionally, you can see like, actually right below it, you know, we're getting registry values or getting registry keys. But what's interesting about this sample is, you know, it's completely packed and then what it also does is it unpacks a DLL in memory and then it writes it to disk packed and then it loads itself and then unpacks it again. And that's kind of this part where it's you read file. So now we're, we're getting all this. So we actually get the DLL that it's, that it's dumping. You know, we get the file name, you know, the naming schema a little bit and then, you know, again, you know, we can see the load libraries. We get, get proc address. You know, we're getting all of these IOCs. Now we're adding these. We're tagging our, this binary in our, in our ingestion pipeline with all of these, these new IOCs. And then we're actually able to collect new ones. So now we have a new DLL. We, this thing was written to disk, but it was written in, it was redirected into our temp location or sandbox location. And now we can look at it. Later. So we're open sourcing it. Sorry. Quickly, we had a couple of things that we want to work on next. But yeah, we are open sourcing it. So we're more than happy to help anyone get involved and get started. But so these are some of the things. So thank you.