 Is that forced for reverse engineers is kind of more annoying, right? Well, we'll see it not so much, but at least there are enough people that don't like reverse to go programs that I think anyone writing malware in the language is going to probably go through the cracks. Anyway, what else is there here to say? Yeah, so basically the main takeaway from this slide is that go binaries, they're not that much more difficult as we will see to reverse engineer than other programs, but they do require this sort of different approach that not everyone knows of. And so this is why I think a lot of people do not like reverse engineering go binaries. But I mean, thanks to this tutorial, hopefully you will see that it's not that big of a deal at the end of the day. So as for the malware that we will be analyzing in the course of the tutorial, after the introduction, where I show you around the code language a little bit, we will look at the Sunshuttle. Sunshuttle is a malware that was disclosed in the context of the SolarWinds incident. I'm pretty sure that all of you know about it. It was disclosed by CISA and Mendiat in March 2021. And Sunshuttle specifically is the full feature backdoor that was discovered in this supply chain attack. So, I mean, I don't really want to go back to the SolarWinds incident in too much detail because I'm sure that most of you are familiar with it. Long story short, this is a software company that provides IT management software. It was breached and then it was leveraged to infect very high profile targets. So if you'd like to know more, I mean, there are so many blog posts out there, I'm pretty sure you can find more information about it. But at the end of the day, after the victim was selected through the supply chain attack, a piece of malware was deployed to those victims. And the final piece of malware that we're deploying on the machine is called Sunshuttle. So this is what we will be looking at because it was written in Go language. So let's talk about how this story was going to be structured. So we will do a little bit of theory and it will be very short, right? I don't think that most, I don't think any of you is really interested in learning about Go language in terms of the language itself. I don't think you'd ever want, or maybe you do, but if you do want to learn how to write proper Go link, a program that probably is about the right place, we will only figure out the most basic things that we need in order to be able to find our way through programs. We will talk about tooling a little bit, especially comes to IDa and the useful plugins because there are some of them. And finally, we'll just dive into just the tutorial or the practical practical work and we will first start with Hello World programs. Looking through those, we will kind of get our first bearings into binaries compiled with Go link. And then finally, when we have figured out the basics and we can directly jump into an actual real life APT malware, and then we'll see how to tackle that. And we'll probably definitely won't be able to look at the whole of the of the social malware in only two hours or two hours and a half. But I'm confident that we will see enough that you will be able to finish this on your own. And then maybe look at other go malware on your own later on as well. So anyway, let's talk a little bit about Go. The first thing that you have to know is that it generates executables that are statically built. So what does this mean? Well, it means that the whole Go runtime and all the necessary library functions ship with every program you compile. So we will see that a simple hello world program turns out to weigh something like two megabytes. And in this sense, this is really the reverse engineer's worst nightmare because, well, the proportion of useful code that was written by the malware author compared to the quantity of code present in binary is really very, very small. It's noticeable. So it means that if there's one kilobyte of malware code and you will get 1.9 megabytes, 1.99 megabytes of library code that is actually probably useless and documented somewhere. So it's not great, but things used to be a lot of worse. That's the silver lining. So IDA Pro since version, I think 7.6 or something like this introduced a lot of improvements when it comes to Go binaries because I'm pretty sure that their customers complained about it. And so IDA has become very efficient at recognizing library functions and that's really, really a good thing because in the past, you used to have to download weird projects from GitHub to maybe get this information. So maybe you would have to create type libraries out of the source code of the Go version that was used to compile a program and so on was really a nightmare. So now things are kind of good. So yeah, let's move on. When it comes to writing Go code, Go feels like a scripting language a little bit. It doesn't have semi columns. The syntax is very unfluttered but it has very strong typing. The compiler turns out to be very, very strict about lots of things. So if you have an unused variable somewhere the code is not going to compile. If you have an import that isn't needed, the code won't compile. If there is a return value that you're not using that the code won't compile either. So it's really a strict language in that sense. As far as reverse engineering is concerned we don't care about that too much because we're not going to suffer through the compiler's whims. But it's kind of a good thing because it means that the authors of the program will be forced or at least forced to some extent to write proper code. So it means that at least the code quality that we will be facing should be, well, it should be kind of okay, right? It means that, well, since the compiler forces people to write sort of clean code it means that what we're going to have to look at is going to be, well, we will have to fight with the complexity of what the program is doing and so on, but we won't have to fight with the added complexity of crappy software developers which is the case in so many malware strings that you see out there. Memory not being released with variables that are not used to anything, et cetera. That doesn't happen in Go language just because the compiler is not going to let you. So this is kind of a plus for us. In Go, another thing is that there is no exception mechanism. Like it just doesn't exist or as far as I know at least. But one very important thing, one key component of the language is that the functions can return multiple return values. And that's sort of a problem in the sense that when you look at either the compiler it tries to decompile programs to C or pseudo code that looks like C. And in C there's just no way to represent multiple return values, right? And so the decompiler tends to be completely confused by everything that Go is doing and so we will not be able to use it. But anyway, one thing that is important is that a very common pattern for Go functions is that they will send out a results but they will also return a error object and this error object represents whether this function succeeded or not. And so in many languages you would have, I don't know, in Java or Python.net, maybe even in C++ if you like, you would have a function that returns one value and then if something doesn't work as expected then it will throw an exception that will report and then you can react to this. In Go the way that it works even for the library function is always that the function will return both its return value, maybe multiple return values and this error object that is hopefully null or null in Go. One silver lining is that the standard library provided by Go is really, really extensive. It means that every time you want to be doing something a bit complex like opening files, establishing network connection, cryptography, whatever, then you will not have to write the code on your own. You will not have to develop this or to write the functions. You will just call some function from the Go standard library. And that's pretty cool because it means that most of the code that we'll be encountering from our authors is really some sort of Lego block assembly of function calls that come from the library. And it means that the complexity of the most complex operations will usually be offloaded to library functions. And then if we are able to recognize those library functions, which we are because the tools now are kind of efficient, then we won't have to suffer so much because we won't have to look into complex code and try to figure out that, oh, yeah, this is Bay 64. This is AES, et cetera. This is just going to be provided to us. That's actually pretty cool. So the way that I will approach Go programs usually is that I will try to rewrite original script manually. The way that I tend to do this is I try to look at all the calls to library functions. I look at all the arguments and then based on this, usually the whole structure of the program tends to surface up. It tends to become apparent. If you look at all the functions of the library calls and all the arguments that are passed to it in the way that the return values are being used in them, it turns out that the meaning of the program tends to be kind of apparent. So I gave you this image of Lego blocks that are assembled out of function calls to library functions. It's really what we're going to be doing there. So we know what the Lego blocks are and we'll just going to see the big structure, the bigger shape that is constructed out of these blocks to figure out what the malware is doing. But we won't really be diving into each and every function. That might not be the most proper way to reverse engineer stuff, but it's actually a way that works very quickly. And this is in many cases what we are looking for. Anyway, when we are looking for explanations about the library code, it turns out that the Go library is pretty well done. So there is a website. It's there, goline.org.phpk. Oh, I think it changed right now. It's pkg.go.dev. But probably the old URL still works. In any case, this documentation is pretty well done. We'll be using it a lot. And when we encounter a function we don't know, we can look at the documentation and we will be able to find all the information about that function, what it does, but also what the arguments are, what the return values are and so on. And that's it. So it's pretty useful. And this is mostly what we need to know there. Another point of interest I can maybe mention is that all the strings and all the global constants that are used in the program, interestingly, you can get all matched together and stored somewhere in the binary. I'll show you that exactly with Ida later on. But it's kind of confusing initially. It is the way it is, right? We best know about it because we can't change it anyway. So I'll show you how exactly it translates, but be ready to be a bit surprised by this. Another very important thing is how function calls are being made. In though there are actually two ABI's, one is the older one that was used for version 1.17 and 1.80 for other architectures. And back then the arguments, the function calls were kind of, they looked a bit like C or C or C-like languages in the sense that for every argument that was passed to a function, they would be pushed in the stack in reverse order. So you would see push one, push two, push three, whatever, and then call to a given function. The return values are actually the same. They would come back through the stack as well. But since version 1.17, the arguments are now passed via read registers. So in the order rax, rdx, rcx, rdi, et cetera. I put the link there so you can see exactly where this is, if you have all the documentation on the Go website, it's very extensive. I maybe don't recommend that you look through it, but it's kind of useful to know, right? And this leads to a question which is how do we find the Go version used to come out of program? Now, I think when you open any Go program and you look at a function call, it's going to be extremely obvious whether or not or which ADI is being used because if you see stuff will be moved to registers and you know it's at least version 1.17, and if you see just push, push, push and then function call, then it's earlier than this. But still, so this is something that is sort of important to know. But I think that as time passes and this old ADI is going to become more and more rare because of course, now we've moved on to this register-based thingy, and this is going to be probably what they will use for the next 10 years or maybe forever. Interestingly, there is a blog post somewhere in Google's blog where they explain that just doing this change allowed them to get something like 50% of performance increased throughout all the programs that they tested. So just doing this simple change of passing arguments through registers instead of through the stack, which is the memory, obviously is going to be faster, but having some metrics is pretty cool. Anyway, let's talk a little bit about the tooling. Now, in the course of this tutorial, I will be talking about Ida Pro a lot. So I apologize for this because I'm pretty sure that a lot of people out there do like Ghidra better. But I just haven't had the time to switch to Ghidra yet. Eventually I will, but I haven't been able to at the moment. So I will be talking about Ida. If you want to follow this tutorial with Ghidra, I'm pretty sure you can, but I will not be able to provide a lot of support there, right? Because I do not really know how this tool works and to support on my side, I think we're limited. But I am pretty sure that since Ghidra has a strong community, all the plugins and all the features that I will be mentioning there have been ported one way or another to Ghidra. At least one comes to plugins. And when it comes to recognizing those functions and the libraries, et cetera, it feels to me like Ida is doing this manually. Like if you compare the program with the very last version of Go and if you don't have the very last version of Ida, usually Ida is going to be a bit lost. So it's kind of a problem there, in my opinion. So I don't know how they are doing this with Ghidra. They are more current or they follow up on this, but maybe you can tell me. Anyway, there is, first of all, this repository which is called AlphaGoLang. It was developed by my good friend Juan Andresca de Rosada. He's a researcher from Sentinel-1. Yeah, he may think this repository. I contributed one script to it. I will show you what some of these plugins do. The general idea is that some of the functions, some of the features that used to be passed like this as Ida plugins have now been integrated directly into Ida. So this repository is less crucial than it used to be, but it still contains pretty useful stuff from time to time. So I will walk you through those various functions of the various scripts that are contained in this repository later. And now let's talk a little bit about the samples. So if you do have Ida Pro or Ghidra, or if you have everything on your machine, you will be able to do everything locally. And this is going to be my recommended way of following this tutorial, because the other is you can use one of our online VMs. Only 30 of them are available. I think we'll be fine, based on the number of people in the room, but the thing is those VMs, they are in a data center in Amsterdam, which is kind of far from here. Usually I tend to do trainings more in Europe. So I don't know exactly how the latency is meant to work for you guys. The access to these VMs is through remote desktop or something, it's through the web browser, you can get anything from there. But if there might be a little bit of delay, I don't know exactly how bearable it's going to be. So if you can use your local machine, if you'd have an Ida license and so on, and I suggest you use that. If this is not the case, then you're welcome to try one of those VMs. But right, the connectivity might, well, I hope it will be comfortable, but it might just not be. If you do use those VMs, the samples are going to be in a folder on the desktop. There will need to be a folder on the desktop with samples or tracks or something like this, and then you go to the very last folder, which is Go or Sunshadow, and then you will find everything in there. Otherwise, if you don't want to work on everything locally, you can find the archive that contains all the samples that we will be using today on the server p.kwi.ski, which is shorthand for proxy.ketkowsky.fr. And in there, there should be, if I remember to clean on my wires, there should be only one archive, and this is just a zip file containing the samples. The password is infected, as usual. And yeah, of course, it contains a sample of Sunshadow. This sample is a live one, a real one. And so if you unzip it on your local machine, then it's very likely that your AV is going to be completely deleted, so be careful about that. All right, and this is it for the theory part of this tutorial. This is basically all the theory that we are going to, these are all the slides that we're going to have to suffer together, and right now, I'm just going to switch to my desktop and open IDA, and then we're going to get in there. Yeah. The password? Oh yeah, yeah. So there is a password, maybe I put NorthSec 2023 with capital N, capital S. I'm not sure whether there is a space between NorthSec and 2023 or not, but this should be something either NorthSec 2023 or NorthSec space 2023. Either of those should work, if it doesn't, let me know and I will create a new URL or something like this, I'll figure it out. By the way, if you have any questions at any time, I don't know if people can ask questions on the internet as well, but feel free to just drop me, raise your hands. This is a small enough class, and I think we can have this sort of friendly discussion where I don't have to speak on time, and if you have, if there's something that you are wondering about, feel free to just talk me there, and I will try to explain. All right, so let me just close down PowerPoint, and now I will switch to Ida. The very first program that we're going to look at together is this thing, this very small sentee. You should have the source code as well. It's really a super simple program to, even if you don't speak Go, which is actually my case, you should be able to figure out what this does. This is a sort of program I like to create initially when I get into a new language, because, well, evidence is one of the most important things, which is how exactly function calls exactly work. So in this case, I just have a main function in the main package, which is the way that you find the entry points in the Go language as far as Ida. And what you do there is you have a sum function that receives three integers and that returns an integer. And the only thing that this simple function does is return a plus b plus c. And here our main function will call this sum function and then print the result. And that's it. So super simple Go program. I point out that it has to be compiled with disables optimizations there because otherwise the Go compiler is smart enough to just inline everything. So you wouldn't speak too much. Now, if everything gets inline, then of course then you won't see your function call. And so that's not great. But anyway, let me start maybe with the very first thing I wanna show you and which is how exactly do we know which version of Go was used to compile the program. And turns out there is a simple trick to do this. And I'm going to show this to you right now. I'm gonna talk about this. So what I do is I open the program with the hex editor. Is this big enough for you guys to see on the screen or should I try to zoom a little bit? Okay, I'll try to zoom there. Okay, should be better now, right? Okay, so this is just a simple hex editor. The one I like is called 010 editor. There are many, many different ones out there. So you really don't have to use the same one as me. The only thing that I wanna show you there is this super advanced trick where all you have to look for in the program is Go V1, I think, or like this, or Go1. And you see if you do, then you will find a string. I think it's the only one there, yeah. And then you see Go1.16.3. And so it's just there. And if you look at, as far as I can tell, all the Go programs that I've analyzed so far, if you just control F, Go1. You will end up with a, on a string that contains the version number of the Go program. So it's sort of useful. First of all, to figure out if the program is written in Go or not, although this is sort of very obvious when you open it with Ida. But also if you want to know in advance what ABI is going to be used, and this is a quick trick that you can use. Anyway, so let me close this text editor because we're done with it. I will just open Ida.64. A very, I think, very talented thing is you see you have Go written there and right next to it, it's work on your own. So basically your Ida isn't going to help you much there. So let me try and drop this here. Open this Go program. And then it will take a bit of time because those binaries are kind of big. But as you see here in the function window already, Ida is able to figure out a lot of function formats, PTR, whatever, actually it doesn't matter too much. But as you can see, Ida recognizes the function names. In the past, as I mentioned, this is something that, I have seen cases where Ida was not able to recognize this pretty well. But usually if you update Ida to the latest version, it tends to fix things. So if you open a program and you don't see any function recognized like this, just update Ida if it's possible and then things will be better for you. One of the things that I want to show is this. You see this very simple program that I had there. It looks like it's 16 lines big, not that much. It tends to be compiled as a two megabytes binary. So that's a lot. Let's see exactly what's in there. So you have many, many functions. All these functions come from the Go runtime and we don't need them basically. One thing I've noticed is that the developer functions, they tend to be put at the end of the list, at the end of the binary. So if you are opening a program at random and don't really know what to start from, just take the slider all the way down and usually at the very end. It tends to be the case. Also, you will notice that the names that are given to the package, for instance, we had this function sum there from the main package. And if you go back to Ida there, you will see that I had a function name in this case. This is a way in Ida to see exactly where the functions are coming from. So you do get the package name and it sort of helps you figure out where the, in which package do I have to share again? All right, are we back? So let us start directly from this main.main function. Usually, as far as I can tell, the entry point of a Go program is always this main.main, so that's helpful. And let's see here exactly in this function, where is the code that we wrote? So we do recognize this result equal there. So that much we can recognize. But already, it looks very different from the code that we had initial. Now, your intuition might be to just press F5 and try to see if maybe the decompiler can help you. It's not too bad there, but as we will see later on, it's not usually that helpful. So it turns out that for this simple example, the decompiler is actually not doing too bad, but as we will see in the next example, it's not really usable at all. So let's forget about the pseudocode now and let's try to see exactly where the first line of our program was. So the first line was a call to this sum function there, right? And it turns out to be here. So it means that probably, well, these are the arguments being pushed or being given to the program, but everything that came up above all that and also this is actually stuff that was added by the compiler, right? This structure there is, or it's not really a structure, but this construct that we see move RCX, GS28, et cetera, plus going here is something we see very often in the programs. It looks like they are, I think it's related to making sure that the stack has enough space for everything. Basically we never ever have to worry about that. So one thing I sometimes do is I just call it the blocks like this just to materialize the fact that I don't care about them. So everything here, we don't care about too much, we don't even have to look at it. It's just stuff added by the compiler and we'll never have any impact on whatever we are doing in the analysis. So let us see our function call there. So we have this placement of all our arguments, one, 10 and 100. See here we had the call to the sum function with arguments one, 10 and 100. And so here they are being passed onto the stack there. Now, Ida is not doing super well already because you see that you have this offset related to RSP. You don't really like to see this, but okay. One of the things I want to point out is that here you would be tempted to rename this var 88 something like arg one, something like this, right? Or arg sum one, because when you're reversing the program and what you want to do is you want to rename things, right? This is the way that it's supposed to work. You recognize something, rename it and then you move on. And then when you have renamed everything, then hopefully you understand what everything was and you understand what the program was. Now you will notice that unfortunately, this variable there is actually being reviewed somewhere else and maybe somewhere there as well. And it turns out that these future uses have actually nothing to do with this initial one. It's another thing that is super annoying for us to go language. It's the fact that when there are regions on the stack that are not used anymore by the program, then the go compiler is happy to reuse them again for other stuff. And that means that if you rename a variable sometime at some point in the program, then the next use might not be the same at all, right? It might be used for something different entirely because the compiler is being smart about it. And so renaming things is actually going to be totally useless for you. And it's very unfortunate, but it is the case. And this is a simple example there, but you will see later on when the functions are bigger than this than any position on the stack may be used for three, four, even 10, 10 different variables. And so renaming them, it's just not going to lead you anywhere. So one of the things you can do instead of having those names that are just confusing and useless is right click. I think, I don't know if you can do it with, yeah, I think there are hotkeys K there or it's right there as well. We can just delete the variable names and you can use just the offsets directly like you see RSP, RSP plus eight, RSP plus 10. You don't have a variable anymore but it will make following things around a bit more easy because anyway, renaming things is not going to be useful. So why bother? I do not recommend that you go through every single variable and do this because it's going to be very time consuming. But when you're beginning and when you're trying to figure out where things are in the program, then at least it turns out that RSP plus 60 or plus 70 is going to be more meaningful than any name that you can put there. So anyway, let's see exactly what happens. So we have this argument number one pushed on RSP, so on the stack or it's not a push there but they just move it directly at the correct offset RSP plus eight, RSP plus 10, and so on. So of course, this is a program compiled for X64 and this is the reason why all of our integers are the size of eight bytes. So let's get into our sum function. So here, sort of simple. In this case, Ida is able to recognize the arguments and you just put R0 into Rax, add R8, add R10 and then you notice that the return value is not being, well, it is in Rax, but it's not being returned through Rax. It is just pushed back on the stack. Let's see, RSP plus 20. And then when you go back, you see that RSP plus 18 is being moved back to Rax. So maybe I should explain a little bit there. So this is the return value, but after you return from the function, this return operand is actually going to pop a return address from the stack. So this is why you have a difference in offset between the two function calls. Is that okay with everyone? It's sort of a, not super important detail, but it's kind of, I just want to remind you that the offsets that you see in one function are not going to be the same as the one you see in the calling function because between the two, due to the function call, due to the return operand that retrieves an address from the stack, the return address, then you will see an offset. And so even then I'm trying to recognize the numbers and not learn the work. But the good thing is we're not going to bother too much. Anyway, so what happens next is kind of very difficult to decipher. You see that like when you start using the XMM registers, there is this type being loaded into already X. It's a structure. We won't bother about this one at the moment. We do see our result string somewhere here. Yeah, there it is. And by the way, one thing I can mention here is that strings in Python, sorry, in Go language, they have this structure where you have two fields. The first one is a pointer to the, it's a pointer to the actual byte data and it's followed by a side. So you see here, you have a pointer to some characters somewhere in memory and the size of nine, which is one, two, three, six, seven, eight, nine. So the size of the string. So if you go here, there, you will see all the strings of the program that are being grouped together, one after the other. It's sort of a detail, so let's not worry about this too much. But you will see that here already figuring out what is happening there is kind of difficult, right? Where the line that we have is this format println, result equal and then rest. And it translates to this very complex series of calls there with the creation of a string type, a conversion to maybe an integer type. There you have this thing that was nowhere in our initial code. See this runtime write barrier, we did not write that, right? It's not in our initial program. But this is more stuff that the compiler added. I think it's related to garbage collection. We don't have to worry about it too much because it doesn't have any influence on what the program means, but it's there and sort of getting in the way. And then you have all this very weird moving around stuff here on the stack in and out. And already we can see that we are reaching some sort of wall of complexity where trying to track every single position of the stack is going to be extremely difficult. So we will have to find another way than just to try to understand any single instruction because I don't think there's any way, I don't think it's humanly possible. At least it's not possible for me to track everything. But we do at the end of the day see our format that println call which corresponded to this here, right? So our first encounter with compiled Go programs was I would say a little bit frustrating in the sense that we were able to kind of see what we had written initially in the program but also there are so many things taking place that it feels like if we were looking at a known program then we would not be able to follow through. So let us look at another one and see this intuition just confirm itself. So I'm going to show you example number two and you will see that this example number two is sort of the same thing, right? It starts with a main function but this time our sum function it doesn't just return an integer, it returns an integer and then a string and then an error which is like returning an error as I mentioned is something that most Go functions do. This is the very Go way of doing things. So the sum function returns a plus b plus c like earlier it also returns hello. And finally it returns a new error which is whatever a fake error. And now our main function is sort of the same as before we call this sum function get we collect all the return values and then if the error is not nil we do print the string and the result. That's it. So it's not that different from what we had earlier but you will see that in the assembly what was already pretty complex now becomes frankly impossible to read as far as I can start. So let's check it out. Again, I'm loading the program which is to take a few seconds for Ida to go through everything. I guess I can already go in my main.main function. And here if you press F5 then you will see that you start having undefined values undefined variables that do not appear to be created anywhere initialized anywhere is because Ida doesn't really like having stuff coming from the stack like this as far as I can tell. And of course there's always going to be this sort of hard limits where you cannot represent a Go program as a sort of pseudo C code at all because this context or this concept of having your multiple return values just does not work it doesn't translate and see. So you see there at this main.sum code it's supposed to get three return values and then you see that Ida is really not able to see that there are actually a return value being collected. In some cases I even seen Ida missed out entire lines and entire function code. So honestly if you are going to work on Go programs I really recommend that you do not ever rely on the pseudo code because it just doesn't work. It really doesn't. So we will have to make do with our disassembly there. So we see at the beginning of our main function the same thing that we've seen before which is first of all this sort of setup with this function that supposedly verifies that the stack is big enough. We do have the same thing that we had before which was the passing of arguments or main function but this time you see that it's being passed through registers. The reason for this is that this time I compiled this function with another Go version. Let's check out which one. I'm just going to open my trusty hex editor again. Control F, Go 1. All right, next. Yeah, and you see this one was compiled with Go 1.18. This is a more recent version. And so this is why this new function that we have here this new program actually uses the register-based API. So this way you can see both of them. You have seen both of them. So this time we move one in EAX, 10 in EGX and 100 in ECX. So again, passing the arguments that were here then we call our sub function and supposedly we're supposed to get three return values collected from this function pool. So web is called to the main.sum function. We will get in there in a bit but then you will see that when we come back then we have RAX, RBX and RCX in which the return values are being stored that get moved back on the stack here. And again, like you might be tempted to rename this to what was the to res. But if we were to rename this to res like that then you will see that later on highlighted it doesn't get reused this time. But in any case, this has not been to help us too much. Now, if you see here already it feels like there's no way we're ever gonna be able to follow that. We had a very simple main function that had was four lines. I mean, four lines is not that big, right? Supposedly, and it ends up being this huge monster of values being moved around being translated or converted into various things. And just following this is a nightmare. Personally, I gave up a long time ago. So let's think of a way that we could do this a little bit easier. And to do so, we are actually going to move on to our real life model or sample, which is sunshadow. Just check out if there's something in this some function that is worth showing. I don't think it's the case. We do recognize some of the stuff that we have written like this creation of a new error object, this creation of our Hello World string. One way or another, but overall, overall things, this is the reason why most people when they encounter Go for the first time, they tend to give up because I mean, it looks extremely complex and in ways that are totally gratuitous, right? Why does it need to do so many operations? Why does it need to move so many things around on the stack just to do those simple things we're asking? It's a good question. And my solution to this complexity is that we're just going to ignore it entirely. We will just not bother with all those operations and try to figure things out another way. So with that in mind, let's move on to our next sample which is sunshadow. So here I already have an ID for sunshadow. So I hope I didn't leave any notes in there. I don't think so. Because again, annotating IDBs for Go programs is not going to be super helpful. So we go at the very end of this function window and you see that there are a number of functions there in the main package and all the rest appears to be stuff that comes out of the Go standard library. Let me actually show you something that is pretty cool. I mentioned earlier that there was this collection of plugins that were available on GitHub from the Sentinel One researcher, one on less. One of the various scripts that is in this repository is called, well, what it does is it uses this feature of IDa where you can classify functions into folders. It was introduced now some time ago. It's not used that much as far as I can tell but it's really a super useful feature. So the way you enable it is you can right click there and click show folders. But initially, of course, you have only a single folder or everything is the same folder, right? And if you want you can then create folders and know where it is exactly, you don't remember. Oh yeah, create folder with items and so on. So you can create folders, move the functions in there. It allows you to sort things out and to make them a bit easier to access. So what this script does is it will create folders for all the packages and sort everything in the correct folder. So I'm just going to, I don't think the script is present on your VMs initially. I'm just going to run this on my own. And this is categorized, go folders there. I'm just going to run it, hopefully it still works. Okay. You see here it created all these folders and sorted everything according to the package. So I personally think it's super helpful because what it does is it allows you to take away all the various functions that come from the standard library, everything that comes from the OS package or the SS package or whatever, and puts them away. So you don't really have to look at them anymore. And you see here that the only package I'm left with or does this unnamed thing or maybe you can look at later, but even a real life malware like some shuttle, you see that at the end of the day, the only functions that we care about are those ones, right? They appear to be the only ones that were written by the malware developer. So that's pretty cool. It means that although we have this five megabyte binary, I think, yeah, exactly. Even though we have this five megabyte binary, we actually only have something like, I don't know, say 20 functions to go through. That's not that much. Now, of course, the functions that we've seen before with our available examples hinted that it might still be a bit of work, but let's see how we can tackle this, right? So we are here in our main function. You see it's quite big. If you look at the window down there, the graph overview, you can see it's pretty long. So not cool, but let's see how we can work through this. First, we encountered the same thing that we had before, which was this construct and this call to runtime more stack with no CTXT. So let's just ignore this because we don't care. The first thing that happens in the main function is equal to net.interfaces. So just the same way that our main function was main.main or main of the store main, we have this function there that comes from the standard library, and which is the interfaces function coming from the net package. So what is it exactly? This is not something that the developers wrote. It's something that comes from the standard library. So let's go back here. You go to pkg.go.dev, or maybe the former URL still works. You look for the net package like this, all right? And then let's look for the interfaces call. All right, so there it is. Lo and behold, you get the description for this function. It's not that complex really. It's a function that returns a list of the systems that work interfaces. All right, so pretty easy. And let's look at the arguments. So no arguments for this function, as you can see there, but you do have here the return value. So it returns a array of interface objects and an error. So what we are going to do now is we are going to try to recreate the Go code that the developers might have written to get to this compiled binary. And this is actually, well, this is actually easier than it looks. So let me just create a new program like this. I'm not a Go developer. So if you are, then you will hopefully forgive my individual Go mistakes, but we'll do something that looks sort of like Go. Let's call this Go through the code. So let's call this package main and then func main and then brackets. And we start with equal to net.interface. This is like that. We know that the function takes zero arguments from the documentation. And we also know that in Go you are forced to collect all the return value. So it means that we have to do something like interface and error. We have to collect those two. So let's write something like interfaces and then error and then assignment operator in Go is a column equal. So when you see here this call to net interfaces, you can see before that whatever Go version was used there, to be any arguments that move to the stack or be passed around. So this is probably the line, the Go line that was written to get to this here. All right. So next stuff is happening. So by now we know that just after the function call, do you see the return values being moved back from the stack to the registers? Okay. So what is going to be done with those? We don't really know, but in any case what we see a bit later is we have this comparison between something and zero. And then depending on what happens, we go into this block or we skip it. So here I could try to figure out exactly what is at inside var underscore d zero. And probably if I were to open a whiteboard somewhere and draw something to represent the stack, I would be able to figure out things. But I'm just going to do a bit of guessing here and think about as a Go developer, what would be the natural thing to write there to compare with zero? Well, based on what we get there, I think the very natural thing to do would be to do something like if air equals nil, then we do something. And otherwise then we do that, right? This is what the Go language wants you to do, right? You call a function, you get an error object and then you test this error object against nil or against zero, right? And so let's see what happens if we go into this green branch. And so it means if this thing we tested was indeed zero, then we just have this call to OS exit and we put an argument which appears to be zero. Now, one of the things I've noticed with the Go language is that it really tends to group all the arguments just before the call instruction. So when you have a function call like this, then immediately before you will see move argument to register move argument to the order register the stack and whatever. So if you take any call, then the instructions, this is something that I've seen that I've seen to be consistent across some most Go versions I've looked at. So it's pretty cool. It means that, well, if you take a language like C or C++ sometimes the arguments can be moved on the stack way before the actual function call and though it doesn't happen, at least that's what I can tell. So if you take any function call there, then immediately above you will see the arguments. So the argument of our OS.exit function call is going to be zero. I could pull up documentation there and go see exactly what this OS.exit function is. I think it's going to be obvious to everyone. Let's check how many arguments it takes anyway. Okay. So there it is. So function exit, it takes one argument and then it exists the program. So let's go back here. So if our object, which I assume as error is different or is different from mail, check. Yeah, it's probably this. Then we go there and we call OS.exit zero. So let me update my code there. So if we have an error, then OS. And that's it. So as you see here, like I'm not really looking at the assembly too much but just with educated guesses and just by looking at the various function calls that are performed by the program, the various library calls that are documented in the internet, I am able to sort of figure out what is going on in the program. And this is going to be my global approach. I don't want to have to track all the values as they are being moved around the stack because this is going to take me forever and I'm not sure I'm ever going to be able to succeed anyway. But just looking at the API calls, just looking at the arguments that are being used here and there, I'm fairly confident that I will be able to understand the global idea of the program. And if by some, if I encounter something that is a bit more complex, then I can just take out the debugger which will be later and I can just go and put breakpoints wherever I want and check out the actual values that are put into the stack and or into the register depending on the API. And I will be able to see exactly what arguments are being used to call to any given API function from the GoStack library. And it turns out that just by looking at the sequence of all these calls, I am able to reconstruct this sort of flow of the program. So let us move on and see how far we can go like this. So if we call OS, OS exit, I suppose that it doesn't really matter too much what happens after this because the program terminates. So let's just skip over everything that is not a function for the next one is here. And you will know it seems to be a loop, something in either somewhere down here. And at the end, we go all the way back up. So we have some sort of, maybe it's a while, right? And try to figure out what this loop could be. Now, based on what we know, based on what we have on the code so far, I think there aren't too many options there, right? There's only one thing that we could loop out. We could loop on here, right? What would that be? It would be our interfaces. Does this make sense to you guys? Yeah. So let's try writing some code. I don't know exactly how it would be written and let's assume it's like this. So for I interface this and then we do something. And this something is going to be whatever happens in there until we loop all the way back up. Now, as I said, as I promised, we will completely ignore all this sort of nightmare of this horrible, horrible shuffling around of memory that is taking place there. If you wanna track it and be my guest, but I'm just not going to bother, right? Let's just go directly to the next function call and check out where it is. This one is called net hardware ADDR string. So let's go back to documentation and see what it is exactly. So we go back to the net package. Okay, so hardware ADDR appears to be a type. A type that represents a physical hardware address. All right, why not? So maybe what is happening there is something like this, right? Something that I dot hardware ADDR dot string, something like this, right? And it's an assumption we'll see how it turns out. Okay, and so here we do have, by the way, if you look at the documentation, you will see that it represents a physical hardware address function here. It just returns a string, some representation of our hardware address, I would imagine. I don't know if we have an example somewhere. It doesn't matter too much, but anyway, so we go back to either here and the next thing, the next function call that we see is this runtime man equal. Now, I think it doesn't require too much guessing to figure out that this is going to be a string comparison. So let's see what we're comparing with. Just looking a little bit above, you see that we have this string that is in there and you see that IDA is sort of not super good at recognizing where the string ends. So let's just go in there and see what happens. Now, you mentioned in the theory part, if you recall, I mentioned that all the strings ended up being garbled all together in the program, right? This is something that I hinted at and you see here exactly what I meant. You see that all the strings of the program, they are just, they're just there one after the other and there is no real way of figuring out where the string starts and where the string ends. This kind of annoying for us. The way that the program works is whenever a string is needed, the program forwards the, you get a direct offset to the string inside this big string ball, I would say. And you have a size somewhere, which here is there. The size is 11h, so 17. And so here you do something like one, two, three, four, five, six, eight, nine, 10, 11, and 17. So you get the pointer to the string somewhere, you get the size of the string and this would be the string that we are interested in and it makes sense because then this description fail and et cetera, it seems like it's important some other string there, right? So what's happening there, we have this call to hardware 80DR.string and then we have the comparison with this. So we can keep reconstructing our source code and we can do something like this maybe, right? And then probably if you have that, it means we are in the if construct. So what is this MAC address exactly? This, we can just do a quick Google search. I'm pretty sure that we are going to end up with sweets about this incident. And you will see that this is actually the MAC address for the Hyper-V interface from Microsoft. So this is actually some anti-DM protection in there. So the malware just checks if we have an adapter that has this MAC address. And if so, let's see. Yeah, we do have another call to OS.exit, all right? So let's go back and complete our source code. Like this, okay. So we are making progress. Let's go back there. And then, we don't have any more function calls but just looking at the graph, it looks like we are just moving on to the next iteration of the loop. This is probably a check on the size or the number of iteration counts we want to go through, et cetera. And then when this is over, we go to this block there. Okay, so let's keep going. We have a call to time dot now. So just the quick check, is it okay for everyone so far? Are you following the general approach? Yeah, okay, great. So here, we can get out of this for loop, I suppose. And we have this time dot now call. Let me look at my column. Okay, so here, you do have this weird value. As far as I remember, this is actually some timestamp conversion that is taking place. So the way that time dot now, I think that's not returned a timestamp on the latest format. And this is just a conversion here. Usually it's a good idea. If you do have this sort of constants in a program, if you want to know what they are, you just Google them. It's a good way of figuring out exactly what they are. Let's see the result there. So here we do have a bit of code that is a dump from the query. Let's try in decimal form. Okay, so I apologize for the results in French, but overall, what we see here is pages related to calendars. So we can guess from there, we can infer that this is related to one of the conversion taking place. So let's ignore that. Because what we don't really need to worry about that, we could just move to the next, if we just move to the next function call, we see this mat brand, PTR brand seed. If you have developed programs in the past, then it's going to look kind of familiar. It's something you probably have done whenever you were using cryptography or whenever you wanted to use a randomness. And what they do there is initialize the seed of the RNG. So let me update the code there. Now we don't exactly see how this result of time dot now has been used, but again, with a bit of educated guesswork, we can figure out that this is probably being used to seed the RNG because this is the type of code we see every time. Mat.rand.seed, right? Looks like something that makes sense. So let's keep going with this and see what happens. Now the next function call is sort of interesting for us, because this is another function that comes from the main package. So this means that this is not something we'll be able to look up online. It means instead that this is some code that comes from the developer code, the malware author written code. It's something we need to get into. Okay. So we get in there, okay? You know, I think this might be, I don't know if we take breaks in the workshop. I think having a 10 minute break would be nice, I suppose, because it's sort of a dry subject matter. So if you guys want to walk up for five minutes and walk around, find by me, right? So we take up again at 2.30, is that okay with you? All right, so we'll see you in a few seconds. Sorry. Okay, come here. So I was wondering if I followed you. No problem, don't sweat it. So there is actually no VM, but there are two ways of, I mean, I will share this with you if you want. I started with some general say information about the language. When it comes to this tutorial, so there are two ways of doing this. If you do have either Pro or VDRA, whatever you can follow with, you can just download the sample directly and work on your machine. If you don't, I do have some online VMs that you can log into through RDP. So in that case, you can just go to p.kawai.ski or the full name there. And there's going to be just an archive there. And you can just download the sample and look at it. The password is infected. Sorry? The URL for the sample. It's here at p.kawai.ski. Oh, it's all the X's, that's fine. I don't know the X's on purpose. Yeah, but you don't need to run them. Oh, no, I just don't know. I just don't know. It's like that. Yeah, probably something without extension. Oh, no, like that one. Yeah, that's good. Yeah. The password is infected. So I think it's probably a dumb question because I don't think about it. How do you actually run these scripts? Like if I just go with my script file, it just doesn't do it. Yeah, so the script, I think, has very likely worked, but there is one more, so your subtlety there is that by default, the folder used on the navel. So you need to get it right click there and maybe show folders and there you are. That's the only thing. It's working perfectly silently and then not switching that script. Yeah, I think that either should enable this by default and at least if you use folders, they should be shown by default as well. In terms of UI, they have a bit of improvements to make, but in any case, it works. Are you planning at the end of this, to actually compile the field you're writing and then do like a side-by-side comparison? I won't do it, but this is what, when I was beginning with this, I did. So I would take the same go version, compile the code and then look side-by-side to see if I was close enough. And if you take the same version of go, it will generate the same, if you have the same code, it should generate the same binary idea or the same functions. So when you begin and you start out, then you can compile the code that you write on your own and then see if it's kind of the same, if it is, then it means you're on the right track. If not, then maybe it's some subtlety or something different that you didn't take into account. You're kind of waiting for the woman I've almost understood actually. Yeah, can we come and see? Yeah, I think that's interesting. I think it's a bit nice. Oh, can we just go to the next question? Yeah. So if I wanna build, because that seems like it's going for a team, a team. Let me check. It's an old one. Yeah, the central, it looks like it was... Yeah, 14. It's very, very possible. Yeah, 14.2. Do I, like, you know, go to the devian package sites and write your code, and then start to write it back? Yeah. I think it's a simpler-ish way, which is to get the official, to build Go yourself from the sources. Just check out the version. They are. I think if you go to the Go website, you can probably download the... Probably you can maybe download the old versions individually. If not, you can get the source code. I would not recommend, like, go to the devian packages on the system, because you're going to end up with broken dependencies, possibly, or with an old Go version, that's probably not what you know how to do. I would just extract them to some directory and see if I can write for the new versions. Yeah. Maybe. But in any case, on the Go website here, you have all the archive versions like this, and you can take any of them. You get the source code. You go in there, you compile it, and you get to Go. Yeah. I think we're probably almost ready to resume. Are there any questions about what we've seen so far? This is not... You can feel free to ask any questions during the course of the tutorial, but if you have something now, I'm happy to take them. So let's just... So before this break, what we had done is we had just reached this new function that was also created by the malware developers. This is called define. So script and just create it as well. I don't think it takes any arguments. There is one way to find it out, which is to go back one step and see if there are any arguments being passed to it. It doesn't seem to be the case, although here I will draw your attention to the fact that with Go, the way that arguments are being returned is very tricky in the sense that they are placed exactly where they need to be given or reused as arguments for the next function. So if you have this call there from this seed function, whatever argument it returns is already the right place to be, whatever return value it gives is already the right place to be used as an argument for the next seed function does not return anything. This is, I think, a smart way that the Go developers or the Go creators devise so that function calls can be changed in a very efficient fashion. So here we don't, it doesn't seem like we do have any arguments to this function. So all right, let's add the brackets. So the first thing that we see here again, ignoring all the crazy memory operations are strings that appear there, config.dat and then .tmp and then this concat string. So this concat string is string concatination. You cannot know in the sense that the stuff is placed in the right position already, but then whether or not it will be used really depends on the next function, right? So the only way you can know is if you look at the documentation, if you look at the documentation and see that the second function, for instance, expects two arguments, then it means that probably whatever comes from the previous one is going to be reused, right? If it's a custom function, it's going to be a bit more tricky, right? So usually this type of function .something, .something, re-comes from the same library. The developers, they don't do that too much as far as I've seen, but it may happen. And if it does, well, you're on your own, maybe you can find, one thing you can do is maybe, you know, with that you can press X, find all the cross-references to the function, find another use of that function where it's not changed like this and then see that instance if it's taking something from the registers or from the stack that we might get by best guess. Okay, so here the first meaningful thing that appears to be done in this function is concat string of TNB and concat. So for sure, why not? Let's do s equals something that plus TNB. Of course we keep in mind this operation because we know what it does. And then we have this code, ioutil.readfile. Okay, so we can guess very easily what this does, but at the same time, let's check out the documentation to see what the arguments are and so on. So let's go to the io package and then readfile. So this function, as the name implies, reads a file. What it does is you give it a fine name, it returns to you all the bytes contained in this file and then of course an error object as this tradition would go. Here I don't even have to open either again, I think. I can just do io.ioutil.readfile and it doesn't take too much guessing to assume that what arguments are going to do, what arguments are going to pass there is going to be this function ring that was given to us. And also let's take the return values and also error equals this. All right. See the very traditional just after function called a check between something and zero. So again, we can take a guess and assume that it's our error being tested. So let's do that again, if error is different from nil or something and otherwise we'll see. Now, do you see that the earlier one branch was easy because it was just OS.exit. In this case, it's not really what happened. So what we can do is let's start by following the red arrow which is going to be the one that is called or the one that is followed if we do not have the fiber on our system because at the moment this file here, I don't think that.tmp, we don't have it, right? We don't know what's in there. So let's assume it doesn't exist and see what happens with the program when that is the case. So we go into this block and then we have this open file and here we are going to see the arguments. So this function opens a file, of course, and then you have three arguments, string and integer which is whether the file is created or we don't need that kind of stuff. And then a permission. So let's see here, Ida. The file name, I assume, is still going to be our config.tmp being passed away from above. So this is something we can check. This is something we'll check in the debugger, actually, because it's starting to be, we're doing a lot of guesswork so it's going to backtrack a little bit and check if we are on the red track. And here, if you change this to octal, I think, yeah, you see that this is a creation of a file with permissions 666. So read, write, execute, maybe a mature, at least read, write. And the final one, I'm not sure this should be because we can check the flag and try to locate where it is exactly. Okay, so let us open this program in the debugger and check that we are still on the right track. So what I have there is I have a VM and in this VM, I have my debugger which is x64 dbg. So I'm just going to drag and drop this program here into my VM. Give it back to the extension. And there are going to, before I even do this, there are going to be a few tips and tricks I'm going to share with you. The first of those tips and tricks is that one thing I love when I'm debugging is being able to copy addresses from Ida and put breakpoints directly into the debugger, right? Because it simplifies things a lot. For space, you have the addresses here in the margin. So let's say I want to put a breakpoint there and just copy this address and put this in the debugger, paste the address and put the breakpoint there. And the issue is, due to the ASLR protection that we have in binaries, and my program is liable to be loaded anywhere, right? And that's super annoying for us. So there is actually a way to circumvent this issue entirely. And this is thanks to a small utility from a guy called DTS Stevens. This is a small set yellow characteristics. We can find it online for free. Let me show it to you. So this is a small binary that you can use to edit the header of a P file. So for instance, the weather program is compatible with ASLR is shown inside the binary by a flag somewhere. And so if you just change this flag it will be header and the program is reported as not to be compatible with ASLR. And so it will be loaded exactly where you want it to be. So this is a pretty cool utility. I do recommend that you have it somewhere because it's super useful. And the way it's being used is very extremely simple. In my VM, if you are using my old VMs it will be in the same place. It's in C slash 2 also. And that's set yellow characteristic. And then it's minus D into the program. Minus because you want to remove it. You can do plus D if you want to enable it as well. Like this. And there you go. So you see they updated the DR characteristics and they disabled ASLR. So that's pretty neat. It's pretty useful. So this will allow me thanks to ASLR being disabled to copy addresses from Ida Pro and then use them directly in my debugger and save me of having to convert them manually and to do this translation. So that's something I suggest you should do. It's not just for go binaries, by the way. Whenever I have to debug a program, I go through the step because it's so useful. So let us open this program with X64 2BG. And you will notice, maybe I should zoom in a lot there. Let me just increase the font a lot. So I will assume that all of you are familiar with debubbers. I don't know if you used this one before. If you do have questions about it, feel free to ask. One of the things that I want to show you too is that, well, when you move around in the program, you will see that let me try to let me put a break point exactly where I wanted to go. Just before this open file. So let me copy this address. And you see here, there's one thing that is actually a bit sad. It's the fact that this call there is equal to such a little 4BD630, which is the address of this function in my program, where I was able to through x-rays.darkmagic to recognize that this was the go function. I open file or whatever. The debugger does not have this intelligence program. It is not able to parse the go program to recognize function names and so on. But as you can imagine, it would be super cool, right? If we were able to import the names that come from IDA and put them inside our debugger. In terms of that, this is actually possible. The way you do this is with a plug-in x64dpg You can find this on github and there it is. Yes. Sure. So let me just go back to the finder. This is the tool that you need. And well, then it's super useful. It doesn't set the other characteristics and then the file and then an option like plus D activates aslr, minus D activate aslr plus N activate DEP, minus N deactivate DEP and so on. Exactly. Maybe it's already open in the debugger. Maybe it's possible. I think on which is it on yours or on the one online? Normally inside this it's supposed to be in a folder where the AV doesn't go along. Oh, then you will have problems with the AV. No, that's not going to work. The AV is going to break things for you. So you have to either copy the same folder or just work in the original folder. Now the VMs that I shared with you, you can actually revert them when remote. You have this toolbar on the top. You can just restore them to the original state. So don't be afraid of breaking them. It's good to take some time to restore but at least you cannot cause any lasting damage. Anyway, going back to our debugger there, you notice that this call to some random address is not really helpful to me. I would like to import stuff. I would like to import my names from IDA into my debugger, which is something I can do. Thanks to this repository there. So again, if you're using IDA, this is the plugin to use. If you're using Kijla or anything else, you're sort of possibly on your own. Sorry about that. I'm still going to show it to you with IDA. In your VMs, you can either import this script and download it and copy it inside your machines or if you're using my VMs online, the plugin should already be set up. In IDA, what you would do there is you would edit and then x64dbgIDA and then export database. And this is meant to create a big file. Actually, let me launch it right now because this actually takes a super long. So let me put this on the desktop. Some shadow. And what it does is it creates this big JSON file that contains for every address a corresponding label. So you'll see there it needs to hang for a while. And this is because the binary is so big, right? There are probably tens of thousands of things to export. So this will take probably something like 30 seconds and it will be exactly the same on the other end when we import it back into x64dbg. So there we are. Hopefully. All right. So I joined up with this sunshadow, the d64. You see that the file is 85 megabytes big. So all the names have been exported here. It's super, super huge. Anyway, I'll just put it in the VM. There it is. Another thing that is a little annoying is the fact that in order for this to work the program in the VM needs to be exactly, to have exactly the same name as the program in the host. So it means that if the program that Ida is analyzing is called sunshadow then the program in the VM has to be named sunshadow as well. And you see here I have this EXE extension that this needs to cause a problem. This is something that happens a lot. Usually when you work on a program in the host machine you take away the extension. You don't want the EXE extension to be there in case you double click and inspect yourself. And when you want to run the program in the VM you can rename it to the e.exe. So how do you solve this problem? It's sort of easy. You just open the database there with a notepad. Notepad doesn't really like having those big files here. 85 megabytes a little bit for it. You can rename this module name and rename it to what you want. So I'm just going to replace this to sunshadow.exe and replace all. It's going to take a bit of time as well. The reason why we need to do this is that if you don't when you import all the database into x64 dbg it will import all the names properly but it won't be able to recognize that the names match are supposed to be associated with the binary that you gave to it. So it's sort of annoying but we have to do that. So I'm going to wait a little bit for this to complete. This is a very big file. So the replacement is going to take some time. And when this is over I'll move over there on the x64d region and just do where is it actually? Oh, database and then import. Not yet. Let's wait. Let's wait a little bit. One thing we can do is at least we know here that we have a breakpoint on this open file tool. So we can look at the arguments and check exactly what they are. It was the whole point. So here this argument is 666 in auto. What we want is the file name. So the file name is rcs. You can just be right here. This is an open file. This is still running. This is taking so long. So here I put the breakpoint on my open file function and it's very easy for me to go back up and look at the arguments and the argument number one go back to documentation here. There argument number one was the string. And so if you recall I mentioned that the string was a structure the structure is going to be a pointer and a size. So let me say rcs there is 0B rcs there is config.dat.tmp Okay, right. So you see that you have this config.dat.tmp which is put on the stack and then the size of the string pushed on the stack as well and then the other arguments. It's kind of difficult to track down but overall you can see that when we reach this open file call this config.dat.tmp is still there somewhere. So let's update our code here I suppose it was something like opens for 3.8 or something and then 6.66. And from now on we're going to be using the debugger a bit more. So I'm going to save that and I'm going to import my database there so again you can file day to day and then you can import whatever you want. It's also going to take some time because it has to parse this huge file that is eventually there you go you see now in the x64dbg I do have the correct names such as open file so I need to make my debugging experience a bit more enjoyable. So let us ignore the debugger for a second and go back to here open file returns a file and an error so let's update our code again f error equals that and you see that this time they don't bother, they don't seem to be checking this error object which is bad practice but it happens the next thing they do is they do this time but now and then time but time constraint so let's look at some of the examples where it looks like this chaining of functions is taking place without having to move the arguments again so we have this time but time constraint and it's being applied on time dot now so moving on and then we have this call to get md5h well the name is kind of transparent here it's fairly easy to guess what is going on let's check it very quickly actually we will get back in here in a second let us just let us just here look at the debugger what is being passed as an argument just to make sure so let's put a real break over there move forward and here you see whatever is being passed as an argument appears to be indeed the time stamp of our of the current days so we are we are doing still a little bit of guessing really a lot of guessing but we are still on the right track and we can also skip over this function and check out what comes out of this let me just follow this copy everything so I'm going to copy all this and since that this get md5h function is going to calculate md5h I'm just going to verify that this is true by calculating it myself let's go to cybershaft I don't know if you are aware of cybershaft I assume you are it's a very useful tool that you can use to convert by convert strings encode them, decode them etc it's really some sort of Swiss army knife of all the various encodings and transformations you can think of so it's super useful I don't know why it's so long to load but I guess they want us to see their funny messages let's go back this is where we are and we know that we have the md5h like this I'm just going to put this here and then here I'm just going to press 10 8, skip over and you can see the return value right here here in one of the return values you have 20 and the other one you have a hash 20x so 0x20, 32 bytes it's exactly the size of an md5h so normally what we would do there is I would verify the value I'm fairly sure this is the case so we're not going to wait until Cybershaft loads but this would be a good way to not have to reverse engineer this function that we are looking at here we wouldn't have to go in there we just know what comes in and what comes out and we can figure out what it does one thing I want to show you here is that sometimes it's not very often but sometimes you have this in go you have this runtime the object and the encode taking place whatever you do have a class instantiation in go and it's something that Aida is not able to handle too well the way it works is you pass a structure here which you cannot get into because it's I'm not saying it's opaque because it's documentated but you cannot read it manually and you pass this structure which describes a class into runtime and then in return you get an instance of this class now I wrote a script that resolves those types because actually all the type information is hidden somewhere in the binary you can find it so I'm just going to run that script it's also in the repository I mentioned at the beginning of this presentation and the one called alpha go so you would get script file here it's the number 5 extract types run it hopefully, yeah there we are and then you have a type which is a pointer to an M5 digest so you get this type pass it to runtime new object and then I guess you use it here I'm going to skip a little bit over things we have this stuff pushed into the M5 digest you have this C digest write sum and then finally encode text so it really is something that looks like a traditional creation of a M5 hash or a computation of an M5 hash but that's the very cool thing about go is that if you are looking at a C program then you would have to recognize the constants and try to see how the hash is calculated and so on here in go we just have to look at the codes to the ghost library you can see that we have codes to first an instantiation of this M5 digest type we have then any code to this digest write digest sum etc then encode the result and that's it so this is why I say very often that I think go reverse engineering is quite easier than other languages it's just because you don't have to recognize anything you don't have to work with crypto you don't have to get into complex functions you just have to look at the codes and see how they are chained together see how they relate to one another and from there you can just reconstruct the meaning of the program you don't have to invest any effort in recognizing what is happening in any function really maybe one thing I can explain there as well is this string to slice is a way to convert a string to a byte array the byte array corresponding to the string bytes and this one makes slice is a function that features subdivision or slice I think it's called the subdivision of some array so this is what it's used for but I really haven't seen many cases where I really need to figure out which slice needed to be caught or exactly what part of the array was being taken away so when you see those string to slice bytes and those big slices etc it's really usually a compiler making sure that the types can get converted properly from one another alright so let's go back here cyber chef and a load yes it did so let me just get this there and make sure that everything works as intended so this is my input let's take md5 come on how can it be that long Jesus oh well I guess we'll be back in a bit anyway we calculate this md5 patch to have this runtime gc drive barrier something related to the garbage collection we can skip over it okay and then we have more stuff so this string which looks like a user agent and then base 64 and go straight alright so this is going to be encoded etc so let's let's take things so let's go back to our debugger put a break point there this is what happens exactly so run until the next break point there we are and so here see that the this user agent that we saw somewhere in the code alright and then very lightly I press f8 what comes out is base 64 one thing I can point out there let's let me show you the documentation for base 64 to the string if there's something interesting if you think it's not there you see that the md5 hash of this string was forward so our hypothesis was correct okay let's forget about this for a second let's base 64 and then the function of the object is encoding and encode to string now here there might be something you would be wondering about is the fact that okay we know that we have this encode to string function this is the type of the arguments this is the return value so what is this what is this extra thing that we have there this is something that includes an object on which the function is applied so in the case of base 64 there are actually different encodings so you see here you have base 64 std encoding and there is another one which is url encoding so it's actually a different alphabet there and so you would have to be careful there because normally this would never be an issue because everyone uses the normal encoding of base 64 well in this specific malware they actually use both encodings so they use the they use the base url encoding as well and do way to be able to find there which represents the encoding and work your way back up in the code anyway this is just a small detail but if you try to decode so if you take this result here in base 64 if you try to decode it you will see well maybe for this specific string it works but for some strings later on it would not work because it contains a url alphabet and not the standard alphabet format it which is a function that is a way to convert an integer into a string and there you would see that this is the this string there is going to be the the integer has been converted so the integer is 5 normally I said it is useless to rename things but for global variables it is a bit different because of course global variables they tend not to be they do not get reused so at the moment we don't know what this is but let's call this global int and it gets converted into an integer and you have a number of these taking place then you have this here which is 0xf so 16 I could rename anything all of this I'm not going to do it you see that we have a number of them being very like this so I'm just going to skip over to the next function and you see there whatever was going on here that we just skipped through all those strings that were being created they end up being concatenated at the end so we are just going to copy we are just going to copy the address and now with the debugger we are just going to look at the end result of this string concatenation again we go back to the debugger G B and here the end result we have this huge string there so I'm just going to copy everything so you see that one way or another this huge string has been created so we have first this which is a hash of the current date that we know because we've seen in the code this 5-15 is something that was constructed out of these hard coded integers this is a user agent encoded in base 64 and that's it so let's go back here string is supplied by an interesting version main.intcript that's interesting I'm going to skip over that for a second and then write file so let's recap a little bit what we've seen so far in the code that we were writing I kind of gave up on doing this reconstruction I'll say a few words about that later generally the id there was this we have this function called define into the settings there were two branches one of them was if this something don't know what happens there or if it doesn't then the file is created then we generate there then we have a code to main.intcript I'm going to download here let's assume we end up with s something like that this and then we have main.intcript I assume something and then finally we add it's not right main.write.file which I guess open file into slicebite file.write.sync just to wrap around wrapping stuff around writing stuff into a file this and here I'm just going to go out on the limb and guess that arguments are f which is the file we have there if we end up in our branch where this read file string does not exist our file name config.tmp does not exist then we just enter this branch and then some sort of default configuration is generated from the program and we know that this is going to be this is a timestamp or the hash of the timestamp these values we can maybe figure out later on and this is an active user agent and that's it now let's check out what is going on in this intcript function I guess this is going to be our next step to go back a bit not too much so we skip over this so you see that now I'm starting to play a bit used with my reconstruction of this system taking shortcuts as much as a rigorous job as I used to before is because when I actually work on a Go program that I'm not really going to bother doing all this all the time the first time I worked on a Go program I really reconstructed the whole thing like that and I really encourage you to do so when you are starting out but of course as you move on you're going to start figuring out your project you don't have to fight everything the way I hope you will believe me when I do say that when I started out this is actually what I did this is how I really got into reverse engineering so it's a method that does work when you look at me doing this I suppose that you will be tempted to think that it's sort of easy because I've looked at this program before and so I know what is happening and I know what the source code is supposed to look like but actually this is really the way that I approach this program initially when I first encountered it right so let's go into this main encrypt function and there again we're going to move very quickly through things you have this crypto AES new cipher creation of an AES cipher bytes repeats there and you do assume this is maybe some padding or something at runtime gross less we don't care about that but it's nice not much is happening there are you read at last now something interesting cipher new CFP so the cipher is going to be AES like did I say AES before AES new cipher so we have an AES cipher now we know it's in CFP mode so we have to skip over the stuff and some encoding in the place and yeah so what's going on there we don't see exactly what is being called there's something missing there we have a call to the creation of our cipher with CFP mode and then we have call RAX so here it looks like for once we don't know exactly what is being called we have to use the debugger so I'm just going to copy the address there and see exactly what's in RAX when we reach this point okay and then you see when I read this point we call crypto.cypher.xorg so I suppose this is going to be the actual computation of the actual encryption taking place okay and then encode to string et cetera so one thing that is missing there is going to be the AES key I don't see any initialization vector either but if you look for you and they can put it at the beginning of the encoded data so the developer actually doesn't have to think about this too much go is in that sense a language that really does everything it can to prevent you from shooting yourself in the foot so we have this main.np function we did not see anything that looked like key material in there so what we are going to do is we're going to look around this function call and see what could be the key the AES key being used if we look around well the only thing that we can see there is this right we have this what seems to be a global variable let's go there one of the one of the values is 20 which to me looks a bit like a size and the other is it looks like some offset so let us try again this look through the debugger exactly what is in there so let me just put a breakpoint here I am going to relaunch the whole program it's like my layout that kind of messed up it looks like my breakpoints do not work anymore okay I know exactly what happened there doesn't matter too much or maybe the reason why my breakpoint was hit is because in fact now the file has been created yeah you see on my desktop I do have this conflict but that.tmp because my tree is debugging so now I enter the other branch and this is why I could not reach this again so I am just going to delete this and try again okay so I get to the open file get to the file hash this is where my previous breakpoints string etc and there we are so let's follow this in the DOM okay this looks like address and you see now I have somewhere I have in the DOM in the region that contains random bytes I think those may be my key bytes I am not 100% sure like this maybe it's this actually following DOM, yeah yeah that really looks like some random data this could be the key so to double-check this what I would have to do is store this somewhere and try to see what the data comes out of this main input function put that back into cyber-chef and try to decrypt everything with this key and the ID that will get the beginning of the output data please believe me when I tell you it's going to work I am not going to do this here because it's going to require me to take these bytes manually and put them back as an ID in cyber-chef it's just annoying but it does work so just doing this we were able to get the AES key that is being used in the program and we were also able to figure out what was going on there just go back here one thing I want to point out at this juncture is really that in terms of reverse engineering we haven't been doing much we just skip over every assembly instruction and look at the function tools and use our debugger to dump all the arguments that are being used and then from there we sort of guess, slash, figure out what is going on in the program so I do sometimes teach university classes in reverse engineering and I'm pretty sure that the first year students who are very uncomfortable with assembly would actually be pretty okay with reverse engineering though but one of the main take-aways I wanted to share with you is the fact that if you try to follow all the instructions in the Go program and you're going to have probably a very bad day but if you take the quick and easy approach of looking at the various function tools we don't have to do that much work to figure out that AES is taking place we can figure out what the key is we can see that files are being created we can just use the debugger to see which files are used and it's really some sort of easy win where we just get our answers to work too much for them so I think this is really pretty cool anyway let me go back to this main the encrypt function I think I was in there here so I'm going to just do a debug and then execute till return okay so I'm just getting out of my main.encrypt I'm going to move on and go up to this main write file there so at the moment if I go to my desktop here supposedly yeah I do have this config.dat.tmp file here somewhere at the end and if I press F8 here and skip over then going back to my desktop you see the sorry wrong desktop you can see that now my file contains data it's now 1 kilobyte big and if I open it with my hex editor yeah you see it is a mixture of looks like base64 but you know with underscores and dashes so this is the URL basically anyway so at this point what I would do here is I would try to make sure that I didn't make any mistakes I would decode everything with cyberchef and verify that I did get back this string that we had initially which was this one and okay so here trust me but it's going to be the case and well it looks like we're done right this branch is over so let us go back up a little bit and look at what happens if the file already exists so I'm just going to go back all the way back up here and if you recall this was the beginning of this function where we started out by checking if this config.dat.tmp file existed if it didn't well now we know what happens we go into this branch and we just created with some some hardcoded values otherwise we'd go in there but what we're doing here is all the way down here just makes it useful we figure out where the AES keys were and so we should probably rename them so let's do that we went here main-entrant and this was this here was the key size so let me rename that okay like this might as well do it because I'm pretty sure that on the other branch we are going to have references to those keys okay so here we are back in the other branch and so if the file does exist then as you might expect the keys get referred to or get downloaded somewhere and then we have this main.dcrypt function which I assume would be applied to this contents value that we got from the read file initial and then here you have this gen-split function which works on the character pipeline like this so we just split whatever string was returned after we decrypt the configuration and split it with the pipeline token so it makes sense because this is sort of structure that we've seen so really nothing expected there and then the strings are split we just move on this here called a2i so based on the tokens we convert them back from strings to integers and then in fact they will be stored back into the global variables that we've seen before so this function really here of course I'm just skipping over a lot we've seen this already but what would happen in this branch is the opposite of the other one by which I mean that instead of taking hardcoded values and storing them in the configuration then we would take the configuration decrypt the configuration and read the values back into the memory of the program the global variables so we now sort of know what is taking place into this big sprawling of a way up define internal settings function it's actually the function that processes the configuration of the program and either the configuration does not exist and we create it or the configuration exists and so we load it and put the values in the global variables just going to go back here and then we would be back there so I think this is probably a good time to put a stop to this analysis but of course there is a lot going on in this program what we saw there is just the very beginning which is the loading of the configuration it took us something like two hours to get there but of course we really took our time we really wanted to make sure that we had everything correct and we wanted to check every argument under the debugger etc and also we constructed part of the program so it was time consuming effort but overall what happens next is we enter the main loop of the program and we now that we have the configuration the program starts doing stuff on the victim machine so this is where he would connect to the C2 server and start interacting with the C2 and exactly the same way that we were able to look at how the configuration is loaded here and go we would be able as well to look at the protocol that is used to talk with the C2 server look at the different keywords that can be sent by the C2 to trigger different actions of the system etc and just by doing this exact same process but competing it on the whole the rest of the program we would be able to understand all of it overall I think with something like two to three days of work then you can be sort of confident that you would be able to see everything in the go program even though it is 5 megabytes big and even though you might not have put in a single line of go code in your whole life so I think we can probably stop it somewhere around here if you do have questions feel free to ask them if there is something you want me to show again that is the time we can have some time left if you want to and otherwise thank you so much for listening to this tutorial I really hope it was helpful to you and that is the next time you do receive a malware written in go and it won't seem as daunting as maybe it was in the past and yeah so that is it I will be there at least till the end of the conference now she is staying one meal for a while so if you just want to go have drinks feel free thank you so much and I suppose that is it right if you want the slides I am going to upload them on the internet it will be the easiest way for everyone to save this desktop or something so the slides should be in the same folder as the one where I put the samples so p.kwi.ski or proxy.kwi.kwi.fr which is the long form the name is domain so the PDF is there it is just the slides that I showed at the beginning of the presentation much much going on there but at least you will have all the names all the names that you might need if you have any interest in the way that the type of information can be extracted from the binaries this is a script that I showed you as well I wrote a blog post on the website so you do have the code on github but if you want to know exactly how this information is being stored in the binaries you can find that there as well although it is technical this is technical details that don't really have any information by hand if you have a script that does it if you care where it is anyway that's it slides are online again if you have other questions come for your answer same URL as the like here you should have the URL on screen yeah no problem so do you have a debug not really what is the standard for starters I don't really see that much malware that uses an entire debug usually they try to not use those techniques because it might stand out somewhat so they try not to do this otherwise we'll make this hard actually it comes from the library and you have to work with like you don't get to execute native stuff so you cannot really start working with the window structures you cannot go into the pv and that kind of stuff this is there are no such folders in the list for instance there are more than one scd I don't think so meet me