 Welcome to Melvier Analysis for Hedgehogs. Today we explore three different techniques to de-opposite J-Script or Javascript Melvier. And to do that we look into a good loader sample because good loader is one of the most prevalent J-Script Melvier samples out there. So let's look at good loader for this case. Good loader is a J-Script Melvier and it's an initial infector. So it's mainly used to distribute other payloads. And it has by now up to six layers of encrypted code. So that's a pretty nasty J-Script Melvier, to be honest. I'm gonna use Notepad++ here. We set the language to Javascript. And the first thing you will notice is that this is an open-source library, in this case JQuery. So here we have like 10,000 lines of code. And this is supposed to be good loader sample. So the code has to be hiding somewhere in this mess. Now one of the ways you can go about this is you can search for the original JQuery library and then use Melt on this and the original one. But the way I found it, I was like just scrolling through and found one of the functions that looked pretty weird. This function here, for instance, this stands out in context to the others because these are like random words with some digits behind them. So let's see where this function is actually called. We wanna actually go to the beginning of that. This is called by this function. That's called by this function. Called by three. Called by word. Nos. These. Okay, we are kind of running in circles. So let's look at another one. Note seven. That's the start note. So note seven is like with the entry point to the malware code. Now what I did here, and we are not gonna do this in this video, that would be a little bit too much. But I was like, if I get a file, wanna analyze it, it would actually be nice to be able to extract all those nodes that are being called or all those functions that are being called based on the start note. So I created a script for that. You can download that from GitHub. It's called extract called functions. And we are gonna use that and provide it with this start note. So for that you have to install node.js and you have to install npm, which is the packet manager. If you run it and you get this, you still have to install some packages with npm. So let's do that. Packages we use are bubblecore and commander. And we're gonna install them and then the script should work. So as you can see, I've had some issues with the installation, but I think it has to do with my VM where it for some reason didn't finish. So I started it again. So here you may notice that you have now here some modules and a package JSON that was created by this installation. So now you have all the requirements called extract call functions.js. So let's check what we need. So we have a minus F for the file to deal with K and minus S for the start node. And that's what we're gonna use. So our file, this one, and the start node, as you may remember, is node 7. Now it will extract all of the functions that are called and also called from the called function. So it's like recursive way to find all of these. And not only that, if there are any identifiers in those functions, these will also be put into the extracted file. So this is the file we get. And now look at this. Instead of, we had here over 10,000 lines of code. And we have now way less. We are at 138, which is better to handle than the 10,000 lines of code. So now we can actually analyze this. And the very first thing you will notice is, well, here are a ton of strings in this world3 function. And they all seem to be built here and then built again and to be put into this final string finish.7. So you can check here run, run is created here. And instrument is this assignment here. So how do you go about this? So you could manually walk through that and try to, you know, copy, replace these variables. But I do not recommend that you will, it would take ages and you will be very frustrated if you do just one mistake. It's not going to work and you will not find out why that happened. So one way you could do is use regex. So let's copy and paste the code here. Actually, yeah, let's copy and paste all of that. Actually, I'm just interested in this part. So let's try to capture this part here and then see if we can replace this part here. With that we need capturing groups to get that one. So, okay. So first thing I think is we start out with a word, which is this. We have a space, we have equals, we have space. We have this sum. So by the way, I'm using, from now I'm using multi-line and global as options. So we want to capture that part, but we also want to capture this part because that's what we are going to replace this identifier with. Next, I need to access the identifier, right? So when this is 23, which is the first capture group, I want to replace it when it's used later on here. So we need to find this part again. And now I need to actually say single line. So I'm treating all of this as one line and I'm trying to find the second 23. So this is how I'm doing it. It finds that part. Let's put this in another capture group for the replacement later in Notepad. This doesn't really work though. You see from the coloring that now we would replace 23 with all of this. So actually what I will do is I will exclude this identifier here. So instead of looking for all of the strings, we will be looking for this. So we see now that is match one group two. So this is the replacement for 23, which we will use here. There's another thing we need to take care of and that is we want to keep this part here. So this part is not going to be replaced. And that's why I put it in a capture group as well. And now we press CTRL F and we say replace. We put regular expression. We should make sure dot matches new line. So we see so we can get across the new line boundaries. So what is it that we want to do? Actually we want to replace everything we find. Let's see, do we find something? Yeah, we find this 23, right? So we can get rid of this part here because we don't need it anymore. But we will need this and here we put the match from here. So this is the third, so this is three. And then we want the second match, which is this. So replace it with two. We say replace and we see now it has been replaced and the next match is here. Say replace again. And now we can basically press replace until all of these are gone. And nothing's changed anymore. So why are these not going to be replaced? Because now we have fewer concatenations that are still there. And we can also address those with Riggix, but I also want to show you something else, another way you can use to de-opfuscate samples, where we were going to proceed with this part here. So let's just copy and paste this and we go to AST Explorer. So when you start ASTExplorer.net for the first time, there is probably this default snippet here. If it isn't, you can go to snippet new. You can choose Java. You should choose JavaScript and BabelParser. So what is this actually? Babel is a transpiler. So in contrast to a compiler, it will transform JavaScript code to JavaScript code. And I believe it's used by developers to make their code more beautiful before they publish it. I'm not entirely sure. I'm not a developer though. So the good thing about this is you can use transpiling also for de-opfuscation and obfuscation for that measure. So we can use it for our program as well. And I want to at least get you familiar with the option and that this exists. An abstract syntax tree is something that compilers in general use. They have like this intermediate representation or internal representation of how the program is built up. So where are variables? Where are functions? And that means if you use a tool like that, you have a lot of context information. Whereas regular expression is dumb in a sense that it cannot know what's a function, what's a variable, is this constant, is this an identifier, is this something else. It doesn't know the context. But abstract syntax trees, they do. The drawback of this using this de-opfuscation is it's probably, at least if you just analyze a single sample with it, it's probably slower. So you have to write more code. And you have to think a little bit more initially how you solve this problem. But the transformations that you do, you can reuse them for similar code. So they are more applicable to other Melvier than regixes. Your regixes are probably very specific to this particular sample. So it's more robust. If you click on the transform button, bubble v7, you get an example for transformation. And the standard example here is that the name is just reversed. So you see a function name that is reversed, and you see a variable name that is also reversed. Let's see how this works here. So we have this abstract syntax tree. We have variable name tips. If you click here, it should highlight where the variable is. And you see it sees here a variable declarator, which is kind of this part. So if you click on that, you see it includes the let tips equals. We have the identifier, which is tips. And the variable is initialized with an array expression containing three string literals, which are these. So this is the build up. And every node, so one part of these things where you can click on as a node, every node has a type. So the type is saved in node.type. In this case, it's variable declarator. Here in this transformation code, we see that we have a visitor. What is that? Visitor traverses the whole abstract syntax tree. So the path is the information of where in the syntax tree we are currently at. And at every point in the path, you can convert the path to a single node, like this particular expression statement, for instance. That's a node. The visitor takes a match for a certain node type here. So in this case, it looks for node that is an identifier. So every node that's an identifier will be visited by this and handled by this function here. So we see an identifier path node name. So it converts this identifier to the identifier node. Let's see. Do we have an identifier? We have it here. It's called print tips. And we can see here if we have identifier, there is dot name and that is print tips. So that's going to be accessed here and there's assigned a new name called that's just the reversed version of that. So this is how you deal with abstract syntax trees in general. Let's now apply this to our sample. So we are going to put this function into this and we are not so much interested in this reversal here. What we are interested in, so I'm choosing this as an easy example first because it's the first thing we're going to do. This would be easy to solve in regex as well. I know that. But let's start small. Let's build a transformation that simply concatenates strings. So what do we have to do? First, we will look at our abstract syntax tree and we see here that these expressions they are built up as binary expression nodes. So we have nodes of the type binary expression. On the left we have a string literal and on the right we have a string literal. And if you go up these nodes you see they are made up of a binary expression on the left and a string literal on the right because this binary expression was the other concatenation. So in this is like a tree of binary expressions until you are at the leaves with two string literates. So what we want to find is the binary expression that has a string literal on the left and a string literal on the right so that we can concatenate them. So we replace this with binary expression. And now we find specifically those binary expressions that have two string literates and we say we want to replace these. For that we use path replace with and Babel has a way to create nodes. Now we want to create a string literal node and for that we use the types constant here, T string literal. Let's say we were to replace it with A. Now we see which nodes are going to be replaced. And now you can see that not all of these are going to be replaced. So the concatenations are not resolved entirely. It's just some of them which are at the very bottom of the tree are going to be replaced. That's because the replacement should actually happen after we found all of the nodes here. So we have a function for that and that is exit. So we can change this to be only executed on aglit. And at this point I realize I have probably left out some nodes that we need. So we are going to add those as well. And the very reason that these were not replaced by the regex that was the actual problem and the very reason for that is that these contain some of our single quotes which we explicitly excluded from the string here. So they were not replaced and that's one of the reasons why regexes are kind of hard to get it right. So we just need to replace those nodes as well which is this one and this one. And now we got them all because those two are not necessary anymore. And now we just need to put the right value there which is a concatenation of those nodes. So how do we get to the value here? You look at the string literal and it's left value. So you are at the binary expression left value or right value. And now we got our concatenation transformation. So as you can see it's a little bit more involved than regex but it's also more robust. So now we are not having to deal with certain things like how JavaScript accepts or escapes strings because that's implicit from the parser. So let's copy this here and there's one thing left to do. We want to actually get the whole string here. So finish7 is a huge string that's being built up here. Let's try to recreate it with our abstract syntax tree parser. So to do that I'm going to use first a to do that I'm going to type in a template that you can use that's a little bit more flexible. So because here if you formulate it this way you can do several traversals through the program. So not only one traversal but several. In this case I want to do like a two pass or two times traversal where first I'm going to collect all of these nodes here. First traversal and second traversal is I'm going to replace all of these here. For that we will define our visitor. So again you're going to check here how it's built up and this in particular is an assignment expression. So we have the assignment expression on the left is the identifier similar. Six on the right is a string literal. And that's what we want to have. We want this assignment expression node. We don't want this assignment expression but we want those so we will say if the right side of it is a string literal. And what are we going to do? We're going to collect the nodes. And we call this to traverse the whole abstract syntax tree. And now this will fill up our array with all the assignment nodes that we want. And in a second pass in a second traversal we want to replace the literates. So let's define this visitor and what do we want to replace? We want to replace these. So you look here. What is it? It's an identifier. So if it's an identifier let's replace it. Now this identifier it shouldn't replace all of them because it would replace these here as well. We don't want that. We only want to replace these. So we are going to make sure these are part of a binary expression. So what exactly does it do? This is accessing the parent path. So we are currently at the identifier here and the parent is the binary expression here. So we just make sure that this identifier is part of a binary expression. Now I could also search for binary expression here and then check if there is an identifier node. Both will work fine. It's just a different way of expressing the same. Now let's find our node. Which of the nodes this is and we provide a function to find which will find our node. And what node do we want? These are n left name. Why? Because we have here, let's look again here. This is an assignment expression node and on the left is the identifier and we want to compare the name to the name of our current node. So we want to access this name and compare it to the name of our current identifier node. And we save this in the node variable and if this variable is there we will replace the node. And we can use the node as is. So the node on the right is the string natural and that's what we are going to replace it with. So there's still some problem that I need to check first. But yeah, these braces are not necessary. So and where are we at? It didn't work yet. No, it works. I just have these still there. Now there's one place where you can delete them that's here. So one way you can do this is you can add path remove and now all of the assignment expression with the string literal on the right side will be removed. All of them, no matter if you use them later for the replacement. So that can be a problem if you copy a bigger part where you may delete some of those nodes forever. But anyways, we have our function here or our final string here which is what we wanted. And that is still in need of concatenation. Here you see these also need to be concatenated and you know how to solve this already. But there's another way you can deal with this and that's using some online JavaScript compiler and then just printing the string. So let's add our node here which is not concatenated yet and we just print or console log the variable press run and we get the whole string. So how are we going to move forward with this? How is this variable used by the way? This is, let's analyze this a little bit. So we see here finish7 is put into a function named use45. You also see here's another function where the result is being put into with this string here which looks suspiciously like a key. So use45 is here and it's doing something 2,704 times. So there's pound and strong. Let's search these. Let's just calling another function and that's a substring something. And the pound one, strong one, calling another one which is a mod of something. And now the interesting part of these calls is they have actually less arguments than they pretend to have here. So actually only three of them are used like that. We can see here this is our... If you want to analyze this properly I highly recommend renaming the variables. So that's the only way you can actually get along with understanding it but just as a quick overview this seems to decrypt something. So if you're going to check how this is being used it's put into JET. Question is where is this? If this is being decrypted what is this where is this being used? Let's see where JET7 is being used. And you see here it's an array and it's being called here. So this is a call on one of the functions of JET7. So I expect to get an array from this part. So let's copy the whole code and let's actually just run this with our decryption function here. We need the key as well because I didn't copy the initialization of music. So let's run this and it's complaining that this is not defined. Where do we have enemy? The reason is that we didn't initialize it. We initialized finish7 but not enemy. So we're going to do this as well. Run. Young1 is not defined. This is another one of these. That's one. And here's another one. We're just going to take these right here. We are not calling this try again. Decoded as JET7 not finish. That's way better. So we have an array. First content is constructor and for the second the content is this which is another script. Now if we put this script into AST Explorer we can make a prettier version of that but we should probably remove the transformation before we do that. So this is our script. The script that we unpacked from that. It's adding some or doing something with registry keys here. And here's another encrypted script. If you check what year one is so let's check or find it in the context here. This is use 45. So we can simply put this again and also you see here's a call. You can put this again into our online decryptor. I think it was this. Let's run. And here we are. But this isn't really entirely decoded. All right, so I needed some time to find the problem here and why this part of the code is not working as expected. You see it's kind of still in the wrong order and some of these are still reversed. The reason for that is the escapes here, the double escapes they are supposed to be just one escape. So I guess during the conversion process the escape nodes are not double escaped anymore but single escape. So all we got to do is we set the search mode here to normal. We say we want to find the double slashes and we replace it with one slash. Say replace all. Copy and paste the new code in here and we run this and it works fine. Let's beautify this. Move the traversals here for a tier unknown return outside of function. So the way good loader is dealing with this is it actually wraps a function around this. So we're going to do the same and we grab the script here and also we can traverse with the concat visitor. So we get a little bit nicer output and in case you're wondering yes I was experimenting a little bit with this it's the same concat visitor we used before just put in a separate constant here so the code might look a little bit different but yeah here we have our final layer and we can see here the C2 URLs and the values they apply here to these. So this didn't get concatenated why because our concatenation is based on two string literates in the binary expression and this is not a string literal here so it's another an array so this is not going to be replaced because this part here is actually that part so it's a not two string literates like this but these are seen together these are seen together as another binary expression with a string literal so these are not going to be replaced by our function. So we have now used and demonstrated three different techniques to de-opposite good loader and to unpack it. The easiest are ragex and dynamic execution however if you are looking to build robust tools like de-oppositation and unpacking tools for j-script javascript malware I think abstract syntax tree manipulation is the most robust way in the best way to go also if you want to well at some point develop an arsenal of little scripts that help you de-opposite samples in general I think also abstract syntax tree manipulations probably better and more robust than trying to do the same with ragex so it depends on your goals but for fast and easy manual analysis you probably are best off with dynamic execution and ragex and just as a side node I created a good loader unpacking tool you will find it in the link in the description below the video so this also uses abstract syntax tree manipulation with barber and you can see how I utilize this to unpack 6 layers up to 6 layers of good loader depending on the sample that we are using I considered using the 6th layer sample for this video but then it would have taken way too long you see it's already more than 40 minutes right now for 3 layers but the other layers are not that much more interesting they are just repeating what the previous layers did so I chose the sample instead if you want to look into some examples check out the good loader unpacking script on github if you want to learn method analysis from the ground up then check the link in the description below there is a link to my Udemy course for beginners it contains 11 hours of video content and the link is the coupon link that's a little bit cheaper for you than buying it from Udemy itself so check it out and maybe I see you there