 Hello, I'm Didier Stevens, a Senior Handler with the Internet Storm Centre. In this video, we are looking at the malicious document. A question was asked via a comment on the last Tire entry I made last week, Maldoc analysis of the weekend. So where I analyze a malicious document with VBA code that contains PowerShell. And a comment was asked, okay, I used the strings method. And this reader here did the same. But then he also wanted to know how you could do this without a string method. Really going to look for what you are missing and the alternative text here in this case. So this weekend I wrote a diary entry about this, finding property values in office documents where I explained this. And here I will make a video of this method. I will do about exactly the same as I did in the diary entry here. But then I will also show an alternative method so that you know more than one method. So I have my malicious sample here. With only dump, we can see stream 8 contains the macros. So I'll select stream 8. And since macros, VBA macros are compressed, I use option V to decompress it. And here you have the decompressed VBA code. Now I know that there is a shell command somewhere in here, a shell statement. So I'm going to grab for this, grab shell. Now you can never be sure of the case that the malware authors will use. So I use option I to make it case insensitive. And here I have my line with the shell command. And you can see it's a cmdc command. And if we look a bit further here, you have your alternative text. And that is what we want to recover the alternative text. So it's the alternative text of an object. And that object here is in this variable. So I'm going to grab for this variable in the source code because it's a VBA variable. Like this. And also case insensitive because you cannot be sure of the case that they used. Okay. And here you can see that the variable is assigned. I said so an object is assigned to it. It's a shape. And this is the name of the shape. You can see this is between double quotes. So this is a string. So this string is something we can look for in the Word document here. And we can do this with the YARA rule. So first of all, I'm going to copy this string. And we're going to use a YARA rule to search for this string. And you can do that with only dump with this option. And it stands for YARA. And then you have to type a file name. Like string dot YARA. And you have to create a file that contains the rule to search for that particular string. Now, because I often do this, it's a bit cumbersome to always have to create a small YARA rule like this. With only dump and my other tools, you can also create ad hoc YARA rules. You indicate this with hash as hash. Like this. The S stands for string. And then you just paste a string you want to search for. So doing this will create an ad hoc YARA rule that will just search for this string. In ASCII, in Unicode, and regardless of case. And that's what the ad hoc rule that is generated does. Now the hash here has special meaning here on macOS. So I'm going to escape this like this. And then I can run this on the sample. Okay. And now you see here for the output for stream 4 and stream 8, we see that the string YARA rule matched. So we know that this string here is in 4 and 8. In 8, that's no surprise because that's our macro code and VBA code. That's normal that it's in here. And then we have also in stream 4. So our shape that we are looking for to find the alternative text that is in stream 4. Where is the stream 4? Well, you can use another option. That's the YARA strings option. If you use that together with the YARA option, then you will also get a list of all the strings that were found like this. Okay. So and the string was found on this, at this hexadecimal position. 1, 6, C, 7, and 7. And this is the byte sequence and the string that we matched. As you can see, it is actually Unicode. You have the Q, the J, the X, and so on. Sorry, the 1 here. And here you have all those 0 bytes. So this is actually Unicode. So 1, 6, C, 7. That's where we have our name of our object, our shape. So I can select this stream and then do an ASCII dump of it. And when I do this, I have the complete stream. There is also an option, cut. It's an uppercase C. In most of my tools, the cut option is a lowercase C, but here in only dump, lowercase C was already used so I had to use the uppercase C. And the cut option allows you to specify a range of bytes that you want to select. So you don't want the complete stream to be dumped. You just want to have a small sequence and that you specify. And the way it works is as follows. Let me say that I want everything starting from byte 10 to byte position 20 included like this. So it's 10 colon 20. And then you get this output. So these are the bytes that you can find starting position 10. So what I'm going to do here is say the position of the shape, the name of the shape, 1, 6, C, 7. And I prefix with 0x to indicate that this is an hexadecimal number and not a decimal number like this. I get no output because I said OK to 20, but 20 is smaller than 1, 6, C, 7. So I get no output. So what I'm going to say is say I want 100 bytes and that is with postfix L. L stands for length. So here I'm selecting 100 bytes like this. And then here you can see the name of our shape. And then you can see something here that looks like PowerShell. But it's obfuscated and it's in Unicode. So this is actually the command. So I can make this longer. I can say 200 bytes for example. And then here you can see that this indeed looks like an obfuscated PowerShell command. So we are going to continue trying to extract this. But first of all, now I'm going to show you another method. Say that you don't know the position. Well, then you can also write a cut expression that will search for the string that you want to find and then start to cut from the position where it found the first occurrence of that string. And you do this with square brackets like this. Single quotes. And this is the string that we want to look for. Now if you do that, you are looking for an ASCII string. And we know that it's not an ASCII string, but a Unicode string. So prefix it with U. And also make sure that you use my latest dump, a version of OleDump, because that's where that Unicode argument for a cut expression is supported. It's only from version 0041 if I'm not mistaken. 41 that you get that cut expression that also supports Unicode. Like this. So here we don't have to specify where the string starts. We just search for the string and the location, the first location where the string is found, that is selected as the beginning position for the cut expression. So that's from that position on that bytes are selected. Now we want to know how long we have our expression here, our PowerShell statement. So I'm just going to leave out the second part of the cut expression. So I'm not going to tell it where it has to stop cutting. And that means that it will select everything until the end. So let's do this. And then let's scroll back. And here we have our PowerShell expression. It's a long expression. It starts here. And you can see that it looks like base 64. And like often with base 64 expressions, it ends with an equal sign. That's not always the case. It can be without equal sign or it can be with two equal signs. Here in this case it's an equal sign. So I can use that too to specify where the cut has to stop. So I'm looking for string equal. And up to this string I want the selection of the bytes for the cut expression to happen. And this is Unicode. So U like this. And now I have an ASCII dump with the name of the shape and then PowerShell, the PowerShell statement, the complete PowerShell statement. Now if I just want the PowerShell statement, then instead of the name of the shape, I can start looking for the beginning of the PowerShell expression. So in that uppercase P, O like this, W like this. And then here you see I start exactly where the statement starts. So this is Unicode. And with only dump you can also extract Unicode. So instead of an ASCII dump we are going to do a text translation. And we know that this is UTF-16 Unicode like this. And then you get your complete PowerShell statement. Now like explained in the diary entry of last week, here this square bracket has to be replaced with uppercase A. And then when you do this then you have the real base 64 Unicode encoded statement for the PowerShell interpreter.