 I guess so, well there's still more seats up front if anybody wants to grab them, rather than stand in the back or anything. Oh, is this a good mic placement and everything? I just tossed it on myself. Excellent. All right, so my name's Sean O'Toole and I'm gonna be speaking to you about metamorphic viruses. And I wanted to stress that I know this whole conference is educational purposes only, but I'm not stressed that this one is definitely because I don't wanna see anybody's grandmother not be able to access their email or something because I gave a speech. You know, you guys are the computer guys of your family, so you know if they have a problem, you're gonna hear about it. And so, yeah, my speech on metamorphic viruses, I'm gonna start off with the I, going through some pseudo code on how the various modules work. And then after that, I'll talk about some of the AV approaches to it. And also, on the CD online dimension, I did have some misspellings, but it was a late night presentation that I was putting together after a friend's birthday, so you know what happens. So as we go on, I just wanted to give you a couple of the definitions, one from Peter Zor, who's a well-known antivirus person down to like your virus people. And it goes from body polymorphics where they swear they've always been around in simpler techniques down to the artist's extreme mutation. So we got that going. Now I'm gonna give a brief description of all the various sections, or the main sections that you'll see or are theorized. The first one is the disassembler, which mainly it usually has some way of putting something into a structure, something, a lot of reverse engineering tools and whatnot. And then the second is the deep permutator, which takes the code that's been shuffled and connected by jumps and puts it straight down to the straightest code that the algorithm can put it back together with. And I also, the shrinker, which takes op codes and I recombine, or clusters of op code and recombines them, say like if you have a push ebx, pop ecx, it becomes a move ecx comma ebx type of thing. So these two first, or two and three are mainly the ones that get you back down to the skeleton of your virus that it can be expanded on later with these ones, the expanders, which will do the exactly the reverse of the shrinker. And the permutator, which shuffles the code, links it with jumps and so forth. And also we get down to the reassembler or assembler, which places the virus back together, fixes jumps and so forth. And these six are the main ones that you'll see are here theorized, like a shrinker is very rare, expander, permutator, those are ones that are more used. And here's just one that I wanted to go over really quick is when you're programming these viruses or looking at them, you've got to think in AI, very software engineering type of thinking towards it, because you know, you want to think in modules, everything's separate, we just pass this information to this one, so you don't end up getting into this huge project that never finishes. So we get down to the disassembler, we have two choices. One is I, one thing I haven't really passed across was heavily theorized to make a pseudo language so that you can change, like, say if you want to go from ELF to X86, you can change a few things in there through your pseudo language before you put it back down into the assembly language. And also we have reverse engineering tools, which I, the zombie is the main person that puts those together. The LDE was an earlier version of the ADE, which has a linked list of structures. And the structure is I actually inside with that module when you get it, so you can look at kind of like the manual and it'll tell you basically what you got to deal with after you run it. Oh, and also one of the things about the disassembler is also that it's not always able to reverse engineer all files, it just has some. So it cuts out some of the selection process. And here's one from I, the mental drills article practice on metamorphism that's a little example of how you can put together your pseudo language. One is that you could I, well you have the op code, the instructions tell you, of course, which registers and whatnot. LM, which is label marker, which tells you later if there's a label that points to that that may be jumped to. And this is very important for shrinkers and so forth because if you shrink an instruction and there's a label in between that cluster, then you've just basically screwed up the whole thing that it's jumping to because you've added too much code into one area. Also we have like the idea of how to deal with the op code, you can say that move is the 40. If you're doing memory to register, it's plus two. So move, memory to register would become 42. And there's also a full list of I, the idea of point pseudo op codes like this in the same article. So now we get down to the D permutator because I, the other choice for the assembler can, or disassembler can be found in its manual pages. And the D permutator, as I said earlier, just de-shuffles the code, if you will. And this can be put in its own module or can be included in the disassembly process depending on which way you'd like to go. And here's like the pseudo code at the beginning. You got the variables, like you start out with the entry point of the I, which you're disassembling or de-permutating. You have a path marker, which is your buffer for after you're de-permutating code. A label, which is two D words. One is the I old address of the code and one is the new address of the code, which later on some of these won't have an old address, which I see in this is very important for jump fixing. Later on as well as other things. And the future jump label table, which has labels that you don't wanna deal with yet through the algorithm. And you'll of course zero out the path markers so you don't end up hitting random I op codes that are just somewhere, something that wasn't that buffer so that you think that this could be a portion of what you're de-permutating. And you translate the current I instruction pointer into the ESI. And then I, here's what you do for a jump. If you have a jump and you've already de-permutated it, of course you just jump it back to that. And that's one of the things about this algorithm that makes you know that you cannot have the same I de-permutated that virus every time. You may have a jump from a couple generations back that the algorithm never really took out. And then if you, I have not de-permutated yet, you put it in no op unless it's just in case it's being pointed to you. And then you just go on from where the jump leads you. And then you have the conditional jumps which I a lot of times are just kept in there and they're put in the future label table and whatnot. You can see everything here. Calls same as conditional jumps. Returns just goes to the future label table so you move one, I branch up the, or one leaf up the code tree and get back to it. And then here's a little note I wanted to make sure that we had on there which is when getting a new EIP from the future label table, we check if the labels stored here are already been de-permutated. If they are, then we insert the corresponding labels at the label table and eliminate the entry and future label tables. If not, we get the new EIP continue. And also I just had like a little note here for if you're making it stand alone and whatnot, you're gonna have to do a lot of more list things if you're doing it stand alone because in two buffers, so that's one of the reasons it's usually in the disassembler it will take up less room. And here's a little example here and we see that the triple X's are the code that is meaningful and the Y codes are the ones that are unreachable. So this is another thing that your de-permutator will help you do. It'll allow you to put in worthless code that will never be reached so that it's just in there, it makes it look different. We see this beforehand. We got the conditional jump. We got some of the other jumps. After it runs, we got this. No more with the Y's, our useless instructions. Once you get a jump, links you straight to it where that code was goes through there until you've hit it all and that's there. And then we get to the shrinker which is extremely rare to find and will reappear in our AV section. And this is basically like I was saying the inverse of the expander and it's most, I haven't run upon it so that's why it's not and I don't have any examples of it anywhere. And so we have the pseudo code for that right here which we can see is basically looking through and seeing if we can find in code clusters and switching them back to their most simple ones which I do it as if there is a list of these conversions but many times it's just a whole function that just does one thing. Then we get to the expander which is the inverse of the shrinker. And basically that's usually a standalone thing and usually also those don't use a list or anything to see what the conversions are. Mainly they're just functions already written out like the Win9x Ramones virus will have some of these so that you can take a reference point to see how they're done. And here's the pseudo code for that. We have is expandable. I forgot to switch it to an actual number rather than a Boolean but basically the expandable question is how many times can we, or how can we expand it? Three op codes, two op codes or just switch op codes and it goes through and it says okay if we got this random number and we got this we're gonna do it. And that's basically that pseudo code and that's just by telling you about how you can make an expandable or is expandable type of thing. It shows zero, there's no way to expand it. Seven you can expand it all three different ways and up and down the line with the various combinations. That's on the same thing. So we get to the eye permutator which you can see in the ghost virus C perm as well as in a lot of other zombie viruses. And here's the pseudo code for that which basically goes through like, well, this one's when it's actually kind of shuffling beforehand, so it just puts the code groups together puts a jump in after that code group. And then we come into this one which goes to the jump and inserts the proper address to jump to the next code group in the sequence. Oh, I also forgot to mention I changed my presentation a little bit so it's different from the one on the CD but yeah, just so if you're following along on a computer just so you don't get lost, I've made a lot of changes. I, in the assembler section, the main thing that you have to worry about rather than the whole converting all the stuff is the main thing is to do jump relocation fixes which so your old jump will go to the proper jump once it's reassembled in the new program. And here I usually initialize a couple tables, one that's eight bytes long, four bytes for where the new instruction pointer is, four bytes for where the old instruction was located and then we have a jump table which basically lists the offsets of our jumps and jumping to the instructions so we can check and they're the older jumps so you can check these old jumps match. Now we should change it to this new one. And basically that's what this for loop does with the if in it, checks to see, goes through if there's still I jumps to be processed, it goes to the jump table and goes through until it hits the end and then I comes down and sees if the old to I jump table address is the same as the new or the old jump table address in the EIP table and then assigns the new one if the old ones match. Other ideas that have come up about this is one of them was that it's been an extremely old idea that's been done a lot even, I can't think of any of the DOS viruses but there have been ones that have done that and it's registered change. One generation, you have EAX doing all these operations for you, next generation you may have EBX doing it which will change if you're doing a signature scan it changes the op code. Then we have I entry point obscuring techniques which have been fairly popular so basically that is when you put a I jump or a call at the host's entrance in the file and then it jumps straight to the virus code, runs it, jumps back and then there was I one that's only, I've been seen in the Z mist virus which is unknown entry point and what that does is it interjects the virus straight into the stream of code which this can be done with your reassembler after disassembling both the host and the previous generation virus. And then we have other modules that you see a lot of them like trash code generation that was big during I love email spreading viruses and worms and whatnot and that one could be added but there's a drawback that will be that your code will keep on growing throughout what always happens with a garbage code generator but also this can be used to change signatures and we'll see why this might become important with a new type of AV technique that they're thinking of making and then we have encryption, I be a polymorphic, be it just regular encryption these things are, you can also include them but you have to be careful how you do it so that say like sometimes a flag will be thrown up because you're using a certain technique to polymorphically encrypt and decrypt your virus but say a properly done one would be Z mist is a great example of that one too which is one of the greater viruses out there and what it does is it acts exactly like a file that will decompress at runtime when it uses its polymorphic engine so here comes the start of our AV section I just wanted to put up this quote from an article about Z mist that says that I metamorphism is something that's gonna be pretty crazy and we need to find out something to do for it and since Z mist and the unexpected entry point have been written by zombie I found an article by zombie that shows what the implications of doing this would be and you have your variables and this is basically saying if you're doing a signature scan you have to run through it until you know you're definitely on the virus you don't know how long that will take so I've been mentioning signature scans so just in case if people don't know I had like a little slide to tell you about signature scans which is basically the oldest technique for virus finding viruses and it's just to check a string of bytes or so forth to see if it matches a signature in a database and if those match then most likely you have a virus and there's many other things say like Z mist is my major example because that's the one that uses the most metamorphic techniques as a date that I've seen and in his he puts a Z in the DOS header and that is his mark for I've infected that file so and you need these marks so you don't constantly infect the same file and that's I but it's still a constant struggle in between will the virus recognize it will the AVs recognize it and one thing is this one's somewhat ambiguous because a Z is allowed in that portion of the header and so that sometimes the virus may not I infect every file it can infecting ones are already unfettered not infected and at the same time the AV will look at it and say we're really not sure this could be a legitimate Z or it could be a Z placed by the virus and next we come on to geometric scanning which I is one of the downfalls of like polymorphism and I but we see within Z mist it was able to beat it by acting like a runtime I've self decrypt or self decompressing file and I so in the basically what geometric scan does is check the size changes in the file so they can say all right this is acting like this is acting like that because the initialized data sections grown this much and then we have the possible answers because we saw that Z mist had beaten the previous ones and one of them is to combine various techniques and use kind of a heuristic idea but this I fails because it takes a lot of time and nobody's gonna run anything if it takes a lot of time and also that there's always like anti-emulation debugging all that type of stuff techniques coming out so that some of these techniques will fail the AV techniques will fail if I they try to use them with these other anti-star whatever you want to call them techniques included and the second which I is a fairly new one and I actually was looking around the internet and I saw that there's somebody that might be talking about this at the Virus Bolton conference and it's to use a shrinker like we were talking one of the parts of the virus the shrinker and the D permutator so that you can have this virus D permutated shrink to its skeleton and that's where the executable trash generator being left in there could help it out because at its skeleton you'll still have op codes that weren't there a generation beforehand and so yeah and we see the flaws in the answers they take too long you're not able to do some techniques if they include various ways of programming and I but the shrinker is the closest one we have so that's what they have to go with and I just want to include a section that's why this might be stronger for I compared to other techniques and trash generation of course you just have worthless code and the virus grows loads and loads every time it passes a generation so after a while a file will get big enough and they'll know that something's happened and also in polymorphism I the body usually you can always take like a static picture of the body or at least portions of the body at times throughout the running of the virus you can know that though it's decrypted now or if it's encrypted now it always has to decrypt itself so you have the same code once it's decrypted I'm sorry I'm kind of new to the speaking to large crowds so I might have flew through that too fast but I just wanted to open the floor to question and answer or anything like that, yes Would most of these viruses be small enough like another program apart? I've, are we talking about the with the polymorphism? I, it's possible that this could happen but I haven't really checked too much into polymorphism to tell you the truth I just looked at a few techniques and saw what they did and but I haven't really caught any like I software stuff for doing the decryption methods for like brute forcing and detection I'm sorry, yes The standard part which is not with the same Can it be used to detect that is your virus, your virus? Which section are we talking about? I mean the part that files the process of decrypting the virus during the patient that's the kind of function of your virus I, it's, hmm I, I'm sorry, I can't really hear you too well up here so I kind of miss pieces and I think they're important like It's common to, it's not encrypted it's not polymorphism, which is the start of the virus Is that the signatory that can be easily recognized from the virus software to the other one, start? I will, I'm going to With the eyes, with the eyes, we have to be eyes It's on Yeah Well this question was I if I recognize it correctly is there a static part in metamorphic viruses that can always be recognized and the thing is, since I it can always be randomly expanded or I permutated it, you could step through it but of course you'll still have different op codes because of the expander or register exchange type of thing so that's one, that's where the strength in is I, with this, is that you can change like op codes and that's where mainly signatures they are strings of op codes I'd recognize changes I, I haven't looked too much into that but that sounds like an interesting idea that could be tried because mainly what I've heard is the shrinker idea which will give you the code skeleton This is how it shows how ridiculous viruses have gotten say like if you have, I not even all of the major six I was talking about but maybe say four of them I say like this emis virus I remember that being I I want to say around 70, 80 pages or so and some of it's also done in C so you know that makes it even bigger so it's, I'm sorry Oh no, no, good I, well the thing is, is I with this virus basically rewrote something or with these viruses the expander is basically an unoptimizer which is really funny because like you know I back not too long ago everything had to be optimized for a virus nowadays we're writing things to unoptimize them afterwards so it shows that you really don't need optimization because we have so much room to deal with I, the techniques, it's no specific technique basically just looks through and I well most of them just look through compare op codes because most of them are functions now and they'll say like, is this the op code? If yes, change it to this Yeah well, yeah the compression is basically like say for an easy example, I push ebx, pop ecx becomes I move ecx comma ebx basically that's it Yes sir Aye, yes sir There's 10 There's 10? There's 10 Yeah, I haven't opened up the CD so I Wow Are there any other, yes sir I'm sorry sir, can you stand up? I was, what was behavior blockers? I don't Yeah well I, the thing with I I, behavior scanners is they, if the behavior is found of course anything once you see what it will do you'll know what's happening but a lot of times I that's, that takes a lot of time and then say like for emulation sometimes there's ways that people have found a block emulation so you'd be more of like AI integrity scanner than an integrity scanner basically you're already infected and that's why it knows you're I being, have been infected I emulation I I don't have my notes on it right now but there's there's a few ways to I if I remember correctly it's I it's placement of like variables I below the call so they I and it's like one of the hard coded ones like so if you do I I think it's like I'm usually if you put the address you want to put in for your I call or jump and I think it's if you put in the instruction to do like dollar minus four so it's throws it back up in the instruction if I remember correctly that was one of them but I don't know if it's it's been a while since I write up too much on that a good reference I let's see a good references I I basically say just reading through code is like the best reference so sometimes they'll usually you can find some this comment where they say this is this technique being done type of thing so that'd be the best start I'd say in best reference I currently I know some places where you can get like I viruses that have the I comments in them and so forth use anti-emulation I don't know there's they're still fairly new and well usually you can go to the same websites I access some of my data off of for I metamorphism and they'll talk about that as well so say like see I zombies website if you go to his which is I is easy your own and the idea dot host dot sk and you would go to some of his links you'll probably find some on his page and some on some of his link pages that you can I look at I anymore questions it thank you for coming out also I'd like to thank my school because I they helped they're going to most likely help me pay for some of the trip over here so I wanted to thank for you this university thank you