 Okay, I'm going to just kind of introduce the topic here, talk about metamorphic software, you know, what it's good for, potential good uses, potential evil uses for metamorphic software. And also just a little bit about worms and viruses and malware in general, kind of the history, just a very quick overview of that. Then Wing is going to take over and talk about her research work, which was very impressive I can say because I didn't do it. It's all about, she looked at some specific metamorphic engines that you can find online and there's hundreds of these out there. She looked at four of them that seemed to be the best according to a lot of sources. And she tried to answer two questions about these engines. First of all, you know, how good are they in the sense, how well do they morph the code? Okay, how metamorphic are the variants that they create, the virus variants? And so she did a lot of work on that. She'll present some of that and show you her results there. But there's kind of a separate, you know, but yet related question about how hard are these things to detect, okay? You know, intuitively you would expect the more metamorphic that these viruses and worms are, the harder they're going to be to detect. But I think the conclusion is a little bit surprising in that there's at least a little more to it than that, okay? And then some conclusions. Okay, metamorphic software. Okay, so our definition is that software is metamorphic. If all copies of the software do the same thing, okay? The functionality is the same, but yet the internal structure is different, okay? So somehow you've gone in and you've changed the internal structure of the code, but you've been careful. You've done it in a way that does not change the function of the code at all. Now today, this is not the way it's done, okay? You don't get metamorphic software. You know, there's a lot of reasons for that. It's potentially harder to develop, harder to maintain, and so on. But it's not impossible. I mean, actually at my failed startup company, we did some of this. So it is possible. You can do this sort of thing. Today though, the software you get is cloned, okay? You create one copy, you create one instance of this, you create one copy of the software and then give everybody the same thing, okay? So everybody's got a clone of this one piece of software. Okay, so that has, you know, implications for security, okay? A lot of implications there, okay? First of all, well, you can think of potentially good uses for metamorphism and potentially, you know, evil uses for metamorphism. So just to give you one example of each, one thing you could think about, you know, you could do this thought experiment. Think about, you write a program, it has a buffer overflow in it. You know, you didn't intentionally put it there. It's just an accident, it has this buffer overflow. Now if you, now maybe that's exploitable, so someone can write an attack and they can attack your code. Now if you clone your software, your original software, and send out the exact copy to everybody, well somebody creates an attack, that same attack can obviously work on every clone copy. They're identical, okay? But if you try to use metamorphic software, you potentially get some advantage there. And I'll talk a little more about that on the next slide. Okay, so that's a potential benefit, okay? A good use for metamorphism, it gives you some sort of benefit. Potential bad use, which Wing's going to talk about, and this is kind of the main point of the talk, is using metamorphism to try and avoid signature detection for worms and viruses. Okay, so let's look at this supposedly good use of metamorphism here a little closer. Again, you have this program, you write it, it has a buffer overflow in it. If you clone it and send it to everybody, the same attack's going to work. So there's one attack, it breaks every copy. And it's sometimes called break once, break everywhere phenomenon. You create one attack, it works everywhere, universally. On the other hand, suppose you created metamorphic copies of the software, okay? Now, every copy is functionally the same, but the internal structure's a little bit different. Now, if someone gets a copy of this, every copy still has the buffer overflow in there. You didn't get rid of it just by creating some metamorphic copies, it's still there. So if I get a hold of a copy of it, I can create an attack probably is going to work against this particular copy. What's going to happen if I try the same attack on all the other metamorphic copies of the program that's out there? Well, chances are it's not going to work. If you know how buffer overflow attacks work, it's a pretty delicate affair, right? Things have to align in just sort of just the right way to make it actually work. And in fact, I had a student do a project on this as well, actually got the numbers and really implemented it and tried to see how often you could get the buffer overflow attack to work on the metamorphic copies and it almost never works, okay? One percent or something like that. So the point here is that just a little bit of metamorphism, it doesn't have to be anything really clever, okay? In this instance, for buffer overflow protection, just this one example, a little bit of metamorphism does a lot of good, okay? It's sort of comparable to genetic diversity in software, genetic diversity in biological systems, okay? If a disease comes along, even smallpox doesn't kill everybody. It kills 40% of the population because there's genetic diversity, okay? So it doesn't affect everybody. So you're trying to get that same sort of effect here to prevent these buffer overflow attacks. On the other hand, if you wanted to use this kind of metamorphism to make viruses and worms that are hard to detect, the point here is that you're trying to avoid having a common signature that can be discovered, okay? And if you have enough metamorphism in your virus or worm, presumably you could avoid this kind of detection. And in fact, some people talk about undetectable viruses that are metamorphic. And is that possible? Seems it's conceivable. Okay, so every copy has a different signature. And again, same point, you're really trying to get this sort of genetic diversity into your population of viruses. So the defender who's trying to discover them has a harder time trying to pick them out, okay? Because they have different signatures. But, okay, the kind of the conclusion I think you'll get out of wings part of the talk here is that this is trickier than it sounds, okay? You really need a lot of metamorphism. And you have to sort of use it in the right sort of way. So that's in contrast to sort of the, just keeping your back your mind, the buffer overflow case, you just need a little bit of metamorphism and sort of a lot of bang for the buck. But if you're trying to use metamorphism in this case, it's tricky, okay? And that's kind of the main theme of the talk. Okay, so just a few quick words here about the evolution of viruses. And then I'll turn it over to Wing and she'll get to the real substance of the talk here. Viruses and worms are not new things, okay? They've been around since the 80s, which is ancient history in the computing world. And once they got started, it pretty quickly became that the game is that the defenders are trying to discover a signature, okay? A signature that they can latch on to, some sort of common feature of the virus or worm that they can pick out and they can identify. Well, okay, the writers of these viruses and worms, they realize that too, and that's the game, so they're gonna try and hide the signature. And what can they do to hide the signature? Well, sort of three different techniques have been used extensively. Encryption, polymorphism, and metamorphism. Now, these terms are used, you know, not consistently in the literature. So here's what we mean by these, I guess. So if you encrypt a virus, okay, that's certainly a way to hide the signature. Use a different key, it's just a different random string of bits, there's no common signature. Well, the problem is, of course, that you have to decrypt the virus before it can actually do anything. So you have a little piece of decryptor code there, and that code can be detected by some signature detection means. Excuse me. Okay, so you could then encrypt the decryptor. But then you need a decryptor for the decryptor, you know, get this infinite regress, okay? So you have this problem, this basic problem that you need, some piece of code there that's not encrypted, and that can be potentially be detected. Okay, so polymorphism by our definition here, is kind of taking that to the next level. So you're going to encrypt the code, now you're going to take the decryptor, the little piece of decryptor code, and you're going to try and obfuscate that, make it harder for people to detect that. Okay, that sort of thing's been done, there's been viruses of this sort. They can still be detected, you can do emulation techniques and such. But it does certainly up the pain, it's harder to detect these sorts of things. Finally, kind of the ultimate here would be metamorphism. Can you create metamorphic worms and viruses? So every time, say the worm mutates, it changes its internal structure in some meaningful sort of way. Well, if you could do this, potentially you would have viruses and worms that are not detectable, they don't have any common signature at all. So in a sense, what you're trying to get here is one way you can think of it is you're trying to sort of change the shape of the virus or worm each time that it mutates. And what sort of techniques would you use? Well, you can think of lots of different things. I just list a couple things here that are commonly used. Okay, so that's just kind of a hand-waving introduction. Now I want to turn it over to Wing and she's going to talk about these various virus construction kits. How metamorphic they are and how can you go about detecting the viruses that they produce? Yeah, so we see that a lot of viruses are written in assembly code. So writing sophisticated and well-functioning assembly code may not be easy. But luckily, if we go online, we can easily find a lot of virus construction kits that come even with very sophisticated, fancy, interactive, interface that can let you just click, click, click, and then you can generate a lot of viruses. Some of them come with, they claim to have very sophisticated features. Like they can produce metamorphic virus, meaning all the variants that it generates look different. Even from identical configuration or the same kind of input that you supply. So we're interested in looking at how effective these generators are in terms of doing metamorphic, producing metamorphic code or viruses. We look at the literature and find out some that are really highly discussed or have some good comments. The first one is the PSMPC. And according to a virus research book, it's working as a cold-moving engine. And the viruses it produce have decryption, routines, and structures that change in variants. Let's look at the other one. G2, we also download from the web. And then its documentation actually says that different viruses may be generated from identical configuration files. And we actually tested these configuration files and generated different viruses. This seemed to be the best one that we looked at. And just according to its documentation, it says that all viruses it created are completely different in structure and opcode. So that is impossible to catch all variants with one or more scan string. And it claims that it's really 100% variable. It has 100% variability of the entire code of the viruses. So we are really interested in seeing how effective these engines are in terms of producing different looking codes. So are they effective? How do we measure that? First of all, I want to go through the metric that we use to compare two pieces of code because we are comparing whether two virus variants look different. So this is the method we use. We take two assembly programs, extract all the opcodes like excluding blank lines, comments, labels. We only get the opcodes and also exclude no op. So we get two sequences of opcodes. One is of length n and we number them 0 to n minus 1. Second program say may have m opcodes and we label them 0 to m minus 1. And for comparison, we compare three subsequence at a time and we do all against all comparison. So for program X and program Y, if the three opcodes are the same in any order, we consider it as a match. So if there's a match, we plot the XY coordinate. X means the opcode number of program X and then the Y is the opcode number of program Y to generate a graph of matches, meaning that three opcode matches in any order. Since there may be random matches and also noise, so we keep in the graph only matches that consist of five points or more. Five is our cutting point. So we call those lines that have five points or more real match lines. So we get the second graph that consists of all the real matches. After that, we walk along the X-axis and the Y-axis for each program. We count the number of opcodes that are covered by one or more of these real match lines. So we count the number of opcodes that are identical to other opcodes in the other program and we find the percentage for each program. So the similarity score for two programs is the average of these two percentages. So if two programs are identical, we'll expect to have a score of one. And the less similar two programs are, the score would be closer to zero. And also, because we are doing sequential match, so identical segments of opcodes would form strict lines parallel to the diagonal. And if they are at the identical positions of the two programs, we will actually see these diagonal lines fall right on the diagonal. If there is some translation, we will still see these lines, but they would be off the diagonal, meaning that there is a translation. Or the opcodes appear in different locations of the two programs. All right? Okay, as we said, we tested four generators. We considered each generator a single family, meaning the viruses generated by one single generator belonging to one family. So we generated different number of viruses from these four generators. And then we measure the similarity among pairs of viruses from the same generator. This is how we measure the degree of metamorphism produced by each generator. So for comparison, we also find used 20 utility programs from SickWin. This program we take of similar size to the virus executables. And we also believe that the viruses will be doing some kind of low-level activity similar to the SickWin utilities. That's why we pick those programs for comparison. Here, we are comparing similarity within a family, meaning we compare virus variants. And we can see all the 190 pair-wise comparison among the NVIC viruses has a score very close to like 10%. They have score range from 0.1% to 21%, and the average is 10%. A great contrast is the comparison between normal programs. They can have similarity from 13% to 90%, but the average is around 35%. So in terms of metamorphism, we actually see that NVIC did a very good job because the virus variants that it produced actually are very dissimilar according to our metric. And normal files are much more similar in this case. How about the other generators? Here, we use a bubble graph to represent the outcome. We plot the minimum similarity score that we obtained from among all the comparisons. We plot it minimum along the x-axis. Maximum score is along the y-axis, and the bubble size represents the average similarity score of that particular generator. So if a generator is effective, meaning that the variants that it generates do actually look different, we would expect to see a bubble that is very small and very close to the origin because the minimum, maximum, and average should be small for effective metamorphism. So again, we can see here the NVIC generator actually outperformed the other three in terms of generating different looking viruses. And if we look at normal program, it actually has a lower, minimum, lower maximum, and lower average than the other three generators. So G2 in this sense may not be that effective because its average similarity is higher and also have a higher minimum similarity. If we look at some of the graphs that we generate while we do the pairwise comparison, we again see similar result, right? The first pair is a pair of NVIC viruses that has similarity of around the average similarity, which is 12%. So we can see that they are very short matches and they are all way off the diagonal, meaning the matches are at different locations. Compared to, say, G2, we said G2 is kind of not as effective. We can see that they have the graph that they produce. We see a long diagonal line and they are 75% similar for this pair. VCL and MPC-GEN also have very low degree of metamorphism, according to our measure, and they also produce graph that shows this. So now we see that NVIC viruses actually look very different from one another. So it seems that they may be able to avoid detection, signature-based detection because of its high degree of metamorphism. But other than that, we are interested in knowing how similar is it to the other viruses or to normal program. Are they in any way similar to other viruses or the normal programs? So we do the same kind of similarity comparison, and then we use all the same viruses that we use previously and just do a cross-family comparison and see how similar they are. Very surprisingly, NVIC viruses had 0% similarity to both G2 and MPC-GEN viruses, and to VCL-32 viruses, they're only 0 to 5.5% similar. More surprisingly is the comparison to normal file because we did 400 pairs of comparison and we only find eight pairs that have some kind of score, which are as low as 0 to 1.2%. So what does this tell us? We see that NVIC is by far the highest degree of metamorphism of the kids that we tested, and it has no similarity to other viruses or normal program. So would that make it undetectable? That's our next question. So can we detect them? We test three different scanners, and we use these three scanners, E-Trust, Aves, and AVG, to scan 37 files that are put in one folder, and these 37 files include viruses from all four families. Here's our result. E-Trust and Aves detected all the G2 and MPC-GEN viruses. AVG detect 27, which are the G2, MPC-GEN, and VCL 32 viruses, but none of the NVIC viruses are detected by these commercial scanners. So what are we going to do? It seems that they actually are quite successful to avoid detection by these scanners. We want to detect them. So we turn to Hidden Markov models. Hidden Markov models have been used in speech recognition since the 1970s, and it has been applied in a lot of applications because of its ability to do pattern recognition. It's a very powerful tool for statistical pattern analysis, and it has been used in biological sequence analysis as well. So our technique is actually inspired by protein modeling. In protein modeling, we would use one model to model a whole protein family. What is a protein family? Protein family is a protein that has similar three-dimensional shape and also have similar or common chemical and physical property. So the thing about family is that even though they behave the same way, but the underlying amino axis sequence may not be totally identical. So because we all know proteins are made up of amino axis sequences, proteins belong to the same family, may have similarity as low as 10% when we examine the amino axis sequence. But we can use a Hidden Markov model to model one family, like given a lot of training data. And after that, the model can distinguish whether any new given protein belongs to the same family. So we look at our viruses. They have the same behavior. They do the same thing, but their structure are different. So there must be some statistical similarity exists between all these assembly programs to make them behave the same way. So we believe that Hidden Markov model can capture this similarity, and we can use it the same way we do with protein modeling and try to use it to detect whether any given program belongs to a same family of the training viruses. So this is what we do. We use Hidden Markov models to represent the statistical property of a set of metamorphic viruses. So we train the model with one family virus and then use a trained model to determine whether any given program is similar to the training viruses. The good thing about Hidden Markov model is that a trained program would maximize the probability of seeing the training sequence. And it would also assign high probabilities to sequences similar to the training sequence. A model would represent the average behavior of multiple sequences, meaning that we don't need many models to represent one family. We just need one model to represent a whole family, and we can later determine whether any given program is similar to that family. So this is how we do. So we generated a lot of these NVIC viruses. We used 200 of them. For training, to train a model, we use 160 of them. And we keep 40, which is not used at all in the training, to test whether the trained model can classify these 40 programs correctly. Also, we want to know whether a program can distinguish a normal program from these NVIC viruses. So we have 40 other executables from SIGWIN. And then we also tested 25 other non-family viruses, which are just the viruses we use in the similarity test. In our tests, we generated 25 models using different number of hidden states. The number of hidden states ranged from 2 to 6. And then we choose different sets of 160 training sequences to generate different number of models and do the same kind of tests. So this is a diagrammatically view of the whole process, as I explained. So from all this, we would actually expect to see that after training, the model can assign a high probability to the 40 test file when we compare to the other normal programs of virus not in the same family. So from that, we should be able to determine the threshold. And then later on, when we are given any program, program A, we can score the program with the hidden Markov model and then compare the score to the predetermined threshold and classify whether the new program that we are given belongs to the same family. This is a typical result of all the 25 models. As we can see, the model, in fact, classify or assign high values to family viruses. And the probabilities that are assigned to normal files are much lower. So if we can reasonably select a threshold that is within, I mean, that can clearly identify them, we can actually achieve 100% detection rate to classify when we try to classify or distinguish a normal file from an NVIC virus. If we include the other virus in the test, we see that some non-family virus actually has scored very close to the family virus or the NVIC viruses. We take a look at those viruses. They are the VCL32 viruses, which show some similarity in the similarity test. So we actually can say that the hidden Markov model determines some of these other viruses for free. I think the bottom line is that we never misclassify a normal program. We would not say a normal program is an NVIC virus, but for some virus, if they have some similarity to the NVIC program, it would actually show a higher score, which is what a hidden Markov model would do. So this is a summary of our experimental result. All normal programs can be distinguished by the score alone. And VCL32 viruses had scored closer to the NVIC viruses. And if we set the threshold properly, 17 of our models had 100% detection rate and 10 models had 0% false positive rate. And as I explained, we tried different number of hidden states. So for three or more hidden states, we didn't find any significant difference in performance. So here we, with the hidden Markov model, we can actually look at the converged final matrices to get some insights into the features that a virus represent. So we train our models on the NVIC viruses. So if we are good enough, we can look at the probability distribution of the models and then determine what kind of features they have. We observe that the opcodes are actually all divided into stages, meaning that most opcodes appear in one state only. So if any one of you are good at assembly programming, you can actually look at the assembly code and see the classification and try to understand what kind of features this virus has. So we've seen that even though NVIC viruses were not detected at all by the three scanners that we tested, we can actually use hidden Markov model to do some kind of classification and they are quite successful. Besides that, we took a completely different approach because of the fact that these viruses are so different from normal program. We tried a different approach. We just use a straightforward similarity index to determine whether a program belongs to the NVIC family. This method will work this way. If you want to see if program A is a NVIC mutated virus, you can just pair it with any NVIC virus and find the similarity score. Most of the time you would expect to find zero similarity because it's actually very different and most of the time we actually define zero. If it has zero similarity, we can immediately conclude it's not NVIC virus. But if this program A has some similarity to the chosen NVIC virus, what do we do? We pick that NVIC virus and then compare it to 20 other NVIC viruses to see the average similarity of this virus to the other virus in the same family. If the similarity, all these similarities among the family is lower than, let me see, is lower than the similarity we just obtained between program A and that virus, we can say that this program is most likely a virus. So this is how we use a similarity index to classify a program and we actually test that. We compare 105 programs composing of normal regular program and also some are NVIC viruses, some are other viruses and use the method I just described just based on similarity index and compare average result. I can classify 100% of the time correct that whether the 105 program virus or not and there's no false positive. And I try this on different viruses and I find out that it doesn't depend on the specific virus that you select in the first place. So what does this tell us? First of all, from the comparison test, we know that metamorphic generators actually vary a lot. Even though we can find a lot on the internet, they may not be all effective in terms of generating metamorphic codes and they seem to be the most effective because the viruses they generate are 10% similar. Other generators are less effective, some are 60% similar or even 70% similar for the other generators. When we compare to normal files, they are only 35% similar. So we know how to rate them. But NVIC viruses can also be detected because they are too different from other virus and normal programs. Even though NVIC viruses are not detected by commercial scanners, they are detected by our hidden Markov model approach with very high accuracy and also by our straightforward similarity index approach. So we see that all metamorphic viruses, including the all four family, were detectable either because they have high similarity with thin family, meaning they have low degree of metamorphism or they are too different from normal programs. So we actually conclude that the effective use of metamorphism by any virus or worms actually require both a high degree of metamorphism and also some similarity to other programs. And this is not trivial and this is not as easy as it seems. I think that kind of explain why we don't see like many metamorphic virus outbreak because they are actually not that easy as most people would believe. We see that metamorphism for good use is easy. Just like as Dr. Sam said, like the buffer for problem, if you have just a little bit of metamorphism, it actually do a lot of good. But for evil use, say in a virus case, it's possible to get high degree of metamorphism. But if you want to do it at the same time to be similar to other programs, it may not be that easy. So I think that's the major point of our result. Yeah, so if you have any question, please go to the microphone there. I have two questions I guess. Oh, there we go. First one, your training set was way bigger than your testing set. Oftentimes people do like 50-50 split. Why'd you do that? What was your thought? This is a classical five-fold cross-validation approach. Yeah, so we actually divide the data set into five equal-sized subsets and training use four subsets and testing use one subset. This is a classical approach. Yeah, we do it the same way in protein modeling. Okay, and then as far as getting a lot of metamorphism but still having similarity to other programs, you might be able to do something like have sort of tropes or common sequences of op codes that perform some higher-level functionality and have a family of those and pick one of those and Alvar's probably gonna say that that's one of his problems that he's gonna talk about. Anyway, so did you give any thought? I mean, you don't work for NGVCK, so maybe that's not your problem, but. Yeah, no, that's a good point. So certainly that's a possibility. This was, her research was kind of limited and so that was what she did. I actually have two more students looking at this, other aspects of this problem, although not specifically what you mentioned but that would certainly be a good project as well. I just have a question concerning the working of this actually well-working metamorphic thing in which manner does it obfuscate the control flow of the executable? Does it just replace one instruction with like a set of other instructions or does it introduce faults, conditionals which would be non-trivial to resolve? Yeah, I checked the NVIC documentation. The major techniques that it used were actually the three that we shown. They do a lot of garbage and no op instructions and then they do opcode replacement and also they would replace the whole sequence of opcode to find identical sequence that do the same function. All right. So those three are the major techniques. What would be possible to get the samples that you used for your research? Right. Yes. Cool, thanks. The question is, were all of these viruses with the toolkit identical in their functionality or is the family that you're detecting NVIC viruses in general or a specific set of variants that do a specific thing in a particular creation? So they were. When we tried, we generated them with the same identical setting because it's an interactive interface. So you just check the option you want. So we actually use the same identical option to generate the 200 different viruses. And so they are supposed to have the same behavior. My question kind of piggybacks on his. When you were looking at the similarities of normal files, did you think about maybe using similar files as your seed to train your Markov model and then see what relation regular viruses appear to see if that was maybe a valid means to detect viruses across the board rather than just one specific virus generator? Yeah, I think that's a good idea actually. We hadn't thought of that. So your idea would be to train on normal and then look for something that appears abnormal. So you just need one model then to test against, right? Seems reasonable. That'd be a good project. I've actually seen a paper. They work this way. They collect the normal files and they train different models to represent normal behavior. So yeah, with any program, they match with normal hidden Markov model and see and do identify. They're successful, but they still need a lot of models to, because the behavior varies, I think they use different models to vary different kinds of behavior. So there's no free lunch here. Yeah, I understand the need for smaller control sets, but your antivirus collection seems kind of small for your claims. You mentioned three, but you also state no commercial products detected it. You tested all commercial products. Your claim was that none do. So the claim was none of them that we tested. None of those that you tested, but it was only three. It seems like an awful small subset. Right, right. Well, those are three we had handy at the time. But they're pretty well regarded commercial testers. Yes, they are all three are, but there are a lot of others out there too. I'd like to know how Kaspersky did on it. Have you tried testing it under, have you tried infecting normal files and testing it under ism model? No, we haven't tried that. Actually, that's a good point. That would certainly be more difficult, right? Cause you would have to zero in on the part of the code that you wanted to test against. So what would kind of up the work somewhat? It looks that you are doing something like fingerprinting the virus, but did you try to fingerprint the different compilers used to generate the normal stuff on the operative system? I mean, there is different C compiler used. I think you can use your technique to detect which is the compiler used for each one of the different files. So this way you're going to say if you use an evil compiler, it's a virus. If you're using something normal, it's good stuff. So as soon as far as the virus program do not use the CCC or any standard compiler, you're going to say, well, this at least is French. Yeah, I think that's a reasonable point, but all these viruses that we looked at were compiled with the same compiler, so we weren't really looking at this situation. I was actually going to say something very similar. Now, how would you describe your hidden mark of models? How did you come up with them? How many states, what size alphabets in those states? What is it that you're detecting? Maybe, in fact, you are detecting the fact that it's just not standard compiled code. Yeah, well, that's a possibility. I mean, she can go into some of the details about what the number of states were and what the states and all that were. Probably should take that offline, I think. Actually, yeah, is this going to be published? Yes. Here's a link to all the details would be in the thesis, right? But there will be a paper as well. My question is, if I made a virus and I applied the NVIC to it, but I made my virus in a non-standard kind of way, that other virus writers and other programs don't usually do, then would your model still work? We haven't tried that. I mean, that's a good question, but I would say probably. I mean, the more non-standard it is, the better, right, for our detection. So sort of the conclusion is that if you want to do this, you'd have to make it look like normal code in some sense, right? Okay, but I mean, it wouldn't look like an NVIC virus in terms of when you compare it to other NVIC viruses. Oh, I see. So yeah, we might have to train specifically on that case. Yeah, that's a good point. My question is, if you took and made a application, but you built, so we actually had real functionality, we took and built a metamorphic virus into it so that the application actually changed it. Well, you'd still have normal similarity of a normal application while still having, when some tiny corner tucked away, maybe spread out to the entire code base, a polymorphic engine. You know, that seems like it would present real problems for, well, I mean, not just for this, but there's a lot of talk about pattern matching going on at DEF CON, and that seems to be the real problem is how do you deal with something like that? Yeah, I think that's a good point too. I mean, you could certainly mask it, right, hide the part that we're grabbing a hold of in a larger mass of code. That would be difficult to deal with, yeah, I would agree. I have two questions. First question is, have you considered updating your presentation with some newer construction kits? I know some of the construction kits you mentioned go back to, it's like what, early 90s, mid 90s, I haven't even seen those for almost like 10 years now. Another thing, the second question is, in your tests when you were doing your research, actually when you infected the files, created these different generation of these construction kits. Have you actually tested each of the executables afterwards if they're still working? You know, I remember looking at them, after a couple generations, the files don't work anymore. So a lot of vendors, there's no point of detecting them anymore because they're already corrupted, damaged files. Well, to the first point, I mean, I'd be interested to know what are the generators that are out there that are considered good today. We just looked through the literature and this is what we came up with as being what looked to be the best. From the second point, I don't know. Maybe not. Maybe not. Okay, thank you. How closely do antivirus and trojan researchers work with industry? How closely do academics like yourself work with Symantec or McAfee? And how do you share information? We don't. I mean, we haven't worked with them at all. She's looking for a job if Symantec's hiring. Generally, AV folks don't share much information because they have this carol cartel where they are like, we'll take your samples, but we won't give you ours for experimentation. So they're not that friendly in that way. Okay, if there's no more questions, thank you.