 So, welcome everybody to day 7, this session is dedicated to two things one is malware and the other is intrusion detection together with two demos ok. So, this is the a little session here on malware, we start with the basic definitions worm versus virus versus bot Trojan and so on. So, worms and viruses both refer to malware that replicate themselves. So, the difference between a worm and a virus on the one hand and a Trojan is that these things have code in them, they have code inside them to replicate. The difference between a worm and a virus, a virus latches itself on to an executable file or a program. So, it latches itself it cannot have its own independent existence, but only within the context of some a program. So, it is like a parasite while a worm is typically a standalone program. So, a virus must latch itself on to some executable file while a worm is a standalone program. So, a virus. So, it is like the difference between a virus and a bacterium, a virus is basically a parasite and a bacterium can have its own independent existence. So, a virus infects a file and uses it as a host from which to infect other files while a worm typically spreads from one computer to another, a Trojan. A Trojan horse or simply Trojan is a program with a malicious component masquerading as a useful piece of software. Like viruses or worms Trojans do not replicate, a Trojan is typically activated by action on the part of the victim. So, there are different kinds of malware some are self-activated, some are activated by say the action of a human being. Trojans may enter a system in several ways through email attachments for example, through file sharing software from a website through cell phone downloads. So, you are probably familiar with many of these things. You can get a Trojan onto your system or a worm into your system through a cell phone download. You can get it through email attachments by clicking on certain malicious attachments. You can get it by clicking on some malicious links in a malicious website and so on. So, as far as worms are concerned there are very interesting classes of worms. So, one is the internet scanning worm and good examples of these are code red and slammer. So, we will consider one case study of the code red worm. Mobile worms like Melissa and Sobe, then P2P worms, web worms like Sammy and Sandy. So, Sammy worm for example, exploited in XSS vulnerability and then mobile worms like Kabir and Comvore. So, whenever you talk about malware there are some very interesting and pertinent questions to ask. First and foremost as always when we talk about any attack we ask what is the vulnerability behind this attack. So, in the context of worms what vulnerabilities does a worm exploit? How does it select its targets? How is it carried? For example, is it carried through SMS messages, through email, etcetera. How does it infect other hosts? How is it activated? Is it activated by human being or automatically? How fast does it spread and is this question really relevant in every case or only in some cases? What harm does it cause? So, obviously we would like to know that what sort of harm does it cause? Does it erase your disc for example, or what? Are there any preventive detective measures against such attacks? If so, what are these measures and how effective are they? Over the years attackers have come up with some very clever and creative techniques in designing a worms and one such thing are polymorphic and metamorphic worms. So, most worms and viruses have unique and distinct signatures. Basically a signature at least the traditional definition of a signature is a pattern of bits usually assembly language code which appears in all instances of the worm. So, the traditional definition of a worm signature is basically a bunch of bits which appears in the worm payload. So, worm and virus signatures are the key to detecting them. However, there are sophisticated obfuscation, obfuscation is hiding techniques to hide the worm. So, what are these obfuscation techniques to evade detection? One such technique is the use of encryption for disguising worm code. So, you could have a worm which is disguised which is encrypted. So, when you look at the body of the worm you do not detect any regular pattern of bits because each instance of the worm might have a separate key. So, a worm detection software will not be able to detect these worms because the signature varies from one worm instance to another. Even instances of the worm may use different keys for encryption thus they might fail to match any existing worm signature. Such worms are said to be polymorphic worms. So, that is one big challenge in detecting polymorphic worms. So, one case study is an internet scanning worm which is a very interesting case study and did a lot of damage as it travelled and propagated around the world. Code red was an internet scanning worm carried in an HTTP request message targeted at a particular web server in particular the IIS web server which is a product of Microsoft. Several million of these were in active deployment at that time. So, this is about the year 2001 and a few years later in 2004 or 2005 there was another worm similar to this called slammer. So, we will look at the differences between code red and slammer in the next few slides. Again the same question when we talk about this code red worm the question is what is the vulnerability? So, it turns out in this case it was a buffer overflow vulnerability we have studied this in great detail a few days ago. So, this was discovered in Microsoft's IIS web server. A patch for this vulnerability was developed a few days later on June the 18th. So, this vulnerability was found out by some people say around June the 10th or June the 12th and then a few days later Microsoft released a patch. So, system administrators were supposed to patch it so that it was no longer vulnerable. So, from that point onwards somewhere around June the 14th or whatever from that point onwards until July the 12th a bunch of people worm writers were busy creating a worm to exploit this vulnerability in this particular server. So, the first version of the worm was unleashed somewhere around this time about a month later July the 12th. So, this used a random number generator to generate new addresses of machines to infect. So, that question that was posed in one of the earlier slides how does it figure out who its targets are which IP addresses to target. Now, this worm sitting on one machine infected machine does not know who are its targets. So, what it does to generate these IP addresses it generates them at random. So, use a random number generator to generate random IP addresses. However, they made a mistake the code writer the same seed was used for the random number generator in every instance of the worm resulting in the same machines being infected over and over again. So, we are talking about July the 12th and they realized whoever those people are they realized that this was not very successful they analyzed and saw what was wrong with it and they found that the wrong thing was the same seed. So, they tried to do something better within the next few days they came up with the next version of code read a second variant of code read was launched wherein a random seed was generated in each worm instance. So, this had a dramatic effect on worm propagation about 360,000 machines were infected in just 14 hours after the launch of this variant on July the 19th. So, the first version July the 12th they found there was a problem with it a small bug in the worm code this was fixed and the new variant was launched on July the 19th. The infection phase continued. So, the infection phase meaning these worms are frantically trying to find out new targets and attacking them. So, the infection phase continued until July the 20th at which point the worms move to attack phase. So, what is this attack that they are doing? At this point until so all this time they were just propagating propagating and propagating and now that they were infected hundreds of thousands of machines they do what? Until July the 20th they launched a denial of service attack on www.whitehouse.gov they also defaced web pages with a phrase hacked by Chinese. So, this is a study of the case of the code read worm which attack the Microsoft IS server sometime in the year 2001 two variants of this worm and the vulnerability was a buffer overflow attack in the code of this application this web server program. So, they said few years later in I believe it was 2004 this other worm now was unleashed. So, it is interesting to see the differences between code read and slammer. So, before I actually go to the differences let us just very quickly see what is exactly going on with code read and how it is spreading. So, imagine you have one machine over here that is infected. So, what does this do? It tries to infect other machines. So, it simply randomly sends out a TCP request to a random IP address which is that address generated by the random number generator. So, it sends to some potential victim B1 it does not know it is running it does not know what sort of operating system or what sort of application this guy is running he just tries he just keeps trying trying trying. So, he sends out to be one he generates another random IP address sends it out to some random machine V2 in the hope that this is an IS server that is vulnerable and that is not been patched sends out to be three and so on. So, he keeps sending out one after the other as fast as he possibly can and out of these things that he is doing thousands of them some of them could be actually vulnerable machine which were not patched. So, the worm code now enters this machine through the buffer overflow vulnerability and it starts to execute and once it executes it does exactly the same thing it tries to find out other machines. So, this is one of the most important things that a worm should do is just try to figure out how to spread how to replicate. So, it now figures out so, it figures out how to spread. So, from here it tries to look out for other victims. So, in most of the cases it may not succeed, but that is ok it does not succeed in some cases it succeeds in this case for example, does not succeed and once it succeeds the worm code is firmly launched on this the attack begins from here and now he finds other guys to infect and tries to infect and then he finds some of these machines which can be infected and then they go ahead and spread the infection. So, you see how this thing happens how it spreads it spreads through setting up a TCP connection to each of these machines. He just tries on port 80 for example, the HTTP port he tries to infect this and then this and this and so on and some of them are actually running the IIS server. So, they do and if they are not patched then they do get infected as soon as they get infected the thing that they do is simply trying to infect other machines. So, it sends out its worm code to this machine and to this machine and so on and so forth and this is the way the worm propagates ok. So, the differences between code red and slammer. So, the first thing is both of them exploited the same vulnerability which is buffer overflow. In the first case the target was the IIS web server and in the case of slammer worm it targeted the SQL server 2000. The interesting thing is the first worm propagated via TCP by setting up a TCP connection and as we very well know it takes a certain amount of time to complete the three way handshake. So, the next set of worm writers were clever few years later they said let us try to use something that is much faster namely UDP because this is connection less. So, they use the UDP protocol which obviously gives you much faster rate of propagation. The payload was 4 kilobytes it was reduced dramatically to only 384 bytes in the case of slammer. So, that again helps it to move very fast because it has got a much smaller payload. It is said that code red was latency limited latency is delay. So, it was it because of the delay in setting up a connection you could spread only this fast you cannot go any faster you must complete the TCP handshake before you can actually spread the infection. On the other hand there is no question of any handshake over here in UDP because it is connectionless. So, it spreads very fast and what limits the spread of this worm is the bandwidth constraint what is the megabits per second on each of those communication links. And to just see how fast this thing spread it was found that this thing doubled the infection rate or the number of infected machines doubled every 37 minutes. On the other hand with the slammer worm the number of infected machines doubled every 8 and a half seconds. So, this was a much more successful worm even though the population of SQL servers might have been less than the population of IS servers still it spread very very fast and infected almost all of the unpatched machines in a very short amount of time. So, it is useful to come up with some sort of a worm propagation model to see how fast these things spread can be mathematically model these things. So, a lot of researchers around the world got interested in this question and started to come up with models. They had some empirical results and they said let us now try to see whether we can match it with some model results. So, let n be the size of the total population population of what not human population population of machines. What sort of machines machines that are running for example, the IS server. So, let us suppose there are so many machines several hundred thousand of them across the world. Let it be the number of infected machines that have been infected up to time p. So, I is a function of t let the number of susceptibles, susceptibles are the machines are the machines that can be infected, but that have not yet been infected. So, let the number of susceptibles therefore, at time t is all the machines that could possibly be infected minus the machines that have already been affected. Then, let beta be the infection rate we assume it is a constant over here, but as we will see we will question that assumption. Let the beta be the infection rate that is each infected person or machine rather attempts to pass on the infection to beta other susceptibles in one time unit. So, that is the definition of the infection rate beta. In one time unit they will attempt to I am not saying they will actually infect, but they will attempt to infect beta other machines in one time unit. So, we use a model that is that has been borrowed from human epidemic spread. So, we have different models that model human epidemics and some of them are referred to as the SEM model simple epidemic model. So, that model has been used over here. So, the following differential equation. So, it is got a bunch of differential equation this is the simplest version of the SEM. The following differential equation captures the number of infected machines at time t. So, in a short interval of time t called d t you infect d i machines what is the relationship between d i and d t. So, in a small interval of time d t the extra number of machines that are infected is equal to the existing number of infected machines multiplied by the infection rate beta. So, this is the number of targets that are attempted you try to infect you try to infect all these other machines, but not all of them get infected because some of them are already infected. So, there are not all of them are new infections some of them are already infected. So, we are interested in the number of new infections. So, these are the ones that are attempted you attempt to infect all these, but the extra ones that you infect the new ones are not all these, but this multiplied by the probability that the machine has not yet been infected which is n minus i divide by n. The total number of machines minus the number of already infected machines divide by n. So, this differential equation captures the number of infectives at any time t. Now, you integrate both sides this thing and this thing and you perform the integration you get this thing. The number of infected machines at time t is i 0 the initial number of infected machines they must have been a small number of them let us say 10 that were already in already infected at time t equals 0. So, i 0 multiplied by the total number the total population of machines that are potentially infectable in that could be infected divided by i 0 plus n minus i 0 times e raise to minus beta t. So, this is an interesting thing that you see in this equation. So, we try to plot it. So, this is a pretty standard equation as I said used to capture the spread of human diseases and you plot it and you find this interesting curve. So, at initial time there is nothing infected or just those few things and then it starts to increase exponentially. So, this is the exponential build up of infected machines and then it begins to saturate. So, this is what the model tells you how the infection spreads the number of infected machines on the y axis and the time. So, this time is roughly 14 hours as mentioned before code read took about 14 hours to spread to as many machines as it could. Now, this is the model value the experimental value something a little different the two of them tally quite a bit up to this point and then they seem to diverge. So, the number of machines infected is not quite as high as what is predicted out here by the model and the reason for that you might guess just take few seconds to guess why the two things do not actually tally. So, as this infection is spreading there is a lot of information that is being disseminated about the spread of this infection on IS machines. So, what happens is some of the system administrators got alerted to this problem and they started to patch their machines. So, when they started to patch their machines the value of beta that we have shown in the previous slide this beta the infection rate started to fall. So, actually speaking to be correct this beta is not a constant it is a function of time and it is a decreasing function of time. So, because of that you can solve this differential equation numerically and what you get is something less than the presumed value from the model the actual value is something like this. So, this is a standard curve that is encountered in many different disciplines and AI and so on it is called the logistic curve S shaped curve. So, one category of worms are the internet scanning worms like the one I just described another category of what are called topological worms. So, what is what is what are these things? So, it is got something to do with the word topology. So, topological worms are so called because the vulnerable machines can be represented as a graph with the nodes representing the vulnerable machines. And edge between machine A and machine B in this graph exists if A knows or stores the address of the email address the phone number whatever some way of contacting a B from A. So, an edge between machine A and machine B exists if A knows or stores the address of B and is capable of directly infecting B by sending it a malicious payload either through email or through an SMS or MMS or whatever. Topological worms have focused targets their immediate targets are their neighbors the guys whose email addresses are contained in their phone book or phone numbers are contained in their phone book and so on. So, those are their immediate neighbors and those immediate neighbors once they are infected then they start to target other machines in their phone book etcetera. So, this is how they spread because one of the questions that was raised about worms is how do they spread? How do you know whom to target next where you get that information? In the case of code read you got the IP address by using a random number generator and randomly generating the IP address. In the case of these many of these email worms you will look at email addresses contained on that machine and that is how you know whom to target next or you look at the address book on a cell phone and so on. So, the best examples of topological worms are email worms P2P worms and also mobile worms. There is a very interesting case study you can read a lot about it if you just do a Google search on this person Sammy Kamkar. This is the originator of the Sammy worm so even its interviews and so on have been recorded and exist on various sites if you do a Google search on this. You also know about the XSS vulnerability so this is directly related to that. The XSS worm Sammy was unleashed in October 2005 by Sammy Kamkar it infected the social networking site Myspace. Social networking sites typically allow users to create edit and save their profiles making them accessible to some other members of the social networking group. So, this is a normal thing that social networking sites do to enable you to advertise your profile and things about yourself to other people. Now Sammy added a bunch so this is what he was doing he is a basically a hobbyist and he was interested in programming languages. So, he suddenly got interested in JavaScript and he said let me put some JavaScript into my profile. So, he added a bunch of carefully crafted JavaScript he tried this tried this for a couple of days added it to his profile and then when a visitor visited his profile say the visitors name is V1 he downloaded Sammy's profile onto his browser then the JavaScript in Sammy's profile executed got executed by his browser. This caused Sammy to be added a friend it we added as a friend in V1's profile and also to include the message, but most of all Sammy is my hero. So, what happened is V1 downloaded Sammy's profile Sammy's profile contained some JavaScript which happened to be malicious so it was by design and what that malicious JavaScript code did which included Ajax APIs and so on was to actually infect V1's profile itself. Now V1 is downloading Sammy's profile Sammy's profile contains malicious JavaScript that malicious JavaScript very carefully and cleverly written add Sammy as a friend and also infects V1's profile then V1 is visited by somebody else that person's profile say V2 is the next person V2 visits V1's profile V2's profile also gets infected and so on. So, this thing spreads more or less exponentially like a tree within 20 hours of the first visit to Sammy's profile Sammy had been added as a friend to more than a million user profiles this rate of spread was even faster than that of code read. So, how do the worms spread any MySpace member can update his profile after logging in after V1 logged in and viewed Sammy's profile the malicious script embedded in it began to execute the script uploaded itself onto V1's profile on the MySpace server and infected it. So, this is again a summary of what I just said so that is one example and you can read a lot about it on different sites on the internet it is extremely fascinating to see this and to also look at the code. So, even the code is over there you can see all the little tricks and all the different vulnerabilities that are exploited both on the server side and also on the browser side. The next classification the next category is mobile malware. So, mobile malware exploit a number of vulnerabilities some of these are features in the Bluetooth protocol others are software vulnerabilities that exists in the implementation of the Bluetooth protocol stack including buffer overflow. But most of the vulnerabilities as far as mobile malware is concerned are of the social engineering type by mistake you click on some link by mistake you accept some Trojan into your system some download which contains a Trojan and so on and so forth. So, some of these vulnerabilities are related to the Symbian operating system which is now not very popular. So, now the popular thing is Android and that is why we have a presentation tomorrow on the Android security and the Android operating system on the cell phone. So, typically what happens is that these files so you have for instance a particular application and it is got a certain version and then there is some message to you that says download the newest version. And you download the newest version and that version contains a Trojan which does all sorts of things like key logging and so on and so forth. Another common vulnerability is placing the cell phone in discoverable mode. So, you can look a little bit about what the Bluetooth protocol is all about and one of the things is pairing and so on. So, you can keep the Bluetooth in discoverable mode either by accident or by default there is a default setting which keeps it in discoverable mode. Now, what that does is this enables an attacker to obtain the Bluetooth device address of the victim cell phone. So, what is the big deal if you know somebody else's Bluetooth device address this is basically like a MAC address containing its 48 bit MAC address. Now, knowing these so this other parties Bluetooth device address this MAC address the attacker could attempt to exchange files with it using something called the OBEX protocol which is object exchange protocol and this involves exchanging images business cards etcetera etcetera. So, once you exchange information with him you could have also exchanged a malicious file which contains a Trojan which starts doing all sorts of nasty things like stealing your secrets or making SMS calls to somebody in your phone book and what not. Then these are other things because of configuration. So, one of the categories of vulnerabilities besides software and besides networking vulnerabilities etcetera is configuration vulnerabilities which I did not talk about before, but you can imagine very well that there are different kinds of settings and by mistake if you have set something wrongly then that could be a vulnerability. For example, user authorization is usually required before a file can be accepted by a smart phone. The smart phone usually prompts a user to enter his pin as a way to confirm whether an external file should be accepted or not. So, typically if you enter a pin correctly then that permits the system to accept some other file from the outside. However, some operating system versions accept file transfers without user authorization and some smart phones allow users to disable the authorization required option for file transfers. So, all these little features which make things more convenient unfortunately also increase in security. This is the tradeoff I talked about long time ago between convenience and security. You do not want to keep entering your pin again and again you just want to download anything even if you do not give it explicit permission it is downloaded and that downloaded file could contain a torching. It is estimated that between 7 and 25 percent of the users are very careless they indiscriminately accept files or MMS attachments. So, two examples which are very well known are the Kabir worm which attempts to discover other Bluetooth enabled phones set in discoverable mode. When it finds such a phone it sends the worm payload in a sys file Symbian installation file. The receiver needs to accept and install the file it is it was basically a proof of concept worm. So, the payload was mostly benign nothing very malicious typically displaying things like curry bay on the screen. The negative thing about this is that it continuously scanned for new victims. So, do not forget one of the big things about a worm or a virus is continuously trying to figure out who else to infect. So, continuously scanning for new victims by an infected phone turned out that this thing depleted battery power otherwise it did not do anything very nasty did not delete any files etcetera etcetera. Then the next worm had two vectors of propagation both Bluetooth and MMS. So, it used MMS to spread to different contacts in the smartphones address book and then it required user interaction to be installed and entice the user with catchy subjects such as happy birthday. So, once it was installed using MMS the next thing is to use Bluetooth once it infects a smartphone attempts to discover Bluetooth enabled smartphones and pass on the infection as an sys file to them. So, two vectors of propagation Bluetooth and MMS the previous worm Kabir had only one vector of propagation which was Bluetooth. So, you see how things become more and more creative in the context of worm design. And finally, one of the important kinds of malware these days are botnets. So, we have seen a little bit about it when we talked about denial of service attacks you have a single controller for example, who controls many zombies and all of the zombies have got some code in them to attack a particular website and so on a denial of service attack. So, a generalization of that they are botnets. A botnet is an army of compromised computers or bots connected to the internet and remotely controlled by a botmaster. The earliest botnets were a collection of zombies that participated in a DDoS attack. Things have moved a great deal since then today's botnets may comprise tens of thousands or even millions of bots. So, today things have gone from simple hacking, simple hobbying to things of things with financial gain the emergence of botnets is closely linked to the motive of financial gain. So, what sorts of things can these bots do? So, first and foremost your machine could be infected by several kinds of bots. One thing that is very different between bots and say the code rate kind of worm is that the bots tend to typically lie very low. They do not make a big noise they do not try to attract any attention. Why is that so? So, that they will not be detected they do not spread very fast for example, and it is very difficult in general to detect them. So, often used to so amongst the terrible things they do are sending spam mail on behalf of third parties. So, you could be a guy who wants to advertise and send spam mail and then you contact a botmaster and you say I will pay you so much can you sense this kind of spam for me on my behalf to hundreds of thousands of users. So, what that botmaster will do is he will recruit his bots he already has got bots under his control he will send them fresh commands to start distributing this spam mail for you or he may do something else. Bot programs may contain key loggers and other forms of spyware that captures sensitive personal information such as passwords and credit card numbers and send these back to a botmaster. So, all these kinds of activities could be done by the bots on behalf of the bot controller. And then the other one bots may also be used as an extortion tool pay up or your website will be bombarded by a DDoS attack. So, for example, those bots will be will be programmed to launch a DDoS attack and you will go to a particular company and tell them that I control these hundred thousand bots they will bombard your website and slow down your website and cripple your website unless you pay me so much. So, it is nowadays used as an extortion tool as well. An important difference between a bot and a computer infected by a traditional Burma virus or even Trojan is that a bot needs to communicate with specific nodes in the botnet to receive fresh commands. So, you may communicate directly with the botmaster or the bot controller, but more likely he is likely to communicate with some other guys who cannot be detected as we will see using a P2P network. So, the earliest bots and botnets used IRC servers as command and control centers internet relay chat. So, early botnets used an IRC server as a command and control server. So, the important thing once a particular machine has been infected with a bot is that the bot has to communicate ever so often periodically with its botmaster. What does he need to know from the botmaster? He needs to receive commands from the botmaster. For example, one command might be a bombard this particular website on the 1st of August 2014. So, for that purpose he needs to be in constant touch when I say constant I do not mean talking every hour, but maybe once every few days with the botmaster or somebody delegated by the botmaster. He talks with those guys to be able to get new commands. So, in the earliest version of these botnets they used an IRC server as a CNC server. The IRC server was used as a CNC server. A more recent trend has been distributed and decentralized botnet architectures which leverage existing highly scalable and robust P2P networks. So, now here you can hide very low and you can have all these you can be communicating with these peer to peer network with your peers and you could be receiving messages and commands from them. So, the connectivity of P2P networks ensures that even if a large number of bots are disabled, the rest of the bots continue to stay connected. So, in the P2P network you are taking advantage of the protocols in the P2P network to actually spread the infection around and to spread the commands to disseminate the commands from the botmaster. And also it becomes very difficult to detect that this is a botnet. Why? There are no fixed CNC servers unlike in the previous IRC case. So, it is hard to detect and incapacitate a P2P based botnet. So, the main thing in the case of the botnet is you do not want to be detected. So, for that purpose you want to highly distributed and decentralized system. The second thing is you want to be highly fault tolerant. So, in a P2P network you get both of these things. It is highly decentralized and also it is highly fault tolerant. Some of the nodes can go down and it will still be working. So, a picture is something like this you have got a botmaster. He has recruited by compromising all these machines. He has recruited a whole bunch of bots to disseminate his commands and his URL. So, the URLs are disseminated to these machines. So, by saying that this is the URL where you will find the latest command. So, that URL has to be disseminated to all these bots. So, these are selected bots that are communicating at a particular point with the botmaster. And then these disseminate commands to the rest of the bots over here. Those sort of commands do this at so and so time or look at this code which is on this particular website go to this website and download the code and execute this code. So, these are the kinds of things that have to be disseminated to all these bots that you can see with the cross lines and this is the entire P2P network. So, on one particular at one particular instant the bot code might be the fresh code might be over here. Now, at some other point in time you might move it to another place to another website to evade detection and so on and so forth. So, you always changing the location or where to get the latest infection from the latest commands from. So, this is basically the outline. So, this is one example of a botnet basically a P2P botnet as compared to the IRC botnet. So, this is a basic introduction to malware. The next part of this talk is on IDSS, but before that we have a small demo of Metasploit by Vibor.