 Okay, so let's get going so today. We're going to continue the discussion of security that we started last time with Anthony and So to review there were four main concepts from last time Authentication quiet, please Authentication which is guaranteeing the user is who they present to be? Integrity which is ensuring that data reaches from a destination reaches to a destination from a source And or is written on a storage device without being changed somehow and corrupted confidentiality is that the data is only accessible to the users that it's intended for and Non-repudiation which is that a commitment that's made by someone can't easily be undone Without leaving some trace and that's part of typically forming contracts digital contracts between people and it involves making Somebody making a signature to something which they basically can't disown later all right, so and Part of the core infrastructure for several of those operations is encryption and then digital signatures, so You know if we want to show that a particular Key a public key that that anyone can examine really belongs to a user called Alice The idea is to take somebody else who's entrusted authority who already has a published public key and And is fully trusted in their infrastructure is trusted by lots of people to sign the Such as very sign to sign both Alice her identity her name and then the key so we construct a tuple of Alice and the key somehow can be simply by concatenating them and then Make a signature out of that by using encryption with Verisign's private key So this whole thing is basically it's essentially like Verisign signing a document that has Alice's name and then her published key on that document All right, and then anyone else who wants to check can examine that They use the complimentary public decryption method for for very using Verisign's public key, which is always going to invert messages or Sorry decrypt messages that are encrypted with their private key. That's going to return the The tuple that we originally encrypted Alice and her key so now Bob can can tell that This key really is associated with Alice at least according to Verisign Verisign's committed that those two go together And so if Bob trusts Verisign, he can trust that this key messages encrypted with this the Private key corresponding to that public key really came from Alice so one of the one of many permutations of using signatures and Digital signatures to make proofs of connections or commitment by different entities So today we're going to look at the I suppose offensive side of security, which is how various actors can try to attack compromise Sometimes extract information and sometimes simply cause damage to a target So So attacks include, you know, basically compromising a host somehow, which means gaining control or at least resources from that host Denial of service, which means somehow typically flooding a particular host with traffic and interfering with its normal operation and Attacks often involve both steps so in order to Perform a denial of service you often need a very large volume of traffic and An easy way to do that is to compromise a lot of hosts and form a network out of them and then target a Small number or even a large number of other hosts to try to slow them down simply by the volume of traffic that you're creating All right, so those are the main topics for today. So first of all host compromise so The earliest large-scale example of this was the Morris worm or the internet worm from 1988 that was created by Robert Tapp and Morris who was at the time a graduate student at Cornell and It successfully Broke into most of the BSD based host so it used code that was shared by most of them and It spread very effectively across the internet So Luckily that was only at the time about 6,000 machines now Given that there's many hundreds of millions of machines. It's estimated that a single worm Could compromise about 10 million hosts in less than five minutes So it's a massive scale and that's the scale the economic damage is is quite astronomical So Robert Tapp and Morris was convicted and got $10,000 fine and a 400 hours of community service and a job at MIT eventually although not not as part of the same packet, but It was a very sort of a slap in the wrist people really had no parameters for that kind of offense, but These days this kind of damage is is absolutely massive so obviously there's a lot of effort that goes into both Preparing these attacks and also preventing them all right, so All right, so attacker wants to gain control of a host sometimes to just get information So some part of this is professional and it's for corporate espionage and sometimes government espionage In some cases, it's it's a denial of service attack, which is often perpetrated by sort of casual hackers sometimes by Security agencies so the Stuxnet Worm or virus was launched deliberately to interfere with machines in the Iranian nuclear program and It was sort of a concerted attack at fairly large scale to affect those devices and In some cases again, if you have malicious intent you can rather than capturing data You can erase data to cause interference with the operation of something so All right, so a variety of bad things can can be done So a worm is something that replicates itself somehow first of all it has to get into a host then it has to find a way to Execute some code and then reproduce itself most often that's done Exploiting careless programming and specially text string buffers in C, which aren't Sort of I'm protected in length and allow the programmer to or like rather allow the code to write over Legitimate data on the stack and interfere with other data. So a virus Is a piece of code that's attached instead to an a Legitimate document or email message and Is opened off sometimes inadvertently by a user who believes they're doing a legitimate operation or accessing some legitimate object by clicking on a link And so virus does a similar thing the the clicking operation Executes the code and allows the virus to run and do more or less arbitrary things to the host Finally a Trojan horse is a sort of a malicious piece of sleeping code that somehow gets Executed on the host it can come through a virus or through a some kind of Compromised port or unsecure port it lodges on the host and then provides Basically a remote socket through which the attacker can launch other kinds of attack So it essentially opens the back door in the computer for a subsequent execution of code all right, and when hosts are collected together and especially like hosts that can run Trojan horse programs a collection of those together is called a botnet and the hosts themselves are called zombies And those can be used to spread worms And also to demount excuse me mount distributed denial of service attacks All right, so so Trojan horses are very effective viruses are very effective because they often use fairly simple every day operations that users do and they masquerade as benign pieces of email and uses It's very easy for users to mistake what the message is and click on some link and get into trouble And so and the infection can be it'll be obvious or it can be benign as well Sometimes you see immediately the computers behaving strangely But it can also be a sleeper virus that's going to run a keystroke logger and collect your information and passwords or May simply turn open up a port on your machine and allow the machine to be used in a botnet so and I too bad too bad things You know you can see this one it's This one it's masquerading as a greeting coming from a friend Sometimes there'll be a name in there that looks you know a common name Makes it very easy to assume that this is a benign thing and the link itself You know it's easy enough in the text of an HTML link to put a legitimate link. It's different from the actual link in the underlying HTML So rarely you're clicking on what you appear to be clicking So it's difficult to spot these but And Unsophisticated users are especially prone to doing this but The easiest thing is to assume if it could be a virus and obviously this can be because you can't see what the real link is You know when you mouse over this you'll typically see the actual link which is safer But if it could be a virus the safe thing is not to click on it So let's look a bit more closely at how these buffer overflows happen All right, so the idea is that the Hacker somehow figures out that there's some buffer in a program that's accessible remotely or email They figure out that there's a buffer there that's got a fixed size programmed into the C code usually That they can overwrite if they use a larger message So especially this is true of cookies which normally do have a fixed length But if the code is not carefully written it will allow The string of the cookie to be written Even if it's arbitrary link arbitrary length So here's a get cookie routine sort of a strawman get cookie routine receives a string of a packet and It does a might cause a sub routine to process the packet Which it doesn't do much except eventually Locates the cookie part of the packet here. So this code here is parsing the pack the packet contents looking for some header that says cookie It retrieves the location of the cookie Puts it in and so then this address here should be the address of a string Which is the beginning of the cookie and So in normal execution it should be safe to copy that string in into the cookie routine, which is just a Whereas it fixed length buffer here on the stack Okay, so The normal execution On most machines so most machines have a stack that grows down The miss machines that you program and also Intel machines Normally grow their stack downwards So one of those machines the main Automatic variable variables will sit in an area of memory like this on the stack It's going to progress down When you do a function call here, it will push the return address on the stack as well So that's going to come down just below the local variables on the frame and That return address is going to point back to the point just after the function call all right So stacks going downwards and then The stack frame of the function being called is going to be added below that on the stack And that contains this variable n which is just the address in the packet of the cookie and the cookie itself which is should be bounded by 512 bytes So somewhere in this processing routine So first of all you figure out actually what n is so it saves the location of where the cookie should be an n and then it calls the this notoriously dangerous C routine string copy and String copy You know as you probably most of you know For C strings Takes the sequence of bytes until it finds a zero as the end of string marker and it copies all of that into cookie All right, so if it's a 512 if it's actually bounded by 512 bytes It should Copy those up into where we into here Oh, yeah, and we add the string Return address just as before and then there's a stack frame associated with a string copy itself, which has just a couple of variables in it so All right, so for a legitimate cookie We'll execute this code down to here and the string copy is just going to move the contents of the packet probably from the heap somewhere Into its local cookie variable, which is some space in the stack. So here it's somewhat less than 512 bytes long and it's Effectively growing up. This is going to be you know, it's a an array. So its addresses are going to be increasing which means the First address is going to be the lowest address numerically. So it's going to be going up from that address on the stack so The rest of the functions will execute they'll return Whatever munch was going to do it will finish off. It'll return back to Get cookie using that address which actually I'll put it back there Yeah, so each each return address After you have finished executing a function you drop back to the return address. So Here we are We just finished executing This routine munch so we will grab the address that was pushed on the stack in order to return from it and Finally, you know, you're back to the original get cookie So everything went according to the programmers plan But let's look at the buffer overflow case. So now the difference is the Well, again, we're going to do the string copy but this time if the Data being copied, you know, if the string that's inside this packet is longer than 512 bytes it's going to write all the way up here and if the if the hacker who did this knows Either by knowledge of the code that they're trying to hack or by trial and error Knows that the return address here is just above Well, in this case, it's just slightly above the length of the cookie. So it's 512 bytes plus four bytes Because there's the the length there, but by trial and error the hacker can put the address of some other code which could itself live on the stack and The routine when it comes back when it tries to return from this routine here will have this bogus address that it will try to Follow to get back So in that way the hacker can actually get both insert code and then create a link to it in that step. Yeah Well, no pages on this. Well, you can do that. That's one of the the techniques for preventing this So yes, that's one of the techniques for trying to prevent that particular version where the code's on the stack So yes, but it's only one of it's not always a reliable operation and In the case of HTML there's often executable JavaScript or Java code that's part of the HTTP message somewhere else so it may be that it Tries, yeah, I guess in that case Tries to link to that code as though it were a legitimate executable as part of the H as Or a part of the executable part of the HTML code All right, so yeah, but anyway those all of those steps will help so in this case Yeah, we did the simple we just showing the simplest case where it actually is execute executable code right on the stack But there's other ways of doing it All right, so and you can see by the time we get back When the time we return from the second rote routine what we're actually doing is executing that little link into executable code that we wrote So yeah, there's lots of variations on that and you know links can be to binary code or Java code depending on that the environment that this is being interpreted in All right, so one question is if you notice we got into trouble by growing the stack part of the trouble was the stack grew downwards, which means the Buffer when it's being written goes upwards Would it fix things if we went downwards? I'm sorry, excuse me, will we fix things if we grew the stack upwards? So in other words the stack would be growing up Instead of down so that then the buffers would be written mostly into empty space would that work? Not necessarily very good answer. Yeah, so well, that's one situation is that there could be There could be other programs in the space, but it becomes a lot less likely to be successful if there's if you don't know the offsets But there's another reason that this still still doesn't work. In fact, it doesn't even work in the example Because It probably works more often though. It probably makes it harder for the for the code packages succeed but the trouble is here if you notice the There was another return address that was inserted Right after the string copy and the string copy was the function that's actually doing the dangerous right So the trouble is if we did reverse the stack so now the stack is going up instead of down Let's see his Yeah, so everything sort of flipped upside down the return address after the string copy is on the top now But in this case the right is going to go up as well But although it didn't club at this address it club at that return address So this would be safe if we haven't called a function there Right if the the code itself had done the copy with a for loop or something It should have been alright because that basically we'd be writing we might have overwritten some data But we wouldn't have overwritten a return address so probably the if people had for whatever reason if designers had Designed the stacks to go the other way things would be a little bit harder for hackers You'd need this combination of code code patterns to be in place in order for it to work Whereas the downward-facing stack it's Potentially always dangerous because there's always going to be stack frames sitting above the current frame So if there is a buffer overrun problem, it's almost always going to be exploitable All right, so When a host is compromised the attacker gets to do whatever they want Usually that involves finding more hosts to compromise Then having the virus copy itself So And that often involves picking random IP addresses random which can be encoded as 32-bit words And There's some interesting Well, we'll get to this later when we discuss particular viruses and worms, but This Fortunately has actually not been grow well executed in some large-scale worms So anyway, so the idea of the worm then is because it's randomly generating New hosts to attack it's going to spread normally exponentially at the beginning Because each each sort of generation of the worm is going to attack some large number of other hosts Let's say K and so each time each generation is multiplying its size by roughly K So that can cause extremely fast infection across the network It's really a really nice simple parallel algorithm. It's practice breath first search Okay, and empirically this is the kind of growth that you see If you ignore this term for a minute at least initially The growth is exponential. You can see it here so it's e to some kt and Gradually over time that causes this fast explosive growth and then another phenomenon kicks in which is the network starts to saturate and the virus or worm can't find new hosts to infect and Well, okay, so we'll get to that. We'll repeat this in a second because This is this was misunderstood also famously by one of the virus authors All right, so you can see the the exponential growth initial initially once you start to have a Significant fraction of the internet infected then it slows down quite a lot so some famous examples of worms the first Famous and successful worm was the internet worm or the Morris worm from 1988 and As we saw it was relatively small scale simply because the network was fairly small scale and it it's thought to infect it about 6,000 hosts Code red was Newer generation of worm a fairly efficiently coded one From 2001 that was able to infect nearly half a million hosts in about 10 hours But then you know as the viruses became more sophisticated in in particular They switched from using TCP connections in this virus To using UDP because if you recall TCP when it establishes a connection It does it a sin request waits for an acknowledgement so there's a round-trip involved in Causing a new host to be infected and slammer instead used UDP, which is a connectionless protocol So it just was able to send out packets In parallel not waiting for a response. I was able to cause damage much faster using a UDP address in Microsoft SQL server and this is sort of The state of the art and current viruses are thought to go much faster So zero day exploit zero day refers to the number of days between the recognition of a vulnerability By the software company and the time at which it's exploited by a virus writer or a worm author Zero day is the worst case which means that it's an unknown exploit So in other words the hackers discover the buffer overrun problem before Microsoft does or before the Linux authors do and So that virus You know, there'll be a lag before that virus can be recognized and before a patch can be distributed during which it's going to spread as fast as the theoretical limits allow and The theoretical limits now are of the order of a million hosts in about 1.3 seconds And the estimated damage of that kind of worm is in the tens of billions of dollars and Yeah, we're probably lucky that we haven't had worse infections So a lot of the viruses that we'll see in a second. We'll just talk a little bit more about them Have had some fairly serious flaws in their design So they were not as bad as they could have been So Robert Morris's worm Back in back in the 80s when this was written most of the internet was Unix hosts and a lot of them used the BSD parts of the BSD distribution and Morris's claim that he was trying to measure the size of the internet with breadth-first search There were a couple of things wrong with his claim One of them is that he carefully disguised the source of the infection So he was working at Cornell as a graduate student, but he actually launched it from MIT, which are oddly enough it eventually became his his employer, but The other thing is that he used a number of security exploits To do this rather than simply asking people if he could run code on their machines And he didn't tell anyone he was doing it either And the exploits he used were you know, actually very common Exploits which have been used over and over again He did actually also rely on The academic networks again academic networks were a big part of the internet at the time and people often created chains of automatic login from one machine to another And and enable the remote shell access which is an insecure version of SSH access and road access so So he did marshal a lot of sort of loopholes that were present in the network at the time in order to get this to work Yeah, and it was well. He claimed it was benign. Obviously the courts didn't believe that and But it did have a bug which was that In order he was afraid that system administrators would try to kill this bug By recognizing that there was an instance of it running and or masquerading for a running instance So he thought that if they sort of masquerade that there's already an infection and my Infection stops when it sees that it's already running that's bad so what he did instead was Randomly try to it reinfect with some probability and it was one in seven so in the virus and Encountered a host that had no sign of already being installed. It would run immediately otherwise it would die with six chances out of seven but on the Seventh chance it would run again and that was critical so he mistakenly thought that by Running fewer instances of the virus It would slow it down enough to not cause an exponential explosion the trouble is that Because there was no Bound on the out outer degree of the virus All he'd actually did was slow things down. Actually, I think of the curve coming up. I know I don't have to go back Sorry, so yeah, this curve here So by simply reducing the probability of infection For this type of worm which is just going to continue attacking hosts. What he actually did was just reduce the constant K But the shape of this curve isn't actually affected by The constant K. It's only the scale So if I modify K all I'm doing effectively scaling time here And so this curve sort of elongates, but it has the same asymptotic effect. So that mistake That was a bad mistake, which made it sorry a little Which made it actually a lot more dangerous than it could have been instead of being a benign breath first search that would have basically Ran on each machine once it became this exponentially growing worm which also caused Systems to lock up once they overloaded the number of connections they could support. All right, so So code red So code red is the really it's a cool cool sounding name for a virus, but the name actually came from the people that discovered the virus who will currently who are apparently looking at a cans of code red Mountain Dew while they were doing this work late at night So it sounds like some espionage some formal name from a Supply from a spy agency national security agency, but it's just a soda soda drink name so So code red You know in 2001 people have started to install firewalls things are a bit more bit more safe but nevertheless this virus Successfully targeted port 80 which is much more accessible and much more Freely admitted through firewalls. So this was a smart choice and There's a but unfortunately a buffer overflow exploit in the HTTP server code and The worm was able to copy itself Using that exploit similar to what we described earlier And then execute a random scan and infect more hosts so Yeah, unfortunately the first version of the bug didn't work very well And it turned out the random number generation that was hand hand coded Had some problems in it which caused Well in this case the random number generator used a fixed seed Which meant that the same Whenever the worm ran it would execute and try to infect the same host in the same order So you can see you really defeat the value of parallelism in that case, and it was a lot slower than then it could have been But anyway, the authors did fix it. So the second generation of code red used a time or some other Varying information from the host and it was able to go a lot faster All right, so that one Yeah, the authors are actually the authors weren't Conclusively identified, but they believed to have come from the Philippines All right, so slammer just a few years later was a very scary Virus so All right, sorry, so SQL slammer exploited a vulnerability in Microsoft SQL server and and This was a very fast-moving Virus because this particular vulnerability was based on a service that that ran over UDP So it didn't require connections to be established before the offending code could be sent across the network so that made it vastly faster to deploy and infect and You know you had this Massive number of hosts hit in a matter of sec of minutes and seconds the good thing about this particular Militious code was that it didn't affect all machines on the Internet Because it was using a SQL server vulnerability, it only affected server machines and and a handful of machines of user machines that had SQL service sometimes it's attached to Microsoft's developer suite as well So some developer machines again, but mostly servers So luckily there was a limited number of those but they did include name servers for the Internet and A large fraction of those went down as a response as a result of this But In a sense there was another design weakness in this virus, which is it spread very fast But just like natural viruses it also killed off the hosts fairly fast So once they a host was hit it would become overloaded and Typically shut down That's you know with natural viruses to nature has figured out that very fast spreading viruses are often they're typically not optimal The best virus is one that is slow enough to Impair the host spread a lot But continue to allow the host to to keep running and the more sophisticated viruses these days follow that pattern. So they Will establish a certain number of connections and then stop so that the Or maintain that number of connections so that the infecting host Basically corrupts a number of target machines and then doesn't kill it doesn't stop itself But to continues to complete the operations on those connections that it already has and it can Corrupt more hosts that way and so what shown in this plot is the Actual observed impact of this virus even though it was affecting servers and some name servers It caused significant pack packet loss both directly and indirectly some through network traffic and some through the fact that these Network servers were going offline but it actually Was quenched fairly quickly on When it first established itself on Saturday, but then two days later There's this huge spike which happened because people turn their server machines on a Monday morning First in I guess Europe and then in the US and And so you know these hosts that some have been taken offline some of them simply crashed and there was an equilibrium reached fairly quickly but then The infection sort of re-initialized almost to the same degree when people brought all of the other Machines online that have been idle over the weekend and this is you can read about this on the F secure site That's the security monitoring company in Finland Alright, so Okay. Yeah, so we're a little bit out of sequence here. Sorry, but so slammer uses this UDP point UDP Pack, excuse me UDP port as an exploit So there's a listening process on that particular point that has a buffer over on problem and It is allows code to execute and forward on these requests to other sites randomly and It was able to bring down a large number of group name servers Which also use SQL server Some very interesting design choice here, I guess if you're a virus writer is that it only used in-memory Access it didn't write itself out to disk and It seems like that was a good clever design decision in terms of it making it very hard for people to figure out Who wrote this and people never did? There were no time stamps about and they couldn't even figure out as we'll see in a second which machine was exit was hit first It did make dealing with a bug fairly easy easy at least Once you close the that port on a firewall you could really simply restart the machine because it was there was no persistent state It was only code running live for some time So you could reboot and that the problem would go away So interestingly enough the the first infected host was never found that's ten years later and The author was never discovered either people really don't have much idea even the Programmers who originally discovered this virus wrote about it extensively But no one ever figured out who was there the cause Some things they could figure out though from the code itself with it Whoever did this wasn't a very effective programmer So this one also had more subtle bugs in its random number generation In this case they basically copied a random number generator from somewhere else but flipped an X or for an or and changed Forgot to add one to the two's complement negation of a number so it was apparently coded in assembly language with some bugs and Fortunately that actually meant that there was a big chunk of address space that it didn't target because the random number simply didn't generate a fraction of the IP addresses so Yeah, so you can see a Pattern that has been consistent a lot of these big viruses that they could have been a lot worse but fortunately the people that are really Have the most skill at Writing this kind of software seem to be not the same people that are writing it so far Then there are some exceptions recently because this is becoming both a corporate process and also a Government process All right, so to just summarize some of the Techniques that people have used yeah question. I mean there's certainly Max seem to have been doing better, but they certainly are Amenable to similar viruses to windows. I mean and I you know there are viruses targeted at iPads and Android devices as well. It's just been those have been smaller markets But they're starting to grow a lot because those devices are You know just they're much more common. They're much more online and so Yeah, we'll probably see a lot more more generalized viruses than than traditional windows viruses Yeah, well, it's it was the earliest ones seem to be more exploratory now. It's more there's more of an industry or like What's the word? I mean, I guess it's a government industry a lot of governments sort of run formal programs to disrupt the operations extract information from other other agencies other countries And you know, that's very true. Yeah, so it's that's true I mean cybercrime is is much more, you know this large-scale sale of passwords. I think one taxon with Ben Paxson was telling us You can buy Facebook. Also people actually know capture acquire Facebook IDs so they can use them as credentials to obtain other other kinds of credential So there's a market in in identities as well. That's growing a lot. Yeah, anyway, so Yeah, so the types of exploiter diversifying and growing the exploits have traditionally come from a few sources though a Variety of Unix tools, which are the ones that Robert Morris used originally It's a number of vulnerabilities have shown up in IIS Windows web server and the mail protocol which is used by various devices to Manage the network. It's also been a source of many problems. All right, so Let's discuss some of the solutions and there was a suggestion for one of them that is actually used the first one is Not to write buggy software What is perhaps surprising about these kind of vulnerabilities is that they Have occurred in code that was written after people were aware of these problems And somehow really shouldn't have happened some of the older code that was grandfathered in from Berkeley Unix You can perhaps understand but Some of the other large vendors whose name I've said many times what I want to say again You know should have known better and there should have been more discipline to prevent that kind of problem recurring so, you know, obviously String copy is a Frequent offender. There's a string and copy which is a standard library routine which simply limits the number of bytes that you Can write which is much better You can try to run code checking Which will spot some of these most likely offenders People don't seem to like to do that. Although it's surprising, you know, some of the exploit sign is obvious as that either So it's harder to track them by automated means There are many languages that lack most of these vulnerabilities like Java, Perl, Python, etc The problem is that then there are other kinds of vulnerabilities sort of below the hood in the in the virtual Machine itself that you can try to exploit instead But anyways random Direct access to blocks of memory as in C is a bad idea C++ even going to C++ with encapsulated strings is a lot better because the strings have Length and so on that's encapsulated in the string object which normally prevent you from doing bad things by overrunning Hardware support and control of what regions of memory can be executed is a very good idea. And so we Discuss that the tools for that earlier on in the course. This is what segmentation was for right this step Before pageable memory we discussed the notion of abstract blocks of memory that only had certain types of access permitted and So yes making stack and heap non executable is a good idea Well stack yes, maybe heap is tricky though because you may have chunks of code that you want to execute in there But anyway, that's one of that is one of the techniques that you can use People have explored more exotic techniques like address space randomization Which you know basically Tries to cause the the memory to be mapped in a random pattern so that when things overrun The bytes are scattered across the memory and unlikely to be reproducibly hitting an address But apart from technical challenges of getting that to work There's serious performance penalties when you're trying to access memory sequentially and it's been randomized in that way And finally you can try to prevent access of Processes to other processes data and address space Should in principle come come can find the effects of one of these pieces of malicious code away from other things that are running Yeah Well, I look this stuff up just briefly. So I don't know Anthony might know more about this but no it seems to be as I understood it to be a physical address thing so that Basically writing to an erroneous address goes to a random location in physical memory doesn't hit You know has a low probability of hitting something that contains in a Return value or something that's going to be executed. No, I think it's Just aside from their circumvention though It does cause this sort of really nice a really nasty fragmentation of memory though so that when you're trying to you know stream through memory It's really defeating you All right, so and firewalls are obviously an important part of the solution as well And we talk about these in the second half of the lecture All right, so We're drawing to the end of the semester midterms coming up the second midterm Next week on Wednesday And we have two rooms again and so we're splitting up alphabetically and The review session we don't have a room yet as far as I know, but it's gonna probably be on Sunday night Sunday evening that seemed to be the best time Best compromise time that we could find It'll be a similar format with a cheat sheet two-sided cheat sheet of notes and it's going to cover everything since the last midterm so lectures projects and read and readings so Any questions and of course there's no final Project four is coming up. I am I know people are going to be probably Doing other things this weekend. So please plan to get Your design doc going on Monday by midnight All right, so that's the last one and project four itself will stretch out for a few more weeks after that All right, so break and we'll just wrap up and discuss firewalls in the end Okay So let's finish up All right, so let's review some of the things that we talked about today All right, so a digital certificate Provides a binding between a host's identity and their public key true or false one true anyone else Yeah, okay, that's right. Oops. There we go. Okay A server must store users password in plain text so it can be checked against the submitted password Worms require human intervention to propagate Right some viruses do you know all right and what about type safe languages eliminates the risk of buffer overflows Well, we actually we kind of implied it was true I mean they don't completely eliminate it But they would significantly reduce the frequency of the simple ones that we described by basically, you know Preventing overrun of right of arrays. You can't access beyond an array in Java beyond the end of an array Okay so a key concept of security these days is firewall which is some machine that's either inserted on the boundary of a network typically in a protected domain like a campus or a business and It stands between the machines inside the the boundary and the rest of the internet and in some cases it may actually reside as software on a machine so Microsoft has this firewall that you Adjust locally on your machine to prevent certain types of access and The idea of the firewall is to restrict the type of access that can come through the network into the domain and The restrictions can be done in a variety of ways. They can be based on the originating address source address and port number based on the type of payload in in the packets Which is a stateless test it may say simply you I won't accept TCP packets or UDP packets or FTP packets or It can be based on a stateful analysis, you know, which is what you need if you want to allow Outbound connections, but prevent inbound connections. You may want your host inside of the firewall to be able to establish connections and then receive Data from the packets excuse me data from the hosts that they are trying to access and that requires some stateful analysis for the firewall to understand that that Packet is part of a connection that's being established from inside So Some of the rules that the firewall will execute could be for instance Which is very commonly done is just blocking things that are not for HTTP There's almost everybody wants to have a web presence So they have to admit HTTP requests Through their firewall inbound ones But they may screen out everything else. That's probably the most common setup they May prevent emails that have attachments because those have the risk of containing viruses and They may block External packets that have an internal address. So in other words, they may just prevent direct access to it and Interior host. All right, so Yeah, so obviously with a firewall it gives you the opportunity to secure a whole bunch of machines Which may not themselves be very well maintained or secure and It doesn't prevent the you know attachment of malicious software to email messages that people then open so You know, it's still going to admit email traffic which will allow Potentially people to to get in trouble and Also, I guess social media which rely on primarily HTTP as a protocol Are also going to be kind of back doors that will still be able to bypass the firewall All right, and you know with the firewall you're trying to balance the Accessibility of the services that you do want to support against the protection of everything else So and a second thing is if the firewall is too restrictive legitimate users Sometimes will find ways of getting around the security in order to do the things they want to do and that can cause a lot of trouble So they may for instance set things up. They may set up tunnels Through port 80 using VPN or something else Which is generally secure, but if they have kind of unsecure If they do SSH connections without keys between a bunch of machines It means that an attacker that compromises that one user's machine can in effect get inside their business network and then Gather data or do other malicious things so you have to be careful that users, you know the There's a very strong human force which is people want to get work done and so if the firewall makes that too difficult then the Users themselves the legitimate ones are gonna be the ones actually digging under the firewall to get through it All right. Anyway, so we saw that port 80 is most commonly left open so that You can have IP services and that implies that other services often legitimately will try to use port 80 to get in and use instead a tunneling process which means basically Transforming the port number to port 80 transposing it and then running a Pattern based filter on the other side which will then distribute the traffic based on Stateful information or based on the type of packet so in other words you sort of demultiplex all of the things going through port 80 to different services right, so The most one of the most common problems now on most common threats is denial of service which means usually traffic or Cycles that are run on a machine in order to interfere with the operation of that machine or other machines So Let's see in 2001 there were where there was a series of attacks Denial of service attacks on many of the large internet portals And they managed some 12,000 different hosts being targeted in about 2,000 different domains 2,000 different companies in just one week Those came from compromised hosts But the compromised hosts will large enough in number to significantly interfere with the operations of these large entities So an interesting attack was This one so Spam houses of a filter Provider basically This is an agency that in International agency and I think it's Britain and Geneva and Switzerland And they provide a service to other companies which basically is a blacklist of bad sites that they believe to be bad or they proved to be bad Cyberbunker is a hosting Company that started in Denmark. Excuse me started in Holland and now whose location is unknown and they provide hosting of almost any service on the web and Except for a child pornography and Declared terrorists, but they will host anything else. So they will host sort of pirated things like Pirate Bay What else? WikiLeaks a lot of the WikiLeaks data was hosted by them and I allegedly also a lot of malicious code groups that Hacker groups that exchange information about viruses and host malicious software were also hosted by them so You know, reputedly these are bad guys. They seem to actually really be bad guys. They've gone underground. So that's a pretty good sign anyway, so Spam house was filtering their content and saying anything from there is bad And these guys didn't like that very much. So they Successfully given that they have a whole community of hackers and malicious code Fairly successfully launched a denial of service attack against the filtering out company So spring this year they launched an attack with more than half a million packets a second at this agency here And you know really effectively shut them down and The perpetrator seems to be just one person who was was actually identified and arrested in April and Had a little moving van with antennas. I guess mobile satellite internet or I'm not really sure maybe a variety of different wireless channels from his van So it tells you a few things, you know, the attacks can be very very powerful Maybe only takes one person to do them, but they're also really good resources of code Compromised machine IP addresses all kinds of things basically Botnets are some are often for sale not sometimes often for sale. So you can also buy a botnet fairly easily And you know use it to launch various other attacks So, all right All right, so the general form of a denial of service attack is somehow we're preventing legitimate users from gaining a service either on one machine Overloading it from within or overloading it from outside So one of the common attacks is a sin attack and Through TCP and Well Well buggy implementations, I mean Even normal implementations of TCP will be susceptible to this and we'll see in a second how they operate but Basically the the attacker launches a lot of connect requests from different hosts none of which are actually going to complete it just makes the request and The host starts executing the protocol to accept the connection and then just sits and waits But it's committed some resources to each of those Requests and that will eventually cause it to crash So Since this happens this shouldn't really happen But more recent implementations of TCP will limit the number of connections that it will actually Accept an attempt to open So that it doesn't run out of resources and then the new ones will just be dropped So, but if you don't do this defensive coding then Because the sort of set of resources that can't accept a Connect request is saturated. You can't make any more connections. So all of the legitimate connections that are coming into this host Will be denied and in fact the machine may also crash Because it won't be able to allocate memory for other things So just to quickly review How TCP connections get established You have some client that's initiating the connection and then a server that's going to receive it and if you recall there the two sets of sequence numbers the client has a set of sequence numbers that they start with That allows them to keep track of their traffic the Synanac request Creates this new sequence number, which is belongs to the server and that's how you know, they keep things in their sequence and So the client is supposed to send an acknowledgement of the Acknowledgement which has the sequence number the next sequence number after the server's number All right So that's what it's normally supposed to happen and But if this in a denial of service attack a client Can just send a sin packet? Wait for or ignore the response and then never send this act or send it doesn't matter too much In either case the server it has to wait in a state where it's For instance, you've got to remember why? So that it can check the y plus one that comes back make sure it's a legitimate acknowledgement of its acknowledgement And it's normally implemented as a state machine. So it's normally going to remember what state it's in based on having done this first round of messages and You know, you might continue on here But you would still be in a particular state of the state machine and you still want to remember your sequence numbers So Yeah, so the simple Version of the attackers to just send lots of these thin messages each one of them Creates a bit of state on the host and those accumulating eventually the host runs out of memory Let's see. So the attacker Yeah, so Yes, and you know a common Thing that you want to overcome is if you get lots of connect requests from the same client a server might easily ignore that So usually the attacker is going to spoof its own address Which you just all we have to do is insert the source address a different source address from the real one in the packets themselves the I Guess the IP packets have the source host address in them So you insert a false address there and the receiving server will Assume these are coming from different hosts. It has no way to check So it'll it'll have to respond to all of them some of them may be legitimate And it all right so and from the host's point of view the service point of view It looks like they're receiving these requests from from different clients Yeah, go ahead. Oh, that's an interesting question. But let's see But the MAC address is not the MAC address is not Going to be in the MAC address is going to be local to the Physical network though, so when it's sending So so what you'll actually see you'd normally be the MAC address of the router that sent you you know the entity that's on the the network that you're on because remember The the MAC address is an address on a particular physical network of the packet that is being formed so I since you know The original the originating network is some physical it let's say some physical ethernet the destination address may be a different type of network So normally the physical addresses are well, they should just be local. So it wouldn't have that information All right. So one solution of this it's some borrows from Cookies in brat web browsers is to offload the state of the connection on to the client and Make an effect the stateless connection So the server Proceeds as normal When it receives the send request connection requested will send back an acknowledgement packet but it will use a sequence number y which is a hash of client IP address client port and server key and Let's see the the client has all of this information and Let me see. Yeah, actually wouldn't matter All right, and then forgets about the connection attempt so the client responds normally sending this acknowledgement and Then the server is going to verify that sorry I should have said the server is going to forget about the connection attempt at the first step So unlike the previous picture where the servers remembering y At this state in this protocol, it's a stateless version of the previous protocol where the server forgets About this connection. So it doesn't have to save any state. It does send the acknowledgement and Then it waits to receive The second round packet from the client which should be addressed ypuff plus one the server can then at that stage check Whether it's in sequence because it can regenerate y using this information. Yeah Yeah, it's something like this It's like the idea of cookies normally which you're offloading from us from the server's database to a client file system So that it does a number of things that makes it a lot quicker usually to access the data It saves a lot of storage on the host and it makes the connections makes the whole protocol simpler by making them stateless Yeah, so is that clear? So there's some limitations on this especially the way this is presented because you can see the This y is really only allowing one connection from that client At a time anyway But anyway with that limitation. Yeah, go ahead. All right. That was a lot of changes to this And I'm trying to understand what what can you explain what you were trying to do with that? Yeah Who's sorry the client saves it right Well the clients, you know, yeah, okay, and then what happens? All right, so you're suggesting like an attack where the client does a lot of these. Is that right? So all right. Well, yeah, you're trying to create a denial of service attack downstream, right? Yeah, so I Think the this is like the sketch of the first step of this protocol. You want to do something stateless at each step of the protocol so Yes, the the idea was so I mean remember that this This protocol here is still responding to all of the sin requests that it receives. It's just not saving state so But generally that's going to allow it to deal with a lot more connections. I mean this operation is pretty fast so in fact This sort of thing is generally fast enough. It could basically receive Receive these requests more or less at the saturation value of the network Will be of that order so in a sense it can't really get overloaded and as long as you continue to do stateless steps Which you presumably will want to do it every step of this process You should be fine I know if that makes sense, but So I think you're worried about the same problem happening downstream. Yeah I mean it can you want to make you want to use the same idea at every step of the protocol so basically store the state and Or Remove the need to store state information on the server So I don't know how you do that, but this the idea should work which has sort of pushed the state into the clock into the Into the client perhaps, you know adding extra state information in the hash Because this perhaps putting well. Yeah, as soon as you put the service key in here. You should be able to verify so You can also often Improve access like if you have an attacker that's been identified or flagged as such and screened out by a firewall a Common way around that is to use reflection which means instead of directly attacking The attacker can send to a compromised machine here Which will launch its own packet To the victim the Source address sorry, this is the source address here will be the reflector address and The victim will be the target Okay, so let's look at some ways of of stopping attacks. Actually. I'm running a little bit. How much am I behind? Yeah, all right, let's try to go quickly through this So egress filtering is stopping Outbound packets which don't have a valid source address So that's in other words preventing people inside of a domain from spoofing their addresses And if that were done everywhere, it would actually prevent spoofing You can include trace back information in packets to allow routers to trace malicious packets All right, so D distributed denial of service is the big problem these days that was used in the Cyber bunker attack all right, so the idea is you marshal a huge number of machines Some of them I think probably would bought some of them maybe directly have been directly compromised Last thing I'll talk about is two-factor authentication Which is you know becoming increasingly important it's usually used in most organizations most companies now some form of this typically involves Two out of three things So something the user knows Usually password and people are exploring with visual authentication like recognizing friend space Something that user has which can be a secure far more dongle ATM cards for authentication with a bank and a lot of experiments now with various vital signs authentication including face face is being commercially used voice not not for very secure things, but Face recognition is available commercially voice recognition voice user recognition fingerprints of the traditional ones and Even heart rate the heart rate waveform is a fairly good identifier And so to factor authentication involves two of these factors And you know typically password and an authenticating card, but in the future we'd like to Use more biosigns authentication because people are good at losing those things There was a an interesting attack on RSA who make one of the most widely used security fobs and Basically hackers Successfully through some people that they knew inside the company successfully infiltrated the machines and were able to grab a Large number of passwords. They basically had the private keys And the data with which to really simulate these devices for a large number of fobs and effectively penetrate many many companies So that was pretty serious. So in spite of the fact that these devices are they provide a very high level of security when correctly used and when the Private key data is fully protected. They're interestingly amenable to the same kind of human Frailty that all systems are amenable to So in fact, that's systems being very successfully hacked at scale, but nevertheless people still use this. This still seems to be the preferred Validation technology by many companies All right, so there's the attack All right, and I won't talk about advanced persistent threats This is just the basically corporate espionage technique which involves long-term placement of malicious code and gradually escalating access of security and capabilities so that you can really Pretty much do whatever you want and exfil exfil trade a lot of information And anyway, there's a link there that you can follow to look at more detail about how that's done All right, so in summary security is one of the biggest problems today Typically caused most of the time by very poor design that allows buff things like buffer overruns We described a number of solutions But in particular the use of firewalls which in practice is the main solution these days We talked about denial of service attacks pretty difficult to defeat and a lot of research in this area right now and finally we looked at some of the Common methods of authentication two-factor authentication which potentially is very strong But you know it still has the same human weaknesses as all of these methods do All right, so we'll see you on Wednesday I'd always have a good vacation