 Oops, yes, I don't know, it's not Halloween anymore, so everything feels fine. Alright, let's get started. Alright, so we got a lot to cover today, so we're going to jump right into it. So we looked before at local routing. So when one machine is on the same local network as another machine and it wants to send an IP packet to it, so A, how does it know that it's on the same local network? Check the subnet, check the subnet, it's on the local network, then it does what? Does it send an ARP request? It sends an ARP request to do what? To have, it sends an ARP request to all machines, asking who IP does this belong to and then whose IP that belongs to, sends back an ARP reply with its MAC address? Yep, and then what does it do? It writes it into the ARP table? Yes, and then what? Yeah, in the back. Wait, what? Do we do it again? No. Do you want to answer? Alright, so then what happens? So we've got an ARP reply back, we now have a mapping between the IP address and the physical address. Now what does it do? We just start writing invites. Like you have the MAC address and you just send it to Rachel through that address. Yep, so at what layer? It needs to be more specific on that. Use the transport layer and either TCP or UDP? We're all right to decide what. Yeah, we haven't talked about that yet, so let's ignore that. So this is everything below that. Right, so now I know where I want to go, now how do I actually make this packet ready to go? Netframe and transfer to it. With what? What do you put in that netframe? You won't forget it, it's going to be first into your memory forever. So it has the IP layer of the packet, it already knows it knows its source IP and knows the destination IP. It takes that packet and encapsulates it in an Ethernet frame which has the source MAC address and the destination MAC address. This destination MAC address it just got from that ARP reply. So this is if the node is on our local network, right? And we know how to answer that question of is it on the local network? We can easily answer that. The question then becomes, well, what happens if it is not on our local network? Which is going to happen constantly, right? When you talk to Google, Google is not in your local network. You can easily check this, you can look at, open up a terminal and run ifconfig and you can see your IP address and your subnet mask. So you'll know what subnet you're on and you can do a DNS request to figure out what's the IP address of Google. You won't very easily be able to tell that that is not in your local network. So how does a packet then get from our computer to Google? What does that do? A domain name server? What does that do? It takes a, I guess, URL, text, whatever you want and transforms it into an IP address by going through a bunch of other bigger domain name servers than just smaller domain name servers. Yes, so on its core, the important thing is that it has domain names to IP addresses. So it doesn't, URLs have a lot of other junk in them. But one part of that is the domain name. So let's ignore that. So just like in our local network case, we know we want to talk to a specific IP address. We just know we want to send an IP packet from us to this other address. So, hey, how do we tell that it's not on our local network? No. How do you send a ping packet out? This is just a guess, but you would send an ARP request, but if nobody replies, then you know it's not on your local network? No. Does that just mean it's down? If it's not responding to ARP requests? Yeah. Just a second. You would want to check your subnet, and see if it's not in the other network. You would need the opposite check that you check to see if it's on your network, right? So you do the opposite check. You check your subnet. It's not in your subnet. So you know this machine must be in some other network. Do you know exactly what that network is? No. No, right? That would be an insane amount of information for one computer to have, right? To have to know about every single network out there. All you need to know is what IP address do you want to talk to? Well, that's part A is what you need to know. But the question then is, where do you send this packet? You can't just throw it into the air and hope that Google catches it at some point, right? Kind of. So where does this packet actually go? So think about your home network. Where do your packets actually go? Send the router. To the router. Why is the router? Which packets do you have to go to the network? Right, remember we're talking about networks of networks, right? If you have a completely closed network, then there's no way a packet can get from us to Google, right? Try this at home. Unplug from your router the external internet cable. See how well your connection goes. You can still talk to all the hosts on your local network. You can still transfer files between machines. You can do all that kind of fancy stuff. But you will not be able to talk to any external hosts, right? And so the idea is in every network you have one machine. It's usually called, we'll use the term gateway rather than router because gateway is more general. There's one machine that's the gateway that knows how to talk to the rest of the networks. So actually we brought my drawing tablet so this should go a little bit easier. We think about that. So you have, let's say in your home network, you have your computer and you have your mobile phone. And maybe you have some storage server, right? They're all basically on the same subnet connected to some gateway, right? Then this has a connection where? Yeah, it has some kind of connection to some ISP network, which is part of some vast internet that I'll draw on the town of Nebulous Cloud. And somewhere there will be a connection to a gateway at Google, which will event and there could be other networks here. There could be another gateway here. And then finally get to some Google server that can actually reply to your request. So an important thing to always keep in mind is thinking about like how much information should I need to know? So should your mobile phone have to worry about the infrastructure of everything in here? No, that would be crazy, right? That would be nuts. Like think about every time AT&T decides to move switches around, they have to update everyone on the internet all, I don't even know, billion devices is probably a safe guess. It's probably an understatement. So, but what we need, so think about this from our perspective. So as the, we'll do the client here, as this client, so we know we want to talk to some IP address, we'll call it, G is confusing, we'll call it B. We're going to talk to Bing now, switch that up. We're going to have a client, we want to talk to Bing, we are, we're having a machine, so we're A, we want to talk to, so we are IP address A, we want to talk to IP address B. What, so we know from before, we know our subnet, right? And our subnet allows us to say whether the IP address B is on our local network or not. So it tells us it's not. What's the next thing we need to know? We need to know who is our gateway or whom do we give traffic to, right? So we need extra information. There's gateway, or the other way to think about this, as we'll see, is a routing table. So we need to know, okay, I know this isn't on my local network, so who do I give this packet to? Who can actually take this packet and send it where it needs to go so it will hopefully get closer to its home, right? Again, at this level, IP, there are no guarantees that it's ever actually going to get there. Just like you can send packets to your router, but if it has nowhere to put them, it just drops them, right? Cut this link, and your packets aren't going anywhere. Do you only have one gateway? Your local network, yeah. Always. I mean, you could pay for more, but you have to have some complicated system set up to actually use the bandwidth for each one. That's not that complicated. It depends on how you want to use it. You want to double your bandwidth, or you want redundancy, in case AT&T has a problem, you can switch over to Verizon. So actually, a lot of companies will do this, as they will have two internet connections just in case, right? So you could have basically a gateway that is aware and decides between the two, or as we'll see, and this is what I wanted to move away from just thinking about a single gateway, is a routing table can specify exactly where to route packets to what hosts, depending on the IP address. So you may want to split your traffic and say, I don't know if you can say this, well, that would be silly, I think. It's tech. It doesn't matter. But you could say something like, let's say you have a more complicated network, and you say, hey, if you want to talk to Google, it's actually cheaper for us to talk to Google through Verizon. So use this gateway to talk to Google, but if you want to talk to Bing, use, I mean, use AT&T's network, because that's actually going to be cheaper and faster for us. So what does the gateway do? So let's say we ask, basically, okay, we know it's not on our local network. We know it's not on our local network. So we take our packet, we're going to essentially, as we'll see, we do exactly what we did before, but the Ethernet layer is going to be from us to the gateway. The IP will still be from A to B. We send that to the gateway. The gateway gets the packet, and then what does it do? How does it know to give it to the ISP? It doesn't give it back to the mobile phone or back to the server. It has its own routing. So the point is these are all the same components, right? From the gateway's perspective, it just has a packet. It knows that it knows, A, it's not destined to it. Why does it know that? It's not its IP. The IP packet says it's destined for B. B is not the same as this gateway's IP. So it says, oh, here, I've got a packet. I need to send it out to the next person. Do I send it to the mobile phone? No, I don't send it to any of these. Just like everything else is configured with its IP and its own subnet and its own gateway or routing table. So that way this gateway knows who do I send packets to, right? So as long as every gateway knows how to route packets based on the IP address, eventually this packet, as long as each of those hops is getting closer to its destination, it will eventually reach this gateway, which will say, oh, pass it to this gateway, and this gateway will take it, pass it to Bing where it can be processed by that machine. So it's actually a conceptually a simple process because once you get down this how routing works thing, everything is the same. The clients, the gateways, everything in between. There's no difference. They just get a packet and they either decide if they want to pass it on, then yes, then they look up their routing table and decide where it goes next. And that's a super straight, repetitive process that doesn't change depending on where you are. Cool. And so at each level, right, so the other thing to think about is that each local network, the Ethernet frame is going to change, right? But the upper level, the IP doesn't change because the IP packet still wants to get from A to B. So based on what we just said, so what do we know must be true about the gateway and A? What are things that must be true? There are one, I will lead you to, but there are other things that can be true. They're on the same local network and they know each other's IPs? Well, the gateway at least knows that, A's IPs. So the thing about the first part you said, we have to be on the same local network, why? Well, I guess that's just where we start off with our ARC requests as broadcasting to our local network. But if you're not on the answer other than you can't get out without going to the gateway. Yeah, we only know how to move packets one hop which means we need to know the ARC, the physical address of the next machine to give the packet to. That machine must be in our local sub-network, right? Because let's say we wanted to talk from A to M, we know because of our sub-net it is not a multi-hop, it is a direct delivery. And so we'll do an ARC request, figure out M's MAC address and just make a packet with a source MAC of A and the destination MAC of M and just send it directly to M. But in this case, we know that the IP address B is not in our local network, so we need to send this packet to our gateway. In order to send it to the gateway, we need to figure out the MAC address of the gateway and so to do that we need an ARC request. So fundamentally, because every hop you got to think is that direct delivery, right? So like from the perspective of G has to be on its sub-net in order for it to actually send the traffic there. So this is if you ever configured this manually on like the command line in Linux, you'll get an error if you try to, if you don't quite know what you're doing and you're thinking about all these different cases, you could easily make a sub-net that is not on the same network as your gateway and you'll have massive problems. What else? So what else did you say? I just focused on that one. Well I said that they needed to know each other's IP, but I mean you don't, because that's what, you don't need to know each other's address, that's what the ARC does, like when we're broadcasting that out. ARC does what? Well ARC maps the MAC addresses to IPs. You brought gas down, it maps, yeah it maps the MAC address to an IP, like when you brought, you ask, who has this IP address? So that would mean that you know the MAC address and you're getting the IP address. That's what I mean by maps too, like a function, right? So like ARC, you think about ARC as a mathematical function, it takes in what and returns what? It takes in an IP and returns a MAC. Yes, so it maps IP addresses to MAC addresses, right? It also does, keeps a backward mapping, but for purposes of thinking about it, right, it does this, this mapping, right? So, thinking about it that way, then A must know the IP address of G, right? So it needs to know the gateway. Yes, it needs to know the gateway's IP address, it does not need to know the MAC address, because it can find that out through an ARC for once. What was the third thing you said? Just like three things. No, I think that was, that's all I think I'm trying to say, they need to know, they're on the same network and they need to know their IPs, which is cool. Great, so yeah, so the way to think about this is there's, we know it's indirect delivery because the destination IP address is not in our local network. So, we figure out, well, where's the first hop of this packet going? Where does this packet go? You consult your routing table, which will tell you the gateway to use. You then do a direct delivery between you and the gateway. The gateway gets this and says, okay, this isn't destined for me, this is an indirect and it's not, it's not in my local network, so I know it needs to go to my gateway so it sends that along, that process keeps happening until finally the last router receives that packet, the last gateway and then what does that gateway say? No, not yet, we're going to talk about that. Yes, so first it checks, is this packet for me, right? They always check, is this destined for my IP address? It will be no because we don't want to talk to a router that's usually done unless we're trying to do something fun and malicious. So it says, okay, it's not destined for me, is it for my local network? And then it says, yes it is. So then it does a direct delivery from the gateway to V and now it will just do the same process. If it doesn't know the MAC address, we'll do an ARC request, get back the MAC address of V and send the packet directly to V. It's actually really easy, same thing with this, super easy. Cool, so we are, we're going to have an example where we are at this machine, 121 and we want to talk to 110 and okay, one important thing that we, so we brought up in this process, right? Nobody actually knows whether this packet is getting closer to its final destination, right? There's absolutely no guarantee that any of those routes and hops along the way, right? Anybody could have made a mistake and either sent the packet backwards to the previous gateway or to a third gateway that sent it back to the first one and so you end up in this loop. You end up in all kinds of routing loops. So clearly we don't want packets to live forever, right? And just keep infinitely cycling around the network, although that would be a really cool way to like store data. I've heard of some weird cases of people exploring like trying to use Pings to like store network data so just ping some servers with like a thousand bytes and then you don't actually have to store those bytes. So when you get the ping back, you just send it back to them and you can do this with enough servers and redundancy that now you're using essentially the network as your storage device which is kind of cool but super weird and probably mean for everyone else who's trying to watch Netflix. Oh yes, so there is a field in the IP packet called the time to live. So each gateway, each step on the hop along the route actually doesn't just leave that field completely it doesn't leave the IP packet completely unchanged. They will decrement the time to live value and then once it hits zero they drop the packet and depending on the router we'll sometimes try to send a message back to the source to say hey this message couldn't be delivered and that's an ICMP message. So every hop along the way so usually TTL starts at I think 255 if I want to say. And this actually has security impacts so this is actually how I believe, I don't think they do this anymore but AT&T and I think also Verizon used to or still do charge extra to have tethering on your phone so to have your phone turn into a Wi-Fi hotspot so that you can connect to your computer to it if they wanted to charge you extra for that. So people of course didn't want to do that so they just install applications that would act turn on the Wi-Fi and active routers but AT&T could detect this because your packets would have their TTL decremented because your packet would go from your laptop to your phone the phone would decrement the TTL value and then send it to AT&T. So AT&T could see every packet you're sending has a TTL value of whatever 254 when it's coming from the phone but when you're using tethering it's 253 and so they actually use that to either throttle you or send you a nasty notice saying you're not actually paying for tethering don't do this so it actually has important and interesting security applications here because then the alternative is once you know about this you just don't do that on the phone or you change other things Cool So the important thing here is that at every step of this process and this is the important thing to remember So if we have post A as we talked about here so post A will be 121 so B will be .10 Do they know each other's MAC address? I think I have it a different way Do they need to know each other's MAC address? MAC address is of things in your local network because those are the machines you need to directly deliver packets to Every other machine you only need to know just the IP address to talk to them So that's essentially what happens here So every packet along this way will have the yellow layer the link layer will be changed the source will be this machine the destination will be this machine here the source will be this MAC address the destination will be this MAC address and so on So finally the last one will have the source MAC address this router and the destination MAC address will be deep So fundamentally A and B don't even have to know or care about each other's MAC address ever Questions on this? You can also use I don't think we'll talk about it now but I'll briefly mention you can also use TTL to try to map networks and try to figure out the path that a packet is taking The idea is some machines when you drop a packet so like a router will send you back an ICMP message which is at the IP layer So that's what PING is PING is an ICMP request message so it has no TCP or UDP layer it's just an IP layer packet One of those types of messages it says I believe TTL exceeded So if anybody ever use Traceroute a fun little network tool that maps the network what it does is it sends a packet out with a TTL of 1 sends that out and then tries to see who responded back with that ICMP message dropped message and then it sends out a packet with TTL of 2 so that should get to the next hop and then it sees who dropped it there and then 3 so it keeps sending these back until the packet actually gets to the destination so you can estimate roughly the number of hops that it's taking and you can figure out what machines are in between That's actually a super fun thing to do if you want to play around and learn more about networks because oftentimes I believe by default Traceroute has a CNS look up which tries to map the IP address to the domain name so usually that will tell you something about where that machine is visibly located so it'll have things like if you go through routers and like AT&T or Comcast in LA it'll be called like lax.comcast.net or something like that with random identifiers so you can actually get a nice indication of the way that your packets are flowing through the network so the main way that packets are routed as we talked about is hop-by-hop so every hop knows where the packet needs to go it used to be when they first actually created the initial like ARVANET they actually had it that the source could specify what route the packet should take and it seems ridiculous but actually if you think about back then they had an extremely unreliable network so if we wanted to talk between two machines and we knew that one path was bad because somebody wasn't admitting that or the route was down we could still talk then I could specify exactly where I wanted packets to go and there's actually an option in IP packets called source routing but I think it would be interesting research to see if this is actually ever used because you can think there's a lot of security problems here if I could force packets to go through certain paths I could try to take down those physical links by forcing a lot of packets to go through one physical link all kinds of bad stuff that you could do so a very key aspect of this indirect delivery is the routing table so for those of you that end up becoming and dealing with networks this is what you live and die by is the routing table so this is something you should always check to make sure that it's set up correctly because this tells you on most unixes you can do route-n and that will tell you the route if you're on a Mac you have to it's really dumb but it uses netstat you have to use I think it's like netstat dash r for the routing table and that will show you the routing table so oh dash n is always a nice handy option dash n usually says do not map IP addresses to domain names so this usually actually makes the process take a lot longer because it has to do these DNS queries so for most options like route does this TCP dump does this TCP dump with the dash n flag that also means don't resolve DNS names so if we did this on a machine you would see something like if it's here go here so the way to read this is the destination so it's in a descending order of specificity that makes sense so and we can see that by the genmask so the mask says this is all ones right, 32 ones so you AND that with the destination and you say okay or you AND the packet where you're trying to to send with that and compare it with the destination and if it's correct then this is where you send it so this says hey if you want to talk to 192.168.1.24 send it out on physical interface 0 e to 0 right, if you want to talk to anybody in the 192.168.1 slash 24 we know it's a slash 24 because of this genmask which means the last octet doesn't matter then send it out on ethernet 0 so this actually you can read it more information about the subnet here it essentially says this is your subnet they also say hey if you want to send it to 127 anything in the 127 which is what IP address range local, it's home, it's this machine so 127.001 is what's mostly used but actually you can use any address in the 127 range and all of those will go back to that same machine so you can see it's actually a different physical interface so this is why if you've ever tried to ping or I don't know if you're sending traffic to yourself on 127.001 or localhost which is a DNS name that's mapped to 127.001 if you tried to do a network if you tried to do TCP dump on your network interface you would never see any packets there because it uses this dummy device that's a local device finally if it doesn't match any of those so this is the zero this means the default this is like a catch all it says if the IP address we're trying to send to doesn't match any of these then send it to 192.168.1.1 so that's that gateway column and so then how do we know so all these flags basically will say up that these are up h if it's a route to a specific host so you can do super cool things with this you can set up routing tables where if a packet what if I had to do kinds of crazy stuff yeah you can all kinds of cool stuff in here while doing this and you can do this on any machine you can look at your routing table and see what's going on an interesting thing to do would be to connect wirelessly to a network and then connect physically to another network and then see what your routing table like because that will actually tell you where your packets are actually going to go if they're going to go on the wireless or the ethernet alright we're going to talk about this so yeah basically the way this table works is you search for matching host address most specific to least specific does the address I want to talk to match any of this so I have questions on route indirect delivery aka routing aka how packets actually work and get to where they're supposed to get there's another wrinkle that we're not going to talk about but I want you to be aware of this seems very straightforward but if you think about the perspective of somebody like an ISP right who gets a packet and says there's this destination address what of the gateways of all the different things I'm connected to where does this actually go right and does anybody I believe you can can you actually do that move IP addresses between providers you can or used to be able to anyways the short version is that there's another protocol called BGP which is the border gateway protocol which is how big ISPs talk to each other and say hey if you want to route like I control all of these networks so if you want to talk to any of these subnet send packets to me and so it's this pretty complicated protocol and it's all based on trust and there's only something like 100 BGP nodes in the world but this actually leads to problems of when people got so I can't remember what country it was what countries wanted to ban Twitter inside their country what the way they chose to do this was they announced the BGP route that says if anybody wants to go to Twitter's network send it to us and then they would just drop all those packets the problem is they announced that to the entire world and so every other gateway and every other ISP was like oh great Twitter's going to this ISP so they sent all the traffic there and dropped it so Twitter was out literally for everyone and there's nothing Twitter could do because it's not wasn't their fault or their problem so they had to take like a coordinated effort from people to realize that and get the government to take that off because it's a very trusting protocol where if you say you can route somewhere then they'll let you do it so so back to UDP so now we've built up all of our physical layer which we're ignoring the link layer the internet layer now which I guess it is 3rd from the bottom maybe you're correct 3rd from the bottom right non-specific way to describe things so now we have looked at UDP and what we saw and talked about with UDP is UDP is simply a very thin layer I think on top of IP so it does not provide any additional delivery guarantees any other types of security mechanisms so we've already discussed this and so now we can actually go back to our spoofing example and we can understand that if we send a spoofed UDP request and we are not on the local network of either the server or the trusted client here will we get the reply no because none of those the destination IP address is client, trusted client none of these hops along the way will ever get the packet closer to us so if we want to do this we have to actually be able to spy on any of those links and any of those hops to actually see that reply which unless you're the NSA or a large government organization is highly unlikely or you could compromise one of these switches along the way that would be another avenue or sorry more cool stuff so we'll think about this from a security perspective so we saw that if we're on the same local network as a machine we can basically trick it to send all traffic through us if we're on the same network we can see all the traffic that's happening there but we can't see all the traffic that's happening here or here or here or in the target network unless we somehow get visibility there what I was going to say is super interesting is that so so the internet is made between two companies companies countries basically through underground cables across the ocean there are I think there are real reports of submarines going down there from countries and putting in tap devices on these underwater cables so that they could see all the traffic that was actually going across the country which is pretty cool cool okay so we talked about IP spoofing we cannot get the packet back and we talked about IP UDP has ports to destination and when you reply to a UDP request you set the source port as the destination you set the destination port as the source port that was sent to you which means that if we wanted to spoof a UDP request and we cannot get a copy of this packet because we are not in the local network of either of the two entities then we need to actually create a spoof UDP reply by guessing what the destination port address is which is actually not that difficult because there are 65,000 tries so there are either two ways to think about it if you want to guarantee that we are successful we send 65,000 replies a problem there is that the servers request may get the servers reply may get there within all of those but maybe we can dost the server or something to guarantee that it is not going to set the packet out or we just try 65,000 times and one time we will be right if we just keep doing that eventually we will hit it so yeah we can still do that in a stealthy way so now we think about attacking so let's say we want to attack some remote system so there is some server let's say we are hired to do some penetration testing and I am going to throw this away so don't be scared about me we are hired to do a penetration test and we have a target remote server so how would we go about breaking into that server find where the server is physically located why? let's say it is a remote penetration test although there are physical ones as well but let's say it is a remote one where all we need to do is in the comfort of our home how would we break into that server trust so what information do they give to us the company who is hiring us to do this penetration test so it is accurate what do they do break into the server X that is our job so then what do they have to tell us do we just start breaking into random computers and hope that we find X yeah we need somebody to identify that machine so we have been thinking about IP addresses we will just say IP address so they tell us the IP address we need to break into they give us nothing else but we want this information why? the information you talked about what it does what it trusts why would we want that information we need to know what it does so we can figure out what types of ways to break into it you are going to approach it differently if you find out that server is a bank this is a mcdonalds versus i don't know or a satin 11 so you need to actually know what that thing is on the other end what it is supposed to do so how do we do that how would we know from the outside just giving the IP address what it does some kind of recon so maybe scan it and see what it is doing what ports it is listening on why do you want to know what ports it is listening on because those are going to be the ones that are open for us to have some kind of access to to force something to execute on the server the only way you can break into a system is if you can somehow influence its execution by data that you control so that can either be you sending data directly to a network service so you want to figure out what type of we talked about UDP what type of services are listening to UDP ports is it running a DNS server which would be interesting is it running the network time protocol server so we want to know all these things to figure out what it is doing to then potentially launch an attack without this information then we can't know and it's called port scanning so the idea is we want to scan a remote machine to see what services are running on that machine so how do we do that and map no and that is a tool but it accomplishes the goal how does it work so in this class we're more interested in how things work rather than what tools to use because I can teach you how to use tools like a monkey that's why I can teach a monkey to do that but they are not going to be able to build their own tools and the next generation tools or know the flaws of current tools given the situation that they're in and maybe they need to tweak and change the tools to accomplish their mission so how would you do that so you're writing in that you want to do a UDP port scan of a remote machine how do you do that I'd send UDP packets to ports 1 through 6, 55, 33 or whatever what's your what's the outcome that you expect from that I'm looking for some kind of response okay is that enough maybe the type of service the type of service or the type of system operating system that's running the key question is what do operating systems or different machines do when you make a UDP request to a port where there's no service listening on that port maybe it just sends you back maybe it sends you back a reply that says no service so if you look for replies you'll send 65,000 requests and you'll get back 65,000 replies so it tells you zero bits of information so basically this is the basic idea you send a zero length UDP packet so you don't need to send any data because you're not trying to talk to these services you just want to know because if you think about it so the physical analogy would be going through that apartment complex and knocking on 65,000 doors to see who opens it or if you want to be stealthier you just wait until they're about to open the door then you run away because you don't want to actually talk to anybody or this I guess I guess it would be the other way would be writing a letter to all 65,000 apartment numbers in a given apartment complex and then you just come back as undelivered from the post office that would be really annoying, don't do that but it would be funny cool so if you get back a port unreachable message then you can infer that that port is closed, that nobody is listening on that port but the other thing is you need to understand what do the operating systems actually do so why does it actually matter what operating systems do so when you're writing a server you're writing an application that wants to listen on a specific port, how does it do that so if you have group privileges you can actually just listen to the physical ethernet device and watch every single packet and just when you see a UDP packet you would then respond back to that when you saw a correct UDP packet for that specific port you would respond back but that's obviously you don't want every single application to be able to listen to the traffic on your machine how does a process open a file is the port abstraction use system calls and interrupts to alert the server that a packet is going to receive I mean obviously there's going to be some kind of interrupt but on a specific port is there an interrupt or is there an interrupt I don't believe it's interrupt base, I believe it's pull base in that you need to pull the operate so it's OS based that has this abstraction so you ask the OS I would like to listen for all UDP all packets I think UDP packets that's by the specific protocol you say I want to listen to all UDP packets on port 25 and then the operating system will collect those for you and then when you ask the OS hey, or how does it I think there's different models I think the main model is every time you get a packet send this packet to this function it invokes one of your functions in your code so the port and file answer the question is just an abstraction where you pick a number and like the number 6 represents this file many tell me you talk about the number 6 in our program we're always talking about this file providing by the OS, that's the important part when you want to open a file you have to ask the operating system open this file for me in our program giving up execution and the operating system starting to execute that system call so you have no control over what happens the same thing happens on networking stack versus even file systems so because of this it's important to understand how the operating system actually responds how will the operating system respond if you send 65,000 UDP packets to me and so part of what happens is there's actually a limit let's say this ICMP port unreachable message so you have to configure your scan so 80 messages every 4 seconds so you have to configure your scan to send less than this otherwise you may hit this limit and you'll start thinking that ports are open when they're not so we can use a tool called NMAC to do this this is actually an incredibly useful network tool it's a little bit more on the offensive side but it's still useful for figuring out what's going on actually use this when I forget like what port the rabbit MQ guest interface runs on so I'll just run NMAC against it to figure that out until the port is open or like against a printer a lot of times printers will have configuration pages on a specific port so I just NMAC it to figure out what it is so the, let's see the dash S is a connect scan I believe and dash capital U is the battery debt is the UDP port scan so it's an open source tool we're always much happier to use open source tools that we can look into and figure out how they work so it will do this scan and it will say that it'll scan this and it'll say 1445 ports scanned but not shown below are in state closed so it will show me all the open ones so this scan found that UDP port 137 UDP port 138 are in an open state so now what do we know we just ran the scan let's say this is the IP address of this machine that we want to try to break into what do we know those are two ports that we can possibly influence so we know that these are two ports that we can influence well, it's kind of hard so we know that we sent out 1445 UDP messages to ports probably 1 to 1455 and we got back we got back the error message for all but two of them it could be that these ports are that those packets just got lost the ICMP, either the UDP request got lost or the ICMP response got lost this is unreliable communication mechanism we actually don't know that it was successful or not so where does this third column come from certain services use certain ports typically in theory you could configure any service to use any of it because guys come in knowledge that NEPA goes from 137 exactly, so there are a list I believe it's I want to say it's the internet engineering task force that maintains the list I can't remember exactly the organization but if you look up port list there will be UDP and TCP port numbers from 1 to 65,000 and they are typically used by a specific service so that's and why this is, if you want to talk to a IP address and you want to make an HTTP request you need to know what port to talk to so if you also had to communicate for every single system the IP address and the port number you wanted to talk to, that's a lot of information so the idea is my standardizing saying most HTTP servers will be running on TCP port 80, you know who to try first and it's actually a great technique if some networks so port 22 on TCP is SSH traffic, SSH to a server, so some things like airlines or airports or paid Wi-Fi will walk all ports except for port 80 or port 3 and so you can get around this by configuring your server to accept TCP connections on port 80 and then they just think it's web traffic but it's clearly not so what I would do here next is invalidate, are these services, I would look at what the heck is a NetBios NS, I would assume maybe it's like NetBios network storage or something and DGM is I have no idea what that stands for you would then dig into this to find out what are these things have we found all the open ports that potentially are on this server shake your head why not because I ask the question yeah there's 65,000 ports 65,000 ports and this Nmap says it only scanned 1,445 ports so why did it do that pretty sure that that's the from 1 to that number or maybe a little less I can't remember is the dedicated service ports that have specified ports for things like SSH and Telnet and FTP yeah so the short well you could look this up for more detail but essentially most of the more common services run usually 1024 or less and those are all the standard services that you mainly care about however we actually don't know maybe there are others that are being used that we could use as a tack factor so this would actually be a very poor penetration test because we're not actually knocking on every door in the apartment complex we want to make sure we configured our Nmap to do this scan and actually have an example of doing this where we found a what's the network management protocol on servers it's like I IPMP I think which allows you to connect to servers in their admin page like and reboot servers or re-image them so we found out that their credit card processing machine had this available to any developer or anyone on their local network and they were shocked that we found this and literally it was because I used the Nmap with scan all ports option because they had a machine they had a dedicated machine that would scan internally in the networks but their scans only scan the first 1000 ports or whatever and so because they had misconfigured their scan they missed this vulnerability in their network IPMI that's the name of it IPMI and it could have I didn't do it because this was literally their actual credit card processing machine and so this if I messed it up it could have taken down their business for the day which for a company that processes credit cards is a very big deal so we just showed them what we found and when they said do not do anything there we will fix that but it was a good finding and it's a good way of showing these defaults defaults are defaults so you should think about what is actually happening here and is it what you want to have happen the TCP the other awesome protocol the transport layer that is most often used TCP is where things actually get more interesting in terms of guarantees so TCP actually allows for a connection oriented reliable stream delivery service so you actually have no loss no duplication no transmission errors and the correct ordering so that you know when you send foobar and the OSS told you yes that's confirmed that the other side actually saw foobar this is super cool super important just like UDP TCP has the port abstraction so we'll see how this is actually guaranteed but it's important to understand TCP offers these guarantees you know one thing that's interesting to think about is does every single network application need all of this functionality does every single network application need no loss no duplication no transmission errors and correct ordering you could maybe think of some services where maybe loss is okay but correct error correct ordering is really good which may be the streaming option is you want the frames to come one after another but if one of them drops it's fine just keep going like let the movie play right so how you actually do that is build those mechanisms on top of UDP so TCP is great because it provides all of these nice guarantees but other ways can be definitely better so so super important concept of TCP is this connection oriented so how do we know that connections actually exist in TCP so the idea is a connection exists between four things is important for couple the source IP the destination IP and the source port and destination port this is what forms a connection a TCP connection the source IP destination IP source port destination port and you got to think this this identifies one flow right from source to destination that reverse tuple is the reverse flow and so TCP is set up to allow this full duplex of both sides can talk to each other and find out the socket abstraction so usually the combination of IP port is sometimes called a socket so TCP so looking at the header we first have our 16 bit source port 16 bit destination port so we can see we have exactly the same number of ports in UDP and TCP after that we have some other projects that we'll get into we have headers flags check some urgent pointer options padding data so the TCP segment so if we think about again adversarially an attacker can completely control the source port the destination port the sequence number acknowledgement number all of these flags window urgent pointer I guess I should have colored all of this so like an adversary can control and spoof TCP packets that have all of these values just like we saw at every layer and just like before TCP is encapsulated inside IP which is encapsulated inside ethernet frame so how do we actually get connection oriented reliable stream delivery service with no loss no duplication no transmission errors and the correct ordering when we're building do any of the other layers underneath us guarantee this so that would be super easy right and say yeah I do this because I'm using IP which provides all this stuff the next thing is that works up so when you do HTTP or something to use a TCP you know you have all of these properties and you don't have to do it especially yourself I'm going to do it it all comes down so first we need to actually establish a connection with another server so we need to with what we want to see is that machine up and alive we want to establish a way to communicate with that other system so we are the client we have a server we want to talk to so we know the server's IP address we know the port we are some computers we have each our local networks and then we have basically like the internet in here so we're the client we can talk to the server so we can send packets from us to the server in general we saw that in the IP later we can send packets to them so we can send essentially we can think of it as like a start packet we can send some packet that says hey I want to talk to you on this port so we send that packet to the server the server gets that what should they do in general when I talk about the protocol we're just thinking through do they say great or they just internally go yes this is awesome everything is connected send a response back why do they need that response the client has no way of knowing did this packet actually get to the server or not because IP does not guarantee that they may have gotten 10 copies of this packet they may have gotten zero they may have gotten half of this packet so the client then needs to send a response back now think about this from the client's perspective how does the client know that this response is in response to the original request that they sent or think about it a different way a client gets a packet back sees that it's a response packet what do they know information let's say just from the layers we've talked about so far what information is contained in this packet IP address so this will have IP of the server the IP of the server and obviously the IP of the client so the client knows that it got a packet from this server's IP address it knows that it's a what's it called a response to the set up message first so that it can know that it got a response back but how does it actually know that it's in response to this initial how does it know that this isn't a response to a request that it made yesterday or an hour ago or 30 minutes ago or 15 minutes ago now we need some other we need some other type of information right we want to know so basically you can think about it as we're kind of sending some random value back to the server and saying hey send me that random value back and if you send that back to me then I know that this is actually a response to my request so we'll call this the sequence number for right now so we send some sequence number and then essentially sends back well the other interesting thing is that actually we'll send back sequence number plus one the reason for this is that the difference is between little Indian and big Indian remember that from an architecture class way back then so now it's coming back so the question is what if you have these bytes this is a large number or big number it depends it depends on which byte is the most significant byte right it depends if this is memory address I don't know one if this is memory address one and this is memory address two and let's go O2 O3 and this is memory address three and this is memory address four if we said at memory address one or what do you call it four then we said this memory address what type of number is this is it F F O1 O2 O3 or is it O3 O2 O1 F F so that's the difference between little Indian and big Indian so super annoying is that most processors are little Indian right so it's actually reverse order so if you saw a physical memory in increasing order F F O1 O2 O3 when that number is interpreted all those bytes are flipped around so the number that it represents is O3 O2 O1 F F however the network is different the network I believe is big Indian because of this the design of TCP actually so I know I sent this start a message request to the server I put some sequence number there I want to know can that machine actually speak the TCP protocol correctly can it add one correctly right because if it adds one in little Indian or in the wrong order it'll increment the biggest number and completely change so it's not a increment by one so this is kind of an interesting I really like this because it seems like a minor detail that's silly but it's actually incredibly important to building this protocol back in like the 80's when they were first doing this or 60's 70's whatever they were first designing this so it gets back sequence number plus one so client so from the client's perspective they know so they know that the server received their request they know that the server got this start packet because they saw a response with sequence plus one from that idea address this was good what about the server what does the server know from its perspective that it's received a packet and sent to packet but it doesn't know if the packet it sent was received it doesn't know it does not know if the client received this packet or not so we need a third verification step we need a step for the client to say alright we're good to go let's talk right so we'll call it an acknowledgement packet for now sending back to the server but how does the server in a similar way that we thought about from the client's perspective when the client got this response how did it actually know that it was in response to this initial packet similarly think about it from the server's perspective I sent some packet some response back that yeah I'm ready to talk but how do I know this response back to the client is actually for that one and not from ten minutes ago or five minutes ago like how can I link that to that specific response sequence number yeah so we use the idea again but we already have a sequence number so essentially you can think about both sides have a different sequence number so although it's a little so essentially the server in its response will create some new random number and that's called the acknowledgement number and this way when the client responds back as that sequence number it actually has the act plus one and it will say I believe sequence plus one to say that it's actually as its acknowledgement number so from both sides they kind of use this but the key point is they both send each other some information so that they can link all of these conversations together and now at this point once that server gets that now it knows that this connection is established so there is now a communication link between the server and the client they are both good to talk to each other now they can actually start transmitting data so one of those things I would burn into your brain because you will probably be asked this by people is how does this actually work so it works with flags so the TCP flags are how the client asks the server hey I would like to start my connection and so the way it does that is by setting the send flag as one so here it's so these are all mapped to the headers so we have 13987 so just like UDP it creates a random source port and says ok source port 13987 I want to talk to destination port 22 which is SSH and it randomly generates some sequence number it says ok 6574 that actually doesn't matter what it is it should be randomly generated as we will find out and at this point there is no acknowledgement number so it's usually just 0 and it sets the send flag to 1 which you can think of as send as like synchronized you're trying to synchronize your connection so the send flag is set so usually we consider this and we call this a send packet so the first step of the TCP 3 by handshake is a send packet the server gets this and says yep I so first obviously it has to check is there a process running on this system that is listing for packets on port 22 if it's not then go away and don't talk to me but if it is then great let's continue this conversation so it flips the source and destination ports because it's now sending from 22 to 13987 I'm like I know it's split and it uses its acknowledgement number so as the sequence number plus 1 so the sequence number here is 6574 it's 6575 and now it generates for its sequence number a new random number that says great when you reply back to me use this number plus 1 and the sequence number and I'll know that it's actually you and the way it specifies that this is a response and not very important thing is if the send flag is just set this would indicate an initiate connection message from the server to the client which is not what we want so this is called a synchronize and acknowledge so send act packet then finally the client gets this and sends back the final act packet so no send just an act and sends back the sequence number of 6575 and the acknowledgement of 76112 so this is an adding one to that sequence number and now we are good to go and good to communicate data we have established a three way handshake here so again super super important three way handshake send and send act and then that sets up the gcd connection questions with the data exchanged on Tuesday and then we are actually close to finishing this out