 All right, so now we're gonna talk about network security. So we've talked for a while, we kind of built up a good background of, we looked at policies and mechanisms and we looked at authentication and authorization and crypto and now we're gonna get a little more technical on kind of go into the various types of technologies and things that we use and one of the security problems issues, things that we should be considering there. So we're gonna start off with networking and network security. This, we will then go from there depending on how much time allows. So to go over this, we're gonna have to go over the basics of networking. So this is kind of a, since we're moving down to a more technical level, if we don't know, if you don't know how the network actually technically works and how packets moved around and how things get routed, then you can't talk about the security of those protocols or how, what types of attacks are actually available given different scenarios. So we're gonna go over essentially a primer of the networking stack. How many people already have taken networking? So we're gonna get class later next semester, it's awesome. Is she good? Hey, what's up, what's up? How's the conversation on that? I just lost. All right. Okay, so, and I guess I should also mention, so this, so my grad level software security course, we go over network insecurity, binary insecurity and web insecurity. So I've taken some content from there and we're going at a very high level of, a lot of very high level, but a higher level. So we're not gonna go as in depth as we would, but some of these ideas and concepts of how network work, how packets moved around are important. So IP, so IP stands for internet protocol. And the idea was the idea behind the internet is what? What does the internet mean? Yeah, but can you connect computers together and not have an internet? You have a what? An intranet. An intranet, which means what? What's the difference between the two? Distance. Can you have an intranet? Well, I don't mean, I guess you can have a WAN that spans a large chunk, but then technically it's, I mean, internet is spanning the globe, I guess. But why? Is it the physical distance that actually makes it an internet? An internet would be a network or is it an internet? Yeah, so I think about it more in those terms or even in the organizational term. Organizational terms, right? So a network, if you think about the ASU network, we are physically, physically spread out. I mean, ASU has five different campuses in Arizona. That whole system would be considered the ASU network. And that would be, I would consider an intranet because that is literally, I mean, ASU controls everything. So if they want to change how things work, if they want to change from, I don't know, whatever, they can change any part of the stack and it doesn't matter because as long as they control that entire network. So to me, it's about really organizational control. The idea is when an ASU wants to send a data packet to Google or Amazon or a completely different network, right? Both organizations need to agree on what protocols, what that means, how data gets from one point to the other to make sure they're talking the same language. So that's really what, I mean, the genius really, the intranet is that you have this interconnected organizations and different networks. So nobody actually cares how you internally route and deal with traffic. As long as externally you speak the same protocols and languages that everyone else does. And so the standard that we have landed on is probably the internet protocol suite. It's also shorthand TCP IP. It's usually the way we think about it because these are the main protocols. And the idea is, this is based on the idea of abstraction and encapsulation. So at different layers of the protocols, do different things that are in charge of different things. And at various layers, they don't need to know the details of everything below them, which would be abstraction, right? We don't care when I'm sending a packet from here to Google, that requires different information than sending a packet from this laptop to the wireless router that I connected to, right? That has to do with all crazy wifi and whatever 802.11 spec I'm using. And then you think about how does it physically get there? What are the modulations of the radio frequency? What channel is it on? All that stuff. But my computer actually shouldn't care about any of that because the data could be transmitted over the wire, right, if I'm plugged in on ethernet. So, but the TCP IP, the higher level layers actually work the same no matter what physical layer you're working in. And so we think about this, we have link layer protocols I think of as one machine talking to another machine or talking to another node. Internet protocols that actually get us to talk between networks and transport protocols that can actually finally transmit data. So once you can talk from one point to the other, you actually want to transmit some data and find the application protocols that define well. Now you have a mechanism to talk between two nodes. What do you want to say? Right, what language do you want to talk? Because if you're, I don't know, I guess using human analogy, if you call somebody and they start to be in French to you and you don't know French, you can't communicate, right? So when you talk to a web server, you need to speak HTTP to that web server so it can understand your request and give you back the web page that you want. And if you can't speak that language, then you're not gonna be able to talk to that service. So I think of the layering here, so other, you know, the people are people who think networking is familiar with the OSI network model. What are the layers in the OSI network model? A story, when I took networking as an undergrad, I could not remember the last layer. I knew it started with a P and I put the pizza layer. It turned out that was not the correct answer. But I got a smiley face with zero points, so that was good. So the bottom is the physical layer, which really is how, physically, how does one component talk to another component, right? So this is with Ethernet. This is even Bluetooth, you can think of Bluetooth, the Bluetooth connection here, or what is one of the points of point, or physical layer protocols? 802.11? 802.11, wireless, yeah. And above that we need the link layer, which basically is slightly more abstracted. So this layer doesn't care if you're on wireless or Ethernet, you're using this link layer for your hardware at a little higher level to talk between two different local nodes. But all this is happening locally. This has no idea or no concept that there's other nodes out there or other places you want to go. The, I think we talked about this as the network level, or you can think of it as the internet level here, it's the IP level, so we call it internet to the right of matter. Where the idea here is this allows you to, so I think of it as the link layer is how one node locally can talk to another node. The IP layer is how you can talk to a node that's in a completely different network for you that you can't directly talk to. Above that, the transport layer, TCP and GDP, these layers have different properties and actually allow you to send traffic. And on top of that, you have the application layer which is what you're trying to talk. So HTTP is led, what's SMTP? Email, simple mail, transport protocol, or DNS. Domain name service, NFS, what's it used for? Is it a file thing? Yes. Is it a network file system? Yes. So it's used for sharing. So if you have a Samba server set up, so I guess SMB is a form of network file service. Cool. That was good. And so it's nice. It's always nice to think of these things in these abstract layers and we have this OSI model that's whatever, five or seven layers or however you wanna think about it. But it's important, it's especially important to security to understand that actually these aren't the beautiful, completely abstract layers like we think they are. So ARP as we'll see at the link layer, ARP actually is used to translate your hardware interface, so your MAC address to an IP address. So it actually bridges a little gap between those two layers. And then you have crazy things like DNS all the way to the top. What's the point of DNS? Resolve a domain name to an IP address. Yeah, resolve a domain name to an IP address, right? So that, a domain name is some other concept, so it's kind of, I mean, to resolve that name to back down to an IP address. So there is intermixing kind of between these levels and that's why we need to study the details so we actually understand what these security implications of these things are. Questions on this? So IP addresses. So the idea behind the IP address is, well, what would be a physical analogy for an IP address? Your, yes, so your street address or your PO box, I'd say that's okay, yes, so your address. So how can other computers talk to you? So the IP address basically defines this is my IP address. And so each host can have one or more IP addresses. This is an important thing. You can actually have more than one IP address for each, for every network interface. So for instance, when I go to my office and I plug directly into the network, my computer's actually still on wifi, so I have two different network interface cards, a wired and a wireless. Each of them will have a different IP address when talking to the server. How many? So we're gonna focus on IPv4 because the adoption rate of IPv6 is still not anywhere close and you can go learn about that on your own time. So IPv4 is a 32-bit address, which means how many addresses do we have? That's not very precise. Two to the 32. Was it? Two to the 32. Two to the 32, yes. So which seems like a very large number, right? Yeah, as we'll get into, we're actually running out of IPv4 addresses but it's also not hurting us as bad as we thought it would, so. And the idea is, so most of the time you see an IP address, it's represented in this dotted decimal notation. So what is an IP address? When you really break it down, what is it? There's like a fort number somewhere in there. Not yet. That's up to the higher level, so the TCP level, UDP level. What is an IP address? It is an address. Sequence of bits. How many? 32 bits, yes. So is a sequence of 32 bits, right? It is just a 32 bit number and you're whatever you want. I mean, it's literally 32 bits. That's what it is, right? And so how do you read this decimal, dotted decimal notation? Yes, but how and what order do you put it in? 8 bit, 8 bit, 8 bit, 8 bit. So what's the highest number you can find in any of these octets, I guess? 255, yeah. So these will each be zero to 255, right? So it's just helping you check and helping you visualize this first octet defines the first 8 bits of the IP address, the second octet defines the second 8 bits, the third octet, the third 8 bits, the fourth, the last 8 bits, right? So that's how you build the 32 bit IP address from this dotted decimal notation. So it's just an important point to kind of rethink about, like this is not an IP address, right? An IP address is not this dotted decimal thing. This represents what that number, that 32 bit number actually is. So back when IP addresses were first created, they were actually specified as the first seven. So the addresses were actually in terms of class, net ID and host ID. So the host ID would define you, your system, and remember this IP address is an internet level IP address. So the class would specify what type, a unique class would specify basically an organization. So organizations would have different classes of IP addresses. So a class, so for instance, and why this is interesting, so class A is basically, I believe this would be, it all starts, yeah, all starts with zero, and there's been some sign of the prefix. Yes. So we need to talk about the prefixes. Not yet, no, because classes, the classes were very fixed, like this system is very fixed. We'll talk about in a second, moving away from this system and talking about prefixes. The important point here is that you as an organization would get, if you had a class A IP address range, that means you can have, I should put comments here, 16 million hosts on your network. So that's what this, by having this address and those 16 million hosts are all essentially publicly routable and no. And then so class B, so there's only 128 or 126 of those because some of them are reserved. This makes sense. The 10 dot IP addresses in this space. So actually, is anybody on that ISU network? Anybody want to tell me what your IP address is? 10 dot something? Yeah, 10 dot something. So that's actually, there are three address ranges at different classes. The 10 is one, maybe no one of the other ones are? 172, that. 172, what's the third one? 192. 192, yeah, so, yes, so those are super interesting because, so basically those are not publicly routable and are meant for internal usage. So this means that ASU, by using a 10 dot IP address space, has room for 16 million hosts on that internal network. Two and 176, they're smaller. Yes. 192, 168. Yes, I don't remember. Yes, I don't know the exact class, how it maps in there, but yes, they're small. I think those two are in class A. Correct, right. The 10 is definitely class A. So here you go, there's an RFC, so the, I guess that's part of my guess, but so how do these things come to be? How do we get to this point where we have this 32 bit IP address range with these ideas of classes? RFCs? RFC, what's an RFC? Request for comments. Request for comments. Where does that go to? Who are you requesting comments from? It's like the overall building of the body, the mirrors that basically chill this, I think it was the IDM or Microsoft, we kicked this off. And so it started the IP addressing and then they built this consortium that then takes requests for comments and then adds that to the overall. Yes, I'm Louis, the Internet Engineering Task Force. It's the IETF and they have a series of requests for comments, RFCs that define standards and protocols. So if you go, you can look up the RFC for IP addresses that describes exactly who these things are. There's an RFC for private IP address ranges. So actually I was looking at, do you ever remember the mirrorRI botnet? Somebody tell us what it was, what it did. It was printers, network cameras and I think there was something else. Third, routers, yeah. And they would look at a list of default username passwords and scan the Internet looking for these things. And it's interesting because I can't remember exactly how the source code of the botnet was released. So you actually go look at that and I was actually talking to some, I wanna say they were high school students or undergrad students who were looking at trying to study and analyze this botnet and that software. But of course you get to tell them you don't want to actually run this because it will start scanning the public Internet and you're basically affecting people, right? So you want to run this in a private IP address space range that you actually create. The problem is the software actually would look am I in at 192.168 whatever private IP address range and not scan those. Because I only wanted to scan the public Internet. But it turns out I actually did it incorrectly. The number of bits that it looked for was not actually correct. So you could create a public IP range that was, I'm sorry, you could create an IP local network with an IP range that was private, but they thought it was public and so they actually go and scan it. So it was pretty interesting. But so these things are hard for you and while we're off they just get right. And so the idea here is should we maybe a parent as you get more and more down and you can have more and more networks. So at the Class C level network, you can have what, there's two million networks each with 255 hosts or 256 hosts. What's the downside here? Networks need to administer it. Well each one of these networks needs somebody to administer it, right? So definitely these are associated to, so we need to know what organization, where, who owns this IP range so the number of packets go to. What of these ranges would ASU meet? ASU network is a trainer. Yeah, so what would you use to, what would you use to get that number? How'd you get to half a million? Just by guessing, right? ASU needs, like why, so there's an organization that controls these networks, right? So how would ASU justify the need half a million IP addresses? Make the very course assumption that on campus there's a computer, a laptop, a cell phone for every student and faculty. 92,000 for roughly the number of IP addresses that you would need at the individual point for all of those devices to connect intermittently without, like, so crazy. That's not the kind of thing that's, that actually gets you to the actual campus. Then you have the labs, there's lab computers, there's no, research computers, there's staff, the staff computers, plus staff laptops, so. Yeah, you could probably easily get to half a million if you started adding up all the people involved in research and that kind of stuff. So ASU would probably want a Class A network. Are there any Class A networks to give? How many are there? 120, yeah, 128 networks, right, or 126. Those are all gone, like 100% gone, right? And a lot of people thought, hey, we need this, right? You're gonna definitely need more than 65,000, right? Because it's clear that even a, well, largest organization will get past that, but, but that granularity rate is, there's a huge gap there. So this was way too coarse-grained. So when you think about it, I believe, I don't know. I don't wanna say anything. Can somebody look up what schools have a Class A IP address range? Okay, anyone? So let's say ASU got a Class A network range. Even half a million, even a million, are we using all of that IP address range? No, which is super wasteful, right? Because there's 15 million IP addresses that are being followed and nobody's using, right? So if anybody looks that up, we have a Class A. So the idea was, okay, this class system is much, much, much too coarse-grained. So they came up with a new scheme of doing classless inter-domain routing. So rather than just have these four classes to know exactly which part of the prefix of the IP address you route to, you could do this without doing that. And so there was also, it was very clear that, okay, because of this, there's a ton of machines that need to be on the network or want to be on the network. And so we want to actually provide IP addresses to them. IPv6 uses how big IP addresses? 128-bit IP addresses, which is huge. I don't even know what to say, it's huge. It's a huge number, it's more than, what are some of the phrases to describe it? It's like, more than, you could give every grain of sand on the earth its own IP address, IPv6 address, and you would not run out. So it's a huge, huge address space, but it's actually very slow, yeah. So the University of Southern California. Oh, so USC has a Class A network, awesome. So I have IBM and Apple. IBM, Apple, yeah, those are good. We'll talk later about the security implications of having all of your organizations' computers publicly accessible from the internet. So that's a whole separate issue, but it definitely is a problem. So, Cider is basically, let's look at this. So, USC. USC actually has two constables. Ah, nice, Stanford. All right, so, yeah, okay. So the idea is, let's say, somebody wanted, okay, somebody give me a, I'm gonna use you to have a random number of generators. Give me a random IP address. No, 222? Actually, this is gonna be the link. I've already, I've already disliked this idea. So why? Because we want to actually represent this number in decimal. So we wanna see it's zero, three, four, five, six, like four, five, six, seven, one. This is super easy to do. Okay, cool. So the question is, where does this go? Right, so before in our class system we'd say, okay, the network ID is the first seven bits. So it's very fixed. You'd say, you'd say, okay, this defines the network that it's gonna be in, and the rest of the bits define where it goes after that. And the other one was that, what is it? 14 bits, so you use where 14 bits is, I don't wanna count it. And the rest is the host. So, but this is very, as we talked about, this is not very granular. So the idea behind CIDR is to use this net host boundary on any bit between 13 and 27. So you could actually have, well, 27, this is, actually we can work around 32. So I think it'd be here, is that right? So this whole thing would tell you the network, and this part would tell you the host name. And you can actually do this at any level. So if you did, and the way this is usually written is 1.1.1.1.1 slash, let's say 24. But 24 means, the slash 24 means place the boundary on the 24th bit, which should be here, right this way. Right, so this means the network that I wanna talk to is 1.1.1.1. And the host that I wanna talk to in that network is 0.0.0.0.0.1. Are you gonna give us any, how many posts here on this average range questions on the bottom? Or is that, that was good. We're talking about it, but. To do powers of two, expectation. No, it's not that easy because the boundary doesn't necessarily have to fall in an 8 bit multiple, so. Correct, but if it was here, it would be like two to the ninth or something, right? Yeah. I'd say if I were to ask you this, that would be an acceptable answer. Otherwise, I don't talk about things that will or will not be on it, but I'll ask you security relevant questions, is that helpful? So, this idea, so here, so this would mean, so where this comes into play is, let's say I know the network is 1.1.1.1.1 slash 24. You would be able to answer the question, are these addresses, are these IP addresses in the same network? Yes, in the second one, the network ID is different because you know the slash 24. If I change this to a slash 16, yes, because the boundary is here, you compare the first parts and say, did you do the same network? The other ones aren't. So, we'll see how that comes into play in a bit, but we need to talk about how to represent it and think and talk about IP addresses. So, the IP protocol is really the core in everything. It's how data gets from one network to another, as we said. And it's important to understand what it actually provides. So, these are, and we'll see a little bit technically why or why it does not provide these things, and you can have conversations about if this is a good idea or a bad idea. Because it's important though to understand what the guarantees are at each level. So, the IP protocol provides a connectionless, unreliable, best effort, datagram delivery service. So, datagram would just be one packet of one piece of information. It's the way to think about that now. So, which means that delivery, integrity, ordering, non-duevocation, and bandwidth is not guaranteed. So, if you say you wanna send a packet, some bit of, some chunk of data to a certain IP address, there's actually no guarantee you'll get there. No guarantee that on the other side it's actually what you sent. There's no guarantee that that data will be sent to them five times. The same information will be sent five times. And you're not even guaranteed when it will get there, how long it'll get there. So, is this insane? As engineers, does this seem like a good thing to you? As long as you watch Netflix. Yeah, but then when you start buffering and you get upset at it, and as a scientist you should understand why you're upset at it. So, when you're watching Netflix, do you care if a frame of data is dropped? Three, one, or five? No, you never know actually, right? Because maybe you'd see a very slight setter, but I think one frame would probably not be enough for you to know, right? Or, same thing with audio calls. So, audio calls are one of the classic examples where you don't actually care that if they actually get it or not, because it doesn't really matter if you drop a tiny bit of audio and you can actually still understand things. So, the way to think about this, the stack that we talk about, right, is the IP address layer guarantee all these things. That was reliable, it would do everything, it would guarantee integrity. Not every application actually means that, right? And all these features come at cost, right? There's some sort of cost if we guarantee that things are delivered, as we'll see, that actually adds a lot of latency overhead. So, some network applications don't need to pay that network latency cost. So, they don't need to think about that. What does connectionless mean? Besides the obvious? Somebody might not be listening. That's good, okay. So, yeah, one aspect of the mean, there's no guarantee that there's actually somebody there on the other end to listen to it. What else does it mean? There's no reply that you've actually created a connectionless or really you're talking to. Yeah, there's no guarantee that they'll, like, there's no way for them to just send you something back, necessarily, as part of IP. So, and if you get a packet from them or a piece of data from them, you don't actually know is this a reply to the thing that I sent or is it a brand new packet coming back? So, that's our connectionless means there. There's not a, when you're talking to somebody on the phone, you know when you talk to them that they're replying back to your voice, right? It would be like, if you call somebody in the left of voicemail and then they called you back in the left of voicemail on your phone, you don't actually know are they calling you about something new or about the old thing that you originally called them about, right? So, it's just, you just see how voicemail. So, IP datagrams, this is important. This is why we talked about IP addresses and IP datagrams can be exchanged between any two nodes provided they both have IP addresses and we'll add on another caveat that they are publicly addressable IP addresses. So, this means, though this goes back to the other good analogy is mail, the mail system. So, if I know your address, I can send you a letter, right? Everybody agree? I know your address and your name, like, I think name is optional. And so, for actually, as we talked about, simple link layer, which we'll briefly touch on, has a lot of different protocols of how to actually do this link layer part, but at the IP level, we don't care about that at all. So, in our good friend, RFC 791, which you can go to and see a diagram exactly like this, we actually break down an IP datagram into its various chunks. Because the idea is just like a letter, right? Can you just like write a letter to somebody, like do your so and so, all of all the boss, sincerely add them and just throw it in the mailbox. You need to write an address, what else do you need? Stamp it. You need to pay for it, you need a stamp, stamp to pay for it, what else? 400 return address. You need a return address in case anything else goes wrong, do you need an envelope? What makes a postcard different? So, an envelope is optional, right? What does the envelope provide? Yeah? No, it protects whatever is inside. Optionally, also, if you're sending a larger object, you wouldn't want just an envelope, you might want like a cardboard box or something. Yeah, you may, well, that's getting a little often now, but yes, definitely you can pay more money and sometimes you can ship the object itself. It's actually a great, I should try that, but it's a really cool website of a person who sent a camera, like a disposable camera through the mail system and on the back were instructions, like, dear postal worker, like please take a picture of yourself like as you're processing this. So it just had like pictures of mail, they would just send it various places and like mail people and people working at postal facilities to take pictures of it, a random stuff as it was getting processed. But they like literally put the postage and like the address on the camera itself and like they would ship it itself. Yeah, the post office is pretty cool. So, and so just like that, so just like in the case that you can't just throw data at another machine, right? You need to provide metadata about where is it going? Who did it come from? Any other options and flags? So one important thing is what version of the IG protocol are you talking about? So this is actually something that's very important and a good design decision that happens in all types of APIs and protocols because you want a way to know that you are talking IPv4, right? Because if the person on the other side gets your packet, if they don't know if you're talking IPv4, IPv5, IPv6, IPv100 because it's in the future, then they don't know how to interpret the rest of the packet, right? So it's actually important to be the first four bits that come across. The length, so the length of the header, how long is the header going to be? Some other flags, service type, total length and identifier. So a packet that actually uniquely identifies this packet, we'll talk about that very briefly. Flags, so there's various IP level flags that you can set on the packet. One thing we didn't talk about is, so unlike the postal system, let's say you can pay more money to have them ship bigger and bigger objects for you, right? Different physical lengths have different restrictions on how big data they can send across their limits. So what IP actually provides is a way to, if you try to send a packet, or if you try to send a chunk of information that's let's say 500K or a huge IP datagram chunk, you can do that, but what it'll do is it'll chop it up into pieces into fragments and then send you to those fragments. So it would be like, I guess kind of like a transporter or something where like the post office would take your large package, chunk it up and do different pieces that actually fit, put an identifier on each of them to say these are actually all from the same package, then put the fragment offset that would say this is fragment one, fragment two, fragment three. So that way when it got to its destination by any number of ways, it could be reassembled back into that one thing. Time to live is important. We'll talk about this in a second. The underlying protocol. So there's actually a check sum for the header. So this is actually something very interesting. So it is not a cryptographic hash of the header information, but it is a check sum that verifies that all the rest of the headers were sent kind of correctly. I guess we should say, since we're doing security, that it verifies that the header check sum matches. So what does the check sum do? That's not exactly how it's implemented. I can't remember the exact check sum algorithm here, but a simple one, you can have a one bit check sum if you just XOR every bit before it, and then XOR that with the last one. If it's a zero, you know that all the data at least was, then you know that the check sum passed rather than a one. If it was a one, that means some bit somewhere flipped. The problem with that is, if you flip two bits in two directions, the check sum will be the same, the check sum will pass. And the same thing with even these 16 bit check sums is you can easily manipulate bits in there maliciously to do that. So what this does is it provides some assurance that at least the header data was not corrupted by a physical process, but it does not guarantee any security or any cryptographic guarantees that this has not changed or altered. We have the source IP address. So this is the IP address of the person who's sending it. Why do we put this in here? Now, when the other node gets that packet, they wanna know, well, who do I reply to? Who do I actually talk to about this? The destination IP address, so this very clearly you need to put where you want this packet to go. They may have some options, padding, and then the actual data. So what's important here for us, who are you talking to on this, is thinking about what of these packets, what of these header information can an attacker? So we always want to think about, we're not just learning about these protocols, we want to think like the attacker and say what things can the attacker change? So here, attackers can alter any of the flags, flip when they're sending packets to somebody. They can flip any of those packets, any of those bits. They can mess with fragment offsets, which was actually in the identifier. So this was actually a type of Nile service where going back to our B, well, doesn't really work, but a post office got a person who cut a packet into a bunch of things. If we set a bunch of packages that were essentially pre-cut and they said these are all number five, and they tried to stack them all on top of each other and it ended up crashing on them and destroyed them and killed their computer. I mean, that actually is what happened, is it destroyed the reassembly process, caused a buffer overflow in the kernel of the other system and caused it to crash, so you could just, I think it was like, it was only something like 10 packets, you could just shut down any computer on the internet, which is not good, hopefully you agree. Interesting, another super interesting thing, this is something we're gonna hammer over and over again, the source IP address, the attacker can control. So they can put whatever IP, whatever return address they want on that, for that source IP address. Why, I mean, well I guess we'll look at the why later, but it's important to note that you can actually, and we'll definitely have a homework side to this, you can create any packets you want, so you don't need to use the operating system to create your packets, you can create your own packets, if you have root access, and then you can send them wherever you want them to go. So the idea of encapsulation is you have your IP packet, which has the IP data that you're trying to send, plus the IP header, and that's encapsulated inside at the frame level, so at the link layer, your ethernet has its own headers, and the data part of the frame is the IP packet, which has its header with its data, and as we'll see, the data part of the IP has first the header of the PCP packet with the actual data. So it's kind of this layering approach where you look and there's like header, header, header, header, and then the actual data. So how do packets get around? So the idea is if two networks are in the, if two hosts are in the same physical network, the IP datagram is just encapsulated in the link layer protocol and delivered directly. How does a node know if a node that I want to talk to is in the same network? Yes, it checks the IP address with the network that it's on. So we looked at Sider, it says I know what network I'm on, I know the IP address of the machine I want to talk to, is this machine in my same subnet? In that same, so it basically checks, does the prefix of R, my network, match the prefix of the other systems? So this is done in a couple of different ways, one way is usually, so this would be the subnetwork, so you can use the slash 24, this would be 111.10.20.1 slash 24, so this would say that the subnetwork is 11.10.20. So you have two machines here, we have the 11, 10, 20, 121, 11, 10, 20, 1114. What are these numbers? Which is what? Identifier of your hardware, right, of your physical machine. Yes, and how do they use it in our layers? It is a MAC address, so it's called a MAC address. Switch it to send the signal to the correct machine. Yes, so the IP address is clearly the IP layer, what's the MAC address at? The link layer, so it's not the physical layer, and the way to prove this to yourself is you can look at the MAC address here, wireless card and your Ethernet card, both on the MAC address, right, because actually the MAC address level doesn't care that you're on Ethernet or Wi-Fi, right? It just mirrors about this. So yes, all this does is uniquely identify, it's a unique address to identify the, your NIT carrier hardware or whatever, which is also spoofable by an attacker, cool. So we are 121, we want to send a packet to 114. How do we know we want to send a packet to 1.14? How do we know? Yeah, exactly. We would potentially have that, so does our DNS resolution server, if it's ping, then we would send that IP address exactly. Next, let's have a quick question, because the point is you just, you know you want to talk to this machine, right? And the idea is you are this machine, you want to talk to dot 14, 111, 10, 2014. You know that either it's a website, you want to collect information from, however that happens, but that's part of the given, you want to talk to this machine. And so, so what is our 121 host ask first? So it wants to talk to this machine, what does it ask first? The message to say, can I just download it? Nope, before that. We are not even talking at that level yet. This hardware address. Do we know what we want to talk about this specific hardware address? It's IP. So we're at the IP level, we only want, we know that we want to send a packet to 111, 10, 2014. Is it in our network? Is it in our network? Is it a local host or not? Are we going to be talking to a machine on our local network or a foreign machine with a different network? So how do we know? How does it actually do it? You check the first three, one, one, two. Yeah, why is it going to check the first three? That's the subnet, right? The subnet is 111, 10, 20. So this specifies the prefix. Any IP address that we want to talk to that has the first, what is that, 24 bits that are equivalent to 111, 10, 20, then that is in our network. If the subnet was 111.10, we would only check the first 16 bits. If it was anything more, or if we had something new that said slash 24, we'd check a different number of bits. But this is how we know and how we check, and it's very computationally easy to check. So yeah, okay, cool. So what we're going to do is we just create our packet, we say, okay, and then we will encapsulate that packet inside another packet. So we're going to skip the ARC thing for now about how we know, and we just basically put our packet IP packet, and we say, okay, this is my MAC address, this is your MAC address, and then send it out, and then the physical layer will decide how does that actually go out to the machines on the network. But the point is that here we know we can talk directly to that host on our local network. We don't have to go out to any other hops. And so we need to briefly go into how kind of a link layer frame works. So link layer frame works, destination, source, a type, and then data with a CRC at the end. So for IP, it's the type specified there. We have other types for ARP and reverse ARP. And so Ethernet is a super widely used link layer protocol. It has fancy things, destination addresses, destination and source addresses are all 48 bits. So you actually have a lot more space here through the 48. It really, you know, it's the type as we saw, IP ARP, reverse ARP. And the important thing here is that MACs can send 1,500 bytes. So this is where you get this fragmentation problem of, if you try to send something in an IP packet that's larger than 1,500 bytes, it will chop it up for you. Cool. Okay, the idea is, coming back to our example, how does 121 know to create an Ethernet packet? How does it know how to create that Ethernet packet? Or Ethernet frame, sorry. So what do we start with, what do we know here? That we're in the same network? And what is that? So we know what network we're in, let's say that. We know what network we're in, what else do we know? We know that, well, we know the destination, let's go with that. The things we absolutely know, we know our network, our subnet. We know the destination we wanna talk to. We also know our IP address. And by using those information, we can say, yes, we absolutely know we are in our subnetwork. What's the next piece of information I need to be able to create an Ethernet packet? What else do I know? Let's go with that. I know my MAC address. I know it, it's in my machine. Do I know .14's MAC address? Possible. But do you actually specify when you talk to an IP address, do you say, if you type in google.com and then provide the MAC address of the Google server you wanna talk to? No, also it doesn't matter because it's not an old number. But the idea is we need some mechanism, some way to map and say, I know I wanna talk to 111, 10, 2014 and I know that that host is on my local network. What is their MAC address? So that is where ARP comes in because you need a way to basically ask the computers on your local network, what is your MAC address? Or who has, sorry, what's the MAC address of the computer with this IP address? Because you know you have the IP address. So this is the key idea of ARP. So here we have a network where we're at 192.168.1.100 and we wanna talk to 1.10 with these different Ethernet addresses and here we can assume we're in the 192.168.1 network, so it's a slash 24. So if I'm host A, you can actually do, so if you do this on your computer if you're running Linux, you can do ARP-A which will show you the list of MAC address and IP address pairs that it knows about. It's empty right now so it doesn't show anything. So if we want to ping 192.168.1.10 before we can actually, which is generate 19 packet, before we can do that, our host A actually has to send an Ethernet frame with an ARP request saying who has 192.168.1.10? And this is captured from a tool called TCP dump which is looking at traffic. You can go and play with this. You can use TCP dump or wire shark to be able to sniff the traffic that's going on. So here we can see that this, the thing basically says ARP who has 192.168.1.10 tell 192.168.1.100. So this is quite literally this ARP packet here. So this is an ARP type packet that says hey, who has this? And we can see that the source MAC address is 804674A3 which is 804670483 which is host A's MAC address. But the key problem is we don't know the MAC address that we want to talk to, right? We're sending out an Ethernet frame. We have our source MAC address but we don't know who the destination is. So by default, the all F, this means the broadcast to every single MAC address. So this goes to the entire subnet. Everyone on the subnet should get this message. Who has this? So if you have 16 million nodes on here it's gonna be a very large broadcast. It's gonna go to a lot of nodes. And then when host B, so host B is listening for ARP traffic, when it gets this request it responds and says hey, it responds and says hey, from 0131D98B8 right here to 804674A3 so this is destined just for one person. Hey, I'm 192.168.1.10 and I'm at 0131D98B8. Then the ping actually happens where you can see the ICMP echo messages. And if we run the ARP-A we can see that on host A that it has 192.168.1.10 is at this address. So this actually does cache as we talked about it will cache this for a while. And the interesting thing is host B also sees this. So host B if you run ARP-A on there you see that now host B knows about host A and has the mapping between. Because when it got this ARP request this contains all the information so they know this mapping. Cool, so this is really important to understand this because this bridges the gap between IP addresses and local macro. We haven't got to you, I'll just start at the time.