 All right everybody. So let's get started. So quiet please. So our goals today are to continue to the discussion of IP networking. Quiet please. So we're going to look at some high level concepts around networking and the functionalities of the network. And then we'll have a tour of the different layers of the network and specifically physical data link network transport and application layers. And as continue the discussion we started last time about the successes and drawbacks of layering. All right. So why is networking important? Well basically we've seen an evolution of computing from desktop systems and mainframe systems to fully network systems that intercommunicate. And increasingly applications are moving from being installed on a single machine to living in the cloud and as web services. So the largest many of the largest companies these days are basically cloud providers. So the network is the entity that makes all of this possible. So the connectivity provided for the network by the network is critical. It's also the case that in terms of user experience difficulties with the network are among the most commonly cited problems that people bring up. So for the iPad 2 actually Wi-Fi connection is the largest complaint that people make. For Kindle Fire it's the second largest. The largest being that you can't turn off one click. So it's 35 percent and 25 percent. So there are very significant numbers. Okay and what is a vacation? Anybody? Yeah it's a vacation. Well it is a vacation spelled with an F but there's actually several interpretations of them. But the one that's interesting is taking a break from Facebook. So in other words people are so used to being connected all the time that taking a break from being networked is considered a lifestyle change. So all right so most of the apps that you'll use communicate with the network. That allows the app to provide a rich experience to use personalization and so on. Provide better services to you. Provide access to a vast set of resources that are living on the network somewhere. And so search engines social media tools shopping and so on use the networking. All right so let's start working through some of the elements of networking. Basically the idea is to have processes running on different machines working with the operating system to communicate with each other in a way that's as transparent as possible. So that requires physically the two processes to be known to their operating systems to communicate through appropriate system calls with the operating system and then through some network hardware either wireless or wired network cards. Many machines these days have multiple network interfaces. Multiple wired network cards are fairly common and laptops will normally have both wired and wireless interfaces. So in fact the devices can have multiple presences on the network. They'll normally have different addresses for each one of those. So the addressing scheme for TCP IP is based on two different notions of address for the physical machines. There's a physical address media access control address which is supposed to be fixed for each piece of hardware and each interface for the hardware. And then an IP address which is a more dynamic entity that's while the device is connected associated with that specific MAC address. And this is the address that's globally presented through typically network directory services so that other machines can potentially find this host. And so this host even if it initiates contacts then return messages can come back to it. So the first stage in networking is to make sure that we can make this association between physical addresses which are unique and attached to physical interface devices and their network presence which is the IP address. So for each MAC address you'll once the device registers itself with the network with a DNS server it will receive an IP address as well. That IP address will often change from session to session but for as long as the device is connected and active it'll have a static mapping. And each other device will have the similar mapping from MAC to IP. All right, so the MAC address is a 48-bit address that once upon a time it was unique and associated with a particular instance of the hardware like a serial number. So it's a very large address space enough for basically an arbitrary number of devices. Modern devices typically have setable MAC addresses. People discovered that it was very useful to be able to set the MAC address of a device to copy that of an older device so that you save having to re-register the device with a name server. So but in theory it's supposed to be unique for each different physical device. The IP address by contrast on IPv4 it's a 32-bit address, four groups of 255 bits and for IPv6 it's 128-bit address which is usually a long hex string. And that's either assigned statically if your network administrator is typically for a server you'll get a static network address. For normal clients it's dynamically set by a name service when the computer connects to the network. Okay, so in a connection is a communication channel between two processes through the OS and through the networking layer. So in addition to having addresses associated with the interfaces of the physical machines the services themselves typically have a port number associated with them. So that's how a process here can connect say with an HTTP server or an FTP server at the other end. The packets will carry both the address of the machine and then the address of the port that the process A is seeking a service from. All right so yeah port A in order to access the service at this end the sending process will also have a port address which is how process B can route traffic back in response. All right and here are some common port numbers for well-known protocols and they start with very low level protocols wake on land for starting up machines remotely on port number nine and then a lot of familiar protocols FTP and so on through from 20 through to 23. DNS is at 53 and HTTP is at port 80. Yeah right that's coming up on a slide why can't we use Mac instead of IP. Mac doesn't encode any topology about the network it's an arbitrary address you know two different machines could be in different parts of the world. The network structure is designed to encode the topology of the network sort of from highest domain down to lowest so that way machines can find each other without having to have a director of everything which would be required with a Mac address. So and the port numbers are quite significant because you know a lot of security these days is about protecting and screening out particular port numbers very often only letting through HTTP traffic because there's for every service here there's a potentially potentially a vulnerability on the service at the other end that you might be able to exploit if you send the wrong kind of information send the wrong kind of packets. So port filtering is an important part of this too. Okay so all right so the most important thing about the the network is to deliver the packets hopefully reasonably quickly and if you want to deliver a packet say from Berkeley to somewhere in Tokyo you have a number of issues the first one is reliability there are many routers along that path and there may as you go up typically higher in the network topology there's more traffic more congestion so there's a good chance of packet loss somewhere collision between packets and so you want to have a reliability layer as well. Flow control is important too part of the congestion control especially at higher levels is keeping track of how many packets are being sent on the network and how many collisions are happening and protocols these days TCP in particular has both reliability and flow control and congestion congestion control built in. So flow control is actually about dealing simply with the receive buffer and making sure you're not sending packets before the receiver is ready to receive them. All right so clearly if you have a mismatch between the end points say a server trying to send to a phone the flow control the acknowledgements that the sender is going to wait for guarantees that you don't get too much information flowing too fast. All right congestion control is about avoiding collisions generally and there the clues to congestion include a lack of acknowledgements coming back that implies collisions either with the outbound packets or with the acknowledgement packets coming back and without congestion control basically if you think about it because again the network it's not exactly a tree but it does involve a higher throughput of packets as you go higher up in the topology so the volume of traffic as you go up is getting larger and larger and if you don't manage it somehow you'll just have a complete overload situation. So we'll talk a little bit about how that's managed at the lowest layers later on. So we don't want to clearly we don't want to let the network grind to a halt so we've got to do that. All right so we talked last time about layering. Layering involves breaking down this complex protocol between a large number of services on the network and a large number of potential physical transports so that you have a manageable number of interoperating functions. The layering of IP guarantees that each layer only has to use the services exposed by the next layer down and it has to only export services to the layer above and there's a simple encapsulation that we'll see a kind of recursive encapsulation of packets as well it goes along with that. So this layering constrains and simplifies the kinds of interaction that you need hides the implementations of the layer the details of implementation at the layers below and allows you to change various parts of the stack without affecting other parts. All right so a service is a functionality exposed by a layer. The service interface are a set of calls that use that functionality and that's presented to the layer above. The protocol specifies how particular peers communicate in order to achieve that service. So the protocol as we discussed last time specifies a syntax and a semantics of the communication. So syntax being the formal structure of messages that are going back and forth and then the syntax being the meaning of certain particular messages, encodings, what actions should be taken and so on. So an FTP request is both a message of a particular format but also a request that a packet should be coming back. So the protocol doesn't though specify exactly how those operations should be implemented and especially details of how the messages are put together and decoded. All right so the layering model that was originally designed to support complex systems including the internet was the OSI layering model which was developed way back in 1984 and it has these seven layers going from physical up to application layer. And although it's an idealization that's worth shooting for it's it implies a really large amount of nesting of communication and wrapping of packets and a lot of overhead in implementing these layers. So IP simplifies the stack there are only five layers in the IP stack. The first four follow the OSI model and the application layer just simply sits on top of the transport layer. It makes it a lot more efficient and as it turns out it's about as simple as the seven layer model. All right so let's look at the physical layer first this is the the wire or the wireless that moves the bits around between devices. The service here is to simply move bits between two hosts that have to be known and identified by their MAC addresses their physical addresses. The interface has to specify at a physical layer how what's the meaning of the bits how the bits have to change to encode or the signals have to change to encode the bits that are moving across the wire. A protocol is a coding scheme there are various coding schemes that are used for for physical transmission especially over wireless. These can involve involve coding theory and exactly how you map logical bits to levels of the interface it can involve error recovery and checking forward error correction so a variety of functions go into the protocol at this low layer. The idea is though to the end result is simply some block of bits going in and one in and coming out the other end. So examples of physical layer originally Ethernet was implemented using coaxial cable which you could daisy chain between hosts. For higher speed networking optical fiber can move bits faster than than wires can and we've seen many generations of wireless that are now working into the tens of gigabits a second. All right so moving up a layer to this data link layer the idea here is that we're starting to encode messages encode communication now as messages which are logically simpler to think about over that physical link. So now we have some additional services that we have to provide including arbitration of access to the physical medium and possibly layers of reliability and flow control. So once we dedicate to packets we may we basically going to expose a process that can receive different requests from different processes trying to communicate through the network so these arbitration and flow control primitives are necessary. The interface that's exposed should be to send those messages or frames to other hosts and receive frames back. All right the protocol will include an address typically so the MAC address of the device that should be the target and a media access control protocol which is a couple of examples here character sense excuse me carrier sense multiple access and collision detection is one of them. So we've moved up these two layers and now we've gone from this physical layer to a packet representation so now we'll actually have addresses in we have frames first of all and we'll have bit addresses encoded in those frames. So this is the first of many layers of the framing that happen in IP at this lowest level we get the outer containing headers so these frames are going to encode what's inside a payload and simply specify how that payload moves around on a local area network. So MAC addresses are always existing they're just hidden perhaps under the IP address of your host but they're easily exposed. The commands on Linux and MAC or IF config and on Windows IP config so running IP config on Windows will give you a print out like this it's going to list usually multiple devices so the wireless LAN is here it's physical address MAC addresses here 48 bit address there the IP address not shown here but the ethernet adapter has both the MAC address and then the physical address here in fact it's got two of them. The IP address IPv4 addresses here it's that group of bytes basically 32 bit address this is IPv6 address which is 128 bits it's sort of in a shorthand here you can see some breaks between the blocks of 4 hex characters but this is this is the full address which allows a much more extended address space than IPv4. Okay so all right so the lowest level of the network is the local area networks this is a bunch of machines that are communicating with each other and they're physically close to each other they have basically a broadcast medium between them logically it's a broadcast medium should use the same physical communication technology so it could be a particular wireless network or a particular wide network at home or an ethernet in an office so those machines all know about each other and can communicate point to point all right so so LANs share the same physical communication medium which is a broadcast channel every frame is normally forwarded to every host on a LAN so hubs are basically devices that have a different physical cable from each machine but which forward all inbound messages to all of the other hosts and so basically addresses so every machine sends out a packet the lowest level packet will have an encoding of the MAC address which is the intended target and the all of the machines receive that the packet and will discard packets which don't match the MAC address so that's so the addressing at the lowest level layer at the data data link layers based on MAC address all right there's packet so it's actually going to be sent to all of the hosts and if the intended target is B it's only going to be processed by B and discarded by C okay so one level of complexity higher than hubs is switches which route traffic selectively based on the intended address and therefore minimize the amount of bandwidth so in contrast to a hub which is going to forward to every wire from that is connected to a switch is going to selectively forward based on the data link address so packet to be the switch is aware of all of the addresses of the devices connected to it and it's only going to forward to the device with MAC address B all right um let's see so there's a few issues with MAC protocols one of them is access to broadcast media and the second one is avoiding collisions so there are a few solutions most of these solutions are fairly old in the history of networking uh the one that's currently used almost universally is random access but let's look at some of the others first all right so so in channeling partition protocols the idea is that this land which is really uh including all of the traffic between the machines is going to be partitioned so that each machine gets an equal share of the bandwidth the realization of this that that's most common is in wireless protocols including fdma which just means basically dividing the spectrum evenly between the hosts um and tdma which is another kind of wireless protocol basically defines time slices that are shared between the hosts so it's evenly defining a fixed wireless channel among different hosts so um yeah so basically all of the hosts can communicate at once most likely in wireless networking they'll be communicating to a single um access point and then receiving a message back um but this protocol simplifies everything by basically not having to adapt the bandwidth of each node according to the traffic from the other nodes um and I mean what part of the history of this too is that um it was developed also for cellular networks were the ideas to provide each customer with a consistent amount of service um all right so um another protocol that was used in early ethernet was token ring um protocols where basically in order to determine which host could talk there was a electronic token that moved around the network that was held by a node for a while then they'd forward this token packet to the next node and only the node that had the token could uh could uh transmit to the network so that again avoids congestion control you have a unique host um it's quite similar to time division multiplexing in wireless spectrum um but it's a bit more adaptive because a host that wants to talk can send a long message then the token's gonna if no one else is sending it'll spin around and quickly come back to the same node so it allows um a node that has some data to submit to send a lot more than one over and of the share so the disadvantage though the reason this is not used much is that it it does have a considerable overhead especially compared to modern network speeds in uh each machine receiving deciding whether to transmit and then forwarding the token uh that can involve a large amount of overhead relative to the amount of data going over the network and and also for node fails then the token passing stops or a node could also cheat and perhaps uh resend the token and cause all kinds of of havoc and congestion um all right so these are simple protocols to understand though um where you somehow explicitly give control of some part of the network to a particular node all right the protocol though that is um most commonly used as random access and um hear the ideas to make a best effort to not collide with someone else so the idea is to listen first to the medium see if anyone else is sending currently uh and wait until there's nobody sending um and then try try to send there's a small probability that between your last sensing operation and your transmit operation that somebody else also starts so you still keep uh keep listening to the network though to make sure that if somebody else is talking while you're talking you can detect that and then stop so basically both of you have have sent garbage because you've um sent the same sent two messages over the same physical media most likely they're both corrupted both nodes in theory should stop at that point um and then uh both nodes though should choose a random delay time such that there's very low probability of them colliding again so whichever uh node picked the shorter random time will start the other node should be able to recognize that they're speaking and then wait until their message has gone through um so that's a simple scheme that works quite well uh and so and it's the one used in ethernet which has proved to be extremely reliable all right so can just can control at least at this layer is is not too bad okay so all right so let's pause for a moment um this is a good point to um to take a pause and just review where things are are so project two code is due um thursday tomorrow night halloween try to get it in on time the group evaluations that you're on friday um we really want you to try to get this one on on time there are only four slip days and i know groups of uh several groups use them already some of them already on the first project we do have automatic deductions and you do have a couple of challenging projects coming up so please try to get this these ones in on time so we have a bit of a problem with uh some of the projects um so i'm just going to review our collaboration policy uh the policy is stated up the front of the course so when you're working on projects it is okay to discuss your design with other groups um to make suggestions about code from another group but you shouldn't use obviously copy or share code from other groups um we discourage you from carefully reading other groups codes uh so that you know you generate your own ideas and what you submit is obviously distinct from what other people are submitting and in particular we want you to not be copying or substantially reading code that's online or test cases from prior years because unfortunately we've discovered a number of previous projects are online and there is a problem with some groups using that information and submitting so unfortunately we've had to review uh the project submissions from project one and we'll have to take some action against some of the groups so please don't get yourself into that situation um the project deadline is coming up but there's a help session today make sure you what that what you submit for this project is your own work um and let's please not see any more of this okay so are any questions any clarifications needed all right well let's take a five minute break and we'll continue with a quiz right after all right let's continue so um let's review some of the ideas about layering networking uh so first of all do protocols specify an implementation yes or no no okay so syntax and semantics but uh but not implementation congestion control is about uh making sure the sender doesn't overwhelm the receiver yes or no all right false so what what is it that's taking care of not overflowing the receiver so that's flow control yeah um a random access protocol is efficient at low utilization true yeah I mean it's not really there's minimal delay because it's only listening for um existing traffic on the network all right um and at the data link layer hosts are identified by IP addresses good all right yeah that that's actually the data link layer is below that layer so they're hosts are identified just by MAC addresses all right physical layer is concerned with sending and receiving bits yeah all right good all right so so let's start moving up to more of the the interesting stuff uh at the the network layer we're really starting to talk now about um IP traffic so uh packets now include an IP address so a global address that's the intended recipient and they'll generally be traveling over multiple multiple networks to get there um at this layer also we can have services associated with scheduling and priority and even buffer management the interface is is pretty simple it's about sending these packets to specified network addresses and um receiving packets that are that are addressed to you okay the protocols um define network addresses globally unique network addresses um including construction of tables and the routing process on the network so there's really a lot of complexity here that we'll see on a slide later on okay so at the network layer we have these um already uh two levels actually it'll be easier for to see it from here at the network layer we have network headers that include the addresses of where the packets are intended and also normally the sender addresses there as well so these packets are nested inside of the data link layer packets um which include the the MAC layer information the MAC address once they're being sent on a physical on a physical network okay um an IP addresses as we said before they're either statically assigned by somebody or they're uh managed by uh a naming service okay so both the destination address and the source address are included in that header okay so uh a wide area network uh is really a network of networks so it's a set of different networks that are glued together by routers that cover a large area and these days the entire planet so the internet itself is a single wide area network um and it will include multiple uh lands and provide access from these days millions and actually yeah it would be billions of hosts now because there's billions of cell phones that have access um and the hosts are connected by routers um which use different technologies to communicate all right so routers um are responsible for forwarding and depending on the type of router they either um forward the entire message sometimes they store the header only and then basically flow the rest of the message to forward um or they capture the entire packet and then forward it uh in order to I don't know uh provide perhaps uh error recovery um before forwarding and the forwarding table is a mapping between um input addresses and output addresses so it determines how a given node is going to forward packets that it receives all right so the router is is is basically a computer with typically its own operating system and uh services that are running both to route things that it already knows about route to hosts that it knows about and it's normally also spending a lot of its a lot of resources communicating with other routers to determine uh dynamically which are the best ways to move packets around all right so um and on receiving a packet a router is going to read the destination address look at the forwarding table um and send to the output port all right and there it goes so um as we already said the IP addresses are logical addresses they encode the entire internet and they're topologically ordered in a certain way you know they have a essentially an address that that's like a physical address there's um and decreasing amount of locality in the address so there's sort of local information at the beginning global information at the end uh IP addresses are just reversed the global informations at the beginning the local information is at the end and we already talked about why MAC addresses wouldn't work they're not scalable you'd have to record every potential MAC address in some massive table and pass that around so it's just not practical for billions of addresses the MAC address you could think about as being something like a social security number for people has no geographic organization whereas the IP addresses are organized in a kind of a course to find manner like home addresses just simply flipped so um so the MAC address is supposed to be globally unique though so it does allow a unique mapping from IP address to physical address um the IP address though um Canon does change for instance if you take your laptop from one location one geographic location somewhere else on a plane it's going to need a geographically appropriate address at the other place so that's going to be different um yeah so it's a little bit like right imagine as you move around your social security number is going to go with you but your address is going to change um yeah well we already said that uh the IP addresses have this geographic locality so for instance at Berkeley um the IPv6 prefix is a9e5 the old one was 128 so you can tell actually that's actually unique those first four digits are unique so any address that begins with that sequence is going to be a Berkeley address so you can tell that the router um in New York if it sees this packet all it needs to know is how to forward to Berkeley and there's a few addresses in here that's uh what's 32 no 16 bits so there's thousands of addresses but still it's a lot less than the full address space of IPv6 so having a table of of these prefixes and knowledge of where to send simplifies the router's task enormously all right and so it only needs to know that Berkeley prefix and not the address of every node and you know that's analogous to knowing um let's say a state address uh if you're the postal service that tells you how to forward the the letter to an appropriate um forwarding state facility that's then going to forward to the actual address because that forward facility has the local routing information that you need all right so here's a high level map of the internet these days um here here we are down here and businesses are down here the network that sits above us is typically partitioned into three tiers of networking provider um the the distinction is that tier one networks once upon a time there used to be a a global backbone of the internet which was originally managed by DARPA and then taken over by National Science Foundation um but as the net grew and was commercialized largely private providers have taken over uh responsibility for this backbone um but nevertheless there's still a concept of a basically backbone level which is the level through which most traffic goes and the level through which the lower level services have to pay for transit so level two ISPs have to normally rent transit from the tier one providers um in the trend tier one providers don't rent from anyone they just possibly peer with other tier one services for efficiency and performance um but they don't have to pay for it so there's relatively few of these but they include companies like AT&T Sprint Verizon um and Tata Communications um perhaps a few dozen of these at that scale all right so at the next level down they're often regional entities or Comcast are national but they're they're relying on some of the other providers here for their tier one service they don't implement end-to-end networking of sufficient bandwidth to be a tier one provider um okay so um even though with a relatively small one small number of these tier one and tier two providers there's still a lot of complexity and routing so um we just saw that Berkeley is one of potentially about 65,000 uh addresses at that first prefix so it's still necessary for routers here to be able to find efficient paths between let's say lower level nodes that might be transiting them so um a lot of the complexity of the network now is in discovery and routing protocols so ISIS is the protocol that's most commonly used for these high level providers and basically it's a discovery protocol um routers here are communicating with each other using this protocol and telling each other about routes that they know to get lower level uh routers and basically each router gathers enough information to make its own map of part of the internet at high level and given that it then runs a simple algorithm which is basically Dijkstra's algorithm um to find shortest routes to certain nodes so that's what's going to go into its routing table um and routers at the next level down do exactly the same thing um the only difference is they're likely to have these tier one networks sort of predominating many of their routes they'll forward to a tier one address before um before forwarding somewhere else so in other words that the addresses that they'll route directly will be local to their jurisdiction everything else will go through a tier one um all right so now an issue here is that these different um uh domains here can have actually be running different protocols any given uh commercial network like this one will run a particular protocol and most of them are running this one but say Comcast potentially can be running a different protocol from AT&T so in order to mediate between the different zones there's another protocol called BGP which is border gateway protocol and and that's specifically designed to arbitrate across the boundary between these domains all right so and you start to get an idea of the complexity as we go down there's a different discovery protocol it does something very similar to this um ISIS protocol at high level in terms of talking to other routers discovering short routes and then populating um the route table and the main difference is this is better adapted to you know large-scale backbone networks and this is better adapted to local networks and you know say within a business or um within a local ISP um okay so a nice quote about standards that this slide sort of is suggesting a nice thing about standards is that there are so many to choose from um a quote from Andrew Tannenbaum so um it is rather remarkable that that the internet really works so well but it does and it does it is because as although there's a diversity of protocols um there are standards that they're all following and it's enough to allow them to communicate um and certain common properties such as the the idea of routing tables themselves that are populated a lot of different ways and the basic IP protocol allow this all to work okay so um still at the network layer we have uh IP is the basic sort of transport of the uh of the network the basic communication packet is the IP packet uh and it's an unreliable packet delivery protocol um IP packet delivery is best effort which means there's a um a destination address but packets may be lost or corrupted or delivered out of order so you send and hope for the best and react the higher level services are responsible for repairing things if packets are lost all right so the interesting services though typically use um higher level uh packet services and the idea is to to provide an abstraction that includes in particular um a process address or port address so the transport layer is the the level at which services are exposed and therefore the the port numbers that I showed you earlier are encoded in packets at this level and that allows you to de multiplex services meaning many um there's packets associated with many different services can go through the same communication link and then be decoded at the other end based on their port address um the transport layer this is where TCP lives and so it can be responsible for reliability um sometimes timing properties and protocols like RTP which is a real-time protocol and things like rate adaptation so TCP does include flow control and congestion control all right and so the interface is both a network address and also a port address and you know so the main idea is to provide this uh service address or port number and possibly the other uh services that we listed up here TCP and UDP both live here TCP is the reliable transport and UDP is a simple connectionless datagram transport so the port numbers which encode services um basically uh specify uh an address there's a convention about what certain addresses mean but um especially these higher addresses can be arbitrated by the two endpoints of the connection um so uh but for the standard addresses port 80 is uh HTTP probably the yeah certainly the most widely used uh port number and a host that exposes port 80 is saying um okay I'm going to implement the HTTP protocol um by listening on this port all right so at the transport layer now we have this transport header which is going to include the port numbers both sending and receiving port numbers and then some data um you can see the nesting happening so the highest levels of the protocol have the innermost nested headers um as the packets are assembled going down the stack so we have port number up here and then um IP address down here and then finally MAC address at the bottom so UDP is the uh you know one workhorse of the internet it's the simplest protocol for packet communication and it's similar to IP it's just the best effort protocol you send a packet and hope for the best there's no ordering there's no um reliability guarantee um TCP is the reliable protocol that's widely used um it includes sequence numbers on packets so therefore allows packets to be reordered at the receiver if they're received out of order um it includes acknowledgment so packet loss can be detected and repaired um and it's a streaming protocol a packet based streaming protocol which means that the idea is that there's a stable connection between two endpoints so there's a state that shared both at the sender and the receiver um that persists between messages unlike UDP where there's no state before and after messages sent there's no change in the state with TCP you've got to establish certain state at both the sender and the receiver um basically they both run state machines that can tell which part of the protocol they've executed and where they're at so establishing that state and maintaining it um is essentially what's involved in uh creating a connection or creating a socket connection at the two ends of the of the TCP link um TCP can determine also um through checksums that packets are being corrupted in transit and it will discard those and request retransmit um yeah and it has both flow control and congestion control which are based on observing the acknowledgments coming back um okay so TCP doesn't provide uh performance or bandwidth or throughput guarantees it can have very large delays because of the latency involved and normally uh packet it will keep retrying to transmit a packet that's not being acknowledged um for a long time so other protocols though um provide this kind of service but not TCP IP and normally there's a tradeoff between these two or there's a tradeoff rather between reliability and delay slash bandwidth okay um it is a statically addressed connection though so it won't survive a change of IP address if you did move your machine to a different domain all right so at the highest level um we have uh well in OSI numbering this would be level seven uh in TCP IP it's the fifth level or top level um and it's a service layer where services that are actually directly used by end users are exposed okay and there's a very wide variety of interfaces associated with the application layer and similarly protocols are quite diverse so they range all away from real-time communication used by Skype uh asynchronous relatively slow um forwarding store and forward communication for email web services and custom protocols in halo bit torrent and so on um so in OSI there are also session and presentation layers um um those improve modularity but they come at a high performance cost and so they're not normally they're not implemented in the internet architecture from the packet head of wrapping that we saw early you can see that this would imply if we did have these two layers um two additional uh wrapping steps to encode packets at these two layers for not a lot of added functionality or performance so we just skip those two all right um so finally the application layer because of the diversity of services that might be involved um we're just showing these kind of packet says data packets in practice each protocol is probably going well inevitably is going to have its own headers so HTTP will have custom headers in here um but in order to make sense of this service at the other end the application layer will have to understand what kind of protocol is being sent all right so to summarize that we have uh lower three layers being implemented everywhere the top two layers implemented at hosts here we have routers taking care of these network data link and physical uh layers the hosts will be actually figuring out exposing uh endpoints basically an application functionality on top of the endpoints all right so the there's a horizontal sort of compatibility here for every um function that's initiated say at a client there's a matching activity or service at the other end so a socket connection here is always going to talk to a socket connection at the other end that socket layer will be interacting through an API with application code at the high level and there should be a matching application here so HTTP requests here have to be interpreted by an application HTTP service at the other end all right so um the communication though is going uh up and down there'll be a nested encapsulation of packets as they go down through the layers of the network and adding the layer specific information about first of all port address um IP address MAC address and so on then the packets will get sent over the physical network um and then potentially depending on the uh the technology have new addresses added here because of a router in order to get to this router the IP address uh would have been the address of the router the forwarding address is going to have to be changed so um I guess we're showing yeah yeah so the IP address will have to be changed the MAC address will also have to be changed to get to the next get across the next link in the network finally you'll get a fully encapsulated packet here or sorry here and then you can strip off the layers which provide you with other information especially the uh source address source IP address and the source IP port um that then the application can use for instance if it needs to send back replies which it usually will all right any questions yeah um I mean I'm not sure you can do much with the with the MAC address so there are definitely um protocols currently for high performance networking that try to bypass these layers in order to get higher performance so um what one of them is uh this something called rdma which is remote dma which uh is like establishing a tcp connection but once the connection's established it uh it it strips away all of these layers and basically writes large blocks of bits into a buffer that is then sent out at the lower layers so there are ways of making this more efficient um but I I think it doesn't normally extend as far as as MAC addresses it's got to be basically this is normally understood by hardware anyway so normally the the bypassing if it happens is happening at these higher layers all right so the internet has a structure roughly of an hourglass so we talked about the layering as being an approach to minimize the potential n squared blow up of uh interaction between different services and uh physical networking layers down here you can see it's a like a tree and in fact everything goes through ip because it wouldn't be sufficient just to have um a layering structure because unless the layers are somehow shrinking as you go up or down you'll still have an n squared blow up so the idea is you go down you're actually decreasing the number of protocols that you'll be using um down to the ip level and then branching out again uh with less interaction between random protocols at different layers so that kind of double tree structure or hourglass structure minimizes the number of potentially different types of layer interaction that you need to worry about so and typically well actually the application layers often um application layer protocols will often work with both tcp and udp but they're normally using one for preference to provide a appropriate quality of service um 80211 is a radio protocol so uh it's really only used with wireless um ethernet now has a couple of popular um physical transports there's both um copper ethernet which is the ethernet you probably have in your machines uh copper really uh is struggling above a gigabit per second and so many of the newer technologies from 10 to gigabits and above which should be coming out to they're currently only in um data center level communications but they should be um hitting the end user application space pretty soon these will probably be fiber it's just physically much harder to move bits at that speed but anyway they'll still be running ethernet which means it'll be simple to um interoperate those devices with ethernet struck ethernet infrastructure that people already have all right so um the ip the narrow waste allows uh a kind of a funneling and a simplification of the potentially quadratic interoperability problem so um yeah it allows um a nice multiplexing of the high level functions with the low level technologies um and allows applications that use ip to run on any physical network um again it has the usual uh advantages of abstraction which allows innovations in any layer of the network um without changing the core protocol every lifetime though once in a lifetime though when you you have a basic uh uh core protocol like ip it has to change in order to deal with the increase in size of the internet so the transition from ipv4 to ipv6 has been pretty massively disruptive but anyway uh uh that's an inevitable glitch the good properties of this hourglass model are still going to persist and eventually things will settle down on ipv6 and still be simple again all right so um some disadvantages of layering um we end up doing a lot of work and a lot of uh copying actually in implementing these layers and adding headers progressively and often fragmenting packets as well the ideal size of packets is often decreasing as you go down in the layer for ethernet it's 15 about 1500 bytes larger layers typically have larger packets um so all of that hurts performance um and the headers can get quite large because you're nesting many layers of them headers can easily become larger than the content if there's small messages you have already dozens of bytes of header yeah um no the the packets are smaller typically i mean ethernet's limited to 1500 byte packets so the larger level protocols and normally i think um i think ip is limited to uh it has a 16 bits length field so it's limited to 65k bytes and higher levels i think i'm well i'm not sure but the higher levels i think can support larger packets but anyway definitely to get to the physical layer um the packet size is quite small and the reason is that it uh if packets are too large at the um the very lowest layers you introduce latency by having to send a you know a long packet before anyone else can send the idea though is the lowest level is supposed to multiplex a lot of communication so you want packet sizes to be small yeah yeah right the yeah so the headers are going to add size especially as you fragment packets so the overhead of headers is getting bigger it's still though typically um a few percent if you have a full size ethernet packet it's still header size is still a small um fraction of the size the difficulty is if you're sending very small packets they're going to be small all the way down and then the headers will dominate so you know acknowledgement packets have this problem and you know tcp has a lot of those um all right so yeah so we have a problem that uh uh you may have duplication of layer functionality so error recovery is something that you might want to implement at several layers most wireless networks have uh either forward error correction or their own error recovery protocol it's also going to be built into tcp um and sometimes higher level protocols have it as well potentially to deal with the high latencies of of each of tcp some layers are going to want the same information um you may want higher level layers to understand what the mtu is so that's the maximum packet size so for even that that's the currently 1500 bytes roughly um so there's a a push to increase the size of the mtu for for uh ethernet and for the intermediate network layers so 1500 is the default that's implemented almost everywhere you can get higher throughput by using longer mtu's but that has to be uh possible at every layer of the network because once you once you um hit a layer that can't transmit a large sized packet at at the uh the um the physical layer you're going to just get a block you basically get the packet being stole there so the router has to determine end to end what the uh guaranteed mtu size is so there's negotiation if you um if you check routes using there are custom trace route tools which will tell you what the sustained transmission unit size is so higher level protocols can make use of that and and choose their packetization adaptively to best use the physical layer okay so all right let's review some of this so for layering um does layering improve application performance right we saw that before okay what about um router's forward packets based on destination address yeah um best effort packet delivery ip or udp guarantees uh packets are delivered in order right it's in fact really those packets don't even have sequence numbers so it's not even really feasible okay port numbers belong to the network layer there's a transport layer and hosts on berkeley campus share the same ip address prefix true all right all right so to summarize we talked about the layered architecture and how it provides a way of managing a very large and complex set of protocols we went through the um physical uh the layers of the internet the five layers from physical through data link which is designed for physical media includes mac addresses the network layer that includes the ip addresses of hosts um and transport layer which is starting to expose and support services that didn't include a a port address in addition to the ip address and finally the applications that sit on top of those but which have relatively a very simple interface in that they're talking to just one of the two major transport layers typically all right so um so this somewhat complicated process is made transparent or rather made invisible to the application author who only has to worry about their communication with the high-level um networking interfaces all right we'll stop there