 To explain how the internet works, we first need to establish a few basic terms. The term server, when applied to software, refers to a program that listens for incoming network requests and responds to those requests. Somewhat confusingly, the term server is also applied to hardware to refer to a machine that is primarily used to run server programs. The term client, when applied to software, refers to a program that makes requests to a server program. And again, somewhat confusingly, the term client is also sometimes applied to hardware to refer to a machine that is primarily operated by a human user. Straddling the line between servers and clients are peers. A peer program both listens for requests from other peers and makes requests of its own to other peers. A network connects together multiple computers, allowing any computer on the network to send data to any other computer on the network. In the jargon of networks, each computer is often called a host. A network is controlled by one single entity, whether an individual person or some kind of organization. In contrast, what we call an internet is a network of networks. These networks are connected together through routers, which are simply systems connected simultaneously to more than one network. Here we have three networks, each connected to the other two via routers. If, say, a host on the red network wishes to send data to a host on the green network, the data is sent either through the router connecting the red and green networks or through the router connecting the red and blue networks and then through the router connecting the blue and green networks. What we call the internet is simply the internet connecting virtually all computers worldwide. The networks that make up the internet can be broadly split into three categories, private networks, service provider networks and backbone carriers. Your own network at home or at your company office is a private network and these networks are connected to your internet service providers network and their network in turn is connected to backbone networks that tie the service providers together. So, for example, when you visit amazon.com, your computer is contacting the web server operated by Amazon. Amazon server is connected to a service provider and your own system is connected to your own service provider. These two service provider networks probably aren't connected directly and so instead exchange packets through one or more intermediary backbone networks. My service provider network may be confined to just the local Southern California region and likewise Amazon service provider may be confined to just the local Washington region. So the backbone networks in between operated by telecommunication companies probably carry the packets over the longer haul between Southern California and Washington. In truth, though, very large companies like Amazon typically host their servers in many places throughout the world. So my visits to amazon.com probably talk to a server in LA or maybe Silicon Valley. Also, the larger internet service providers may operate networks that span large regions and may even connect directly to the networks of larger companies like Amazon or Google. So it's actually possible that packets sent between my computer and Amazon avoid backbone networks entirely. The standard internet protocol, IP for short, sends data in discrete chunks called packets. In IP version four, the version used for most of today's internet traffic, each host and router is known by a unique 32-bit address. By convention, the four bytes of these addresses are each expressed as an integer between 0 and 255 and the four integers are separated by dots. For example, 233.75.19.198 is an IP version four address. The major problem with version four is that 32-bit addresses only allow for about 4.3 billion different addresses. This was more than sufficient back when the internet was only used by a handful of universities and government entities, but now the world has more than 4.3 billion devices which all want to connect to the internet. A trick called NAT, network address translation, allows multiple devices to share one IP address, but NAT is not an ideal solution. It generally works fine for simple applications like web browsing, but causes configuration headaches for other applications. The real solution to the address shortage is the next version of the internet protocol IP version six. There was an IP version five, but it was never formalized and adopted, so we're skipping straight from four to six. IP version six uses 128-bit addresses, allowing for many, many more unique addresses. In fact, 128-bits allows for so many addresses that every person on the planet could each have many trillions of their own IP addresses. Currently, here in 2014, only about 5% of internet traffic is sent with IP version six, and it will probably be another 10 years before IP version six usage overtakes IP version four. In the meantime, we'll have to live with NAT, and because version four still predominates, that's the version we'll cover in this video. Looking back at our example of a user visiting amazon.com, let's say that the user's IP version four address is 128.3.0.11, and the amazon.com server's IP version four address is 71.200.44.5. Packets sent between these two hosts could travel two possible routes, either directly through the green network or by taking an extra hop through the blue network as well. The question then is how does each host and router decide where to send each packet? Well, the logic is very simple. Each host simply sends every packet out on its network such that every other device on the network can read it. Every other host on the network looks at the destination address of every packet and ignores those packets not addressed to them. The routers on the network also look at the destination address, and if the destination address falls within a preconfigured range of addresses, the router will send the packet onwards to the neighboring network. For example, the destination address of a packet sent by the user here on the red network is checked by every other host and router on that network. Assuming the address is in the preconfigured range of addresses expected by router A, the packet will be sent out on the green network where it is in turn read by router B, which checks the packet's destination address against its own routing tables. So for the packet sent from 128.3.0.11 to make its way to 71.200.44.5 via the green network, the tables of router A and router B must be configured to pass in that direction any packets with destination address 71.200.44.5. When the router tables are not properly configured, packets might not find their way to their destination or might take less optimal routes. When routing tables are not properly configured, packets might not find their way to their destination or might take less optimal routes. How exactly routing tables get configured is a complex topic we won't go into here. If you're interested in these details, you should read up on CIDR, Classless Interdomain Routing and BGP, the Border Gateway Protocol. A modem short for modulator demodulator converts digital data into analog signals and vice versa. In a typical home setup, a cable or DSL modem is the device directly connected to your service provider's network and your provider's network assigns the modem and IP address. You can directly connect your computer to the modem, but it's generally preferable to connect the modem to what we confusingly call an access router, or even more confusingly, commonly just call a router. An access router provides three functions. First, a router allows multiple devices to share a single IP address using network address translation. Second, virtually all routers today provide wireless ethernet connectivity. And third, routers usually provide firewall functionality, meaning the router can help filter unwanted traffic that may maliciously try to exploit security holes in your hardware and software. So all packets sent between your devices and your service provider typically travel through both an access router and a modem. And quite often, the modem and access router are combined into one device. Packets sent between your modem and your service provider typically don't get seen by other modems on the provider's network, as that would be both insecure and wasteful of bandwidth. Instead, your packets typically get handled by a router internal to your provider's network that passes packets directly to and from your modem without sending the packets to the modems of any other users. Except, of course, in the cases when your packets are addressed to other users on the network. Because numbers are difficult for people to remember, the domain name system was introduced. The domain name system is a global registry of names mapped to addresses. These domain names are organized into top-level domains, such as com, net, org, edu, mil, gov, among others. Each top-level domain is controlled by a designated authority, and most of these authorities allow others to lease subdomains. So for example, Google leases the subdomain wikipedia.com and Wikipedia leases the subdomain wikipedia.org. The leaser of a subdomain gets to decide what IP address the subdomain maps to in the global registry, and so for example, Google.com resolves to the IP address of a Google server. When your computer wants to know the IP address mapped to a domain name, it asks a DNS server. Most home users' computers are configured to use a DNS server run by their internet service provider. Each DNS server keeps a local cache of the global registry and periodically checks for updates from the top-level authorities. In the old days, it could take days for changes to the authoritative domain registries to filter out to the DNS servers of the world, but these days it usually takes only a few hours. A URL, a uniform resource locator, is a string of text representing the location of a resource on the internet. A URL has three components. The schema denotes the nature of the thing, which most commonly means the protocol by which the thing is accessed. Web page URLs, for example, have the schema HTTP standing for the hypertext transfer protocol used to retrieve web pages. The host specifies the IP address or domain name of the system from which the resource can be retrieved, and the path specifies a particular resource on that system. The format of the path is specified for certain schemas, but for many schemas, the path could be any sequence of characters without white space. In web page URLs, the path often resembles a file path, but the format and the significance of the path is really left entirely up to the server receiving the request for the resource. This example URL denotes the location of a web page at nytimes.com. When your web browser sends a request to the web server at nytimes.com, the path is sent as part of the request, and what resource the path refers to is entirely up to the web server's interpretation.