 All right, Bonjour. Welcome back to was it week 10 week 11 something like that. I forget what week it is Anyways, we will talk about sockets today, which is the foundation of your web browser internet all Networking stuff is going to use sockets. So if you ever want to use networking You will use sockets. So But luckily enough for us sockets are just another form of IPC There's really nothing anything different that we haven't seen before So like I said another form of IPC if we don't remember what IPC is that is inter-process communication or just sending Bites between two different processes. So this is how your web server works even using what your web browser is Technically IPC so your web browser runs as a process or in each individual tab runs as the process And you're communicating with another process at another machine hosted by Google hosted by Amazon something like that But it's just another process running on another machine and the other process is a web server So we've seen pipes. We've seen signals. We've also talked about shared memory how you can use that to share between processes But these forms of IPC assume you are on the same physical machine Obviously short shared memory is not going to work between machines because they don't have access to the same memory So sockets are the first form of IPC We'll see that's between physical machines and typically this will be over a network and Using your network card ethernet Wi-Fi all that fun stuff so it's a bit more involved to set up some communication between sockets, but and Typically there's a server and a client so first we'll go over the steps for the server So if the server wants to use a socket, there's four steps They're basically system calls all our usual C wrappers returns negative one if there's an error sets error No, all that fun stuff so the first system call is called socket and it creates a socket and We'll see the two options you have for that Basically, you only have two options and then after you have a general generic socket that just sets up kind of a protocol Well, this next step you have to do is step to bind and that is giving a name to the socket attaching it to some location So you can bind the socket to a file So it looks like a file you can access Mostly if you're just testing because that would have to be on the same machine Or it could be like an IP and a port which is what you're more used to if you've used anything online Then the next step you want to do if you are a server is that listen system call that indicates You're accepting connections and will create a queue internally in the kernel for any incoming connections to be held and handled and You pretty much just always use the default value for that But it's just another step you have to go through because the queue is managed by the kernel So you have to tell it how big the queue needs to be Everything's a queue. Yep So it depends you might bind a socket to a file which would be like a location on your machine or an IP and a port which would be Designated by your network and then handled by your network But it would like listen to you can even just listen to all Incoming connections on port 80 or something like that, which is what a web server would do. Sorry Yeah, yeah, so we'll have an example. So you first you create a socket and it Surprised unsurprisingly, hopefully returns us a file descriptor And then we can do whatever we want with it in terms of a socket. You have to do a few extra steps. Yep yeah, so The making a socket a file would be used in individual soft like software because essentially it sets up a pipe But they don't have to be related So all you need to know is a location of something and then you can actually connect to it and communicate with it So if anyone has used Docker, which is a when using the dev container It sets up a socket to communicate. So Docker actually uses a socket to communicate on your local machine so anytime you run a client talks to that and it sets it up because it knows some The location to that Docker socket and Yeah, we'll see an example of this too Okay, and yeah, so the last step is to connect and or sorry the accept system call and That will return whenever there's a new connection So you call this multiple times on a socket and each time except returns It returns you a new file descriptor that represents that connection. So between you and the client So if you're the server you'd call except multiple times each time it returns means you have a new connection And then it's a file descriptor so I can just read and write to it as we have been doing all along So that's the server. It's a bit more involved Clients a lot easier. So if you're the client You use so one socket is also one connection. So if you're the client you create a socket using the same socket system Call so they start off the same way where they essentially agree on a protocol And then after that the step for the client is simple. It's just connect so I'm connecting to some location and that would be Whatever the location is set by bind in the server So if I'm the client I have to connect to whatever that address is so if it's a file I have to give it the path if it's an IP. I have to give it that IP and port To be able to connect to it and then connect would return again a file descriptor whenever it has successfully made that connection and then I don't really care I can read and write bytes to that file descriptor and it would actually use a network connection if it's you know an IP port or Actually use the kernel if it's just a local file So like I said the socket system call sets the protocol and type of socket So there's three arguments here. Basically the last protocol one isn't actually used people just use that term for other things typically people just say the type is the protocol but So domains like a general protocol. So there's three main options. You'll see here So UNIX is for local communication, which is on the same physical machine Which it will just look like a file and that's how it communicates and then there's two options There's IP or I net which is IPv4, which is just the normal number dot number dot number dot number That would most people use and then there is IPv6 because we've ran out of IP addresses If you do the math on that that is what 2 to the power of 32 that is not enough for all the devices in the world So there's IPv6, which is essentially a 64-bit IP address so Type is like more what people say is like the protocol and there's basically two options for type There's stream and datagram sockets as anyone ever heard of like TCP and UDP before So we got a few networking people that I expected. Thankfully, okay a few so these are closely related to TCP and UDP so stream sockets the underlying thing stream sockets represents is just TCP and With that it gives you a few niceties. This is basically the default option, which gives you what you expect So all the data sent by the client client is Received in the same order on the server, which seems like a no-duh, but as some networking people might argue that that is actually very difficult and Something that is not a given once you're dealing with networking so This also forms a persisted connection between the client and server So if one of them disconnects someone then plugs a cable Wi-Fi goes out whatever one will know the other has gone offline because the connection will be broken and Essentially, if you'd get an error from any system calls with that file descriptor if that connection is now broken So this is reliable But maybe slow because as you might figure out in a networking course doing this is actually quite difficult So the other option is datagram which uses UDP, which you might have also heard before and That just sends a message between the client and server. It's not Persistent whatsoever. So the client doesn't know It's connected to or sorry the server doesn't know how many clients are currently connected to it It just receives some information Sends out some information It has no way of knowing anything got received. It has no idea of knowing if that client is now offline Nothing at all. It's really fast But those messages aren't guaranteed to be in any particular order And they're also not guaranteed to make it to the recipient and you will not know if it has been dropped or not so typically people will use this because it's faster it's simpler and For some applications, you might not care So if by the time you detect an error happens that data is already old Then perhaps you don't care and you wouldn't bother resending it anyways So why even bother having TCP? So an example of this would be like a real-time game or something like that if you press a button You send some packets. Well, if you disconnect in the middle of it, it doesn't matter You're not going to try and reconnect and resend that because it's already happened everyone else has already seen what's happened So there's no point trying to introduce all that overhead with it Might use it for video streaming might not depends how much you care about the stream Some video streaming does use to TCP though so The bind system call so that was the step two for the server So it takes as the first argument the socket which is that file descriptor That's returned from the socket system call and then it wants an address and because it see You give it a pointer and you say how big that pointer is because it see and it can't really tell So there's basically that struct sock add adder and there are three versions one for each Location so there's socket UN for the UNIX one why they shortened it from UNIX to just UN I don't know but That's what you use to specify a location for a UNIX file or a UNIX socket And it's just a path on your machine. So it just looks like any other file. It's just a little bit more special And then there is UN again, I don't know why they shorten it from INET to just in Don't ask me why and that's specifying an IPv4 address so like 8.8.8 Which fun fact that's a DNS server if you're doing networking you'll figure out what a DNS server is Basically changes those numbers to something you can type like google.com or something like that And then finally there's socket adder in 6 for an IPv6 address and The format's a little bit different, but it's essentially just a 64-bit address instead of a 32-bit address So next step is after we have bound to an address we have to tell the kernel how big of a connection queue we want and It just takes a socket as the argument and then an int for the backlog that just says how essentially how big that queue wants to be and That's however many connections Kind of queue up before they just get dropped from the machine before I can actually accept them So the kernel manages this yep Yeah, yeah, so it'd be new incoming connections that get cut off And this is for things that you haven't accepted yet So as soon as I accept it creates a connection and then is removed from this queue And you don't have to worry about it but this would be like if 10 million people try and connect while you're trying to Accept some connections instead of overloading your machine It'll just start dropping them and say nope servers not available anymore. Sorry. Yeah So the question is is there an interface to tell if you have dropped any incoming connections? And I'm actually not sure I Imagine so but I'd have to double check Yeah, so basically with this all the time you'll just give the argument zero to say accept the default A sec accept the default argument let the kernel worry about it curl developers are pretty smart so we can trust them Okay, so the last one is restrict finally step four of the server so that is accept takes a socket as the first argument and then Has an address which is optional So again socket still the file descriptor returned from the first socket system call and The address and address length again are the same as the bind call and That's what you would connect to but It's just kind of like an optional thing especially if you've already bound to something before and it's like a persistent thing So you can set them both to null if you want to ignore them and for Stream sockets or TCP sockets. You've already set the address. It's always the same. So you don't have to worry about it So after this after accept returns like I said before it's a new file descriptor and we can read and write as usual Okay, the next one is for the client so The connect system call looks an awful lot like bind except it's on the client and then Connect will actually return you a file descriptor with that connection So again first arguments that socket file descriptor that's returned from socket And then you give it the address you want to connect to and that's the exact same thing as the bind And they would have to match in the client and server in order to actually do make a connection so If and then if this call succeeds the only difference is that socket file descriptor It's just used as a normal file descriptor. You can read and write to it. However, whatever you want All right, so let's see this example. So we will create a This problem. Yeah, it's not really a web server. So we'll just create a kind of server That sends hello there to every client and then just connects So whenever someone connects to it, it just sends a message and then says yeah, okay By closes it and closes the connection So it is in your examples directory and let's just see it and explore and we essentially will play with this the rest of the lecture or do any fun questions So Here is my server code. So in Maine, I'm registering some signals and we'll see why I'm doing that because essentially this program will be an infinite loop and I create some resources so I need a way to clean them up and I would clean them up in my signal handler and use that to exit my program. So We learn signals for a reason because they actually come in handy when we have Things like processes like this that essentially don't die until you tell them to die so Here we use that socket system call. We say we want a Unix socket which will just be represented as a file on our machine and We'll say we want a stream socket because we want things in order Because that makes the most sense to us and then this protocol argument just isn't really used Of course, we would check for errors see if there is a see if it returns negative one Which means it set error no and that the dot the dot and so there's this error no exit function So next step is to Set the socket address So this is a Unix socket and all it has is one field called or it has a field called Sun family Which has to be the same thing which has to just be that same AF value It's just a macro. So it's just some type of number. So it actually knows what this What this structure represents So the other only other field it has is the Sun path field Which is just a array of strings and that's where that path goes So we would use a straight string copy and just copy that path and for the path We set it as what? We're just gonna say it's example dot sock. So we're going to create a file called example dot sock Which looks like a regular file, but it actually represents a Unix socket So after we create this socket adder Struct we will bind and we just bind using the file descriptor return from socket give it the address of that field or Address of that struct and tell it how big that struct is because it's C and of course we have to Again check for errors and then we'll do that listen system call Zero is a default key argument. So we just kind of have to call it and we'll just check for an error And then this is the meat of our server So this is what pretty much every web server if you tear it down enough. This is what they all look like So it's just a while true. So while an infinite loop except and except will accept an incoming connection. So if you know Nine people connect to this this loop will execute nine times will be nine accept calls. So we actually get a new File descriptor for all of them. So let's see what happens in one iteration of the loop so we accept if this returns it will give us a file descriptor representing the connection and Except is like a normal blocking system call. So it won't return until there's an actual connection, which is why we do Yeah, so it won't return till there's a natural connection Which again might be a good reason to use threads for this and kernel threads So we don't block everything but for this we don't even use threads, but a real web server one So we accept after returns from except we have a connection So we just create a message get the length of the message and just do our write system call to that connection file descriptor so we just We're just writing hello world there. That's it and then we of course check for errors And then we close the file descriptor because it's like any old file descriptor So that's what happens every time through the loop So I can only handle one connection at a time the way this is written Because I didn't make any threads. Nothing's running in different processes or anything like that But in a real web server after you would have essentially one thread that's just doing except over and over again and It would create a new thread to handle that connection and go ahead and go do that. So that's more what a Real web server would do but this one's really really fast. So it just sends a message and closes and then Yeah, and then in our client Well, it's pretty much the same thing as a server So it creates that socket checks for error like I literally copy and pasted this. It's the same socket address Goes to that same path which would be example dot sock And then tries to connect instead of do bind or anything and then after connect it can use that file descriptor So here we just create a big old buffer. That's the size of a page do a read system call and Just we just do it in a while loop just to read all the data We can but we know that it's only going to read one line and then return zero indicating that it's closed We'll check for an error and then we'll just print off whatever string we get and then close the file descriptor So let's see this So in here If I run build dot server Looks like it's not doing anything because it's at that accept system call over and over and over again Well, if I go in my other shell here, I'll see Once I started it there's this example dot sock and it just looks like a file and since we did the I knowed I knowed lecture yesterday. We can actually check it out because oddly enough It is also an I knowed So the sockets represent by an I knowed but it has zero blocks It doesn't take up any space on the device It just kind of looks like a file, but it isn't really a file, but it does get an I know number Because it has a name so it needs a way to refer to it But if you connect to it the kernel will do stuff. Yep Yeah, yeah, so dot sock is just a convention, but I could called it whatever I want All right, any other questions about that Okay, well now if I'm the client I Can run the client and I see I received hello world. So nothing in there had Or nothing in that client program had hello world. It comes from the server. So whenever I Run it connects to the socket and then the server sends it data reads the data and then it closes itself So that happens no matter how many times I run it and It just receives all the time because the server is just there in that big old loop Now if I want to stop the server well, I can hit control C because I registered my signals and We'll see why we did that So let's go look at our signal So we registered a signal to call this handle signal function handle signal just calls closed socket just because I wanted to be really intentional with what the name was and Close socket is what cleans it up So it closes the file descriptor representing the socket and then it does this unlinked system call Remember unlinked just removes that name. So whenever I don't have that server I don't want that file to be on my system So it does this unlinked system call while it's exiting just to clean up any resources They have if I didn't have a signal handler or anything that example dot sock file would still exist on It would still exist in that directory, but it wouldn't have anything actually listen to it It would just be like a dead file. So there's no point. You should actually probably clean it up so Now the server is not running. So the example dot sock is not there So now if I go to my client. Oh Well now my connect system call gives an error because I can't connect to that socket because well Two things that file that path doesn't exist anymore and also there is no program listing on it Yep, so that yeah, so this set error. No, but let's see client So Yeah, so I got an error from my connect So connect return negative one and then it went to this error. No exit because yeah, it sets error No, all error. No exit does is do PR with whatever I gave it which was just Connect and then yeah, so it prints that message from the error. No, so it would Have the error no such file or directory Yep Yeah, so the questions What would happen if I killed the server as the client was doing something if it went back and forth a few more times and The answer to that is because it's a stream connection You wouldn't get a signal when it died But you would get an error whenever you tried to read and write to it again So whatever the next thing you're going to try and read and write you'd get an error from that system call And then you can go ahead and handle it all right, any other questions and This is like the kernel interface So the only difference between this and your web browser or your web server is that it just listens to IP? Addresses, but everything is the same. So this is a server Chrome's going to use the socket interface because that's the kernel interface for networking and you can't do anything about it So everything will use this. Yep Yep Yeah, so the question is if this is actually going to the network Do I have to do any more work and the answer to that is new? So the only thing you have to do different if you want to connect on a network is you just use socket adder For IP or I net whatever they call it or IPv6. That's it. That's the only changes you have to make Otherwise, it's exactly the same. Yeah Yeah, so the question is if I just Start a server that listens on anything can other people connect to it? And then the answer to that is depends who's running the network so The Toronto network it will not let you do that So you won't be able to just host a web server and it communicates with everything you have to ask nicely Because all the networking people can block ports between machines because they're in charge of getting Data from a to b so they can block everything if they want So yeah, it depends on your network setting if you're at home on the same local network You can do that all you want you can connect to it because probably there's no restrictions Yeah, all the other stuff of making sure the number you put in actually connects to a machine That's all a networking problem, which is another course. Yeah Yep So Basically, you would use stream sockets whenever you want to actually know someone has disconnected And it's really important to get the messages in order and you really want to be reliable UDP doesn't have any of that it just sends a message and it's like hope hopes for the best So generally if you want something to be persistent in order and everything you just use TCP Otherwise you use UDP if you don't really care about the data and then there's like in between things so sometimes people build their own things on top of UDP to like hey, you should send me a message if you're still alive and like that but Eventually you add so much stuff on top of it it becomes TCP but crappier so So concrete examples so everyone's used SSH probably so SSH uses TCP So that's an example because everything you type you get in order. You send it in order. You probably want everything in order Example of UDP probably more like if you were actually listening to this like live streaming something a lot of them use UDP Because if you lose a connection well, you just it's live anyway So whatever you catch up you just get the latest thing and you miss that minute or whatever you were disconnected for So it wouldn't even bother trying to retry and do that. Yep So database connections typically are TCP so they want to be persistent in order you send a message you get something back You probably want to know when they die or not But yeah, most things Generally now use TCP Generally the only things that use UDP are like games lives like video streaming Sometimes some video streaming uses TCP, but that's like the two main ones All right, any other questions? Because this is yeah, this is Exactly this isn't really anything new except a few new system calls and address, but this is just another form of IPC and Connecting between different machines So there's a few different ones So instead of read write, there's also like specialized socket ones called send and receive But they're basically the same thing Except they have a few more things if you actually care about networking So some of them examples are like message OOB or send and receive out of band I don't specifically know what that means, but it probably means something to networking people There's a peak system call that essentially won't increment that position so it's like I wanted to just check what's at the beginning of this packet or Beginning of whatever that I got sent during the connection and I don't want to advance the read I just want to look at it and then maybe forward it on to something else So that's peak and then the last one is don't route Which is a weird thing because typically whenever you connect there needs to be like a connection between two servers That's mostly for if you are like super in the weeds of networking and you're implementing your own router Or you're doing some insane network debugging. These are things you'll probably never see Except for message peak. There's no way you'll ever probably use these or need to know these and then there's also send to and receive from which takes an additional address like if it is a UDP socket you don't have a you don't have a persistent connection So you don't need to connect so you can just send to receive from and those take the address as one of the arguments So you can just send directly to that address or receive directly from that address And those are completely ignored for stream sockets because you already have a connection. They're only for the datagram sockets so That's pretty much it for sockets. They're just IPC But they're across machines and the basics are they just require an address So if they're local, it just looks like a file if it's on the internet it probably has an IP address which just looks like a number and If you wanted to you know enough that if you wanted to connect to any server You can use that put in the IP address and do a read system call and just see what you get So you could use that for any server as long as you figure out the IP address just connect to it Do a read see what you get because you'll probably get something So there's two types of sockets. There's stream and datagram stream are the ones that are always in order Persistent connection, you know whether or not that connection is broken Datagram is just like throwing messages out into the wind. You don't know if they'll get there but you know, you'll hope they'll get there they might not be in any particular order and That's just how they work and Then if your server all you have to do is bind to an address say you're listening to it and accept some incoming connections If you're a client all you need to do is connect to an address then on both sides. You just have a file descriptor Same read and write system calls. We all know and love So any other questions about this or the lab or anything at all? Again, like I said, we're like in the chill third of the course so We're cruising every all the hard stuff's already done Ish. All right, so just remember pulling for you