 All right, good morning. So today we're going to talk about sockets. So luckily, they're not anything terribly new other than they actually may use a network. But fundamentally, you'll see that they're just file descriptors, but we should know what they look like since lab five is a web server. So it's going to use sockets. So it's probably good that you actually understand the code that you'll be modifying. So like I said, they're just another form of IPC. So we've already seen pipes, we've already seen signals that are also like IPC. We also talked about shared memory, but you can also represent shared memory using a file descriptor and read and write to it or you could also just let the kernel play around with the page tables so that two processes can access a different virtual address and they map to the same physical page and therefore can share memory. But so far, all the forms of IPC we've seen so far all assume that the processes are on the same physical machine. Today we're going to assume that they may not be on a physical machine because that's what networking gives us. So sockets enable IPC between different physical machines, then that will typically be done over the network via Wi-Fi, Ethernet, or whatever. So if you are a server, you have four simple steps to use a socket and they're all system calls and have the usual C wrappers. So if there's an error, they return negative one and then set error no, and then you can figure out what the actual error is. So the first one and you have to do these in order is you create a socket using a system call called socket and that will give you back a file descriptor. And then there are a few additional steps. If you are using sockets in your the server, so step two is bind. So you can attach a socket to some known location that other processes can use to communicate with you. So we're going to use a file today, but it could also be like an IP address and a port, could be an IPv4 address, could be an IPv6 address, and you'll learn more about that in networking courses. But for this, we'll just use a file. And then the third step, it's kind of explicit is listen. So you have to indicate you're actually accepting connection. So if someone tries to use that socket to connect to you, you are accepting a connection and you let the kernel go ahead and make that connection for you. And then as part of it, you can set a queue limit, how many connections are allowed to sit there and wait for you to finally listen to them. And when you want to listen to a connection or accept it, there is the fourth step, which is called accept. So it returns the next incoming connection for you to handle. So if you want to handle a lot of connections, then you can accept as many as you would like. And you're going to be kind of bound by that queue depth of how many that can pile up at a single time. So clients are a bit simpler. They only follow two steps. The first call, they just create a socket. So they create a socket, same as you would do in a server where you kind of will see what the details of that, but you have to specify some details about it. And then in step two, you just connect. So that's where you give it a location. So the server has bind where it has to listen and accept a lot of connections. If you're the client, you just connect to a socket and that's it. So you just call connect, you put the location there, and now the socket can now send and receive data if it has been accepted. So you can read, you can do read system calls, you can do write system calls on it. So the socket system call sets the protocol, which you'll learn more about all these different protocols in the networking course and the type of socket. But at the general step, so there's three arguments. There's a domain type and protocol. A domain is like a general protocol, and it's further specified with the last parameter, that protocol one, although that one is generally unused right now. So the domain can either be AF UNIX, so that's for local communication, so on the same physical machine, and that's gonna be represented by a file. And then you have INET, which is normal internet connections, IPv4, and then INET6, which is kind of the newer IP range that just increases the size for an address so you can support more IP addresses. And then the type is where you'll learn more about in your networking course. So if you've heard of like TCP, UDP before, that is what the type is. But the kernel gives a more abstract interface to them so they either call them stream or datagram sockets. So TCP, AKA, is what they call stream sockets, so it gives you some nice guarantees, like all data. Now the sent by the client appears in the same order in the server, so if you sent one, two, three, four, five, you would get one, two, three, four, five on the server. So, and it forms a persistent connection between the client and the server, so if that connection breaks, you'll be known of it, and you'll know that that connection is no longer valid. So this is a nice reliable way, but also might be slow because you're keeping a connection open and there's gonna be some communication between them back and forth, and also you're letting the kernel take care of some synchronization issues because you're getting things back in the same order you send them. Well, if you dive into the networking course, you'll know that that is actually not guaranteed. So it'll give you a bit flavor of that. Datagram sockets essentially use UDP and it sends messages between the client and server. It doesn't form a persistent connection at all. It's really, really fast, but for this, you send essentially packets, so you send little bundles of information from server to client and they're not guaranteed to even arrive or arrive in any special order. So if you spent, sent five separate messages in a row, like one, two, three, four, five, you have no idea what order they may appear on the server. It might be five first, four first doesn't matter, and when you go into the networking course, you might even realize that all of those packets might not even go the same direction, so one might, I don't know, route through Vancouver while another goes through Los Angeles or something like that, so there's no way to guarantee anything about them. But it might be fast, it might be more applicable, there's no overhead, but you also don't know if anyone's connected anymore, and packets could just be completely lost and you're not going to know the difference. So the second step on the server was that bind system call that basically sets a socket to an address. So the first argument you give to it is a socket, which is a file descriptor, and then the next one, they try and be more general, but C, so it kind of looks ugly. So you just have a pointer to a socket address, and then you have to give it the length of that structure, and it's supposed to know based off the type from here, it basically sets the format of that socket address there. So there's one socket address structure for each type. So if you use just the Unix type, your socket address would be socket address underscore UN, I don't know why they had to shorten it to two letters while it could just be Unix, but that's just for local communication and it's just a path to a file. And then if you are using IP or Inet, it would be the IP address. So for example, like 888.8.8.8, which fun fact is a DNS server, a valid one, and then if it was IPv6, you would give it IPv6, so it's just a longer address, and this is what an IPv6 address would look like. So the listen system call is the one that sets the queue limits for incoming connections. So again, it just takes the socket file descriptor, and then an integer of how many connections you want the kernel to stack up that you can be ready to accept at a single time. And then if more connections come in that are greater than the queue, the kernel is just gonna drop the connections and it's gonna say your server's not responding anymore. So it's just a number of connections. If you set it to zero, it's going to use a default queue size. So we'll just set it to zero for our example today. Then for the accept call, so this is the call. So at this point, we have a socket file descriptor. It says what type of socket it is and what the address should look like. Then we bound it to an address, so anything that connects to it will hopefully accept it. And then we listen, so we listen, we set up our queue, and then now we can do accept and that will block until there's actually a connection. So when it returns, it's going to return us a new file descriptor representing that connection. But if we back up this accept function, so the first argument is the file descriptor of our socket, and then this is optional. So these are optional because we already bound to something. We know what it's supposed to be listening to. So for our case, we don't really have to use it, but you could specify and address there if you would like to. So this returns a new file descriptor and then we can do read and write system calls as usual. And then this is the server side of it. So the server, or sorry, the client side of it. So the client still has to call socket and we know how that works, so it's not any different. So in the client, it has to do a connect system call to connect to an address. So you give it the socket file descriptor and then you give it the socket address that we saw before. So socket fd is just the file descriptor returned by the socket call and you have to make sure that the location you're connecting to, it's going to be the same protocol as what it expects, otherwise you're gonna get an error and then it's the same address and address length you have to give to it. Then again, if this call succeeds, then it returns you a file descriptor and then you can read and write to it. So, and that forms communication between you and the server. So our example server will go ahead and send the pack, send hello there to every client and just simply disconnect them. So it's in lectures 26 in your directory. So there's client.c and server.c and we'll see now that we can actually program these nicely and we actually use signals and we see what signals are actually useful for. So we're gonna just use a local socket for demonstration. So we're just going to give it some file that we know but you could go ahead and use IP like you could use internet if you want but it's not gonna make a difference for this example. You just change the socket address. So we're gonna use the file called example.sock and let the kernel go ahead and manage that for us. So what would that look like? So here is what it, here's the client. So we just define a socket path. So it's just gonna be example.sock and then in main, we are creating a socket. So we create a socket. We're going to use AFUNIX because it's just going to be a simple UNIX socket which is just representative file and we'll use a stream. So everything is nice and reliable. We don't really care about how fast it is now and then the last option was just zero because it's the default. So then we go ahead, we check our errors because we're good like that. So if it's negative one it sets error no. So this one's a bit of a different flavor but it basically does the same thing as check error did before. So it'll print the error message if it's there with what error no means and then exit. So in the case of, so we actually have to specify the socket address. So we create a socket address for UN or the UNIX socket because it matches what we, the argument we gave to socket and then as part of the kind of contract and conventions you have to follow. One of the things in the structure is you have to put that AF, you have to put that value in the son family field of that structure because it needs to figure it out what it is and then we can go ahead and just stir and copy so we can just copy a string. We can just copy the socket path to the son path and that's just the name of the field it needs and that's it. So that's, yeah, so that's all we do. And then after that we have an address set up in that structure. So then we can just use our connect system call. So we give it the file descriptor of that socket. We give it the socket address structure that we just populated and we give it the size of the socket address so it doesn't overrun a buffer, do anything silly. And then after that, yeah, so after connect, oh, so sorry, yeah, connect doesn't return a file descriptor it just makes that socket file descriptor actually valid to read and write to. So we just check if connect has an error, if it doesn't have an error then we can read and write to that file descriptor. So we can go ahead just declare a buffer like we've been doing and then we just, in this case, we're not going to send any data to the server at all. All we're going to do is just read over and over again until we keep reading all the data we can and then eventually reads going to return zero indicating that we're at end of file or that there's no more data left. So then we're just going to make sure we didn't get an error and then we'll print out the message we received and then just close the file descriptor and then return. So any questions about the client? Pretty straightforward. Okay, so if I go ahead, so let's see that that works. First I will run a server in another terminal and then when we come here we can go ahead, execute the client and the client receives a string hello there. So, and then we can have as many as we want. They can connect over and over again. They all receive the same string. So let's go ahead, make sure there's no questions. Okay, so if there's no questions then we can go take a look at the server. So server is going to do the same thing. So let's ignore that right now. So it's going to create a socket. Looks exactly the same as the client. So it's a Unix socket and it's a stream. So they both have the same protocol. They both agree. So we're not going to get any random errors or anything like that. Then we do the exact same thing to set up the socket address structure. So this is exactly the same in the client, which it should be because the client and server need to at least agree on some address that they use to connect to each other. So in this case, they're agreeing on a file. And then in the server, next thing it has to do is that socket file descriptor, it has to bind to it. So it binds to it, calls bind. And now after that, that file descriptor is set up as a socket that can accept connections to it. But only after we also listen to it. So it's a two-step process. So after we bound to it, we now, it's now kind of bound to an address that it knows it needs to listen, it should listen to whenever you call that function on it. So it's now bound to an address. And then after you call listen, you set up the Q limits. And in this case, I set the Q limit as being zero or basically just accept the default. So after that, I will just have an infinite loop and in my infinite loop, every time it goes through it, it will call accept and it calls accept on that socket file descriptor and just gives null arguments because it's already bound to an address so we don't need to use anything. So accept will return a new file descriptor that represents that connection. So we'll make sure that it doesn't have an error or anything like that. And if it doesn't have an error, then we can just use read and write on it as normal. So we have our string hello there. We calculate, we store its length. So we have to do stir length and then plus one for the null byte because this is C and we'll just be nice and also send the null byte to the client so it doesn't have to do any weird, which is why the client doesn't have to do anything weird. So it's gonna be a C string. So then we just do a write system call. So we write to that file descriptor representing the connection so that will send bytes to the client and then we're sending the message and the length of the message is here because we calculated it and we're just going to make sure that it sends the whole message. So if write, because write always returns a number of bytes that it actually wrote to the file descriptor. So we're just assuming that it writes every single byte to the client. Otherwise, we will just exit with an error. And then after that, we're going to assume this is the only thing we're going to do. So we're just going to close the file descriptor, which would close the connection and then we return to the beginning of the loop. So you go ahead, so then you can accept another connection. So any questions about that? Nope? Yeah, how'd you kill the server? Yeah, what'd you do to kill the server? So yeah, that's part of it. So let's see what happens when we kill the server. So right now we can see when the server is running. There's that example.soc file there. So that represents our socket and it just looks like a normal file. And you could even read and write to it if you wanted to, but that would cause some weird things to happen. So there's our socket there. And then if we go ahead and close our server with control C, then our server is done. And then that example.soc file is now no longer there. So if we tried to run our client, then there's nothing to connect to. So I would say connect no such file directory because it doesn't exist anymore. And I guess to your point, so the nice thing, so to actually clean up the resources in the server. So whenever the server runs, it will actually, as part of this socket, everything, it creates a file. So we can see this file again. So there's example.soc. So to shut down the server nicely, it has to remove this file. And then so to remove this file, that's where we use our signals. So in our server, we register a sig init handler and a sig term handler. So this one's for control C, and this one is if anyone tries to kill us nicely or kind of nicely ask us to terminate. And then in our signal handler, so this is where we just register what function to run. And we're going to register the function called handle signal. And then handle signal, all we do, we don't care about what signal actually got sent. And all we do is close the socket. So to close the socket, it would close that file descriptor. So now that file descriptors close no longer valid, but that's gonna be closed anyways. The important part here is this unlink system call. So the unlink system call actually removes that socket or example.soc entry in that directory. And we'll see when we get into file systems, there is actually no such, well, at least according to the kernel, there's no such thing as deleting a file. You just unlink it and just kind of kill any reference to it. And then if there's no references to it, which we'll see later than the kernel can go ahead and delete it. So this is how we close off nicely and we exit out of our infinite loop because we just use signals. So a signal comes in and then we kind of shut down gracefully. So now we can see where signals are actually somewhat useful. Okay, any questions about that? So, hmm, okay. So fairly straightforward then. Okay, well, in that case, we can wrap up early again. This might be a pattern for the rest of the lectures because we have like three weeks now and I think we pretty much covered all the topics except for this and file systems. So we'll start doing that. Yep, yeah. So next week we can go over quiz three review stuff or anything just let me know what you wanna see because I think I can fill like four more lectures until I'm done, maybe five. So that gives us another like four. So just let me know what you want to see. So we can do quiz three review next week. So everyone's ready for that. Yeah, and go from there. Otherwise just let me know what you wanna see because we're kind of running out of content. Okay, well, let's wrap up today's. So also a few other things instead of read and write there's also specific system calls for dealing with socket called send and receive but they're basically the same thing as read and write except they have some more flags that are only applicable to dealing with sockets. So some examples are you can do message OOB so you can receive out of band data which you won't know what that means till an operating or till a networking course. You can also do this message peak where you look at the data without actually like advancing the read pointer. So you just take a peek at the data and you don't kind of modify where anything is in buffer. So if anyone tries to read again they start at the beginning and reread the data that you peeked at and then there's this don't route so you can send data without routing packets. Again, you won't really know what that means till a networking course. So you might use peak at some point but generally you don't really need to know these and there's specific send to and receive from that take an additional address if you want to just quickly send something to a location but for stream sockets, yeah that's for datagram sockets for stream sockets those arguments are ignored because there's already a connection formed. So yeah, you perform networking through sockets. It was surprisingly not actually anything new it's just another form of IPC and they use file descriptors so it's the same abstractions we all already know. So all they are is IPC across different physical machines possibly they don't have to be and the basics are they require an address so if it's local it's just a file or it could be an IP address like a V4 or V6 address and there's two types of sockets stream and datagram stream are your ones that are reliable form of persistent connection and you get things in order for datagram sockets it essentially will batch up one single right call and then send it to the client and if you do multiple right calls they might arrive in any particular order you're not given any guarantees and if they drop their network connection for some reason you're not going to be informed about it because you don't form a persistent connection with them. Then if you're a server you need to bind to an address listen and accept connections if you're a client you just need to connect to an address and then after that point they're just a file descriptor and you can use it however you may. So I will yeah so YouTube streaming uses TCP. Yeah so I will wrap up this lecture and I will stick around in class if you have any questions or lab stuff or anything else or just tell me what you wanna see for the next lectures. All right just remember pulling for you we're all in this together.