 All righty, welcome back to good old operating systems. So today, fairly chill lecture. Hopefully we can be over with it fast. Get back to lab questions if anyone has any lab questions. Hopefully you've started the lab otherwise, Yikes. Yeah. No, so this won't be tested at all for anything. This is just for you because if you have ever used the internet before or web browser or want to communicate using the network, it's what Sockets is. So it ties into this course because it doesn't really add anything that more. And then what we get out of it is, yay, we can communicate across the network and across computers, which is real fun. So let's just dive into it. So Sockets are just another form of inter-process communication. So so far, we've seen pipes. We've seen signals. We've seen shared memory even. But these forms of IPC, well, they can only communicate between processes that are on the same machine. What if we want to communicate between different machines? And that's where Sockets come into play. So Sockets enabled inter-process communication between physical machines. And typically this will be done over a network. So whatever you visit any web page on the planet, you're using Sockets. So there are two steps, or sorry, there are two kinds of Sockets. There are Sockets that supposed to represent a server. Like if you go to Google.com, you connect to a server that supplies your web browser some information. And then you render that, and then you can see Google.com. So the server side is the more complicated side. So there are four steps if you want to use a Socket as a server. And they are first, there is a system called Socket that will create a Socket. And spoiler alert, it just gives you a file descriptor. So that file descriptor represents a Socket. And then everything you know about file descriptors is the exact same. You can read and write to it or do whatever you want. The other steps are a bit weird. So there is the second step, which is bind, which means we attach the Socket to some location. So that could be a file, which we'll see in the example today. Or an IP address with a port, so on and so forth. Some way to actually form a connection with this server through some known name. The third one is a listen step. And that indicates you will accept connections. So you are going to accept connections. And basically the kernel will take care of queuing all the connections for you if like a million people try and connect or something like that. It has an internal queue and you can set a queue limit. Then for the fourth step and the final step, you do an accept system call. And that will return the next incoming connection for you to handle. And it will also be a file descriptor. So whatever file descriptor you get from accept represents a connection between you and another person or another program. And you can read and write information to them. And it goes across the network. So clients have a much simpler job. They only use one socket per connection. So you create a socket with the same socket system call. And then you use a connect system call. So that connects to some location. And now the socket is active. It can now send and receive data. And the file descriptor from the server end that got returned from accept can communicate using that same file descriptor. The information just goes the other way. So this is what the socket system call looks like. It takes three arguments, a domain, a type, and a protocol. The domain is kind of just the general protocol to use. It's further specified by that third protocol argument, which typically no one ever uses anymore. So basically the domain actually represents the protocol. So there's three main ones you'll see. There's AFUNIX, which is for local communication, which is on the same physical machine. And generally the name for that is just a file name. The other ones are INET, so that's IPv4. So like 8.8.8.8. If you've heard of an address like that before, that's an IPv4 address. And then IPv6 because we ran out of IP addresses. So that's that naming scheme, which has colons and hex characters and it's lots of fun. So the type is usually one of two options, stream or datagram sockets. If you have ever heard the term, has anyone ever heard the term TCP or UDP before? Yeah, we got a few of those. So yeah, network, which have you taken the network course? Oh, okay, then this'll be easy. This is, have you actually used sockets then? Yeah, well, there you go. So the sockets should actually be taught here, but here we go. So stream sockets are like TCP connections, datagram sockets are like UDP and we'll go into that in a little bit. So like I said, stream sockets use TCP. That means all data sent by the client appears in the same order in the server and so on and so forth. And you form a persistent connection between them. As soon as that connection gets interrupted because someone pulls a network cable or something like that, you are notified of it and it says lost connection or something like that. So this is reliable, but maybe slow and especially when you get into the networking course, you'll realize that, hey, things appearing in order given a network isn't always the case. Networks can do weird things. So TCP protocol, make sure everything's ordered. There's gonna be some overhead associated with that. Datagram sockets on the other hand use UDP and you just send messages or packets between the client and the server. There's no persistent connection between them. It's really fast because you don't get any acknowledgement or anything like that, but any messages you send to the server could appear in different orders depending on how the packets get routed or they might not even make it there at all and you will not know the difference. So generally you pick TCP because it does a lot of the work for you. If you really get into performance like where if you send something and they don't get it, it's too late anyways, like a game or something like that, that needs really, really fast response times. You use UDP because resending it, that time has already passed, the state of the world is different, so why would I bother to send it again? All right, so the buying system call takes a socket file descriptor as the first argument and then an address and the address is over two arguments, so there's a struct socket address which is different depending on what name you're trying to connect to, so there's one for each of the three different types and then last one is a length because it's C and it doesn't know how long it is so you have to tell it how long it is. So the three structs that are different is there's SOC address, so short for socket address and then underscore UN, I don't know why they just can't type in UNIX, why it has to be UN, but don't ask me. So for local communication, it's just a path that looks like a file, for IN or IPv4, it's an address like 8.8.8 and for IPv6, it's like some ugly hex thing that's separated by colons. So listen system call sets the queue length and limits for incoming connections, so you just give it that same socket file descriptor and then an integer with the maximum queue length that you want it to maintain and what it will do is the kernel itself will queue up to that many connections as specified by that backlog argument and after that you would just get a connection failed message and the kernel would just drop the connection whenever someone tries to connect and your queue is full. So if we just set it to default, we let the kernel figure it out, get the default size, kernel spray smart. All right, the third or the last part of the server is the accept system call that blocks until someone tries to connect to it. So again, takes the socket file descriptor as the first argument and then the address again and the address length and these are optional because this could represent like a UDP connection or something like that. You already set it in the listen system call or sorry, the bind system call, then you don't have to worry about these, set them to null and just ignore them and at the end of this, like I said before, this returns a new file descriptor and then we can do read and write system calls as before and now they get sent over the network. So this is the client, the client just has a connect system call that just takes that socket file descriptor and then the address similar to the bind for the server. So you would have to try and connect to the same thing that the server is listening on. So all of that is exactly the same as bind and if this call succeeds, it returns you a normal file descriptor that you can read and write and that's how you communicate with the server. So let us just go into the example and then we can wrap up quickly and go back to the lab and I guess your other midterms if you are so inclined. Okay, so here is our main, I will make that slightly bigger. Here's our main, we're going to register some signals like we've seen before, this is just so we can exit properly so we'll go over what our signal handler does in a little bit but we're trying to be a server in this code so we are creating a socket with the Unix file type so we're not going to try and connect using an IP or anything we're just going to connect using a file name and we will say sock stream that will give us that stream connection or TCP so everything's in the same order and we don't have to worry about it and then the protocol parameter is just unused. So of course we would do what I have been harping about this entire course, you may have debugged for an extra 10 hours on lab two because you didn't listen to me so here I check for an error and if there is an error, I exit immediately so I know of it. So our next step is we want to bind so we need to set up a string so we can bind to it so here we create a socket address for that Unix name, this bracket just initializes that whole structure to zero and then we also have to specify in that structure what the domain is or the protocol so it should match exactly what you have in the socket system call and then here I fill in the sun path field and that is just the name of the socket to connect to. In this case, my path is defined as a define and it expands to just example.soc so this should create a file called example.soc or something that looks like a file and that represents our socket. Otherwise I just, I'd say how much of the string to copy don't need to copy the null byte or anything like that then I bind to that socket address check for errors of course then I listen, I give it a zero for that queue depth because I don't care I'm just going to use a default check for errors and then I have a while true so if you ever use a web server or host a web server the core of your web server looks exactly like this so it's just while true it does an accept system call so it's just waiting for connections waiting for people to connect to it so this will do an accept system call which will block until there's a connection and then it will finally return from the system call and I just use that socket file descriptor so whenever it returns I get a new file descriptor that represents that connection so because I'm a server I will send the client some information so here I create a message called hello there I calculate its length with the null byte or with the yeah with the null byte and then I just do a write system call so we've all done write system calls hopefully and this will send information possibly across a network but in this case it's just on my machine but it could be a network depending on what you listen to so here check for errors check that it writes the whole message otherwise I print out some error messages and then here I just close that connection and then the loop would just restart again and then wait for an accept again wait for a new client and so on and so forth so in my register signal that's the only way I can end this process cleanly so I created a file called example.soc that represents a socket so I should close it but because I'm in an infant loop I don't really know when to close it and the answer to that is well I should register a signal and I should close it successfully in my signal handler so here I registered my signal handle signal function and in here all I do is call close socket and in close socket I close the file descriptor that represents my socket and then I do this unlink system call which we haven't seen before this basically just removes that file so that it doesn't show there before because if we didn't remove it it would just be there and now we can't really use it so let us run this so if we run it looks like it's not doing anything because well it doesn't print effort do anything anyways and if we S-traced it we would see that it is blocked on that accept system call because no one has connected to it yet so we can switch over to our client and our client would do the same steps set up the same socket system call AF Unix Sock Stream so it should match what you're connected to otherwise you'll get an error and then here I create that socket address it's a Unix socket it goes to the same path so it should be example.sock and then because I am now a client I just call connect to it so I call connect and then if connect is successful it doesn't return a new file descriptor it just makes this file that socket file descriptor actually usable so I can read and write to it so in this case I'm just going to create a buffer and then do a read system call and just keep reading over and over again while I get some data all the data we can get until we get that end of file character because we'll get that because the server closes the connection on us so we know it's not possible to get data and then at the end of it I will just print F whatever I received across the socket so if I go back to a new terminal I can see that there is this example.sock so that is what the server is bound to and listing on and then if I run the client yay I received hello world or not hello world hello there wow formal all right any questions about that that is a web server so not too much different right if we know one form of IPC it's the only difference is how you set it up but after that read and write system calls now they go over the network yay any questions pretty cool awesome stuff so you will use this in other courses or if you ever have to do anything involving the web uses this and if you want to verify it go ahead and S trace your web browser if you're brave enough so if you S trace your web browser you'll see that well it has to communicate with the system and it connects to the internet so it has to use sockets so I have no questions or want me to do anything silly with this example see what would happen yeah yeah so I can run it again because all the server does is it each time we connect to it this except would return we send it a message and then we just close that file close that connection and then we just wait for the next one so I can run this client as many times as I want and I get a message from the server yep so the question is how many clients can the server hold at a time so in this how I wrote this server is I only serve one client at a time right now so I just accept a connection I write a message I close the connection so I only handle one at a time and how many clients can try and connect to it at the same time it's determined by the queue depth of that listen system call until it starts dropping connections so if I wanted to I could write you know I could write launch a bunch of shells that launch a bunch of processes that try and connect to it and see when it starts dying that's probably a really big number yeah yeah let's run the client after closing the server then so if I press control C should shut down normally because I close that socket and I remove that entry here I can see that there's no example dot sock there so if I try and use the client now it says connect no such file directory doesn't exist anymore if I want to we could start a different one so let's say I messed up so let's just say I forgot to do this in the server so I forgot to unlink which basically just removes that example dot sock entry so if I go ahead build this and run the server and then close the server so the server's not running but that file is still there so if I try and run the client again I shouldn't get no such file directory because it's clearly there so let's see what happens connection refused so there's nothing nothing's listening to it it's just a socket floating off in somewhere I don't know pick your favorite place so this is why it's always good to check for error messages too and connection refused who has seen that whenever they try and connect to a website yeah guess what it's just it's this alright any other questions I think that's the most we can break it without writing another script that like tries to see the maximum number of clients I can send to it but because it's on my machines probably really big yeah yeah question is this the most common form of IPC yeah why would someone ever want to use pipes when they can do this so pipes are a bit more private because you share them directly with your children and you know anything that is unrelated cannot connect to it and cannot get access to your pipe while in this if I know the file name and I have permissions to get to it any process could connect to that right which sometimes you might not want so that's about the only thing that's different but this is a lot easier typically you write programs that want to communicate with anything and then in that case you just use a socket because you want to throw it on the internet or something like that alright any other questions cool let's just polish off the slides so instead of read or write there's also system calls that essentially do the same thing except they have some special flags so like there's a send and receive system call that do network specific things which you will not understand what these things actually are until you take a networking course so there's like out of band data which is really weird there's peak so you can look at data without actually reading it so you just take a peek at it and then you can forward it to somewhere else in your program that actually reads all of the data then there's this don't route flag which sounds really weird so that sends the data without routing the packets anywhere you may think hey if this is a network thing and I need to route packets between somewhere why the hell is there something called don't route and that's only used if you are actually implementing a router yourself which you may or may not you probably don't use in your networking course but if you need to implement a router switch or something that's when you would use don't route when you really care about the path and you don't want to like go around infinitely because you are the router so I guess that's it so you perform networking through sockets everything's a system call you can sTrace everything if you don't believe me I recommend you never believe me you can just always sTrace things yourself if you really want so sTrace Chrome see what happens but sockets are the actual IPC that just communicates across machines basics are sockets require an address so like a local thing I used which is just easier and local to your system and then IPv6, IPv4 two types of sockets stream and datagram sockets need to bind an address listen, accept connections have a whole bunch of steps for that and then clients just need to connect to the address and then you get a file descriptor with your two-way communication there all right we can use the rest of this time for fun or ask me things about lab three or study for your midterms or do whatever so just remember, phone for you we're on this together