 All righty, welcome back to operating system everyone's favorite course All right, so today we're talking about sockets Chill lecture today. I will not test you on this, but it will be useful for you. So Have you ever connected to the internet? Guess what you've used sockets your web browser uses sockets if you want to be a cool hacker person you will use sockets because The internet turns out has a lot of juicy things on it So we will talk about sockets aside from the apis Because operating systems are great. You already know how to use them aside from what system calls to set them up with So sockets just another form of IPC. We've done pipes. We've done signals We briefly kind of sort of talked about shared memory. We didn't use it because Well, we'll get our fair share of it once we start talking about threads So, you know three types of IPC here is a fourth So all of those were fairly limiting because they all had to be on the same physical machine shared memory You need to access the same memory between processes Pipe signals same thing operates on a single machine Sockets enable IPC Between physical machines so the processes you're communicating between for inter-process communication Do not need to be on the same physical machine The processes can be on different physical machines and this will typically be through a network Wi-Fi all that fun stuff We will not get into the details of networking that is the networkings courses job not mine This is just how to use them because in all for all practical purposes This is the only lecture you need to know you don't actually need to take networking unless you You know unless you really unless you're going to design a router or something like that This one lecture is actually all you need So there are four system calls They all have the usual C wrappers will the return negative one if there's a problem set error no all that fun stuff Which we definitely listen to me and we definitely check for those now, right? Hopefully because that would have saved you like ten hours worth of debugging on lab two if you listen to me At least a fair fairly significant amount of students probably would have so Four steps if you're a server because whenever we have Sockets communicating across the network. There's usually two sides to it There's a server the one that you connect to that supplies all the information Example google.com any web server anything like that you have to connect to it and then ask it for stuff So if you are a server and you want to use sockets there are four steps You need to take one is a socket system call to create a socket We'll go into the details of that with an example and then two is a bind system call So that socket has to be connected with the location somewhere So it could be a file if we want to use sockets our local machine but they could also be an IP address with a port or a Different style of IP address with a port some other options, but generally you'll never ever use them Third step is to say hey my server is listing on this location And I am now accepting connections and then the fourth system call here is accept and that will return the next incoming connection for you to handle and Guess what it is a file descriptor so you know how to use that So if you get a file descriptor you can read and write and that will communicate with it So if I accept I get a file descriptor back Representing that connection if I write information to that file descriptor I am sending information across the network if I'm reading from it. I am receiving information from Another process running on another machine over the network Clients have a much easier job they don't have to listen for multiple things They just form one connection. So a client Starts with that same socket system call and then it just has a connect system call So that needs to connect to the same location that the server is listening on in order to establish a connection Otherwise, you'll get an error message But if this call is successful you can now use the socket the file descriptor return from socket to send and receive data Using your read and write system calls as we have done before so you can kind of see how File descriptors are fairly powerful because I can just represent an Internet connection with a socket which is the file descriptor so I can send and receive information across it And I don't have to learn anything new Aside from how to set it up which I look up once and then I copy it and then that's it So socket system call has three arguments Only two of them are actually used for anything. The first one is a domain which says Essentially, what should I be connecting? What is the type of thing I need to be connecting to? There are three main ones like I said before there's Unix, which is for local communication on the same physical machine Typically use this for debugging or if you are running servers on the same machine And you don't actually have access to like change Access to open ports or anything like that. Maybe you'll use a file for debugging Other than that there is iNet for IPv4, which is like number dot number dot number dot number which I don't know how Your guys age bracket is but maybe you've seen that if you like connect to minecraft servers or something I don't know what the hell you guys actually do And there's IPv6 because v4 is only a 32-bit numbers With which is like four billion numbers and we've ran out of IP addresses once people started connecting their damn toasters to the internet So this is a much larger number, but typically no one writes this number because it's really ugly. It's hex characters. It's gigantic and Then the second argument that's actually used as type and there are one of two things there is stream and Datagram sockets, which we'll go into next and this protocol just not used It's probably going to be zero for everything. It was there when people thought this was going to be fancier, but turns out no one used it So you might have heard the term TCP If you are going to take a networking course, you will hear of that Basically all that means is the data you send across a socket Will appear in the same order as you send it on the server, which if you get into networking which Despite me talking about is a fairly interesting topic. You might realize that that is not guaranteed because you send information I don't know through Radios or something. I don't know. I'm a software person. So that Can actually arrive in different orders? It can arrive. Well, yeah It can arrive in different orders. It might get routed to Japan or something like that and take a long time while another packet doesn't sometimes that does happen your ISP screws up a lot so if you've ever like Your connections like weirdly slowed like certain sites It's probably because your ISP screwed up the routing and it's like routing the packets to Japan instead of you know to Arizona wherever the hell server farms are So not always guaranteed, but with TCP you actually get that guarantee of whatever you send will be Received in the same order on the server. So it Has some fancy algorithms that make sure that reconstruct stuff and they'll figure out what was sent when and make sure that looks the same on both and it also Creates a persistent connection between them So you can tell whether or not you have disconnected from the server or something like that has happened You'll get some acknowledgement back and forth. You'll know whether or not your information actually made it there It's reliable, but it might be really slow So the other alternative is something called a datagram socket which uses something called UDP so UDP just sends packets or messages just big blocks between the client and the server and It doesn't have the same guarantees So the packets you receive on the server may not be in the same order as you sent them in the client You have no acknowledgement. They actually made it to the server. They could have got lost You could have the server could have died the client could have died they could have Got vaporized by you heating up your dinner by turning on a microwave and you're on the same frequency and something bad happened So it's really really fast because it doesn't have all those guarantees those guarantees Difficult need algorithms to run they'll waste space so on and so forth, but for some types of communications well you don't really care if it made it there or not because if you'd have to retry maybe the state of the world is different now and There's no point in retrying because that packet is now no longer invalid or something different has happened Or you just don't care or you'll just skip a frame or something So like video streaming would use UDP because it just sends you information It doesn't care if you actually get it or if you have need to buffer if your computer is too slow or whatever Doesn't care so it's fast But you might get messages in different orders and then dropped some people think they're clever and they use this because they Think it's faster and then they try and guarantee that things are sent in the same order And then they basically implement a bad version of TCP but way worse with bugs So if you need order use to use TCP or a stream if you don't need order use datagram Don't try and make your own for the for the love of Pete So let's get into the system calls So there is the bind system call which yeah the bind system call Which takes that socket as the first argument which is the file descriptor and then two arguments for the address So one is the address of a struct called sock adrew or saw or the socket address And the other one is an argument that tells you how big that structure is Because we are dealing with C and C is not terribly clever. So you have to say what the length is explicitly in the function so after this it will It will bind that socket to that location and then you can do the next system calls So the structure is going to depend on what that domain parameter was So if it is a unix socket at soccer der underscore on Why didn't they type unix here? I have no idea, but that's what it is called and If it is that type, it's just a file name just looks like a path if it is I n short for I net Again, why they need to save two characters. I do not know. It's going to be an IP address like 8.8.8 Or whatever Google comm is today and then For in I6 it's an IPv6 address which looks something ugly like this So the listen system call takes that socket file descriptor and then an integer representing a backlog or a maximum queue length so That is a queue that the kernel maintains so the kernel will maintain a queue of incoming connections Depending on the length of that parameter and then if there are too many incoming connections It will just start dropping them So if you want to have it as a specified amount You can specify it here. Otherwise you can set to zero and just let the kernel figure out what its queue limit should be Typically just let the kernel figure it out. The kernel developers are better than me probably much better than you So they know what they're doing just set to the default and Hope it works out For the exact system call well, this just takes the socket and then It's still the socket file descriptor return from the socket system call and then it optionally also takes an address here if we want to listen to a different address, but if we set it up as a Stream socket we don't use these parameters But if we wanted to we could use these parameters if we're using a datagram socket So in this case, we will just set them both to null and just ignore them And then after this except returns you a new file descriptor that represents a connection to a Client and then you can do the read and write system calls as usual So that was a lot of setup. Don't worry. We'll see an example The next one is the system call for the client So in the client it just has to do the socket system call and then the connect call So it takes that socket file descriptor as an argument and then a The sock address Structure and the length of it just like the bind system call so this needs to match whatever the server bound to in order to form a connection with it and If this system called successful now the client has a file descriptor Representing that connection and it can read and write to it so you can communicate with that server So that is it. We can just run see an example Okay, that's in connect again cool. Oops. Oh my okay, we're good All right, I played with it in the last lecture, so we'll just make sure we compile it again. Okay so in this case Let us read what happened so excuse my typing So here is the server code So in the server we are going to register two signals and we'll see why in a little bit But I'm going to register sig int and sig term Which is what happens when you ask it to terminate nicely and then what happens when you press control C So I'm going to set these up and we'll explain why later But I register those two signals and it's not just to ignore them to be a jerk. They will actually have a use So first we do the socket system call in this case We're just using a unix socket So I say I'm using a unix socket and then I will use a sock stream So I get everything in the same order nice and reliable and I know whether or not the connection gets dropped And then for that third parameter, it's a protocol. I set to zero. I never use it Then I will check for errors Because if there's one thing you've learned from the labs in doing this course, you should always check for errors So here I check for an error and then if I see an error, which it returns negative one I just exit the process and then I will deal with it later So after that Socket has been successful. So I set up this socket address for a unix socket. I do this curly bracket zero to zero initialize the whole structure Just because it's not guaranteed and it is a good thing to do and then I have two fields I have to set on so for every socket they need the Sun family Field and it just needs to match the domain from the socket because this is C And it has to know what the structure actually is and that's how it knows So it's this is basically C++ classes, but in C So here it has a field called Sun path So I do a string copy because I'm copying this string to it So I'm just trying to connect to a socket just called example dot sock. That is it And in the string copy, I just make sure that I don't copy the null byte I just Go up until the last character and that is my whole path So then I try and bind to that address I will check for an error if there's an error. I bail out Otherwise, I am successful then I have to do that listen system call here I give it the argument zero because That remember that's the queue depth I'll let the kernel figure it out for me and then here is the meat of the code So any web server you connect to it's going to look like this So it's going to say wow true and then it's going to have an except system call So that except system call is blocking it will only return Whenever someone has connected to it. So this will block listing on that signal I don't need the address or anything and then whenever this returns I get a new file descriptor as long as there's no error here and I get negative one I get a new file descriptor which I called connection fd and then That represents a connection to a client I can read and write information to it if I wanted to in this case. I will pretend I'm a server So I will send information. So I will create a string called hello there with an explanation mark Calculate the string length and include the null terminating character, and then I will do a write system call So I just write to that Write to that file descriptor with the message and how many bytes I want to write and then check for errors Check that we actually Written the whole thing and then here. This is just a short way of printing to standard error Instead of using printf. I can use d printf, and then it takes a file descriptor as the first argument So after I'm done writing my message My life as a server in this case is done So I just close the connection check for errors and then I hit this and then I go back to the Beginning of the loop and just wait for the next connection. So this server will just wait for connection serve as many people as It can accept at once Go through send them a message wait for the next person some message wait for the next person send a message So obviously if I have a while true loop, I Can't just I don't know when to exit my program never exits until someone requests it to exit so that's why I have a signal handler because I have the socket file descriptor I want to close and it's associated with a file name that I want to get rid of so in the signal handler All it does is make sure that we get one of the signals We expect which should never happen and then it closes the socket so to close the socket It closes the file descriptor, which is just a global variable So it can access it so it closes that socket and then it does this unlinked system call Which we haven't seen yet that basically just deletes a file Which once we get in file systems will realize that deleting a file is not really a thing So this just gets rid of that name. So that's why it's called unlink and not delete so Yeah, and funny story if you s trace RM and like try and remove something there's no RM system call It's unlink so unlinked just gets rid of a name in this case. I'm getting rid of the name example dot sock So now if I expand this and I run the server So if I run the server, I don't get any output because it doesn't print anything But it also doesn't finish it's running It would be if we s traced it we would see it be stuck on that accept system call in the loop No one has connected to it yet So if I switch to another terminal and look in here, well, there's that example dot sock File that represents the socket so the server looks like it's running and then in here I can oh, let's actually see the client code before I run it So in the client code all it does is do the same socket system call Says we want to connect to Unix socket sock stream protocol zero exactly It's the same as before this socket address is going to be exactly the same as in the server So I want to connect to something called example dot sock So because I am the client I just call connect So call connect give it the sock the file descriptor I got from the socket system call and then give it the address of the structure Give it the length of the structure check for errors Otherwise that file descriptor return from socket now represents a active connection So in here, I create a buffer of a page because the kernel loves pages as we now know and In here, I just do a read system call in a while loop until I'm all out of data Over and over again if I exit the loop I'll just check if I have a error number of bytes in which case I will bail out Otherwise, I just print off whatever I received from the server and then close the file descriptor Return zero exit the program. So now if I run this my client says hello there because I got that from the server and I can run this over and over again and The server is just going to accept the connection Send me some information close it Go on to the next one so on and so forth Any questions about that? That's the internet in a nutshell So cool. Anyone want me to do anything to break this or do anything silly? Yeah, if I can send something to the server to cause it to send a signal to itself No, so in this case if I write anything to the server the server never reads anything It just closes connection immediately so The kernel would like copy that memory in somewhere and then after that connection gets closed. It would just delete it Yeah, I can't trigger a Signal or something unless I do like a kill system call in the server process But yeah sending data not going to trigger your signal or anything All right, anything else fun I should do with it Yeah, can I cause a denial of service attack? Sure? exponential Yeah for whatever. Oh, this is gonna spam. It shouldn't be that bad So my server can put up with that amount So denial of service This also seems silly to deny service to your own machine, but let's roll with it. Yeah, that looked pretty bad So yeah, you can denial of service your own machine if you really want but that's just The kernels fine other programs can probably still run. Let's see if other stuff still runs, you know, other things are fine Yeah, everything else is fine. The kernels probably just dropping all those connections and then Blocking and doing something bad to them. I probably should have S trace the whole thing to see what was happening but yeah, you can denial of service yourself, but like That's when you get into the networking course We'll figure out it's trying to connect to your server and then say hey I'll change the IP address or I'll make you know the route different or just start dropping connections or do something So it's probably actually dropping connections here All right, any other fun stuff? Okay. I should probably get rid of that code because that is dangerous But I could have gotten the situation where forks. Oh, actually, that's probably was failing fork Probably started failing because that was a lot of processes and I probably ran out So I probably actually didn't deny I didn't deny myself service there. I just ran into processes before because Like local sockets are really really fast. So what's my limit here? I probably can't denial myself over a unique socket, but I tried I ran out of processes first. So Yeah, so that's a question. Can I make two servers? So servers already running, right? So can I just run it again? Address already in use So only one thing can be bound to a sock only one process can be bound to a socket at a given time and the kernel knows about that So I'll get an error message But if I wanted to write some connections if Depending on how the server is built it could only close the connection if it gets sent specific information or whatever If I just never send it information, it'll never close the connection until You would have to program your server to drop a connection after a timeout or something like that So that's why people use off-the-shelf web servers instead of doing their own But if you really wanted to guess what this is a web server You just have to speak HTTP because Though like you can s trace chrome if you want it'll create a connection then it'll write Some bytes to the server that says hey Give me your web page, please And then it gets send its web page and The protocols for how to do that information But if you read a thousand page boring document, you could figure out how to do that if you really wanted to All right, any other questions? Cool networking one course or one lecture easy course, right? Still take networking don't don't have them come after me, please All right, so let's wrap that wrap it up So there's some extra things because it's a socket. So instead of reading, right? There's also send and receive system calls that have Special flags that are specific to sockets some examples are message OOB so receive out-of-bound data, which doesn't mean anything to me But means something to networking people there's peak so you can look at the data without actually reading it So whenever you do read it later, you can read the whole message So you can just peek at the first few bytes then in your program you can figure out what you want to do with it and then Another part of it can read the whole message And then there is this don't route thing which says send data without routing package packets Which sounds really weird if that's what? Happens in a network if you send a packet it gets routed. Why would I tell it not to route? well, guess what your router is written in a Programming language, and it uses a kernel and if you are a router you want to set don't route because you are the router And you want to make sure you're forming direct connections Otherwise if you didn't have this flag you could make an infinite loop route Which isn't good, and you will like just crash your own router and something bad will happen So that's why some non-obvious flags are there because guess what? Your router probably runs Linux or something like Linux So it will use something like that But you'll probably never use these unless you take a networking course, and then there's send to and receive from and They just allow you to set addresses with them So too long didn't read of today you perform networking through sockets So they're just IPC's across physical machines It's Aside from the setup it's nothing we haven't seen before but it might be more useful to you Which is why we took this brief detour and also we took this brief detour because you have midterms and stuff So it's nice to have a lecture where you don't actually have to Write stuff down or like be stressed about it or anything So basics are sockets need a type and an address So they'll be like local IPv4 IPv6 You can pick whether or not they're stream or datagram sockets, which are basically TCP and UDP They need to bind if you're a server you bind to address you listen to it and then accept connections If you're a client you just connect to it. You can send and receive data from it That's all so we can ask questions or do whatever or work on lab 3 or Leave early up to you. Alright, so just remember pulling for you. We're on this to