 Welcome back to 26 there at 162 everybody. I almost said 262 We are Out of Mars and the upside down it appears because there's actually some non-orange light that's happened today, but it's still a bad Air quality so that's not great, but let's see what we can do today and continuing our topics We're going to be talking a little bit more about the user's view of the system So that when we really dive in to details inside the operating system, you'll have a good clue Why we're doing what we're doing? So today we're going to talk about communication between processes. We were talking about how to create them Up and how to create threads, but now we're going to talk about communicating between them we're going to introduce pipes and sockets and TCP IP connection set up for web servers for instance and the thing to think about here is Our mental model here is going to be process a on one side of the network talks to process B on the other side And they use read and write just like the file interface Okay, so the other thing I just wanted to keep everybody in mind here as we talked about creating processes with fork So the fork basically Copies the current process all of its address spaces the state of the original process is duplicated in the parent and the child That's the address space the file descriptor is etc And what I'm showing here basically is giving you a brief way to look at this When fork returns once the two processes have been created fork returns in each of them And in one of them it returns something bigger than zero That's the parent and the other one it returns zero and that's the child and I Show you this on this side here once we forked This is the parent the if CP ID greater than zero and it's actually executing a weight Which says it's going to pause or go to sleep until the child exits Which is this other piece of code with a 42 which is in this case an error So most cases with Unix return code of zero is what happens when there are no errors So I saw an interesting question up on Piazza, I thought I would say something about so the question is why fork I mean if it's really creating two identical processes What's the point and the point is there are two processes where there was one before Okay, so fork is basically how you create new processes This is mostly true because as I mentioned here Linux has something called clone Which gives you more options than regular fork But fork was the original mechanism way back in the first versions of Unix and so its semantics are partially historical But the question of why fork is really that's the way you get new processes So last time we talked a lot about the fact that in Unix pretty much everything's a file Okay, obviously you can talk to files with read write you can talk to devices You can do inter processor communication Which we're going to show today, but that interface is pretty constant. Okay, and among other things it's going to allow this simple composition find piping into grep piping into word count, etc that you're Getting used to with your programming at user level and you're going to actually implement when we get around to project two This particular modality of Communication with the kernel is you open everything before you use it Okay, and so all of the access control checking is done on open and if you get returned Something then you know that you were successful in opening The other important thing is that in Unix the kernel is extraordinarily agnostic Okay It's agnostic to what the underlying structure of the data is That means that everything is essentially byte oriented regardless of whether it's coming off of a disk 4k at a time Or off of a keyboard one byte at a time now the question of if processors are composed of threads There's forking a process for call the threads. So we answered that Last time the answer is no. So you got to be very careful only the Only the thread that actually executed fork is recreated in the child process So the other thing we briefly talked about was the fact that kernel buffers reads and writes to give you that byte oriented behavior, so It basically takes from the disk it might take 4k or 8k or 16k at a time and it buffers it internal to the kernel So then you can read 13 bytes and then 12 bytes and then 196 bytes Without having to go to the disk all the time because that would be extraordinarily inefficient Writes are also buffered. So when you write you don't have to wait until it gets pushed out to the disk before it returns back to the user Okay, and then because we had open before use We also have an explicit close operation that typically you use when you want to close something out and clean everything up Although the kernel Will do that if your process just ends and you haven't closed things So I wanted to put together a kind of a walking pattern for thinking about today's lecture And this is going to be one process. So we're not talking about inter process yet But it's a web server which you've all used a lot and here we have the standard three layers We've got the user level And notice that even a server is running at user level. We have the kernel which is all of the kernel code that's giving the glue and the Virtual machine and so on is all done in the kernel and then the hardware of course has got things like networking and disk and so on And so we could imagine that the server process starts up and the first thing it does is it's going to open Some sockets to get ready to listen to incoming requests We'll get to that in a moment, but notice that that First thing it does is a read and that read goes to the socket and it has to take a system call to do that And the first thing that happens is wait Okay, why because there's no data yet. So that Server gets put to sleep or the thread that did this gets put to sleep The server could be multithreaded as we'll talk about Because there's no data and notice we've used read. So we're actually Going to be communicating with the network in the same way that we did with the file system Okay, and sometime later data is going to come in from remotely over the network And for instance, this might be a request to the web server for reading a certain url It'll generate an interrupt. We haven't talked about that yet. It'll copy things into the uh socket Buffer and then poof the weight condition is no longer going to be true and we're going to be able to wake up and Remove ourselves from the kernel and basically return from read So this we we went into the kernel with read But we stayed there for a while and then eventually we returned from read with data And so there's a request and now that request since we're talking about a web server is likely to need to get something off the disc So it executes a read to a file descriptor for the disc file system And now it's going to wait a little bit because potentially the disc has to be accessed with the device driver So that may take some time to pull things off the disc And then the disc interface will eventually hand back the requested data Which again will remove the weight condition and return from the read system call with data At which point we format the reply like a Http Reply we go back to our network socket with a write And that again is a syscall boundary Which will send the packet out going and notice that we don't have a weight condition here Because i'm assuming that the buffers aren't full and the data just goes out And of course after 12 we're going to just repeat and do another read Okay, and we're going to see that a lot in a little bit of the lecture Today we're going to talk more about this network communication thing here. How does this work? Okay, but before we get there. I did want to point out one thing which is What you see here is if you recall we were stalled on our read Both for the network and for the disc for a little while and the kernel took all the responses in from the disc and from the network and saved up And buffered them so that we only got returned what we asked for Okay, so the boxes here Inside the kernel are slots for bytes or whatever. Okay, think of them. This is a generic cue of some sort The the case of write Means essentially that when we write our data it goes into the kernel and is buffered by the kernel And we can return immediately back to the server to Do another read if we want to Okay now Again to remember We talked about both high and low level apis for file data and Also for io now. Here's an example of the high level streams Which all have almost all of them have an f in front of them So like f open and f close and f read and f write and they When they return They return a pointer to a file data structure Okay And that file data structure has inside of it the fact that this was successfully opened and potentially If it was successfully opened That also has information required to do the reads or the writes depending on what you asked for an error is returned From the operating system and from the library in this case with a null file star So if your file star is zero or null then you know that it failed And you use this pointer that was returned for all subsequent operations Okay, and data in the high level streams are buffered in user space in addition to the kernel Okay, so now here's a question as a kernel buffer network traffic indefinitely before any data gets returned at all to read so If you don't execute a read and you open a socket And a bunch of data arrives then what will happen is it'll start filling up in the socket And eventually that'll fill up and it'll back up to the sender until it to stop sending data And then as you start reading it'll pull data out of the network and Empty the socket buffer and then things will get started again. We'll talk about that later in the term To contrast this high level Stream streaming infrastructure where there's actually buffering at user level We have the low level raw interface, which is basically using system calls directly. That's like open create and close Okay, and notice that what returns from them From open on success is a file descriptor. Okay, and that file descriptor says Which file was opened, but the way it says that is not something you can figure out It's something the kernel does it has a table inside of file descriptor to file description Okay data structures And so you're going to get back an integer here that you're not going to know what to do with The one integer that does matter for instance is less than zero or minus one says this was a failure Okay, and then you got to check error and then finally since streams. That's the high level and Up in the system calls That's the low level like open create close or tightly related each other if you take a stream and you run file No on it file number You'll actually get back the internal file descriptor. That's uh part of that stream. Okay So the um the the flags here Are saying whether you're doing reading or writing to the file. That's what you want to do to it the bit Permissions are what other people can do to it. So this is kind of what you want to do locally and the permissions are what other people can do for it The question is does this lead to a vulnerability where other Methods could try a random number to Access a file they shouldn't so i'm assuming that what you mean is you randomly choose an integer and then you try to Use it in read or write So the point is that all of the access control Is done on open and then the kernel for your process puts a pointer into there of Mapping between the file descriptor number and the actual internals Of the open file and the best you would get by randomly selecting something is Maybe you'll pick one that was open But then you already have permission to use it because it's your process If you pick something that's not there, there's no way you'll get another Uh person's file because that mapping between numbers And open file descriptors are is actually uh unique to the process. So randomly random descriptor numbers doesn't help you Now, um, we also talked about the representation of a process inside the kernel So if you look here, um, the process of course has its address space, which we're going to do a lot with in a couple of weeks Um, it's got registers for at least one thread, which is there's always one primary thread in a process. There could be more It's got this file descriptor table which maps numbers to open file descriptions And notice by the way, there's always zero one and two that are started up when you start a process We didn't include them here. Um, but we did talk about them last time. This is uh, the uh, standard out standard error and standard in okay So this descriptor table gives you a redirection and uh, each open file has a description That's in internal kernels data structures Okay, so file descriptors are per process file descriptions are not necessarily Okay And we talked about that last time for instance here, uh, here's process one and two Perhaps this is the parent process is one and the child is number two After fork, you uh, you copy the address space and the registers of the thread and the file descriptor table Which happens to point to now a shared file description And if you take a look at the end of last, uh, lecture we talked about some of the uh, good and bad consequences of this Um, and then of course zero one and two are, uh, typically attached to the terminal terminal Okay, but uh, on the other hand, you can redirect them, which is where piping comes into play Okay, the position variable is how many bytes you've read so far in the file except Uh, you got to be careful because this is the position that the kernel knows of if you're using the streaming Interfaces with f in front of them. There's a different buffer inside Of the user space that also keeps track of the position for your reading through the f read and f write And if these two, uh, so these two, uh, pointers are not necessarily the same and you should take a look I uh, the very end of uh, one of the recent lectures there was a discussion of that Okay, and yes the position variables how many bytes not that you've read so far It's the position of the next thing that's going to be read so that you can change the position with the various seek operations Okay, so if you were to seek back to 100 read seek back to 100 read you could do that over and over again And keep reading the same thing and this position wouldn't change in that case Now, um Okay, so that's a very quick Reminder of things. I just wanted to talk to about some brief administration. Of course homework one is is almost due So hopefully you're making great progress on that Project one should be in full swing. So that's been released Your groups have been set and your discussion sessions. Hopefully have been set so, um You're all Up to for good here. It's time to get moving. Make sure that you Figure out how to have your partners meet Regularly, okay, because that's important Um, you should be attending your permanent discussion session Uh, remember to turn on your camera and zoom Um, and uh discussion attendance is mandatory so your tas can get to know you so that's important Okay, um, the other thing I'm sure you're well aware of is our first midterms coming up October 1st Roughly two weeks from Thursday. Sorry. This has three weeks from tomorrow. I didn't change that but it's two weeks from Thursday And uh be prepared, okay the Last thing is again plan on how your group collaborates. We're going to be giving you guys credit for showing us some selfies of all four of you talking in zoom with your cameras on but Except for just that you should consider doing that So try to meet multiple times a week because even in real space not virtual space People that don't meet regularly the projects end up failing at the end of the term and you don't want to do that So try to keep your groups moving. Okay. Now. We had a couple of questions on the On the chat here So the question about Since syscalls are expensive. Is it possible to pre-request threads and then schedule them at user level the answer is yes And we'll talk more about that I want to give you a brief example Toward the end of the lecture where we talk about thread pools, for instance for web service web services. So that's a good idea Now the the selfie by the way that I was talking about is showing a video a Screen capture from zoom, okay Because you're supposed to be using your cameras with meeting with your partners as well Okay, you don't have to have video screenshots fine And then the other question is will the descriptor have the same value across processes only if The file descriptor is shared because you had a parent that Executed fork then the child will have all the same file descriptors if you open the same file In different processes independently. There's absolutely nothing that says that the file descriptors have to be the same Unless you're doing some tricks with dupe or dupe two, which is something you're going to learn. Well, how to use Okay, so today we're going to talk about Communicating between processes So what if a process there's multiple of them wants to communicate with another one? Why might they want to do that? Well, perhaps they're sharing a task so both, you know Both of them are doing something or perhaps there's a cooperative venturi with some security implications What do I mean by that? Well, clearly if you have a bunch of threads in a single process It's easy for them to communicate But perhaps you don't trust everything that that other code is doing and so you'd like to have separate processes But then you want to have them communicate Okay, and this is not uncommon So the process abstraction Is designed to discourage this right it's set up to make it hard for one process to mess with another one or the operating system Okay, that's by design. That's a feature. So we got to do something special That's agreed upon by both processes and so think of this as punching a hole in the security but doing so in a way that's Okay to the two processes. So we start off with no communication Then we've got to communicate. Okay And we call this inter process communication not surprisingly Now if you remember, I just wanted to re-emphasize this and we're going to talk a lot more about page table mappings and so on In a week or so, but if you remember, there's a page table That does these translation maps for you and it basically says that process one's code Goes to the table and maps to some part of physical space that's different from process two's code So notice they're using completely different parts of the physical DRAM and the same for data heap and stack and as a result they can't Alter each other's data, right? That's by design. So that's part of our protection. So We got to figure out something else for communication And if you think about it, we've already talked a lot about something that works, right? We've talked about How you could have a producer which is a writer and a consumer which is a reader Separate it in time Communicate how do we do that with a file? Okay, we already talked a lot about how when a parent process creates a child process they share The file descriptor table and so if you have a file that's been open for reading and writing and then you Produce a child process then the two of you can exchange data through the the disc Okay, so that's easy Okay, can anybody say why this might not be desirable? Yeah, so slow. Why slow? Well, you're not really trashing the disc per se But it is slow because what you're saying is in order for communication Which is already in memory. You've got to go out to disc and back Okay, so this this doesn't seem Particularly desirable for that reason But I do want to point out that this idea of writing to some file descriptor and then reading from a file descriptor Is our standard? Unix io mechanism. So whatever we come up with Is going to be very different here. Okay. Now. I did see an interesting Question in the chat and this is going to be the first time I tell you this today. So here's your fact for the day Does anybody have any idea of how many? instructions You lose by waiting for a disc to pull data Well, it's not a hundred billion, but it is a million Okay, so a million is a good rule of thumb Especially when you have multi issue processors that are running more than one thing at once So think at least a million okay, and so Going out to disc and back is not good It's very slow. Now, of course what we haven't talked about yet is caching inside the kernel So in reality, you could write and read without ever going out to disc But this interface by its very nature tries to push data out to disc and so i'm basically taking something that ought to be a quick communication through memory and You know adding a disc onto it for some goofy reason. So this seems like this might not be always Desirable and you may want something else when you don't care about keeping your data persistent Is there a faster way? Yes, there is Now one thing we also won't talk about today is this you see what I did here see the red. So what I did was yes initially it was impossible for processor program one to talk to processor program two Through memory because we mapped it that way But we can also choose to map certain parts of memory so that both of them share it So that's what's read here both mapped to the same page in memory And then you can do things like have data structures that are shared You can have linked lists that are shared all sorts of cool stuff. So this is Pretty uncontrolled But is fast and we'll talk about how to make this work after we've gone through how How we can communicate And set this up. Okay, so we're not going to get there yet Okay, so we're going to need locks. We're going to need a lot of stuff So before we go to this shared memory model, let's Understand a few things. Okay, but today's inter process Communication is going to be a little different than this. Okay What else can we do? All right, so discs aren't great. Well, what if we ask the kernel to help us in other ways like an in-memory queue? Okay, that's a producer put stuff on the queue and the consumer consumes stuff And we'll use system calls for security reasons. So we're not going to open up a security And by the way, you know, if you do this shared memory thing, you got to make sure that you're okay with The process completely the other process completely reading and writing the data that you're reading and writing Okay, so you have to do this carefully But what else could we do? Well, here we go. Here's a queue Okay, so notice this is not a disk anymore, but process a Executes a write system call which puts stuff in the queue and a process b executes a read system call Which removes things from the queue and now suddenly we've got communication Wouldn't that be great? Okay So some details before we figure out how to do this some details of what we might want is For instance, well when data is written by a it's held in memory until b reads it. Okay, that sounds good It's the same interface we use for files. Yeah, that's good much more efficient because nothing goes to disk. Okay But we have some questions here like how do we set it up? What if a generates data faster than b can consume it and the queue is going to get full Or what if b consumes data faster than a can produce it? Well, then the queue is going to be empty So what do we want to do For these second things. Well, what if a is generating data too fast? What do we need to do anybody have any ideas? So how do we tell it to slow down what might might be the simplest thing Well, not a lock. Yeah, wait very good Weight is the key. Okay, so as I'm going to teach you and you're going to hear over and over again Not a semaphore. We haven't gotten there yet What you're going to hear over and over again for me in the next couple of weeks is the way you solve synchronization problems is by waiting So in this particular instance, what we want is when process a executes a write system call, but the queue's full We want a to go to sleep Okay, and b if b tries to execute a read system call and the queue is empty. We want to go to sleep right and the important part is that We want Once there becomes memory space if a is asleep We want to wake it up and finish the right system call and furthermore once There's data in q if b is asleep. We want to wake it up and return from read Okay, now the question here is why wait rather than a lock. Well, the answer is locking Is all about waiting. Okay, so this is a type of locking But it's a type of locking that's particularly convenient when we're doing writes and reads to a an api and the kernel because the kernel can put those threads to sleep and wake them up again when it's time okay and deadlock here Is only a problem if Well, there's no deadlock here because there are no cycles You might be saying is there a live lock issue here where b gets put to sleep and is never woken up That's a bug because process a has Refused to put any data in there And in fact what you can do is you can set up reads to and writes to time out After a certain amount of time if they're if they're not satisfied So it's not possible for process a to screw process b up If it doesn't write anything okay Yeah, if there's a cycle That's a that's a different problem. Let's let's hold on to that for now. Okay all right So here's the first thing that looks exactly like that q that I wanted to talk about which is the unix pipe It's also part of posix and it's essentially just a q we call it a pipe But process a writes to the pipe process b reads from it. They use the same Read and write interface we've talked about before and now we've got communication across process boundaries The memory buffer here is going to be finite. Okay. Well, why because memory is finite And if producer a tries to write when the buffer is full it blocks It's put to sleep until there's space and if consumer b tries to read when the buffer is empty It blocks which means it's put to sleep until there's data. So this has exactly the semantics of what we want it Okay And there's a system called called pipe Which you will become very familiar with soon which Looks like this you you call pipe and you give it a Pointer to a a two element array That can store two integers and why is that well? We need to file the descriptor for both ends of the pipe for both the input End and the output end and so what this pipe call does is it? Uh Opens up creates a pipe and opens up two ends and returns two file descriptors So when you write on the the input end it goes into the pipe and when you read from the output end, it comes out of the pipe Okay. Now the question about how do we know If there's data in the pipe Can somebody answer that do we have to monitor or do we have to pull it every now and then to check Hamming codes. Nope. No hamming codes. So How do how do we know? There you go. Perfect. We had a great answer there If process b is reading and there's no data it goes to sleep the kernel knows when process a writes because process a wrote Okay, the kernel Knows this okay, the pipe is not a it's not a separate process the pipe pipe is just a memory in kernel space And so when process a Goes to write the kernel as part of putting the data into the pipe checks and sees that well the read There's a read waiting. So it just wakes it up. So because this is all running inside of the kernel The kernel knows okay, and so the kernel knows when process a writes whether b needs to be woken up And it knows when b reads whether a needs to be woken up And so that's purely an advantage of being at kernel internal kernel interface Okay questions So the pipe is not a process the pipe is just a q inside of kernel memory Whose interfaces are using system calls read and write Okay, this is not Necessarily in general standard in and standard out. This is you can do anything you want Okay, so you yourself could create a pipe with a new file descriptors that aren't zero one or two Okay Are there other examples of process besides read and write? I'm not sure I understand the question So processes do all sorts of stuff Okay But reads and writes are the way that we do communication either with other processes or with the file system Okay So you get two new file descriptors exactly This is an array of two file descriptors, and I'm going to show you for instance Uh an example here. So here's an example where we x we make an an array of integers That's got two slots in it. That's this uh int pipe fd of two And then I call the pipe system call by saying pipe I give it the pointer to that array and if what comes back is a minus one then that's a failure Okay, that's a pretty standard idea in unix and we say there was a failure in return Otherwise we succeeded and now we have two file descriptors for two ends of the pipe and so the pipe fd of one is the right end and pipe fd of zero is the read end And you should do a man on pipe by the way to see the the Um interface there, but so all we have to do For instance is if I have a message which is message in a pipe And I and I write that into the pipe and I have an extra plus one here after string length This lets me make sure I write the not only the message but also the uh null at the end And when I'm done it writes to the pipe and then I then I immediately read from the pipe and I just get the data back Okay, now why are there two closed calls because there's two file descriptors open a right end and a read end Oh, by the way, why is it say pipe fd of zero? Yes, that's a bug Hold on Sorry, my uh my mistake So, um, sorry about that so now if you uh As we're continuing let's take a look at this so let's look at what else we can do So let's do pipes between processors okay, so um The question here about where is the data? It's buffered in kernel space. Yes So because we're using system calls like write and read we're going from user space into the kernel To access that pipe And so the the buffering is entirely in the kernel Okay all right now So this right now this is only one process So it creates the pipe and then it uses it so this is this code example is a little uh goofy because the Process writes into the pipe and then immediately reads from the same pipe Okay, so there's no there's no two processes. Hold on just a sec. Okay, we're getting to that example And how do we get to that example? Well, we Execute pipe which gives us two file descriptors. There it is. So there's the um the first file descriptor is the read end And the second one is the write end Okay, and then when we do fork Poof. All right. Now we've got a parent process and a child process that are sharing a pipe So now if you notice what I did earlier here I said was a little goofy as I wrote in the right end and I read from the read end I actually have as an option here Now both processes the parent can read and write the pipe and the child can read and write the pipe Okay, but that's a little goofy right so um what we typically do Is the following we we uh Generate the pipe and then we fork Okay, which I already kind of showed you in this picture, but now Depending on what we do we close one file descriptor in one process in the other one in the other one So for instance here if pid is not zero then um, we are the We are the parent and what we're saying is really should be a pid is greater than zero Sorry about that. We are the parent and in that case we uh write To the the right end which is number one and we close the read end Whereas in the child We read from the read end, but close the right end Okay Now the question here of can we use the heap for the pipe The answer is the kernels got the pipe so you don't have any control over where it is Now you may ask the maybe you're asking the question here of Where is this? Array with two file descriptors in it Certainly you could use the heap for that if you wanted although it's um, probably Not necessary because you're probably going to basically create a pipe in some Place and then use it right away But you could certainly put the the two file descriptors in the heap if you like Okay, and if you wanted two-way communication, you don't really need to have Well, you don't have to have two pipes, but then the communication would get interleaved But you could create two pipes one for each direction certainly okay And we'll get to what happens with closure in a moment here So the the answer to the question on the chat is if if you have a file description Table entry and there's anybody still pointing to it then it stays Open so writing to the read end and reading from the right end is not guaranteed to do anything useful Okay, so Here's in graphics I wanted to show you so we're making a channel from parent to child We've already done fork as you can see here. We did pipe in fork. So what we're going to do is we're going to close Three on the parent side because we're not going to read at the parent side And we're going to close four over here on the child side and now that we're done we have The ability for the parent to write into the pipe and then it gets read from the child and so now we can send Data a stream of data from parent to child But we could do the opposite So here we could close four on the parent side and close three on the child side And now the child can send data to parent process okay, and As was asked earlier could we make two pipes certainly we could make a pipe to go from parent to child and child to parent And they would be separate from each other because they'd have separate cues okay How do you get end of file in a pipe so you know think about this for a moment a pipe is just A cue in memory. So what does end of file mean what what it means is there's going to be no more data coming And so what happens is After the last write descriptor is closed the pipes effectively closed because it only returns the o f After the last read descriptor is closed if a write tries to write it'll get the so-called sig pipe signal We talked about signals last time and if the process ignores that a sig pipe signal then the right will fail with an e-pipe error Okay, so you could either capture the sig pipe signal or you could get an error back from right Those are a couple of options. And so in this instance here We close file descriptor four. So now that pipe is hanging. We're not going to Garbage collect the pipe yet because there's still a file descriptor pointing at it but What you can see here is that the only thing that process two is going to get out of reading that cof End of file All right now once we have communication we need a protocol. So protocol Is an agreement on how to communicate? So in the case of that That pipe Yeah, we can send a stream of bytes from parent to child. But how does the child interpret that? Well, we may need to decide to put them into packet So there are some system calls like send message receive message you could do for that Or you could packetize it yourself And say well, I'm going to send you a stream of bytes where the first one says how many bytes are in the data structure And then I put those number of bytes But that's starting to become a protocol where there's an agreement for how the bytes are formatted in the channel, okay, and We're not going to go into this much today, but just to Get you thinking here, okay You've got a syntax to that protocol which says how are the bytes structured together You know, we always have this byte is followed by those bytes, whatever and then semantics of what that means Oftentimes you can describe this by a state machine So protocols can get pretty sophisticated. We're going to talk about the tcpip protocol later in the term And another thing which we're not going to talk about today But also later in the term is the fact that across the network, for instance, you may need to even translate from one machine representation to another so if you remember the big endian little endian discussion from 61c It could be that when you send a message from One process to another that that other process Looks at integers a different way and you need to reformat the messages to match what they are at the other end Now the question here about can you use higher level constructs like fopen and fread on pipes? What you're going to do in that instance is you create the pipe and then you can wrap A file star around it. There are there are system calls to do that. Okay called fd open for instance um, this is not quite It's similar maybe to control f endings replaced by line feed endings under some circumstances, perhaps Okay, and yes, this is decoding and encoding But it's needs to be agreed upon via standard And so that whole idea of what is the standard for encoding and decoding gets pretty interesting But later. Okay, we'll talk about it And by the way, another word you might be aware of is serialization You probably talked about that in some of your classes like 61b Okay, so some examples Some examples here. Yes, people are mentioning things like utf 8 and 16 and so on that's also part of it Here's a simple protocol. You'll telephone you pick up The phone you listen for a dial tone or see that you have service Not too many people not too many of you probably even know what a dial tone is anymore Maybe you do but then you dial the number You hear ringing and then all of a sudden on the other side you hear hello And then you say hi, it's john or my favorite is hi, it's me. It's like, well, how do you know who it is? But you might say something like how do you think blah, blah, blah, blah The other side said, yeah, blah, blah, blah, maybe you wait a little bit to think and you say goodbye They say goodbye and you hang up and this is actually a protocol where the ringing Uh, the expectation is that somebody at the other side says hello It's always a little crazy when you get a spam call and they don't And then that hello leads to the initiator the call saying what it's about Which then uh gets a response back Okay, which eventually causes a closing of the channel Okay, saying goodbye and then you hang up and these round trips here are very similar to what happens with tcp pip with the the fin messages and so on so um the protocol we're going to talk about for today's the rest of today's lecture is this web server request reply protocol and uh There's a communication channel of some sort that we need to figure out how to discuss in the middle here But the client might say request Over the network say and then the web server would give you a reply And there's a very carefully uh constructed protocol here Okay, and this uh communication from the client to the web server is certainly going to be running tcp ip but the um There's more to it because you got to satisfy http. So there's actually uh some standard protocol With the headers and so on okay all right, so This idea of cross network ipc is an interesting one because Potentially you could have one server serving a whole large number of clients And many clients accessing a common server starts yielding some interesting questions like how does the server keep track of the clients? Okay, so how would the server keep track of the clients anybody have any idea there Okay, so maybe every client has a different ip address. Well, if you're anything like um You know like myself when you use a web browser firefox whatever your favorite chrome Notice that there may be a bunch of tabs Or there may be a bunch of pieces inside and in that case There may be many clients that the server is interacting with that are all at the same ip address So then what? Okay, okay, I see a lot of sockets and cookies ip address plus mac address no that's not going to help you sequence numbers Okay, so I'm going to try to answer this question. Oh, I saw somebody say port. That's exactly right So each unique communication which we're going to talk about here has both an ip address and a port on each side and a protocol And as a result each uh communication channel is unique Okay, and so the unique idea is going to be a five tuple uh that we're going to talk about in a moment Okay, so the client let's make sure we understand first of all what we mean by client server So a client is somebody that asks for service from a remote server and The clients are sometimes on Okay, they you turn your computers off you're turning your You know you turn your cell phones off sometimes, but it's the thing that typically initiates a contact like here's a get over htcp for an index.html a server on the other hand is typically always on up on some well known address that Can be accessed by a client and so it's not doesn't typically initiate contact with clients But it needs a fixed well known address and port in order to be findable by clients And of course, uh, you make you make your request and you get some response back. Okay Um now What's a network connection? Let's be really basic. So for this lecture, it's a bi-directional stream of bytes between two processes That might actually be on different machines Um for now, we're going to be discussing tcpip Okay, which is uh is the basically the control protocol that's used for um across the network and does error recovery Okay, and so it's a unique stream of information. Okay Abstractly a connection between two endpoints a and b has a q going in both directions So there's a q from data sent from a to b and from b to a Which is just like we were talking about with pipes except that this is potentially across the network Could be on the same machine Might be in the same building could be on different Uh continents. Okay, it could even be I suppose between here and the moon and back if there's somebody up there so We need something to help us with this and the socket abstraction is uh this idea of an endpoint for communication And the key idea here is communication across the world once again is going to look like file io with reads and writes Okay, so here we go. So we have process Uh one process is going to for instance do a write Okay, and that's going to go into a data structure. We'll get to call the socket here Which is going to cause the communication to go across the network to another q In which point process b can read from that other end of the socket And we get communication and because we're going to be using tcp ip then we don't have to worry about errors in the middle here or anything okay Okay, the difference between port and socket is a port is uh Describing a unique communication the socket is a data structure including a q Okay, you'll hopefully you See the difference about that in a moment Okay, um if you don't by the end of the lecture make sure to ask again now just as we were talking about with pipes if we go to read On one side and there's no data that process gets put to sleep until the data shows up Okay, so sockets are end points for communication. They're cues to hold results um two sockets connecting over the network gives you inter process control or inter process communication over the network and um This sounds great, but now you got to start asking questions like how do you open this? Uh, what's the namespace? How are you connect them? Okay? So, um There are lots of different types of sockets. It's true, but not all pipes are our sockets Okay, there are ways to get things like pipes Uh that don't have sockets internally and there's also ways of connecting sockets internally that act like pipes Okay, so for now The pipe the native pipe implementation is actually uh, not the socket implementation on a lot of unix Distributions Okay, so we need to figure out how to connect all of this. All right Um, so what are some more details? So the first detail, um is that sockets are pretty ubiquitous Um, what I said about posits not being ubiquitous everywhere is not true about sockets So sockets are pretty much implemented on almost any operating system that wants to communicate over the network Okay, you pick it. It's got it Um, it was standardized by posits, but this is part of the standard that is um always there Okay, the thing you ought to know about which is fun is that sockets came from berkeley And um the berkeley standard distribution unix version 4.2 was the one that first introduced sockets All right, definitely go bears on this this release had a whole bunch of benefits to it And a lot of excitement uh from potential users in fact people that were there at the time it was released have told me stories about uh, how There were runners waiting to uh get The tapes that had the latest release on them so that they could Quickly take them to where they were going to be uploaded and to their um computers and run So berkeley 4.2 bsd had a lot of buzz okay hashtag go bears um, and so you can be uh proud of that now um The same abstraction is for any type of network so you can be local within the same machine So as I mentioned before you could imagine two sockets being connected Inside a machine using the sockets libraries in the kernel and it would look like a pipe But what I said earlier is that not all pipes Implementations use sockets. Okay, because it's a simpler interface the internet. Um, You can go across with tcp ip and udp ip and at the time of 4.2 bsd. There were a whole bunch Uh of other networking protocols. So tcp ip and udp ip were not the only ones There was apple talk and ipx and a whole bunch of native ones Some of which still live in uh deep recesses of the network Okay now um, yeah, there's 162 participants in our In our class right now that is pretty funny So more details on sockets Let's just it looks like a file with a file descriptor. So once again our standard, uh Our standard idea that all io is looks like reads and writes To files is going to be true with sockets. Okay, so write adds output read removes input Okay, now since this is an i this is io, there's no notion of lc So there's an example of what you might think is a part of the standard Interface that just doesn't make any sense for sockets. It also doesn't make any sense for pipes okay Now how can we how can we use sockets to support real applications? Well a byte stream by itself is not necessarily useful Okay, so a bidirectional byte stream Has no boundaries to messages. It doesn't necessarily have any interpretation So we already talked about this. You need to add syntax and semantics You possibly need to have a serialization mechanism Okay, and so We will talk at another time later about rpc facilities and so on. Okay Now, uh, there was a the question about Kafka, which is a different thing. So we'll uh, we'll talk to that about the end of the term Okay, so there is no notion by the way as of append here because there's no notion of seek So when you write it just goes to the end of the socket. So sockets keep things in order just as tcp Ip keeps the stream in order. Okay, so there's no append in this instance because you can't see Now um and or or the other way to say that is every write is an append. Okay So let's Dive right in with a simple example here. So I'm going to build we're going to call it a web server But it isn't really doing HTTP. So this is a little bit of a misnomer But let's suppose that the client Sends a message and the server echoes it. That's it. So the client sends it server echoes it Okay, so it's an echo server and what do that might look like well here I have an example of the network you could say the left side is uh, I don't know Berkeley and the right side is Beijing and we've set up a socket between the two now What that means is uh, the green boxes the two of them on the left Are part of the same socket. They're just the two cues going either direction and the two green boxes on the right Are part of the sockets on the server side. Okay, so we have two boxes for the client two for the server And that's because uh, it's bi-directional when you set up something. Okay now The first thing that I kind of indicated already is the server It's going to set up these sockets, which we don't have any idea how to do quite yet But what it's going to do is it's going to immediately do a read And of course the socket is empty on the read side And so all that's going to happen is it's going to enter the kernel and it's going to wait Okay, and we saw that earlier when I showed you the the web server example at the very beginning of the lecture What happened is you you did a read and if there wasn't any data you went to wait Okay, meanwhile a client comes along and it's going to set up an echo So one of the things that we need to do is From the user we have to figure out what they want to send and so maybe we do an f get string from standard in Okay, so this is a streaming input, which will wait until you hit a carriage return And then it's going to send the data over the socket By writing it. Okay, so it's a write system call to the socket file descriptor Here's our buffer and notice because I say string length of send buff plus one i'm sending The the null character at the end of the string in addition to the string This is things to start thinking about as you get comfortable with c Um and meanwhile that write can go on right away without the data actually going out as you remember because writes are buffered in the kernel And so yes, the socket's going to try to send it But we return from the right almost immediately at which point we go and try to read To wait for the response and of course we go to sleep because there's nothing in There's nothing in the read side on the client of the socket just like there's nothing on the read side for the server Okay meanwhile back at the fort the Write gets sent out across the network to the other side at which point The data wakes up the read process the server process wakes up. It might print the thing on the local console Okay, and um and then it Also writes the echo back Okay, at which point it gets sent across the network It wakes up the client maybe print something on the screen And then of course we can loop back and do it over and over again. Okay, and now we have an echo server Okay, so the f get us just to be clear here is only asking for the user here to type in the string that they want to send It's the right that actually sends it across the network. Okay So what it means here, uh the green boxes Are the socket pieces inside the kernel these white boxes are Representing code Places where you're interacting with the kernel so mostly on either side the client and the server server are user level The green boxes are in the kernel and occasionally when I do a write or a read I enter one of the white boxes. Okay And potentially I have to wait all right now, um You can force you can try to force the uh the kernel to send the data But in fact and there are there are ways to do that with flush But by and large it'll just send it right away. So you're not you're not too worried about that. Okay Now this is not Four sockets. This is only two sockets. Okay. It's a socket is a double-sided endpoint for communication So the two greens on the left are the client and the two greens on the right are the socket on the server Okay, so this is only two sockets And each side has two cues. That's why there are four green boxes. Okay Now let's look at this in code a little bit. Okay, so the client code What you see here is we have to get ourselves a buffer which has some maximum input size So a socket is bi-directional because there are two cues inside of it. Okay a pipe only has one cue So it's unidirectional. Yes. If you look here, we had to get ourselves some character buffers and the max in and max out Are defined somewhere else in this file and then we go over and over again and We basically say We grab the send buff. Oh, I guess I temporarily broke this code. I apologize But um forget the while assume. This is while true. We basically grab the send buff uh From the user and then we write it out Okay, and then we clear out the receive buff and we read it and then we just keep looping. Okay And the same on the server side. We read the data. We uh write it to um Standard out and we uh echo it. Okay, and so what happens is our right Goes across the network and wakes up a read and then the right on this side goes across the network and wakes up A right and this repeats over and over again Yes, it looks like dna. It's true Now and notice it's uh, yeah, it looks like dna. I guess. Yeah So what assumptions are we being are we making here? So one of the things is we have no error correction code For what happens if the packet's lost. Okay, because we're assuming that if you write Uh data gets read back. So with a file unless your disk is full The assumption is always when you write to it. You can read it back When you write to a tcp socket the assumption is the read on the other side happens Okay, it's like pipes Okay, if you put it in it'll come out on the other side Okay, let's uh, let's hold off on the chatter on on the uh the chat for now. Okay And the other thing that's important is the assumption that um, we have an in-order sequential stream So when I put data multiple writes Into the input side of a socket on the opposite side It'll come out in exactly the same order Not a different order. So that that's a property of the tcp IP protocol Every fight that goes in comes out in the same order and comes out only once. Okay So that's a really nice semantic and it's why everybody loves tcp IP Okay There are some disadvantages to tcp IP, but this is a pretty big advantage Okay, and so when you're ready Uh, when the data is ready on the other side, what happens? Well the file read gets whatever's there at the time Okay, so this is why to do a real version of this you need to check Uh, you need to come up with a protocol that says i'm going to maybe write Uh into the the first thing I write is the number of bytes to expect and then I write those bytes And then on the other side, I read the number of bytes I'm expecting and I keep looping with read until I get that number of bytes So to really do this correctly, you need to have a protocol That you've defined that lets you do things like message boundaries. Okay, but for now, we're not worrying so much about this We're also assuming that we block if nothing's arrived yet just like pipes Okay, so tcp IP plus uh sockets Is very much like a bi-directional pipe that goes Across the globe. Okay, it's it's a very simple pipe To two ends of the planet, which is pretty nice or a pipe on the same machine or a pipe To different machines in the same building. They all act with exactly the same interface Okay all right Now socket creation We might be interested in here. Okay, so for instance file systems Uh provide a collection of permanent objects in a structured namespace. All right, so if you think about it the um The uh Whole point of the file system is that I can name a file so that I can open it you know slash home slash kubi slash Uh classes slash cs 161 162 whatever there is a namespace The problem with sockets is what's the namespace? Okay, so files it's just independently of processes And it's very easy to name a file with open But when you start talking about sockets sockets are kind of by their very nature transient and really only functional when they're connected Okay, so pipes Partially get us that way, right? It's one community one way communication between processes on the same physical machine It's got a single q. It's created transiently by pipe And it's passed from uh parent to child In a way that allows us to share between two processes And notice that in that instance there isn't any namespace per se But rather we called pipe and the fact that the file descriptors are now shared is how we end up with the connection between the two processes Okay The reason a pipe is unidirectional is although the two processes each have a right Pointer to the right end and the read end if they both try to write That the the data will get interleaved and so I don't consider that bi-directional because you can't have two Clean communications you get two garbled combined communication. So that's why you always end up Uh closing one channel or another and if you really want bi-directional Communication with the pipe as I said earlier in the lecture you create two pipes. Okay now Sockets have this problem that a we're not on the same Colonel so, you know, that's a little bit of a problem And we need to somehow address something all the way across the planet. How do we do that? Well It does have the two cues for uh, you know Communication in each direction processes can be on separate machines Um, so there's no common ancestor to pass something from one to the other In fact, we could be here in berkeley and in beijing or pick your favorite, uh other place and, um How do we name it? There's certainly no common ancestor of those processes, right? So, um, what are we going to do? Well, the namespace of course is ip So you're all very familiar with this namespace. So for instance, the host name like www.eecs.berkeley.edu is an example of a name That can be uniquely identified across the network and used to route traffic to it Now, of course, we're going to have to talk about, um Things like dns and so on later in the term, but that host name really translates directly to ip Okay, and so what is ip well ip addresses Depending on whether you have ipv4 or ipv6 are 32 or 128 bit integers And so for instance, www.eecs.berkeley.edu would translate into some ip address Okay, which now would allow us to actually communicate across the network But as I mentioned earlier the ip address is not enough if you have a browser with a bunch of tabs in it Each one of those tabs has the same ip address because there's only one machine And so you need a way to uniquely name a connection and that's where ports come into play Ports are part of the tcpip and udpip spec. They're 16 bits. So there's only really 65 5 36 of them And the first 1024 are called well-known Okay And the well-known Ports are ones that are are much harder for you to bind anything to and in fact, you're going to need to be super user to use them There are some ports between 1024 and 49 151 which are typically registered ports Like for instance 25 5 65 happens to be the Port for a Minecraft server. That's an important one for you all to remember And then there's a bunch of dynamic ports or private ones and you'll see in a moment What they're about, okay So if we look at a connection setup over tcpip, we're going to need something that's special here The server needs to set up a the process of Waiting for a client to connect and that's called a server socket. So the server Basically produces a server socket Okay, and that server socket listens on typically well-known ports That have been registered With a standardization agency and you can you can register them, but it's very hard to get the Ports in that lower 1024 registered Typically people have ports that are just well known in the higher portions Okay, but now once the server socket is set up now the client Will be able to communicate which is because this socket The the thing the server does after creating it is it says listen which says Go to sleep waiting for an incoming connection Okay listening see that's an ear by the way So this client creates one its end of the socket sets a request To the other end by using the ip address and the standard port And at that point the server Executes an accept which says well take this connection and let's make it Real enough that we can communicate So in that instance the server says oh, I accept And the kernel then takes this connection Creates another endpoint and notice these are both green And the either end of them there's a final connection phase and now when you're done Those two ends represent two ends of a bi directional Socket and this is tcpip. Yes Okay, and when you do ping Okay ping is not Is is different than this so ping Does not set up a connection ping is the icmp protocol, which is just a datagram protocol Okay, so we haven't gotten to port we've talked about ports a couple of times But ports are really what's going to make this connection unique. Okay, so let me talk about ports again So both sides of the socket. Let's just look at the green ones Have associated with them a five tuple which is the source address The source ip address the destination ip address the source port the destination port and the fact that this is a protocol like tcpip Together those five things mean that this yellow connection is unique from all other Connections that we might make between those two servers. Okay, or between those two ip addresses. Excuse me. Okay, so Um Why how does this work? Well, I already mentioned that the client side of this connection Um is typically in that upper range above 49 000 of randomly or dynamically assigned Port numbers so when a client first does this connection They assign themselves a random port so now they have their ip address a random new port for that connection The server side has its own ip address, which is what i'm remotely connecting to and it's well known port So here's a good one 80 is a port you all ought to know Okay, that's basically the typical web server port And so what this connection did is it went from the client up to the server socket And said hi, I'd like to make a connection on port 80 and the server says, okay I will make that connection for you and when you're done you have two sides. Those are the two green sides of a uh Connected set of sockets each green thing is a socket on either side And why is this yellow one different from any other yellow ones? Well, because it's got these one of these five things on the left are unique Okay, so what is a port again a port? Is a 16 bit integer that helps define a unique connection Okay, so each server socket has Uh A particular port that it's bound to so i'll show you this in a moment But this server socket if this were a web server It would be bound to port 80 and so the incoming connection is asking for that ip address At that port 80 and that's the connection that's being requested. And so all yellow connections For this server socket are all going to have the same destination ip address and destination port number But they're going to have different ip addresses or ip or ports for the client This is not ping. So ping is something completely different. It's kind of like our echo server But it's a it's a datagram protocol. Okay, this is tcp ip Okay Now the client tells the server what its ip address and port and uh IP address and port is the server knows what its ip address and port is And so when you're done you have a unique connection If the same socket Excuse me if the same client wants to make another one Then it needs to come up with a unique port for its side because otherwise there wouldn't be a unique yellow connection So in that example of the web browser we talked about with all the tabs Every tab would have a different local port associated with it Even though we're talking to the same. Let's say remote server That has the same ip address and is all port 80 All right. I'm going to move on from this But just keep in mind that this every yellow connection has this unique property of a unique five tuple Okay, and 80 is a common one. That's web browsing without Web browsing without any security 443 is the https protocol 25 is send mail etc So the in the question about local host colon 500 says it's port 500 on the local machine. Yes, that is correct Good in fact, you could see sometimes peoples that have local servers That they're using for iot devices. For instance, it'll often be ip address colon 8080 or colon 8 000. That's pretty common Okay, now so all the server sockets Are not operating out of the same Port there's one server socket operating on a port 80 and it spawns All the new sockets that are communicating with port 80 There if you have port 443 or you have some other port like 25 That will be a different server socket that spawned listening on port 25 So there's one server socket for all the port 80's connections One server socket for all of the port 443 etc connections now So in concept what happens here is the server creates the server socket That's this blue one and part of that creation Is uh binding it to an address which is its current host ip address and the port like 80 That it's going to do service on and then it's going to execute listen Which means at that point we are listening for the connection For incoming Okay, and basically we are going to try to execute accept and that'll put us to sleep until somebody actually comes in Okay, so later the client Creates a socket And it does a connect operation which says I want to connect to A remote server that has this host name Or this host excuse me ip address and port which Assuming we've did this correctly. We'll go across to the server Which is busy listening The system call will accept it in which case this three three way protocol happens And when we're done, we now have a connection socket on either side With a unique five tuple defining this as a unique connection and every subsequent client that tries to do this Will get a different unique five tuple or unique connection Okay Now the once the server is ready. It says, oh, I have a socket. Okay, I'm going to do a read request on the socket Well, of course, that's going to go to sleep right away until the right request from the client comes in and says I want to look at some http address and meanwhile There'll be a read Okay, another side that wakes up writes a response that will get sent to the other side And we do this Combined write the request wait for the response here We do read the request write the response on either side Okay So each connection socket the server owns has a different port that is correct Okay Now When we're done we close the client socket And then the server goes back and does another accept and that's how we serve multiple requests for now. Okay Now you can see probably right off the bat Uh, if by the way, there's no race condition because the incoming connection requests go into the server And they're put into the queue using synchronization That we haven't talked about yet. Okay No race conditions So the client protocol that you see here is pretty simple So look we uh, we first get an adder info structure Defining the host that we're trying to connect to okay with host name and port name And so this is look up the host Basically, I'm not going to show you that until the end of the lecture if we get there But this is going to return a unique host name port name combination for who I'm trying to communicate with I'm going to create the sockets file descriptor And then um, so that's now I have a socket here So the client's got this file descriptor which is an integer the socket structure is inside the kernel Okay, and um, and then I try to do a connect and that connect immediately says Waits until the connection finishes and then when it emerges this sock file descriptor is no longer a disconnected socket It's a connected socket and now we uh can go ahead and do our client operations Whatever that are whatever they might be which could be doing lots of reads and requests and over and over again Um, and then closing. Okay The server side is a similar idea, but it's it's got this server socket. So if you look here We set up Which address family we want we bind it? Which means basically we say that we want to listen on a certain address and port Okay, and so binding basically attaches an address to a socket creating the socket just makes the queue We bind an address to it and then we do listen instead of connect And now in this while loop we over and over again. We accept the next connection We process it we close we go and accept another one Okay, and so we just do this over and over again and we're good to go um Can anyone see what's wrong with this protocol? What seems unfortunate about this particular server implementation? Yeah one connection at a time Right, so this can't be good So, uh, what can we do? Well, first of all, how might we protect ourselves? Because if you notice we're kind of running Right here, uh, we're kind of running in the same Uh process over and over again. We might want to protect ourselves So what we can do in that case is actually take what I just showed you And add a fork and let the child communicate with the client and do a wait Until the child is done close the connection Uh and go back Okay, and notice when we fork because we fork the listen socket Is going to uh end up on both sides and of course the child doesn't need the listen socket because it's not a server So it closes the listen socket On the other hand the parent doesn't need the connection socket because it's not serving the client And so it's going to close the connection socket. So this is just like the pipe example I gave you earlier where we create a pipe we fork and then each side closes one of the two file descriptors Okay All right, we're not serving multiple yet. We're just putting protection in here. So now every child Uh is running In a protected environment. We haven't gotten to the multiple yet. Okay, but you can see we're coming with that The only thing I did that's a little different here is I when I once I accepted That incoming connection I fork And here if I'm the child Which is pid equal to zero. I closed the server socket I at that point I go ahead and serve the client And then I close the connection socket and exit meanwhile the server closes the connection socket because it doesn't communicate with the clients and it waits Okay, so this is all that we changed But of course we want concurrency. Okay, or parallelism if that's available At least concurrency would he be even better because if we could have Multiple clients requests going on simultaneously Then when one of them is sleeping because it's doing a disk access the other one could be being served Okay, so even if we don't get parallelism, we still want more than one request going on at once So we've kind of broken that so far. Okay, and why do we need protection? Well What we can do is we can in that other process we can limit access To only those things that let it connect with a small part of the File system or whatever to make sure that we are safe. Okay Then just uh, I know we're running low on time. Hold on for just a second We're almost done the question of why we're closing sockets here is because when we fork We fork all of the file descriptors and on the child side It doesn't need the server socket that's being listened on And the in the parent doesn't need the connection socket. Okay, and so we're closing either side So here's the simple example where um, what we're going to do is we're not waiting here, right? So after we fork the parent closes the connection socket and goes back and accepts another one immediately Okay, and yes, the child could listen. All right, so uh, what you do with the child code is you make sure to do all that closing And then you set up your environment before you start doing the processing Okay But if you notice here, uh, we close the connection socket and we immediately go accept another one So all I did by removing that weight right here. Do you see the I commented out weight now suddenly we have concurrency Okay, multiple requests at once now. There's a comment in the chat saying this seems heavyweight Well, it is because we're creating a brand new process every time okay now It's uh, it's not it's the same Okay, so let's be careful about this. I want to I see some chatter in the chat. I want to make sure we got it this uh Server socket is the same all the time, but we don't use it for communication. We use it for listening Okay, so the parent has the server socket and it's the one doing accept And so it just do that does that over and over again and each time except comes back it comes with a new socket connection Okay, and each child gets a different socket connection. So the parent Keeps accepting new connections every one of these connections is a unique because it's got a different file five tuple It's got a different at least remote IP address and port combination could be just the port But something there is unique and it's got a unique process So every loop that every accept gets a new process for a new child, uh, connection Now just before we, uh, finish up a little bit here So the server address So one of the ways we set up the address at the server side is we say which port we're interested in And you may ask if the port is a 16 bit integer. Why is this a char star? Well, the many of these interfaces you can do man on them uh Basically take in a char star, which is a string representation of a number Okay, but anyway, what we do here is we set up things like what family are we communicating with with this socket And the way you've got to think about families here is it use when sockets first came out? There was not only IP out there. There were many other options So what we're basically saving is we're going to be a stream, which means tcp ip We're not going to say what family it is because we're going to take whatever comes in and we're going to Bind to a particular server and port Okay, there's a flip side with the client Which is probably more interesting you guys should look at this after the lecture But if what comes in is a particular host name and port that we're interested in Then we can look it up By using something called get adder info and what that does is that returns a Structure the server structure, which is an adder info That has all the information about what ip address and port We are so that then we can bind for the server socket Okay So finally, uh, if we're willing to not have protection on every Connection, but instead we wanted lightweight. We could do threads. So here's an example where instead of fork All we do is we create a thread the spawned thread handles the request and the main thread just goes back and accepts again Okay, and so now that's a thread per connection. That sounds great Unless you get Slashed out it. Okay, and you could easily have a situation where so much incoming traffic spawns so many threads that you crash your kernel This is bad. So what should you do there? How do you prevent? Well, it's true. You can't fork, but we're not forking here right now. Anyway, we're just doing threads limit the number of threads great Okay, and the way we do that. I'm only going to start talking about this briefly today but uh, the way we limit threads is we can create something called a thread pool Which has this basic idea where we create a bunch of threads at the beginning But it's a fixed number and then every time an incoming request comes in We put the connection on an incoming queue And then when a thread becomes free it just goes back dq is the next connection and handles it Okay So this is a way of the thread pools a way of bounding the number of threads All right, so we're done for today So in conclusion, we've been talking about inner process communication of how to get communication facilities between different environments Namely different processes Pipes are an abstraction of a single queue And you can create it in a parent and then pass it off to children and Decide which direction you want it to go Sockets are an abstraction of two queues, but across the network potentially And you you have two ends you have a read end and a write end on both sides Okay, so you can have two streams that are not interleaved with each other You get file descriptors back from the socket gives you a single file descriptor that you can both read and write to The same file descriptor. So this is different from a pipe It's one file descriptor that handles both reads and writes and the direction that Things go in depends on whether you're reading or writing Okay, and you can inherit file descriptors with fork facilities, which is why for instance here when we did this example When we forked we ended up with all of the sockets on the child side and the parent side Which meant the child and the parent had to close off the sockets they weren't using All right I think we're good for now. I'm going to call it a night. Thanks for hanging with me everybody and We'll see you on wednesday Have a good night