 Hello and welcome to the next lecture in the course on introduction to computer and network performance analysis using queuing systems. I am Professor Varsha Apte and I am a faculty member in the department of computer science and engineering at IIT Bombay. So we will continue our study of closed queuing networks and we will study a very important method called mean value analysis. So just to recall quickly what we have done in the last few lectures we have done closed queuing networks with just a single server and we did a lot of analysis on this. We figured out how to calculate the, how to relate the throughput and the response time, we did some asymptotic analysis and we also saw some examples. So the main thing there difference from the open queuing networks was that the load, the way the load on the network is shown is by number of clients and mean think time rather than an arrival rate. And remember that you are not really bothering about a limited buffer size we are always assuming that the buffer sizes at the server are larger than the number of requests or the clients that are there in the system. So we also do not consider this. So continuing this theme just like in the case of open queuing systems from open queuing systems we went to open queuing networks. The same motivation remains for closed systems, to go from closed systems with a single node to closed systems with multiple stations or nodes. For example, just like in the previous in a single simple closed queuing system we have clients, we will have say a set of users that are interacting with the server system. But this time the server system may not have just one server station, we may have something like a web application which is two tier web application. So this could be the web server and this could be the DB server. And so a request that comes to this two tier web application may have some flow like it has to do some work at the web server then there is a database call. So there is some processing at the database server. Then whenever there is a database server processing it actually always just has to come back to the web server. We always have to come back to the web server. Then this may happen a few times and with some probability p the at some point of time of course the web server processing of the request will finish and then it will go back in the form of an HTML page to the user. And then the user will think and then issue the next request. So this cycle will continue except now you have more servers at the server end. You do not have just one server, you have multiple servers which we call either a server station or node. What that implies is that the number of requests flowing through the system is always going to be fixed. So there could be N1 requests at the web server waiting in the queue and waiting for the processing. There could be N2 requests at the DB server in the queue or getting processed. And if this we know N1 and N2 then we will know that M-N1-N2 requests are thinking. Meaning they are back at the user and the users are thinking or reading the response and waiting basically at the end of that they will issue the next request. So this is a very important thing that the number of requests circulating in the system will remain fixed and that is why precisely we call these queuing networks as closed queuing networks. So this is one example under which closed queuing networks arise and closed queuing networks make sense. As I had said previously also such a model makes sense if M is small. So if it is a lab of where there are 100 students and those 100 students are interacting with some course management system that is one example of a small set of users. We go with open queuing networks when M is large and think time is also large. Then one can show that we can you know rather model it as open queuing systems with Poisson arrivals let us say and this ends up being actually a fair model. So this is one example under which closed queuing networks arise. Let us see the second example. So again imagine it is the same web server kind of system but for the moment let us imagine there is just a web server at the server end and this whole thing let us suppose that this whole thing is the server machine. And now in this picture we are not modeling the end users that are sending requests to this machine. The scenario is we are looking at a system in which there is a fixed number of server threads. So let us say if this web server is configured with 128 threads that is a very typical number that a web server is configured with and those threads are going to be in this machine using a little bit of CPU. They will have a CPU burst and once they may they will very commonly need some IO they might read something from a file. So they will issue the IO call and then go into IO wait mode and they will queue for their IO request to be fulfilled. So these threads this picture here represents the queuing system of the threads with the web requests. So all of this is inside the machine this is the thread you can say this is the thread pool, this is the buffer of the web requests. So whatever web requests come externally from the users are queuing here and there is a thread pool and the thread pool picks up the request here and then does a CPU IO cycle and then finishes again with probability p and then goes back to the thread pool to see whether there is any other request waiting to be picked up. Once it is is that there is a request in the queue it will pick up come back here and then this cycle will continue. So this is a typical server cycle. Now imagine what happens at high load. Let us just focus on what happens at high load, very high load. What do I mean by high load? There are a lot of users and they are issuing requests very frequently. So the rate of arrivals into this system this is high. This is so high that we assume that this queue is never going to be empty. This request queue here is never going to be empty. There is going to be always some request here waiting to be served and these 128 threads are always busy. So what are we assuming? We are assuming that the web server threads are always going to be busy as soon as they finish fulfilling one request there is going to be something in the queue here for the thread to pick up and then the thread will come back into the system. So what are we saying here suppose a thread let me just change the color this is at high load. So there is a thread that picks up this request and immediately enters the first thing a request is always going a thread is always going to need is the CPU. So it will come to the CPU it will go to the IO and as usual it will circulate many times and when it is done what is going to happen is there is going to be a next request. There is going to be a next request sitting here and to pick up a request it does not take much time and it will just pick up the next request here because it is going to be waiting there and it will again enter the system and again do the same cycle of CPU IO and then exit here. A key assumption to note here is we are going to assume that all these if we assume actually if we assume that all these requests are statistically similar that means they have the same mean service time the same branching probability p of finishing they have the same variance and everything their service time distribution is basically the same and their probability of wanting the IO versus going finishing after this getting the CPU is the same then it is as if the same request has come back into the system. It is as if the request has returned into the system. So this is what we assume we actually remove this the representation of that node of the thread pool queuing system and we just think of this as a queuing closed queuing network where there are threads like these 128 threads are forever circulating between the CPU and IO and even when the branch this of this probability p is taken of finishing it just directly comes back into the CPU queue. This branch will actually represent completion of a request but we do not go back put the thread back into the thread pool and then pick up the next request we kind of ignore all of that we say that it take that time will be negligible and it is as if the same thread has just come back into the CPU. And this is this makes sense because the time to pick up a request is going to be negligible and the next request that the thread works on is statistically similar. So it does not matter it is going to have the same tau 1 here is going to have the same tau 2 here and so it does not really matter. So these are the two scenarios under which closed queuing networks are modeled. So these are the typical parameters of the closed queuing networks. M is the number of clients if there is a client node this is not always there not always present but in if you are modeling the users explicitly then it is present otherwise it would be the kind of a closed queue network where you are just going through this without modeling this explicitly. Then there is the if there is a client node then there is a think time. If there is a client node then there is a think time and as usual we have number of queuing stations n for example here n is equal to 3. We have the average service time per visit tau i mu i is nothing but 1 over tau i closed queuing networks can be defined with multiple servers here. We can have multiple servers here itself at each node. So ci represents the number of servers at station i then we have this thing called vi which is the same similar to the visit count for open queuing networks with a very big difference that in case of closed queuing networks visit counts are relative and I will explain this a little bit more later but what vi denotes in case of closed queuing networks is the average number of visits to some to a station i relative to visits to some station j. I will explain this in the next few slides but once we have an average relative average visit count then we can define service demand the same way that it was defined for open queuing networks. So what is this relative visit counts? Since there is no notion of exit from a closed queuing network this is request suggest going to circulate in this inside this queuing network actually they just are circulating here and if you draw the boundary around the queuing network they are never going to leave and this also it is the same this is a closed system and things are just revolving here they are circulating within this system so there is no exit. So if there is no exit how can we define an expected number of visits until exit that is what visit count was in the open queuing network it was the expected number of visits to a station i from the point that the request enters to the point that it leaves. So it is difficult to define this for a closed queuing network. What we do instead is we as I said before we talk about the number of visits made to server si relative to the number of visits made to some server sj. Now what is this sj going to be? Which server do we do this relative to? In some queuing networks that is going to be very obvious. So for example queuing networks with an explicit client node it is going to be very it is going to be very intuitive. This is the one with the explicit client node and this is the one the CPU IO kind of situation. So let us take the explicit client node first. So in this example it is kind of intuitive actually the relative visit counts seems like an unnecessary word actually it is kind of intuitive that you are going to be that this is the entry and this is the exit at least from the server part of the network. So we are calling it closed if we include the clients also in this but there is a very clear notion and there is a very clear boundary in some sense of entering the server side and leaving the server side. So for keeping the notion of the relative visit counts we can simply say that it is V1 is the number of visits made to S1 relative to one visit to client node. So we just make it relative to one visit to client node and then everything falls into place even V2 is the number of visits made to server S2 relative to one visit for one visit. So when you exit from here and make an entry here you are making one visit to the client node and so it all becomes intuitive for the explicit client node. For the CPU IO fixed threads example it is not so straightforward because there is no client node. So now what can we say average number of visits is to be strictly theoretically the number of visits that a request makes to CPU is actually infinite. There is nothing that is you know the notion of actually a request finishing we have kind of forgotten and we are making this request just circulate through the CPU and IO forever. There is no explicit notion that now this request is done in going back to the client node. However we know that this branch is actually the it is actually an exit. We know that this is actually an exit and there was a thread pool and all that system here that we chose not to model but we know that this is the exit. So we can knowing this thing we can calculate the average number of visits here knowing that this was the exit and this was the entry and this can be done using some mathematical setup that we will see when we do some examples. Just as a peep into that we can calculate actually based on the structure of the queuing network here that we CPU is actually going to be 1 by P. So we can use this 1 by P as a given and then we calculate V IO with respect to V CPU. On the other hand we can also just start with some given V CPU and some given V IO and we can assume that this has been measured in the system directly. And in fact that is what we are going to do in the rest of the analysis here we are going to assume that V CPU and V IO is just given to us. The only thing that I wanted to point out that neither this nor neither V CPU nor V IO are kind of going to be 1. There is going to be no node here relative to which the other visits are calculated. They are both going to be more than 1. So relative visit counts here are to be interpreted as average number of visits a thread makes to CPU or IO after starting a fresh request and up to finishing it. And we expect to just get these numbers as parameters to our closed queuing network. So with this definition let us define some metrics. The metrics are actually again not very different from open queuing networks and then the closed queuing system with the single server. If you just think of these two the metrics will come together in your mind very easily. So system throughput we start with system throughput for the explicit client node. It is easy it is the throughput across this arc in the fixed thread CPU IO kind of model again we know that this was actually the exit. So it will be throughput across this arc. System response time in case of the explicit client node it is much easier it is basically going to be response time at node server 1 plus response time at server 2. In case of the fixed IO threads it is also going to be similar. It is response time here and plus response time at IO. But it is a little procure notion again we have to base that remembering that this was actually an exit. So the system response time is the time from which you join here and then you may circulate here and then leave here. So in the case of, so let us come to cycle time. Cycle time will again have a more clear definition where there is an explicit client node. It is system response time plus think time. In the case where there is no explicit client node actually think time does not exist and we can just say that is 0 if no client known. And here system response time and cycle time is the same thing. Then again we have the individual throughputs of the stations this is lambda 1 here, this is lambda 2 here and same thing here this is you can say lambda CPU, this is lambda IO. Then we have the server utilizations as usual we have response times at each server waiting time at each node and response time at each node, number of customers queue length at each node these are the same as the open queuing networks. Now we have metrics similar to the closed system remember there was a saturation number which was how many users does this system support and this again exists this time we just have to make sure that we take into account the capacities of both the servers and just like open queuing networks we have a notion of a bottleneck server. M star here the saturation number over here is going to be the maximum number of requests such that one of these nodes shows full utilization and M star is actually clearly defined for queuing networks with explicit client node. So, these are the two kinds we always have to remember that one of these does not have a client node. So, anything any metric or calculation that has to do with think time does not apply to this queuing network. Further there is a kind of queuing network which is the Jackson queuing network this is again very similar to the open Jackson queuing networks the assumptions required are almost the same a closed queuing network is a Jackson queuing network if the branching is memory less this branching of going back here versus here or any other kind of branching that might be there in the queuing network memory less means no matter how many times you are coming out of the CPU the server one service the probability that you will finish versus the probability that you have to go to server 2 remains the same remember that is what is memory less that you cannot say that after 3 visits to the server S1 probability of finishing is high that does not meet the memory less assumption. Service times have to be exponentially distributed think time has to be exponentially distributed and scheduling discipline has to be FCFS in that case it is a closed Jackson queuing network and we can apply some theorems and some laws which make analysis much easier. So, it is closed Jackson queuing networks for which analysis is possible. So, let us start going over some results that are applicable only for Jackson networks and these are the results and theorems again that make analysis just very elegant. So, to some extent this Sevchik Mitrani arrival theorem it is not similar but it serves the same purpose as pasta for us remember pasta in open queuing networks or open queuing systems because of pasta and some other properties of Poisson arrivals properties of Poisson arrivals in open queuing systems. Because of these analysis became easy you should go back and refer to those lectures that analysis becomes much easier with this that similar purpose is served by the Sevchik Mitrani arrival theorem. Let me just go over what it is for a closed Jackson queuing network with a load level. So, sometimes by the way we say load level for this number of clients or number of requests because it is an indicator of a load in the system. So, with a load level of m a request arriving at node i sees the probability distribution of the length of the queue at node i as if the queuing network has one less customer in the system. In other words it sees the state of the system as if it itself is not there in the system but just observing the system. So, I will actually make this a little more formal and then I will give an example. So, let n i m be the unconditional steady state time average of number of customers at node i. Now I think everybody remembers this time average steady state unconditional we have talked about these things very often unconditional means it is not at any particular time is just over any period of time and it is a time average meaning that it takes into account how long a certain number of customers were there in that system and steady state means warm up effects of the in the system are ignored we are looking at the system when the system is at a statistical equilibrium. All of these things we have talked about before. So, please go back and refer to those. So, let n i m so n i is the index of the node m is the load level this is how we express every metric in a close queuing network at load level m the steady state time average of the number of customers at node i is denoted by n i m. As opposed to that let n superscript a and subscript i m be the average number of customers at node i as seen by an arriving request. Again, please go back and look at the lectures which are talking about how unconditional averages are different from what is seen at arrival arrivals or at seen at any other points also even the averages seen at departures will be different or if it is conditioned on any other event then it is different. Those averages are not always the same. We had talked about this in the context of pasta. So, you can go back and look at the lectures on pasta. These two averages need not always be the same. So, what does Savchik Mitrani theorem say about remember about seeing the system as if one less customer. Mathematically that means that what a request sees at a node i at arrival times for a queuing network which has m customers in the system it is the same as the unconditional average. Remember, I am not putting the superscript here. So, this is the unconditional average with one less customer in the system at the m-1 level. So, it is as if the request which is arriving to this node i is not there in the system. So, it is as if it is just an observer and if it was an observer and observed the unconditional averages at each node then that is what it would see at the points that it arrives into the network even when it is actually in the network. We are operating at the load level m, but the arriving request sees the system as if it is not there. It is an absolutely beautiful thought. Please think about it. It might take some time to digest this thought. So, I am just going to give an example with some numbers. So, imagine that m is equal to 10 and there are say 6 customers thinking right now that and imagine so 4 should be somewhere at this server. So, let us say there is there are 3 requests queued here and 1 request in service here. So, the steady state average averages will be denoted by n1, 10 right. So, this is at server 1 and n2, 10 this is the average at server 2. This is what the average is currently in the system. When one of these requests is issued and it arrives at that point the average that it will see at this node is actually n19. Is it as if this request is not there in the system? Similarly, when it arrives here also the average that it will see is n29 as if there are total 9 customers in the system. So, this is something that allows us a very, very elegant way to analyze queuing, closed queuing networks and that is what we will look at in the next lecture. We will be doing a method called mean value analysis which uses subchic Mitrani theorem to analyze and get all the steady state metrics of closed Jackson queuing networks. Thank you.