 Hello and welcome to the next lecture in the course on Introduction to Computer and Network System Performance. My name is Varsha Apte and I am a faculty member in the Department of Computer Science and Engineering. So, in today's lecture, we will continue what we were doing in the last lecture which is we were discussing high and low load asymptotes for various metrics of the various standard queuing systems that we are now learning, okay. So, here is again the slides that I show now to remind you of all the parameters of our queuing systems. All the analysis that we are doing is in the, in terms of these parameters, the arrival rate, the service time, the service rate which is 1 over this service time and we have the number of servers and we have the buffer size and the metrics are always these, the number of jobs in the system, the number of jobs in the queue, utilization, this which is also the fraction of time the server is busy throughput, time spent in queue, time spent in queue and service and blocking probability, right. So, these are all our standard things that we are doing. Recall that this is actually we call this utilization and this is throughput. Also we use jobs and requests and customers kind of interchangeably for a general queuing system. And this is something we learnt in the so far about the metrics and some laws. So, this is also this also lists in a clear list the queuing systems that we have been learning so far GG1, GG1K, GGC, GGCK, these are the 4 queuing systems we are learning. Actually these 2 are just special cases of this, this is nothing but when you have C equal to 1 and here GG1K is also again when you have C equal to 1 and this is when you have K equal to infinity and this is both C equal to 1 and K equal to infinity. So, these are just actually special cases of these queues, but it is just useful to list them separately. So, that is why we have done that. You can go back to the previous lectures to see how we derived all of this out of this particular the utilization formulae come from utilization law. So, this is just a recap. Last recap is of the previous lecture where we basically derived asymptotic values. So, high and low load what we mean by that is high load is when the arrival rate is going to infinity, low load is when arrival rate is almost 0. So, what kind of metrics can we expect for that this lists that out. So, remember that throughput will go to the maximum capacity in both cases utilization of course, server will become busy if it is highly loaded in the only main differences is in the finite buffer queue. The number in system and the response times actually converts to some finite values whereas the number in system and queue lengths and response times and waiting times for the infinite buffer system as expected they just go to infinity. Another thing to remember always is that even with very little load response time is always about a customer that actually get service. So, this is for those who get service it is also to those who get service. So, if you are going to get service if you are going to be admitted into the system and get service then you will certainly need a minimum amount of response time of tau. So, that is the thing of course, for infinite buffer we have no loss probability whereas in finite buffer for high load loss probability is going to go to 1. So, this is just to recall what happened so far. Let us move on now just straight forward generalizations of whatever we did earlier for GGC and GGCK. So, out of that the GGC is actually quite simple let us make the same table as before we have our throughput utilization number in queue number in system waiting time response time and lambda tending to 0 lambda tending to infinity. So, again same thing here the really now you should be able to do this yourself I would encourage you to pause the video and actually try to fill this table yourself ok. So, you can look at the previous slides how the reasoning has been and you should be able to easily fill this table yourself. So, of course, when there is very little load coming to the system throughput also will be going to 0 utilization is 0 all of these are 0 except always remember that response time is tau when the load arriving into the system that is arrival rate into the system is increasing to infinity. Now, remember that if there are C servers each going at rate mu remember that mu is equal to 1 by tau this is going to go to C mu and utilization will go to 1 all of these again we have infinite buffer remember when you do not have anything written there it is infinite buffer. So, these are just going to go to infinity. So, really the only difference between gg1 and this was that this maximum throughput that as lambda tends to infinity throughput will go to C mu instead of mu that was the only difference. What is more interesting is actually this ggck ok. So, let us first do the throughput utilization and both the queue length and also number in system. So, let us draw the same table so throughput utilization number in queue number in system waiting time response time lambda going to 0 lambda going to infinity. This should follow the similar reasoning as gg1k so if you want to look at the slides and just pause the video here and try to fill this yourself it will be it will be a good exercise. As usual let us do the easy part first let us do the lambda tending to 0 first throughput of course will be 0 we do not expect the server utilization to be much it will it will also go to 0 as lambda goes to 0 all of these this is really no difference always this is tau no matter if there are C servers remember the customer once they say it starts service it will anyway need tau amount of time for the work that it has to do. So, this will remain tau ok now let us look at what happens at as arrival rate increases. So, this will go to C mu throughput is going to go to C mu server will become fully busy queue length remember we have k size buffer and these are C servers. So, the number that can sit in the queue as lambda tends to infinity you expect this to become full ok. So, clearly this will be k as lambda tends to infinity the number in system is this k plus this C right. So, this becomes k plus C ok. So, these 3 plus even the number in system all of this is kind of straightforward the waiting time and response time asymptotes are a little bit interesting. So, let us take another page for that ok. So, let us try. So, what are we trying to find here lambda tending to infinity what happens to waiting time and what happens to response time ok. So, let us draw the queue again here this is the this is k this is C to understand this again it is really much better to take an example. So, let us take an example of a queuing system with size 3 in the buffer and maybe 2 servers here ok 2. So, we have k is equal to 3 and C is equal to 2 and let us take the tau how about the tau as 10 milliseconds ok. Now, the question is as lambda tends to infinity what happens to the let us try to first do the waiting time ok. So, as lambda tends to infinity remember that the queue will pretty much always be full ok. So, actually we expect even as the servers to be fully busy and the queue to be full what will happen just when a departure happens ok just when a departure happens let us say this server some departure will happen and the request that is over here will go here the request that is here will go here and the request that will go here. So, we expect that the new arrival will always be the last one here ok it will always be the last one. It is very very low probability that a new arrival even if they get space in the queuing system that it will actually be able to queue anywhere like you know that it can be the first or second that probability is almost 0 with probability 1 a new arrival will be the last right I hope that is intuitive that if there are so many the load is so high that as soon as even one space is open at the end of the queue the new arrival will be there ok. So, we basically start with this assumption that any new arrival will be last in queue. So, since it is the last in queue how long will it may take for it to go from here to here that is the question we are trying to do ok. So, let us look at that ok. So, again I am here I will move from here to here when the request in service here finishes its service and remember that I would have joined just when a departure has happened. So, I will wait for here to here for a full service time ok an entire tau you know actually sorry. So, so with probability 1 remember I will be queuing the last ok. Now at what rate am I going to move through the system ok this is something very important to remember ok. So, let me actually draw this thing again here. So, at what rate are requests leaving remember that when the load is very high all these servers are very busy they are continuously busy and we expect that a request will be appearing on the on this side that means they will depart once every 10 by 2 milliseconds ok. Why is that? Because from this server you can expect from this server that requests leave every 10 milliseconds and from this server you will expect that they leave every 10 milliseconds. So, if you look at the merged stream here ok if you look at the merged stream here you can expect that on an average you will see once a departure once every 5 milliseconds right that is intuitive right. If I am seeing one departure here one request finishing from this server every 10 milliseconds from this server every 10 milliseconds then as a total as an aggregate departure rate here I should see one every 5 milliseconds. What does that mean? Why am I doing all this calculation? Because remember that this queue is full here and you can move through this queue only at the same rate at which departures are happening through this server ok. So, for the first at the first 5 milliseconds you will need one 5 milliseconds to move from here to here which is when this customer will start service. So, this is 5 milliseconds then you will need another 5 milliseconds to move from here to here this is 5 milliseconds and remember now you are still at the head of the queue you will need one more to move into the system ok. So, you needed 3 into 10 by 2 this many milliseconds you needed to move through the queue and start service and this was your waiting time. So, in general can we generalize this? If we generalize this it means a customer that stands at the end of the queue has to move through k positions it has to move k times and this movement happens each movement takes how much time in general here tau was 10 milliseconds and we had to divide tau by 2 because number of servers were 2 in general it will take tau by c milliseconds right. So, waiting time is going to be equal to k multiplied by tau by c as lambda tends to infinity right. Now, once we have the waiting time response time is trivial take response time is equal to waiting time plus tau. So, this is going to be equal to k tau by c plus tau. So, that is it with this ggck asymptotes let us look at them everything together for ggck of course, low load everything is 0 except response time and the throughputs all of these the throughput for ggc and ggck high load asymptotes are going to be the same they so utilizations also high load asymptotes are going to be the same. For number in queue and waiting time and all of these things the infinite buffer queue goes to infinity. The finite buffer queue k is easy to know that this is we go to k total number in system will go to k plus c waiting time we just saw the whole reasoning that it goes to k tau by c and response time is nothing but waiting time plus service time. Loss probability is one more very important metric for infinite buffer it remains 0 there is no loss for finite buffer at low load it will be nearly 0 and at high load it will be nearly 1. This and the fact that the queue is full this is related that nearly all of the requests that come in are going to be dropped. But for those that are not dropped and somehow make it to the system these are the limiting this is the limiting value of asymptotic value of waiting time and this is of response time. So with that let us now just do some examples. We have done a lot of formulae and theories so we are just going to go through some examples. This is the first example is of a web server it is a very common example. Let us say there are two threads running on two cores in a CPU this is a server machine. And for all our examples let us assume that the threads are only using CPU and for the beginning let us assume it is an infinite buffer. So what we have is a gg2 queuing system let us look at the asymptotic values all the if nothing is coming to the system then through port utilization all of these are 0 response time will be equal to service time so that is 10 milliseconds. What is the throughput? So for that we have to calculate c mu so we have c equal to 2 here and mu equal to 1 by 10 in terms of requests per millisecond. So this is going to be 1000 by 10 requests per second. We can leave this is also correct it is just a convention to always mention request rate in terms of requests per second. It is a little easier to look at that number 100 requests per second. So that is what we have for mu and then we multiply this by c so c mu is equal to 200 requests per second. So that is how we get this and then we have the utilization as 1 in the infinite load high load asymptote and again this is infinite buffer so all of these become infinite. Now let us look at the same example but with a finite buffer. Note that it is the same example 2 threads running on 2 cores request processing time is 10 milliseconds but there is a queue size now that is you can have 10 here and we have 2 cores here c is equal to 2, tau is equal to 10 milliseconds. Our mu does not change remember we calculated mu will be 100 requests per second and c mu also is the same. What else do we have so all this now is the same remember the low load asymptotes really do not change this is 10 milliseconds response time is the only one that has a non-zero value everything else is 0 at infinite load however lot of things change right. Of course throughput and utilization are the same as the gg2 infinity queue but number in queue clearly is going to be 10 because that is the queue length the buffer size number in system will be 10 here and 2 here so we have 12. Waiting time remember the exit rate here from here we are going to have 1 every 10 milliseconds from here we are going to have 1 every 10 milliseconds the aggregate rate will be 1 every 5 milliseconds so the stepping up the request will be able to step up through this queue at the rate of once every 5 milliseconds and you have to move 10 positions right. So remember all the reasoning we did earlier so we will have 10 multiplied by 5 and that is equal to 50 milliseconds that is where the waiting time comes from and then we have response time as just 50 plus 10. Next example this is interesting this is the cellular channels example suppose a cell of a particular operator has 10 channels and the average call holding time that is the average amount of time that somebody calls talks on the phone for 1 call is 2 minutes and remember that cellular the way cellular calls operate is the buffer size is 0 there is no such thing as queuing a call calls that arrive if all the channels are busy then you just get that network busy signal and the call is dropped right. So we have here a gg2 0 system sorry gg this should be 10 gg10 0 system. So now how does the rest of the calculation proceed remember the call holding time is 2 minutes so in one we are this is again another convention that the throughput for calls is usually done in a per hour rate. So if one call it takes 2 minutes in 60 minutes which is 1 hour we can do 30 calls right and now remember that there are 10 channel this is basically this is our tau in minutes this is our mu in calls per hour and our C is 10. So our C mu is going to be 300 calls per hour right that is how we get 300 as a high load asymptote then utilization of course all the channels are going to be busy if there is a lot of load number in system is going to be 10 remember there is no buffer here. So it is just going to be 10 as calls go to infinity call arrival rate goes to infinity all there will be always 10 calls going on because there are 10 channels in the system. Waiting time is 0 because if a call is admitted into the system it is immediately admitted it immediately gets a channel. So waiting time is always 0 because there is nowhere to wait if you are in the system means your call has started. Response time of course is going to be the service time so 2 minutes. So here it is interesting the low load asymptote and high load asymptote of response time is the same and that is because there is no buffer. You can visualize this system as like this there is no q there is no q k is equal to 0 and last probability of course at low load is 0 and at high load almost all calls are getting dropped. Now a few non asymptotic metric we have only done asymptotes in this first 3 example let us just do some usual metrics along with the asymptotes. So let us go back to a single threaded server running on a one core CPU. Let us start with infinite buffer request processing time is 5 milliseconds and request inter arrival time is 6 milliseconds. That means mu will be 1000 by 5 which is 200 requests per second and lambda is going to be 1000 by 6 which is around 167 requests per second. Yeah the asymptotes now you can kind of look at it yourself everything for lambda going to 0 other than response time which is equal to service time everything is 0. So this is infinite buffer so loss probability remains 0 all this number and waiting times are infinity this is our C mu and CPU utilization given as percent is going to be 100 percent. Now but now what is interesting is to actually do this at the given lambda which is 1000 by 6. So this is not lambda going to infinity. So throughput is going to be 167 why because 167 is less than 200 because arrival rate is less than service rate. CPU utilization remember it is lambda tau which is equal to lambda by mu so which is also equal to you can write it as 1 by mu by 1 by lambda which is equal to tau over 1 by lambda. So in our case remember tau is equal to 5 milliseconds and 1 by lambda is 6 milliseconds. So you can write it as 5 by 6 and just to make it a percent you get 83 percent from that. And then we do not have what we do not know here is that at a lambda which is neither going to 0 nor going to infinity what is the queue length what is the number in the system what is the waiting time what is the response time we do not know that we also do not know the loss probability actually we do not know this. Now the same example but for limited buffer again the asymptotes I can I will encourage you to look at it yourself we did a very similar one earlier we did it with two cores this is with one core and with 5 milliseconds response time these remain the same 10 is the buffer size buffer size plus 1 and 50 milliseconds is the waiting time because we have to move 10 positions and 50 milliseconds plus 5 we get the response time. What is interesting here now is that the throughput we know is going to be arrival rate into probability of the request being accepted into the system. So, which is 1 minus loss probability. Now loss probability is something we do not know that is because at a low load asymptote we can assume it to be 0 at a high load asymptote we can assume that it is going to 1 but at a low load asymptote we still do not know how to calculate the loss probability. So, we have to just leave this as this equation and all we can say about the throughput is that it is something less than 167 requests per second. All we can say is that it is something less than arrival rate because there is going to be some nonzero drops happening and therefore, all the incoming requests are not going to be accepted and therefore, the throughput is not going to be equal to arrival rate it is going to be something slightly less than the arrival rate. Similarly, all that we can say about the utilization is that whatever we had calculated earlier we have we know that it has to be less than that 83 percent because throughput is not exactly equal to 167 it is something less. So, we have to leave it at that we also do not know how to calculate queue length, numbering system, waiting time and response time at this value which is neither 0 nor infinity. Now, let us do the same example for a two threaded server on dual core CPU. The asymptotes change here this is mu was 200 requests per second but now we have c equal to 2. So, accordingly this is 400 of course utilization will always go to 100 percent and queue length assumptions here the same this is 10, this is 12. This changes however because now we have two CPUs and we are going to be moving through these 10 positions in a 5 by 2 milliseconds per position. So, that gives us 25 this comes from 10 multiplied by 5 by 2 and now we get 25 milliseconds as the waiting time to which we have to add 5 milliseconds as the service time. So, that is how we get response time equal to 30. And now we have loss probability as usual should go to 1 when lambda 2 tends to infinity. What about the intermediate things at 1000 by 6 actually throughput this remains below 400 but it is not going to be exactly 167 we know that it is a limited buffer. So, there will be some loss. So, from the system some are getting dropped here with probability P L and only lambda multiplied by 1 minus P L are making it into this buffer and therefore those are the ones that leave the system. Those are the ones that are going to leave the system. So, that is the throughput that remains the same since 167 remained below 400 the same formula applies. But now we have all of the numbers for the same the only difference was that there are two servers. So, actually it is the same thing but it gets divided by 2. So, in percentage earlier it was around 83 now it is around 41 percent. So, this is just summarizing everything you can go through these formulae we have derived all of this in all of the previous lectures. And in fact what I would like to point out is gg1, ggc and gg1k are all special cases of this you can either set c equal to 1 or k equal to infinity and you get all of these special cases. The point here that I want to make even here in fact I want to just show you that how does this become let us say how does this work for an infinite buffer for infinite buffer pl will be equal to 0. So, actually this will just become this will become 1. So, this becomes small lambda. So, this is basically the formulae they get basically specialized for each of these cases. So, you can see here that all of these values are filled for the asymptotes, but for the not asymptote only this first two values we have some expression for. And even that we do not actually for the infinite for the finite buffer condition we actually do not know what the throughputs are for knowing the throughputs and the utilizations also we need to know the lost probability. And in the infinite buffer case we actually do not know these values at all for even for anything that is not any lambda that is not 0 or infinity. So, how do we find this? So, for this we actually need a different and more advanced theory if we really want to find all of these values we need something called stochastic processes. But luckily there is another simpler law which is called Little's law which helps us find let us say the number in system if we are given the response time or vice versa or if we have the queue length then we can get the waiting time or if we have the waiting time we can get the queue length. So, this is what we are going to do in the next class.