 Hello, and welcome to the next lecture in the course on Introduction to Computer and Network Performance Analysis using queuing systems. I am Professor Varshaapte, I am a faculty member in the Department of Computer Science and Engineering at IIT Bombay. And today we will talk about observational or operational laws in queuing systems. So as usual let us start with recalling what we just did in the previous lecture. You should now remember this picture. This is how we describe queuing systems, this is how we draw queuing systems. There are a bunch of parameters, this is the number of servers. Servers are described by an average service time, a service rate, customers are described by an arrival rate, an inter-arrival time distribution, then there is a buffer. And this is basically this number is C, this is K, the buffer size is K, the rate here is denoted by lambda, that this rate here is actually denoted by mu. The average service time is actually denoted by tau and the overall service rate for this system will be C mu. So these are all the parameters, right. So this slide actually just summarizes that, again remember C, K, lambda is the arrival rate, tau is the average service time, mu is the service rate of one server, C mu is the service rate of the whole system. Then we have the metrics of the open queuing system where you start with throughput, be denoted it by lambda, that is the completion rate, you have utilization, how busy the servers are, the fraction of time that the servers are busy, that is rho. Then you have waiting time is w, you can have queue length just of the customers that are in the queue, not in the server that is queue. If you want to count everything then that is n and if you want to count the entire time request spends in the whole system, waiting and also getting service that is response time. And for finite buffer systems we also have a blocking probability, we will use the symbol P L for loss for this, okay. And again this is a table that is again reminding you of all of this, remember this n, q, rho, capital lambda, waiting time w, response time r, P L blocking probability. So when you look at the slides apart from the lecture you can refer to this table for the metrics of the system. Now let us move on to how do we derive this, okay. What can we calculate easily about throughput and utilization and for doing this there is this one method called operational laws and there are various operational laws that govern the behavior of queuing systems and we will derive one today, okay. So let us start with a very basic, our standard single server, this is infinite buffer, this is single server and let us assume that we are observing this for some time t, okay. And we assume that t is large, okay. So it is sort of the boundary conditions have less effect. So we observe this q and what do we observe? We say that a is the number of, so I will always use this hash as a, as something that denotes a number, okay. So number of arrivals in time t, let us see with the number of completions in time t and let b be the amount of time server was busy, okay, during this time, during observation time t. So suppose we actually were monitoring this queuing system and somebody measured all of these things, okay. So somebody measured how many arrivals came, how many departures happened, so this is a number, okay, not read, this is a number, number and this is time, okay. So now of course by our notation we have lambda which is the arrival rate, this is trivial, this is going to be a by t, right. Then we have throughput, it is nothing but c by t because this is what has been, the number of completions that have happened in time t. So throughput is completions per time t, so obviously it is c by t. What about server utilization? So since we have been given this time b, somebody has measured how much time the server was busy, so utilization is basically a fraction of time that it was busy. So this is the row is basically b by t. Now a little bit of, if you just look at this thing, we will, we can actually derive one more parameter. Given all of these numbers, how about can we write what is average service time, okay, just think about it for a minute that given these numbers we can also actually write average service time. So total amount of time that the server was busy was b and in that time it did c completions, right. C completions in time t, completely done in time t, okay. And remember that the time t is large, so if we started observing the system let us say when one of these requests was in the middle of a service then maybe part of that time has gone, but still if you observe it over a long time this boundary condition can go and average service time, okay, given this will basically be total time divided by number of completions, right. This is basically time per completed request. So this is the average time, okay, so this is going to be tau, okay. So now if you look at this a little carefully you can realize that this can be rewritten as you can just divide and multiply by c and you get this as c by t multiplied by b by c, right. I just multiplied and divided this by c. And now c by t is capital lambda and b by c is tau. So this is actually what is called the utilization law, we just derived the utilization law and that is for single servers that is given by rho is equal to capital lambda, okay. This is one of the main operational laws. Now let us continue thinking about all this, okay. Now suppose this system that we were observing, now this applies actually with very few whatever we wrote here there is actually very few assumptions, you know this could be basically g, e, g, 1 is all we are assuming, okay. And we could even have finite buffer actually, there is nothing here where it says that some of the arrivals are not dropped, okay. We are measuring the arrivals separately in the completion separately. But suppose now we assume that there is infinite buffer. So now let us talk about a g, g, 1, q, okay. Can we think about what a throughput for a g, g, 1 with infinity, remember when there is infinity here we drop it. So infinite buffer if every incoming request can q it is never dropped, then can we think about what the throughput will be, okay. So let lambda be the arrival rate and we know that mu is the service rate. So for a moment assume that lambda is less than mu, okay. So if the arrival rate is less than the service rate, what do you think the throughput is going to be, okay. And I will give you an example here, okay. So and I will give you a non-computer science example. So suppose your capacity, okay, you are a server, let us say you are like a server and you can watch 4 lectures per day, okay. So this is like your mu, you can watch 4 lectures per day. But what is coming to you, okay, you go to a website and see how many lectures are coming there per day, only 2 lectures are coming there per day, okay. So this is like the lambda, okay. So if you can watch 4 lectures per day but only 2 lectures are coming per day, then what is the throughput you can do, how many lectures can you watch every day. Obviously you can watch 2 lectures every day, right. So when the arrival rate is less than the service rate and there is no nothing that drops, there is no finite buffer for dropping the requests, then actually throughput can really just be whatever comes to the system, throughput can be equal to arrival rate, okay. So this is a major kind of a rate in equal to rate out principle that applies to infinite buffer systems. When it is infinite buffer we also call these things as lossless. So there is no loss, so this can be assumed, okay. Now let us take the same example, okay. You can watch 4 lectures per day and but now this is not the case, what is coming to you is actually 6 lectures per day, okay. But you can watch only 4 lectures per day, okay. Suddenly the website is putting 6 lectures per day. Then what will your throughput be? How many lectures can you complete watching per day? It has to be just 4 lectures per day, right. How can you watch more? 6 can be coming but you can only do 4. So the rule here is if lambda is greater than or equal to mu, anything more comes then throughput has to be equal to mu, right. So basically in summary throughput will be equal to minimum of arrival rate and service rate, right. When if arrival rate is less than service rate it is lambda. If arrival rate is greater than service rate then it is mu. That means it is minimum of lambda and mu. Now thinking about this a little bit it is very easy to extrapolate this to, now let us take a G, G, C system where we have C servers. So now the service rate possible here is C mu, right. Same example you can take if there were 2 of you, okay. If we are talking about that 2 students should sit and just watch the lectures, okay. If there were each one of you has the capacity to watch 4 lectures per day. If 6 lectures per day so your total capacity will now be 2 into 4 that is 8 lectures per day. Now if 6 lectures per day are coming together you will be able to, he can watch 3, she can watch 3, you can watch 3 and you can actually watch 6 lectures per day, right. Suppose everybody does not have to watch everything, it is enough that one where you are partnering like your friend can see one lecture, you can see one lecture and that way you can finish watching the lectures. So that way you can watch 6 lectures per day. So your throughput can be 6 lectures per day. So the same logic applies, right. For GGC throughput is going to be arrival rate, if arrival rate is less than C mu but if it is greater than C mu then it has to be C mu, right. So here 6 was less than 8 so your throughput is 6 but now if you know suddenly lambda is 10, 10 lectures per day are coming then even both of you will not be able to watch, you can only watch 8 lectures per day, right. So this throughput law is very intuitive actually, you can do what you can do no matter what work comes to you, everybody has a certain capacity and you cannot do more than that, okay. So and if less is coming to you then you will not produce something else, you can only do whatever work comes to you, okay. That is the main assumption here that the server is not creating work for itself, the server only does what comes to it. In that case you are only going to finish what comes to it, okay. So now we talked about GG1 and GGC throughput, how about utilization? Obviously, we talked about single server utilization, we said it is going to be for single server for GG1 is going to be rho is equal to throughput multiplied by tau, okay. But now can we further discuss if it is infinite buffer, what happens? We know that lambda is equal to small lambda if lambda less than mu and is equal to mu if lambda greater than equal to mu. So we can use this and say the same thing here, rho is going to be equal to small lambda tau if lambda less than mu and it is going to be equal to, we have to substitute this mu tau here if lambda greater than equal to mu, but mu tau is nothing but 1 and this makes sense intuitively, we do not have to do this formula plugging, right. If I have less work coming to me, okay, then I can handle, then I am not fully busy. So this value is going to be something less than 1, okay. In the previous example, you were getting 2 lectures per day, but you can watch 4 lectures per day. So obviously, you are going to be only 50% busy, right. So we are just making a formula out of that, right. So here it is lambda tau or another way of saying lambda tau is also equal to lambda by mu, right. So this is a rate, this is a rate, if what is coming in is half of the rate of what I can do, then I am going to be 50% busy, but if it is more. So in this example, like 6 lectures are coming per day to watch, but you can only do 4 lectures, otherwise you are busy fully, then you are going to be fully busy just watching those lectures. So that is what these formulas are saying that it is lambda tau. The only thing that we changed from the utilization law that we talked about earlier was that this capital lambda can be replaced by small lambda if lambda is less than mu and it will be replaced by mu if lambda is greater than equal to mu. Now again, following the same style as the throughput, a simple way of writing this is that rho is equal to minimum of 1 and lambda tau, right. If lambda tau is less than 1, then the utilization is lambda tau. If lambda tau is greater than or equal to 1, like here, what will happen? If I try to use a formula here, if I try to do lambda tau, that is will be equal to 6 multiplied by 1 by 4 and this will be greater than 1, okay, then you cannot say utilization is greater than 1. Remember utilization is a fraction of time, a server is busy. So fraction means is it, it can only be from 0 to 100%. So we cannot say server is 120% busy or what you know, we cannot say that. So you have to say if more work is coming, you can only become 100% busy, nobody can become 150% or 500% busy, okay. So that is why if it is less than greater than 1, then we just have to say 100%, okay. So that is what this formula is saying, either it says 1 if you want to talk in fractions or you can convert it to percent and say 100%. So this is the formula for GG1. Similarly you can try for GGC, okay, what happens with GGC? Same thing now it will be, let us write the throughput first, throughput is going to be lambda if lambda less than C mu and is equal to going to be C mu if lambda is greater than equal to C mu, okay. Now that rho equal to throughput multiplied by tau utilization law was for just one server, okay. So we are going to go from one server to multiple servers by just simple intuition, okay. Let us go back to the, this, your example of the lectures per day. So here you are getting, your capacity is now 8 lectures per day, you and your friend are watching lectures. So now you can watch 8 lectures per day. And now if 6 lectures per day are coming to you, obviously each one of you can watch 3 lectures per day, right. So each one of you is going to be 3 by 4th busy, right. So what did you do here? You did, you took 6 and divided it by 2 because there are 2 of you and then also divided it by 8, right. That is how you got, sorry divided by 4. So that is how you got 3 by 4, right. Your capacity is still 4 lectures per day, right. You can do 4 lectures per day, total 6 lectures for coming but you are doing 3. So now 3 by 4 is what your busyness is. So that is what we need to convert into a formula. So utilization was lambda tau but it is not going to be lambda tau anymore. Total work is divided between C number of servers here. So it is going to get divided by C. If 1 server was 50% busy, 2 servers will become 25% busy, right. It is very easy. If this is, if lambda tau by C is less than 1, and then of course when it is, when you are getting more work, all of them are getting more work, then it is going to be 1. That is when lambda tau by C is greater than 1. This happens when lambda tau is greater than C, which means lambda is greater than, by mu is greater than C, which is basically lambda is greater than C mu. That was this condition here, greater than or equal to. So this is now we got Gg1, Ggc. Now this is actually just summarizing whatever we have just talked, I wrote on the little bit informally, but this is actually repeating everything. This is the derivation of the operational laws, I will not go over it again. This is for you to refer to once you have seen the lecture. So this was the utilization law for single server, this was single server. Then we have Gg1, remember we derived this thing, throughput is equal to minimum of lambda and mu. And remember that some of these things, the rate in equal to rate out are for this condition lambda less than mu. And when a queuing system is satisfying this condition, it is called stable. This is called a stable queuing system, when we know that if more work is coming to us or a queue or anywhere, actually it all becomes unstable. So it is a very natural word, when lambda is less than mu, it is stable. Then repeating whatever I derived for Gg1, rho is equal to minimum 1, lambda tau. This is utilization law for single server lossless system, because when it is lossless, we can assume that throughput is equal to arrival rate. This also I just derived throughput was minimum lambda and mu. And this is about utilization when we have C servers. So utilization is lambda tau by C, if lambda is less than C mu, otherwise utilization is 1, if basically lambda greater than or equal to C mu. We will continue this now. Next thing, this is just stating rho is equal to min 1.0 or comma lambda tau by C, minimum of these two for multiple server lossless system, we are still in lossless system. For this small lambda, we can use instead of throughput, only when we can assume that for a stable system, you can assume that capital lambda is equal to small lambda when there is no loss. Otherwise, even for a stable system, we cannot really assume, if the buffer is finite, we cannot assume that whatever comes in goes, because some requests can still get lost. That is what we are going to talk about next actually. So let us talk about now throughput of a multiple server finite buffer queuing system. We have C, which is the multiple servers, K is the finite buffer. I can just draw this here to remind you all of this. This is the buffer has size K, these are the C servers, arrival rate is lambda, we want to derive throughput, this is C servers here. And so service rate of these things are going to be C mu and this buffer size is K. Now remember, I was talking about the matrix earlier I had shown a table, P sub L is the probability that an arriving request is lost. Remember that L is for lost due to finite buffer being full. Remember that this can happen in a queuing system, even if lambda is less than C mu, P L can be greater than 0. That is because we are assuming some randomness, this it can be that even if the overall arrival rate is less than the service rate, suddenly if too much work comes into a buffer, then the request can drop. Even if the overall arrival rate was less than the service rate, suddenly you can have a burst of arrivals and then the buffer will become full and things can drop. So right, so we have P L as this probability. So for finite buffer systems, how do we define the throughput? We define it as, it is a very sort of simple definition. It is arrival rate multiplied by the probability that you can actually get into the system. Because once you come into the system, you are going to leave, there is nowhere that the requests have to go. So this is a finite buffer system here. This is a finite buffer system, requests can come. Of course there is going to be some probability of loss, but once they come in, they are going to get service and they leave. So if the rate of arrival here is lambda, then the rate through this is going to be nothing but lambda multiplied by 1 minus P L. Because if this is the probability of loss, 1 minus P L has to be probability of accepting. So it is just arrival rate multiplied by accept probability and that is denoted by this. And this is actually applicable for any lambda greater than 0. We do not need to distinguish here between lambda less than C mu or lambda greater than or equal to C mu. Because even if lambda is greater than equal to C mu, loss probability will increase and finally that is what takes care of the throughput through the system. And because of this actually, because there is a natural loss, finite buffer queues are considered as always stable. There is no stability condition as such. We do not have to say lambda less than C mu, we do not have to say that. They have a built-in mechanism to keep themselves stable which is the dropping. Now what is the utilization? Utilization is very very similar, we are just following the utilization law. If it was one server, rho equal to tau would apply. Remember utilization law when we derived it, we did not assume finite buffer. So this is okay for one server. But if there are multiple servers, this is going to the business, the work gets divided. You can assume that all the servers are similar and each server does capital lambda by C amount of the work. And if this is what the completion rate and the work that one server is doing, then capital lambda by C multiplied by tau will obviously be the how much C servers are going to be busy. So utilization is very intuitively capital lambda tau by C and where lambda is given by this from the previous slide. So I am just going to go through some examples quickly. Let us consider a single threaded server on one core CPU. Again infinite buffer, let us consider this infinite buffer for the examples. So request processing time, this is tau is 5 milliseconds, request inter arrival time is 10 milliseconds, this is our 1 by lambda. So lambda is going to be actually 100 requests, it is going to be basically 1 over 10 milliseconds which is 1000 by 10, this is 1 over 10 requests per second, request per millisecond which is equal to 1000 by 10 requests per second which is going to be equal to 100 requests per second. So lambda is 100 requests per second and what is mu? Mu is actually 5 milliseconds. So this is 1 by tau in terms of requests per millisecond. So if I want to convert it to seconds, I should just do 1000 by 5, this is 200 milliseconds sorry 200 requests per second. So we have our lambda which is 100 requests per second, we have mu which is 200 requests per second and now we can just apply whatever we saw in the previous slides. So let us see what we can do here. So throughput will be 100, why? Because 100 is less than mu, so throughput will be 100, utilization will be 50% why? Because utilization is nothing but lambda tau or lambda by mu which in our case is 100 by 200. So and if we want to make a percent out of it, we multiply by 100. So this is 0.5 multiplied by 100 and we get 50%. I have purposely shown this here to remind you that so far we are not calculating any of these. We do not, I have not told you about how to calculate queue length, number in system, waiting time or response time. Another example, now same example and let us just assume 2 cores. So now we will use the same thing. This is tau, mu is equal actually 200 requests per second, lambda is equal to 100 requests per second. But now we have a C, C is equal to 2, so C mu is equal to 400 requests per second. So again the same thing maybe you can think about it before I show the answers. 100 is less than 400, so it is going to remain as 400. Utilization now remember everything is the same about the system. So and just instead of one core CPU, we have 2 cores. And so if we have 2 cores, work is going to get divided earlier we got 50%, now it will be 25%. And if you want to apply the formula, it is going to be 100 by 400. It is equal to 0.25. So that is what we get. A couple of more examples. What I have done now is actually all of this is the same 2 core CPU, inter arrival time is 5 milliseconds, request inter sorry request processing time is 5 milliseconds, request inter arrival time is now very less, it is 2 milliseconds. So what is happening here? We have again we still have 400 requests per second is the C mu. But lambda is now 1000 by 2 which is 500 requests per second. So throughput will now be 400 because lambda is greater than 400. So you the system cannot do 500, it will have to just stop at 400 requests per second that is what it can do. And the servers both the core CPU cores will get 100% busy. Last example before we wrap up for today. So let us consider a network link of bandwidth 1 megabits per second and suppose to this network link packets are coming at 75 packets per second. Then average packet size, average packet size is 1500 bytes. So we have to do a little bit of calculation here to get all the units the same. First thing we have to calculate so many packets coming per second of this size, how many bits per second is that? So that will be 75 multiplied by 1500 or what we can do instead of this what we can do is we can actually calculate the average packet transmission time. We can first calculate the transmission the service rate in terms of packets per second. So what I am doing here is first I am calculating the average packet transmission time which is going to be the size in bits divided by this bandwidth in bits. 1 mbps is 10 to the 6 bits per second. Then I am calculating the service rate as 1 over tau remember this packet transmission time will become the service time and 1 over tau is going to be the service rate and that is reciprocal of this and this calculates to 83.3 packets per second. So now 75 is less than 83 so throughput is equal to 75. Actually this link can actually carry 75 packets per second and what is utilization compared to its capacity 83 packets per second if it is getting 75 packets per second this is the ratio this is the fraction of time it is going to be busy so it is 0.9 in other words 90%. So this is what again this is saying and these were just some examples to show how all our calculations proceeded today and what we are going to do next is high and low load asymptotes of various metrics. This is because as you saw in some of the slides the 4 other metrics Q length number of customers in the system waiting time and response time we are still not calculating. So we will first start by calculating their asymptotes and we will discuss this in the next lecture. Thank you.