 Hello and welcome to this lecture on introduction to computer and network performance analysis using queuing systems. I am Professor Varsha Apte, I am a faculty member in the department of computer science and engineering IIT Bombay. So in the previous lecture, we had done a lot of discussion about shared resources and queues. We had discussed that delays are basically a result of contention for shared resources and we had discussed that resources in computers and networks include things like network links, web server threads, CPU and we had also made a list of pairs of resources and their user or customer entity. Remember that user here is the entity to which the resource is directly given. So we humans are of course the original users of everything, but when we say users we mean the direct entity that is scheduled, so the schedule level entity. So for CPU it is threads, for cellular network channels it is calls, we had gone all of this through the previous lecture. For the wireless medium it is frames, for the link it will be packets and so on. And we left off discussing that performance as we defined it in the last lecture, which is basically how well something functions given that it is functioning. That performance is something that we can talk about, we left off asking that can we be a little specific, can we say something more about these resource and users other than it is slow or it is fast. So again this is a reminder to have a pen and paper ready because it will be good if you also write some thoughts down as I ask. So before we get into discussing more parameters and metrics, let us define it very clearly. So metric is the attribute by which is a quantitative attribute by which we describe some quality of the system. So it is a quantity for describing some quality. So for example you might say that this person is very fast, he runs, you can just say this person is very fast, that is just a qualitative attribute. What is the metric? Then you will have to say that this person runs 1 kilometer in I do not know 5 minutes or 3 minutes or something like that or somebody is if you say this person is very strong that is a qualitative metric. Then how many kilograms can they lift? Is it 50 kg, 100 kg something like that people bodybuilders can lift? That will be a metric that this is as many kilos as this person can lift. So it is a quantity, it is a quantitative measure for describing some quality and that is metric. And when the metric, metrics can be for anything I just told you a metric for a person being fast, a person being strong, a metric to describe a car, recall from the previous lecture a metric for describing an efficient car could be how many liters of petrol does it consume? How many kilometers does it give for 1 liter of petrol? That is a very quantitative metric to describe an efficient car. The metrics that we will talk about are for computer performance of course, computer and network performance. So that is why we call it a performance metric. It can be a reliability metric also remember again from the previous lecture we had reliability or dependability, how many times car broke down in the last 5 years is a reliability metric. But again we will only talk about performance metrics in this class. Now what is a parameter? A parameter is something about the system or about the thing about which whose metric that you are talking about, it is something about that that affects the metric. So again recall in the previous lecture we talked about speed of the resource, how many resources, workload intensity all of these the quantitative versions of this. So quantifying an influencing factor, a metric influencing factor that is basically a parameter. And basically the metric is going to be a function of the parameter. And this is just a general graph. This is not any particular metric or any particular parameter. But this is what we are, what the relationship is, we will generally plot a metric as a function of a parameter. In this case whatever this metric is it is increasing in this way as the parameter increases. So let us talk about performance metrics in computer systems and in general also actually there are two types of metrics, the system metrics and user perceived metrics. So what are user perceived metrics? These are actually very easy to think about. These are metrics that are visible and are felt by and are of interest to the user of the resource. For example, the waiting time is something that is felt by the user of the resource. All the previous examples that we gave outside of computer systems and networks, clearly these are how much time you are waiting at an escalator, how much time you are waiting in a queue. This is something that a person actually experiences. Similarly, in networks and systems the waiting time will be experienced by a packet. Now the packet is not a person, it will be something that the network administrator will have to measure and so on. But it is a metric that belongs to the packet and is indirectly experienced by the human who is sending the packet. Similarly, the waiting room is full, meaning the packets came to the network buffer but were dropped. So they did not even have space to queue. So that could be this kind of something that the packet will know, if the packet was a person the packet would know that it was dropped. But at least it is something that the packet has experienced and maybe the sender of the packet will note that the packet was dropped. So one way or the other it is something that it concerns the user. A system metric is something that is visible and of interest to the owner of the resource or the operator of that resource or the resource herself or himself if the resource is a person. So giving the example of the pole booth worker which was a person, how busy they were, you know they were maybe they did not even get a break for lunch or something. That is something that will be experienced by the resource itself. If it is a computer or a network link that we are talking about, how busy it is which is called the resource utilization or how many packets per day, how many threads per second is a CPU processing, how many packets per second is a link transferring. All of this is something that the owner of the link or the owner of the CPU, they will know the person who is submitting a job to the CPU or who has sent packets to be sent on the network link, they will not know in fact, they are done when their packet goes over the link. Like again giving the example of the pole booth worker, the person who is the voter who is queuing for the pole booth, they will not know how many persons per hour came through this voting station today. They only know that I had to wait for 2 hours. The pole booth operator will know how many persons per hour came. So these are the system and user metrics and in fact, they are often in conflict with each other. System owners want their resources to be utilized. Now imagine for example, the escalator, if somebody installs some 2-3 escalators and they are not being used, that means you did not estimate the number of people that are going to come to that station or the mall or whatever would be, you did not estimate it properly and you put in too many escalators and you are spending so much money. So that is not good. Now so system operators often want their systems to be highly utilized, their resources to be highly utilized, users like the resources to be idle so that they do not have much waiting. So there is a bit of a conflict between what a system owner wants and what a user wants. Now coming to parameters also there are 2 types, system and workload. Here this is much more straightforward. As we discussed earlier, a resource has a speed, you know how many packets per second can this link carry, what is the bits per second bandwidth of the link, how many links are there or how many CPUs are there, how many poll workers are there, all of these are the parameters. We already discussed that these will actually impact the delay and the performance in general. The sharing policy, we discussed this previously also it is definitely a factor. This is actually a qualitative factor, this is hard to quantify, this is a general attribute but it is a parameter nonetheless. But these are all characteristics of the resource or of the system. Now regardless of this, the performance will be influenced by the workload parameters and as discussed earlier workload parameters will be like arrival rate, how many packets per second, how many jobs per second are coming to this resource and each user, how much work are they bringing to the system. And so if there is a ticket vendor, one person may just want to buy one ticket quickly and go but the next person might come and they might spend a lot of time choosing the seats and so on. So it is a different amount of workload that came to the ticket vendor. So how much unit of work is coming to the resource that will also be determining the metric. But it is a workload parameter, it is not the characteristic of the system, it is the characteristic of a user. So can we name some, in the instances that we saw for example for a network link and a packet which is the user, can we name some metrics. Can you go ahead and write some in your notebook. For example for this there can be packet waiting time, in fact it is called queuing delay, how much time did the packet spend in the queue. There can be packet drop probability, then in terms of let us say the web server system. So this one is of course, this is a user metric, this was also user. So the link utilization, what fraction of time was the link busy. This is a system metric, the web server again request response, request delay, how long did it take for the request to finish, this is one metric. Then average, we can have like average number of busy threads, this can be some metric. What could be a parameter, parameter could be rate of arrival of requests. So let us go ahead and look at this whole table that is here of various metrics and parameters. I hope you took some time to think about these things yourself. So for example CPU, let us start with the metrics, what is it, what are the questions, what are we interested in. So the resource owner will like to think in terms of so many jobs per second were done on the CPU. Let us say it is a high performance computing system, in those kind of systems, actually these large jobs are actually submitted for processing. And if you are owner of that kind of system, you will really want to be processing lots of jobs because maybe you have given it for rent to some users and you make more profit if more jobs per second are done by the CPU. So you really want as a system owner, you always want these things to be high. You want your CPU utilization meaning the percent of time it is busy, you want that to be high, otherwise it is a wasted resource. From the user's point of view, of course you would love it if there is nobody else using the CPU in its only for you because you would be interested in the job completion time. What will affect these metrics, it will be affected by the speed of the CPU, the number of cores, these are system parameters. The more the cores and the more the speed, of course you can do more jobs per second. Other lot of very technical things like cache size, of course the scheduling policy, it is very important in terms of some of these metrics. We know for example that there are some scheduling policies which are better for job completion times, other scheduling policies are worse for job completion times. You can recall your operating systems undergraduate material and we will discuss this in some other lectures. Then there is a workload parameter, right. Each job is bringing how many instructions to the CPU and of course the job arrival rate itself. In cellular channels, we have the system operator, the cellular network owner will be interested in what is let us say in a day at between 11 am to 12 noon, what is the average number of busy channels. Then there is call volume, what is the calls carried per second, how many calls am I, successfully completing every hour for example or every second. The user will be interested in the same thing, blocking probability, blocking here means network busy, what is the probability that I tried to dial something and I got network busy. The parameters of course will be number of channels. If there are fewer channels then actually you cannot carry that many calls because many people will get network busy. This is the parameter of the system and then from the user's point of view, how long are users going on talking on the phone that will directly impact how many calls per hour you can actually carry successfully, right, you can complete successfully. So, this is for the cellular channels. For web server threads, wireless medium is all very similar. You will be interested in requests completed per second, average number of busy time, request response time, connection refuse rate, these are the metrics, parameters will be number of threads, scheduling policy, arrival rate, how much time does each request take. Similarly for the wireless medium, the bandwidth as we say, right, sorry, this is the metric in terms of the performance metric, how many bits per second is this Wi-Fi channel actually successfully carrying per second. That is a very important metric from the system owner's point of view. Then what fraction of the time was actually it being used, otherwise like I said again is it a waste, is it being used, collision rate, recall from your wireless network material that there is something called collisions that can happen on Wi-Fi. So, what is the collision rate, from the user point of view again it is delays and sometimes it can also be how variable is the delay that is also a metric of interest and maybe collision probability for the user also. The parameters will be data rate which we say bandwidth, right. If the bandwidth is higher, the frame transmission delay will be smaller. Then this is very important, protocol rules because usage like I said there is a resource sharing policy, right, whenever there is a resource, there is a sharing policy. So, this in networking, these policies are determined by protocols. In case of wireless medium, there is an access protocol, right, access protocol. So how are you going to access the medium because if many laptops are using the same medium, there needs to be a way to share that medium. So, that will itself be a qualitative parameter and then again we have frame arrival rate and frame size. In this table, by the way, the hash sign is for number, ok. So, this is number of instructions, number of course, number of threads and so on. Now, how do we calculate all of these metrics, ok? It is, you know, seems fairly dizzying as to all kinds of metrics are there, you know. Before we go into how, we may want to ask why should we calculate all of these metrics? Everything should have a purpose and you should never believe that something should be done just because I am saying in this lecture that it should be done. So, how do we calculate? Why should we calculate? Why is it important that we know all of these metrics, ok? Then finally, really what is all of this that we are doing, ok? Before we ask, answer how and why, let me just define what this is, ok? This material that we are actually talking about just like the name of the course is basically called performance analysis, ok. And what is performance analysis? It is a process by which we methodically analyze the performance of a computing or networking system. And do I mean by analyze the performance? Essentially what we will be doing is we will determine the relationship, remember I said metric is a function of some parameter. What we want to do in essence is to determine this function, ok? We want to determine the relationship of the performance metric to a system or workload parameter, ok? Now that we know what it is that we are talking about, the next question has to be why? Why do we need to do this? So, there are many reasons, ok. One is sizing. What is sizing? Sizing means being able to determine the number and speed of resources that you need for something. So, for a cellular operator for example, the cellular operator may want to know how many channels they should arrange for in a particular cell. For that they may need to estimate how many calls per second they need to actually successfully carry or they may need to estimate that if I only give say 10 channels in this cell then what will be my blocking probability? How many times how frequently will I be giving network busy to my users? If you calculate that then if it is too much then you might be able to say that no I should have 10 channels. Secondly, setting configuration parameters. What is the configuration parameter? One example is the number of threads a web server should have. This is something that is typically there in a configuration file in the server. Again, same thing might happen. I should not have too few threads so that the requests are queuing. I should also not have too many, I mean I should not have like 50,000 threads on a 4 core machine because all of you know again from your undergraduate operating system too many threads may cause what is called thrashing right we can have thrashing. So we need to perfectly set the configuration parameter. Choosing architectural alternatives meaning you can you might have to design a network let us say a campus network somebody is designing they would need to know what should be the packets per second or the bits per second that the network should carry. What kind of switch architecture remember we saw a picture of a network and link architecture what should that architecture be so that my performance is good. Then comparing resource allocation of or scheduling algorithms this is very important I have let us say a CPU core and then I have so many threads and processes using it. Again from your undergraduate operating systems you would have learned that there are so many different scheduling policies right. There is shortest job first, there is round robin all of these will result in different delays and if sometimes it can lead to different rates at which jobs are completed also. So based on what my I should be I cannot just propose some new policy other than these two without doing the performance analysis of it. Suppose I have some new idea for a scheduling policy I cannot say that I think this is going to be good please use it you cannot do that you will have to do a very very formal performance analysis ok. Determining bottlenecks this is another very interesting thing when you have multiple resources multiple types of resources that are used by any user to actually complete their job in that case sometimes different resources will see different use. So let us say in the network example I have you know so many switches and these are the links but this could be the switch to which many links are connected and this is a bottleneck the switch or the link this is the link on which lot of packets are coming in and this link becomes the bottleneck. So then I will I need to know that this is the bottleneck and maybe I should put some extra links here and increase or just put another link there with higher bandwidth. So I need to know what my bottleneck of my if I am this network owner this whole network I own then I need to know what my bottleneck is so that I can fix it right. Finally I need to do performance analysis because I may need to give service guarantees which is what is service guarantees suppose I am creating a website which I am promising to my customers that the loading time will be less than a second or less than half a second. How do I know suppose I am I have my customer is actually another business that needs to be very fast and I have a contract with that business that tells that my web their web page should load within a second. So I need to analyze my system right before I can be sure that I am giving that quality. So determining whether service guarantees will be met is the reason another reason. So how do we do this right how can we do this there are many ways one is you can just measure if the resource that you are doing this performance analysis for is already operating you can actually just use some monitoring logs or you can actually have some probes and some programs that are actually measuring the performance of the actual service or you can set it up in a test bed. So let us say you are a network operator you could set up a test network in your own premises and have some laptops and devices on it which are just sending lot of artificial lot of doing lot of downloads and uploads and while doing all of that you measure the performance using some software and that is how you measure you do lot of measurement and analysis and you get done with it. The other the last thing here is the other two methods actually are in the category of modeling. Now what is modeling? Modeling is what you need to do when you do not have the actual system to measure or it is not possible to measure. So here basically you create an artificial representation of the actual system there are two ways to model anything one is by simulation and another is analysis. In simulation you basically write a program you write a computer program that represents the entities within the real system. So switches and links and packets are basically represented by data structures and the program logic behaves like packets going from one switch to the other and so on and then you collect statistics and then you are able to estimate when you model things you can only estimate the performance. So one way to do it is this. Another way is to use just mathematical reasoning just reasoning and some probabilistic and mathematical calculations with pen and paper to derive some very good things about the system. Now there is a bit of a advantage disadvantage relationship between these methods. So analytical modeling one of the biggest advantages of it is it can give you some insights because this can sometimes result in some formulae and there is nothing like a little formula that gives you some very deep insight about the fundamental behavior of a metric with respect to the parameter. So more insight in this direction of course analyzing mathematically needs a little more expertise like you will have to do this course before you can analyze these things mathematically. But once the expertise is there actually you can get very quick answers it is just a formula sometimes you can just plug in some numbers there and you get the answer. As opposed to that measurement requires will rarely give you insights very deep insights you will have to do lot of you have to plot lot of graphs and everything and only then you will get you will get some insight of course by measurement. But it is a little harder to derive that insight rather than from a formula and simulation is somewhere in the middle. Similarly, but measurement gives you complete you know maximum flexibility ok maximum flexibility in a way because you can set up a very very complicated infrastructure and measure it. If the cost is not a problem right you can get flexibility but there will be lot of costs also and why is there less flexibility with analytical models? It is because analytical models are mathematical and they need lot of assumptions and this will be more realistic and again because mathematical models need to be simple there will they will be less realistic ok. So these are the trade-offs of all of these methods. Now in this course our focus is this last method which is analysis with pen and paper with and mathematics and just reasoning. So just last closing couple of thoughts ok. If I say that I have to use mathematical models to estimate these metrics in terms of these parameters these are metrics and these are parameters. So if you want to if you want to do this analysis you may wonder do we have to do it separately for each of these things and then hundreds of other resources that are also there in each system. There are different scheduling policies there are different metrics do I have to go through the maths for each of these systems? So that seems very daunting right how do we carry out this analysis? As it turns out actually you do not have to do this ok. If you stare at this table long enough and these metrics long enough you will realize that there is a lot of similarity in some essence between all of these things. So for looking at the metrics for example here I have jobs per second I have calls carried per second I have request completed per second I have successful bits per second in terms of user metrics I have job completion time I have request response time I have waiting time transmission delay here I have some collision probability here I have blocking probability in terms of parameters I have job arrival rate call arrival rate I have holding time I have processing time I have this arrival rate as the same as this arrival rate this number of threads corresponds to this number of channels corresponds to this number of cores. So there is something similar here right it is not that every system is different ok. So the insight that allows an analytical approach is that in essence all metrics and parameters are similar and we can try and generalize this view ok which brings us to the last slide here which is that this generalized view is what is called queuing systems it is a universal mathematical model to represent queuing for shared resources and for analyzing performance of such systems ok. And I will be explaining what these diagrams mean in the upcoming lectures. So from the next lecture I will introduce queuing systems and the rest of the course is largely about using queuing systems to understand and carry out performance analysis Thank you.