 Today we're going to see how to generate random variates in a simulation using Monte Carlo methods. As you know in simulation we have to have a way to generate values from a particular random variable distribution and the way we do that is with Monte Carlo sampling. This numerical technique goes back pretty far. The name Monte Carlo was the code name applied to a piece of the Manhattan project that was top secret and involved these techniques. In fact the increasing popularity of Monte Carlo methods is what spurred the stimulus for the development of pseudo-random numbers and pseudo-random number generators that we talked about in the last lecture because there was so much in terms of using Monte Carlo sampling it totally exploded and using real random numbers from on a magnetic tape or entered in some way into the program that was used to generate the sampling methods just wasn't going to cut it anymore. It ended up being too inefficient and so it was actually the popularity of Monte Carlo sampling that caused all the work that's been going on in pseudo-random number generators. We know that the word simulation is sometimes used interchangeably with Monte Carlo sampling but in this course and in particular in this lecture I want to maintain the distinction between simulation and Monte Carlo methods. Simulation is experimenting with a model of version of reality and Monte Carlo methods are a particular branch of mathematics and it's a mathematical technique used to solve mathematical or statistical problems and specifically it's a numerical technique, an algorithmic technique. The term Monte Carlo simulation actually is a distribution sampling. It uses repeated sampling from the distribution of interest to determine properties of some phenomenon, well probably because we're going to be summing or getting an average, getting parameters, getting estimates of the distribution. And as we know simulation is a broader technique and especially discrete event simulation which is what we're studying in this course involves a time element and is definitely distinguished from Monte Carlo sampling which is a moment in time. It's a cross-section but a moment in time that doesn't move over time. It's a distribution that's being sampled from repeatedly but the distribution doesn't change. Sometimes the term Monte Carlo is used in a sense of employing randomness or probabilistic activity in order to solve problems that might be deterministic in principle. Sometimes we might know the distribution and therefore we can sample from it but if we want to solve for the mean, the parameter of the distribution it may not be possible or it may not be easy to do it using expected value like we would when we can do it. So if the mean is intractable it's very easy, get an estimate of the mean by many, many, many repetitions of sampling from the distribution of interest and then summing them up and dividing by the number in the sample just like you do it in any field test. We could call it a probabilistic and numerical approach to solving something that should be solvable analytically and actually one of the earliest uses of Monte Carlo methods was to empirically get the approximate value of pi and of course today in starting back from the advent of large computers Monte Carlo methods have been tied to computer programs unlike the historical Monte Carlo methods in the 1700s let's say in the 1800s. These are some examples that are trying to show you a distinction between simulation Monte Carlo methods and then Monte Carlo simulation where simulation is said to be drawing one pseudo random uniform variate on zero one to simulate the tossing of a coin, one coin. If the value that you get is obviously it's a value between zero and one if you get a value less than or equal to 0.5 let's say the coin is heads the value is greater than 0.5 let's say it's tails. This is definitely a simulation and we've done things like this in class but it's not a Monte Carlo simulation. Now Monte Carlo method might be an example of that might be let's take a whole box of coins, pour them out and then figure out the ratio of heads versus tails and that might be a Monte Carlo method of determining what happens in repeated coin tosses but it's not a simulation it's repeated sampling. A Monte Carlo simulation perhaps drawing a large number of pseudo random variables from the interval from uniform zero one assigning values less than or equal to 0.5 as heads greater than 0.5 as tails and it's like repeatedly tossing the coin. Personally as you know I wouldn't call it a Monte Carlo simulation but rather Monte Carlo sampling and we use it in simulation all the time but I got this from Wiki Wand and I wanted to be true to the source. Alright so let's learn about Monte Carlo methods. These are techniques that we use when we want to generate data. What do we need for that? We need a random number generator and we need the cumulative probability distribution for the distribution that we want to generate these values from. We assume in this endeavor we assume that we have a source of random numbers that come to us in uniform between zero and one. We've got this stream of uniform zero one random numbers. It exists, it's usable, it's not going to break down. That's the assumption all of this is based on. Where do we get the random numbers from? A table of random digits, a roulette wheel, some kind of probabilistic activity more than likely today from a computer program, an app that's going to generate these random numbers which as we know because we've already studied them are really pseudo random numbers. To illustrate Monte Carlo methods we're going to use the inverse transform technique. It's easy to understand, easy to demonstrate. It may not be the best technique around, especially in terms of efficiency but it definitely works and it's easy to understand and easy to explain. First, you need the cumulative probability distribution function like the graph that you see on the slide in front of you. You use your random number generator to get a random number between zero and one. Whatever your, for whatever reasons, whatever your justification decide how many decimal places you want and you generate a number between zero and one. Use a random number generator and a uniform, apply a uniform zero one distribution. And you can see the blue lines on that graph. Well, a random number between zero and one would be a value on the y-axis. Wherever that lands out, we hang a horizontal line to the curve and then hang a vertical line down so that then we can pick out the value of x that corresponds to that uniform zero one random number. Very, very simple, easy to understand process. The next step is that's x. That's the whole thing, x is the sample value. And then you repeat over and over again until you have the sample size that you're interested in. So here's how we would use the inverse transform method to sample values from the exponential distribution with a particular parameter. The cumulative exponential distribution is on the graph in front of you. Again, the vertical axis is f of x, or in other words, a number between zero and one. It's a cumulative probability distribution. It's going to be a number between zero and one. And the x-axis is values from this exponential distribution. Maybe it's times. Maybe it's interarrival times in your simulation. The formula for the cumulative distribution function of the exponential distribution is listed there, one minus e raised to the power of negative lambda x. And this is, so how do we figure out x? Obviously, we don't want to do it with graph paper. First of all, we won't be as accurate as we could be. And secondly, well, we just don't want to use graph paper. We will want to manipulate the formula in order to be able to solve for x. That's the objective, solve for x. We enter this algorithm with a uniform zero one random number and we exit the algorithm, the output from the algorithm is the value of x and then we go and do it again. So the first step is that's the cumulative distribution for the exponential. The second step is to take the one to the other side. But if f of x is a number between zero and one, then one minus f of x is also a number between zero and one. So that's where we put the random number in. And now the random number is on the left side of the equation. The right hand side is pared down to just e raised to the power of negative lambda times x. And step four, step five are all intermediate algebraic steps to turn everything around and just solve for x. And we do that in step six. And now we have something we can use for repeatedly. We're coming in with a stream of random numbers between zero and one. We want to generate a stream of exponential random values. So each random number that comes in goes into that formula and out comes, voila, out comes an x. This is to remind you of the relationship between the Poisson and the exponential distribution. The Poisson is a discrete distribution. The exponential is a continuous distribution. And we run into these a lot in discrete event simulations. Suppose we know that on average four customers arrive at our system, whatever it is, per hour. Well, that's almost exactly fits the definition of a Poisson process. Discrete events in a continuous interval. So this four must be lambda, the parameter of the Poisson process, which is not a function and doesn't change. It's a stationary Poisson process. We look at the units. If the units on lambda is customers per hour, then that works for the Poisson process. If lambda is four, then one over lambda is one over four. And the units on that are hours per arrival. So the average into arrival time is a quarter of an hour, 15 minutes per arrival. That would have to be the mean of the exponential distribution. That's why generally, especially in simulation when we use the Poisson and the exponential so much, we'll often use different designations, different notation for them. And we'll call lambda the mean of the Poisson distribution and mu the mean of the exponential distribution. But the notation is not uniform. So you really just take a look at the units, listen to the words. You have to know what's being talked about here. In this example, we're using the inverse transform method to sample from a distribution that's discrete. That's a customer arrival distribution. And it's an empirical distribution. In other words, we're just using the data as it is. And we're calling that the distribution. We've collected some data and we found out the proportion. We found out how the data is distributed. We're not giving it a name. We're not trying to fit it to any theoretical distribution. We're just using it as is. So here in this case, customers arrive and these are 10 minute intervals. We have a probability of .4, 40% chance of getting zero customers in any 10 minute interval, 25% chance of one customer, 20% chance of two customers, 15% chance of three customers. That's the distribution. It's a discrete probability distribution where we're enumerating, listing the possible outcomes, the probability of each outcome. And in this case, because we need it, we're also indicating the cumulative probability where we just add one to the previous one. The probability, the cumulative probability of zero is .4. The cumulative probability of one includes the probability of one and the probability of zero. So it's .4 plus .25 is .65 and so on. So the cumulative probability of three includes all of the other probabilities. And that's the totality, that's one. With this discrete empirical distribution, the cumulative distribution looks like steps. And that's typically what you'll see. But when you plot it out, the x-axis contains the value, the x-value, number of customers arriving. The y-axis is the probability, the cumulative probability, so it's a number between zero and one. And we can do the same thing. If you use the inverse transform method, you generate a random number, uniform zero one, figure out, look at it on the graph, and horizontal line till it bumps into the curve, then hang a vertical line to get the x-value that you got. So for instance, if the random numbers that we generated are .09, .54, and so on, well, for .09, anything between zero and .4 will be zero. So that's a zero. In the first time period. In the second time period, .54 gives you a one. .42, slightly above .4, also gives you a one. In the fourth time period, we get a .8. And that matches the x-value of two, two customers arriving in that time period. And then in the fifth time period, we go back down to zero. And so we've pretty simply used what would the exponential look rather complicated, but in this distribution, since it's discrete and we're just looking at a fairly simple distribution, it doesn't present any intellectual challenges for us at all, at least I hope not. But we're going to look at something soon that will give us a challenge. All right. It's one of our favorite distributions, the normal distribution. We know so much about it. It should be easy as pie to sample from the normal distribution and generate normal variants. Let's see. First of all, before even going further, we're going to forget about the normal distribution and stick to the standard normal. Because it's been studied. We don't need to derive any formulas. We don't need to create it all over again for every single pair of mu and sigma. The standard normal is usually called Z. Z is distributed as normal with a mean of zero and a standard deviation of one. Great. But of course, that's not really what we want to generate values from. Still, if we can generate the Z's, that's great because we could take any Z and transform it back into an X, solving for X again using algebra. I guess that's a side note so that just to say when we're talking about the normal distribution and sampling from a normal distribution, we're really talking about the standard normal because it's a very easy transformation to go back and forth between what we'll call X and Z. The probability density function for the standard normal distribution is given in the middle of the slide. It's rather complicated, but it's not that terrible. It's 1 over the square root of 2 pi times e raised to the power of negative one half times X squared. The formula for the cumulative probability distribution of the standard normal is a little bit more complicated but it's not something we would really want to work with algebraically to solve for X. In fact, it's pretty frightening. Using the graphical solution I suppose would be an answer because the graph is a regular S graph typical of symmetric equals median equals mode distributions like the normal. Again, using the graphical approach is very inefficient and time consuming and it's hard to generate a computer program to do it. So do we really want to use the inverse transform method here? Probably not. Thankfully there are a lot of other alternatives. I'm going to show you one on the next couple of slides. There are different methods that are used for sampling from a normal distribution. Here I'm going to present to you one that's usually called very simply the sampling method and you'll see why pretty soon. It's an easy to explain method. It's an interesting one to look at. Probably not one that's actually going to be used because it's not terribly efficient. We'll look at some of its deficiencies after I show you the technique. This method is based on the central limit theorem. So recall the central limit theorem says that sums are averages of distributions that are not normal. As long as they are symmetric they very quickly become come to look like a normal distribution as the number of items that you're summing, the number of variables that you're summing together as that number gets larger and larger and larger. So here's look at this example here. We're looking at a very, very simple probability distribution tossing a die. When you toss a die you can get outcome the number that's on the face of the die either a one or two or three or four or five or six. They're all equally likely. So the probability of any one of those outcomes is one six. That's a typical rectangular distribution and it's a uniform distribution. But what happens if we toss the die twice and outcome we're looking at is the sum or the average of the two outcomes from the two tosses? Or what happens if we toss the die three times or four times and so on? Well it so happens that in the first case you will start ending up with a rectangular distribution. And a rectangular distribution is not that much different from the normal. It needs some smoothing out but it's certainly symmetric. The mean is equal to the median is equal to the mode and so on. But what happens as you toss the die more and more times and accumulate the sums over that number of times? This distribution does in fact start to look very much like it has a bell shaped curve. Well as you know it's not enough just to know that you're creating this normal distribution by sampling from let's say a uniform distribution. But it's nice to know you can do that. But you also have to know what the mean and the standard deviation is because what we really want is the standard normal distribution. So we at least have to be able to figure out what are the parameters of this normal distribution that we're counting on having sampled because we're summing the values generated from a uniform zero one distribution. Well as far as the mean goes for any uniform from A to B distribution we know that the mean is one half A plus B. So for the specific case of the uniform zero one the expected value or the mean is going to be one half zero plus one or one half point five. We know that the expected value of the sum of two random variables is the sum of the expected values. The expected value of x plus y is equal to the expected value of x plus the expected value of y. And by extension if we have more than two random values from this from these random variables then we can add those as well. So the method goes like this let's say we have 12 what would happen if we have 12. Well for random number from a uniform zero one distribution the mean is point five and one half. So what happens if we have 12 well it's one half plus one half plus one half 12 times. So in other words the mean will be six if we sample repeatedly 12 times from a uniform zero one distribution will have something approaching a normal distribution. We'll have an approximation of a normal distribution with a mean of six. Now what about the variance using the same tools that we did for the mean the variance of a uniform zero one distribution reduces to one over 12 one 12. If you when you sum variances or when you get if you want the variance of a sum you can also sum the variances just like we did with the mean and that's a nice feature. So for two it would be two times the variance for 12 it would be 12 times the variance. That's one good reason for using 12 12 times one 12 is one if the variance is one the standard deviation is one and we're halfway to a standard normal distribution. Finally here's where we draw it all together. What's the actual process for using this sampling method to generate values from a normal distribution. Well the first thing we do we start with our typical uniform zero one generator. We get 12 of those add them up and then subtract the number six. Why because we need the mean to be zero. The variance is already zero the standard deviation is already zero. So once we're finished with step two. We have a value for generated from the Z distribution the standard normal with a mean of zero and a standard deviation of one. Step three is then simply using the transform algorithm to go from an X from a Z to an X using whatever the parameters from the normal distribution that we're interested in that we started out with. And that's that's it it's done it's very very simple it's not great but at least it's simple easy to do easy to understand. Okay so what's so not great about it well the fact that we're limiting ourselves to 12 summing 12 random numbers in order to get value from a normal distribution. Even if it's a decent approximation where we're limiting the values that it can take on and sometimes we do want to sample from very low probability values in the extremes. And this won't be very amenable this method won't work very well for that. And in addition I mean really 12 random numbers for every normal random variate. We're going to run through our random number generators period very very quickly so then in addition we need a pseudo random number generator that doesn't start cycling too quickly. What do we use all of this for in simulation. We use this in order to generate values to that are used as input to the simulation. If we have customers coming into a bank we need to generate into arrival times we if we have a machine shop we need to generate the time until the machine goes out of service for whatever reason. So these are the inputs to the simulation. This all assumes that we know the distribution we want to sample from and that's a whole other lecture and that's what's coming up next.