 I mean looking at preliminaries, that's what we saw for pretty much all the lectures, last three classes, and I think they have to, okay, all right, so this seems to be much louder. So the last thing we saw was discrete time signal processing, okay, so that's what we are going to do this class, okay, so we will be looking at, so this will be my notation for discrete time signals, I will denote discrete time signals with x square bracket n, okay, and we will let n be an integer index going from minus infinity to infinity, okay, so basically x of n will be a sequence of complex numbers, okay, possibly a sequence of complex numbers and like the continuous time case will be mostly concerned with finite energy sequences, okay, so we will be only worried about finite energy sequences. In some cases we will look at bounded sequences but typically only finite energy sequences, okay, so what does finite energy sequence mean, okay, in the discrete time case summation from over all n modulus of x of n squared should be finite, okay, so that gives me finite energy sequences, I will only be worried about these kind of sequences, okay, so many of these I think you must be familiar with quite a few of these things that I am going to talk about, so we will go through once again a little bit fast, the first thing that I am going to point out is discrete time convolution, okay, it has got a very similar notation to the continuous time case, so I will use a slight abuse of notation and call y of n the convolution of x of n and h of n, okay, I have to be very careful here with the notation because the way I write it down I am going to say summed over all m x m h n minus m, okay, you can also do it the other way, you can also write h m x n minus m and that is discrete time convolution for you, okay, so where is convolution relevant when you look at linear time invariant systems in discrete time you can again show the output is the input sequence convolved with the impulse response sequence h of n, okay, so there will be two transforms that we will be dealing with in the discrete time case, one is what I would call DTFT, what is DTFT, discrete time Fourier transform and the other is what is called the Z transform, okay, so I will denote that ZT, okay, so we will be worried about both these transforms for h of n, okay, so the definitions are quite easy, okay, the discrete time Fourier transform is denoted e power j omega, even before I go further, okay, so this just this notation means a lot of things, okay, why can't I just write x of omega, why am I writing e power j omega, what's the big deal, just by writing x of e power j omega, what does it imply on this function, I am sorry, oh well you are going to the Z transform even just as function instead of writing x of omega, if I write a function of x of e power j omega, what does it mean, everything should be in terms of e power j omega, so what does it mean, okay, in particular e power j omega is a specific kind of function, it has a lot of properties, for instance it's periodic in omega with period 2 pi, so whatever I do to this function what will it be, it will also be periodic with period 2 pi, just by denoting it like this, I am reminding myself every time, I can as well write it as x of omega and think of it as periodic in 2 pi, there is no problem in that, but I am reminding myself every time when I write like this that, I know for sure that my function has to be periodic with period 2 pi, okay, so that's one of the main things that comes out in this discrete time Fourier transform, so typically people worry about only a 2 pi interval in omega, okay, so 0 to 2 pi or minus pi to pi depending on any any 2 pi is fine, right, it's going to be periodic as long as you know where your high frequencies and low frequencies are, it's fine, okay, so minus pi to pi is a typical interval that we might also take, okay, so x of e per j omega, just the way it's defined is if sum over all n xn e per minus j n omega, right, so it's a simple, once again it's a it's a series summation, right, it's not anything more than that, there are some convergence properties and all that, so if you can show a lot of useful properties based on that, okay, and couple of things to note here, so this omega is actually dimensionless frequency parameter, okay, so we think of it as frequency but it doesn't matter, it's just a notation, right, it's just a variable omega, okay, so it's typically dimensionless in physical systems, you think of it as frequency times the sampling interval, okay, so that's what you think about it, so omega would be 2 pi f t, okay, capital T where t is the sampling time that you took from the continuous time sequence, so that's to go back to the physical nature of these signals, but if you don't care about it, just as sequences, this is a very well-defined operation, you can just simply take summation x of n e per minus j n omega, there's no problem, as long as you have a sequence, you will get a function of omega out and you can study convergence and all that, okay, so assuming convergence holds in a certain way, you also have a inverse relationship, xn could be, you need a 1 by 2 pi here, okay, integral, so I'm going to take minus pi to pi, you might as well substitute this with any interval of length 2 pi, okay, x of e per j omega e power j n omega d omega, okay, so this is a, this is quite a fancy integral, okay, so if you're not, if you're not familiar with complex integrals, it can be misleading, okay, but you learn enough theory in DSP to do a lot of conversions without really knowing anything about complex valued integrals, right, so you learn enough about, some sequences are and then you learn a lot of properties of this, so that you can quickly do transformations without worrying about all kinds of properties with integrals, okay, once in a while it will fail you, but mostly it's okay, as long as you deal with finite energy signals, that's good enough and that's good enough for us as well, okay, so that's about the DTFT, okay, the Z transform is a similar definition, but it's a little bit more general and more powerful, minus and free to infinity x n Z power minus n, where Z is another complex number, okay, so it's a variable that denotes any complex number, okay, so you can see this is a generalization of the DTFT, right, so like you pointed out, you can take the Z transform and look at it only on the complex number Z with magnitude 1, so Z power e per j omega becomes set of all complex numbers with magnitude 1, right, the unit circle as you call it, okay, so on that if you look at it, you get DTFT, so it makes sense now, the periodicity also makes sense, right, as you keep changing omega, obviously you're going to come back over and over again to the same point, okay, but the variables Z, I'm sure you must have seen a lot of physical motivation for what the complex frequency stands for, what the real part is, what the imaginary part is, what this means, what that means and all that, okay, so all those things, I'm assuming you have enough intuition about, we'll use all those things once in a while, okay, so the thing that will be an object that we'll look at quite frequently in discrete time is the LTI system, okay, LTI system or linear filter, okay, so I'll just call a filter for instance, okay, will always be linear for us, okay, it's characterized by, okay, typically the impulse response or I might replace this with what, one of two quantities, it can either be the discrete time Fourier transform whenever it exists or H of Z whenever it exists, okay, so one thing to worry about in H of Z is what, there's one more thing you have to specify just that X of Z, you can show for most cases will converge, in fact it'll converge all the time, but in a certain range of values for Z, okay, so in many cases that will be a ring, right, outside of a circle, inside of a circle between two circles, okay, that's how the region of convergence will be, so it will be typically specified, okay, so the ROC also is very, very important for X of Z, okay, once you specify both you can go back to X of n as well, okay, so that once again there you know a whole bunch of Z transform paths and you can go from any relevant one back to the Z transform in there, okay, so I'm not talking much about the ROC here, once in a while I'll use those notions, so if you've forgotten what the ROC is, go back and read its definition, figure out how to find ROCs and all that, okay, at least for the rational case, it's very quite easy to figure out what the ROC is, okay, and a couple of notions for this H of n that we'll be worried about is, okay, so I think some names are also important, I'll probably call this frequency response, okay, this guy will be frequency response and H of Z will probably be transfer function, okay, so those are names from system theory that we will use as well, all right, so well from an implementation point of view, you would typically want this H of n to be causal and stable, okay, causal means what? It's 0 for n, negative, okay, so but that's too restrictive, that too in digital it's really, really restrictive because you can store stuff, right, so it doesn't make sense to restrict yourself to causal sequences, okay, so we'll restrict ourselves to what are called right-sided sequences as in H of n will go to 0 eventually, some at some point in time and then for negative indices, so what can you do? How will you implement such filters in practice? Simply delay, right, so you shift it to the right or delay as long as you want and then implement it, implement a causal version of it, so what can happen eventually? What is the only difference? What can happen to your input sequence? Can it change drastically? Right, nothing, just a delay, okay, so you delay your output also, you wait for a little bit longer, you will get the exact response that you want, it's all LTI system, so it doesn't matter, you can shift in time, then as long as you wait long enough, you'll get it, okay, so with that cost, we will say, we'll consider in general H of n to be a right-sided sequence. What's the rigorous definition? H of n is 0 for n less than or equal to m, typically we think of m as negative, that's fine, I'll just give a general definition for us, for some integer m, okay, so there are some restrictions on the region of convergence for instance, if you have causal sequences, the region of convergence you can show will be a region outside of a certain circle, okay, all the way up to infinity, if you have right-sided sequences, you cannot include infinity, okay, so that's the only thing, there's no other problem, okay, you take the largest pole, then everything outside of it will be a region of convergence, so in particular, all your poles are inside the unit circle, what can you say? Your sequence is going to be stable as well, okay, bounded input, bounded output stability, your transfer function will be stable, impulse response is stable, okay, so we'll mostly be interested in causal and stable, okay, interpret causal as right-sided, okay, so when I say causal, once in a while I'll interpret it as right-sided, but typically we'll sometimes say causal as well, so when I say, okay, let me go back and remove this, I don't want to say causal as right-sided, let me take, cause more confusion and unnecessary, okay, right-sided and stable filters is what we'll be interested in, okay, what do I mean by stable, okay, so this bounded input, bounded output stability, there are various ways of defining it and time domain is defined as, what should this be, basically finite energy, right, so this should be less than infinity, or what will you define it in z-transform domain, all poles should be inside the unit circle or the region of convergence should contain the unit circle, so that h of e power j omega is defined, okay, so in terms of the frequency response, once the frequency response is defined, you know it is, it's going to be stable, okay, so that's how you go about defining it, okay, so all those things are important, once in a while we'll also be worried about minimum phase, what's minimum phase, minimum phase is all poles and zeros of the, of h of z are inside the unit circle, okay, so there are various interpretations for minimum phase, one of the interpretations is what will, the inverse will also be stable, right, is that fine, so the inverse will also be stable, okay, so poles and zeros are minimum phase, so when you think of doing the inverse, you'll also get poles and zeros to be inside the unit circle, inverse will also be stable, okay, so there's also a notion of monic, I just want to write it down here, I don't want to go to a new page here, what's monic, okay, monic and causal, okay, what's monic and causal, what is the monic, okay, causal means h of zero is where you start, okay, h of zero, h of one, etc, and h of zero equal to one makes it monic, okay, so that's the thing, okay, so that's the definition for being monic, the leading parameter, the leading coefficient is one, okay, so that's the definition, okay, so I think it's pretty much everything I wanted to do, it's probably reminded you of, yeah, the reason will be, see, usually when you talk of existence of certain filters, existence will be guaranteed only up to scaling, so you can multiply with some number and you'll still get another signal, so when you want to talk about the unique something, you always say monic, just to make it unique, it helps in specifying, making things unique, so for instance, I have a filter with a certain frequency response, if I scale it with any number, I'll still get another filter frequency, same frequency, so if you want to specify one filter, you say, okay, I take the monic version and it'll be, okay, so yeah, it's used usually for that purpose, it's not very important but sometimes it makes a difference, okay, so the next thing is, we'll be worried about is the folded spectrum, okay, so this is quite important, okay, very often we'll be sampling continuous time signals and you should have a very clear understanding of how the spectrum translates, okay, so I have a continuous time signal with a certain spectrum, what will be the DTFT of the sampled version, okay, so that's where the folded spectrum notion comes in, okay, so here's where we start, we say signal Q of t is Fourier transform pair with frequency response Q f, okay, and then I go ahead and sample this, what do I do when I sample this, I define a sequence Qn as Q at these time instances, Nt, okay, t is my sampling time or my sampling frequency is what, 1 by t samples per second, okay, so that's my unit, okay, so one can show if you do this, Q of n's DTFT, okay, will be what, okay, I'll define that as Q tilde e power j, watch out for what I'm going to do next, be very careful about where this omega comes from, I'll relate omega to all these other quantities as well, okay, Q tilde of e power j omega, this will work out to what, 1 by t summation m equals minus infinity to infinity Q f minus m over t, okay, so this is called folding odd, aliasing is also another word that people used to describe this, so you take Q of f, okay, then shift it by 1 by t, then add up everything together and then you divide by the 1 by t, the final division by 1 by t is not so crucial, doesn't change too many things but the shape of the spectrum is decided by summation, okay, so that aliasing is very, very important, so to go back to the discrete continuous time case and relate everything and keep everything together, you have to think of omega as what, 2 pi f t, okay, where f is my continuous frequency, okay, so if you're really worried about mapping everything back to the continuous time domain, your continuous frequency and discrete time frequency are related like this, okay, or if you want you can write it as 2 pi f over f s, okay, so this f over f s is the kind of normalized frequency in the discrete time domain, okay, so we will usually deal with this, okay, alright, so you notice one thing, this guy is what, this guy is periodic with period, 1 by t, okay, in f, okay, so that results in the periodicity for Q of e pi j omega as well, okay, so that's something that we wanted and that's happening, alright, so if you don't want any aliasing and if you want your Q tilde of e pi j omega to contain an accurate copy of Q of f, what should happen necessarily? Q of f should vanish within within this 1 by t, well 1 by 2 t, right, so you need 1 by t to be greater than, so it's called the no aliasing condition, you have to sample fast enough, right, so 1 by t has to be greater than, greater than 2 times the bandwidth of Q of t, okay, so you always think of Q of t as a base band signal, okay, so you define the bandwidth as the largest positive frequency that it has and if you as long as you sample 2 times above that you get it, okay, there's also a notion of band pass sampling, okay, so it's also possible to do band pass sampling, so this Q of t need not necessarily be a base band signal, as long as it occupies a small enough band in any frequency, you can sample it at a carefully chosen rate which is an integer multiple of somewhere in the middle, so that when it aliases, you will get a base band copy which is exactly what it was in the past, okay, so it's very easy to do that as well, I'm hoping you would have read that, so but it's not too relevant for us, but it's an interesting thing to think about, this aliasing shifts it all the way, right, so even if it is somewhere else, as long as you make sure this m by t is such that something will come in the base band properly, you can sample directly, okay, you have to filter carefully in that band and then sample, then if your low pass filter it or if your bandwidth itself is small enough, you will get automatically the sampled version, okay, so that's band pass sampling, I'm not going to talk too much about it, but you should know that it exists, okay, so another thing that I wanted to talk about here is spectrum, spectral factorization, but I don't think this is a good point to introduce it, I'll introduce it later when we need it, okay, so if you want to read about it, you can read about it, it's not terribly difficult, but I'll do it later when we need it, okay, so I think that way it will stay fresher in your mind, okay, so that's pretty much what I wanted to do for discrete time signals and like I said, we'll mostly be dealing with discrete time signals in this course, so if you're not sure about DTFT, what it means, some transform pairs, it's a good, now is a good time to go and read it, okay, so I'll ask questions and quizzes and exams, assuming you know how to find DTFT, how to find Z-transform, how to find Fourier transform and all that, okay, so that's the reason why I'm giving all this notation, you can't come back and quiz and tell me, I don't know how to find the DTFT of this, okay, so you better be prepared for that, okay, all right, so next thing we need, the last bit is what you must have seen very recently, which is in 356, this is random signals, so probability and random variables and random processes, okay, so I'll go through it a little bit fast, pausing only for the very important things, let's see how it works, okay, so couple of things about the way I lecture, I've been told several times that I go very, very fast, okay, is that true? True or false? Not true, okay, so there's one sure fire way of slowing me down, that is to keep asking questions, okay, so if you don't interrupt me and ask questions, I'll just keep on going and once the momentum gathers it's tough to stop, okay, so whenever you think it's going too fast, I'm not able to catch something, it's perfectly okay to just put your hand up and say, can you repeat something, okay, I won't be distressed or upset or anything, okay, I might penalize you later but that's okay, okay, all right, so first thing, before you study random signals is the notion of probability and random variables, okay, many students tend to retain their high school notion of probability and random variables and never really learn the real graduate school version, we try to do some kind of a job in 356 but it's also still confusing, okay, so if at all you learn probability and random variables properly in the rigorous way, you'll realize how much more there is to it than what I'm going to describe now, okay, so what we're doing is just a simple version just to enable calculation, not the real rigorous theory, okay, so the notions that I'll expect you to be very comfortable with initially in probability is sample space and all that, okay, sample space, events, what are events, how you define them, how to calculate probabilities for events given probability measure on your sample space, all those things, not even going to write down, okay, so I think you've been reading it for a long time in your life, so I'll expect that you know that, okay, the next thing, for technical reasons what we need is, first thing we need is notion of random variables, okay, how do you define a random variable? What is a random variable? Function from to the real line, right, so it cannot be any old function, there are some very rigorous restrictions on what that function can be but we'll take it to be a function from sample space to the real line, okay, for us in most cases we'll be dealing with a sample space which is also as a real line, okay, so we'll just talk about the random variable as it is without worrying about the sample space, okay, in that case you have to think of the sample space also as the real line and some events are being defined around that real line, so I'll say my event is the random variable lies between 3 and 4, okay, so that's an event, okay, so that's how I'll define events, okay, so that you should be comfortable with. To properly do computations with random variables there are two associated functions, the first function which always exists is what? The way the random variable define, it's guaranteed to exist for any random variable, what is that? The cumulative distribution function, okay, so that always exists, there's no problem, one can always talk about it, okay, it has some properties, it starts at 0, it's right continuous and it will end at 1, okay, so you know that for sure, okay. To further facilitate calculations in several cases one can define what's called a probability density function and it'll like once again exist for most cases that we'll be dealing with, okay, in case where it doesn't exist we'll use the delta functions once in a while and make sure we study all random variables together in one big bunch which has a nice PDF defined, okay, there's also a PMF that I'll talk about for the discrete time case, okay, so what is PMF? Probability Mass Function, okay, so that's PDF as probability density function, okay, so there are several calculations you have to do around these PDFs, for instance if I give you an event as random variable lying between 3 and 4, how will you find it using the PDF? You integrate the PDF from 3 to 4, how will you find it with the CDF? CDF evaluated at 4 minus CDF evaluated at 3, okay, so there are some intricacies if there is a jump at 3 or 4 or somewhere in the middle, okay, so that those cases we won't worry too much about, we'll just say we'll deal with it when we come there, okay, so it's not a big deal, okay, other things that one needs to be very careful about with random variables is mean and expectation, okay, so you should know how to compute mean or expectation, okay, so this expectation operator is more general, right, expected value of any function of the random variable you should know how to compute, okay, so you just put it inside the integral and multiply with the PDF and figure it out, okay, mean is a specific case which is expected value of the random variable itself, okay, typically I'll use numbers like variables like x and y to denote random variables, then variance is another thing we'll worry about, variance is what? Expected value of square minus mean squared, okay, so that's the variance and variance you can show will always be positive, okay, square root of the variance is standard deviation, so all those things are there, okay, so that's as far as one random variable is concerned, when you go to multiple random variables I'll expect you to know what a joint PDF is, okay, joint PDF, joint PMF, how to find the marginals from it, how to evaluate probabilities with joint PDF which is always more complicated than single PDF, okay, so once you have more than one, it's always complicated and when you go to really more than two or three, it's useful to think of vectors of random variables, okay, or random vectors, okay, okay, and in this case there'll be a lot of things which will become vectors and matrices, for instance the mean will become a vector of means, okay, and what about variance? Yeah, it'll become a covariance matrix, right, so that's something you should deal with when it comes to vectors, okay, so other things, another thing that I'll worry about is conditional probability, okay, this will play a very important role, so I'll maybe spend some time with conditional probability, okay, and things like this rule play a very, very important role, okay, so there are several things to pay attention to when you look at conditional probability, depending on whether you're dealing with the discrete random variable or a continuous random variable, all these things, we define conditional probability very differently, right, in some cases when this, when you're conditioning on a continuous random variable, what will happen? You'll have to go to the PDF and then look at its evaluation, all those things, so we do all that because we're limited by the theory, so if at all you learn the theory properly, there is a proper unified way to define everything which takes care of the whole thing, okay, so but for us we'll deal with discrete and continuous separately, okay, and there are also two different notions, okay, one is conditioning on an event, okay, probability of something given, conditioned on an event, and there's also another notion of conditioning on a random variable, right, the condition on a random variable being equal to some value, okay, so both those things are sometimes related, sometimes not related depending on whether you're working in discrete or continuous time, and I'll expect you to be comfortable doing all those computations, okay, and we will do this a lot, we will do a lot of conditional probability calculations involving both discrete variables and continuous variables, conditioning on the conditional variable, conditioning on the discrete variable, we'll do all kinds of things, I'll keep writing down the formula and in most cases you'll agree, sometimes in an exam when you have to recreate it, all these basic confusions will come, okay, so make sure you know how to do these computations, okay, another thing we'll define is conditional PDF which is again another thing, reasonably complicated, conditional PDF, okay, so again I'll do it for the continuous time case, the continuous case, the discrete case is not too bad, okay, so this will be my notation when I write down, okay, so all the notation is nicely captured here, so that's why I'm doing this, okay, X and Y are random variables which are jointly, they have a joint PDF, they're jointly distributed, the conditional distribution of X given Y is defined as the joint PDF of X, Y divided by the marginal of Y, whenever the marginal is non-zero, okay, so you only have to look at those points where the marginal of Y is non-zero, so you see all the notation is nicely captured in this formula, okay, so how will I denote PDF of a single random variable, I'll write F, random variable as subscript and a dummy variable running, okay, so if I see any other notation which is fancy, I will not consider that as valid, okay, so please stick to this notation, particularly in this course, this notation can save you, okay, so there's lots of things that we'll be dealing with, if you're not very careful, you'll get totally misled by notation, okay, in the joint PDF what am I going to denote it, I'm going to put all the variables that I have to keep track of in the subscript and a dummy variable for each, what am I doing for the conditional PDF, okay, X given Y, then X given Y, again, again a dummy variable, so what does it mean, conditioned on Y taking the value small y, what's my PDF of X, okay, in the joint distribution, okay, you can show all these things are consistent if you carefully define it, okay, if one of these things become discrete, what should you do in general as a rule, you simply replace wherever you have PDF with the probability, okay, so that will work in general, okay, but sometimes you have to pay attention to how you are actually defining the notion, okay, all right, so that's about the basics of random variables, so one random variable which we will deal with all the time is Gaussian random variable, okay, sometimes I'll call this normal, okay, so we'll say X is normal distributed with mean mu and variance sigma square, this means that the PDF is 1 by root 2 pi sigma e power minus X minus mu squared by 2 sigma square, okay, for what value of X, for all X, okay, so this is a random variable that allows its, that's allowed to take values over the entire region of X, okay, so you should know this formula even at midnight, okay, if somebody wakes you up from sleep, wake up, you should know this formula, okay, so this is one formula you cannot do without, wherever whichever field you go to, okay, doesn't matter if you want to get out of electrical engineering as soon as you pass out, you should know this formula, okay, this will play a big role in your life, okay, if at all you are anywhere near computation, all right, so that's the normal distribution, in several cases we'll deal with the standard normal distribution which is just 0, 1, okay, so this is a really simple formula, here you would get 1 by root 2 pi e power minus X squared by 2, okay, so I remember there was a student, I don't know if he actually is in this class right now, he came and spoke to me that he doesn't have the prerequisite but he wants to do the course, what is the only question I asked him, what do you think is the question I asked him, what is the Gaussian random variable, okay, he keeps saying things like this, bell shape, he didn't get through, okay, at least I didn't give him, I didn't tell him that he can join but is he here, I don't know, I don't think he's here, I can't recognize him, anyway, so all that is not good, okay, the Gaussian random variable has a precise definition, it is that random variable which has this pdf, there is no other Gaussian random variable, okay, don't be under any such illusions, okay, so useful exercise if you've never done this before is to show first of all that this is a valid pdf, how will you show that it's a valid pdf, okay, how many of you know how to do this, okay, okay, some ways of doing it, if you don't know it's a good thing to get to know, okay, so there's a way of converting into a 2D integral which will nicely simplify into a doable one, okay, so otherwise it becomes difficult, okay, so you can show for the Gaussian random variable x defined this way, expected value of x is what, mu, okay, so this doesn't seem like a terribly difficult integral to do, an integral that might be more difficult is to show expected value of x squared, what will this work out to, mu squared plus sigma squared, okay, so this might be slightly more difficult if you're not used to it, okay, so there are standard tricks to do these kind of integrals with the Gaussian, okay, I'm assuming you're familiar with it, if you're not better pick up a book and be familiar with it, because I might expect you to do such integrals, okay, so it should be very basic and simple, but you should know how to do it, okay, so one integral you cannot do very easily is the CDF for this, okay, so it doesn't have a closed form, so we'll define a function, it's possible to define what's called error function, it's complement to define the CDF, that's what statisticians do usually, but in this course, we'll define what's called a Q function, okay, so the Q function evaluated at x, I'll define as integral from x to infinity 1 by root 2 pi e power minus t squared by 2 dt, okay, this will be my Q function, you can write the CDF very easily in terms of this, okay, so it's not a big deal, so we'll use the Q function, it calculates what's called the tail probability, probability that capital X is greater than or equal to small x, okay, so this is what calculates, okay, so that's the Q function and, okay, so this is this x, okay, sorry, okay, so you can go back and forth between mean mu variance sigma square and mean 0 and variance 1, how do you do that? If you do x minus mu by sigma, the Gaussian with mean mu and variance sigma square becomes the normal Gaussian, so once you make a normal Gaussian, you do all the calculation here with Q and then go back to whatever you want, okay, that's the trick and this Q function falls very rapidly, you can show, you can show, you can show, where's my approximations? Yeah, Q function goes very rapidly, in fact, a good upper bound for all cases is this half, half e power minus x square by 2, this is a good approximation, now you also have an approximation which is valid for slightly larger values, which is once again very useful, okay, so these are all, these two are actually good approximations for Q itself, okay, so you can, in cases where you have to evaluate an integral involving Q and you want to quickly get to an answer without worrying about exact nature of Q, you can plug in these guys and get some pretty good results, very, very, very nice results can be obtained for this, okay, there's also a lower bound which is closely related, I'll allow you to look it up, okay, so that's about the Q, the next thing you might, which I'm sure none of you will remember by heart is the joint Gaussian distribution, okay, so this is once again will be very useful to us, we'll say two variables x and y are jointly Gaussian, okay, if the joint PDF is, okay, so it's a crazy formula, I don't expect you to really remember it, but it's good to know that there is a formula, okay, e power minus x square minus 2 rho xy plus y square by 2 sigma square 1 minus rho square, okay, once again all x and all y, this rho, well, this is actually a specific case, this is a more general case where x and y could have different means and different variances, I've taken a very specific case where x and y have means 0 and both have individually variances sigma square, okay, you can show that if this is the joint PDF the individual distribution is what, x is normal with mean 0, variance sigma square, y is normal with mean 0 and variance sigma square, and this together they have, they're related by this joint PDF, they're not independent, okay, so if they're independent the formula will look different, okay, this is, together they're related by this joint PDF which is different from the independent version and it's characterized by what's called this correlation coefficient rho which can be defined as expected value of xy divided by sigma square, okay, so this is pretty much the most general version we'll need, okay, so any change you can make very easily, right, so what will happen if I want a different mean for x, you put x minus mu and all that, so in fact you can even set sigma square to 1 and define a simple case and then do transformations to deal with other cases, okay, so this is the joint legal scene PDF, this rho is the correlation coefficient, so rho is 0, okay, actually you should say x and y are uncorrelated but the Gaussian case it will turn out they are independent as well, okay, so those are nice things about the Gaussian, okay, alright, so you also have a vector of random variables which are jointly Gaussian, okay, so here you had 2, in general you might have n random variables, once again I'll put a transpose here, each xy is normal say, okay so I should say shouldn't put equal to, okay, so once again the general distribution, general joint PDF is fairly complex, okay, so we'll take first a simple case, I'll introduce the way the PDF is written down and then I'll write the more general case, okay, so this is x1 through xn being jointly Gaussian but each guy is, so this is the iid case, okay, so I should say what this is, this is iid case, what is iid, independent and identically distributed, so each xy is normal with mean 0 and variance 1, okay, and it is independent, okay, so in this case what can you do for the joint PDF, multiply all the marginals, right, so if you want to write down, you can nicely write fxxs, okay, you just multiply the individual ones, it'll turn out to be a very simple formula, so I'll just write down the actual formula here, n over 2 e power minus 2 sigma square, so what's this norm x square, it's the 2 norm, okay, so summation xy squared, okay, so I should put a bar here, all right, so that's the joint PDF, it's simply the product of the individual normal PDFs, it's no big deal, okay, so the more general case, a little bit more twisted, you need a few more definitions, this is the general case, not the iid case, you need the mean first, the mean would be vector of these guys, expected values, and then you also need what's called the covariance matrix, covariance matrix is expected value of x minus m, remember x minus m is a column vector, I'm going to multiply with x minus m transpose, which would be a row vector, what would be column vector times a row vector, you'll get a square matrix, right, the same vector transpose, you'll get a square matrix, and the expected value operator should operate on each element of the matrix, okay, so each the ijth term of ci will look like what, what will it be, expected value of xi minus mi times xj minus mg, okay, so that's how the ijth value will look, okay, so once you have all this and once, I'm going to further assume that c is an invertible matrix, okay, so you might say why do you want to assume c is an invertible matrix, seems like a very artificial assumption, it just simplifies my things, okay, so if assume c is an invertible matrix, you can write down the joint PDF very nicely, okay, so 1 by 2 pi raise to the power n by 2, so this modulus for the matrix is the determinant, okay, raise to the power half e power minus half x minus m transpose c inverse x minus m, okay, all right, so the numerator will be what's called a quadratic form in general, so it will have terms like xi xj, okay, all those terms will appear, xi squared, xi some constant times xj, all those terms and you'll have a constant, those kind of terms will appear in this, in the exponential, okay, so that's the jointly Gaussian distribution, okay, so a standard question that's asked in several interviews from people is, do you know what the joint Gaussian distribution is, people will give this c inverse formula and people's first question, next question is what, what if c is not invertible, can you define jointly Gaussian in that case, okay, so it turns out the proper way of defining jointly Gaussian is something else, okay, and that is what, what is the proper way of defining jointly Gaussian random variables, okay, the way you define it is this thing, every linear combination, every set of linear combination should be jointly Gaussian again, okay, so that's how we kind of define it, well that's the property that turns out to be true but actually when you define it, you define a set of random variables to be jointly Gaussian if every linear combination is Gaussian, okay, so that's the, that's the condition, okay, so that's the general definition that works whether or not the covariance matrix is invertible or not, so from there you can go to a set of random variables for which the covariance matrix will be invertible and then you can define the distribution, okay, everything else will be nicely defined, okay, so that property I want to highlight it's very, very important for us, okay, linear combinations of, of jointly Gaussian random variables are once again what are jointly Gaussian, okay, so this is a crucial property, it will simplify all kinds of computations for us, okay, so the keywords here are linear combinations and jointly Gaussian, you take a jointly Gaussian set of random variables, take whatever number of linear combinations you want, you once again get a jointly Gaussian set, okay, so that's very, very important, okay, so this is a very key property, if you didn't have this property, so many computations that you do routinely will tend, tend to get ultimately ugly, so you see why people really like jointly Gaussian random variables, right, so we like linearity, we would like to do linear combinations of inputs in every filter that we have, we would like to do that and whenever we have uncertainty, we want to assume Gaussian because then I know my LTI system will do nothing to the distribution, if it's jointly Gaussian, it'll once again spit out jointly Gaussian random variables at the output, okay, so that's very useful, it's also possible to do so many other definitions, okay, so that's, that's pretty much all I want to do about jointly Gaussian random variables, okay, so the next thing I'll quickly define and then we'll start to characterize this further in the next class, next 5 minutes is random process, okay, so we'll distinguish between two different cases here, random processes in discrete time and random processes in continuous time, okay, but I'm going to do both of them simultaneously and just to speed up the whole review, okay, so I'm assuming you've seen all of this before in 3, 5, 6 at least, okay, so I'm going to do this in that way, so in discrete time and continuous time, okay, random process is nothing but I'll denote it xk, okay, so it's a sequence of random variables, okay, so each xk can be a continuous random variable, I don't care about that, when I say discrete time what do I mean, time is discrete, okay, so I only have x1, x2 so on, I don't have x1, x1.001, I don't have things like that, okay, so only x1, x2 so on, that's discrete time for me, okay, likewise in the continuous case, I'll denote by x of t, okay, a continuous time random process, what does it mean, for each t, x of t is a random variable, once again it can be discrete, it can be continuous, I don't care, okay, so for each t, x of t is a random variable, okay, so if you talk about random process with mathematicians and statisticians, they'll be worried about a lot of things before they define something like this, okay, so we'll not worry about such things, we'll just quickly say this is how we define it, okay, so how do we specify a random process, there is the proper way of specifying it is to specify all finite distributions, finite set distributions, if I give you a finite set of random variables from this infinite connection, you should give me a joint PDF for that, okay, so that's the way in which we'll define it, I'll call those finite distributions, these are used to specify random variables, okay, joint PDF of, PDF or PMF, okay, of a finite collection of random variables from the process, okay, so in the discrete time case, how will this work out, I have to give the joint PDF of xk1, k2 to kn for all n and ki, okay, so this is what I have to do, okay, in the continuous case, I have to give the joint PDF of x of t1, x of t2, x of tn for all n and ti, okay, if I do that, I've specified my random process completely, okay, so usually this is not how random processes in practice are specified, you specify them slightly differently through another route, which is called the sample function route, okay, so what you assume is the entire random process is controlled by say one or two random variables, based on some value the random variable takes, I define my entire random process, okay, so I'll only give you an example of how that is done, I won't go into the details because it's usually done with examples, here's an example of a sample function definition, okay, so I'll say my random process is defined as ak, say for instance cos 2 pi ft plus theta, okay, so maybe this is ak fk, okay, so maybe this is how I define it, so what will I say, I'll say my fk's are constants or random variables if you want, theta is a constant or a random variable if you want, ak is a constant or a random variable if you want, depending on what I make a random variable, I have actually defined a random variable for each t, okay, have I specified the joint pdf, I have also specified the random joint pdf, okay, so based on this definition and the pdf's and the joint distributions of ak, fk and theta, for any set of ti's you can compute the joint pdf, okay, I'm not saying it's easy, but it's possible to compute it, okay, so that's another way of doing it, so the sample function root is what we will take usually, okay, so you see it's also possible to mix up discrete and continuous and all that and come up with definitions, okay, so this is the root we will take most often in defining random processes, okay, so it's an indirect way of specifying the distributions, alright, so I'll stop here and proceed with more definitions on random processes in next class, you have to sign your attendance.