 So, the last class was quite crucial in several respects and we want to quickly go through some main things that we did there. So, the way to picture that is that you have a vector x and vector y and you have some joint PDF by which these case are related. So, in our case what happens y equals x plus n that is the ideal AWGN type assumption, but in general you can view it to some joint PDF. So, joint PDF could be specified as say, so in many cases this might be possible, maybe you can find all conditional PDFs for all possible y and x. So, that is the way of specifying joint PDF and assuming you also know the PDF for x itself this completely defines the joint PDF. So, the receiver the transmitter has x and the receiver has y. So, from y you have to come up with a function which I called x hat. So, x hat is what the receiver produces which claims is an estimate of x, but what is it exactly it is some function of y. So, one can come up with kind of a joint PDF for x and x hat if you want x y and x hat if you want all of these things together will have joint probability distributions. So, what do I mean by that joint PDF for x and x hat what do I mean by that. If you transmit x the receiver is likely to decide x hat is one of several possibilities from the constellation. So, x belongs to some constellation with some probability. So, basically that is the probability that we are concerned about which is the probability which you want to maximize probability that x hat equals x that is. So, you want the joint PDF to lie on that straight line if you plot the joint PDF of x and x hat you want it to be close to that straight line x equals x hat that is where you want all the mass to be. So, that is how you want to design your function of y. So, that is a very abstract way of viewing the problem. So, in most cases it will simplify to a very easy situation which is not all that difficult to handle. But this itself is called the detection problem in some cases this is the detection problem. It is a very well studied problem and one can argue without major contradictions that the whole of digital communications essentially is a solution to the detection problem. So, there is no you would not be wrong somebody says that. So, that is a very valid argument. So, detection theory is also vast in so many ways. But the way we came up with a good detector we studied two detectors right three the first detector was what the MAP detector which was the optimal detector. What did the MAP detector do? You can succinctly write it down as x hat equals argument of maximum over x in the alphabet of what? Probability that x equals x given y equals y. So, this was my MAP rule. So, you see you do not have to worry too much about the distribution of x hat and all that. You can just use the joint PDF of x and y to define this guy. So, this was optimal in what sense which is the objective function that it maximized or minimized? What was the objective function that I looked at? Probability of correct decision right. So, and that was maximized by this choice of x hat as the function. So, remember this will be a function of y. So, this will be a function of y that also you can see depending on y is x hat will change. So, it is a random variable all those things are clear. So, now there is another detector which is the ML detector which can be written down as argument of maximization over x and x probability that y equals y given well in this case typically you do not write probability you write what? You write some PDF because I am expecting y to be continuous. So, I will write y given x here. So, this is the way I wrote down and this is another detector. So, irrespective of anything else I can choose to run this detector nothing wrong in that. But usually when is this detector optimal? Not usually actually when is this detector optimal? When all the different inputs are equally likely. So, the PDF of x is uniform then ML is also optimal in the sense that it maximizes my objective function which is probability of correct decision. So, that is the thing. So, I derived it based on that. So, please see that. The last thing I saw I called it something else I think the best thing to call that is the minimum distance detector. What did that work out to? It is much simpler argument of maximum x in the alphabet remember my alphabet is a subset of some real space R m. So, all these quantities are nicely defined y minus x squared. So, this is a minimum sorry. So, that is my minimum distance detector. So, I can choose to do such a detector as well if I want to. But when will it be optimal? Several things have to be true first of all ML has to be optimal and then you have to have AWGN all that. So, all these AWGN stuff comes only at the minimum distance level. So, these are general principles and it is important to learn them as general principles and know that you are applying them into a for the specific digital communication problem. So, these are used in so many other areas of communications and signal processing that I do not want you to think that these are digital communication ideas these come from a related area they are used in digital communication as well. And depending on the problem and the joint PDF the minimum distance is not going to be optimal in several cases. But it is useful to simplify the ML condition to as simple a version as possible. So, depending on the noise depending on the way noise is added that version might change ok. So, for the AWGN case it became the very very simple minimum distance criteria ok. So, I know it is still abstract, but it is important to appreciate this a little bit and the power I mean why were we fundamentally able to do this in digital communication ok. The crucial thing was how we transform the waveform channel into the vector channel and how we model the noise how we did this correlation did everything. So, that we got a vector then we got a joint PDF and then you could just directly apply these detection principles without worrying about anything ok. Otherwise you have to think so many get confused with so many other details ok. So, hopefully that the power of this is hopefully clear to you ok. So, the assumptions and the models are very crucial in this ok. Any questions comments ok alright. So, ultimately x hat is a function of y ok. So, that is another thing I want you to I wrote it down once again, but ok. So, it is a function of y and the receiver is assumed to have different types of knowledge. For instance, if the receiver has to implement either the MAP or the ML accurately in the general case what should the receiver know? It should know the joint PDF right. So, if you do not know the joint PDF or the conditional PDFs you cannot do it. But look at the minimum distance detector it seems to be more general you do not have to know much just do not have to know anything. But you may not know if it is optimal or not unless you know the joint PDF, but it is fine I mean you can do it. So, those kind of things are also valuable in so many ways in practice ok. Even if you do not know it is optimal or not you should be able to do something. So, you should have all these detectors at hand ok. So, some getting something to work is also important alright. So, the next thing I did pretty quick without going into too many details is this idea of decision regions I think it is very clear I do not have to define these things formally, but what I thought I will quickly just run through it so that I formally define it and then I talk about it ok. So, in my signal space so why this why belongs to my signal space right well not really yeah it is in the signal space ok, but typically it is not a point on the constellation right on the constellation right ok. So, is this clear? So, if you do BPSK for instance your constellation is minus 1 and plus 1 why can be any real number theoretically any real number practice it will be bounded by something, but it can be any number, but it will be on the space it cannot be somewhere else that is the thing I want to figure out. So, one can think of in an actual reception situation one can think of something called a received constellation what will be the received constellation yeah all the points why that you received if you put a dot next to them you will get a received constellation and you can imagine you know the PDF for why so it will be bunched up around the transmit constellation, but it would not be on top of it will be around that point somewhere ok. So, keep that notion in mind. So, now I want to split the signal space into different regions so that if I am falling in a particular region my x hat is going to be a particular transmit point ok. So, that is the whole notion of decision regions ok. So, you have for instance BPSK 0 plus 1 and minus 1 ok. So, maybe this is bit 0 and bit 1 ok I want to find that subset of the real line in this case my signal space is the real line right that subset of the signal space this is the real line on which my decision will always be 0 ok. So, set of all y in R such that x hat of y equals what plus 1 this will be my decision region for the constellation point plus 1 what will be the decision region for constellation point minus 1 same thing such that x hat of y equals minus 1 ok. So, if you do the minimum distance detector for instance the decision region is very clear and obvious ok. So, every point on the right half of the real line positive part of the real line will be the decision region corresponding to this guy ok. So, this is this decision region and the other decision region will be the negative part of the real line ok. So, I can generalize this definition if I know in general my signal space is R m ok what is my decision region for in general. So, this is general my decision region for some x in my constellation is set of all y in R m assuming my dimension is m ok. So, something I have been doing all along. So, such that what x hat of y equals well I should put a bar here as well equals x ok. So, that is a formal definition and you can see this will this will be a partition of the signal space what do I mean by partition. All these sets will not intersect and together they will encompass the entire space. So, that is the partition ok. So, it will be a partitioning ok one can write down proofs for those things, but I guess it is very very clear if you decide on one thing you do not decide on another thing right. It is very clear that it has to partition the signal space ok. And this decision region ok. So, maybe I will call it b x ok. It is good to write down the probability of error or probability of correct decision in a very succinct form ok do you see why ok. So, the decision region for a particular x can be used to write down the probability of error in a very probability of correct decision in a very clear way. So, for instance what is the probability that x hat equals x given x equals x what is this probability given that you transmitted x small x what is the probability that y lies in dx ok. So, this is nice to write down. So, this decision regions are analogously defined. So, you see where the definition comes from motivation is to write down probability of correct decision in a very compact way. So, once you identify the region. So, you integrate over all y in that region you get your probability ok. So, what is this? This can be nicely written as integrate over y belonging to dx f y y dy ok. So, this is the probability of error for a particular x. So, if you change x this probability might change maybe it does not change ok I do not know, but depending on the decision regions it will work out in this form alright any questions ok. What will be probability of error given x equals x 1 minus this thing. So, it is a lot y not belonging to dx you will take ok. So, it is useful to write it down that is always good to know ok. So, let us do let us do more decision regions for a while just get used to decision regions for minimum distance is very easy to do ok. So, it is no big deal you just look at the Euclidean space and the two dimensions quickly mark it out. We will do that exercise for a while and then we will come back and try to see how some of this probability of error calculations are done and how it will simplify and what are the real parameters of interest is ok. So, that is what we will do next. So, I think what did I do I did I did a few examples I want to know what I got to did I do BPSK think I did BPSK what else did I do just BPSK ok. So, the next example I want to look at is 4 PAM ok see these are examples of decision regions ok. So, I am going to say this is the second example because the first example is did I make a mistake what happened oh yeah yeah oh my goodness yeah yeah yeah so the integral should be over the conditional PDF you are right you are right you are right you are right all my eagerness is forgot about the conditioning ok. So, yeah everything is conditioned ok. So, of course everything is conditioned you are right thanks for pointing it out ok everything is conditioned ok. So, let us go to the examples of decision regions I did I did BPSK let us do 4 PAM next it is the next easiest example ok I have 0 plus 1 plus 3 minus 1 minus 3 ok. So, that is my constellation. So, I will put a X mark around the constellation points ok. So, it is almost trivial right. So, what is the decision region for plus 3 starts at 2 and goes off to the right ok. So, you see that this is the decision region for D3 is going to be 2 infinity ok is that clear. So, what would be D1 yeah it will be exactly only this right this guy is going to be well you might argue about the square bracket and so what do I mean by the square bracket by the including 2 ok. So, you can argue about this square bracket and curve brackets. So, why is it not relevant why should I not care about whether I put a square bracket here or yeah exactly. So, probability that I get y equals 0 is what 0 ok. So, there is no probability since the continuous PDF it will not change any of my probability calculation. So, you can put anything I want it does not matter ok. So, what is D minus 1 ok D minus 1 is going to be this set minus 2 is 0 and then D minus 3 is going to be this guy which is ok maybe I will do minus infinity minus 2 ok. So, it is quite simple. So, the next example I am going to see is 4 QAM ok. So, what is the 4 QAM constellation how many points does it have 4 points what are the 4 points plus or minus 1 plus or minus 1 right. So, those are the 4 points that you have ok. So, you have 4 points ok this is my 4 QAM and what would be the set of all points which will decode to yeah the 4 quadrants right. So, it is easy to write down this is going to be D 1 1 this is going to be D minus 1 1 this is going to be D minus 1 minus 1 this is going to be D 1 minus 1 ok the 4 quadrants are the decision regions for 4 QAM ok. So, if I did 4 QPSK for instance QPSK the way I defined it which is a rotated version of this right you remember that I did a QPSK where my points were plus or minus 1 plus or minus j. So, I did that also there what would happen you will have to do the x equals y and x equals minus y lines and take the 4 different parts you get from there for that it is quite easy to see ok. It is possible to geometrically define it. So, for the for a general case also it is possible to do it, but I am not going to spend time doing that we will just do for specific cases ok. The next one I will do which is may be marginally more interesting is the 8 PSK ok. So, the 8 PSK constellation it is good to draw a dotted circle. So, I think this thing should do circles and all that very nicely, but I should figure it out sometime anyway. So, so 8 PSK has 8 points ok. So, once again it is I think reasonably clear. So, this angle is what pi by 4 right at angle is pi by 4. So, if you want to write down all the points that would be mapped to say plus 1 what would you do d 1. So, you do from pi by 8 to minus pi by 8 ok. So, everything that falls within that cone. So, to speak a pi by 4 angle around the x axis ok. So, this is this guy is pi by 8 this guy is pi by 8 ok. So, this would be d 1 ok. So, if you want to write down for this guy you will have to again draw a line which is pi by 8 from this axis right and this region would be d e power j pi by 4 ok that would be that region. So, likewise you can do it. So, it is just different type of cones if you will, angular views that are half here ok. So, like I said if somebody gives you an arbitrary set of points you should be able to do this. Can you imagine a systematic way of going about doing this what is the systematic way of doing it? What are these lines? These lines that I drew can be described as some things of the points what are the point what are those? The perpendicular bisectors of lines joining any two points. So, you draw a whole bunch of them and then you can from there you can easily figure out the actual regions that are of interest to you ok. Some of them will nicely overlap. So, given any set of points arbitrary set of points which has a constellation you draw a perpendicular bisectors right and they will intersect somewhere they will not intersect somewhere they will define pretty much all the regions that are of interest to you as far as decision region seconds ok. So, I am not going to do any such thing we will just do examples and we will be happy ok. The next example which is a little bit more interesting is 16 QAM ok. So, so far every single decision region that we have looked at looked exactly the same right except for the 4 PAM case where you had two different types of decision regions right everything else seem to have similar picture, but for 16 QAM you will see you will in fact get three different types of decision regions ok. So, you can get different types depending on how the constellation looks and this is an example to illustrate how you can get three different types ok. So, I have 16 different points 4 in each quadrant ok. So, if I go ahead and draw the perpendicular bisectors you see all these guys share one all these guys share one right and then these guys will share one and these guys will share one and the axis axis themselves upper perpendicular the remaining perpendicular bisectors and that is it once I have drawn all the perpendicular bisectors I should just pick out the regions that are defined by them as decision regions ok. So, you can make a lot of arguments to make your process easier, but anyway I think that is clear enough ok. So, you see essentially there are three types of decision regions one for the interior points one for the corner points and one more for the side points that are on the outside ok. So, you have three different types they all look different ok. So, this part is reasonably clear. So, a very typical standard tutorial and exam question is what? You do something that is not quite square you know when it looks a little bit different then test your skill in being able to draw perpendicular bisectors carefully define decision regions carefully and then do computations with that ok. So, that is a very standard question does not test much, but still helps me great right it is very useful alright. So, let us go through and do a probability of error computation now, but for that I want to stick to M PAM. So, I will do I am going to do for M PAM. So, the final example we will see will be for M PAM ok. So, we will do a general M PAM it is not really a big deal ok. So, you see I have 0 plus 1 plus 3 so on till plus M minus 1 then you have minus 1 minus 3 minus M minus 1 right and then you do decision regions you will have two types of decision regions one for all the interior points which will be what 0 comma 2 comma 4 comma 6 so on till whatever M minus M minus 4 comma M minus 2 ok. And then you have two others for the outermost points which is M minus 2 comma infinity and minus infinity comma minus M minus 2 ok. So, that is the that is the decision region. So, if you want to write down very succinctly you can say d M minus 1 is ok maybe I will do this M minus 2 comma infinity and then d minus M minus 1 is ok. So, I am going to not bother with all these. So, let me just fix this brackets for a while and then get a board ok. And then d I is going to be what? I minus 1 to I plus 1 ok. So, that I can do always this is for other I ok. So, that is M PAM alright. So, it is fine we did the decision regions by themselves like I said they do not mean much, but they give you a nice handle on quickly computing probability of error or writing down the expression which is the first step in going ahead and computing right. So, it is useful to compute probability of error and that is where the real attention should be the point is to compute probability of error ok. So, decision regions are just a tool ok. One can say decision regions can also be used in your decoder right. So, you can then use it in the implementation itself, but the main tool is like I said for probability of error ok. So, so let us do probability of error for M PAM I am not doing M squared QAM in general because M squared QAM will be a simple generalization of 16 QAM ok. So, you just do the perpendicular bisectors and then just look at the squares that it defines. One can write down in fact one can write down general expressions like this for M squared QAM also it is not too difficult alright. So, the thing I want to really concentrate on at least for the rest of this lecture is probability of error for M PAM M PAM under a minimum distance decoder ok. So, remember I have to I have to always specify what decoder I am doing right otherwise it makes no no sense in defining anything ok. So, so I am going to look at probability of error for M PAM under a minimum distance decoder ok. So, you can argue when it will be optimal and all that and that is it is ok for us ok. So, this is the minimum distance decoder that we are going to use ok. So, I want to step back a little bit and point point out some of the initial comments I made about the main parameters of interest in a an additional communication system ok. So, what are the various parameters power of the signal bandwidth of the signal rate at which you are sending data noise power that you have and then and then probability of error ok. So, far the first four things we have been doing on and off I have been mentioning ok. So, we have been trying to model it understand it maybe not all the trade offs maybe the trade offs are not clear the thing that brings out all the trade offs in one go is this probability of error computation ok. So, the only thing that may not be apparent is bandwidth we will see soon enough how bandwidth can be brought into the picture ok. So, it is also brought into the picture here. So, the one quantity which ties everything together is probability of error it should right or that is the most important thing do not care about anything else right probability of error has to go to 0 or should be small enough ok. So, and this calculation is what is going to tie everything together ok. So, we wanted to pay attention to it it is not a very difficult computation, but it is a crucial computation because it will tell us the quantity that is of important to us and it will also introduce us to this quantity called signal to noise ratio which becomes so crucial in the entire definition ok. So, that is what we are going to do. So, it is very simple and you might say I am doing this computation for M PAM over minimum distance seems like an abstract calculation, but hopefully you agree that this constellation actually represents actual signals right real pass band signals or real base band signals and I know how to go from constellation to the signal ok energy actually is the same thing ok. Same way noise power is also something which is real which I have modded ok same thing is true for the decoder also something that I can actually implement. So, all these things are very real computations do not imagine them to be some abstract probability exercise alright. So, let us go ahead do that and one argument I made for when I did this PAM was that this root ES or some factor multiplying the minus 1 and plus 1 does not matter ok, but I want to show that it does not matter the probability of error. So, so that is what I am going to do in the definition ok. So, I am going to change my PAM to a general PAM where I introduce an arbitrary distance between minus 1 and plus 1. So, I am going to say my distance between minus 1 and plus 1 is D ok. So, if my distance between minus 1 and plus 1 has to be D what should I say this is minus D by 2 to D by 2 ok. So, my next point would be 3 times D by 2 point on this side would be minus 3 times D by 2. So, until M minus 1 times D by 2. So, until minus M minus 1 times D by 2 ok. So, you will see eventually. So, what really matters is the ratio of D or D square 2 then power of the noise variance. So, the ratio only matters. So, I can fix D to be arbitrary and vary the whole thing by varying the denominator alone. So, that is why in that sense it does not matter ok. So, that is why we did that, but to bring out the point we will introduce this D and this D is called the minimum distance separating any two constellation point right. So, this is the minimum distance in this ok. So, that is a useful thing ok. So, my transmitted signal constellation point is uniform in this alphabet that is my alphabet. Is there a question? Do you have a question to ask or ok? So, this is x and what about my noise? Noise ok. So, before we proceed we should compute first energy in the signal ok. I will denote this E sub s this is what? Expected value of x square ok. So, if you do that computation it will work out to M square minus 1 D square by 12 ok. So, that is the computation that it will work out ok. So, you can play it around and then write down D as what? Or you can do the reverse D is square root of 12 E s by M square minus 1 ok. So, these are the computations that you can play around ok. My noise N is normal with mean 0 and variance and not by 2 ok. So, if I want to think of energy in my noise right. See remember this was what? This was energy of expected value of x of t square right. So, that is how I got to this expected value of x square ok. So, likewise here energy in noise has to be expected value of N square of t ok. So, if you go back and look at the way I did the derivation I will only use N 1 of t which is the projection of N of t on my signal space and I will only worry about the expected value of that and this we showed was equal to N naught by 2. Is that right? Wrong? Maybe yeah I think so ok N naught by 2 this will work out to N naught by 2 ok. We will take it to be N naught by 2 alright ok. So, anyway so even if so there might be a constant missing here, but what happened? Something very bad happened ok. So, this will take to be N naught by 2. So, you can check this you can check the relationship between this guy and this guy and make sure that it works out ok. So, this will work out to N naught by 2 ok. So, so so so so so ok. So, anyway so even if this is not important to us we will take C N to be N naught by 2 ok. So, the next thing is the received value y which is x plus N something that is disturbing me about the way I wrote this down ok. So, maybe I will fix this later, but I will think about this and fix it later anyway. So, y is x plus N this is my received value ok. So, so what is next? Yeah that is what was disturbing me a little bit too, but anyway it does not matter. Yeah let us ok. So, let us not worry about this guy now do not worry about this guy I will fix this later, but I will define some E N to be N naught by 2 ok. So, turns out D squared by N naught is the important parameter ok. So, it has to be N naught. So, it is just wrote it down without thinking too much forget about this N naught by 2 is important. So, I will define E N to be N naught by 2 just take that as a definition ok. So, the received value y is x plus N and what am I doing for the detector ok. My detector is minimum distance decoder ok. So, I have to compute now probability of error. So, I will define as Pe which I will write down as Pe denote as Pe this probability that x hat not equal to x. So, remember how did I define x hat x hat is the minimum distance decoder ok. So, we know all about this alright. So, let us go ahead and do this computation. So, I am going to say I am going to condition on x first ok probability that x hat not equal to x is there a question what happened ok x hat. So, I do not have to write down vectors is all just one number I will just write down in general just so that you get a feel given that x equals x times probability that x equals x ok. So, this I can do ok. So, this quantity now is not too bad to compute ok. So, now I am back to the decision region ok. So, that is a decision region. So, I have to compute this what about this ok what will I do to this I am going to assume uniform I said that uniform. So, this will be 1 by m so that is fine alright. So, that is that is how I am breaking down my probability of error ok. So, go back and look at the way I did it did for deriving the optimal detector deriving the optimal detector what did I do to probability of error or probability of correct decision. I did something which was different from this yeah. So, you conditioned on y ok. So, it is a slightly different way of looking at probability of error ok. So, once you do the optimal detector and fix your detector this is a much easier way of computing probability of error ok. So, when you want to derive the detector you have to condition on y it seems like a strange tool, but it is useful ok. So, that is probability of error and now this each of these case can be easily computed. So, how do I compute probability of x hat not equal to x given x equals x ok. So, how do I compute this? There are two cases. First case I will consider is x is plus or minus m minus 1 d by 2 ok. In this case what will be the probability? What will this be? Integrate from minus what? Yeah. So, you have to integrate from. So, for instance let me fix x to be plus m minus 1 d by 2 ok. So, somehow suppose I fix this to be m minus 1 d by 2 ok. So, what should I do here? Minus infinity to m minus 2 2 times d by 2 ok. This always a times d by 2 floating around. Integrate what? I am sorry minus no I am taking x to be plus m minus 1 d by 2. Is that fine? I am looking at the right most point ok. So, I have to integrate this out. What? What is my f y? Ok, give me some numbers. The limit is correct or not? It is correct. It is correct? Ok. So, what should I put inside? The normal pdf right. So, it will be some normal pdf. What will be the mean and I will write down simply normal pdf. What will be the mean? Minus 1 d by 2 and the variance will be n naught by 2 ok. So, that will be the integral ok. So, you have to put the d whatever I did and write it carefully. So, this is the this is the integral ok. So, what about the minus case? x equals minus m minus 1 d by 2. The same thing will be equal to integral from minus m minus 2 d by 2 to infinity normal with mean minus m minus 1 d by 2 and naught by 2 ok. So, I want you to write give me these two expressions in terms of q functions. What is q? q is q x is what? q t I will write so that I do not, there is too many x's floating around. q t is what? 1 by root 2 pi integral t to infinity e power minus t square by 2 d t. So, this is q. I want you to give me these two expressions, simplify them and give me an expression in terms of q. I am going to claim these two are the same. Do you believe me? They have to be equal. So, what will what will it be in terms of q? q. So, the claim is what he is saying is this is equal to q I am sorry minus d by ok. Just give me q of positive numbers 1 do not give me minus. He gets scared if we say q of minus ok. So, let me write this down completely this probability that x hat not equal to x given is equal to x ok. Why do I get confused? Why do I get scared if you say q of minus something? q of minus anything will be greater than 0.5 and all that. It is not a very nice thing to have. q of d by integral ok. So, you should get something like this d by 2 by root and not by 2 ok. How many of you think you can get to this expression? So, it is very simple not very difficult, but you have to be able to get to the get to that expression ok. So, this guy is probably a much easier guy to work with than the other one the other one you have to do a little bit of a who is turning around to understand ok. So, this is how it works ok. So, it is q of d by 2 divided by root of and not by 2 ok. So, that is this probability. What about interior points? 2 times that do you believe me ok. So, for interior points x equals plus or minus i d by 2 i equals what 1, 3 so on till m minus 3 probability that x hat not equal to x given x equals x is 2 times q of d by 2 divided by root and not by 2 ok. So, we will stop here I think I am running out of time. So, we will pick up from here on Monday and try to finish it.