 This is lecture 43, so maybe we should turn that another final. Okay, so we've been looking at some very simple coding schemes, looking at how to decode them, the hard decision decoder and the soft decision decoder which is better and how do you quantify how much better it is etc etc, okay. So I gave you a simple example of a repetition code and we saw that it didn't really work in terms of coding gain, okay. So what we'll see next is a way of doing coding which makes it very easy on the encoder for encoding and all that, okay. So it gives you a structured way of doing coding, it's called linear block codes, okay. So when you want to do a linear block code at the receiver, what you do is very similar to before, the block diagram will be always the same. You first compute, first accumulate k message bits and then the encoder has to produce, right, has to produce some n code word bits, okay. So this is what we've been looking at. So far and I made the comment that if you're planning to do a lookup table like implementation for going from M to C, then if k becomes 1000 or 500 or something, it becomes really impossible to do any implementation of the encoder. The lookup table becomes very large, 2 power 500 is a very large number, okay. So and we also saw that for k equals 1, we weren't getting any coding gain. In fact, it's true that for low k, you won't get much coding gain. You have to go to larger and larger block lengths, okay. So for all that, you want to have put something here which is very simple, some encoder here which is a very simple operation. So if you were to do a linear encoder, what you do is the following. Suppose your message vector is m1, m2, so on till mk. Remember each of these things is a bit, okay. So if your code word vector is c1, c2, so on till cn, okay, n is larger than k, right. So in a linear encoder, ci will be xor of bits of m, okay. So this is a way in which you can define what a linear encoder would do. Each code word bit is an xor of selected bits from the message, okay. So that's the way a linear encoder operates, okay. So this in general would look like mi1, xor, mi2, right. So let me be careful here. So you see how it would look, right, so dot, dot, dot, mi, some d, okay. So you pick i1 through id in some fashion for each i, okay. And then you do an xor of those bits. So this is a definition of a linear encoder. So this is how a linear encoder works, okay. So this simplifies matters, right, at least from an implementation point of view, okay, without looking at anything else. This clearly simplifies matters. Only thing that the encoder has to remember now is what? For each i, what is the i1 through id and what is d, okay. Once you remember that, you just take those from the, from somewhere, they take the, pull out those messages, xor them and send it out, okay. So this is an easier implementation than having a 2 power 500 lookup table, okay. So that much is clear, okay. So this is clearly possible, okay. So this is one way of doing encoding, okay. So now if you do such an encoding and look at the collection of code words you've produced, okay. So that will, that will give you a code and that's called a linear code, okay. So it's linear because if you take two such code words and xor one with the other position wise, you will get another code word, okay. So all these things you can prove, okay. So what you get is a linear code. We won't spend too much time over all those things in this class at least. I'll just quickly brush over what you get, okay. So in practice, this xor is simplified further, okay. So what people do is for i between 1 and k, c i you simply set as equal to m i, okay. So this is a general linear code. What I'm going to write down here is what's called a systematic encoding, okay. So this is a general linear encoder. The systematic linear encoder, okay, we'll do this, okay. First k bits that you put out will be equal to the first k message bits. And then for i from k plus 1 to n, you do the same thing as before. C i is m i 1 xor so on till m i d. And depending on i, you pick a different set of message bits to xor, okay. But the first k message bits, for first k code word bits will be equal to the message bits. So this is called a systematic encoder, okay. So if you just narrate it, the picture on the left might look much more general than the picture on the right, okay. So you might say you can get many more codes from doing general linear coding than doing systematic coding. But the strange thing is what? It's not true, systematic is good enough, okay. The reason is both of them give you linear codes and then you can do Gaussian elimination and go from one to the other. Okay, so both of them are different encoders but every code turns out it has a systematic encoding. So you don't have to worry about non systematic codes. It's enough if you do systematic encoding that covers all the linear codes that you might want to cover, okay. So we will look at a version of systematic encoding only, okay. So I'm going to give a couple of examples. The first example will be the repetition code and the next will be other examples of doing some simple linear codes and how to describe them, etc, etc. Okay, that's what I'm going to do next, okay. So here are examples. The first and simplest example is the repetition code, okay. So if you look at k equals one and equals three, what does the repetition code do? It sets C1 equals M1, C2 equals, what? M1, there's only one M1, okay. So you can't do anything with it, right. So you either choose to send zero or one or you send the message and it's better to send the message than sending zero or one. So you have M1, M1, okay. So these are the, this is the linear encoder for the repetition code. So you see repetition code is a linear code, okay. So if you look at the code itself, it's what? 0, 0, 0 and 1, 1, 1. If you XOR any two of these vectors, you'll get some vector in the code, okay. So you see that that works out, it's a linear code, okay. So C1, C2, C3 is M1, M1, M1, M1. So is it systematic? Yeah, it is systematic, okay. So you can see that it's systematic, okay. So here's another example of a, so I'll call it even parity code, okay. So I'll take K equals, I don't know, seven, then N equals eight, okay. So I'm gonna set, so if K is seven, N is eight. I have seven message bits and eight code word bits, okay. So I have to produce eight code word bits. So I'm gonna set CA equals MI4, I between one and seven. And then CA it is going to be M1 XOR, M2 XOR, so on till M7, all the bits are going to be XOR together, okay. Why did I call it an even parity code? So if you do this, the number of ones in any code word will be, will be even. You see that? Your message, the seven bits of message can have an even number of ones or an odd number of ones, but the eighth bit you add is the parity of that. So if you had an even number of ones, it'll be zero. If you had an odd number of ones, the eighth bit will be one, making the total number of ones in C even, okay. So that's why it's called an even parity code, okay. C has even number of ones, okay. So one more thing you can check is that this code is also linear. You take any two code words, then add them. If you have an even number of ones and even number of ones, you add, you'll still only get an even number of ones. You can show this, okay. So these are interesting things to look at for a vector, okay. So binary vector is easy to show. That's an even parity code, okay. So there are a lot of questions that can be asked, for instance. So in the notation for a code, people usually denote a code as an NK code. So this is notation, okay. So linear codes are denoted as by N comma K, okay. So NK, in this case, for the even parity code, it would be a 8 comma 7 code. And any NK code, how many code words does it have? Let's call the size of the code. Number of code words, which is 2 power K, okay. So these are all some simple terminologies. And then the rate, of course, is K by N, okay. So all these things are standard notation. So the NK code captures all that information, okay. But of course, N and K alone does not fix the exact code, okay. You can't know exactly what code it is, right. For 8 comma 7, how many different codes can I have? Systematic encoding, so fix the systematic encoder in your head. How many different codes can I have? How do you answer that question? So the only freedom you have is in C8. So it's a question of how many bits of M1 you choose to XR to get? To get C8, okay. So how many, in how many ways can you choose it? What are the different subsets? 2 power 7, right. So you have so many different possibilities. And all of them will give you seemingly different codes. Maybe some of them will be the same. You don't know, okay. You have to think about it carefully. What will be same, what will be different and all that, okay. All right, so that's the definition of code. I'm gonna give you one more example just to round it out. And then we'll go back and look at the definitions, okay. So let me make sure I get this example right. Okay, so I'm gonna do the 7-4 Hamming code, okay. So this is named after Hamming who invented this code long, long back, okay. So this is how it works, 7-4, okay. So that immediately tells you the number of message bits is 4 and the number of code word bits is 7, okay. So since I'm doing systematic encoding, I'm going to set CA equals MI for I between 1 and 4. There's no problem. And then C5 I'm going to set as, okay. So hopefully I'll get this right, okay. I'm gonna set it as M1, M2, XOR, M3. I need three parity bits, okay. And then C6 is going to be M1, XOR, M2, XOR, M4, and C7 is going to be, is this working out correctly? Is it okay? People know enough of this, okay. Is it okay? All right. So I think this should work out. So let me do a careful check here. Hopefully I did that right, okay. So I'll have to think in my head about this. Yeah, I think this is okay, right. This should work out, okay. In case there is some trouble, I'll come back and fix it later. Okay, so I think this should work out, okay. All right, so this is the 7-4, Hamming code. How many code words will there be? 16 code words, right? So 16 code words, it's not much of a problem, okay. So this is one way of specifying a linear code, okay. So how do you specify it? You specify the systematic encoder. You say what are the combinations you do? You get some code words, okay. So there are several other equivalent ways which just manipulate the specification and they specify it differently, okay. So one such way is called the generator matrix description, okay. So specifying the systematic encoder, like I said, is one way of specifying the linear code completely. Okay, so what do I mean by specifying the linear code? Okay, I have to tell you how to generate all the code words, okay. So once I tell you how to generate all the code words, you are just, I've specified the linear code. So all these things are different ways of doing the same thing, okay. So generator matrix is another way of doing it. I'll show you how to do it for the hamming code. You'll see it quickly generalizes, it's very easy. So I want to come up with a matrix G such that C equals M times G, okay. So this is my objective, okay. So how do I come up with the matrix G? So that C equals M times C. If I do that, then I have a generator matrix. This is supposed to be a generator matrix. So before we go any further, what should be the dimensions of G? So this is one by K. This is one by N. So this should be K by N, okay. So that's the first way of doing it. And the next thing to look at is one can come up with G row by row, there's several ways of doing it. I'll just give you one way of doing it. One can come up with the G row by row. How will I find the first row of G? First K rows will be I, K rows, K columns, okay. So there are so many ways, people are quickly seeing what I'm going to say, okay. So you can see this is very easy to construct G, but look at the question I asked. The first row of G will correspond to a, yeah, correspond to what message, okay. So if your message is one followed by K minus one zeros, then what will C be? It'll be the first row of G. Do you see that? So how do you pick out rows of a matrix? Multiply on the left with the vector, which is one at the corresponding place, right? So if we pick the message to be one followed by K minus one zeros, then the code word corresponding to that will be the first row of G, okay. So that's one way of thinking about it. Another way of thinking about it is the way he said it. Just look at the equations and then figure out what this matrix will have to be. Okay, so let me give you an example. I'll do this example for the seven four hamming code. It's complicated enough that you should be able to do it. So C one equals M one, C two equals M two, C three equals M three, C four equals M four. So just based on that, he was able to say that the first four columns and the first four rows together will make a identity matrix. And then what about C five? It's right there, right? It's right there. So I'm gonna put it together in a matrix. I'm gonna say C equals M times, okay. One, zero, zero, zero. Zero, one, zero, zero. Zero, zero, one, zero. Zero, zero, zero, one, okay. And then I have to say what else has to go here, okay. So remember one more thing. This is one confusion here. So usually when you multiply, you might get M one plus M two or something. So now we want M one plus M two plus M three. So I'm gonna put one, one, one, zero, okay. So what is my plus exactly here in this multiplication? So I'm dealing with bits. I can't add one plus one plus one and say it's three. It's not three, right? How do I interpret plus? Modulo two, okay. So all your additions are modulo two, okay. So that makes that cross bridges the gap between addition in matrix algebra and the binary XOR that you have, okay. So there's no confusion here actually. If you go back to linear algebra, people define what are called fields and then vector spaces over fields and then transformations as matrices, okay. So matrices with entries from a field are very well understood, okay. And it turns out zero comma one is the smallest field that you can have. It's called the binary field. Addition is modulo two, multiplication is well trivial, but still modulo two you can think of, okay. So it's all very clear from theory. All the linear algebra you'll learn still applies, okay. So but anyway, it doesn't matter. We can interpret addition modulo two and write out this matrix very easily. So what will be the next column? One, one, zero, one. What will be the last column? Zero, one, one, one, okay. So this is the way which I did it, okay. So this is the generator matrix. And remember if you go from these equations to the generator matrix, it's nice to work column by column. Just fill out column by column because that corresponds to each CI, right. That's how you multiply. You take this row, multiply to a particular column you get that corresponding entry, right. So it's a very easy thing to do. And I can see people are nodding. It can get confusing when you see some questions in the exam, okay. So make sure you understand how this matrix was obtained, okay. And the next step is understanding the rows of G. What is the first row of G? You obtain the first row of G as the code word when your message is what? One followed by three zeros. If it's zero, one, zero, zero, you get the second row, okay. So that's another interpretation for this, okay. So what about all the code words of the code? Okay, so once I write this down, how will you succinctly describe all the code words of the code in terms of the generator matrix? Yeah, what does that thing call linear combinations of rows of a matrix? It's the row space, okay. So it's the row space of G. So the generator matrix has a complete specification of the code. It's the row space of the code, okay. So that those are all various ways of thinking about it. So once you think about it that way, you see the distinction between the non-systematic and systematic version disappears, okay. It's just row space of a matrix. So you can always do Gaussian elimination on a matrix to get it to get a I on the left hand side. So whether it's systematic or non-systematic, it's okay. It works out, okay. All right, so this is the generator matrix description. Let's spend a couple of minutes and then write down generator matrices for the repetition code. N equals three, K equals one, okay, repetition code. The three one repetition code. Okay, so maybe I should write it that way, I'm sorry. Okay, what's the generator matrix? What will be the dimensions? One by three and what will it be? One, one, one, okay. Okay, so another way of thinking about it is the rows of the generator matrix will be a basis for my code word space, okay. So this dimension, this basis, all these linear algebra terms you'll learn will come and help you if you learn the linear algebra well, okay. So it's no problem though. Okay, so what about the last example? The eight comma seven, even parity code. It's also called the even weight code. Even parity code. What's the generator matrix? Yeah, so you have I seven, which is a seven by seven identity followed by an all ones. So maybe all ones will have a shorthand notation, okay. So that is the, that is the generator matrix. So it's easy to come up with the generator matrix, okay. So the next matrix that people use to describe linear block code is what's called a parity check matrix. It's once again a simple manipulation of the equations, but except that it can be a little bit non-trivial and we'll again go back to linear algebra finally and justify why that makes sense. But initially I'll start it off as a manipulation of the equations that we had, okay. So let's look at a parity check matrix. It's just arranging the equations in different way, okay. So instead of looking at c equals m times g, I want to bring everything in my equations right now I had c equals m times g. I want to bring everything to one side and I have, I want to have zeros on the other side. Okay, so if you try that you get what's called a parity check matrix, okay. So it's very simple. The systematic part you typically won't have any equations. Okay, so remember in all my systematic encoding I have c i equals m i for i between one to k and c i for the other i's, okay. So k plus one to n is what is an XOR of m i's from one to k. Instead of m I can also put c, why not, right. I can put c, both c i equals m i for one to k. So instead of saying m i one, I'll say simply c i one, c i two, likewise still c i sum d, okay. So I can do this. So now how do I get an equation to get zero on the right hand side? You bring everything to the left side. So how do you bring XORs to the left? Okay, when you do modulo two addition it doesn't matter whether you do plus one or minus one. Okay, so minus one mod two is what? So whether you do minus one or plus one doesn't matter. So basically this equation will become the same as c i XOR, c i one XOR, c i two XOR, c i d equals zero on the right hand side, okay. So how many such equations will I get? N minus k equations, okay. So these set of equations when collected together in a matrix form what's called a parity check matrix, okay. So it's very easy to come up with a structure for the parity check matrix also. I'm gonna do it for the example of the Hamming code and then you'll see the generalization. It's very easy, okay, it's quite trivial to come up with a parity check matrix, okay. So once again the example of the Hamming code, okay. What are the equations we had? We have c five equals c one XOR, c two XOR, c three amirite, then we had c six equals c one XOR, c two XOR, c four. So this is the seven four Hamming. Then c seven equals c two XOR, c three XOR, c four. Is it okay? So now I'm gonna collect everything to the left side. So the set of equations I get will be c one plus c two plus c three plus c five is zero, c one plus c two plus c four plus c six equals zero, c two plus c three plus c four plus c seven equals zero. So I want to collect them together and write it as one matrix multiplying c transpose being equal to zero transpose. Well, zero, zero three in this case, okay. So this is what I want. So what should these matrix be? Okay, so here you'll find filling it out row by row is much more easier, okay. So instead of you try column by column, you can also do column by column, there's nothing wrong in that, but filling it out row by row is much, much easier, okay. So the first row will be what? One, one, one, one, one, one, zero, one, zero, zero. Am I right? Okay, and the next one will be one, one, zero, one, zero. The last one will be zero, one, one, one, zero, zero. Okay, so this is typically denoted H and that's called the parity check matrix. Okay, so this is the Hamming. Okay, what will be its dimensions? If you have an NK code, it will be, it will be N minus K cross N and the property that it will have, okay. The what's the property that generated a matrix had? Any code word can be written as M times G. So the row space of G denotes the, specifies the code. It turns out even the parity check matrix completely specifies the code. It has to, right? I just took all the equations and played around with it and gave it to you on another form. So obviously the parity check matrix also completely specifies the matrix, but to know exactly what it is, you have to know a little bit of linear algebra. Okay, so what is the terminology for that? Yeah, the null space of H is the code. Okay, so what's called null space. The set of all C such that H times C transpose is zero. Okay, so remember once again, H times C transpose is done modulo two. It don't come up with some two and four and six and all that, it's all modulo two. Okay, so that's the thing. So, so null space of H equals row space of G equals the code. Okay, so the rows of H are what now? Yeah, there will be vectors which are orthogonal to my code space. So every code word in my code will be orthogonal to the rows of H. So that's how we define it. So for instance, if we take a plane, a 2D plane in three dimensions, there's two way of specifying it. What are the two ways of specifying it? Think of a plane going through the origin so that it becomes a linear subspace. Okay, there are two ways of specifying it. How do you specify it? Either specify two vectors, any two vectors or two vectors which are two things that are not, that are linearly independent, right? So they should not have a zero angle between them, any angle or you can specify a one vector which is normal to the plane. So both of them are equivalent ways of specifying a linear space. So either specify a basis or what's called a dual basis. Okay, so spaces in this orthogonal to the whole thing. Okay, so both of them are equivalent ways of doing it. So null space and row space are like that. So you saw this G was what? G was a K by N matrix, as H naturally became a N minus K by N matrix. Okay, so these are all different ways of specifying it. So you'll see terms thrown around here and there. Okay, so why did we do all this? We did all this because there are several reasons. First of all, these matrices help in compact description of the code, okay? And they are used at the encoder and decoder, okay? So instead of how do you specify the equations that are used at the encoder? You specify the matrix, okay? So that gives you the equations, okay? So one more observation I want to make about H, okay? So G you generally think of in systematic form, it will have an identity here, IK, and then it will have some matrix P, what will be the dimension of this matrix? K by N minus K. So that will be the dimensions of that matrix. So initially you'll have an identity part, okay? The first K will be identity and the next will be some arbitrary form. If you take this G, okay? Look at the equations, push everything to the left side and convert it into a parity check matrix. It turns out the parity check matrix has a very, very simple form, which you can quickly figure out, okay? It will be P transpose I N minus K, okay? So you can check that for the Hamming case and you'll see that it was true and it has to be true in general. It's just a simple way of moving the equations around. It's no real great thing here and it's also very standard in linear algebra, okay? So this is how you find the row space and null space of a matrix. You do Gaussian elimination and the basis for the null space becomes P transpose I. So it's a very easy way of doing things but in this case it becomes equally easy, okay? So this is the parity check matrix and this is generator matrix. So any questions on any of this? It's all very basic stuff but still it's good to just spend some time thinking about it, okay? So what we're gonna do next is work out the generator matrix and parity check matrix for the two other examples that we saw, okay? So for the three one repetition code, okay? Three one repetition code. What is G? We already saw what that was. One, one, one. And what is the parity check matrix? Yeah, so one, one, one, zero, one, zero, one, okay? So anyway, you can do it row-wise, row-wise, column-wise anything you want, okay? One, one, zero, one, zero, one, okay? So suppose you have to do the, I'll do one more example before I go to the next one. I'll do the eight comma one repetition code. There's a good reason why I'm doing it. You'll see quickly enough. What will be G? Yeah, eight ones, right? So one, one, one, one, one, one, one, one. What will be H? It will be one, seven transpose and then I, seven, is that okay? So this is seven, okay? So let's see the last example. Let's see the eight comma seven, even parity code. The generator matrix was what? We already saw it. It's I seven, one, seven transpose. And what will be the parity check matrix? So it will be what? I put a long big parity check matrix but the parity check matrix is simply one, one, one, one, one, one, one, one, okay? So anyone wants to make any observations about all these things I've written down? What do you think the last two examples are, okay? So it turns out the eight one repetition code is the dual of the eight, seven, even parity code. So it's an interesting way of looking at these things. So you might say the generator matrix here and the parity check matrix here have a relationship. The generator matrix here and the parity check matrix here have a relationship. Yeah, they do. They're actually the same thing, okay? So that's what we do. Okay, so this, how I got this edge is clear, right? So it's very easy to see, okay? So there are several ways of figuring these things out. One more way of thinking about it is every code word in my even parity code has even weight, okay? And I know my dual space is only one dimensional. So I need only one vector from the dual space and the all ones vector has to be orthogonal to every even weight vector. Why? If you do a dot product of all ones vector with any even weight vector, what will happen? You will get an even number, which is actually zero mode too. So it makes a lot of sense. So all these things make sense, okay? Okay, so I think that's enough talk about matrices and all that. So let's proceed so far and then do the, so the thing I want to retain is the hamming code, okay, so the 74 hamming code I want to retain, okay? So the generator matrix worked out as 1, 0, 0, 0, 1, 1, 0. Did I get that right? Okay, 0, 1, 0, 0. What did I do? I'm sorry, what did I get? I got 1, 1, 1, 0, then 1, 0, 1, 1 and then 0, 1, 1, 1. 0, 1, 1, 1, right? So it's not, 8, 1 is not there. Is it okay? This was my G and then the corresponding H turned out as 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1. Is it okay? All right, so this is what I want to retain. I want to give you a feeling for what this is, okay? So the first thing you should be able to do for at least for small examples is given the generator and parity check matrix, you should be able to easily come up with the list of code words, okay? How many code words you have here, 16? You can easily come up with a list of code words, there's no problem, okay? So the list of code words you can come up with. So the code itself can be computed, okay? So I'm not going to do it, I'm not going to go through the list, okay? So but just that can be computed, there's no problem. All right, so then the next thing is you can design simple encoders, when I say design, basically you can write a program to implement this encoder, there's really no difficulty, right? So encoder can also be implemented, okay? So these two things you should be able to do, maybe in software or hardware or whatever is your interest, you can easily do these two things, okay? But when K and N become larger and larger, okay? One of these things is definitely much easier than the other, which one is easy, which one is difficult, okay? Yeah, if K becomes 500 and N becomes 1000, okay? I have a 500 by 1000, 500 code, my generator matrix becomes a 500 by 1000 matrix. If I have to list out all the code words of my code, it's pretty much impossible, but I can still do encoding today with reasonable complexity, right? So maybe you have to store a huge matrix and remember it, okay? But that's okay, you can do that and you can implement your encoder. But still when K and N become larger and larger, encoding is also a problem, okay? So think about complexity of these two things and you'll get a feel for what's possible and what's not possible. So typically people simplify this much further also, okay? So we'll do that, we'll come back and address the problem of how to simplify encoders, okay? So that one code we'll consider as a very, very simple encoder also, okay? So we'll do that as we go along, okay? So those are things to think about. But one thing we have not addressed so far is what? Decoder, okay? So you have to decode, it's not enough if I give you a fancy generator matrix or a paratistic matrix, how do you use it at the decoder, okay? So once again, you have all those methods available to you. There are two different, basically different ways of decoding. What are the two ways? One is hard decision, another is soft decision, okay? So these hard decision and soft decision decoders are very easy to describe, okay? I'll describe it very easily, but when it comes to implementation, some of them become more difficult, okay? So that's what we're going to see next. So how do you go about decoding these simple codes and we'll mostly use the seven four hamming code as an example, okay? Any questions? Okay, all right. So one question that will come up when you decode and study its properties is what's called minimum distance, okay? So far I have not defined it. I'm going to define that next, okay? Before we jump into the decoder because you'll see when I can describe the decoder to you, but if you want to analyze it, you will need to know something called the minimum distance, okay? So if you go back to a way we analyzed our ML decoder, okay, remember I was searching for the error paths which are smallest distance away from each other, okay? So it turns out in the coding scenario to understand minimum distance, okay? So there's so many different ways of thinking about it, okay? So finally, when you want to do decoding, you'll have to do detection on a high dimensional constellation, right? I gave you the description also. So eventually when you do coding, you're not doing symbol by symbol. So your constellation is not very simple. It's actually a few hamming code. Your constellation is over seven dimensions, okay? And given a point, you have to find the nearest neighbor. That will only fix your probability of error, right? So how do you do approximate probability of error analysis? Find the nearest closest neighbor, then do Q of D by two sigma, multiply that by the number of neighbors you have. That's an approximate estimate of probability of error. So to find my nearest neighbor, I have to know given my constellation point which is the closest neighboring constellation point, okay? So in the coded scenario, minimum distance and closest neighbors are very closely related. And for all that, you need to know this minimum distance notion, okay? So that's what I'm going to introduce next. So at least in BPSK and all that, it's very, very useful to know this minimum distance separation, okay? So I'll assume we have an NK code C, NK linear code C. If I enumerate all the code words, I'll have what? C1, C2, so until C, 2 power K, the 2 power K, I'll call as M, okay? Which is my size of the code, okay? So this is the code word that I have, okay? So remember when you think of the expanded signal constellation with the code word, it's going to be seven dimensional. And how do you go from symbols to code word, from code word to symbol? You do zero to plus one, one to minus one. So it's a huge dimension. It's tough to imagine what it is, okay? So in such a picture, I want to find the closest point. That's my aim, okay? But I will do it in binary. I won't do it after I go from zero to plus one and minus one. I'll do it in binary. You'll see it's closely related. It's not all that different, okay? So for that, minimum distance is useful. So here's the definition of minimum distance. D is minimum over CICJ in, okay? So I don't even have to write CICJ, sorry. I and J between one and M, I not equal to J. What's called the hamming distance between CIN, CJ, okay? So I have to define what's hamming distances, okay? What is hamming distance? How many of you know? Nobody knows, okay? So the number of places in which two vectors differ. Okay? So that's hamming distance. Let me define that. DH between U and V is, I can do a mathematical definition, but I'll just simply write it down. So number of positions where U and V differ, okay? So if you want to write a mathematical definition, it's the size of I between one and N. Suppose U is, U and V are length N vectors, okay? So U is U1 to UN and V is V1 to VN. Number of I such that what? UA is not equal to VI, it's the size of that set, okay? So there's a nice way of, there's a nice relationship between hamming distance and what's called hamming weight. Okay, hamming weight is U is number of ones in U. That's the definition, okay? And there's a nice relationship between these two guys, which is hamming distance between U and V can be written as the hamming weight of some vector. What is that vector? Ux or V, do you see that? See, what does XOR do? If this equal, it makes it zero. And if it's not equal, it makes it one, okay? So if to count the number of places in which they differ, you can XOR the vectors and then look at the number of ones in that vector, okay? Which is hamming weight. So all these things are definitions. So now if you go back to the minimum distance of a code, okay, so if you go back to this definition, what does this definition ask you to do? If you have to, given a code, if you have to find its minimum distance, what should you do? Select all possible paths from your code word, code word. So how many possibilities are there? Flee M squared, right? So M squared is two power two K, okay? So it's exponential definitely in K. It's a large number. If you want an exact number, what is that exact number? M choose two, right? M times M minus one by two. If you're careful about it, you can do that. Roughly M squared. So two power two K. So it's clearly exponential in K. So it's a large computation. It's very difficult to do this, okay? So it's possible to still keep it exponential and make it, make the complexity slightly lesser from M squared to M you can come, okay? How do you do that? How do you go from M squared to M? You use the relationship between hamming distance and hamming weight and the linearity of the code. You can go from M squared to M, okay? But still it turns out you can't go much below M, okay? So this problem has been proven to be very, very difficult. It seems like a silly problem, right? You give you a generator matrix and have parity check matrix, find the minimum distance of the code. It turns out it's difficult. Obviously it's not difficult for K equals three and N equals seven, okay? So when this is difficult and K is 2000 and N is 3000, it turns out it's difficult to find the minimum distance. But we can go from M squared to M by using the simple relationship. So how do you go from M squared to M? You can quickly argue that this is minimum over, okay? I said I gave you the definition, ij, i not equal to j the hamming distance between c, i, n, c, j. But you know this is the same as the hamming weight of c, i, x, r, c, j, okay? But what do I know about c, i, x, r, c, j? It has to be equal to cl for some one less than l less than m, okay? But it cannot be one codeword, okay? So one thing I never stressed is if i equals j then c, i, x, r, c, j will become zero, okay? So that cannot happen. So it has to be a non-zero codeword in the code, okay? That's the only relationship. So once you do that, you can see it's easy to write this minimum over c in the code, hamming weight of c, okay? So we have gone from m squared to m, okay? 2 power 2k to 2 power k, since no reduction at all. Nobody will consider that as a reduction in complexity. But anyway, it's still some comfort, at least for proving and simple problems, this is very, very useful, okay? Is that clear? So the minimum distance of a linear code, which is technically defined as the minimum separation, if you will, between any two codewords is the same as what? The minimum number of ones in any non-zero codeword, okay? So it's like saying this, if I give you a finite number of points and they form a linear subspace to find the minimum separation between any two of them, it's enough to go to the origin. Just sit at the origin and look for the closest point and that will be the same as from any other point also. The reason is the whole thing is linear and you can shift origins to any codeword and the picture looks exactly the same, okay? So it's tough to visualize this if you're used to only real spaces, okay? So this is not a real space, it's hamming, so it becomes a little bit twisted, but that's the principle, okay? So we have five minutes left, so we'll see two simple examples and then I'll pick up slightly more complicated examples in the next class. So the first example I want to have is n comma one repetition code, what will be the minimum distance, okay? D will be n, you see that? It's very easy to see of just two codewords. Just look for the vector non-zero codeword with minimum weight, it will be n, okay? It's just only one non-zero vector, okay? Slightly more twisted example is to look at n comma n minus one even weight code, even weight code or even parity code. So what will be D for this? I think a lot of people who know the answer are giving me the answer, but it has to be two, okay? So convince yourself that you can prove this. So how do you prove this? Okay? How do you prove this? Okay, the one, I'm sorry? I think people who are doing the self-study code should not answer this question. How do you prove this? Yeah, that's all. So just look at it, look at the generator matrix and the parity check matrix, okay? Right? This code has two power n minus one vectors, it seems complicated. But if you look at the parity check matrix, what is the parity check matrix? It's got all, how many ones? It's got n ones, right? That is the parity check matrix. And any code word of this even weight code has to be such that H times C transpose is zero. So you have to look at that closely and try to figure out the smallest number of ones that C can have so that H times C transpose is zero. And the best way of doing it is to work exhaustively starting with the least weight possible. Okay? So you might say I'll come up with a clever way, but it can be very, very deceiving. Okay, so this problem is a very complicated problem. So the best way of solving it is to simply start with, you're not allowed to start with all zero, okay? So obviously H times zero will be zero, but you're not allowed that because you want a non-zero code word, okay? Minimum weight of non-zero code word. Then what is the least weight possible for a non-zero code word? One, okay? But one, if you pick, what will happen? It'll be one. So it won't be H times C transpose will not be zero. So the next step is to go to two and two you can pick and you know definitely D will be two, okay? That is the only way of proving this result, okay? You might come up with some very clever way unless you are very confident and you write it down precisely. There is no other way of proving this result, okay? In this case, there might be some special easy ways, okay? In a general situation, if I give you a parity check matrix and you have to find the least weight non-zero code word, this is the only way of doing it. Start with all, of course you don't start with all zero, start with vectors of weight one, vectors of weight two, vectors of weight three, slowly exhaustively finish everything and stop at a point where you find the first one, okay? All right, so we'll stop here.