 So, this is lecture what, somebody should tell me, what was it before, 33, 32, so the last thing we were saying was we saw beta B decoding algorithm and then I was trying to introduce recursive feedback convolutional encoders, systematic convolutional encoders which have feedback. So, the example I saw and this recap, I took this very simple example where G of d I wrote as 1 and then 1 plus d squared by 1 plus d plus d squared. So, if you were to look at an implementation, so you will notice it is a very, some of those canonical linear system implementations. If you remember how this is done, oops, so if you remember how this is done, this is how the encoder looked like. So, if you have a message sequence u, v0 was in fact equal to u, then v1 was given by that recursive equation v1n plus v1n minus 2 equals, what am I right? No, v1n plus v1n minus 1 plus v1n minus 2 equals u1n plus un, u, un plus u and minus 2. So, that equation is satisfied by this kind. So, this is the example that we saw and if you draw the trellis, so you will see the trellis will have different labels when compared to the labels that you previously had, case the state transitions will be different. Previously, we had simply a feed forward encoder and you could write down the state transition all that very easily. I think it is instructive to just see one example of that. So, I will do the trellis for this and just leave it like that. So, I am going to use my states as s0 and s1, so that is my state. So, what is my output in terms of the state? How will I write it down? v0 is going to be equal to what? It is going to be equal to v0n is going to be equal to un, there is no problem. What about the other things? How will I write v1n? So, it is useful to have say some w for this quantity here. So, v1n is going to be equal to what? wn plus s1, but what is wn? I am sorry, un plus s0 plus s1. So, it is ultimately going to be equal to un plus s0, am I right or am I making a mistake? This is fine, right? So, this is what it looks like. So, why am I not happy with this? For some reason I am not happy with this. I am sorry. Yeah, when you update the next state, s1 will become s0 and s0 will become wn. What happened? Yes, so no. Okay, do you like this or you do not like this? I am sorry, it is not in terms of u, why? Why would I ever do that? No, I do not want to do all that. Why? Is this correct or wrong or what is going on? Yeah, I mean, s0 is equal to wn minus 1, s1 is equal to wn minus 2. Yeah, should it be or not? You tell me, should not be. No, why? It is the same thing. So, what are you saying? You are saying? So, is this fine or not? How many of you like this or how many of you do not like this? Let me see a show of fans. How many of you do not like this? How many of you do not accept this? At least one person, okay. So, we will go with this. If at all there is an error, we will see, we will fix that later. So, it will work something like this. So, when you do the next state, s0 will become wn and s1 will become s0. So, that is how you have to do the next state. It will be very careful about how you are doing the next state. So, let us see. Remember this when we do the trellis. I will do the complete stage of the trellis. I will not do the evolution. So, let us take the complete stage in that state. I have four states. So, if I do a stage of the trellis, four states, we call it 0, 0, 0, 1, 1, 0, 1, 1, just to make sure we do not get too confused. My state is written as s0, s1. And what are my two outputs? v0, n equals un and v1, n equals un plus s0. And my next state is what? So, this w, I have to compute w. So, I think to write down the next state, I said s0 is w, but what is wn? wn is un plus s0 plus s1. So, s0 next is un plus s0 plus s1. What is s1 next? s0 itself. So, when you draw the trellis for these recursive encoders, it is good to write down all these equations first. Do not think in terms of which value of s1 I will use before or after or anything. Just write down at a particular time instant n, when the nth input bit is being clocked in, what are the different values? And assume that these all these XOR gates do not have any delay or anything, they just happen immediately. The moment this nth input bit is in, all the XORs will be immediately implemented. Only in the next after some time will the D flip-flops get activated and the shifting will happen. Shift happens and it settles. And then the next input is clocked in. At that time the XORs immediately act. So, that is how you should not imagine that when the input is being clocked in, you will also operate the D flip-flop. So, then you will get into all kinds of confusion. So, the input is the state is all set and nice when the input is clocked in. You get the output, after you calculate the output, you execute your D flip-flop clock so that it shifts. So, otherwise you will be worried about should I use the previous value for S1, next value for S1, all these things. So, this is what makes most sense and it is what I have written down. So, you see the recursive computation works in a slightly involved way when you write down the terms of states. So, try to go ahead and do it on your own. I am going to do it on the board. I might go wrong, I might be correct, but try to do this based on what you are doing. So, always compute the next state first and then compute the output. It is here that way. So, if you are at 0, 0 and the input is 0, clearly the next state is also 0, 0 and the output is going to be 0, 0. If you are at 0, 0 and the input is 1, then the next state is going to be 1, 0. I do not like that very much, but let us see how it goes. 1, 0 and the output is going to be 1, 1. So, if you are at 0, 1 and the input is 0, the next state is going to be 1, 0. And the output is going to be 0, 0. So, you see this thing requires a little bit of practice. So, if you did not practice this even once on your own, when you see it for the first time in the exam, you will have a tough time doing it. So, because a lot of intricate is here, it says 0, s1, this is the best way of doing it. You can also do it without writing down these equations. I believe you will get lost if you do that, but if you think you are comfortable doing that, you can do that also. But at the end of the day, it is just a simple digital system. So, you should be able to do it quite fast. 0, 1 and the input is 1, you go back to 0, 0 and the output will hopefully be 1, 1. So, even at this point, you notice the difference between this trellis and the previous trellis. This trellis out of a particular state, the two branches correspond to two different input values, which is the same as what you had in the previous trellis. But if you go to the right hand side, if you look at 0, 0 for instance, its two inputs correspond to what? 0 and 1. So, that is very different from the feed forward case. In the feed forward case, that can never happen. So, in the only in the feedback case, something like this can happen. There can be two different inputs coming into a particular state. So, that can happen. Go ahead and complete this. It is a little bit interesting. 1, 0 will go to 0, 1 with an input of 0 and the output will be 0, 1. Is that right? Can someone confirm that? Okay. 1, 0 with an input of 1, make a mistake. 1, 1 is it? Okay. And the input is 0. Okay. So, I think when the input is 1, it goes to 0, 1. So, input is 1, it goes to 0, 1. What is the output? 1, 0. Okay. When the input is 0, it goes to, am I right? Okay. What about the next one? If you are already at 1, 1, you go to 0, 1 with an input of 0 and the output would be 0, 1. Okay. And with an input of 1, you go here and the output should be 1, 0. Okay. So, that is how the trellis looks. Okay. But once you do this trellis, the Wittebi algorithm is as before. It is nothing different. Once you get this input and output done, you can always go back from output bits to output symbols. Given corresponding receipt values, you can always compute branch metrics, put it on the branch. Then you can compute state metrics. You can compute survivor paths. The only thing that will change is when you store survivor paths as a sequence of states. From there, you cannot go back easily to the message bits. Okay. So, you need some other function. Okay, that is all. But it will correspond. Every sequence of states will correspond to a unique message sequence. There is no problem with that except that you cannot just extract one bit out of the state to get the message that was actually sent. You cannot do that. You will have to do one more computation with it. Okay. So, some simple computation only, but it needs to be done. Okay. So, that is the trellis for this recursive code. Okay. All right. So, once again, when you want to write down equations like this, remember that when you are clocking in the input, the same the states are supposed to be fixed. Okay. So, you clock in the input, do all the do all the XORs, then get the output out. Then you switch your D flip flops, shift it, shift with the D flip flops, and then you get the next end. Okay. So, that is the sequence in which things go here. Okay. I believe that is the same thing you assume in any finite state machine when you do digital systems. Right. Otherwise, you cannot be doing everything you get. Okay. So, that is the trellis. Okay. So, now I am going to give, is there a question? No. So, okay. So, so previously we had a, so one of those descriptions we had for the code words was using the generator matrix. Right. Using the G of D, we could write every code word sequence that you saw as some U of D times G0D and U of D times G1D. Okay. So, even here, we will try to do the same thing, but there will be some confusion. Okay. So, let us look at that very closely and see how to get around that. Okay. So, let me write down what G of D is once again. Okay. So, the generator matrix is what we are going to look at next. Okay. So, this G of D is 1n 1 plus D squared by 1 plus D plus D squared. Okay. So, what am I interested in when I encode? When I am interested, what I am interested in is, I know how many message bits I have. Right. I would have some K message bits. Once I clock in the K message bits, I start at the all zero state. Right. You start at the all zero state. Okay. From the all zero state, you clock in your K message bits. Okay. After that, what should you do? You should return to the all zero state again. For that, what did we do before? We were just sending in zeros. Now, it is not enough to send in zeros. See, notice, if you are at the all zero one state, how will you go back to the zero zero state? You have to send in one and not zero. Okay. So, depending on which state you are in, you have to clock in suitable bits to take you back to the all zero state. So, your termination will change slightly when you produce code words. Okay. Till you clock in your message bits, there is no problem. After you have clocked in all your message bits, to take you back to the all zero code word, it is not enough to just clock in zeros. You have to clock in a suitable set of bits to take you back to the all zero state. So, that will change the way you produce your code words. Okay. So, what I am going to say here is you said you send M termination bits. Okay. These are not necessarily zero, not all zero. Okay. And then you go back to the all zero state. Okay. For instance, okay. So, it is very easy to answer this question for this example. If after I clock in my K message bits, I am at state one zero. How will I go back to the all zero state? What are the two message bits that I have to clock in? One one, right? See, one takes me back to zero one and then one again takes me back to zero zero. If I am at one, one, what will I do? Zero one. If I'm at zero one, what will I do? One and then zero. Right. So, you have to do two. If you like it or not, you have to do two, right? Even if you get to the all zero state. So, it's to keep the length consistent. Okay. If I'm at zero itself, then I do zero zero. Okay. So, maybe, maybe it won't work out that badly, but still anyway. So, that's what that's the description of the termination things that will happen. Okay. So, if you remember the trellis, it will become two states and then it will become one state. Already in that you have to do zeroes. Okay. That's okay. So, that's what you get. So, when you clock in K message bits, you will actually get K plus M stages in the trellis and K plus M times two in this case, that many codeword bits. Okay. So, all that remains the same. Let me write that down once again. A K bit message U when it goes through a rate, let's say one by, I don't know, what did I have? One by two convolutional encoder. Okay. Just for example. Okay. We'll give you, if you do the termination, you'll get what? Two times K plus M codeword bits. Okay. Same as before. Except that in the termination stage, you will not be clocking in zero. You'll be clocking in some other set of bits. Okay. So, there is a very nice way to describe this in terms of the generator matrix. Okay. So, remember, what is the codeword in terms of the generator matrix V of D is U of D times G of D. Right. So, which is in my case one and one plus D squared by one plus D plus D squared. Okay. Remember, M equals two for me, right in this example. I'm going through fully with an example, but even in the general case, it will be the same. Maybe we'll see one more example for more general situation, but it's the, it's exactly the same. Okay. So, typically in the feed forward case, U of D, you think of it as only the message bits. Okay. The reason is the termination bits are zero. So, even if you add the termination bits, U of D will not change. Okay. In the feedback case, it's useful to add the actual termination bits to U of D. Right. To add the termination bits to U of D. When you add the termination bits to U of D now, U of D will slightly increase in degree. Okay. Depending on what the previous values were. Okay. A very convenient and nice way of describing that is to say, I will take U of D to be, okay, U0 plus U1D plus so on till U k minus 1 D power k minus 1, which is the actual message. And then I'll multiply that by 1 plus D plus D squared. Okay. So, this, this is purely generated by the message. Okay. This multiplication by 1 plus D plus D squared will take care of the actual termination that is required. Okay. Yes or no? I mean, if you like this, how many if you don't like this? Okay. Okay. No. Okay. So, what will happen if I said U of D to be this? If I said U of D to be this, what will happen? V of D will be what? V of D will be what? U of D times 1 plus D plus D squared. And then what will be the next thing? U of D times 1 plus D squared. What is this encoding? Exactly same as the feed forward case. Okay. So, if I said U of D to be whatever message I have times 1 plus D plus D squared and feed it into this feedback encoder, I will get the exact same mapping as I got with my feed forward encoder and the exact same list of code words. But typically when you do convolution, so okay. So, this is, this is the, let me first write that down. Exact same as, okay. So, if you set this to be this exact same as feed forward case. Okay. So, that is good to know. Okay. So, but what's the difference? What happens when you set U of D to be something like this? What happens when you do this? There's something that you lose. There's one, there's one motivation I gave for this doing this recursive thing, which was yeah, I'm losing my systematic nature, right. So, in the code word, my code words, my messages will not be there at all. I'm completely losing my systematic nature. I'm undoing everything that I did. Okay. So, what you do usually is you set U of D to be something else. What you set it to be is something like this. You set U of D to be U0 plus, okay, to retain systematic nature, U and D plus, so on till U k minus 1, D power k minus 1. And then you add a couple of termination bits, U k plus 1, D power k plus 1, okay, such that, okay, these are your message bits. How do you do this? How do you figure out those two bits such that what? 1 plus D plus D square divides U of D, okay. So, this whole termination returning to the all-zero state can be described as something like this. You add those two bits so that 1 plus D plus D square will divide this entire U of D, okay. So, it's very easy to do that also. Actually, you can divide this message thing by U of D which is actually, by 1 plus D plus D square, which is actually what you're doing in that feedback circuit. And whatever else you have, you feed back in through as a reminder, okay. So, it's a very, there are a lot of theories behind how you do, how you can accomplish division with shift register circuits, okay. So, I think I mentioned this briefly during cyclic codes encoding also, okay. So, this is what you do. Okay. Another way of visualizing this is you look at the state that you are after you shift in your kth bit, then send in whatever is necessary to get you back to the all-zero state, okay. So, even if you do this kind of a U of D, you will only get the same set of code words as you got before, but what will change? The mapping from the message bits to the code words, code word bits will change, okay. See, even if you form U of D this way, you will only get all multiples of 1 plus D plus D square, except that they will appear in a different form. You will get a systematic encoding, okay. So, these are several ways of thinking about it. And you see, the whole idea is to get a code word that will stop after some time, you know. I mean, after a while the response should stop, right. It's a finite length. You don't want the code word to keep on going forever, okay. So, that's the whole point of coming back to the all-zero sequence, all-zero state, okay. So, that's why we force the U of D to be a multiple of 1 plus D plus D square, everything happens very, very easily, okay. So, that's why when people define convolutional codes, they would do something like this. They will say, I will only consider code words that terminate after a finite time. Once you do that, you will see the convolutional code is nicely defined and you can have, if you want a feed forward non-systematic encoder, if you want you can have a feedback systematic encoder, anything you want you can have. You can have different encoders, all of them targeting the same code but with different mappings from your message to the encoder, okay. So, if you are not very happy with this, try a few examples. Take a very small value for k, take k equals 4 and try it for this case and do both the systematic and non-systematic. You will see the actual set of code word sequences you get are exactly the same. But the mapping from message to code word is different, okay. All that happens because I am terminating to the all zero state. If I don't, if I didn't do that, then all kinds of crazy things can happen, okay. So, terminating to the all zero state, only nice things can happen in my description of the code, okay. Any questions? It has to be, you know, comes from division algorithms. If you take this whole guy and divide by 1 plus D plus D square, the remainder can have only degree 1, right. And then when you add the remainder back to it, you will have to go back to the remainder. Can I only degree 1? So, 2 are enough, okay. So, that's a brief description of this systematic encoding and why you need feedback for that, okay. And so slowly, as I said, we are moving towards turbo codes and systematic, non-systematic feedback encoders play a very crucial role in the construction of the turbo code in terms of motivation, okay. So, they are the reason why turbo codes really work and motivation for that, okay. So, for that, I am going to give a slightly different example so that we see one more example and then we ask the kind of questions that are necessary to move towards turbo codes, okay. So, this is the example I am going to give. It's quite simple once again, but I want you to look at it, okay. 1 and 1 plus D power 4 by 1 plus D plus D square plus D power 3 plus D power 4, okay. So, this is my generator matrix. The first thing you can do is draw a shift register circuit to implement this. How many flip-flops will I need? 4, right. So, you write down the 4 and then feed back everything into the first XOR gate, then pull out the first and the last one, you will get the encoder for this, okay. So, it is very easy to come up with an encoder for this guy, okay. So, I am going to ask a slightly different question, okay. The question I am going to ask is, okay, the following, okay. So, it is motivated, as I said, by what is going to come later. But for now, let us just look at this and ask questions to understand this encoding better, okay. Suppose I ask the question, what is the least weight codeword that I can get with such an encoder? Okay, how will you answer that question? Okay, so the way to think about it is, think of a U of D, okay. Think of a U of D, which when multiplied will give you a very short codeword, okay, you know, not too many ones. Okay, when will this have a lot of ones, okay. See, so what happens to the codeword? What is a codeword? What is the general form of a codeword? When you multiply this with U of D, what is the form of the codeword? You have U of D showing up in the systematic form and then U of D is going to multiply 1 plus d power 4 by 1 plus d plus d square plus d power 3 plus d power 4, okay. Okay, so if I take U of D itself to have a lot of weight, then what will happen? Definitely, the codeword will also have a lot of weight, because U of D is showing up by itself in the codeword, okay. So I should take U of D to be very low weight. If I want a low weight V of D, I should take U of D to be very low weight. But if I take it to be very low weight, what will happen to this guy here? Okay, it has to cancel, no, I mean the way I am thinking of my U of D, this has to cancel with this. If it does not cancel also, then there is a lot of problem, okay. So it is going to go into some IIR type filtering and it will give you a lot of response before it dies down, okay. So that is the problem. So you have to take U of D to be a multiple of what is in the denominator and U of D should have very little weight, okay, based on what you have seen before, what would you pick U of D to be? I want at least nonzero codeword. Yeah, of course all zero codeword has got zero weight, not good, not going to dispute that, okay. Nonzero codeword of least weight. What is a good choice? Can you give me a choice? What would cancel 1 plus D plus D square plus D power 3 plus D power 4? It is very simple polynomial filtering, polynomial factoring, 1 minus 1 plus D power 5, can I take U of D to be 1 plus D power 5? What will happen if I take U of D to be 1 plus D power 5? What will happen if I take U of D to be 1 plus D power 5? Yeah, you will get 1 plus D on the top, okay, and then the V of D will be what? 1 plus D power 5, what will you get here? 1 plus D power 4 plus D power 5 plus D power 9, okay, what is the weight here? 6. Can you get anything less than 6? Okay, so maybe it requires proof, but one can see that it is difficult to get anything less than 6, right? Okay, all right, so it is also possible to study these things on the trellis, so one can come up with a very easy definition on the trellis to study these things, okay, so what you look for in the trellis is you start at the all zero state and how quickly can you get back to the all zero state, okay, and what is the weight of that path which comes back, it is called a minimum distance error event for instance. This is the same as what I did here, right, I want to return to the all zero state very, very quickly, so I want to pick my U of D such that, I am sorry, you can also find harming distance in the trellis, right, the code word you can do that, okay, so if for instance look in the trellis, okay, if I have to start at the all zero code word, okay, see I have to depart from the all zero code word, so I can go to state 2, then I have to quickly return to, so why is that giving me 5, okay, well this is another code, it is giving me 5, okay, so I have got 2 here and then 1 weight 1 here and then weight 2 here, okay, so I have got a code word of weight 5 with a code word of weight, with an input code word of weight, what, 3, right, okay, so something like this will happen here also, okay, so you can also interpret this on the trellis if you want, but a better interpretation is in terms of the polynomials, right, 1 plus d power 5 is the smallest degree polynomial which will which will cancel out, 1 plus d plus d square plus d power 3 plus d power 4, so you put that there and you get your least degree, I am sorry, yeah, it is not the, it is not a rigorous proof, but one can prove it in this case, it is possible to prove that also on this case, okay, so this is a nice way of getting small weight code words from small weight input, okay, so can you give me any other input, any other u of d which will achieve the same type of distance, what else can I do, multiply by d, right, so that is a very standard way of getting more code words of minimum weight in the convolutional code, what do I mean by multiplying by d, what am I doing when I multiply by d, I am simply delaying by input by 1, if I keep on doing that, I will simply keep getting the same thing again, it is a linear time invariant system, right, this is this filter that we have, so if you keep delaying it, you will get the same output with the delay and the weight will not change, okay, so I could say at u of d to be some d power l into 1 plus d power 5 and I will still get a very low weight code word at the output, okay, so what is the moral of this story, the moral of the story is if you have a weight 2 input separated by 5 processions, okay, then your code word will have very low weight, okay, your code word will have very low weight, okay, if you have input separated by 5, that 5 is crucial, if it is not 5, then lots of other things can happen, okay, so it won't cancel, you will get long code words, okay, and what is the role of this delay, suppose my message sequence is very large like 1000, okay, and I ask the question how many code words of weight 6 I have, what is the answer, I can do 1 plus d power 5 and I can keep on delaying it till some 995 or something, so the number of code words of low weight is going to linearly increase with my message length, okay, right, and that is maybe not a very nice thing, okay, so if you have 1000 particularly for very long number of message bits if you go to 1000 and 2000, then the number of code words of very little weight is also going to increase linearly, okay, so that is one of the drawbacks of convolutional codes, okay, because it is a linear time invariant system, okay, if you keep delaying it by one position all your code words will also get delayed and your number of code words of low weights, this is actually, this example I showed for 6, but if I find a code word of weight 7 I can do the same thing, right, I can keep delaying the whole thing, I will keep still getting more and more code words, okay, so the number of code words of low weight is a lot in number, okay, so what happens because of that is when you try to decode convolutional codes you will see it will work only when you have reasonably good SNR, but very, very low SNRs even though you are doing ML decoding since there are so many competitive low weight code words nearby you will invariably make a lot of errors, so that is one observation that people knew from a long time back, okay, they knew because of this linear time invariant property, okay, convolutional codes have very many code words of low weight, right, right, and that causes a problem at very low SNRs, so when you want to try to achieve capacity with convolutional codes you will be stuck somewhere, you cannot really get arbitrarily close, okay, that is mainly because of this problem, right. Yeah, I mean I agree with you, I have to do this calculation very carefully, okay, but it can be done and you will see this linear increase is not very nice, you do not want even a linear increase, okay, you want really the low weight code words to drastically go down and that is not a very nice thing, okay, so it can be there, one or two can be there, but it should go down, it should go down, the lower the better, okay, and another thing to keep in mind is the minimum distance here is not all that high, you know what I mean, 6 is not very high, so that is another problem, so you do not want too many of the code words to be there, very close by, then you can go wrong in so many ways and close by, I know it is not a very rigorous treatment but that is the best I can do, given the situation, okay, hopefully it is intuitive, you can see why that works, okay, so what is needed is several things, okay, so if you have to take convolutional codes and move them into capacity achieving scenarios, okay, so you need to somehow break this linear time invariance type thing, okay, so as long as you have the linear time invariance, things will not work very much, that is one lesson you can get right out of this, very simple logic, okay, so let me write down a few more things that are required if you want to move towards turbo codes, okay, so moving towards turbo codes requires a few other observations, okay, first observation is LTI is bad, okay, this hurts, okay, so it is not good, okay, so you have to break this linear time invariance, you have to do something else, okay, not just shifting and exotic, okay, something else has to happen, something which will make it not time invariant, linearity maybe is not a bad thing, but time invariance is a more hurtful thing, that is one thing. And the second observation is, you might remember when I introduced LDPC codes, I said so far, as far as Reed-Solomon codes were concerned, we looked at very solid constructions, deterministic constructions, once you started with LDPC code, I gave a little bit of an argument for why you needed a random element in your construction, okay, so what is missing in the convolutional codes clearly is some random element, okay, so you need some random element, okay, and the third notion is, so it is great, we have ML Viterbi decoders for convolutional codes, when we do all these things, when we introduce some random elements and when we kill the time invariance, you are not going to be able to do the ML decoder, okay, so if you go back and remember for LDPC codes, what did we do? We had a bitwise MAP type decoder which was reasonably efficient to implement in an iterative fashion, okay, so you need some such suboptimal iterative decoder, okay, so all these things put together will take you towards, take you towards turbo codes from convolutional codes, okay, so there is one more thing that is required just to, I think it is important when you read convolutional codes to know about puncturing because what is usually done is, I have so far described only rate half codes, okay, you might say why only rate half codes, maybe I need a rate 4 by 5 code, rate 6 by 7 code, and then what will happen, if you need a rate 6 by 7 code, what would you do? Okay, you cannot have 6 different inputs coming in and it will just become ugly, okay, so what is done in convolutional codes typically is something else, people do puncturing a lot, okay, so you do just one rate half code and then you puncture it, you remember what is puncturing? You drop some paritables, okay, so that is a very common idea and that is one missing piece that we will need when we talk about turbo codes also, okay, so the way convolutional codes are punctured, this is the following, okay, so usually you start with a rate half code, okay, so when you do a rate half encoder, what happens? You have one input U and you have two outputs, I will take a systematic case, okay, so usually people do only systematic encoding and it is possible to do non-systematic also, if non-systematic puncturing is a little bit more confusing and all that, so I will do just systematic version, okay and then V is my parity bits, right, so if I have my message bit U, I will have the same message bits coming here and then I will have a parity bit, right, for every message bit I will have a parity bit, I am doing a rate half encoding, right, so what people would do is, you take say for instance U1, U2, so on till U8, you take 8 bits, okay, correspondingly you would have 8 parity bits, so you would not send 4 of the parity bits, so for instance a very simple thing that you can do is, you can drop V2, V4, V6 and V8, you can decide to puncture those 4 bits, then what is your rate become, 8 by 12 which is 2 by 3, okay, suppose I want to achieve a rate 4 by 5, 3 by 4, okay, for instance, maybe you take more bits, you take like 20 bits, you have 20 more parity bits and then you puncture suitably to get a better rate and it is very common to puncture these alternate type bits and there is also optimal puncturing patterns, people have done a lot of research and if you search enough, you will find optimal puncturing patterns to go from a particular rate half code to some other rate code, okay, so it is possible to have rates 7 by 8 for instance, okay, so this is a very common way of doing going to higher rates and convolutional codes, okay, people typically never design higher rate convolutional codes directly, okay, you do puncturing, okay, so now the important question to ask is, what do you do at the decoder, okay, so fine, if you did not send this bit, but at the decoder what am I doing, how will I compute the branch matrix, okay, I need this bit, I need a received value for this bit to compute branch matrix, what will I do, what do you do at the decoder, what can you do, think of something, what received value, I need a received value equally probable, so what received value I should take for BPSK, 0, okay, so simply take received value 0 for those bits which were not transmitted, okay, so that is what you do, okay, set received values, okay, in the BPSK case, remember if you have something else, maybe you will have to do something else, but typically 0 works in most cases, set received values, I am sorry, yeah for the punctured code, yes, received values for punctured positions to 0, okay, so you see when I do this, you have to be careful about how you puncture, right, so you see in the puncturing pattern typically people will never puncture consecutive bits, okay, so I am setting my received value to 0, which means I want to put a lot of gap between my puncturing so that I get good branch matrix on for successive stages, in any two stages, at least in one stage I should get proper complete branch matrix, right, if I keep losing my branch matrix because of this puncturing, let me at least lose it sufficiently far apart, so that I can adjust for the parts as I go along, okay, so that is those are some intuitions you can use when you pick the puncturing pattern, but as I said people have done research and these puncturing patterns are available out there, so you do not have to recreate them on your own, so most standards for instance you take the LT standard or 3GPPO, any of those standards, they will give you the puncturing pattern, so you do not need to do any more work, right, so quickly that is always given, okay, so this is one thing, another puncturing thing which is very common which we will use, receiver also knows the puncturing pattern, it is decided beforehand which bits are punctured, yeah, that tends to be decided otherwise they are not allowed, you might ask then the question how does the receiver know what the encoder was, okay, so all decided ahead of time, okay, so you cannot change it anyway, so another thing which is very commonly done to get rate half from rate 1 by 3 encoders, okay, see rate 1 by 3 and all is not too bad, right, you have only one bit stream coming in, you encode more and more, it does not seem too terrible, so that is also very easy, you can do it very very easily, if you have a rate 1 by 3 encoder corresponding to a message, I have the message sequence showing up in the systematic form, I will do it for 8, but I guess you can do it for any other length, okay, so maybe I will do it for N, okay, just for just for fun, okay, you want to UN, okay, I will have my first punctured bits as V1, 1, 2, no, first parity bits, I am sorry why am I saying punctured bits, my second set of parity is to be V2, 1, 2, V2, N, okay, these are my two parity sequences and the message sequence that shows up on its own, okay, so very common thing to do is to puncture alternate positions in V1 and V2, so for instance in V1 you would puncture what, puncture 1, 3, 5, so on, here you puncture 2, 4, 6, 4, okay, so rate will overall become half, okay, so every alternate bit you take from the other thing and rate becomes half, okay, so puncture after puncturing rate is half, okay, so while moving towards turbo codes, this is useful, I mean as in this is typically done when you do turbo codes, you puncture a little bit to get better rate, okay, that is one thing. And the other thing missing piece before we go towards turbo codes is we saw an ML decoder for convolutional codes, there is also a bitwise MAP decoder for convolutional codes which can be very efficiently implemented, okay, so it is based on what is called the BCJR algorithm or the forward-backward recursion, if you have done digital modulation, if you are doing digital modulation coding now, I think it has been done for you already, maybe in the context of equalization, but even there they just have a trellis, so the same thing can be done here as well, okay, so I am going to give a very, very brief description of what that is and simply tell you input output for it, what is the input to that block, what is the output from that block, okay, so I won't do anything more beyond that, okay, so bitwise MAP decoder for convolutional codes, okay, so after we do this we can move towards definition of turbo codes and then go on to talk about the turbo decoder, okay, so I think we should be able to do this by this class, okay, so what is the overall big picture, right, so you have say a rate, I will take a rate half encoder just for simplicity, rate half encoder and maybe it is systematic, I will take it to be systematic just for no reason why it should not be so, so the U is getting encoded into U and V, okay, so that is what happens, okay, in the encoder, now this goes through say BPSK AWGN and you receive, I will say R0 is the vector received corresponding to the systematic part and R1 is the vector received corresponding to the parity part, okay, did you see that clearly, maybe I should write it once again, okay, so R0 is the received vector corresponding to the systematic part and R1 is the received vector corresponding to the parity part, okay, so now what did we do in the Viterbi decoder, we maximized the likelihood of receiving R given over all codewords, right, we did an ARG min or ARG max, ARG min of distance which is the same as ARG max of the probabilities, right, so that is what we did in the Viterbi case, what should you do in the Bitwise MAP case, you should compute what, okay, so I will take these vectors to be U1, U2, UN, KV to be, V1, V2, VN, then same thing for R0 and R1, okay, so it is unfortunate I have to write down, so the whole thing like that, but you see, right, I will just take those to be vectors, so what do I compute, I am trying to compute, okay, a Bitwise MAP decoder tries to compute what, okay, something like what, probability that, what, Ui equals, did I usually do 0 or 1, we are going to do 1 given R0, R1, okay, so this is what you have to compute, okay, and this Bitwise MAP algorithm for the convolutional code will compute this way, okay, so it will take in R0, R1 and the, say the trellis of the convolutional code, some description of the convolutional code and it will output all these values, okay, so typically we do not think of this value, right, what is the value that we usually use, okay, the likelihood ratio, in fact the log of the likelihood ratio, so I will denote that as lambda i, I do not know what I have been using before for the log likelihood ratio, maybe capital Li, okay, so maybe we will be consistent with that, I will say capital Li, okay, the a posteriori I used capital Li and for the a priori I used small Li, did I do that, okay, looks fine, so log of probability that Ui is, I will take, did I take 0 on top, probably took 0 on top, okay, R0, R1, okay, if you do not like the R0, R1, collect them into one received vector and call it R, okay, so it is fine, it is the same thing, ratio Ui equals 1 given R0, okay, so this is what the bitwise MAP decoder will compute, okay, so I am going to make a few remarks about it without going into great detail, okay, so the way it works is very much similar to the biturby decoder, okay, so remember the beta bit decoder, it starts with the all 0 state at stage 1, then proceeds stage by stage all the way to the last stage, okay, so and stops there and it is able to output the ML path to get the probabilities that is not enough to compute these probabilities, it turns out you have to do a backward recursion starting from the final state, then come back stage by stage and compute further things very similar to what you did in the forward recursion, okay, so what happens is you compute what are typically denoted alphas and betas, okay, for each state you compute alpha and beta at each stage and then there are these branch metrics in the middle which are called gamma usually and then for each of these probabilities become simply alpha, gamma, beta, some over several things, okay, so that is how it works, not going to explain anything more beyond that, okay, I do not think it is necessary in this class, if you are interested you can do it, I think most of you might be familiar with this already, that is what you do to compute these probabilities, okay, what I am going to do on the other hand is to show you how to split this probability in terms of the a priori probabilities and the a in terms of just ri alone and the remaining outside of ri, intrinsic and extrinsic, not a priori, intrinsic and extrinsic like I did for the block code case, I want to just do in a very general term in the general way to show you that this thing will actually split into a priori, into intrinsic and extrinsic, the term that depends only on ri and the term that depends on everything else other than ri, okay, so it can be done, it is not too difficult but I will just show you very quickly how it works, if I get stuck I will just stop everything and give you the final answer, okay, so we will see how it works, okay, so the way it works is the following, you look at this probability r0, r1 and you write it as use base rule and you write it as what, what do I want to write this, I want to write this as, okay, so another thing I will do just to simplify my notation is I will set this to be r0, r1, okay and if I write rn that actually denotes r0n, r1, no, ri, okay, I am sorry, ri, it actually denotes r0in, r1i and if I write r, let me say not equal to i, okay, so I guess this is the notation, what does it mean, what does it mean, yeah, so you start with r1, r2 you go until ri minus 1 and then you jump to ri plus 1 and go all the way till the last, last guy which is maybe, okay, r not equal to i, okay, so that is maybe it is stupid notation but it will help me a little, okay, it is a simple notation to understand what is it that we are doing, okay, alright, so now first thing in base rule is to write this as ui equals 0, r divided by f of r, f is some suitable pdf, okay, so you will have to calculate it based on what you have, okay, the next step is to do something with this, what do I do, okay, so I am going to say write this as, I want to write this as, okay, of course I can write this as ui equals 0, ri and r not equal to i divided by f of ri, r not equal to i, this second, right, it is no problem and what do I want next, okay, so I have to do some conditioning now, okay, so let me try this conditioning, I think this should work, okay, we will go to the next page, I think this conditioning should work, I am going to say f of ui equals, they put 0 or 1, I have been putting 0 or 1, 0, okay, 0, condition on, let me take r not equal to i, condition on ri, will that work, what am I doing, I am sorry, what happened, oh, in the next page, no, I am trying to just think my way through before I get stuck in the middle, so let me just see, it will work out very nicely, so I want a conditioning on, I think I want conditioning on r not equal to i, but I am a little bit confused about whether or not, okay, so I do not care about the denominator that much, but the numerator should work out, okay, so I also want ui given ri, so let me just condition on, I think maybe I should condition on both, okay, so let me raise this, let me try, let me try what happens if I condition on both, okay, let us try to condition on both, okay, r, ri, r not equal to i, multiplied by, oh, I will go back to the same thing, so maybe I do not want this, it is a very simple derivation, except that since I did not, I think maybe the best thing to do is to just give up on this and give you the final answer, but let me just think, it is not too difficult, it can be done, if I do this, okay, okay, so I think maybe I will look at it, I mean it is not a very difficult derivation, except that I am so close to the, so close to this thing, it is going to get more complicated if I try to do this, okay, so you can work with this, I am sorry, I am not able to fill it in, just do not have too much time, so maybe I will come back next class and give you the actual derivation, it is not too difficult, so if you do that at the end of the day, you will get this very nice simplification, Li will become 2 by sigma square times Ri, which will correspond to your intrinsic LLR, and then plus this other guy, okay, so what is this other thing, you can say that will be log probability of ui equals 0 given r not equal to i, okay, so it is quite a simple derivation, for some reason it is refusing to occur to me immediately, okay, so maybe I give it one last try just to see if I can do this, and then we will just ditch this derivation and keep going further, okay, yeah I think we will just ditch this, okay, so it is not so crucial, so try to work this out by yourself, so this is very important in the turbo decoder, okay, so we will use this over and over again, and make sure you can write that down carefully and get to this, okay, so this you can see is the intrinsic part, and this will be the extrinsic part, okay, so it is quite easy to see this, but for some reason my derivation is getting stuck somewhere out there, okay, so if you have any bitwise MAP decoder which is giving you Li, okay, that Li can be split into the intrinsic part which corresponds to the received information at that stage, right, so wait, how did I get this, there is some problem here, is it r0 or r1, I think it should be r0, no, yeah it should be ri0 and then yeah I think, yeah this should actually include, yeah okay, so anyway, so looks like I am a little bit confused about this r0 and r1, so you can write the Li that you get finally as 2 by sigma square ri0 plus whatever else is remaining, okay, so I think I am a little bit confused about whether the parity part enters here corresponding to that, I think it enters here, not very clear about that, but anyway, so this is what you do, okay, so this is the, this is what the bitwise MAP decoder is going to do, okay, so the input to the bitwise MAP decoder is r0 and r1 and the output is, okay, so you put a bitwise MAP decoder and the output is going to be, okay, Li, okay, so I call that vector L which is L1, L2, Ln and remember each Li can be written as 2 by sigma square ri0 plus the L extrinsic, okay, so this is the high level picture of what the bitwise MAP decoder is going to do for you, okay, so it takes the received values corresponding to the systematic part and the parity part, it computes the APP log likelihood ratio which with some work can be shown as split into 2 by sigma square ri0 plus extrinsic whatever else is, whatever contribution you get from the remaining bits, okay, so this is the only thing I want to do about the bitwise MAP decoder, if you're interested I can give you other references, okay, so fine, so that's, that I think should be enough to jump into the turbo code description, so I'm going to describe the definition of turbo codes now, okay, so it's actually a very, very simple definition at the end of the day, okay, so the version I'm going to describe is what's called a parallel concatenated turbo codes, okay, so this is the version that I'm going to describe, describe today, okay, so there's also serial concatenation and all kinds of modification to it, you can do so many other things, but the version I'm going to describe is parallel concatenation, so I'm going to draw the diagram and leave you with that, I think that's all we'll have time for today, okay, so this is how the encoder is going to look, so you see the random element will come in and you'll see the time invariance will be destroyed in the way the encoder is done, okay, so you have a message U, okay, so one part of the encoder will produce simply the systematic part, okay, so you'll take a recursive systematic encoder and the top part goes through, okay, so there will be encoder 1 which will give you the parity part, so the first parity and then what you would do is interleaving, okay, I'll represent the interleaver as a pi, okay, so that's my interleaving, think of this as k bits, okay, so this would be k bits again, okay, so this is a rate half encoder, systematic encoder, so the number of parity bits it produces is k bits once again, okay, so then you do an interleaving, what do I mean by interleaving, you take these k bits and permute them in a different way, okay, and then run it through another rate half systematic encoder, I'll call it encoder 2 and it produces v2 which is again k bits, okay, so that's what we would do, okay, so this is a random interleaver, we should do random interleaver, okay, yeah, yeah, all this is known at the decoder, okay, so as I've written down, this is a rate 1 by 3 encoder, right, overall, okay, so then how do you get a rate half for instance, you can puncture, okay, so typically you take something like this and you puncture say the alternate bits of v1 and v2 to get your rate half, okay, so first thing that you can see clearly is introduced as a random element and why is the time invariance lost, you can see, yeah, the whole thing has become a block encoder and just because you shift and then there's a huge interleaver in the middle, you don't know what's going to happen at the other output, okay, so the shifting may not be maintained very nicely through the interleaver, okay, so you should design the interleaver so that it's very random and it's not going to be made up, okay, so those two properties have been killed, so hopefully this will have, I'm sorry, well the question is do turbo codes achieve capacity, they come pretty close for some rates, some more logic, okay, I'm sorry, it's random, so just pick it to be some random, it should not be a regular interleaver like you delay by 2 or something, sorry, just a random permutation and it can be any permutation, right, so actually this is not one code, it's actually an ensemble of codes, keep the encoders fixed, you change the permutation, you get different codes, so that's in that way it's defined smaller, okay, so we'll stop here and have to run somewhere else, so we'll pick up from here in the next class and maybe I'll fix that derivation for you in the next class, okay.