 Let's try this once again, so are you getting through, okay, okay, okay, okay, so this is lecture 22, right, okay, so the last thing we were seeing was, what was the last thing we were seeing, okay, so neighborhoods of Depthel for a bit node from an LDPC code and the tannograph of an LDPC code and then how repetition in the neighborhood as you go along and constricted corresponds to a closed loop or a cycle in your original tannograph, okay, so I want to spend some time talking about this neighborhood a little bit because we will use it quite often when we describe decoders and analysis etc, okay, so the first thing I want to do is to think of the neighborhood in the original tannograph itself, okay, so how did I draw the neighborhood, okay, so okay, so I want to talk more about these neighborhoods, how did I write down the neighborhood, you remember I wrote it like a tree, right, I wrote the node first and then its neighbors, its neighbors, its neighbors, so on, okay, I never backtracked but I kept on writing like this, I said we'll allow repetition in small cases but in large cases we also said even as n becomes very large as your graph becomes really really large and it's very sparse, then I stated a result, right, I said cycles will not, you remember that result, okay, so the probability that you'll get a cycle is very very low, in fact it decreases exponentially in n, okay, yeah, small length cycles, yeah, I think probability decreases, maybe I'm not quite right about the exponential in n, maybe it's only 1 by n or something, okay, so I'm not very sure, maybe I want to check that but it decreases and it becomes very very small, so probability that you get a cycle is not really there, okay, but you can also think of a neighborhood on the tanner graph itself, suppose I have a tanner graph here, okay, so let me, let me spend some time and try to write down a tanner graph, okay, just an approximate version, I have a whole bunch of bit nodes here and then I have check nodes, okay, it goes on like this, that's how and there are connections, right, so there are a few things that are very basic and elementary, so I can reorder the check nodes and bit nodes in any way I want, right, it doesn't change the code or the actual matrix in any way, right, I can reorder the nodes and on both sides any way I want but I cannot change the placement of the edges, if I change the placement of the edges, what happens, the one actually moves from something that the code itself changes, can change potentially, okay, so keep that in mind, so when talking about neighborhoods, it's useful to reorder the nodes as they would appear in the neighborhood, okay, suppose I start with this bit, okay, suppose I'm interested in the neighborhood of this bit, okay, so to think of it on this tanner graph in a bipartite way itself, you can say, okay, so this is let's say 3-6 regular graph, okay, so 3-6 regular graph, so I'll say the immediate neighbors of this bit node I'll put as the first 3 check nodes just for convenience, okay, so that I can show you how this neighborhoods will work out in the tanner graph, okay, those are my immediate neighbors, maybe they actually appear somewhere else in my original matrix but I can reorder them, it's the same thing, okay, and then what will happen if I look at neighbors of this check node, the first check node, okay, I can again write them down as the next 3, next 5 bit nodes, right, the check node will have 6 total bit nodes connected to it, okay, degree 6 node, 3, 6, so maybe I draw those 5 here, okay, is that clear, I mean, that's okay, so likewise if you were to draw the neighbors of this check node, okay, one can imagine those things being here, right, okay, all 5 appearing here, likewise the neighbors of this check node maybe they appear here, okay, so this is one way of visualizing the neighborhood in the graph itself, now what will happen to the neighbors of this bit node, the second bit node, maybe you draw them here, okay, the 2 more maybe they appear here, okay, okay, and then you can keep on continuing like this, okay, all the neighbors of these bit nodes will appear here and then those neighbors will come down here, then further down to the right, further down to the left, likewise, likewise, likewise, the same thing I drew in last class in a tree fashion, okay, but you can also visualize the neighbors on this graph itself by going to the right, going to the left, going to the right, going to the left, okay, what will happen when you have repetitions, you will go back to something and that will immediately cause a loop, okay, you will have a close path, close loop going back, okay, that is one more way of visualizing it, all these things are important, it is also possible to visualize neighborhoods on the matrix itself, okay, little bit more difficult, but how will you do it, suppose I start with a particular bit, how will I identify its neighbors on the matrix, the positions of the rows on which the first column has ones, those are its neighbors, checknotes that are connected to, now what about the neighbors of those checknotes, the columns where those rows have ones, okay, and then what about further neighbors, can go to each of those columns and come back, okay, it is a little bit difficult to visualize on the matrix, but this is how you go, you keep going column to row, then row to column, and row to column to row, row to column so on and so forth, and this is the same jump you are doing here, okay, so anytime you have a repetition, you will get a close loop in the, in the tanner graph, okay, a similar thing in the matrix is more difficult to visualize, okay, so there is no close loop as such, right, so you will get something, which looks like, yeah, you will come back to one of the same rows of columns, okay, but the structure itself does not have a very clear description like the cycle in the graph, okay, in the matrix it is a complicated way to describe, okay, so it is important that you think about the neighborhoods and you have a good, good clear idea about what the neighborhoods are, any questions, something that is disturbing you, both the way I write it down, okay, and when the result that I gave you that there will not be any short cycles is basically that as n becomes really, really large, if you hold WR and WC constant, okay, and increase n, you will not get short cycles, in fact, I showed you can for any L, you can have neighborhoods of depth L being cycle free with very high probability, okay, that is possible as you get n becoming very, very large, okay, so we will use all this as we go along, okay, any questions, it is fine, okay, so let us move on to describe the Gallagher A decoder, okay, so this is the first decoder that we are seeing, Gallagher, okay, so this is a hard decision decoder, what do I mean by a hard decision decoder? Yeah, you first quantize to how many bits, one bit, so basically you are decoding on what channel, even though you have a Gaussian channel, you are actually converting it into a BSC, okay, that is a very simple conversion that you are doing and then you are decoding on that, so you can think of Gallagher A decoder as a decoder which works on the BSC, okay, it is a simple thinking about it, it is a hard decision decoder, okay, there are several properties, you will see the way we describe it, many of these properties will become clear, so first of all it is an iterative decoder, okay, almost all decoders for LDPC codes are iterative, iterative in the sense, if you take for instance, the Reed-Solomon decoder, you had several steps, but you do not go back and revisit the same step over and over again, right, so you do it and you move on, okay, and then finally you decode and you finish, here what will happen is you will increasingly refine your decoding by repeating the same steps over and over again, okay, so that is why it is iterative, the same process is repeated several times, okay, and you will see in our analysis we will want to do a large number of iterations, okay, and even in some simulations we can see that, all that will tend to be good, okay, so it is iterative in nature and it is described on the tannograph, it is easiest to describe on the tannograph, okay, one can write down a very clear description even with the matrix, it is not too difficult, but it is not intuitive and it does not extend to other decoders very easily, so, but we describe it on the tannograph, it becomes so clear and easy and simple to visualize what is going on, okay, later on I will show you a visualization of the decoding on the matrix, okay, at that time I will talk about what the connection is, how the visualization is on the matrix, but the description I will do on the tannograph, okay, all right, so, okay, so as I said description on the tannograph, so to be very specific let us say we look at a WR, WC, regular LDPC code, okay, so this is what we will fix, okay, I have not mentioned block length and how do you visualize that, n is sufficiently large so that this matrix is parsed and everything will work out very well, okay, that is who you think of the block length and suppose I already have a parity check matrix, okay, so I already have a parity check matrix, I already have a tannograph, okay, so what is my channel model, my channel model is I am transmitting a code word, how did I get a code word, I have to encode, okay, I have not talked about encoding that much, okay, so, but you know it is a linear code, you will have a generator matrix, you will have a parity check matrix, you can encode, so one can potentially do it, maybe it is complex, maybe it is not very simple, but one can encode, okay, so if I have somehow encoded this code word is being transmitted and I will say it goes through a binary symmetric channel with probability of transition P and I get a received word R, okay, so this R is going to be R1, R2, so on till Rn, okay, where n is my block length and this is my input to the decoder, okay, this is going to be run through a, let us say the Gallagher A decoder, produce an estimate of the transmitted code word, okay, this is how the picture looks, okay, so now I said I am going to describe it on the tannograph, each bit node of my tannograph corresponds to a column of the parity check matrix and each column I can relate to a particular bit, okay, I have R1 through Rn, okay, R1 is received when C1 was transmitted, okay, so that is the, some other thing, okay, so this is C1 through Cn, R1 was received when C1 was transmitted, R2 was received when C2 was transmitted, so on, okay, but R1 and R2 I clearly said are not independent, right, you remember the entire vector are not independent, if you given a particular code word then they become independent, okay, if you know what C1 are they are independent, otherwise they are not independent, but it is very usual to associate R1 to the first bit node of my tannograph, okay, so what is that exactly? R1 is immediately if I do not do any processing, R1 is the best estimate I have for C1, okay, is that clear? Okay, so do you see that simple, it is a very simple thing, R1 is received when C1 was transmitted, what is the probability that R1 equals C1? 1 minus p and the probability P R1 will not be equal to C1, suppose if I am not allowed to do any processing, what will be your best estimate for C1? Assuming P is less than 0.5, it will be R1, right, you cannot do anything more, okay, if you are not allowed to do any processing, R1 will be your best estimate of C1, likewise R2 will be the best estimate of C2, okay, you cannot do anything better if you are not allowed to do any processing, okay, so it is very natural to associate R1 with bit node 1, R2 with bit node 2 and so on, so that is as if that is the initial starting information you have about those corresponding bits, okay, so when I write down the tannograph, I will put R1, R2, R3 so on next to those corresponding bit nodes, okay, so and one can imagine that is the best estimate you have for that bit node, okay, at time 0 when you have not really started doing any of your decoding, okay, so suppose if I were to put down the tannograph, okay, I would say R1 is associated with this guy, R2 is associated with this guy, R3 is associated with this guy, so until Rn which is associated with this guy, okay, right, so what is my tannograph, the complicated set of connections, all this is there, okay and then I have a bunch of check nodes here, okay, that is my tannograph, okay, so like I said the procedure is iterative, so iteration 1 is slightly different from the other iterations, so we will describe iteration 1 separately and then we will go on and repeat the same steps over and over again, okay, so what is iteration 1, okay, so all iterations can be described in a very simple statement, almost all iterations that you do will try to send the best estimate or knowledge you have about the particular bit locally to its neighbors, okay, so for instance initially all these bit nodes are the only ones which have any information, the other nodes do not have any information and what does bit node 1 have, it has R1 which is its best estimate for what C1 is, okay, in any iteration a bit node will send some information to its neighbors, okay, well any node will send some information to its neighbors, that is how the general philosophy is, okay, so for instance bit node 1 will be connected to how many other check nodes, W, C, no, WR no, oh, so I have been doing WC, WR, okay, I am sorry, correct me if I make these kind of mistakes, this is what I have been writing down, okay, so 3 comma 6 is what we write, okay, so suppose bit node 1 is connected to WC check nodes, okay, so in the first iteration bit node is going to give the its best estimate of C1 to all its neighbors, what is its best estimate of C1, R1, okay, so it will send R1 to all its neighbors, okay, so iteration 1 is all bit nodes send, okay, let me be, let me be very specific here, I am sorry, I will not say all bit nodes, let me say bit node i, then it becomes easy, okay, bit node i sends what, sends Ri to its neighbors, its neighboring check nodes, okay, so this is step a of iteration 1, okay, in iteration 1 there is a step b as well, okay, so what is the step b, so step a, in step a what happened, some information was sent from bit nodes to check nodes, okay, in step b of iteration 1, some information will be sent from check nodes to the neighboring bit nodes, okay, that is what will happen in step b, so it is a very simple procedure, okay, so every iteration has two steps, in the first step some information is sent from bit nodes to the neighboring check nodes, and in step b some information is sent from check nodes to the neighboring bit nodes, so all information passes only along the edges, okay, there is no information that will jump from one bit node to some other check node which is not connected to it immediately, okay, in every iteration and every step information will travel along the edges, okay, that is one way of visualizing that fact that, see for instance when bit node i sends Ri to all its neighbors you can visualize that Ri as traveling on each of those edges, okay, the same Ri is traveling on so many different edges, okay, so suppose if you have to write down a simpler picture for one bit node, this is bit node i, okay, it has got Ri as its input, okay, it is connected to how many check nodes, Wc check nodes, okay, you can visualize Ri on each of these edges, this is iteration one, step A, and this is what is happening in at all bit nodes, all the bit nodes are doing these things, so it is also parallel, so you will see these decoders are all parallel, all the nodes can do simultaneously, okay, so it will work together, okay, so what is step B in iteration one, check nodes have to do something, okay, so let us see this process a little bit more closely, okay, so I will draw a picture like this, maybe I look at check node j, okay, how many bit nodes this is connected to? Wc bit nodes, okay, Wr, okay, thanks, okay, I do not know what they are, they can be some all kinds of numbers, so I will say this is bit node j1, this is bit node j2, so until bit node jWr, is that fine, okay, I do not know what they are, which means what was the received value here, rj1, rj2, so until rjWr, okay, so too many subscripts here, but I guess I need all of those to be very specific, okay, so in step A what would have happened, in step A what would have happened? Yeah, so this check node would have received in step A, okay, it would have received what? Yeah, received rj1, rj2, so until rjWr, okay, so now when it wants to send something back to bit node j1, what is the principle, what should it send back? It should send back its estimate of what bit j1 could be, right, based on all the information that the check node has, okay, what is its best estimate of what bit j1 could have been, okay, best estimate does not, what estimate it can calculate quickly, okay, sometimes we will say maybe the best estimate is too difficult to calculate, so I will just give up and calculate what I can quickly calculate, in this case in fact in the first iteration you can even do the best estimate, what is the best estimate, what is its best estimate based on all those things, okay well rj1 it has, okay, so you should start with rj1, it knows rj1, right, so rj1 is an estimate, but is there any point in sending back rj1 to bit j1, bit j1, there is no point, why? I got it from bit j1, so I know whatever I got from bit j1, I do not have to send back once again, so you should be careful with that also, when I said it is its best estimate, its best estimate of extrinsic information, whatever is extrinsic to bit j1, not that I know, not information that bit j1 would already know, okay, rj1 it already knows or whatever j1 sent it already knows, okay, so this best estimate I have to rephrase slightly, I should say the check node should estimate what bit j1 should be using all the information it obtained from the other neighbors not including j1, because if you include j1 then it is not extrinsic anymore and then you are doing something repetitive which is probably not very useful, okay, so suppose I say you should not include rj1, what would you do to get an estimate for bit j1 from the other things? Yeah, so what do you know, what does the check node know, the check node knows, cj1 plus cj2 plus so on till cjwr was equal to what, it is equal to 0 at the transmitting end, right, the check node, the check was satisfied at the transmitter and then these corresponding rj's were received when those cj's were transmitted, so what would be your best estimate now, can I say the xor of all the remaining r's, okay, can I say that, is that clear, any questions on this, okay, so what it will send on this link, okay, will be rj2 xor rj3 xor so on till rjwr, okay, so remember all the information is flowing, see in step A the information flowed in this direction, in step B what will happen, it will flow in the opposite direction, okay, it will flow from check node to bit node, so what should you do here for this guy, rj1 xor rj3, right, that is what you do, rj1 xor rj3 xor so on till rjwr, okay, okay, watch out for this and keep thinking about it, I am going to ask a slightly non-trivial question next, okay, some should be able to answer it, okay, so the next thing is, so likewise you can do for everything and then the last step would be rj1 xor rj2 xor so on till rjwr minus 1, is that clear, it is quite simple, okay, so if you had to implement this step, if you want to try to implement this step very efficiently by minimizing the number of xors, how many xors do you need to be able to compute all these messages, okay, so you can be slightly smart, right, so you see the nature, the way in which it comes, brute force way of doing it is what, wr minus 1 times, wc times, no I do not know, wr minus 1 times wr, is it get that right or wrong, it is correct, okay, everybody is happy, okay, but that will waste a lot of computations, okay, so you can do it in how many steps, 2wr minus 1, it is possible to do it, okay, so you do wr minus 1 for computing the overall parity and then xor with each of those guys which is another wrxor, okay, so you can simplify your computation, so it is not too bad, okay, so you are not doing wr squared, you are only doing wr, 2wr computations for it, okay, so that is the first question which is slightly non-trivial, the next question is more non-trivial, for the first step, okay, what is the probability that your estimate was wrong, for step A, step A what is the probability that the estimate that was sent on a edge was wrong, it is P, do you agree, in step A it is P, probability that r i not equal to C i which is P, what is the probability now that one of these estimates is wrong, yeah, do the computation, see if you can do the computation, what is the probability that one of these estimates is wrong, it is an important computation, okay, assume the all 0 code word was transmitted, if you want to simplify your calculations, assume all 0 code word was transmitted, okay, it turns out the probability does not change depending on what code word it is, it has got something to do with odd number of errors, right, so there is a nice simple expression you can write for odd number of errors in a Bernoulli process, okay, so I will write down the answer, okay, do not try to waste some time later and try to derive it, it is very useful, the answer is 1 minus 1 minus 2P to the power wr minus 1 divided by 2, okay, so try to derive this, it is not too difficult, it is basically what out of wr minus 1 received values, how many errors should you have, you should have an odd number of errors, okay, and these are all independent, okay, as I said, like I said, assume all 0 code word is transmitted, these are all independent now and then what is the probability of one error occurring, P, okay, odd number of errors will be what? You should do NC1P to the power 1, 1 minus P to the power n minus 1 plus, well, n is wr minus 1 in this case, right, so you have to do all that and add it up, how did I get such a simple expression finally? Some of odd n choose things can be written in this form, right, a plus b power n minus a minus b power n will become what? Only odd choices will be included, okay, then how did I get the 1, P plus 1 minus P becomes 1, okay, this is 1 minus P minus P, it survives here, okay, so this is how it works out, okay, so this is the probability of error, okay, so that is step B, okay, so the first iteration is quite easy to describe. Now, in second iteration onwards, okay, I will say iteration L step A, okay, so how am I going to do iteration L step A, okay, so you will see we will be introducing suboptimal things beginning here, okay, so very, just quick fire decisions that we can make just based on locally available information as opposed to trying to find optimal things, okay, so what is, so bit node, okay, so let me write down iteration L minus 1 step B first, okay, so in the previous, the step B of the previous iteration, what would have happened to this bit? If this is say bit node I, it would have received estimates from its neighbors, okay, so it has so many neighbors, WC neighbors, okay, so maybe these are I1, I2, IWC, okay, it would have received some estimates, I need some notation for those estimates, it turns out whatever processing I do depends on those messages, so I am going to call them something, okay, so I will call this B1, B2, BWC, okay, remember these are all estimates of what, bit I could have been, they are all estimates of what bit I could have been, okay, so that is how it was done, okay, so now I want to write down what happens in step A of iteration L, okay, so I have this bit node, excuse me, okay, so in this step messages are going to be passed, so let me write down here, I am sorry, just to be very precise, I will say L minus 1 here, because those were messages, yeah, yeah, so the remark was that unless WR and WC are very small, all these computations will become fairly intensive, okay, so you want to keep WR and WC very small compared to N, okay, so the messages that are going to be passed back in step A of iteration L, once again will be estimates of what bit I could have been, I will call them U1, I am sorry, U1L, U2L, so on, UWCL, okay, so there is some logic to the step that I am going to write down, okay, so the logic that we are going to use to come up with U1L, okay, so I will write it down and then ask you if it makes sense, okay, so that is what we are going to say, okay, U1L is going to be equal to, okay, now let me write down the condition first, I am sorry, if V2L minus 1 equals V3L minus 1, so on till VWCL minus 1 equals some B, B which is what, 0 or 1, right, these messages are all binary, 0 or 1, okay, if this condition is satisfied, I am going to set U1L to be B, okay, else I am going to say U1L is RI, okay, so how do I find the message or the bit that goes from bit I to check node I1, I look at all the other messages that were received in the previous iteration, if all of them agree or all of them are equal or if all of them are either all 0 or all 1s, then I would send that common value as my best estimate, in case there is any disagreement, I am going to say, I am going to say what, I am going to give up on everything and say I do not trust any of these guys, I only trust my, trust what came out of the channel and send RI once again, okay, so that is the logic, it is clear, right, what I am going to do, so I will look at all the other values, it is very clear that I should not use V1L minus 1, right, there is no point in using V1L minus 1, V1L minus 1 already I 1 knew, it is what it sends, so I should not use that, I am using all the other things and when am I trusting all the other messages or other bits that were received, when all of them agree, if they do not agree, I am not going to trust them, I am going to send RI itself directly, does this make sense, why does it make sense? Why cannot we use majority is the question, yeah, if you use majority you get majority or use a threshold, instead of saying only if all of them agree, say if at least so many of them agree, then I will do it, yeah, that is a better thing that is actually called Gallagher's B algorithm, but we will not study it, maybe we will do something with it later on, but Gallagher A, Gallagher A makes this choice, okay, the choice is if any of them, if all of them are not the same, I am going to send RI, why am I trusting RI more? Sir, any reason for why RI should be trusted more? Well, dependency is not the only issue, intrinsically you should trust RI more, why? Exactly, probability of error for RI will be lower than probability of error for any of the other guys, why? It is 1 minus 1 minus 2p it will be much lower, much higher, much worse than the probability that you had with just p, okay, so RI you can trust more, see how did you get the other messages? You exorbed a lot of things which themselves could have been in error, so there are so many more ways in which you could have gone wrong with the other things, but RI just came purely out of the channel with probability p, okay, so you can trust RI more, okay, so that is why this makes sense, okay, so but like Mukundan was pointing out, maybe you should have a threshold here instead of saying all of them agreeing, okay, that is possible and you can optimize over that threshold and get better performance, it is Gallagher B as we call it, maybe we will see it, maybe we will not see it in detail later, but Gallagher A it is a very simple algorithm, okay, but you can see there is a lot of suboptimal ad hoc decision here, okay, right, we have not done any optimal thing about look at all the connections, but what is the advantage of this? It is not complex, it is very simple, right, it is very easy if you can, if you want to build a gate or write a program to implement this, it is very, very easy, do you have a question? No, no, no, not necessarily that, the majority logic is not too complex compared to this, okay, this is just the starting point, as I said it is Gallagher A, Gallagher B is, one can imagine is more involved than Gallagher A, analyzing that is a little bit more tough, this is much easier to analyze, oh I mean both of them are quite simple decoders, they are not difficult, I am going to, I am going to slowly go there, I am sorry, yeah that is again one of my principles, one of my principles was I will only send extrinsic information, I will not send intrinsic information back to the check node, see the check node already knew V1 L-1, I will not use that in the extrinsic calculation, it is just a principle, you can violate it, in fact you can design a decoder violating it, it will be more difficult to analyze, it might converge, it may not converge, it is difficult to do things, but the principle I have chosen is I will not send intrinsic information back, I will only use pure extrinsic information in my attrition for calculating my message, yeah it can very much, we will show, we will see, we will see slowly, we will look at convergence and success and failure slowly, okay, we have not come to that yet, right now I am just describing the decoder, it is hard enough to understand the decoder very clearly, okay, okay, so is this clear, is this step clear, okay, hopefully this is clear, is this clear, looks like people are, okay, so the best test of clarity is to do some computation with it and see if we can do it, okay, suppose I say each of the, okay what about U2L, how will I do U2L, yeah if V1 L-1 equals V3 L-1, so I will not use V2 alone in that equation, but I can repeat the same thing, you see that, right, I will write down exactly for just U2 alone, if V1 L-1 equals V3 L-1, so on till VWC L-1 equals B, then you set U2L equals B, else you set U2L equals R, right, okay, and one more thing which I have not mentioned explicitly, but it is clear, all checknotes have to do this, okay, it is not enough just one checknote doesn't, okay, when you say I checknote, all the checknotes have to do it and they can do all of them in parallel, right, theoretically if you have parallel processing possible, all of them can happen at the same time, okay, this is clear, no, yeah, yeah it will be, so totally overall the probability is better, if you include both RA and these guys you will get better probability, right, now I am trying to think of the next iteration, okay, after the first iteration I also have to make a decision on what my CI hat would have been, okay, I am not come to that, slowly we will come there, there you will see that will be better probability, you will include RA everything and you will want to increase your probability, see, apparently you see that is very easy to imagine, if you do AX or B, A can be in error, B can be in error, right, with the same probability, AX or B will be poorer, right, both of them can be in error, you are doing an XOR and several things can go wrong, that's a very natural thing, but if you use all the information together you will get some gain, okay, so the computation I want you to do is the following, okay, suppose I say assume all 0 code word was transmitted and all that for simplicity, okay, suppose I say probability that any of the VI at the L-1 iteration are in error is some QL, okay, so I am going to define a QL now, what is QL, probability that VL-1 equals, it is an error, okay, what do I mean by error if all 0 code word is transmitted, all 0 code word implies error in VL-1 if what? VL-1 equals 1, okay, if it is equal to 1 then there is an error, otherwise it is not an error, okay, so if you go back and look at the first iteration at the end of the first iteration, right, we had the same probability for any of the VI is to be in error, so I can drop the subscript there and simply say VL-1 and QL is the probability that VL-1 is an error, okay, I want you to compute the probability that UL will be in error, okay, so I will call that as PL probability that UL equals 1, okay, this is the probability that I want you to compute, okay, remember again all 0 code word has been assumed, okay, so when I say error it is UL equals 1, okay, I want you to compute PL in terms of QL and P, okay, you will need both, try to compute that, it is a simple probability calculation but, okay, assumes all, assume that all those VIL-1s are all independent, assume they are all independent and identically distributed, 1 with probability QL, 0 with probability 1-QL, okay, now we have to find the probability that UL equals 1, yeah, assume independence of VLs, assume all the VI's are independent and identically distributed, please assume for now and calculate, we will show you when they will be independent, then we will argue it out, okay, assume all the VIL-1 are independent and identically distributed, 0 with probability 1-QL, 1 with probability QL, it is actually a very simple calculation, but you have to get your head around it, okay, what is PL, okay, so it will be, okay, so there are two cases, right, RI could have been in error and RI, okay, so suppose RI is received correctly, okay, so what is that probability 1-P, if RI is received correctly, what is the only way in which UL, which RI is received correctly means what, RI is 0, what is the only way in which can I get, UL to be equal to 1, all of them are in, are equal to 1, okay, WC-1 of them should be equal to 1, is that right, so what should I write here, QL power WC-1, is that fine, everybody is happy with that, okay, on the other hand, if RI was equal to 1, yeah, they should not all be 0, that is the only thing, what is that probability? 1-QL power WC-1, do you see that, do you agree with me, this is my formula for PL in terms of QL and P, okay, what were the various assumptions, okay, list down all the assumptions that were done when I wrote down this nice and simple formula without those assumptions, it is not possible, the first assumption by which it was so easy was I assumed all 0 codeword, okay, so it is so crucial, if I did not assume all 0 codeword, what will happen, I have to keep saying UL is in error, okay, and UL is in error could mean RI was 0, RI was 1, there is just too many complications, right, maybe you can write it down, it is not a big deal, but it was simplified very nicely by saying that I can assume it was the all 0 codeword, okay, that was a crucial thing, the next thing was what, all these VIL-1 are identical and independent, okay, both of those are also, both that is also a very, very crucial assumption, without that you cannot do that, I have to come back and justify all those assumptions, I will do that later on, I will say why those things are reasonable and for large n why they would not matter, okay, so I will make, I will justify those later on, but for now at least these computations we can believe, okay, we are clear, okay, so now let us move on to step B of iteration L, yes, so you are saying this 1-P should not come, see this question always comes when I write this down, okay, maybe we can talk about this outside of class, okay, so it should, I have seen an expression written like this, so it has to work like this, I think it is correct, you will see, okay, anyway maybe there is some bug here, if there is a bug I will come back and correct it, I know I have the right version somewhere, I can dig it out, okay, I think this is fine, okay, we will see, okay, but it is okay, I mean the exact form and the correctness of the expression is okay, I mean we can check it later, but the philosophy is what, when you make all these assumptions you can compute probability of error involving QL, okay, it is very important, this will come later on, okay, when we analyze these expressions will be very, very crucial, okay, alright, so now let me write down step B, okay, so all these expressions not being exactly the same is okay in the class, but it is not okay when you write a quiz, okay, write an exam, answer has to be exact, you cannot say all this is irrelevant and all that, okay, it is unfortunately true, keep that in mind, it is the advantage of teaching as opposed to writing exams, okay, so step B, step B is actually the same as step B for iteration one, okay, it is not very different, but I will write it in the most general case, okay, the most general case, if I look at check note J, it is connected to say bit nodes J1, J2 so on till JWR, okay, again I have to actually write down step A, okay, in step A what would have happened, it would have received some things, okay, so for that I need some notation, so I will say this is step A, okay, let us suppose that what was received is U1L, U2L, okay notice there will be a clash in notation here and that is the best way in which I can denote it, otherwise if you keep changing the notation, being very careful it will just get too messy and unnecessary, okay, so I will say the messages that flew, that came from the different bit nodes to a particular check note J in step A of iteration L where U1L through UWRL, okay, so what happens in step B, what would you do in step B, it is very easy, yeah I take the XOR of remaining bits for each one, okay, so that is what you do, okay, so let me write that down formally, okay, so what goes back here would be U2L, XOR, U3L, XOR, so on till XOR, UWRL, okay, so likewise you can do it, I will write the last one alone, U1L, XOR, U2L, XOR, UWR minus 1L, okay, so clean that is what goes back to RJ1, RJ2, check note J, okay, it is the same as before, yes, yeah, okay, so that question was about the validity of the previous expression, okay, so I think people are still obsessed about it, in spite of my lecture about why it is not so, why it should be inconsequential, anyway, if you are thinking about it it is fine, okay, all right, so, by the way did you understand what she said, no, not clear, she is saying that P into QL power WC minus 1 actually occurs on the right hand side, right, is that what you are saying, and it will add it to this and eventually it will become okay, anyway, but it does not matter, it is okay, let us not worry about it, let me not obsess with the tools, yeah, okay, so let us go to things that really matter, which is, okay, now I wanted to redo the calculation now for this part, now again, assume that each of these UIs will be in error with probability PL, okay, that we already know, okay, probability of error for each of the UIs that are coming in will be PL, okay, now imagine that each of these case are what, V1L, okay, this is what is going back here, V2, V2L, exactly, okay, VWRL, okay, these are the messages that are going back, now we have to compute the probability that VI will be in error or VI will be equal to 1, assuming all 0 code word, okay, what will you get, you will get a very similar expression to what I got before and that can be easily done, so you see Q, okay, so I think I made a slight goof here, okay, so I should actually call this QL minus 1, okay, only then will I get something reasonable here, I apologize very profusely for this, because this concerns VL minus 1, I do not know why I called it QL, okay, so this should all be minus 1, okay, apologize for that, so it makes sense to put minus 1 here, is that fine, okay, so I am sorry for that, now I can write down QL will be equal to what, 1 minus 1 minus 2 PL raised to the power WR minus 1 divided by 2, okay, so this is the probability that VL equals 1, so now not only have I given you a description of the algorithm, okay, now based on this description you can go back and given a parity check matrix, you can write a simple program to implement this decoding algorithm, right, it is very easy, just keep doing this over and over again, is it easy, is it easy, okay, you should try writing it, okay, so it is a very interesting algorithm to try to code, okay, all you have to do is what, have an algorithm to generate parity check matrix, how will you generate a parity check matrix, you can use Gallagher's construction, pick a suitable end, do Gallagher's construction, you will get a parity check matrix, then generate the all zero code word, introduce errors according to some probability P, okay, then write a program to implement this decoding algorithm, you can do that, not only can you do that, you can also analyze up to some assumptions, what are the assumptions, the big assumption of being independent and identically distributed, up to that assumption you can also analyze, you can find PL and QL as L increases, okay, so I can in fact write down one iteration for PL as a function of PL-1, okay, for L equals 1, 2 so on and P0 is what, I can set it to P, okay, well it will actually go through Q, but it is okay, I mean I can write it as one big function if I want, okay, so the probability of error can also be analyzed in a very nice way, okay, so go back and contrast this with the bitwise MAP decoder and ML decoder, we had optimal wonderful decoders, but what did we have to do to even think about analyzing, we had this complicated expression for which you have to find PDR, okay, here we are able to make some approximations, I will justify the approximation soon enough, we are able to make some IID approximations and we are able to write down probability of error analysis as a very nice recursion, okay, so now we have to prove some properties for this recursion, for instance what would be a desirable property for this recursion, it should decrease, right, so PL should be a monotonically decreasing sequence, okay, try to prove it, it can be proved based on the situation, okay, it is a complicated thing, you will have to do some careful proof, but at the end of the day it is only real analysis, it is only nice smooth functions, there is nothing very nasty going on, it is polynomials, you can prove it, you can show that PL will be a monotonically decreasing sequence, okay according to this situation, you should get that, okay, what can happen possibly to monotonically decreasing sequences as L tends to infinity, it is also positive, right, it is greater than 0, it is bounded away from negative infinity, so now when it monotonically decreases, what can happen, only two things can happen, only one thing can happen, it has to converge, right, it has to converge, it cannot escape convergence because it is bounded below at 0 and then it is decreasing, so it has to converge, only thing is it can converge away from 0, it can converge to a non-zero value, okay, so those two cases are of interest to us, will this sequence converge to 0 or will this converge to some non-zero value, why is it of any importance to us, well that is the probability of error and you want it to converge to 0, okay, okay, so what is this probability of error for? Probability of error for the message that goes from bit node to check node at iteration L, okay, is it enough if that is 0? Yeah, that is enough, right, if that is 0 then what will become 0, I have not told you how to make decisions yet, okay, right, you have to ask that question, so how do you make a decision after iteration L, okay, yeah, so that is also important, decision after iteration L, okay, I do not want to write down something explicit here, you can use a reasonable rule, maybe you use majority logic, okay, what is majority logic, you have RI and then what, V1, V2 through VWC, okay, right, after iteration L for bit no bit I, all the information you have is what, RI and then you have V1 L-1, am I right, no, V1 L, no, V1 L-1, did I get that right, L-1, right, yeah, L-1, V2 L-1, yeah, at iteration 1 you do not have anything, okay, so am I getting this right or wrong, should be L or L-1, yeah, after iteration, okay, so it will be L, no, yeah, I wrote after here man, clearly, so, so clearly it should be L, okay, so maybe you make a majority vote or maybe you give more preference for RI, okay, it is okay, I mean I can come up with some reasonable rule, okay, so maybe majority decision, okay, so CA hat equals majority of RI V1 L through VWC L, okay, if PL equals 0, then you are guaranteed that the majority decision will be correct, okay, so it is enough if you force PL to 0, your decision will automatically become accurate, okay, so this is a decision you can do after iteration L, you cannot make sure, so what you do in practice is the question was how do you make sure that C cap equals, so maybe I put a L here, C cap L which is equal to C1 cap L, C2 cap L, Cn cap L can make all these decisions fine, but will it be a codeword or not, that is a question you have to ask, okay, you would not know if it is a codeword or not, how will you find out if it is a codeword or not, multiply with H times, see transpose, see if it goes to 0, okay, so what people do in practice is if the number of iterations is not a problem, you keep on iterating till you get a codeword after a particular, and you have an absolute maximum number, you have like 100 at which you give up and stop, say okay, I have not got a codeword now, maybe I will never converge to a particular codeword, so in practice you have to use all those things too when you write a program and let it decode, okay, so the first thing I want to tell you is this can get very complicated, okay, so you have thousands of bit nodes, okay, all these messages flowing back and forth, okay, the only handle you have to analyze, analyzing it is this PL, and PL calculation of this PL involves an approximation which says, I have assumed all these things are IID, if there is any repetition in my neighborhood, right, if there is any repetition in my neighborhood, you will start getting dependence, why? Because the same bit nodes are coming from different places, okay, but if up to depth L, if there are no repetitions in my neighborhood, what does it mean? Up to iteration L, my calculation for PL is valid, okay, it is not wrong, okay, so this is the connection, this is where everything comes together, okay, if in my neighborhood up to depth L, I do not have any repetition or what is it in terms of cycles in the tannograph, if there are no cycles of length 2L or lower in my tannograph, my decoding will be accurately analyzed up to iteration L by this IID assumption, okay, so it is a very simple statement to make at the end of the day, but hopefully you saw all the directions that it took and finally we came up to one, very nice statement, okay, so that is the last thing I want to write down, okay, so the length of the minimum, this length of the cycle becomes important, okay, if there are no cycles of length 2L or lower in the tannograph, okay, one has to say to be very careful that the analysis by PL, by the IID assumption is accurate, in fact, if you see the literature, most people will say decoding is accurate, right, cannot say if decoding is accurate or not, okay, up to iteration L, the analysis through this IID assumption is accurate, okay, decoding might still be another after iteration L, okay, there is no guarantee for any such thing, okay, so what is out for that, okay, I will say IID assumption for analysis, IID assumption holds up to iteration L, okay, now what did I say about sparse graph as n tends to infinity, right, for any L, as long as you keep increasing n probability that in the random code you selected from this ensemble of all possible WRWC codes, you will get cycles of length 2L or lower is going to be very, very small, okay, out of these so many codes that you have in this ensemble, the one that you randomly pick will have very low probability of having a cycle of length 2L for any L as long as you make n large enough, it falls I think quite rapidly with n, okay, so the probability is going to fall very rapidly, so what does it mean, the IID assumption is going to hold with very high probability for large n, okay, so that is the next thing, IID assumption holds with high probability for any L, any finite L as n I will say tends to infinity as long as n is large enough, okay, sorry, of block length n, well typical what depends on what typical is, okay, so for instance if you look at the Ymax standard which uses LDPC codes, the value of n ranges from 24 times 24, what is the 576, 576 to 24 times 96 which is I believe 2304, okay, so 576 to 2300, that is what they use in Ymax because in wireless you cannot use very large block lengths, right, you have to be short block length, there is all kinds of problems in wireless, so that is what they use, okay, so in practice if you say typical is what is done in practice and that is, but if you have another channel, for instance if you have deep space communication you really do not care about block length, you can keep going on and on forever, maybe you tend your block length to a large number, okay, and then hope to get better and better, okay, all right, so this five more minutes I am going to make one more statement, okay, so what people do is, I have made this statement about this recursion, Pl equals f of Pl minus 1, okay, it is a monotonically decreasing sequence and I said we have to distinguish between two cases, the case where that sequence converges to 0 and the case there where that sequence does not converge to 0, it turns out there is a very sharp cutoff for P which determines that, okay, that is what is called the threshold, okay, so the threshold is P star which is a transition probability, okay, for P less than P star Pl tends to 0 and for P greater than P star Pl will tend to infinity, it turns out this is valid for that iteration that we had, what is the iteration that we had, Pl equals f of Pl minus 1, so how will you test whether this is valid or not, fix some WR and WC, okay, so fix it to be 3, 6, okay, start with different P0s, P0 is P, right, P0 is P and then do that iteration maybe in MATLAB or something and see if Pl tends to 0, well not infinity, I am sorry, tends to some, that is quite funny, finite, thanks, yeah, so it will be less than 1, it will not be 0, it is not 0, it is bounded away from 0, it will converge to say 0.001, it will never go below that, okay, okay, so it turns out this also can be shown for that iteration, think about how you will prove something like this for the iteration, okay, can you prove something like this, I do not know, it is possible, by simulation, yeah, we can easily calculate this, by simulation one can easily compute the P star, right, so what will this P star be a function of, WR and WC, do you see that, there is nothing else involved in that, okay, so P star will be a function of WR and WC, why because your iteration itself changes, right, you multiply by, raise it to the power WR minus 1, raise it to the power WC minus 1 and all, the iteration itself changes, but all these properties are true, it is always true that irrespective of WR and WC, the iteration is monotonically decreasing, okay, it is also true that irrespective of WR and WC, you always have a threshold probability below which your PL will tend to 0, okay, and you can compute this P star by doing this iteration repeatedly in simulation very, very easily, it is easily computed, okay, now notice this P star actually tells you something about probability of error when, what all should you tend to infinity now, for this P star to be actually a probability of error, or this P, well, you see where it, okay, so let me be careful, okay, so for this P star to have any meaning in practice, okay, that is what I should say, it is not probability of error, for this P star to have any meaning in practice, what all should I tend to infinity, first of all I should tend L to infinity, right, I can make a statement like, if I have a particular code, this code will work with 0 probability of error for P less than P star and it will work with non-zero probability of error for P greater than P star, if I have to make that statement, then what all should I do, I should tend L to infinity first, okay, just because I am tending L to a large value, what should go to infinity also, N should also go to infinity, so this threshold is asymptotic in both senses, it is asymptotic in block length and number of iterations, okay, you do more and go to a larger and larger block length and do more and more iterations, this P star will control the behavior of your LDPC code, what will happen, for P less than P star, you will get 0 probability of error, it will drop, okay, for P greater than P star you will get a non-zero probability of error and this will be more and more effective as block length increases and number of iterations increases, okay, let me illustrate that with one plot, okay, so okay, so this is the plot, hopefully you can see it, so I am not importing it into Windows Journal, maybe you can add it later, so this is a plot of simulation, okay, so this is purely simulation, there is no analysis here, I have done simulation for BER, bit error rate versus transition probability of, can you read it, transition probability for a binary symmetric channel and can you read the title, what does it say, 36 regular LDPC codes with Gallagher A decoding, maybe you cannot read the legend very clearly, the different curves are for increasing block length, I start with block length 30, then I go to block length 100, then I go to block length 1000, then I go to block length 10,000, okay and threshold which is purely a function of 3 and 6 which I have calculated by doing all the simulations, trying out the iteration and seeing when it converges to 0 and seeing when it does not converge to 0, okay, the point at which the transition happens is 0.04 for 3,6 regular codes, okay, you can stare at it for a while and convince yourself, while threshold theoretically results from tending block length to infinity and number of iterations to infinity is very practical in predicting the performance, okay, in fact, can you guess how many iterations I would have done for this, I did only 10 iterations, okay, I did 10 iterations and for block length even maybe 1000, you can see that behavior to the right side of threshold, what happens, you get a non-zero probability and what happens immediately as soon as it crosses the threshold, it is a big cascade, okay, so it is a waterfall type behavior, okay, it drops down and for 10000 it becomes more pronounced, it is constant for a while and then around that 0.4, 0.04 it is taking a sharp dive down, can you see that, why is it behaving towards the left hand side, why is that, what is important with respect to transition probability, you might be used to SNR which improves as you move to the right, this is transition probability which decreases as you move to the right, so it behaves like this, okay, so it is a little bit inverted, L is only 10, my number of iterations is only 10 in this plot, okay, so you see even for small number of iterations and reasonable block lengths, okay, 1000, 10000 one might say is not too unreasonable, okay, threshold is quite an accurate predictor for actual behavior across the channel, okay, so what does this mean, it means all the assumptions you made, all those probabilities are falling very fast with them, okay, that is why even for small n, all these things tend to hold and even for small l, these things tend to hold, so that P L is converging very fast, when it wants to go to 0, it goes to 0 very fast, okay, so all those things are happening, okay, so that is why this threshold while we made all those assumptions, fantastic assumptions in the analysis is very valid in practice, even for block length 30, okay well for block length 30 it is really bad, okay, I admit it is just a straight line falling down, but block length 30 maybe you do not care too much, okay, but even block length 30 notice how remarkable it is, okay, I am doing 10 iterations and what is this point, okay, so this point here, can you see it, can you see my cursor moving around that point, the probability of error is what, 2 times 10 power minus 3, 0.002 and what is the probability of output bit error, it is below 10 power minus 4, okay, so it is doing fine, I mean it is not too terrible in terms of just, if you just think of it in terms of raw behavior, maybe you cannot expect anything more from a rate half code with block length 30, okay, maybe I am sure there are better codes than just giving you an idea, okay, for larger and larger block length, imagine how remarkable it is for the 10,000 code, it is a 5000 by 10,000 parity check matrix, okay, I want you to step back and imagine how non-trivial this decoding is, can you imagine doing any other decoding for that code, it is not possible, right, it is just a random code that I constructed with an LDPC code, I am able to decode it and show you a plot for block bit error rate going down to 10 power minus 5, okay, which means how many blocks would I have simulated, in fact for this I simulated 100,000 blocks, okay, in MATLAB it takes about maybe I do not know 1 minute, 2 minutes less than that, okay, so that is the remarkable nature of all the things that have gone into the design of LDPC codes, into the decoding, into the analysis, so it all comes together in this plot as far as I am concerned, okay, we did a lot of things, we did a lot of assumptions, but at the end of the day for long block lengths, LDPC codes can be made to work and reasonably long block lengths, not that too, not that very long, it is low complex, it is implementable, it is wonderful, okay, and there is no wonder that they are out there, okay, and now if you remember capacity, what was capacity for rate half BSC, 0.11, right, 0.11 is capacity, threshold is 0.04 here, okay, right, even if I tend n to infinity, as long as I keep my 3.6, I can never be better than threshold, right, and I will never tend to 0 better than threshold, okay, so this 0.04 kind of limits me, okay, I can, so what should I do now, what should I try to do now, I should try to design LDPC codes for which thresholds will, in this picture move to the right, right, maybe rate half LDPC codes for which threshold is very, very close to 0.11, can we do that is one question, okay, so you will see naturally you will have to go to codes that are called irregular codes for that, okay, and that means more careful study, okay, so that is the direction in which we will move, okay, so I thought I will show you an animation but we do not have time, I will return your answer papers, I will show you the animation in the next class, okay, I am sorry.