 It is just 33 and I guess last week it has been quite a long time now, I think last Tuesday when we met last. So, let me clearly remind you the kind of things we were looking at last week. So, we were looking basically at constraint complexity equalizers. So, if you remember we initially started with the situation where there was no ISI and then we went to a situation where you expect ISI, then how do you cancel it? If you have, after you do a match filter, if you finally have a finite impulse response then you can presumably do optimum detection which is with Viterbi algorithm. So, in cases when you do not have FIR response, you cannot really do Viterbi, so this is out of the question. So, everything you can do is something suboptimal and that is where these linear equalizers and this decision feedback equalizers come in and they play an important role there. So, your channel can have a pole, it is stable inside the unit circle, but still it might have a pole. So, in situations like that we were looking at what to do and then we ended up with linear equalizers and DFE. So, initially what we did was we did not put any constraints on the length or the order of the filters used in linear equalizer and DFE and then we derived a whole bunch of filters with two different criteria, the zero forcing criteria and the mean square error criteria. But for both criteria, we derived what the optimal filters are, we found what the minimum mean square error is and then we found the DFE MMSC seemed like a pretty good choice in most cases. But still there were problems as in some of the problems included what was one of the few problems with the optimal linear equalizers and DFE. So, several times the matched filter part of it, the pre-cursor equalizer so to speak ended up having the matched filter and the matched filter in some cases can end up being IAR anti-causal particularly if you have a pole inside the unit circle in your channel response which can happen then your filter becomes IAR anti-causal which can only be approximated. So, in several cases the optimal things may not be implementable that is one problem and couple of other problems another problem is that the order is really really large it can become very large, if the order is very large then you can't possibly implement it accurately. So, you don't know how to approximate that for a finite number of tabs that's the other problem. The third and more significant problem is what if you don't know the channel, so all these optimal filters involve the channel response I mean they are in terms of the channel response and if you don't know the channel response what do you do is a question. So, all these things are questions and once again let me remind you of the overall system ok. So, we took a general system model where this H of Z is not necessarily minimum phase and then we had noise and then we have ZK and then in constraint complexity equalizers we said we will put first we looked at linear equalizers I will come to DFE also as we go along. So, we will have a filter here with what N equals 12 plus 1 tabs as in it is a 12 plus 1 order filter C minus L to C plus L. So, you have just 12 plus 1 coefficients in your filter and whatever you get out you are going to slice and at this constraint what is the best possible C of Z I can pick. So, that was the next question to ask. So, now best possible I have to define using different terms one could use zero forcing, but we didn't see that we saw the MSC constraint complexity linear equalizer ok. So, the mean square filter is what we saw and we saw it becomes it becomes like a linear algebra problem just vectors and matrices nothing more in it ok. So, to be very specific you form these vectors C which is what which is the actual filter ok. So, this is the filter it had coefficients from C plus L to C I think C minus L to C L ok. So, 12 plus 1 coefficients and then I form the other vector which is which I call Z K ok. So, what is this this is a kind of a shift register which holds the Z K's it goes from Z K plus L to Z K minus L once again a 12 plus 1 length filter. So, if you do this you can write this output I don't know what I call this output what did I call it X K or did I call it X K must have called it X K ok. So, we do this you see X K becomes simply C transpose times Z K and then assuming well not assuming anything the error then becomes S K minus C transpose Z K and then you write a simple expression for the expected value of mod E K squared ok remember all these things are complex and all these things are also random variables. So, then you assume assume a certain distribution for it and then you compute expectation over that this is the definition for mean square error ok. Clearly, if you know the statistics for S K and Z K ok how do you know the statistics for Z K Z K is nothing, but S K convolved with H K plus N K. So, it means you should know H K exactly and you should also know the statistics for S K and N K. Once you know all these statistics, statistics in the sense that it's enough if you know the mean and autocorrelation right. So, this is all linear filters just additions just mean and autocorrelation will give you everything you want ok. So, once you know those statistics, but H K you need to know exactly it's not random. So, you need to know that once you know that the statistics for Z K can be computed statistics in the sense what may be possible cross correlations with Z K and S K and autocorrelation of Z K. So, once you compute those things you see that the mean square error is nicely expressed as expected value of mod S K squared minus 2 times real part of C conjugate transpose a vector which I called alpha plus C conjugate transpose phi C ok. And this alpha ended up being cross correlation between S K and Z K, Z K star is an expected value of S K times Z K star ok. And then phi ended up being expected value of what Z conjugate times Z transpose ok. So, it's basically the autocorrelation function ok. So, once you know alpha and phi the mean square error is completely expressed as a quadratic form in terms of what? Quadratic form in what variable? In C which is what you want. So, it's just like it's becomes a simple problem of maximizing a quadratic equation ok. And that's a very trivial problem you just one straight forward way of doing it is differentiate with respect to the variables Z is equal to 0 you will only get linear equations ok. And we saw that that linear equation ended up being what? Phi times C opt equals alpha ok. So, if you solve this linear equation you get the best possible filter which minimizes the mean square error at the input of the slice ok. It's really simple even the constraint complexity equalizer is a very simple straight forward situation there's no real complication let's do it to get the answer ok. And then ok. So, this is solved a quite a few of the problems that we had so far except for the problem that you need to know phi and alpha which in turn requires you to know the channel H of Z. So, what if you don't know the channel is the next question ok. So, for that we have to build what's called an adaptive equalizer one that will adapt according to what the mean square error is calculated at the receiver ok. So, it's a it's an adaptive filter. So, basically the C will not be a constant filter over time ok. If you know all the statistics you compute it and keep it constant ok. But if in an adaptive case C will keep changing with time ok. So, that's the basic idea ok. So, you can see why an adaptive filter is useful for several scenarios ok. So, you might know the channel at some time and the channel might change on you ok it might change then at that point it's good to have an adaptive filter anyway ok. So, so that you keep changing as according to the mean square error. And the other thing is this mean square error can also be computed at the receiver that's the very nice thing about the mean square error. How do you compute mean square error at the receiver? Instead of S k you put S hat of k. So, basically subtract the input and output of the slicer and square it. And you hope that S hat of k is roughly S of k. So, you get the mean square error ok. So, that's how you adapt you adapt according to the error of the slicer and you use mean square error ok. So, to come up with an adaptive filter first we saw an iterative algorithm for doing what? The last was for solving finding the solution for this linear equation ok. So, linear equation can be solved using several methods. One of those methods that I talked about was this iterative algorithm. So, we are going to start with the iterative algorithm for kind solving the linear equation and that will nicely suggest an adaptive algorithm ok. So, which will adapt C from time to time ok. So, that's what we see ok. So, to in last class once again we saw an adapt an iterative algorithm. So, basically the upshot of the iterative algorithm was this following update ok. So, here is an update at the j-th iteration or j to j plus 1 iteration ok. Cj plus 1 is Cj minus beta by 2 the gradient with respect to Cj expected value of mod ek squared ok. So, you know this msc is the quadratic form in Cj ok. So, the msc at j after the j-th iteration the mean square error will be in terms of Cj ok and then I find its gradient and then how do I adjust Cj? I move in the opposite direction to the gradient ok. So, gradient is going to take me towards the maximum. So, the opposite direction of the gradient will take me towards the minimum. Remember all these things are quadratic function. So, they are all well behaved you don't have to worry about all kinds of maximum and minimum here and there ok. You will have only one maximum and you can get there ok. The well behaved quadratic is no problem there ok. So, you write this you get a very simple update equation ok. So, remember this is this is kind of like after the j-th iteration. So, this is like mscj ok. So, you do the gradient ok. So, the gradient also we saw had a very very simple formula right. So, what was the very simple formula for the gradient? You put that in it becomes beta alpha minus did I get this right? Phi times Cj is that right ok. So, the gradient had a very simple form. So, you see Cj plus 1 gets updated in a very simple way based on Cj ok. And then we analyzed how this algorithm converges as a function of j. We were able to write very easily that Cj could be written easily in terms of C0 itself right. The error function the difference between the C optimum and Cj could be written very easily as the initial error times some function of the eigenvalues of phi. Phi is a Hermitian symmetric matrix with eigenvalues are going to be real. We also assumed it is positive definite. So, eigenvalues are going to be positive ok. So, you know a lot of things about phi using all those properties you could write very easily what the error is going to be ok. So, the error well if you define Qj as Cj minus Copt ok. We showed Qj plus 1 is going to be summation i equals 1 to n 1 minus beta lambda i power j ok. So, this V i is and V i conjugate transpose this V j V i is are the orthonormal eigenvectors of phi ok. So, it is Hermitian symmetric. So, you know it is got a complete set of orthonormal eigenvectors. So, you take all that and expand you can very easily show that this formula will work 1 minus beta lambda i power j ok. So, you notice so this error term after the j plus 1 iteration is a sum of n quantities and each of those quantities decays differently as a function of j. All of them decay exponentially, but you can choose only 1 beta you cannot make beta a function of i right. So, you can choose only 1 beta and depending on your one choice of beta each of these terms will have a different magnitude ok. And you can make make sure that all of them are less than 1 by choosing beta to be what? If you choose beta between 0 and 2 by lambda max where did I get lambda max. So, you know phi is a positive definite Hermitian symmetric matrix. So, its eigenvalues are real and positive. So, you can arrange them in increasing order ok. So, I might be able to arrange the eigenvalues of from lambda min to lambda max ok. So, these are the eigenvalues of phi ok. And if you choose beta between 0 and 2 by lambda max you know for sure that each of these terms will be less than 1 in magnitude ok. So, all of them will decay ok, but still even if you choose it like this each of the terms will decay at a differing rate ok. So, if you want to maximize the rate at which it converges you have to make some careful choices all these things can be made ok, but I am not going to go into it. But the only thing I will talk about is when will the convergence be fastest ok. So, if all the terms decay at the same rate then you can pick a beta. So, that the convergence is fastest ok. So, if the eigenvalues are all equal if you have repeated eigenvalue say at 1 or 2 or some such thing then you can pick 1 beta and expect that same convergence rate for all your terms. But on the other hand if your eigenvalues are well spread out from 1 to 100 for instance ok. Then wherever you put your beta there will be one term which is which is going to go much slower than the other terms and that will determine how fast your convergence is ok. So, in general the statement you can make is speed of convergence depends on the spread of eigenvalues. Speed of convergence depends on spread of eigenvalues ok. So, directly related to spread of eigenvalues ok. So, that is the first result that we see ok. So, the next thing is how do you quickly relate the spread of eigenvalues to some actual channel physical entity as opposed to just looking at it as a matrix ok. Phi after all is the autocorrelation function of z ok and then z can be definitely the power spectral density of z can be definitely related to the channel ok. So, I want to be able to look at the channel response and decide how the spread of my eigenvalues is going to be ok. So, for this we will use some results which we which I will not prove ok. The results I am going to use are the following. One can show that the eigenvalues of this phi will be limited by will be bounded by the maximum over theta of Sz of e power j theta ok. This is my power spectral density and then the minimum over theta Sz of e power j theta ok. This will always be true and you can show this very easily. This is not a very difficult result to show ok. So, after all the autocorrelation matrix comes from the autocorrelation function and the PSD of the autocorrelation function is the power spectral density. So, this is not a very difficult result to show ok. It looks difficult but it is actually very easy to show ok. You can use some DFT type arguments and very easily show show that this result will be true ok. Another thing which is so ok. So, this is the first thing. So, when will the spread of eigenvalues be less? Exactly. When the channel is flat, if the channel responds well this Sz actually is related to two things. What are the two things that Sz is related to? Mod square plus the noise power spectral density. Let us assume the noise power spectral density is flat ok. It is not changing its frequency. Then the variation of Sz is completely controlled by the variation in your mod h square ok, the channel square channel response. If the channel varies a lot, then your eigenvalues are also likely to vary a lot ok. If the channel is flat, then your eigenvalues will also be flat. That makes total sense right. So, if the channel is varying a lot, your equalizer is going to take a long time to converge. If the channel is flat, your equalizer is going to converge fast ok. So, that is a nice way of relating everything together and bringing everything together at the end ok. So, looking at the channel if it is varying a lot, if it is minimum and maximum are well separated, then you can expect the convergence to be slow. Otherwise, the convergence will be fast. The reason why convergence is important is that shows how quickly your filter will respond to a changing channel situation ok. In today's wireless communications, people are moving about a lot and channel changes very rapidly. And you want to build an equalizer which will adapt very quickly to changing channel conditions ok. So, that is why these kind of things are very important. How quickly my convergence happens is very very important for that ok. That is the one point. There are several other points you can make about how. So, for instance, lambda i more things are known as n tends to infinity. Remember n is your n is the what is the n, n is your iterations ok. And where am I? n is what? I am totally confused now. n is not your iterations right, n is your n has to be the dimension of phi ok. So, n by n. As n becomes really really large, one can show that lambda min will tend to minimum over theta s z of e part j theta ok. So, this is important and one can make this assumption reasonably that n is becoming very large if you want if you are using a and lambda max will tend to max over theta s z of e part j theta ok. So, the reason why this is important is if you expect a channel null as in a 0 in your frequency response ok. What happens when your filter order becomes really really large is your minimum eigenvalue is going to go to 0 which means your phi is not invertible anymore. So, your optimal MSc cannot be guaranteed ok. So, that is those are problems with the way this thing adapts ok. So, once your phi is not invertible lambda min is 0, your MSc can blow up a lot of bad things can happen ok is the way it works ok. So, these are some intuitive feeling for how the eigenvalues work and how fast you can expect the algorithm to converge ok. So, any questions on any of this what you have been doing ok, people are fairly ok with this so far ok. So, it seems simple. All right. So, the next thing we are going to do is consider a situation where you do not know phi and alpha ok, you do not know h and you cannot compute phi and alpha. So, how do you start off your equalizer, how do you adapt is the question ok. So, this algorithm is going to be called stochastic gradient algorithm. One can also call it LMS ok least mean square algorithm if you want ok. So, the motivation is going to be very clearly from from the come from the gradient descent ok. So, what did we have so far? We had Cj plus 1 being Cj minus beta by 2 gradient with respect to Cj expected value of mod ek square ok. So, if you look at a situation where you do not know h ok, what are the things that are computable in this expression. If you look at the gradient expression mod ek square one can compute, you do not really need to know h ok, you know z you put in through a filter, you look at the slicer, slicer input and slicer output you can subtract. For each k you can easily compute mod ek square, what is it that you cannot do? You cannot take expectation ok. So, you can only do time averaging if you want, you cannot take ensemble average expectation because you do not know the distribution, you do not know h you cannot do it. So, what do we do when you do not know something in an expression ok. One convenient thing to do is simply erase it ok. What will happen if I erase that e expectation? Nothing really goes bad right, you can still take gradient with respect to C for mod ek square, you know how to compute mod ek square, you can take gradient with respect to C ok right. So, that is the thing that is done in least mean square algorithm or stochastic gradient algorithm. Since you cannot compute expectation, you simply remove it. The reason another reason why this removing it makes sense is, suppose somebody gives you observations of a random variable ok, some x is some random variable, it has a certain pdf and you do not know the pdf, somebody tells you x took a value small x ok. And if you were to ask, if you were to be asked what is the best estimate for the mean of x, what do you say the mean is x itself right. So, that will be the best estimate for the mean, you can show it is the least mean square estimate for the mean ok. If you are given only one observation of a random variable and you have to find its mean, your best guess is that that is the mean itself ok. So, that is what you are doing here. You are given only 1 k and you have to compute mod ek square and you, but you have to find its mean. So, instead of finding the mean, you are going to say my best estimate is that what I computed itself and simply erase the expected value ok. So, that is what we will do, we will also do a small subtle change ok. So, here j, see right now we were looking at just solving an equation and j was a iteration number ok. So, in practice you are trying to adapt your equalizer to symbols that are coming in and for every symbol that comes in, you want to adapt your equalizer once ok. So, that makes sense. So, every time you get in a symbol, maybe you are able to compute a new e of k and for that new e of k, you want to use that and adapt your filter ok. So, this j I will replace by k again. So, for every symbol arrival, I have a filter update ok. So, I am going to say ck minus beta by 2 gradient with respect to ck right modulus ek square ok. So, my k became a little bit corrupted ok. So, this is going to be my stochastic gradient algorithm ok. So, I take what I can compute as the best estimate of the mean and simply do replace that with the mean ok. Instead of the mean, I use this estimate in the thing ok. And also another thing I do is for every time I compute a new e of k, I want to be able to update my filter coefficient. So, I think of my iteration number as the symbol arrivals itself. So, every time a new symbol comes, I update my filter ok. So, my iterations are controlled by symbols ok. So, that is how the whole thing works ok. So, now this quantity is easy to compute. Just look at ek, you know what ek is ok, ek can be written as mod ek square can be written as mod sk minus c transpose zk square ok. So, you go through the same quantity times, this quantity times its conjugate and do the simplification, you will get mod sk square minus 2 real part of ok. So, I thought I had it very clearly here ok. So, minus 2 real part of what? c conjugate transpose zk conjugate sk plus c conjugate transpose zk conjugate zk transpose c ok. This is just a simple sk minus c transpose zk times its conjugate, you do the expansion, identify the parts carefully right. Ultimately everything has to be real here right, real and positive, left hand side is real and positive, so you do that you get that. So, now we take gradient with respect to c ok. So, what you end up getting is the following, you can show this I did that once again 4 minus 2 ek zk star ok. So, it is instructive to do this carefully, you can do it in several ways, but the best way is to simply write the thing in terms of all the coefficients and then differentiate ok. But remember this is complex, cc is complex, you will have to write it as real and imaginary part. So, you will have a lot of variables and then you will have to put them together carefully ok. So, you do all that you will get the gradient to be minus 2 ek zk star ok. So, this makes a lot of sense, the gradient is error multiplied by some other thing right. So, you have so you see why if you have if you differentiate x square you get x in it ok. So, it is good that ek is showing up here, that is the only point I want to make ok. So, you go back and substitute it back there you get a final update as ck plus 1 equals ck plus beta times ek zk star ok. So, this is my ultimate update for each k ok. So, the whole thing works out quite easily. So, even if you do not know h ok, you can start off with a c0 as your initial filter response arbitrarily somewhere ok, something that makes sense to you, you pick some from point start off and then zk is coming in, you let it through your filter, slice it, look at the output, subtract the output and input that is your error, compute beta times ek times zk star, then add it to your filter, previous filter you had, you have a new filter. So, how do you choose beta? You have to choose it blindly ok, choose it based on some experience ok, try a few channels by if you do some simulations with some channels you will know what how to choose beta ok. Pick it small enough, do not pick it very large ok, if going small is not serious problem ok, eventually you will converge, if you pick it very large you can end up in some kind complicated situation, instability can result ok. So, you pick it small enough you will get to converge ok. So, remember all these things are vectors ok. So, this is one update equation, but this is all vectors. So, let me write it down carefully for you ok. So, ck plus 1 minus l ck plus 1 0 ck plus 1 l right, this is actually the vector ck plus 1 minus l 2 plus l coefficient and then for ck also you have minus l 2 plus l ok plus beta times remember ek is simply a scalar the error that you got after the kth symbol arrive ok, but zk star is what? It is actually a shift register type z star k plus l down to z star k down to z star k minus l. So, this is the actual update equation ok. So, if you want to look at if you number the rows from 0 to 2 l ok, the last row is 2 l, first row is 0 ok, suppose you take some jth row ok, in the jth row jth element of ck plus 1 will be updated as ck plus 1 j ck j plus beta ek, if you do the computation I believe it will work out to k plus l minus ok. So, this is how you compute it ok. So, one can draw a picture ok. So, I am going to carefully draw a picture then show you how this update is going to work ok. So, let us draw a picture here. So, you have sk going through a channel which you do not know then noise gets added to it ok. So, now you have to have a shift register right, the z case should populate a shift register of length 2 l plus 1 ok. So, how do you make a shift register? You just put a series of d flip flops if you want or in dsp they are denoted as z inverse basically delay elements ok. So, you put a one sample delay element ok. So, 2 l plus 1 well you might have to put 2 l elements if you want to use the present input and the 2 l minus 1 delay version of it. Each of these guys will get multiplied by what? They will get multiplied by the by what? This one is this one will get multiplied by ck minus l ok. So, this will be ck minus l plus 1 ok. So, on this last one will be ck plus l right this will be ck l minus 1 ok. Those are the multiplications and then what do you do? You add up all these things ok. So, I will put a big circle here and write summation and then you get what? Here you get ck transpose zk ok which is the filtered version of zk right this filtering I am doing like this is the best way of doing it. Then what do you do here? What is the next block? Slice it that is all you just slice it to get the decision you get your s hat of k ok. Is it ok? So, now how do you adapt it? You compute the error term. How do you compute the error term? This gives you e k and this can be used in the adaption ok. So, this is used in adapting using those equations ok. So, I will draw another picture to show you how the adapting works you will see that more clear. So, this is how you equalize a structure will look ok. You will have a shift register for it coming in then you multiply all the values in the shift register and add them up you get the filtered version you slice it you get the estimate of the transmitted symbol then you compute the error and use the error to adapt your coefficients ok. So, you can use this do this very in a very very easy way ok. But I want you to ok. So, next thing I am going to do is show you draw a picture for the adapting, but before that I want you to look at this this very closely ok ok. So, when I drop the expected expectation operator in the gradient I said if I am given only if I am given only 1 e k the best estimate for the expectation is e k itself. But actually as k goes along I am getting more and more e k ok. So, if I get e k and e k plus 1 then the best estimate for the mean is what? The addition of those 2 or n divided by 2 ok. So, the scalar can be adjusted somehow, but at least I have to keep adding all the error terms I am getting or I should do some addition some accumulation should happen for the error ok. But notice in that update equation accumulation is implicitly happening how is the accumulation happening? Notice this is an iterative thing. So, if I write c k plus 1 in terms of c 0 j what will happen? Plus beta times summation the summation automatically comes ok. So, in the estimate of my gradient I am actually doing an accumulation also which you should and then beta is taking care of the division if you will ok. So, in a way this adaptive equalizer can be seen as a way in which you replace the ensemble average by the time average that you will actually get. So, that is the way in which you are adapting and you are hoping that both of them will be the same. So, that is another way view point. So, you should keep that in mind there is actually an accumulator here. So, this is only a iterative step if you write c k plus 1 in terms of c 0 here you will have a summation ok. So, that will come automatically ok. So, keep that in mind. So, I am going to show you how this draw a picture to show how the adaptation works out ok. So, with that I think we should be ok. So, you have this shift register here ok. So, I will take the j th term ok. So, maybe after this shift register I have z k plus l minus j ok. So, maybe so all these k and all do not really matter ok. So, it is just a shift register the it is it holds 2 l plus 1 values that is all ok. So, you will have to adapt it and the adaptation will work fine ok. So, so this one I am going to multiply with c k j ok and then this goes to the summation unit remember the summation unit will get input from everything else also ok and maybe even on this side from somewhere ok. And then you produce the filtered output ok. So, which I am going to slice on one side I will draw the slicer here to get my s hat of k which I am going to actually imagine is s k for computing the error compute the error and then what do I do I have to multiply well I have to conjugate this guy and multiply with what beta times e k right. So, very simple expression and then what should I do here I should actually have a accumulator and I get c k plus 1 ok. So, this is this is an accumulator probably very badly drawn rough picture, but hopefully you see the idea ok. So, this is how the adaptive loop will happen ok. So, all these things I mean these are all very digital implemented in computer etcetera, but they all have loops ok. So, hopefully you see that the loops are there every bad every time there is a loop what should you be worried about yeah we should always be worried about stability how quickly will it react to things how stable will it be there is always a compromise between those two every time you have a loop and you have to have loops because so, only way things will work otherwise they do not work in practice ok. So, and every time you have a loop you have to be very careful it is nice and fine to write one equation saying c k plus 1 equals c k plus something, but actually it is causing a loop in your system and you have to be very careful about how quickly it is updating and how stable it is updating ok. So, all these things you have to worry about. So, usually what people might do in cases where they expect instability is what they will put some clipping on these coefficients ok. So, they would not let the coefficients go off to wherever they want. In fact, they might even clip the error maybe error is not too much point in clipping outside of the accumulator maybe you want to clip ok. So, you do not want to let your coefficients blow up too much ok. So, all these things are additional things you might want to put in in practice in some situations where you expect instability, but these loops are unavoidable in any receiver system ok. So, in fact, you have a lot more loops which we will see as we go along ok. So, this is just the first loop we are seeing ok. The error is feeding back into the adapting of the equalizer coefficients all right. This picture is clear right. So, it is a very simple enough picture, but hopefully it is clear for people. Yes, I am sorry. Oh. So, you want to show the convergence ok. So, I was not planning to do it if you can you can see the corresponding section in Barry Lee and Messerschmitt. They have some study of convergence, but I did not think it was nice enough for me to present in the class ok. So, but you can study convergence here also like you did before. You will study convergence and you can show that it actually works. The best thing for me convergence here is if you go to the lab the lab that we are doing and you actually implement this this will work ok. So, for most channels it will work. I have actually done this I know it works. So, this really works ok, but yeah. So, you have some trouble once in a while when the channel is changing and in some cases your coefficients might blow up if things do not work properly things might go wrong. All those things will be there, but it works eventually it works. You can play around with it and it works. That is the best way of problem ok. So, ok. So, I think this is a good point to do a summary of all that we have done and give you a brief brief look into what more is there. So, maybe ok. So, maybe the should I do what should I do. Ok. So, let us ok oh I should do DFE ok I am sorry. So, how do you adapt the DFE is the next question. So, far I have only done a linear equalizer ok. So, you might say DFE also is ok. So, the question is we motivated this by solving the linear equation and there it seemed like a complete thing with all the data taken into account etcetera etcetera. That is because you had an ensemble average to find phi and alpha. When you do not have that here you are saying somehow it seems like for each k you are doing only one iteration right, but the iteration keeps building up. So, one might say for the initial part right first say maybe the first 100 symbols or 1000 symbols maybe it is not that reliable. So, what you do is you usually train this filter. So, you actually send some 100 symbols which are known to the receiver ok. In some other cases actually in some advanced wireless cases you do things differently you send something called pilots and there your frequency frequency or your equalization is slightly different you do something else. So, something you have to do some training or pilot sequence or something you have to transmit to make sure you train the equalizer first a little bit ok. So, that it has some place to start and after that you will let it evolve with decision director ok. But otherwise yeah for the first symbol it will be really really bad you do not know what the equalizer is be very bad. So, you have to either have a training sequence or you have to have a pilot somewhere for the equalizer to know what the tabs are approximating ok. Pilots do not really work in this situation, but there is another situation where the pilots work ok. More iterations for the same k what do you mean more iterations for the same k ok. Since we know phi you can do all that solution if you do not know phi how do you repeat the iterations see you have to be very careful when you do these repetitions ok. So, if you based if you do it based on 1 k if there is an error when you repeat it that same error will build up ok. If you can repeat it with some new information where the error is going to decrease that is ok. So, whenever you loop back should be careful about that you cannot keep looping back with the same error ok. So, it will just blow up you do not want to blow up that you have to be very careful. The next time instant you are getting a new error which hopefully statistically is evening out and you do not get the blow up ok. So, all these things are just in due to high level things, but this is how actual receiver circuits are built you know you have to think along those lines and build something which will work ok. So, whenever you have a loop back you have to be worried about whether you are doing positive feedback or you are doing some average case proper feedback. So, if you are just doing positive feedback over and over again your coefficient will only build up I think more lab ok. Alright. So, how do you adapt the DFE is the question not going to go into too much detail here, but I will simply say the following it is actually not very different ok. So, you can define a suitable ZK and a suitable CK ok. So, you have to define do you agree with me it is all a question of defining a suitable ZK and a suitable CK which is multiplying the ZK ok, but remember the DFE has two different inputs one input is the channel input what is the other input? It is actually from the slicer. So, your ZK will actually now have both the channel inputs and the slicer outputs in it ok and what about the CK it will have the C coefficients and the post cursor coefficients ok. So, it will have both and then once you define the error as so, you define all this so, that the error still becomes what? SK minus CK transpose ZK ok, you define so, that this version is satisfied ok. Once you do this the adaptation is exactly the same you cannot do anything else ok, you simply do beta times EK times Z star K that is what you do ok. So, once you do this you simply adapt CK plus 1 as CK plus beta times EK Z star K and you get your adaptive version ok, but remember ZK has to be very carefully written. So, one way of doing it is the following. So, CK in this case will become what remember your DFE we pick the precursor to be strictly no actually it was anti-causal 0 and minus only ok and the post cursor was strictly causal ok. So, if you write your CK you will get C I think minus L so on down to C0 and then you will have D1 all the way to I do not know D I think I put Dm some such thing I do not know what I used for the order ok. So, remember what the DFE is DFE is going to be C of Z and then slicer inside and then a D of Z here right and then there is going to be a plus minus here ok. So, this we chose as what summation Cm Z power minus m m went from 0 to well m went from minus I do not know what I picked I am going to say minus n to 0 ok. So, D of Z I said is going to be summation Dm Z power minus m m goes from 1 to so I am going to say capital K ok. So, this is how we picked. So, this is going to be C minus n to 0 and D1 to D capital K ok this will be your actual CK. So, what will your ZK be if you want the output here to be CK transpose ZK what will your ZK be? It is very easy to write down it will be ZK plus n all the way down to ZK and then what? S hat K minus 1 all the way down to S hat K minus capital K. Is that clear? You simply write it in whatever way you want so that the error becomes SK minus CK transpose ZK the input to the slicer should always be CK transpose ZK. So, you choose suitable vectors ZK and ZK such that the input to the slicer becomes CK transpose ZK. So, once you write it that way there is no problem and then you can just adapt it the exact same way as you want. So, DFE linear equalizer makes no difference ok well actually it makes a difference DFE you expect to be perform better than the MSC linear equalizer right. So, it will be slightly better in several cases. So, one might want to do DFE as opposed to linear equalizer in practice. But the adapting and the method is basically the same you have a loop which adapts your coefficients and the input to the loop is the error always. But how you compute the error varies depending on what structure you have for the equalizer. So, other than that it is exactly the same. So, that is the DFE and so one thing we did not do is we did not derive the DFE for the constraint complexity case maybe I will give you an assignment for that. So, that will be that will take care of it. Ok. So, I think the quiz is coming up on Monday and I want to spend the rest of the week solving problems from the tutorial sheets and tomorrow we have a class right and Wednesday we have a class. So, we have two classes and then Thursday is a holiday. So, I would ideally like to meet on Thursday as well. So, hopefully on Wednesday we will see how we are doing and based on that we might want to meet on Thursday as well maybe for one hour or one and a fast and do some more problems. If the rest of week will be problem solving and Wednesday afternoon we will have an exam on the sixth tutorial sheet. So, those of you who want to write can show up on Wednesday afternoon for that. So, the sixth tutorial I will not do till Wednesday. So, I will do it only on Thursday or something. Fourth and fifth is what we will solve tomorrow and day after and if you really want to get anything useful out of it please try to solve it ahead of time at least see the problems make sure you know what the ideas are behind the problem before you come for class tomorrow.