 A warm welcome to the 20th lecture on the subject of wavelets and multirate digital signal processing. Recall what we had done in the previous lecture. We had built up the uncertainty principle to completion. We found that nature imposed a fundamental limit. If we look at the time bandwidth product, you cannot go below a certain number that is what we finally inferred. In fact, we also inferred which function could give us that minimum product. Let us therefore, put the theme of the lecture today and some of the important conclusions that we had drawn in the previous lecture before ourselves to put our discussion in perspective. So, what we intend to do today is to talk about what is called the time frequency plane and the idea of tiling the time frequency plane. Just like you would tile a floor or tile a surface, you would tile the time frequency plane. And to recall what we had done in the previous lecture, we had drawn the following conclusions. Question number 1, what is the minimum time bandwidth product that you can get? So, recall that we had defined the time bandwidth product to be the time variance multiplied by the frequency variance. And we said that this quantity which we also described as sigma t squared times sigma omega squared cannot fall short of 0.25. So, we said the time bandwidth product sigma t squared sigma omega squared for any function x belonging to L to R is always greater than or equal to 0.25. We had proved this the last time and we also concluded which function x t in L to R gives us this time bandwidth product. So, we showed that x t is the Gaussian namely e raised to the power minus t squared by 2 for example, is an example of a so called optimal function, optimal in the sense of time bandwidth product. Other optimal functions can be obtained by modulating this with a term of the form e raised to the power j alpha t square. So, we saw the more general optimal function. More general optimal function is of the form e raised to the power minus gamma 0 t squared by 2 or you could say plus gamma 0 t squared by 2 with the real part of gamma 0 negative. If you wrote minus gamma 0 of course, you could say the real part is positive either way whatever you wish to write. So, gamma 0 could be complex in general that is what I mean anyway essentially it is a Gaussian that is optimal in a way that is good news in a way it is bad news. The good news is that we know what the optimal function is we know that the Gaussian is the optimal. The bad news is that the Gaussian is unrealizable in the exact sense in physical systems. You know this may seem like a puzzling statement. One of the favorite probability density functions of most scientists and engineers is the so called normal or the Gaussian density and in fact, we go to the extent of saying that when we put together a number of independent identically distributed random variables in most situations the some random variable goes towards the Gaussian. So, that is what is called the central limit theorem and therefore, we also justify the use of the Gaussian density in most statistical situations. So, much so that we are obsessed with the use of a normal distribution and we use the term variance to even denote the spread around the mean. We say well in a Gaussian the spread around the mean tells us is indicated by the variance and tells us more or less in what range that variable lines the variance mean plus variance to mean minus variance. Well, so that is the Gaussian for you then why are we saying that this is physically unrealizable. I am talking about a Gaussian time waveform take for example, the exponential time waveform or the exponential time waveform modulated by a sinusoid. These are easily realizable circuits which comprise of resistances inductances and capacitances when excited say with a step or even for that matter with a sinusoid. Give us either exponentially decaying sinusoids or exponentially decaying transients and therefore, those are easy to generate with physical systems. Unfortunately, there is no meaningful physical system which can generate a Gaussian in the same way. So, that is one of the fundamental reasons why Gaussian is good news in a statistical density but Gaussian is bad news as far as functions go. You know I must mention that people talk about what is called Gaussian mean shift keying or Gaussian minimum shift keying GMSK in the context of digital communication. The word Gaussian there refers to a Gaussian pattern in the impulse response whether it is in phase or in amplitude but there again people really fight hard to realize a Gaussian filter. So, you see that the Gaussian is difficult to realize in physical systems can only be approximated. So, although it is good news that we know what the optimal function is, it is bad news that we cannot easily realize this optimal function by using physical systems. Well then that is bad news, let us bring some good news. If not the Gaussian then can we use a reasonable function which we could probably realize with a cascade of two simple systems or something of the kind and go close to the Gaussian. So, in other words when we started with the Haar we are a terrible time band with product infinite. Now, can we do a little better? Suppose we took a cascade of two systems whose impulse response is essentially that pulse. What I mean by that is instead of taking just one pulse take a cascade of them. So, we suppose we have two systems each of whose impulse response is essentially a pulse say of the same width. This is the linear shift invariant system, this is another linear shift invariant system. The impulse response here is essentially a pulse and the impulse response here too is a pulse both pulses of the same width let us say t we cascade them and we note of course that together this also forms a composite LSI system and the impulse response of that composite LSI system is essentially the convolution and we know that convolution very well. That convolution looks something like this I hardly need to work it out this is something very familiar to us from any basic course on signals or systems. The convolution looks like this essentially what is called a triangular pulse. Now, we can give this a physical interpretation you know when you have this LSI system with an impulse response equal to a pulse what you are essentially doing is a sample and hold process. So, if an impulse results in a pulse you are essentially sampling a function at a point and holding it for the duration of that pulse that is what the physical meaning of that impulse response given by a pulse is. So, if you have two such sample and hold then you are effectively talking about a triangular impulse response. So, there is some underlying physical meaning. Now, the natural question to ask is what can we say about the time bandwidth product of this triangular pulse how bad or good is it compared to the Gaussian that is the next question that we shall answer. So, the time bandwidth product of this triangular pulse and you know you remember that we do not need to worry where this triangular pulse lies. So, we can as well centre it and put it at 0. We do not need to worry how wide this triangular pulse is as long as we have kept it symmetric. So, we can put this from minus 1 to 1 and we do not need to worry what the height is and we can as well therefore make the height equal to 1 good. All this is because of the invariance properties of the time bandwidth product it is invariant to scaling on the dependent variable it is invariant to scaling on the independent variable and it is invariant to translation. So, we shall find out the time bandwidth product of this. In fact, we can describe this function let us call this function x of t as a function of t and describe it. It is essentially 1 minus mod t for mod t between 0 and 1. So, how would we find the time bandwidth product? We shall first obtain the time variance and you will recall that since the function is centered that means the center of x t is at 0. The time variance is going to be described by the norm of t x t the whole squared in l 2 r divided by the norm of x in l 2 r the whole squared. Now, we shall be requiring this norm of x in l 2 r the whole squared again and again. So, let us begin by calculating this norm first. The norm of x t in l 2 r is easily seen to be 2 times integral from 0 to 1 1 minus t the whole squared d t this 2 times comes because of the symmetry around t equal to 0. So, essentially the area on the negative and the positive side is the same. Now, this is an easy integral to evaluate. We can easily make the substitution lambda is 1 minus t and evaluate this integral and that gives us the norm of x t in l 2 r squared is 2 again this is lambda squared. Now, d lambda is minus d t. So, we could write minus d lambda here, but the limits also change from 1 to 0 in that case and therefore, this is the same as 2 integral 0 to 1 lambda squared d lambda. This is easy to evaluate this essentially evaluates to 2 by 3 that is easy to evaluate lambda cube by 3 from 0 to 1 anyway. So, so much so for the l 2 r norm. Now, let us take the norm of t x t. So, let us evaluate norm t x t the whole squared. Now, here again we will use symmetry. So, it is 2 times the integral from 0 to 1 t times 1 minus t d t using symmetry the whole squared of course. Now, here it is not going to help very much to make a substitution of variable because even if we substitute lambda is 1 minus t we will get a 1 minus lambda here which is not so convenient. So, let us keep it as an integral in t and let us evaluate the integral bravely. So, to speak so that is 2 integral from 0 to 1 t squared 1 minus 2 t plus t squared d t which is 1 minus 2 t plus t squared d t which is 2 integral 0 to 1 t squared minus 2 t cubed plus t to the power 4 d t. Easy integrals to evaluate and we do that t cube by 3 there t to the power 4 by 4 there and t to the power 5 by 5 here evaluated from 0 to 1 and that is 2 1 by 3 minus 2 by 4 plus 1 by 5 simple enough let us simplify a little bit. So, 1 by 15 is what we have 16 minus 15 that is 1, 2 and 2 cancel and there you are. Now, this is the L 2 norm of t x t squared I mean the L 2 norm of t x t the whole squared in L 2 r that is what I mean. So, we have the time variance ready for us the time variance is therefore, 1 by 15 divided by 2 by 3 which is 1 by 15 into 3 by 2 or that is 1 by 10. Now, let us look at the frequency domain. In fact, the frequency domain will be a little easier because we are going to make use of the principle of bringing the frequency to the time domain in calculating variance that is a very easy integral to evaluate. The frequency variance is going to be given by the L 2 norm of d x t d t the whole squared divided by the L 2 norm of x t the whole squared and d x t d t is a very simple function to evaluate. In fact, d x t d t has the following appearance it is interesting d x t d t here has the appearance of a wave let and it is very easy to calculate the energy in this the L 2 norm of d x t d t is simply L 2 norm squared I mean is simply 1 squared into 1 plus 1 squared into 1 looking at the areas of the rectangles that is 2 and we already know the L 2 norm of the function in squared 2. So, we know this we know the L 2 norm of x it is 2 by 3 and therefore, the frequency variance turns out to be 2 divided by 2 by 3 which is 3. Now, we can calculate the time bandwidth product. So, the time bandwidth product is 0.1 the time variance multiplied by 3 the frequency variance low and behold it is 0.3. So, that is very good news actually if you think about it we know the minimum we can go to is 0.25 we have come all the way down to 0.3 pretty good from infinity we have come all the way to 0.3 just by cascading the system with itself once again not bad at all. So, although there was bad news in the uncertainty principle that you cannot reduce the simultaneous localization in time and frequency below 0.25 in the sense of the time bandwidth product. There was good news in that you knew what the optimal function was namely the Gaussian then there was bad news again that the Gaussian was physically unrealizable as a function, but now we have some good news namely that we can go all the way down to 0.3 by a very meaningful function. Now, we seem to have an alternation of bad news and good news and yes that alternation takes one more step. The bad news is that now when you want to go from 0.3 to 0.25 you are going to have to work really really hard that is what nature does most of the time I mentioned this before nature brings you tantalizingly close to an ideal and makes you work very hard to go anywhere closer. If you look at many fields whether it is filter design whether it is system design you know to get an acceptable level of system performance you may have to work just a little bit to get that finish in system performance one has to really work hard and the difference between acceptable and fine performance may not be all that much all the time that is true here as well. To go close to the uncertainty principle is not too difficult to go any closer is very difficult. In fact, one way of going closer is to repeatedly convolve the pulse with itself. So, if you took this sample and whole system and if you made a cascade of one more subsystem with it you get the triangular pulse. Now, if you take a cascade of three subsystems put one more in cascade what I mean by that is take this cascade. So, I have a cascade of three LSI systems each of whose impulse response is essentially a pulse I shall leave a little exercise for you to do here the exercise is number one evaluate the overall impulse response exercise number two these are all LSI systems by the way linear shift invariant systems with the impulse response shown exercise two obtain the time bandwidth product of this impulse response. And naturally when you do that you would be inclined to compare it with the number 0.3 you hope that it should be less, but I leave it to you to see what it actually is. In fact, it will be interesting to do this further and it will be even more interesting to see if you can come up with an inductive argument that is not easy by the way each time you put one in the cascade are you going to do better in terms of time bandwidth product are you going to go closer to 0.25 I leave it to you to take a couple of steps it should be an interesting thing to do anyway let us make one more remark you know it is not just a compactly supported function like this one which has a time bandwidth product of 0.3. Now, I shall use a very simple argument to show that the same 0.3 can come from a non compactly supported function. We use the principle of Fourier duality and we know that the Fourier transform of this triangular pulse in time bandwidth design is of the form sin a f by b f the whole squared a and b are constants you can suitably evaluate the a and b that is not important what is important is the form. So, in fact we can even sketch it the Fourier transform would look something like this as a function of f. Important question to ask is what is the Fourier transform if this is treated as a time function and that is what Fourier duality would give us it would tell us that if we considered a time function like this something like sin a t by b t the whole square its Fourier transform is going to look something like this I do not need to mark the limits here, but this is 0 as a consequence of duality this is the Fourier transform function of f here. So, what we are calling the time variance for the triangular pulse becomes the frequency variance for this for this function the sin a t by b t kind of function here and what we are calling the frequency variance for the triangular pulse becomes the time variance for this. So, in other words when you take the Fourier transform of a function and ask what is its time bandwidth product the time bandwidth product is the same. So, for this function for the function sin a t by b t the whole square or this time variance is equal to the frequency variance of the triangular pulse and the frequency variance is equal to the time variance of the same triangular pulse and therefore, the time bandwidth product is very easy to calculate in fact the time bandwidth product would simply be 0.3. Now, we have a partial answer to the question that we raised the last time can you change the shape and maintain the same time bandwidth product yes you can. In fact, what we have just done answers many questions or brings up many different conclusions one is we have discovered one more kind of invariance of the time bandwidth product and let us write that down the time bandwidth product is invariant to Fourier transformation and this is a very deep kind of invariance. One more conclusion that we have drawn from here is that we can have both compactly and non compactly supported functions namely functions which are non zero on a finite interval and functions that are non zero over an infinite interval with the same time bandwidth product. So, it is possible from this example we see it is possible to have two functions one compactly supported and one not with the same time bandwidth product. Now, with this remark we would like to take the idea of the time bandwidth product further now that we have identified two domains let us put the domains together and bring out a new domain a two variable domain. So, we shall henceforth talk of what is called a time frequency plane essentially a plane in which one axis say the horizontal axis represents time and the other axis say the vertical axis represents frequency. And therefore, what the uncertainty principle says is that if you wish to describe the occupancy of a function in this plane you know you can think of each function in L to R you can think of the occupancy of x t in this time frequency plane occupancy is notional. And this occupancy can be thought as being from t 0 the centre in time to t 0 plus sigma t on one side and t 0 minus sigma t on the other this is the horizontal axis. And on the vertical axis we could centre it at capital omega 0 namely the frequency centre and we could spread it to omega 0 minus delta omega 0 or sigma omega if you please and above we could take it to omega 0 plus sigma omega. So, this is in some sense notionally the spread. So, you could think of that function x t as located in a rectangle which is sintered at t 0 omega 0 here which has a width or a horizontal spread of 2 times sigma t and a vertical spread of 2 times sigma omega. So, let me show the whole time frequency plane in some sense what we are saying is here you have the time frequency plane. So, let us mark it this is the time frequency plane notionally and a function x t occupies a rectangle in this time frequency plane centre t 0 omega 0 spread 2 sigma t horizontally 2 sigma omega vertical very interesting concept. A function in L to R lies in a circle certain region of the time frequency plane it occupies a certain area in the time frequency plane. And what the uncertainty principle says is that this rectangle cannot have an area smaller than a certain number uncertainty says the rectangle area cannot be smaller than well how much 2 sigma t into 2 sigma omega that is 4 sigma t sigma omega greater than equal to 4 times square root of 0.25 which is 0.5. So, that is 4 into 0.5 that is the smallest area that it can have 2 2 units the area of the rectangle cannot be smaller than 2 units. Now, within that limitation you can change the width and the height that is also the positive side of the uncertainty principle. And in fact, if you wish to cover the time frequency plane with functions what you mean by covering a time frequency plane with functions. It means using functions which occupy different such rectangles in such a way that it gives you different information in the time and frequency domain about another function. So, now we can talk of what is called tiling the time frequency plane. It essentially means covering this plane with such rectangles with rectangles corresponding to functions. In fact, I will say rectangular tiles corresponding to such function and what we are essentially saying is that when we take the dot product of such a function you know let us be specific. So, when we take any other function take any other function to be analyzed let us say y of t and let this tool function. So, to speak p x of t then passable's theorem says that if I take the dot product of y t with x t essentially y t x bar t d t then the same thing happens in the frequency domain passable's theorem says this is equal to 1 by 2 pi do not worry about the factor 1 by 2 pi it is essentially just a normalizing constant the important thing is inside Fourier transform of y Fourier transform of x complex conjugated integrated over omega. So, what this means physically is if I take the dot product if I take the projection of a function y t on such a tool function x t in time I am doing the same projection in frequency. So, by projecting this y t on x t in time I am essentially extracting information about y t in a time region between t 0 minus sigma t and t 0 plus sigma t passable's theorem tells me simultaneously I am also extracting information of the Fourier transform of y in a region captured between capital omega 0 minus sigma omega and capital omega 0 plus sigma omega simultaneously I am extracting information in the time frequency plane about y t in that rectangle using a tool function. So, when I take the dot product of y t with such a tool function I am immediately extracting information about y t in that rectangle of the time frequency plane and now we have an interpretation the rectangular region over which you want to extract information about a function y t cannot be smaller in area than 2 units. Well, you know the number 2 is not the point I mean you can always change that number by changing the unit we have used a certain unit here. In fact, we have used angular frequency as far as frequency goes if you used Hertz frequency you would get a different number there that is not the point. The point is that there is a minimum rectangular area over which you can view y t there is a minimum joint resolution you know this in the sense you cannot go finer than that resolution when you look at the 2 domains together. But the good news is that there are many different ways in which you can look at a small domain when it is within the uncertainty limit. In fact, now tiling has a different interpretation if I wish to analyze a function I think of the function in the time domain and in the frequency domain together. So, essentially I am viewing the function in a joint domain and I wish to see how the function looks in a joint domain let me give an example. Suppose I have what is called a chirp function you know a chirp function is named after the sound of birds when birds chirp broadly or crudely the chirp wave form has a pattern which is a continuously changing frequency in time instantaneous frequency. So, it is something of the form say sin omega as a function of t t. So, essentially this instantaneous frequency so to speak of the sin wave is a function of time. Now you know an important question in analyzing chirp functions that one encounters sometimes in radar or in sonar is trace this variation of the instantaneous frequency in time and there the uncertainty principle hits hard. So, what we are saying is you know in the time frequency plane suppose omega t is a constant function of time is a constant function of time then we are talking about a frequency independent of time suppose that is a linear function of time which is often true something like say this omega is a function of t is some a plus b t this is called a linear chirp what would you try to do with a tool you would try to trace this pattern and that is where the uncertainty principle hits you. It says that you can only put rectangles that look something like this and you can never really trace what is happening within that rectangle. So, you could crudely say if the rectangles you know suppose you think of putting many rectangles on this time frequency plane you know. So, you had rectangles like this put them all over and you would see that these rectangles are lighted up so to speak. So, you know I will shade these the shaded rectangles are lighted up or in other words if I looked at the dot product of this function y t which has this linear chirp nature with this set of tiles that means many of these functions which are on different tiles in this time frequency plane. The tiles in which the function essentially is prominent would be lighted up that means the magnitude would be large there the intensity would be large of the dot product. Now, all that that would indicate is that you know it would show you discrete points. So, here for example if you look back in this time frequency plane each of these rectangles would correspond to a single point here. So, it would show this point this point and this point the ones that lie on the line as lighted up. So, these points would be lighted up here, but you cannot go closer than this point. So, of course these points would lie on what can be seen to be a straight time, but you would not know what has happened between the points that is what the uncertainty principle says. You cannot get instantaneous frequency as a function of time exactly, but you can do it as closely as you desire by taking smaller rectangles and the smaller this the area of the rectangles that you take to within of course the uncertainty principle the better you can make this estimate. One of the meanings of a time frequency plane and its tiling we shall see more in the next lecture. Thank you.