 So we have been looking at prediction error method and let me just take a quick review again of where we stand now we have this data set y and u okay we want to develop a model which is in general of this form g is modeled with respect to the known inputs typically the manipulated inputs and hq is a model with respect to unmeasured unknown disturbances. Now u here can include measured disturbances if you make some simplifying assumptions. So u is what I that is why I kept keep saying known inputs okay so if you have measured disturbances there are ways of modifying this model to we have to make certain simplifying assumptions of course but because the inputs manipulated inputs that go out of a computer are piecewise constant the disturbances that you measure are not piecewise constant even if you measure a disturbance see for example you have some you know feed water use for cooling some system and if you are measuring the feed water temperature feed water temperature throughout the day keeps changing because atmospheric temperature keeps changing even if you measure it truly speaking it is not piecewise constant but if you are measuring fast enough you know you can make an assumption that it is piecewise constant and model so that u here could be known inputs or it could be manipulated variables which are piecewise okay this part is everything that is not explained by the known inputs that is the only correct interpretation of this. So this component actually captures unknown disturbances okay it captures measurement errors it captures errors because of approximations you are actually having a model which is linear model the true system might is in general nonlinear very rarely a real system is linear perfectly linear so this is something that captures everything that is unknown everything that is not captured by this component okay the way we do modeling is to use one step ahead predictions so we develop this one step ahead predictor and as a part of exercise we will be actually developing more such predictors so developing one step ahead predictors for different simple forms is part of the exercise that we are going to do tomorrow and we estimate this prediction error okay this method is called as prediction error method because we minimize the sum of the square of prediction errors okay this is prediction error this is yk is the measurement at instant k okay yk-1 is sorry y hat k given k-1 is prediction of y based on measurements available up to k-1 that is the okay so this notation we are going to use throughout the course k given k-1 means prediction of y using measurements available up to time k-1 theta is the model parameters that you need to estimate okay and then I was saying that we minimize sum of the square of errors we minimize the variance okay there is nothing nothing will nobody stops you from minimizing sum of absolute errors you can do that okay or minimizing maximum error minimize maximum error over the that is infinite norm you can minimize some other function in general two norm has some special properties which I am going to discuss today why this two norm is so important why we can get some insights into parameter estimation if you happen to use two norm okay that is why we want to use two norm so this method is called as prediction error method so what is the other method what are the other approaches to do system identification modeling there is one more method which is based on projections okay and this method is known as subspace identification method it has become very popular in last 10 to 15 years it is just based on projections idea the projections so nice thing is you know you can just use simple matrix projections to come up with the model so whereas here you have to use nonlinear optimization since you are using nonlinear optimization here it is very important that you give a good guess if you do not give a good guess okay how do you give a good guess okay right now you know leung's toolbox is doing it for us when we give data it gives you model okay there is never any problem for two models one is ARX model other is FIR model which we have been looking at in the these two models are very easy to identify from data and that is why they are very popular in the industry whether trouble with these models is that you need large number of parameters so you will need large data set so you need you know to conduct the experiment for a longer time which is loss of production so the models which are easy to develop have some trouble associated with it okay models which are difficult to develop have some other trouble you know the trouble is shifted from the experiments long experiments to difficulty in solving but probably difficulty in solving okay is easier to deal with than longer experiment longer experiment means loss of production okay which means it is money okay a difficulty in solving a problem is offline okay you can collect data and do some tricks to make the problem give a good guess and so on. So if you ask me what should you do whether you should go for ARMAX or box Jenkins model or ARX model I would say you should go for ARX model or go for ARMAX or box Jenkins model okay and try to give a good guess for example even if you have small data you first try to create identify an ARX model which is which will be bad you know because the data size is small but you can use that to create a good guess for your you know ARMAX or box Jenkins model and then proceed so if you use your knowledge intelligently you can actually plan your experiments very well and save money that is important okay now let us get into the properties of the model so I talked briefly about these steps in the model development and we also said something about model structure selection then I told you about this one particular thing here is that I mean what I have introduced to you is just tip of the iceberg it is just you know just beginning what what the system I did actually you should not just stop at my notes I have uploaded two more documents in model I do not know how many of you have seen them one is slides by Professor Leung why particularly Professor Leung he is a well known authority in this area there is a very nicely written book by Professor Leung I have mentioned the book here system identification I personally I do not feel it is a book for beginners okay the other book which I have mentioned here is short of Strom and Stoica that is a better book for beginners the way he introduces is much more easy to digest but Leung's book is you know one of the standard reference it is any time you get doubts about anything in system identification you can always go back to Leung and you will see that he has discussed it you only realize that such a question or such a problem exist maybe after one or two years but he has thought about it this toolbox Matlab toolbox has been actually written by Professor Leung so there is a compressed version of his lecture notes about 30-40 pages which also I have uploaded there so there are two things one is presentation slides of a workshop he conducted in University of Alberta in 2004 and the second one is you know his condensed lecture notes on system identification I think there are some 50-60 pages. So those of you who are going to use these techniques in future and where all these techniques are used everywhere I mean from you know somebody decides to go into finance and then you know wants to do a share price modeling you can do that it is time series it is a you can model it as a stationary or a non-stationary process you can view it as a stationary you can find a transfer function driven by a white noise that you know tells you about how the share price is you have a model for share price fluctuation and then what is K here day by day you know you can take today's price tomorrow's average price yesterday's average price question is can I predict if I have a model why do we develop models we can do predictions we can forecast okay so that part will come little later when we will start using these models for forecasting now you know when you have this y-y hat what is this epsilon is y-y hat okay now you may want to suppress certain frequencies that are there in y-y hat let us say y has measurement errors which are very high frequency you know that these are not going to be useful in modeling you can filter those errors using a low pass filter or a band pass filter and that is why you can actually have an objective function in which you minimize a filtered error and not the directly innovations or error itself this signal y-y hat or this epsilon is many times called as residuals model residuals it is also called innovations okay so modeling error whatever you want to call it residual is a very very commonly used model residuals yeah epsilon f will not be white noise we stop when you epsilon is white noise but not epsilon f is white noise epsilon f is filtered how can a filtered signal driven by a white noise be a white noise it will be called as epsilon f will not be white noise you have to stop when epsilon is a white noise yeah yeah so whether by minimizing epsilon f will you be able to find epsilon which is white by properly choosing order you should be able to find epsilon that is my you know first cut answer it should be possible to whether it is guaranteed always is I have to go back and check but I think it should be possible to find out just because you are minimizing the filtered value does not mean that yes you can go back to epsilon yeah see you are minimizing the filtered value but that filter does not enter your model anywhere okay filter does not see that filtered value is only used to knock off certain frequencies so as to emphasize particular frequencies in your model okay so actually when you use that model the way it will enter probably in your model is afterwards if you want to use the model you probably have to use filtered f and filtered you okay so it will translate to filtered f and filtered you know epsilon f cannot be white noise now by definition epsilon f is a correlated noise there is a transfer function multiplying epsilon f cannot be white noise epsilon can be white noise okay epsilon f cannot be no what I am what I was what I mistook your question last time was if you minimize this objective function is it possible to get an epsilon which is white noise I think it should be possible to get but what are the guarantees and all that we have that it should consist correspond to the system bandwidth and the system bandwidth is something which you have to know as a engineer which you know which is a knowledge which you should know that in what bandwidth I should which bandwidth is relevant for my control which bandwidth I should cut off that is where this choice of this filter here can play a very critical role in identification how do you choose this filter so this modeling in some sense is not completely black box when you choose this filter you have to have okay or when you choose the model order you have to have some idea about so in some sense it is a gray box model you know you are no no so you have to go and do some preliminary experiments with the system you have to look at a data and do some spectral analysis you cannot do it just like that so as I mentioned multiple input multiple output systems ARX models can be very easily adapted can be changed ARX modeling scheme for multiple input multiple output system trouble with ARX is large number of parameters other possibility is that output error or max box Jenkins models typically they are developed as multiple input single output models so if you have a system which has two inputs and two outputs you develop two models for output one and two inputs output two and two inputs and then you club them together I am going to talk about how to club them together today okay okay now even if I want to do this part very briefly I would at least need three four lectures or three lectures probably but I am just going to talk about it very very briefly I am not going to go into details if you are want to know more about this I have notes here at the end appendix this appendix here this appendix here I have explained you know basis for this analysis that frequency analysis now analysis is to get insight into how parameters you know into these two things one is bias error and the variance error and what is bias error and variance error I will talk about it I will talk about the final expression which are derived after all this analysis okay now if you for the time being like engineers approach and say that well there is a derivation which is true let me concentrate on the final result and let me see how I can use it analyze it you have to understand or to plan my experiment if I can do that that is enough okay because ultimately even if you understand all the derivations finally how to use that particular result to plan an experiment is more important not how you are out of the derivation okay. So as an engineer I am more interested in the final result which I can use to analyze so I am going to directly talk about final result so this is power spectrum analysis and this is based on the Fourier transform of the autocorrelation and cross correlation so very very powerful tool and here you are able to do this because you are using two norm because Fourier transform you can take and talk about you know interpretation in the frequency space because you are working with in a Hilbert space where two norm is you know available to you and you can move back and forth between different reference frames time domain and frequency domain reference frames and then or view points and so skipping this long short we just want to look at these two types of what kind of errors that can occur okay when you identify a model from data okay there are two types of fundamentally the two types of errors for the time being actually I would say there are three types of errors but the third one which comes because of approximation of a non-linear system with a linear system let us ignore that right now let us assume that a true plant is perfectly linear okay under that situation what are the errors that can occur okay one error is that the plant is perfectly linear I got data I do not know what is the order okay I do not know what is the order of the plant so I chose some guess you know second order transfer function third order transfer function this is my guess and ultimately I am going to use some criteria like a kayak information criteria and make some call on which model is good okay so I do not know what is the truth okay so one error so what I have done here is that let us say g is represents g ? n hat represents the model transfer function that you have estimated from data and gq is the true this is the true this is estimated so I am putting this ? star and ? star in between okay so this difference this difference is between you know it is just what I would say a structural bias structural bias comes because the true model is different okay true model has let us say seven parameters and I am using a model with two parameters yesterday we had we had seen one problem okay if I are modeled with two parameters we were trying to identify using one parameter so a similar situation that is because you do not know what is the true order you guess okay so the first type of error is a structural error that true differential equation this 10th order you modeling at a third order so the structural error so whatever you do a third order differential equation cannot imitate behavior of a 10th order equation okay you can make them bring them close but 10th order is 10th order and third order is third order third order cannot you know imitate 10th order beyond the point so this is the first thing that you have to know that there is a structural error when you are identifying a model second error comes suppose you know the structure okay what is the other type of error other type of error comes because of variance errors variance errors are because of the data length okay you know that you know that if you take infinite you will get perfect model okay but you cannot take infinite data you cannot run the test for you know perturbation for infinite time you have to stop you have to take finite data so finite data will give rise to errors which are called as variance errors so the second type of errors are introduced because of the variance errors okay so total error in the estimation is a combination of bias error and variance error bias error comes because of structural mismatch variance error comes because of limitations finite data length okay they also come of course because of unmeasure disturbances and noise but the two things are tightly related and you can if you want to reduce the influence of noise on the estimation you better take larger data length so we will see what is the relationship just looking at the final expressions let me again go over this idea of bias error what is this bias error concept in process control this is a very very popular model this model okay this model form if I go back and show you here there are two model forms which are very very often used in chemical process control one is why or I will write it in terms of Laplace domain you can convert it into time domain or into discrete time Laplace domain this model is k p divided by kow s plus 1 e to the power minus theta s okay this is this is called as f o p t d f is first order okay f o is first order okay first order with time delay okay and I think p is for single time constant pole okay is the first order with time delay model okay you have one time constant okay and you have you have gain time delay and time constant very simple model okay many times useful to approximate high order systems you have some distillation column which is 100th order system you do not want to model this 100th order system you model it as this the other model which is very very of course into us the other model which is very popular is called as so p t d so this is k p upon t square s square plus 2 t zeta s plus 1 this is second order plus time delay I think p stands for plus second order plus time delay s o p t d so this is s o p t d and this is first order plus time delay model okay so these models are very very popularly use these are low order models first order or second order trying to use one pole or two poles to approximate you know a system which is high order okay so of course when you convert this into discrete form this will be yk is equal to some beta 1 q to the power minus 1 plus beta 2 q to the power minus 2 upon 1 plus alpha 1 q to the power minus 1 plus this s o p t d model when you convert into discrete form we will get a second order difference equation this is second order differential equation this is second order difference equation okay so they are inter convertible but again here you are trying everything using beta 1 beta 2 alpha 1 alpha 2 and then this d even the true system might be very high order you tend to use a small order model see suppose you have a system which is multiple input multiple output between each input and output pair you tend to assume a model which is of this form or this form because you know overall order of multiple input multiple output starts blowing up okay we will see that how it happens so often this smaller dimensional form is convenient okay but now what is the trouble why this and then there are in the books on process control you will find special methods to identify this models this k p tau and theta from some step change and all that okay so this has been a very popular method of modeling where is the trouble so let us look at a scenario where you have 8th order transfer function this is 1 upon 10s plus 1 okay I did this identification exercise and I decided to model it as a first order plus time delay this is my time constant and this is the time delay this combination approximates this transfer function okay and typically how do you check in process control it is typically you check the step responses step responses seems to match pretty well okay the gain is correct okay it is see this is this blue line is the approximation and this green line is the true plant and you will say well this is not a bad approximation okay it is matching quite okay in the step responses who made you compare the frequency responses you see what is the problem okay frequency response do not match okay frequency responses are matching in this region right this is the low frequency region because there is a good match in the low frequency region you know you see good match in the step response but if you inject a signal which is high frequency okay then there will be significant mismatch between the model behavior and the plant behavior now suppose it happens that when you are operating the plant the frequencies of the plant lie in this region then there is a big mismatch okay model is wrong and you use model for doing all kinds of things you use model for control so you are using a wrong model in this frequency region okay. Now that there are two solutions one solution is not to use first order model okay use 8th order model but in this case I know the order is 8 in real plant suppose the order is 100 am I going to use 100 order model you know identifying 100 time constants of a 100 order model will be difficult it is not so easy task okay so I do not want to use 100 order model I am not comfortable with it I want to use still first order model or second order model or third order model maybe you know not a very high order model okay then the question is for this particular approximation for this particular approximation the good match is in this region but I know the relevant frequency is in this region can I shift this match from here to here I do not mind if there is a mismatch here but I want good match you understand what I am saying so what I am trying to say now let us be practical that the real system is high order I am always going to develop a low order model so there is always going to be this structural mismatch between the truth and the model okay. Now in which frequency band you want the mismatch to be low and in which frequency band you want the mismatch to be high that is your choice okay but what is the I mean how do you decide how do you analyze that okay that is where this frequency analysis comes into picture okay this derivation which I am presenting on this one page would probably require lot of time to go through if you start doing it but is lot of things you have to accept and finally we just look at the result okay you just go back into the appendix and try to see if you can understand the derivations is everyone here with me on this expression how do you get the error between see you have to yeah we have this expression we have this expression for epsilon okay now substituting for the truth substituting for the truth and for the estimate okay and doing some algebra okay which I am not going to do right now here okay you can show see what is what are the terms here this just tell me what is h hat estimate of what is when we use hat what is the what is the convention what is g hat estimate of the truth okay and when I say g it is the truth okay so I have expressed the error prediction error in terms of four things what are the four things true g true h okay estimated g estimated h this I can do through some algebra just believe this right now okay and you can see what is what is what is why why is function of true g to h what is why hat function of estimated g estimated h and what is epsilon is difference between the two okay so just believe me that you can do some algebra and get this expression okay how am I going to use this now what is the variance how do you estimate the parameters by minimizing the variance okay by minimizing the variance is everyone with me on this I am minimizing the variance and estimating the parameters okay now I am taking finite data length right when I actually do estimation of the parameters I am taking finite data length but doing analysis with finite data length is very difficult you have to do analysis by taking limiting case so what is my limiting case as n tends to infinity okay at n tends to infinity as n tends to infinity what is what is what is this quantity variance what is r0 what is r0 is variance of the signal what is variance of variance of epsilon okay it is variance of epsilon any doubt up to this point is it fine it is variance of epsilon okay now is the trouble or now is the trick not trouble so you can using this Parseval's theorem you can interpret you can interpret this quantity okay in the frequency domain you can transform this quantity limiting quantity that is limit as n tends to infinity this quantity you can convert into frequency domain okay now I am going to use this expression for converting into frequency domain okay and if I convert I finally get this particular term okay so minimizing this quantity minimizing what is see I have taken spectrum of epsilon which is given by this quantity here spectrum of epsilon is when you have a transfer function you can estimate the spectrum by putting q is equal to j omega okay by putting q is equal to j omega you can estimate the details are given in the appendix for this go back and check so I can relate this variance with this I can convert it into spectral domain and what is this phi what is this phi spectral density of epsilon that is given by this quantity why this quantity it is coming from here okay now let me explain looks very complex when you see for the first time okay but now why it is useful why am I saying that this is going to be useful now look at things here what is this quantity what is g j omega e to the power g of e to the power i omega this frequency response of true what is this frequency response of estimate so I am saying that difference between this frequency response and this frequency response is weighted by spectrum of inputs okay is weighted by spectrum of inputs okay so what is the consequence okay so this difference is weighted by see if you are doing optimization okay if you are doing optimization in optimization if you have some of the square of certain terms okay some terms have higher weightage and some terms have lower weightage what is the tendency of the optimizer wherever there is a higher weightage it will try to reduce that term more and more wherever there is lower weightage it will not bother about optimizer see because if it tries to change that variable the objective function does not change you get my point see if you have an optimization problem if you have an optimization problem which has different components some components have higher weightage some components have lower weightage okay now the tendency of the optimizer see wherever there is higher weightage optimizer will try to bring you know minimize that component it is more sensitive to that component it is less sensitive to the component where there is less weightage okay now how this difference see actually behaves in the estimated model depends upon how you choose this frequency spectrum okay how you choose this frequency spectrum so actually this frequency spectrum shaping can be used to shift this difference to different zones okay see let us go back to I have these two signals input signals okay and what is there corresponding power spectrum this is their power spectrum so what is the meaning of this meaning of this is that this power spectrum has low power at low frequencies it has high power at middle frequencies it has okay low power at okay whereas this signal has high power at low frequencies and almost no power at middle and high frequencies okay now if I perturb the plant using this signal then let us go back to this if I perturb the plant using the signal which has high content at low frequency okay then optimizer will work in such a way that this difference is small at low frequencies okay this difference is small so model is good frequency response are matching in the low frequency region okay and it does not bother about it does not bother about matching the frequency responses in high frequency region why because spectrum see this integral is weighted by the spectrum right this spectrum is low at high frequencies and middle frequencies so the tendency minimizing this optimization function in time domain would implicitly do you know will implicitly reduce frequency domain mismatch at low frequencies this is the insight which this equation gives okay this is not possible for one norm or infinite norm this is possible with two norm why two norm you can convert use Perceval's theorem and use Fourier transform and get into frequency domain do this analysis okay and even if I derive this equation finally I am going to say only this is the important part of it how do you derive at this equation do not bother about it right okay so it tells you this frequency domain expression tells you how to plan your experiments very very critical okay see if I use this signal if I excite the plant using this signal okay if I excite the plant using this signal then it emphasizes middle frequency and this high frequency and low frequency are not so important what would happen is this right now this model mismatch is very good here it will shift from here to here there will be mismatch at low frequencies there will be mismatch at high frequencies if I use white noise I will get white noise as all the frequencies okay but as I told you that white noise using for perturbations is only in computer simulations you cannot do it in reality okay perturbing a plant with white noise is not practical so even though it is ideal it is not practical okay so so you have to so see that is the problem since it is not practical you are forced to make a frequency choice what is the frequency choice you made so what is the frequency range of your interest how does it influence the parameter estimates and the frequency response that is given by this expression this expression tells you that I can shape this difference by using by shaping the input spectrum if that that is the only message I wanted to take there is nothing more even if you understand the derivation finally you have to understand this that I can shape this difference okay this interpretation is possible only because of Parseval's theorem only because you are using two norm two norm Fourier transform and then you know you can interpret this into frequency domain and say that well so how I plan my input excitations okay is can be understood through this analysis okay that is why Loong's book is filled with frequency domain analysis along with time domain analysis. So this is this from this entire complex expression the take home message is only this that input spectrum can be chosen intelligently to minimize difference between the frequency response of the truth and the model in certain okay now what is what is what is the effect of adding that filter if you add that filter and do all the calculations that filter spectrum will come here I talked about this filter okay it will turn out that this filter is another way of shaping see one way is to shape the inputs okay other way is to choose the shaping filter this spectrum of the shaping filter will appear in this equation to that I have not shown here if you do that filtering that spectrum of that signal will appear in this expression okay and so if you are not chosen inputs correctly you can you know you can choose the shaping filter correctly and try to you know meant the error that you made by choosing the signal. So there are tricks which you can see only when you go to this frequency domain it is not possible to see this in the time domain okay the other thing is other thing that is done in Loong's book with quite detail is this variance errors okay so what you can show is that variance of estimated frequency response variance of estimated frequency response okay you can think of it in terms of possible error band in the frequency response variance will tell you what possible error band in the okay is related to these two terms small n here is number of model parameters okay capital N is the data length capital N is the data length okay and this is noise spectrum and this is input spectrum okay this ratio of noise spectrum to input spectrum is called as noise to signal ratio okay just look at this expression and tell me how will you make variance error small one way is to choose small n see one way is to choose this n use less number of parameters if you are using large number of parameters better choose large n capital N should be large okay if you are apart from this you have one more parameter that you can manipulate what is sigma v here v is the noise okay noise spectrum is there in the plant what is in your choice what is in your hands input spectrum so I can choose input spectrum in such a way that noise to signal ratio becomes insignificant so if the signal dominates over noise this ratio is small variance error low okay if I if my input spectrum dominates over the noise spectrum okay then so when you do this modeling and perturbations people will talk about you know what is signal to noise ratio signal to noise ratio is other way round that is 5u by 5v is called signal to noise ratio 5v by 5 u is called noise to signal so whichever way I mean sometimes people use noise to signal ratio sometimes people use signal to noise ratio so you have to make noise to signal ratio as small as possible or signal to noise ratio as large as possible okay signal should dominate noise should be small okay and this insight does not come looking at some of the square of errors this comes only when you look at frequency domain expressions that is why frequency domain analysis is quite important when it comes to so how do you reduce variance errors you reduce variance errors by choosing large data length okay by choosing correct signal to noise ratio so you choose signal to noise ratio to be large 5u by 5v is large then 5v by 5u is small and then so all this analysis is extremely important while getting a good model okay though I am just giving you the final bits bytes of what is useful you know all these analysis is extremely important in developing a good model you cannot blindly use tool boxes which are available now unless you understand all these theory okay and these whatever one and half month lectures are just to sensitize you about this there is lot more to this than what I am doing yeah you can get wherever you want so you can shift match see the idea is that you have to live with bias because real world problems are very high order see the take home message from this is that the real world problems are actually high order you are always going to develop a model which is low order so the bias is going to be there bias is part of identification so all that you can do is to shift emphasis in which frequency band you want good match in which frequency band you can live with mismatch okay how do you do that by choosing input spectrum properly okay or by choosing shaping filter properly so there are different ways of handling this actually actually if you look carefully this H hat is actually a shaping filter for this mismatch this term 5u by H hat square comes here so this signal to noise ratio so this is signal and this is noise spectrum this appears here okay so signal to noise ratio also shapes so many times people say that noise modeling is not because you want to identify disturbance model is noise modeling is because you want to get a good model deterministic model so noise modeling is a way of shaping is a way of shaping this term okay that is the that is the idea anyway so let us move on this brings to end the lectures on system identification hopefully a combined effect of the mid-same where you solve problems and when you start doing actual simulations that is why you will learn much more about this than this these lectures so let us go back now to now we want to go back move to control okay I have spent almost 40% of my time in talking about modeling well in reality when you go to a plant implement advance control 70% of your time will go in modeling 30% of him will go in control once you have a good model you are done you know you are there so now what we will be doing is mostly algebra and that is much easier than because now you are in the once you translate the reality into a model which is nice linear difference equation you are in the world of linear algebra you can do all kinds of things okay now as I said there are two viewpoints prominent in control one is transfer function viewpoint the other one is state space viewpoint and I do not want to profess one or the other but I belong to the state space viewpoint so I like to work with state space it is all linear algebra and simple linear algebra of matrices so I am going to convert my model back into state space form I like state space form so I want I identified this transfer function okay we will talk about how to deal with e k and all that right now let us look at one see so transfer function single input single output how do I convert this into this standard form that is my question okay so afterwards I am not going to bother about how did you get the standard form we have the standard form I have the standard form it is quite likely that I have a mechanistic model then I did linearization and then I did discretization and then I got this model okay or you know I had this system to play with these are the inputs these are the outputs I introduced some fluctuations I recorded the output as a function of time and then using this you know input and output data and using some system identification tool I come up with this okay I do not care finally how do I come up with this model I could have identified our max model you know the box Jenkins model output error model whatever you chose finally I am going to convert into this form I am going to work with it okay so which route you came to this form afterwards is not important okay so we could we could come to this form any which way okay it is not going to matter so I am going to talk about one possible way of doing realizations are there possible ways are given in the notes I will explain one okay one which is little complex to understand first two will try to see whether we can cover first two today but doing this is not so difficult it is it is pretty easy what is the okay what is the aim aim is to choose Phi Gamma and C in such a way that C times GI- Phi inverse Gamma is same as not Z okay should be Q okay I want to choose C Phi and Gamma in such a way that this G Q is the transfer function that is my aim there are different ways of their infinite ways of doing it there is no finite number of ways I will prove that also but I am going to talk about some you know popular ways and why they are popular also they come clear after some time when we start doing controller development so this first form is called as a controllable canonical form okay what I have done here is I have introduced one pseudo variable here you see this pseudo variable ? okay I have rewritten this equation I have rewritten this I have rewritten this equation okay by introducing an intermediate variable which is this ? K okay same equation so I am saying that this operator operating on ? K is effect of UK entering the system and YK is same transfer function just that trick introduced one more term in between okay now this particular first equation okay is equivalent to this difference equation just check this is Q 3 so I will get ? K plus 3 this is Q 2 ? K plus 2 Q ? K and is this okay yeah all 3 are negative here so they should be positive because when they come on left hand side will be positive I have taken them on the right hand side so that is why they are negative is everyone with me on this right okay so I am going to define 3 state variables okay X 1 K is ? K plus 2 X 2 K is ? K plus 1 and X 3 K is ? K now those of you have done numerical methods we have done something similar if you remember converting high order differential equation into n first order differential equations right this is equivalent thing for a difference equation okay equivalent thing for a difference equation is this okay this term I am calling as X 1 this term I am calling at X 2 this term I am calling at X 3 okay 3 states I have defined okay what is this term X 1 yeah this is this term this term is X 1 K plus 1 will be ? K plus 3 that is you see this okay I am going to use that next see here what is done X 1 K plus 1 is nothing but ? K plus 3 okay so this was this equation is written in terms of X 1 X 2 and X 3 okay so now do you see I have converted a third order difference equation into 3 first order equations I have converted okay a third order difference equation into 3 first order equations in general nth order difference equation can be converted into n first order equations okay is this is this transformation okay see there is a relationship between these two variables the relationship between these two variables that is captured through these X 2 K plus 1 X 2 K plus 1 will be ? K plus 2 which is same as X 1 X 2 okay so I have written correctly no there is an error here okay now just check is it okay X 2 K plus 1 X 2 K plus 1 will be ? K plus 2 so that is X 1 X 3 K plus 1 will be ? K plus 1 so that is X 2 right so I have three difference equations in place of one third order it is not coming here is this okay okay so then you know I just convert this into what is called as controllable canonical form okay I have just rearrange these three equations these bottom three equations I have rearranged into matrix you get this particular form this is called as controllable canonical form and I want to now get Y is equal to C of X okay Y is equal to C of X is it is very easy because Y K is equal to B 1 Q square plus B 2 Q plus B 3 so if you multiply you will get this which is same as this is X 1 K this is X 2 K this is X 3 K so I got Y is equal to C of X okay you got Y is equal to C of X the next one I want you to go back and read and tell me if there is an error there is another form called as observable canonical form the derivation is little more complex just go back and read this derivation okay it is not there is nothing fundamentally you know difficult to understand it is an algebra to get another way of state realization what I am going to get if I do this way of state realization is I have explained the steps here I will get this I am just going through it very very quickly yeah I get this form this particular form is called as observable canonical form here you will see that the matrices are now just change there you know one is transpose of the other C and B seem to have change this place but there are different ways of getting state realization there is no unique way of coming up with this state realization that is a important message that are different ways you can get the state realization all of them will have same transfer function all of them will have same transfer function in fact you can do any transformation of this you can do any transformation of this state by multiplying by invertible matrix and you can show that that is also so if I get this if I get this and if I multiply both sides by invertible matrix okay I will get another you know state space form that is very much possible and even that new state space form will have same transfer function transfer function is invariant state realizations can be many okay so this simple thing tells you that there are you can transform into some other form where this intermediate state variable is ? and pretty much the transfer function will be same so it will not get a different transfer function so the realization of a transfer function into a state space is not unique there are infinite possible ways you can get it normally we get it using these two forms controllable canonical form observable canonical form because they are very easy to construct okay but you can multiply that form with an invertible matrix you will get another realization all of them will have same transfer function that is the key thing any one of them any one of them any you take one realization C ? another realization C ? ? ? ? all of them will have same transfer function okay that you can show because this different matrices are related through invertible transformations okay if they are related to invertible transformations you can be sure that the it is the same it is a realization of a so what is the meaning of these states which were which you get here X here the physical there is no physical meaning that you can attach with this X see I got this state vector here I got this state vector here X k plus 1 is equal to this matrix into X k plus this vector into U k what is the real physical signal here U k what is another physical signal Y k okay this X k has no physical meaning it is a mathematical construct which helps you to put everything into one standard form okay so we are going to use this X okay because it is convenient to put into this form and then work with it okay linear difference equation models very well understood so this realizations are non unique there can be infinite ways of realizing the same transfer function into different ? ? C matrices all of them will give you identical transfer function that is the message yeah yeah I will come to that good question so I will answer that question let me come to be multiple input multiple output systems okay what I will show you is that if I work with multiple input multiple output systems and if I work with state space models final form is the same irrespective of whether it is multiple input multiple output single input for a CISO transfer function it is a scalar transfer function MIMO transfer function matrix is very complex business okay here everything is same whether it is CISO or MIMO or MISO or CIMO or whatever okay the state space model will be just look same finally okay how so let us say I have let us consider a situation where you have three outputs and one input okay so I can construct a state realization for this which looks like this there are three outputs B1, B2, B3 for each one of them will appear in this C matrix this go back and think about it it is not see there are three parallel lines all of them have same denominator okay only here you have B1, 1, B1, 2, B1, 3 here you have B2, 1, B2, 2, B2, 3 and so on okay so all that will happen is that this matrix this matrix instead of becoming single output it will become multiple output all three are modeled here okay final form mathematically looks same Phi, Gamma and C okay MFD is completely different description so I do not belong to MFD I do not like MFD descriptions now what about 2 x 2 if you have two inputs two outputs what do you do okay I can develop one model here okay I can develop one model with respect to U1 I can develop a state realization okay with respect to U2 I can develop another state realization and I can put them together and create a big state realization okay is this okay I have two things and then I am going to stack x1 and x2 see go back here and see what we have done we have split this there are two additive parts one is effect of U1 on y other is effect of U2 on y for this component we have developed one state realization this is y U1 that is effect of or contribution to y due to U1 that is given by this state space model contribution of U2 to y is given by this state space model and this plus this is nothing but yk okay so I am going to use this to define a combined state vector x1 and x2 x1 coming from U1 model x2 coming from U2 model okay I can just combine it I can just stack you know the state space models mini state space models can be stacked into big state space model this form is again the same whether it is multiple input multiple output single input multiple input whatever final form is same it is not different so my algebra can be only on this particular form I do not care if I start working with matrix fraction description if I start working with polynomial you know matrix description algebra is very very messy controller design becomes very messy well you have advantage you can work with frequency domain but here you have computational advantage you know you can do everything in time domain so I can just finally get the same form mathematical form remains same I have just given the same thing for MISO models and observable canonical forms if you have MISO models you can develop observable canonical forms which will look like this okay and then same thing you know you consider a 2 x 2 system and then you develop two models you combine them into one big model one big model finally will look the same so take home message is state space form state space realization is you know will look same irrespective of number of inputs number of outputs you know finally same mathematical form so I just have to deal with one equation afterwards life is simple so full tank model so rmax model we can look at it as one output and two input what are the two inputs u and e u and e are the two inputs okay I can look at it as a model with two inputs and I can convert this into a state space form okay matlab toolbox of course will give you the state space form if you give the transfer if you give the model and say give me SS I think it is called ID to SS or something there is a function it will give you the state space form okay so you will get a state space matrix see you have the states oh there should be there is a mistake here this should okay so you have two inputs one is you are there is e is the innovations okay there is only single output suppose you take single rmax model one input one output and one innovations then you can convert into this standard form okay so this has a deeper relationship with something we are going to study later called Kalman filtering we will talk about it later so with this I come to an end of these lectures so we looked at different kinds of models we looked at gray box models black box models of course if you have a mechanistic model nothing like it okay that is the best model if you do not have you can still develop a model completely from data that is black box models if you can merge the two that is a gray box model okay and this is the most crucial part in a any control project developing a good model okay so we have discussed all kinds of issues that are related to model development these are the some of the references that I have been using for my notes so they are not from one place a good book for beginners is this Soderstram and Stoica system identification or Schumwe and Stoffer these two are first and the fourth are very good books for beginners okay for advanced users Leung's book or actually I would say one three and four these are very good books for beginners even even these two books Astrom and Franklin and Powell so the last book and a second book sorry last book and a third book or they are very good but they are little advanced so