 So, we have been developing one step ahead predictor for our max model in my last class I showed you one way of developing this predictor okay, we distinguish between prediction of y and y measured you should clearly understand the difference prediction of y based on the past information is different from y measured, y measured actually is the value of measured variable collected from data equity system. Prediction is an estimate it is not it is not there might be different ways of predicting so there is no unique prediction but the measurement is once it is obtained it is unique it is not or it is in some sense invariant you cannot change the measurement okay. So I showed you one possible way of doing predictions the idea was to bring everything that is y and u into my prediction equation this is alternate way of doing predictions we will just go over it once more so now other ways to use predicted residuals we use predicted y we can use predicted residuals and then do predictions okay. So here to begin with I am showing you two predictions suppose I have this I have constructed estimate y hat k minus 1 given k minus 2 this is prediction of y at instant k minus 1 using information up to k minus 2 okay so same thing is here this is prediction of y at instant k minus 2 using information up to k minus 3 let us go back to our equation we were looking at second order Arama model second order Arama model we had rearranged like this y hat k given k minus 1 was weighted sum of past measurements past inputs manipulated inputs or known inputs to the system and now I am retaining these past two errors now just notice here earlier I had this term ek ek is the white noise which we are hypothesized drives the model okay I do not have measurement of ek I am going to use an estimate of this error this estimate of the error between y measured and y predicted this difference I am going to use as an imposter okay somebody who imitates the true innovations so called true innovations or so called true error so I am constructing an estimate of e the estimate of e is constructed using difference between y measured and y predicted at each instant and that is used in my model okay so now I want to start doing predictions okay I can make a simplifying assumption that at in the beginning of the in the beginning of modeling exercise at time equal to 0 the two errors are equal to 0 expected value of e0 e1 and e2 is equal to 0 this is a simplifying assumption I want to do predictions using y u and error okay if I have first three errors then subsequent errors can be generated using this equation I only have problem when I start doing using this equation from time 0 to time capital N okay my only problem is first three instances if I somehow have some guess for these first three instances so I am going to make an assumption well our error e is a 0 mean signal best estimate for this signal is 0 okay so it justify to use 0 0 0 in the beginning so now what I am going to do is given this model parameter now in optimization what do you do you guess unknown parameters okay then using the guess values you construct the prediction errors and then we minimize some of the square of prediction errors that is the idea so first three prediction errors we are arbitrarily choosing them equal to 0 okay first three only after that see what happens if you make that simplifying assumption what happens is that make this assumption then what is y3 see y2 is the measurement at instant 2 y1 is measurement is instant 1 they are available u1 u0 are available and e2 even we have assume them to be 0 that is our simplifying assumption okay so I get y hat 3 given 2 with this I can estimate e3 okay I can estimate e3 then you know I go to y4 y4 needs e3 and e2 e2 we assume to be 0 okay e3 got calculated here okay and then I get estimate for e4 is everyone with me on this just check what I am doing okay once you give me a guess for parameters a1 a2 b1 b2 c1 c2 and then make this simplifying assumption of e0 equal to 0 e1 equal to 0 e2 equal to 0 then if you have large data set this first three elements being assumed to be 0 does not make a difference okay typically we will have large data set and this does not really cause much problem particularly if the if you make an assumption that the disturbance model unknown disturbance model is stable and inversely stable this is perfectly value assumption making 3 to be equal to 0 so that is not going to cause a problem in the identification well I can just go on doing this now see once I have you know e4 and e5 I can go on using them rolling in my recursively in my prediction this is just writing a for loop when I am writing a program I am just writing a for loop going from time say time 3 to time n I just go on doing this calculation for every guess for every guess of parameters I can estimate this okay so what is done in optimization in optimization you give a guess okay you give a guess for a1 a2 b1 b2 c1 c2 for every guess we have knowledge of y1 y and u we construct these predictions then we estimate e3 e4 e5 up to en okay this just goes on and then we minimize we minimize some of the square of errors with respect to these parameters okay the theta here is nothing but this parameter set a1 a2 b1 b2 b and what does parameter optimization algorithm do it will it will estimate the objective function based on the gradient of the objective function it will give you or some some method gradient or decision of the objective function it will construct a new guess and you keep iteratively trying different values of the algorithm of the parameters till you reach some local optimum minimum okay some optimization some minimization criteria satisfied so that is this problem has to be solved iteratively okay as you can see for some simple systems I showed you demonstration of MATLAB toolbox last time this can be done very very quickly you know it takes probably few seconds now doing these kind of calculations optimization calculations so this is a constrained optimization problem so epsilon is actually computed using this particular equation of course you do not have to use constrained equation here you can eliminate and use simple unconstrained methods unless you want to constrain this a1 a2 b1 b2 and so on but that is a little advance thing whether to constrain them or not to constrain them not to ideally we should constrain them because we want the noise model to be stable and inversely stable but if you try to pose a constraint problem there are some difficulties and there have been methods of handling them that is kind of part of current research not really so it is very important to give good initial guess it is an optimization problem it is very very important to design inputs correctly it is very very important to choose structure correctly all kinds of things so let me just summarize the prediction error method okay let us go back and see what we have done we have for every guess of parameter that are to be estimated we actually compute this y hat k and error okay and then we minimize some of the square of errors with respect to these parameters to be estimated that is the what if there is a time delay or dead time in the system that is typically estimated a priori based on some other method and that is just used while constructing the model time delay estimation using this becomes difficult because time delay is a integer variable you know you have time delay of three samples or four samples and if you notice here these are all continuous values a1 a2 b1 b2 c1 c2 can take you know continuous values they can be real numbers so we do not really get into you know solving a integer mixed integer non-linear programming problem that becomes little difficult okay so let me summarize this let me summarize this so I have this data I have done some experiments and I have collected this data for some capital N capital N should be typically large how large you as you know practicing engineer knowing about the plant knowing about the domain dynamics of the system you have to decide okay you can probably come up with some help from you know statistics and linear algebra but then it is a joint call that is not completely automated your expertise as a domain expert is very very important here then you know we choose some model structure how do you parameterize G how do you parameterize H whether you choose ARMA model AR model ARX model all that is up to you you have to again come up with what is feasible what is data length you know very very complex though it looks very simple here then I do this optimal predictor I just showed you how to construct optimal predictor for two some simple cases now the nice thing about this optimal predictor is that it right hand side only involves known signals you and why okay it does not involve unknown signal E innovations innovations are actually constructed E is a signal which is actually a fabricated signal okay and you can see that that signal is nothing but here y-y hat so it is it is a we expect this to be a white noise a zero mean white noise signal typically zero mean Gaussian white noise each one of them a stationary process white noise stationary process with and when do we stop how do you know whether the model identified is good or bad this epsilon sequence which you get after estimation of the model okay should be a white noise very very critical okay and then you minimize typically you minimize some of the square of errors somebody might ask me why some of the square why not some of absolute errors why not you know infinite norm you know minimize maximum error maximum of the absolute error minimize maximum of the absolute error you can do that perfectly fine okay why we use to norm is because to norm you know is associated with lot of rich you know properties but first of all those of you who have done this numerical methods course to them comes with you know it is on the Hilbert space right so you have in a product you have a lot of you know rich theory associated with two norm and you can you can interpret it through projections then in this case you can use something called Persever's theorem and then transform the you know or do analysis using frequency domain there are lot of things which come with two norm very very helpful to do analysis becomes difficult to do analysis with one norm or infinite norm so if I use some of absolute errors it is perfectly fine but analysis becomes difficult okay analysis is very easy particularly with two norm that is why v square estimation is so popular estimating covariance of parameter estimates estimating some theoretical property well I am just going to hint at those properties right now I do not have time to get into derivations or talk about it fully because in this course you know modeling is one aspect we have to proceed and go to control and say the estimate so with this lecture and the next lecture I am going to wind up the control relevant modeling I am going to go over some things very quickly okay which will probably cause some discomfort because there will be some complex expressions here but right now idea is to sensitize you that there are many more complexes here okay I probably would need entire 40 lectures to really go into depth of these things we many times run this course system identification and say estimation separately this year it is not running but if it runs next year maybe you can think of editing or auditing that course so that will give you much border perspective of what is happening okay. So let us go back to our two tank data that we have okay if I actually do this parameter estimation using Matlab toolbox this is the model that I get and of course as I told you the model consist of these ABC polynomials and it also consist of the statistical properties of noise E which have been estimated from data so the model has two components deterministic component which is coming by B and A stochastic component which is coming through C and A okay together with noise properties okay we actually wanted EK to be 0 mean it is almost 0 mean it is not equal to 0 mean it is almost 0 mean and estimate of the variance how is this way estimate of the variance obtained using the sum of the square of errors divided by 1 upon n that will give you estimate actually I have given this formula so this objective function itself divided by 1 by n will give you estimate of the variance okay so that is that is after you optimized for this particular optimum values estimate of lambda square lambda square is variance of E is given here okay Matlab toolbox will give you this information you should know how to use it that is very very important okay so or even Sylab if you are using Sylab Sylab toolbox or there are many statistical open source software I am sure they some of them have this time series modeling so they will give you all this information you should know what is meaning of all this okay so second order they are Rmax model is pretty good I get I had shown you these results earlier the innovations are almost a white noise then innovations and inputs are almost uncorrelated so it is a good model we have extracted or we have separated deterministic component and stochastic component okay we have modeled for unknown disturbances your model for known component we can go ahead and use this for design of controllers yeah we have to go back and say that whether my model order is correct should I go back and change from second order to third order both will change if you if you want to change only deterministic order not change the noise model order then you should not use Rmax you should use box Jenkins structure should be different see we have this different structures the trouble with Rmax model well why I have introduced Rmax model it is simple to teach and understand the concept but the trouble with Rmax model that it has common denominator polynomial for noise and for deterministic which sometimes is not a great idea what you are saying is correct this having same a and for noise and distinct it will have schools for deterministic and stochastic component club together you may have some difficulties because of that if you want to separate them you can separately parameterize so this can be third order this can be first order you know you can do that okay only the algebra becomes little more complex okay everything the same you do h inverse g in plus 1 minus h inverse y everything the same nothing changes okay so the prediction error method can very much be used not an issue okay and you can separately parameterize the noise model in fact a better idea is to separately parameterize the noise model okay so Rmax model of course you can always convert this model into this model suppose I do cross multiplication and have a common denominator okay I can have a common denominator which is aq-1 into dq-1 do you see that see this is you can do polynomial algebra just like fraction algebra you can just multiply so I can define a model where this will be d into b divided by ad let's see into a divided by ad that converts it into this form okay by the way please understand that these are just representative forms when I am writing this and this here there doesn't mean that this b is equal to this b is not that these are just representations okay so when I am writing ARX model Rmax model I still keep using a b you know actually for for a same system Rmax model probably should be a tilde and you know BJ should be a cap or some other notation because they are different they will not be identical but it's you know that will make the notation so complex that so these are just representations okay and you can see here you can convert this form into this form and so on okay a better understanding of this will come of course we are solving those problems and we'll have those we'll have two more one or two more sessions for before the mid-same problem solving but the real understanding of this will come when you start doing the project I have been waiting for finishing this component so post mid-same will start a project so what we have done is we have collected the five different systems and we are going to give you good projects okay so so you simulate the system you actually introduce noise you get data then you put it into I didn't get different models then you know you will get a feel of these numbers because all these statistical things if you do using some three or four values that make sense you should do it in MATLAB and we'll have some sessions where you start doing these these projects so there are two or three models from chemical engineering then two models from chemical into reactor models one two models from bio domain which I think anyone can understand there's some models for monitoring in chudin monitoring blood glucose concentration using insulin infusion some three equation model simple model very very simple model so those models will cut across any branch system is known to you human body and then you know here very very close to you so and you can appreciate what happens if glucose goes high and you don't need to need any special engineering domain knowledge for appreciating what will also what it will also tell you is that these methods that we are talking about about modeling and control they just cut across I mean the boundaries of disciplines that it is belongs to this for that doesn't make sense they they they are modeling tools that can be used for any system any system okay okay so let's go to this model we can I showed you various ways of assessing the model quality okay we we actually what we do is we compare model predictions see you you take data you divided into two parts one part is for model identification other part is for model validation a model identified using part a of the data should be able to predict part b data because see this data has been used I'll show you that particular thing this data has been used one part of the data has been used for I'm going to show you this case study so this is what is done you you conduct an experiment you take this part of the data for model identification and then this part of the data you call for model validation if I identify a model using this part of the data it should predict this data then only your model is good how do you know see when you are minimizing the sum of the square of errors on this the identification you know we'll show you perfect results it should predict an unknown situation then the model is good okay maybe I'll present I'll rearrange do some I'll present this case study first before I move to some of the issues that I want to talk about yeah actually nothing like that you can have both of them same you can have to conduct two different experiments with the similar characteristics and then you can use one experiment set for model identification the other experiments no nothing nothing you use say basically if I conduct an experiment for five years well I have taken larger data set for model identification because to keep the parameter estimation variance of parameter estimates low I need n to be large okay that's why when you if you conduct only one experiment then you tend to take a larger data for identification smaller data for if you conduct multiple experiments if you are trying to do that then not an issue you can have two different experiments conducted one dedicated only for model identification the next one for model validation absolutely fine so before I move to some of the issues in identification let me present a case study okay there are a lot of okay let me present selectively some two slides before I move on and then we'll move on to case study then we'll come back to the issues so what are the steps in model one of the most important thing is model structure selection you have to a priori first of all know what is what is available in your basket what are the what are the different model structures available to you I have talked about a limited set linear difference equations okay linear stochastic difference equations they are stochastic difference equations mind you because you have they are being driven by known inputs as well as innovation white noise okay zero mean white noise signal so it's a it's a set which is a linear stochastic difference again within that you know you have ARX, ARMAX, ARMA, OE all kinds of things and you have to make a judgment what do you want maybe in some situations you are not bothered about disturbance modeling you just want a deterministic component correctly you should go for output error model structure okay but as theory shows that you want to get good estimate of deterministic component you may have to model the noise okay so that part is I am not going to cover that that aspect so much deep I just show pointers as I said if I do give justice to all those things probably I need to spend entire semester which I don't want to do so model structure selection is very very important it's even important before you start you know experiments because if you are going to do ARX models okay or finite impulse response model the number of coefficients is going to be large you better conduct experiments for a longer time so all these decisions are correlated they are not independent you have to understand this and you have to understand this particularly with the backdrop that you know now all these modeling tools are available to you at a fingertip okay so when I was a graduate student say 20 years back we had to write our own program to make this and you know writing a program for this itself would be some project to write a PEM method now it's not even a course tutorial you know so it's reduced to that so but you should know what what you are using you have amours but you know you should know what you are doing I just talked about linear difference models there is a whole lot of things called non-linear ARX models Narmax models non-linear you know you have all those neural nets and wave nets and then there are so many other options by which you can construct non-linear difference equation models we have just talked about one simplest set why why linear difference equation models because linear difference equations are mathematically treatable very you know very very easily treat we can treat linear difference equations you can do design of controllers and all that very easily that's why we but in some cases in some situations linear difference equation models will not be useful and in those cases you have to take a call what model you want to be linear or non-linear and then once you go for a non-linear model you have to of course use non-linear controller design methods my prescription would be simple things first first try linear difference equation models if they don't seem to work or if you know from physics that linear linearity is not going to work in this particular situation then only you should go for a more complex structure next thing is that you have to plan suitable set of experiments you have to design input perturbation sequence okay you have to worry about whether to do the experiment in open loop or closed loop now I'll tell you why you are using linear models okay now linear model is not valid if you develop a linear model for a plant which is operating at one operating conditions the operating conditions change tomorrow okay which can very well happen in a chemical plant let's say you have a refinery which is working using Gulf crude and you decide to shift to Bombay high crude okay the characteristics of Bombay high crude are different operating conditions change completely the linear model which is valid for Gulf crude will not be valid for Bombay high crude okay so these are ultimately perturbation models okay or when we go to that diabetes model okay the model if the model is non-linear if you linearize it at one operating point the predictions will not match the behavior at some other operating point so if the operating condition changes if the glucose level changes from say 100 to 30 the predictions at 30 will be different you cannot use the same linear model so you may have to you know inject the perturbations at 30 and now the question is that when you are doing this injecting the perturbation should it be in open loop or should it be in closed loop open loop is that no insulin is being is given okay maybe dangerous is the glucose level are too low and no insulin is given no control is there okay and you are perturbing the plant you know you are on the dangerous territory okay you better control the system and inject the perturbations okay so in a reactor you may you may be in a region where it is it is you know very high pressure and temperature it is dangerous to operate if you inject perturbation without the controller being present so there is some controller designed earlier it is already active we found that the controller is not performing well we want to change the model retune the controller this is the scenario I am talking about okay now in such a situation should I take the controller off or should the controller be on operator is more comfortable if controller is on okay he doesn't want to take off the controller and as a you know modular you will say no no no open loop is better but open loop is you know not practical sometimes now if you decide to do perturbations in closed loop there is huge mathematical implications the noise in the measurement becomes correlated with the inputs you and then entire things becomes messy maths becomes very very messy you cannot use the method which I described in a straightforward manner you have to do a lot of lot of maths to clean the data of the noise so the input and output in some sense become correlated that causes problems in estimation I have hinted at this in my tutorial you saw the tutorial I have given one small problem with proportional feedback and showed that some matrices this omega matrix will become ranked efficient and so on okay when we do that exercise will realize so identification whether do it in closed loop or open loop is a tricky question and when you are actually re-identifying a plant while the plant is running operator would prefer that you do it in closed loop but close loop identification is a tricky business there are very few commercial products right now people are working on actually developing products which can do model identification online while the controller is on okay it's a it's a hot research topic and very much valued by industry there are so many issues yes that is also one of the argument which is advanced that you know it might drift somewhere the other argument which is advanced is you know you might when you when the controller is on you actually excite the plant in the frequency range which is relevant to the closed loop which is not possible when you this is open loop and so many other issues so there are people who are for it and for against it so it's a debate which is still in conclusive debate you have to take a call on depending upon the particular situation you cannot there are no there are no global answers to this what should be done yes yes in some sense yes what you are saying is yes yes so how do you know prior if the plant is not running so there are chicken and egg problems you know what comes first and so how do you design an input itself becomes a very critical thing so typically you use optimization techniques and model evaluation you have to perform by multiple ways you have to see whether the steady state properties of the model or good or not the gain is correct or not you know you expect the game to be positive when it turns out to be negative from estimation something wrong with your model so so all kinds of things which you have to so there are multiple things you know whether a see I don't know whether many of you know what I'm saying here because for chemical students chemical students batch and continuous is not difficult to understand a chemical plant has you know the crude oil coming continuously from the some source some oil well let's say or some some it is continuously being processed and the products are continuously being pumped to different destinations so let's say you have crude oil then you separate it into aviation turbo fuel and diesel and gasoline and this process is happening continuously all the time the feed keeps coming and it gets separated and sent these are called continuous process the other kind of processes are batch processes where there is something starts then there is some operation and then it stops best example in our home for a batch process is cooking okay so you take rice in the cooker then you raise the temperature to certain point then you operate it for located from a control viewpoint you operate it at a constant temperature for some you know 9 minutes or 11 minutes or whatever we wait for some whistles and then you switch off then the temperature should come down right before you can open it so it's a control problem which we do it by hand a real life situation in chemical engineering is specialty chemicals pharmaceuticals you produce you know some specialty chemicals or drugs in a not a cooker but a reactor which is called as a batch reactor or some batch reactor in which you put things okay and let them cook let them react so you have some recipe there is a batch recipe you heat you heat it up to certain point then the reaction starts the reaction might be exothermic so you may have to cool it and so there are all kinds of you know the other batch process is aircraft flight you take it off to a certain level cruise it and then it stops right you come down on the analog continuous time analog for this kind of a system would be of chemical engineering would be a satellite a satellite never once it takes off it is there I mean for so it is if you take it that it never stops it's a continuous time system which is once it is there it is there you know in the orbit and so so it also depends upon what kind of application you have in mind I will be talking about this prediction so what kind of model you want to develop depends upon what is the availability of data you know how much time you want you have at your disposal so there are I mean modeling if you ask me in the control project what is most important part is modeling and not the control once you have a good model you know doing control is relatively much easier because you know once you put the system behavior into nice difference equations we know how to go about you know we are master at working in this ideal world of equations but translating the reality into something that is equations which mimic the plant or the system reasonably well whether it is mechanistic model whether it is grey box model whether it is black box model whatever it is doing that is the most tricky part and in complex control schemes where people implement them the people who can do modeling are paid much higher probably and those who can operate because if you know how to model you know you are done 70% of the work in a control project you had some question distillation can be done in both batch or continuous so the crude oil distillation which I was talking to you is is like a continuous process it probably this they do I mean if you take a very large viewpoint of 100 years it's a batch process but we call it continuous because once you start you will not stop it for one year or so once you start this process for one year okay at a larger time scale everything is a batch process okay but because sometime you have to do a startup and sometime you do a shutdown we call it continuous because once I started for one year it's going to run okay but the other batches which I am talking about see let's say I am preparing some specialty drug okay so once I react it for someone one day let's say it takes to react and get mixture where you get the formulation then you get a mixture which is which has product and reactants both together now you need to separate the products okay now that has to be done in a distillation but that is called batch distillation because you are going to only process you know some 100 kg of material it will take some four days not four days it may take some four hours or five hours okay so that is called a batch process okay so this batch and continuous is related that's so if the crude oil supply dries up well the plant is shut it is not so when you say continuous it is pseudo continuous so what I would say is that the model granularity what kind of model you want to develop first principle model linear model back box model grey box model you know nonlinear grey box model whatever it all depends upon what is the application what is the system what is a data available you know you have to take a call so please remember even though we are doing with data driven modeling your role as engineer who knows physics is not less important it is equally important you cannot develop good models unless you know physics of the system well okay so this is a tool which has to be used in conjunction with your knowledge of physics it cannot be so that's why as a control engineer I am better suited to develop models in chemical industry because my background is in chemical engineering and a mechanical engineer is better suited to develop models for a robotic problem because he knows those issues or for a car or so those things are important statistics part anyone can pick up the domain knowledge you know is important how do you solve this problem typically there are two methods one is nonlinear optimization there is also this method called Gauss Newton method which is also quite popular which is used and those of you who have not done this should probably look at what is Gauss Newton method there are a number of issues you know do you want to emphasize certain frequency content in your model okay I know that the model the plant actually operates in this frequency range this knowledge is known to you from your physics understanding of the system and you can use it to shape up the objective function so instead of minimizing the sum of the square of errors okay I can choose to minimize a filtered version of this I can I can I can construct this epsilon pass it through a shaping filter okay and then it's a band pass filter and that band pass filter will allow you to cut off some high frequencies from this and focus on low frequencies how to choose this filter filter is a which one no no no we want a white noise but we are minimizing we are minimizing the filtered value of it objective function is different okay we still want epsilon to be white noise okay but what is being minimized is yeah whether you will always get a white noise or not is a good question I too have to if you do this are you guaranteed to get a white noise is a question that is quite critical yes I do have to go back and check that you are guaranteed to get white noise and there are issues like model order selection okay see one thing which you can which you will find when you are doing any data fitting okay if you have a model with four parameters okay and if you have model with ten parameters ten parameter model will always seem to fit better okay as compared to four parameter model now which one I should use visually it looks like you know ten parameter model is better so like sometimes they say parameter estimation you know if you give me ten parameters I can fit an elephant into data and if you give me twelve I can make the elephant walk so you can you know make seemingly better and better model if you have more parameters but then there are so many issues more parameters means larger data length means longer experiment time means loss of production and so what is done is that when you are comparing different see actually when you are identifying a model for a system for which you are guessing the order okay you do not know when you are given some data in this simple system of two tanks you can say okay it is second order or third order I you go to a real plant where you have power plant you know you have some complex boiler system what is the order very difficult decision to make so you start trying different orders second order third order fourth order smaller the better okay I am going to take a call what order model I should fit so we actually use this criteria called a cake information criteria okay to decide which model is good so it uses two things you see here it uses some of the square of errors see just just because you have a small some of the square of errors does not mean the model is better okay we use two things one is this information criteria ways some of the square of errors it also weighs the number of model parameters okay so we compare a model with five parameters and with ten parameters using this criteria if you see here a ten parameter model will have this component more it may have this component less but this component more okay so this criteria tries to balance between number of model parameters and some of the square of errors which reduce because number of model parameters okay so you compare different models using so prediction so there is one term which is prediction term which is this term this tells you how well the model fits the data and the model n is some sense you know major of a complexity of your model you know more the more the parameters more complex model you have so model order selection you have to gain experience in doing this no book will ever tell you how to choose the model order you have to see a lot of data you have to fit in data use that model models and then you will get so you penalize in this type information criteria you penalize the model complexity by by adding a including this term or some penalty on n number of model parameters and then you know this is not the only thing that you have to bother about you have to bother about variance errors you do bother about something called bias errors and I'm coming to that so so typically you use a cake information criteria I have given you here one information criteria there are more information criteria in the literature the Bayesian information criteria and some other methods of comparing models but this is a crucial question if you have data if you are developing models of different order how do you compare them okay you might be developing simple correlation you know correlation model whether to use a first order polynomial second order polynomial third order polynomial okay which is better you can use this a cake information criteria not just for time series modeling you can use it anyway okay so that is how you can use to compare models well real systems are multiple input multiple output no real system is you know truly single input single output single input single output we study in the first course because everything is easy to understand the concepts you know digest the concept is easy and in many cases you know some simple systems approximating as C-show system works okay not every time you require very complex things but real complex systems in today's world they are tightly integrated and they are difficult to use single loop controllers you have to use multi variable controllers okay this is more and more so because of tight energy integration mass integration all kinds of complex designs that are coming up you no longer have systems which are you know can be done with so you have to use multi variable you have to develop multi variable models multiple inputs affect multiple outputs simultaneously and you have to model in a in a in a power plant or in a boiler you can appreciate if I change the fuel flow it's going to change variety of things not just you know one one parameter okay you want to change pressure it's going to change the level inside the boiler is going to change the oxygen content in the you know flue gas it's it's multiple things what happens and this is for everything you change one thing it has effect on multiple outputs so in reality we have to develop MIMO models multiple input multiple output model problem is much more complex now there are issues when I have multiple inputs do I perturb them simultaneously or do I perturb them both at a time okay one at a time looks nice you know you can segregate the effect of one input but not practical because if you have a boiler which has five inputs okay you perturb for one input for one day and five experiment for five days you are wasting you know the steam for five days not affordable you better perturb all five simultaneously do experiment for one day okay collect your data and do modeling nobody is going to allow you to do experiments for five days so so these are the issues in real in real life that you have to worry about well ARX model can be very easily converted into MIMO model multiple input multiple output model not an issue okay the trouble is the number of parameters to be estimated just blows up okay so some initial data I was trying to use ARX models you know we had five outputs seven inputs with respect to each input output pair I required 40 parameters to get a huge number of parameters of course the data was there for 23 days I could do a lot of I had used data so I could use a model which has large number of parameters but you need large number of parameters if you have ARX model you can do simultaneous modeling of multiple input multiple output but and you can get solutions very easily that d square estimation just scales up doesn't matter whether you are working with scalars or vectors it just works very well but there are issues in terms of number of data points these models output error model model what you do is you develop what are called as MISO models multiple input single output models and then you fuse them to create a MIMO model that is typically what is that okay so this MISO modeling is that how do you excite the input well it's a tricky business as I said there you do it sequentially and simultaneously and you will still find people writing papers what is the best way of doing it and not a sorted issue what should be my that's so many issues when you design an input signal what should be my choice of input signal should I use this signal or this signal both are pseudo random binary signals but they have different frequency content you can see that this has very low frequency content there is very high frequency content how do I know this I look at the power spectrum okay I look at the power spectrum this signal this signal here has this power spectrum it has lot of frequency content lot of power at low frequencies but it has almost no power at high frequencies okay whereas this signal this signal has almost no power at low frequencies but significant power at middle frequencies okay which excitation I should introduce golden question okay you should know the system you should know the system what at what time scales or what frequencies scale the system operates unless you know that you cannot make this call and this is very very very crucial call if you do not introduce right you know frequency perturbations your data will not be useful for modeling because it will give you a model in the wrong frequency region why will it do model in the wrong frequency well I will talk about that a little later one issue that is very very important what is called as persistency of excitation I have tried to teach this concept through exercise problem of talking about estimating parameters of a two parameter model in which step input is given and I have tried to show that step input will not be useful step input will give you you know this matrix to be ranked efficient and then you will have trouble estimating model parameters if you have so how do you put up the plant is actually a becomes an art and guided by maths and the physics both you know so this persistency of excitation is actually technically defined as rank of this matrix correlation matrix of inputs if you want to identify nth a model with n parameters with reference to the input this rank should be equal to n critical if it is not your model will be you know the result that you get from your identification exercise even if you use the best package the result that you get is a garbage that you have to understand see now you have good powerful packages when you are giving data and you are not making any mistake on numerical side okay if you are getting a wrong model if you are getting a and and even a good package there can give you garbage because you have fed in garbage that you have to understand if you do not excite the plan properly if you do not make sure that this matrix rank is you know consistent with the choice of model order okay see if I want to if I want to identify a model of third order then this this matrix should have at least rank 3 they want to identify model of fifth order this matrix should have at least rank 5 and so on okay so it is not that given a data I can develop a model of any order okay if your input is not you know persistently exciting of order n you cannot develop a order 10 model very very important where do these parameters where do these things appear in the equation see here this r u r 1 r 0 these are appearing here okay this sub matrix here rank of this sub matrix actually is very very critical rank of the sub matrix it should be to if it is not to you have a trouble okay and it will cause trouble even here if this sub matrix rank is not to you have trouble even here so there are issues which are deeply related to matrix rank if you make a wrong choice of the input excitation okay very very now before sensitizing you about something which is reasonably advanced I am not going to cover this part though I have given some exposure to this part towards the end of my notes after the last slide I have put some extra slides on this issue there are two kinds of errors that can occur when you identify a model one is called as a bias error other is called as a variance error I will just give you sensitize you about this after I talk about the real example so let's take let's depart from here go to this problem okay now this is a system which I have in my lab okay I want to develop a black box model okay I want to develop a black box model if you want to play with the data I can put it in model you can download the data and play with the data put it into I didn't toolbox and then you can develop all kinds of models yourself there are two inputs okay and there are two measurements the full time problem which we have been looking at for a long time same problem sometime maybe after the mid-sem we will actually visit the lab I will show you the setup I have the setup you can actually and some of you are interested you can do a project on that you have really to do that instead of doing project on a toy simulation example you can choose to do a project on maybe I will give a different weightage if you are going to do that that is much more difficult but you will learn far more if you do a project on a real data than if you do it with toy problem real problems are much more complex so I have simultaneously perturbed both the inputs two walls simultaneously perturbed okay frequency content I had to choose based on the knowledge of the time constants rough idea of the time constant I know that this particular plant time constant of the order of two minutes and based on that I have made a call on how to do this so these inputs can be generated using Matlab identification toolbox there is a there is a there is a function called id input okay you can give what type of input you want to generate what should be the frequency content with reference to the Nyquist frequency pi by T what fraction of Nyquist frequency T is the sampling interval pi is pi of course pi by T gives you Nyquist frequency with reference to frequency frequency what fraction you want to so what that what is that fraction that you should decide doing the plant frequency okay that is critical so it is very very important that when you are exciting two inputs they should not be identical if they are identical then that means rank of this matrix will not be N okay they should not be scalar multiple of each other if you are simultaneously perturbing a plant they should be uncorrelated and id input will make sure if it simultaneously generate two inputs it will generate uncorrelated inputs so very very critical that these two inputs should not have any correlation okay very very critical that is the meaning of actually persistency of excitation so these are the two outputs I collected both the levels remember each level is changing because of both the input simultaneously it is not just function of one okay now I am going to do this I am going to take part of the data for modeling and remaining for violation okay the model which is developed using this part of the data should predict this behavior then only the model is good okay I have developed variety of models I have developed second-order output another model then you know the prediction seems to be okay 87 percent fit means it's explaining some of the square of errors is there is a this matter system intervention toolbox decide defines something called percentage fit using some of the square of errors so it gives you a good measure of how good is the data and mind you this is not so this part of model valuation has been carried out on this part of the data this data so these inputs have been given to the model identified I want to check how well the prediction so these means I have just shown here one graph but actually up to thousand I think up to thousand yeah up to thousand okay I have used for identification and thousand to whatever 1300 so this part both the inputs starting from thousand to 1300 have been injected into the model and I want to predict how well is level one so you can see that it's a good model unknown data is being predictorism anywhere using second-order model well in this case the noise is uncorrelated great okay no problem you can go on developing different models you can compare them on a IC criteria I am just showing here auto correlation of residuals I developed a third-order box Jenkins model I developed a fourth-order I am comparing different models well box Jenkins gives 89% little more little better than the other and and so on so you can actually compare model output these are the model parameters which will be obtained okay so this this case study you can actually do it yourself I'll upload this data and you can take this into my lap toolbox and then you can play with it you can put all kinds of models and see what model parameters then you can take it to you know LTI view you can compare the model frequency responses step responses impassive responses like this you'll get better insight into what is happening so you can develop a variety of models at your fingertips and it's not so of course when you go for controller design you take a call which one of them you are going to use and how so the next part logically is going to be you know using them for controller design and my next lecture is going to be converting this models see I have these models into into this transfer function form okay my entire course is based on discrete time state space model whether one should work using transfer function models or state space model well it's a matter of choice there in control there are two communities once where by transfer function others like only state I belong to the state space category I like to work with state space because you can use linear algebra okay not that you cannot do the algebra when you are using transfer function but a matrices containing polynomials of q their manipulations become very messy if you go for higher order system for state space in my opinion whether you work with single input single output or multiple input multiple output max remains same everything is so industry is does now does prefer state space a lot of controllers are written using state space realization so so what I am going to in my next lecture is what is called as state space realization of a time series model I want to convert my model into a state space model so basically I will start with this okay and systematically explain how do I come up with a state realization okay which will look like this in the standard form okay in this particular case x may not have any physical meaning unlike when you start from physics and develop a state space model okay the state x has a physical meaning temperature pressure level whatever it comes from physics in this case when I start from this this transfer function model and go to this thing called state realization okay x has no physical meaning what is physically meaningful here is you which is the input and why why is the measurement and all that you are saying is that this this difference equation okay has the same transfer function as this okay so the next part of this is going to be connecting up with this standard form if you remember we develop this standard form starting from physics starting from first principle models we linearized develop the standard form now I am going to develop this standard form starting from this data driven black box model okay finally when I start working with the controller development I am just going to use this standard form I am not worried where does it come from whether it comes from physics whether it comes from completely data whether it comes from some merger of data and physics it is up to the model developer okay I am just going to work with this standard model form for controller design synthesis whatever okay so that is that is how the game plan is that you you could start with physics if your physics model you could start with data if your data driven model finally you come up with this one standard form and then do your maths on on this standard form see once once you have translated the reality or the system into a mathematical description okay then you are in the world of mathematical models you can do all kinds of manipulations okay so I am going to stick to the standard form and do all the manipulations that is my that is my game plan okay so how to do this from starting from is there a unique way of doing this there are all kinds of questions will answer this question you know how do you go how do you go from here to here okay there are so many ways of doing it there is no unique way of doing it you can show that there are infinite ways of doing it so we are going to prefer and do it by some two three different ways which are popular okay but that lets reserve that for the next class so so that is where the story ends that you start with data you come up with this time series models you convert them into state space model and then I will use the state space model for control okay they should not have correlation they may or may not have different frequency content it depends upon some inputs have very fast influence on the outputs some inputs have slow influence on the outputs you know different generalize it depends upon depends upon the system when your that thing is not so critical in the in validation data can be correlated validation data can be on identification you better make sure that the inputs are uncorrelated particularly if you are simultaneously exciting okay so statistical properties of the input signal and frequency domain properties of the input signal are equally important when you design inputs for identification validation is not which input what if input is white noise if you give white noise well on paper ideal input is white noise okay then if you have white noise then you know then the persistency of excitation is as much as you want you can develop any model but white noise is not a great signal to inject okay so we typically do not inject white we inject the colored noise so see you can see here know that if I inject this signal if I am injecting this signal it is a colored noise it has only see what is white noise it has all the frequencies so that would be ideal signal and if you inject white noise then you know white noise is the perfectly persistent the exciting signal nothing better than that but white noise cannot be injected you know in reality in computer simulations you can inject white noise you will get great models but not practical okay so this matrix will be you know perfectly conditioned if you introduce white noise because you know this matrix will be diagonal if you introduce white noise this matrix is diagonal nothing like it but you cannot do that okay I will just briefly talk about this now maybe I will continue in the next lecture there are two fundamental issues to understand this actually you have to change the gear and it becomes much more complex than whatever we have been doing till now if you consider whatever we have been doing till now is simple then this is relatively more complex okay the analysis has to be done in frequency domain okay because in time domain you can do identification and all those things but the insights come only from frequency domain to go from time domain to frequency domain insights using two norm is very very critical because there are certain relationships transformation from time domain to relationship come only because we define two norm and you can best books for this system identification by loom or sort of strawman strike strike are there are two books which are I have listed them in my they are two control engineers who have written excellent books which give a lot of insights into these aspects now let me explain what is this animal see you have GQ let us assume that there exists for a given system there exists something called a true transfer function okay don't ask me question whether for a real car or that does it exist it doesn't exist it's a hypothetical scenario suppose that it exists suppose it exists then the question is will you reach it when you identify it from data okay this question you can definitely ask for a computer simulation if I am when I am simulating a plant inside the computer I am God I am creating the world inside my computer program so I know that there is a perfect system GQ and I give to a modular only data and ask him to identify question is will he able to be able to reach the truth only from data okay now there are two problems see if I give you a data for a system let's say for a boiler okay you know that boiler is something which is a real boiler it's interconnected system of multiple systems okay so actually what should be the order of this system it should be high order it will be typically high order system okay but as a modular I don't want to use large number of differential equations or different I want to keep them small so I tend to choose a lower order model okay classic example is you know let's say in process control we use this kind of models these are called as time delay a dead time plus time constant model first order time constant plus a dead time this is used to approximate a high order model why this because this is a simple model one time constant and one time delay as compared to 8th order model 8th order model means 8th order differential equation or 8 first order differential equations okay so instead of that I like to work with one differential equation you know so but but what has happened if I for a real plant for a real plant which is like this if I approximate using this model okay laser structural error between the true plant and the model okay and this gives rise to something called bias error okay see if you look at the step responses they look similar okay this blue is the true step response this step responses of the first FOPTD is first order plus time delay model these models are available in system identification toolbox you can go and it will tell you which kind of model you want to identify you can say that I don't want ARX OE I want FOPTD model it will give you a model which is of this form it will give you a model of this form if you just compare the step responses of these two they look very similar you might say good model I mean I am getting there you know the step responses are similar if I compare the frequency responses of these two models see if I show if I say that this is the true plant this is the estimated model for the true plant then you know and if I just show you this valuation we will say good model I will use this for control okay look at the frequency responses the comparison of frequency responses this is this is the model frequency response and this green one is the plant 8th order model this is my approximation which is first order but time delay approximation it is very good only in the low frequency region match is very good in low frequency region okay but as in the high frequency region this match is not good suppose your controller in closed loop you are operating in this region in this frequency band there is a large mismatch between the model and the plant that will give you big problem when you control yeah that will give you big problem when you control okay now what you can do is okay this is called bias error and this is going to occur in most of the real situations real situations real problems are very high order okay we always approximate them using low order models so when you approximate a high order system using a low order model this always a bias error okay because your model is not able to capture the reality it has limitations okay what you can do of course is that you can shift this error you can say that well instead of good match in the low frequency I want a good match in this frequency how do I do that I go back and say that I will use this signal for input excitation than using this signal so using this signal for instead of this signal will it ensure that you will get good model in that frequency range yes so that I will talk about in my next lecture I will just hint at next in my next lecture you can show that well the you can shape the difference between the true and estimated using the power spectrum of inputs so how do you shape the power spectrum of inputs has a lot of bearing in what region what frequency band the model is good what frequency band the model is bad okay so all this analysis comes only with two norm that's why we use two norm yeah so this I am just going to sensitize you not really get into deep because this might need a two or three weeks of lecture if I want to really get into