 In the last lecture we have been talking about the holistic view of data simulation the role of models the role of observation within the context of observation we started talking about observation noise. Observation noise depends on the instruments that are used to measure the observation satellite observations have certain kinds of error structure radar observations have different kinds of error structures all these physical quantities pressure, temperature, humidity whatever you want to measure we measure with instruments and they are always inherently associated with that observation noise and this observation noise is modeled as a white Gaussian noise with the known covariance structure that was the theme we saw towards the end of the last talk. However, there is a fundamental difference between observations in other branches as opposed to engineering and physical sciences. For example in economics we talk about stock prices, we talk about interest rate, we talk about foreign exchange rate these are intrinsically random but one distinguishing feature is that there are no observation errors we cannot say IBM prices $22 per stock plus or minus 5% when you say IBM price is precise there is no error the observations with very of the economic quantities are without error. However observations of pressure, temperature they are erroneous these are two fundamental differences between the variables in one domain against the variable another domain. Foreign exchange rate it changes from day to day there is a natural variation there are a lot of factors that go into the value of foreign exchange rupee versus dollar or dollar versus euro and so on and so there is a intrinsically natural random behavior but there is no observation error but observation error whenever it is present is random. So there is randomness associated with the observation but the randomness come from different sources one from the source of the error another from the sources of natural variability. In the case of temperature as we already discussed the temperature also varies seasonally so there is intrinsic variation on the top of it we superimpose the observation errors. So understanding the kind of instruments that are used in observation understanding the properties of the errors that the instrument may be associated with understanding the natural variations of processes these are all fundamental and these are all very fundamental to how one is going to use observations in data simulation systems. I would like to little give a brief review of some of the models of observations and how to relate observations with the model. In order to use the observations and to fit the model to the observation there must be a bridge between observations in the model and that is what the bridge we are going to now build that bridge could be a linear relation or a non-linear relation these relations are functions they are functions of the state. The linear case let us take an example X is the state of a system state vector we already talked about Z are the observation X is equal to H times X V are the observational noise vector if the observations are perfect V will be 0 if V is not 0 the observations are imperfect they are noise imperfect in the sense they are noisy. Here H is the matrix X is a n vector Z is a m vector by our definition H is the matrix of size m by n V is a vector of size m V represents the observation noise in the non-linear case the same Z is given by H of X H is a function and V is again a noise vector H is a map from R n to R m we will discuss more about the definitions of these things in the mathematical preliminaries that we will talk about H of X is called the vector valued function of a vector. So H of X is a vector consisting of H 1 of X H 2 of X this must be H m of X instead of n H m of X transpose V is a vector normally distributed there is a Gaussian distribution or normal distribution it is a 0 mean it is a known covariance R the mathematical expression that describes the Gaussian distribution is given by this 1 over 2 pi to the power m by 2 the determinant of the matrix R in this the symbol R represents the determinant of this matrix and V transpose R inverse V is called a quadratic structure this essentially represents the bell shaped curve H is a matrix in this case linear H is a map the case of non-linear both of these in the context of meteorology as well as geoscience they are called forward operators it is these forward operators that describe the relation between the model state and the observation just to be able to talk about it little pictorially in terms of examples for the state could be a sea surface temperature that is the actual thing I want observations are satellite radiance in the infrared domain the relation that relates the radiance the temperature is called the Planck-Stefan's law the relation is non-linear I think it is Planck PLCK radar reflectivity the actual state of the amount of rain the law is empirical is non-linear the observation could be voltage the state of the car could be the speed of the car the physical relation is called Faraday's law that is the linear relation so we could have non-linear models linear models we have the state of the system we also have the nature of the observables so what is the statement of a typical data simulation problem I would like to describe the problem either as a static problem or as a dynamic problem given a set of noisy observation z belonging to Rm and the forward operator the matrix H or the function hfx find x such that z is equal to hfx or z is equal to hfx so this is the linear problem this is the non-linear problem now you can see the following we are given you are given z you know h I have to find what x is or you are given z you know the function we have to find x that is the problem that we have to solve this kinds of problems as we will soon see are also called inverse problem why this called inverse we will talk about it in a minute within the context of dynamic setup I am now going to have observation taken at different times so let k1, k2, kn be the different time epochs zk1 is the observation taken at time k1 zk2 observation time k2 zkn is the observation taken at time kn so it could be so k1, k2, kn could be noon for the next phase zk could be the maximum temperature at noon time in a given physical location it could be downtown London it could be downtown New York or it could be the price of a stock it could be foreign exchange between Euro and and and and yen Japanese yen and on midday on Monday so these are the dates these are observations these observations are associated with noisy data we are interested in physical sciences in physical sciences almost all the observations come from instruments so they are inherently noisy the model is a dynamic model the model in this case is a discrete time linear model xk is the state of the system at time k xk plus 1 is the state of the system at time k plus 1, yam is the transformation that transfers the state at time k to time k plus 1 and this transformation is a linear transformation so the linear dynamics or yam of xk alpha yam again is a non-linear operator xk is the state of the system at time k alpha is the parameter I know alpha I know xk I know the map mk I can compute what is going to happen at time k plus 1 so yam could represent the price of a stock xk is the price of a stock today xk plus 1 is the price of the stock tomorrow yam could be a model that predicts the temperature xk could be the maximum temperature today xk plus 1 could be maximum temperature tomorrow so time is discrete model can be linear or non-linear so the state of the system evolves in time I have n observations of the state given by the Z's so what is that we would like to be able to find the model states can be computed if I know the initial condition or the initial state so start the model at the initial state x0 you compute the model state x0 is given x1 can be found x2 can be found x3 can be found using either a linear model or non-linear model starting from x0 x0 is given so x1 x2 x3 these are called models evolution I have observations at time k1 observation time k2 observation time kn so the model has a state at time k1 the model has a state as time k2 and so on our job is to be able to is to be able to make sure that my xk I when operated by h gives zk I or xk I when operated by h equal to zk I so the problem stated above is called the inverse problem why this is the inverse problem look at this now I have an initial I have a time k this is the initial state I have observations at time k1 observations at time k2 observations at time k3 let us assume ns3 observations are noisy I am if I started x0 my model is going to generate a trajectory which can be thought of like this so at time k1 I have this this is the model predicted value there is observation this is the model predicted by the observation model predicted value observation so these are called errors I want to use these errors to be able to estimate by initial condition so knowing the solution I want to find the best initial condition that is the observation that is the inverse problem here knowing z and h I would like to find x that is the inverse problem so static problem give raise to inverse problem dynamic problem gives raise to inverse problem these inverse problems have different flavor with respect to solution process so to be able to understand this inverse problem now I am going to talk about an underlying fundamental classification of problems themselves at a very high level problems can divide into either the forward problem is an inverse problem for example what is the forward problem given a matrix A A is given if the x is given I can multiply a matrix by a vector to get B so find B given A and x that is the forward problem another example of forward problem is given a function h and a value x evaluate hfx for example I have a polynomial I would like to be able to evaluate a polynomial at a point 2 I would like to be able to evaluate a polynomial at point 75 polynomial is given the point x is given I simply want to be able to evaluate the value of the polynomial another example of forward problem I am being given a differential equation I am given x not initial condition so given a differential equation I can find a general solution if you tell me the given initial condition I can find a specific solution that matches the initial condition so these are called the examples of forward problems in mathematics they generally teach you to solve forward problems in matrix multiply evaluation of polynomials solution and differential equations so much of the mathematical training in calculus and probability theory and other things a lot of effort is spent in teaching how to solve forward problems why using the methods for solving forward problems we have to we are called upon to solve inverse problems in nature inverse problems are abundant in order to be able to build expertise in solving inverse problem we need to be able to know how to solve forward problems so what is an example of an inverse problem in this context given a and a vector B I would like to be able to find a x such that a x is equal to B now you can see the difference given a given x finding B is the forward problem in other words I am given all the quantities in the right hand side I simply need to evaluate the expression to get the value of the left hand side in the inverse problem a part of the left hand side is known a part of right hand side is known that is certain things either in the left hand side or the right hand side is not known I have to be able to find the one that is missing. So finding x given a and b x is equal to a inverse b inverse a is called a inverse so that is an example of inverse problem what is the other example of inverse problem given a function hfx and a number alpha I would like to be able to find the roots of the equation hfx-alpha is equal to 0 for example if I have a function x this is h this is the level alpha the function may have this shape so I would like to be able to find the value of x here the value of x here the value of x here this is alpha I would like to be able to estimate the values of x at which the function takes the value alpha that is called finding the roots of the equation if alpha is equal to 0 I am essentially finding the roots of the equation hfx is equal to 0 so root finding is an inverse problem but evaluating a function is called the direct problem I want you to look at the differences between direct problem and inverse problems with respect to ordinary differential equation if I know the initial condition I can compute the solution but what is the inverse problem associated with ordinary differential equation here is the inverse problem this is the time k I know the differential equation I would like to be able to find the initial condition such that the solution of the differential equation matches these values at different three different times in other words the solution may come like this I would like to be able to estimate this that is the initial condition I would like to estimate. So given an ordinary differential equation and the values of the solution at finite number of points find the initial condition when solved forward will match the values at those given instances in time what is the another inverse problem estimate or retrieve the sea surface temperature from the satellite radiance we already say I said an example of a linear problem where the equatorial Pacific is warmer by 3 to 4 degrees so somebody has made these measurements these measurements are given in terms of the energy received by the satellite energy depends on temperature so knowing the energy I have to recompute the temperature that led to the radiance so that is an inverse problem. So in many of the applications we have to solve the inverse problem but before being able to solve the inverse problem I need to solve the forward problem. So forward problem inverse problem data simulation intrinsically is an inverse problem so I am now going to summarize the discussions thus far data simulation has been called by different names data simulation estimation theory curve fitting again there is a spelling error here I will correct that is also called regression analysis it is called system identification is related to adaptive optimization it is also called inverse or trivial problems you can see all these are intrinsically related to underlying mathematics of it is called data simulation so this is what I meant by 5 blind men looking in an elephant you may be working in different areas you may be looking at the elephant from different ways but from an applied mathematical perspective a person in machine learning a person estimation theory a person in numeric analysis a person system identification a person geology doing data simulation mathematically they are all talking about the same underlying process and that is the essence of this first module 1.1. Now I would like to continue the discussion further by summarizing the discussion so far in a block diagram. So ultimate goal of why do you do all these things ultimate goal of data simulation is prediction data simulation is a part of the predictive process so data simulation is a part of the predictive science if you wish to call and prediction is fundamental to everything we do in life so in the process based model setup I have a model based on the process causality based I have data or observation I bring the models in the observation in a data simulation scheme once I assimilate I get what is an assimilated model the course is largely connected with the data assimilation part of it once I generate a simulated model what do I do with it I run the model forward in time and the model ran forward in time is the one that creates prediction what is going to be the unemployment a month from now what is going to be the inflation year from now what is going to be what is the probability that there will be 13 inches of rain during summer monsoon in Bombay and a given within 24 hours what is the probability that the city of Chennai will again see 15 inches of rain within a week so all these kinds of predictions are needed to be able to predict we need to be able to do this process so this is the line of argument that underlie the data simulation within process based scientific engineering domain as opposed to the empirical domain we have data observations I need to do data mining the data mining provides an empirical model please remember model observation model observations once model observations arise I do a data assimilation I had an assimilated model I make predictions so that is the commonality the the the empirical model based data mining is very popular in today's world data mining is a large and a vast area of science the interest in data assimilation and interest in data mining arise largely because of the need to predict prediction is fundamental to all the things that we do in life so that is the end of the first start first part of the module now I am going to start with so in 1.1 we talked about data simulation in the widest possible perspective the aim was even though each of us may be interested in our own sub-domain we would like to be able to see what kind of an animal it is and how broad and how deep and who are other people doing it so that if people from different disciplines come together if they can exchange ideas methods in one discipline can be transferred to methods in other discipline all the disciplines can by this interaction benefit and grow so I would like to be able to promote that possibility that is why I provided such a broadest possible background a canvas if you wish to call it that way and to paint up as broad a picture of data simulation as possible now I would like to go a little deeper into further relations between data mining data simulation and prediction please understand data simulation is the crux of what we do prediction is the reason why we do what we do data mining is fundamental to everything we do in science so I am going to relate all these three from my perspective data assimilation data mining and prediction are parts of a continuum to understand this continuum I would like to re-emphasize data mining is fundamental to model development you please realize data assimilation needs models and data where do models come from models always come from data mining models always come from data mining so I would like to be able to talk about the fundamental nature that data mining plays in the advancement of science we call the development of data specific empirical models rely heavily on data mining here what we did we compute computed correlogram similarity of data so data is used to develop models within the empirical context this involves analysis of correlation narrowing narrowing down the choices of models it turns out historically the early developments even in the causality based models were developed primarily using data mining so data mining is not only used within the context of empirical model building they are also fundamental to how scientist over ages have built models I am now going to provide some instances to bring out the fundamental nature of data mining in all of model development so data mining in early beginnings much of what we know in physical sciences today had their origins in astronomy with the observations by humans using simple celestial telescopes we use these telescopes to observe various celestial objects at various time their bearings based on these astronomical observations early pioneers have developed various kinds of models thanks to the sampling of the various efforts Copernicus from geocentric to heliocentric system Galileo the notion of gravity Kepler the notion of motion of planets around the sun in electrical orbits that led to Kepler's loss in my view Kepler was probably the master data miner he was a model data miner what did he have he had 50 years of observations he very meticulously analyzed all these observations by hand and he had great ingenuity and imaginative power to be able to observe these models with the ability that he had to be able to condense all the observations into simple four basic laws that we have come to call Kepler's loss every planet revolves around the sun in an electric orbit with the sun as its focus each electric of the in equal intervals of time they sweep equal areas that gives an ability to compute the velocity so simply based on observations he came up with four basic fundamental laws which in my view led to the birth of modern physics as we have come to know Newton then further took those laws and developed the grand theory of gravitation the three laws of Newton the Newtonian gravitational field now please look at this all these are empirical laws there is no way to prove Newton's laws this but Kepler had had had beautiful vision he was able to fit a model to the data so Kepler's laws were also empirically derived Newton's laws are also empirically derived but they had beautiful intuition that considered the basis for tours to the to establishing the fund of the foundations of modern physics as we have come to know so this is only a small sampling of a long list of pioneers both in physics I am not talking about biology I am not talking about chemistry I am not talking about eco sciences in any and every branch of science there have been Kepler's whose fundamental contribution began with a very careful very careful analysis of observations so now you can see data mining plays an absolutely fundamental role even in development of causality based models the causality the relation of causality came after our understanding Newton's laws we apply for everything if there is a force there is a equal and opposite force if there is a force and a mass there is an acceleration so on and so forth so these are all some of the realistic examples of data mining within the context of physical sciences I have already talked about this again large volumes of astronomical observations are collected over the decades very meticulously analyzed by hand to formulate the following laws of nature helo central system laws of Kepler gravitation so within the context of physical sciences data mining has played a very fundamental role and still continues to play almost all the laws in chemistry all the laws on biology the law of evolution by Charles Darwin is an empirical law that he enunciated so data mining is fundamental to advancement of any interdisciplinary discipline not only in the development and empirical models but also causality based models as well so what is data mining in its broader sense data mining is the process of extracting the structure or patterns that under that underlie or that are inherent to the observed data for example Kepler did not know that everything revolved around the sun in the orbits but the planets have been doing it which we did not know by careful observations he was able to analyze hey this is how the planets behaved the ability to extract explain the phenomenon that underlies the astronomical objects and being able to summarize it in very simple laws that is one of the most fundamental example of data mining so the patterns were already there so these patterns essentially generate how mother nature generates this data so the fundamental importance of data mining is to be able to understand the data generation process if I can generate the process in the laboratory that means I have understood the system so the ability to extract the structure underlie that underlie the observation and summarize them in the form of simple laws that is the ultimate goal of data mining to understand to quantify the data generation process I would like to emphasize the data generation that is fundamental since the motion of the little object inherently followed certain physical laws with the hard work and ingenuity they could discover many the underlying laws that constitute the fundamentals of physical sciences engineering as we know today so this is very fundamental to the advancement in science so we all have been doing data mining without calling it data mining analysis of data now I would like to come back and fast forward to the modern time in today's technology thanks to the technology we are living in a world where data is much abundant let me go back if you look at the technology of the time of Kepler he did not have anything he did not have pencil as we have it he did not have the paper as we have it he did not have the pen as we have it but they have recorded observation in a particular way but he know he was pretty good in being able to analyze very meticulously so the amount of data from today's perspective may be small but it is a large data set for him at that time given the technology so volumes of data doubles in every 3 years thanks to technology computers large scale storage devices communication and sensor technologies for example in many parts of the United States we are prone to earthquakes they have built tall buildings Los Angeles basin San Francisco basin they are they have been building on the top of falls they know that there is a fall still they build but what is the challenge they would like to be able to build these buildings that can withstand 8.5 richer scale 7.5 richer scale so to be able to understand how this building behaves what is that they have done they have wired the bridges the tall buildings with sensor devices because if if the earth shakes the building respond they would like to be able to record continuously the response of the buildings and bridges to tremor the tremors are happening all the time some 2.5 some 3.2 some 4.5 so by understanding these observations for very many different types of tremor engineers can develop where the cracks develop so ability to observe has vastly improved thanks to the sensor technologies thanks to the ability to communicate thanks to the ability to store in fact United States there is a data bank where they have all the observations of all the radars ever since they were implemented one database system that you can go back what was that particular storm that affected Miami in 1972 what is the hurricane that hit Philippines in 1882 18 not 1882 1985 and so on and so forth so data is available in abundance today's interest in data mining includes a very vast array of topics physical biological medical sciences so what is the one data mining project of great interest in medical sciences we would like to be able to relate the structure of the genes to the occurrence of diseases it is pretty much understood that defects in the genes cause the diseases so they would like to be able to identify what kind of defect causes pancreatic cancer what kind of defect causes some other type of cancer associating the defect with the disease and then once I know this is the defect then I can try to compensate for the defect so in medicine there is a tremendous interest in in in in data mining space exploration all branches of engineering I would like to give an example of engineering thing that is of great interest in civil engineering for example as you know in other states we have highway system which is the envy of the world they have east-west is about 3000 miles north-south is about 3000 miles you can think of a square 3000 by 3000 miles they have roads everywhere one mile of a two-way highway cost 5 million dollars you can imagine there are 10 different parallel systems east-west there are 10 different parallel system north-south each of the roads which is well over 3000 miles you can see the amount of money that they have invested in developing infrastructure 5 3 to 5 million dollars per mile so what is that they would like to be able to do they would like to be able to optimize the performance of the concrete the northern ranges of fringes of the United States the low degree the low temperature is is is minus 30 the high temperature could be 85 in southern fringes the high temperature can be 110 the low temperatures can be minus 10 so when you subject these road concrete to such a wide variation we need to understand the expansion contraction properties so in a laboratory they build concrete blocks of various compositions they subject them to different kinds of other conditions they run an artificial wheel they can adjust the pressure they can adjust the speed they can adjust the temperature surroundings and they measure the wear and tear so they collect ton of empirical data once they collect the empirical data they would like to be able to model the wear and tear of the pavement with respect to speed with respect to temperature with respect to all kinds of quantities of interest so that's the science which is developing and it has tremendous implications in terms of transportation budget to maintain these roads is very expensive environmental sciences ecology economics social sciences finance banking sports under creation in sports under creation you are always in want to predict which team will win the Super Bowl which which basketball team will be the national champion so there is a lot of betting goes on there is a lot of interest in prediction in sports of course a government in private companies what is an example of a data mining with respect to with respect to government a government wants to develop budget priorities they would like to know what would be the net amount of revenue that is there will be available so that they can allocate this much for education that must for transportation that much for crime fighting so on and so forth so somebody has to predict what will be the budget available for us to be able to spend in June 1st of 2016 and I have to predict that in December 2015 so that the legislators can take six months to decide how to divide the incoming budget into various process every government agency large and small local and national state level they all have to do this mechanism for prediction and that is a data mining project you look at the revenue stream over the past 15 years you model the revenue stream and then predict so there's a lot of data mining projects involved we will talk a little bit more about data mining later back to some of the early discoveries in astronomy I would like to bring in data mining with respect to other developments the development of calculus and the discovery of dynamic models by Newton Newton was essentially the first one to Newton and Leibniz are credited with the discovery of calculus once calculus was understood the notion of dynamic models came into existence the notion of a differential equation the first set of differential equation they developed is for the motion of the earth around the sun motion of planets around the sun so the dynamics of motion the planets on the sun was was was well understood in the early days of Newton it's himself so with the availability so look at this now take us through Copernicus Kepler Newton discovery of calculus the discover dynamic models once the dynamic models are known I need to know the initial condition I can make predictions so the notion of dynamic models model building data simulation data mining prediction that as clear as it is today in the days of Gauss in the days of Newton so you can see the interdependency between data mining data simulation and prediction right from the early days so once we have models the potential for prediction are forecast forecast and prediction are essentially same two synonymous words and ability to predict became reality okay so we are able to predict many of the some of the aspects of the geophysical systems starting at as early as Kepler as early as as Gauss as early as as as as Newton so we talked about models we talked about data mining we talked about process based models and empirical based models and so on now I would like to talk about data simulation the birth of data simulation discovery of least squares heralds the beginning of data simulation historically Gauss was only 24 years old he was given a very challenging problem what is that astronomers were observing an object called cirrus it's a moon it went out of sight so they would like to know where it meant when it will reappear what bearings it will reappear what time it will reappear all that he had was a bunch of satellite observations of the past track of the cirrus think of it the problem now he was given only a data relating to the various positions taken from observations so what is that he has to do he has to utilize that data to build a model so he built a static model to be with several parameters he combined that model with the observation he invented the least squares as we know today so Gauss using the model he create from the data and use the data to estimate the unknown parameters he invented the notion of least squares he estimated the parameter of the model he made a prediction when the object at what bearings it will appear shown of it there so Gauss is considered to be the father of modern data simulation the very first assimilated model was created by Gauss not only he created the he created the prediction the method that he used to make the prediction is called the least squares method the least squares method is the workhorse of the data simulation industry so he used this model to be able to predict the location of the time for the reappearance of the last of astronomical object and that was the first success story about data mining data simulation and prediction all rolled into one that the first time it is in this context he was trying to fit a model to the observation he found the observations were not consistent he knew the observations had errors it is at that time he also simultaneously discovered the Gaussian distribution as we now know the bell curve so you can see at that one moment he discovered the distribution of observation using bell curves he invented the notion of least squares he developed the first assimilated model ever known to mankind he solved the first inverse problem ever known to mankind and he by solving inverse problem made a prediction which is verified so that there is a glorious time with the annals of the data simulation literature this work by Gauss helped to laid the foundation for data simulation his work led to the development of method of least squares as we know today the method of least squares still continues to dominate the theory and practice of estimation of unknown parameters in all the mind that we talked about state ranging from curve fitting to a static regression to different problems that we already discussed by this time Gauss also had co-invented the notion of statistical analysis of observation errors the bell shaped curve that helped to introduce the notion of Gaussian or normal distribution now what is the goal of data simulation why do we do all these things why trouble ourselves with all this ultimately we want to be able to predict so ultimate goal of data simulation to generate prediction remember Gauss he was interested in predicting where the object went and when it reappeared right from Gauss's days the need for prediction has become all pervasive each of us at one time or the other have looked into the crystal ball to see how our own future looks like I want to know some people go to the astrologers some people they don't believe in astrology but they use other means how my life will evolve we are all interested in our own future so this need to be able to predict how we are going to evolve is a is a fundamental is a fundamental interest to all of mankind here are some scientific examples predict the path of hurricane I have seen a hurricane being just beginning to appear as a low pressure system just on the western fringes of Africa the Easterly carries them to the United States we want like to be able to make a prediction is going to get Cuba and get into the Gulf or is going to go to Bermuda and then turn right to the sea whether we will cut into the United States so these kinds of predictions are very much needed especially from August through December is a very active season for for hurricane so hurricane research center in Miami is extremely active in predicting paths and intensities of hurricane government at all levels want to generate revenue projections to develop budget look allocations for the next fiscal year we already talked about that another example of prediction is essentially comes from national transportation safety board suppose a plane has crashed planes do crash unfortunately after crash happens they invite specialist they collect the debris from the debris field they try to reconstruct and judge what must have brought the plane down it must be a crack it is a bomb so it is a kind of a re-analysis that they have to do from the observations there to go to the cause so any kind of such analysis is an inverse problem if I had known I would have been able to predict but I thought everything was very good the plane takes off but something happens there is a failure so from the failure data I have to estimate the cause for the failure crime scene investigation I want to tell you who are all the other people who are data mining we are not the only one doing data mining the NTSB the national transportation safety board every time they called upon whether train accident or the bus accident or a plane accident they are called upon their experts in dynamics of various types looking at the crash structure they will be able to rebuild the cause for the failure crime scene investigation police always comes to the scene after the crime was committed if they were there the crime may not have been committed so if the crime has already been committed they take meticulously all the observations and they analyze these observations and then there is a firing of a gun which direction could have come from what kind of bullet was used they analyze the reasons you may remember a TV show called crimes in US against CSI Miami CSI New York is a very popular TV show in United States a team of experts go analyze a case to be able to retrieve how who in what means committed the crime medical diagnosis from symptom to the cure I have symptoms diagnosis is an inverse problem from observations I have to diagnose what the problem is once I know the problem I can then cure so symptoms to diagnosis to cure is an example of a data simulation problem a prediction problem a modeling problem and so on I am now to give some historical perspective because the courses will be going to be largely for scientific audience prediction within the context of meteorology historical view I am going to provide a quick review of some of the things specific to atmospheric sciences Wilhelm Bjerkenes in 1904 was the first one to propose the weather prediction problem as an initial value problem in physics the weather evolution can be described by primitive equation forecasting essentially initial value problem mathematically Richardson 1927 made the first attempt to numerically generate a forecast many of us would know that but he used humans so he arranged people in a 6 by 6 array each one of them represent a point in the grid each one of them did certain local calculations they exchanged the calculations all around and then the computations evolved in time even though his experiment did not meet with success as we know today but it is the model almost all the work that we do we can relate back to what Richardson did so he paved the fundamental foundations of modern predictive science as it is used in geographical I am sorry geological geophysical demise due to numerical instability his efforts did not succeed but it still continues to be very inspirational in 1950 Jules Choney and his team using the very first computer in 1951 52 made the first 21 hour forecast of the transient features of large scale flow using simple barotropic model with that began the modern era of weather prediction so this is a very short history of predictions within the context of atmospheric sciences you can see it all began in 1770 with gas so within a span of 150 plus years it has spread to other sciences and this is a very short history of what happened within the meteorological sciences starting from the late 50s operational forecasting centres became very much interested in predicting large scale waves Sweden, United States, UK they really let that efforts globally. In 1963 a great mathematician by name Edward Lawrence discovered the new phenomenon of chaos, chaos relates to sensitivity of the model to initial conditions so if you are trying to do data simulation and you are trying to estimate the initial conditions if the initial the model solution is very sensitive to the initial condition you will have lot more challenges the troubles the headaches so if the models chaotic is very difficult to do data simulation that became very apparent with the fundamental work and the discovery of the phenomenon of chaos in 1963 with that also came the notion of what is called the predictability limit so what does it mean certain phenomena we can predict for 100 years loner sonar eclipse in fact there is a famous saying that goes somewhat like this and astronomers can credit where the moons of Jupiter will be at midnight today but he has no way of knowing where his teenage daughter will be at midnight today the behavior of teenage daughter is very different from behavior of the moon of the Jupiter one phenomena is predict perfectly predictable other phenomena is totally unpredictable so that brings in the notion of what is called predictability limit if there is a notion of a predictability limit we need to be able to create good forecast and we also need to be able to tell what is the time span within which my forecast will hold if you look at some TV predictions in another state they will give you a week's prediction some TV will give 10 days prediction if you carefully know 10-day prediction from today 9-day prediction tomorrow 8-day prediction day after tomorrow you can see the variance of the prediction for a given day made from 10 days 8 days 3 days 5 days 2 days 1 day and the day so this predict this this variance is very large and that goes to tell the 10 days weather is not predictable the good predictable limit for weather is hardly 3 to 5 days we can't we can't predict better than that so the notion of predictability limit became very fundamental importance of inverse problems in earlier parts of the lecture today I would like to give a brief history of these inverse problems please remember I've been giving historical background for prediction but historical background in inverse problems data mining and so on I think it's better to understand the history in a fundamental way so that we can appreciate the connections between what we do in our own trenches as opposed to the vastness inverse problem has a rich and a deep history the first example of inverse problem began in 1823 when Abel formulated and solved the very first inverse problem ever known to mankind and what is the inverse problem determine the shape of a hill from the data related to the travel time so what is the idea here pretend that there is a hill pretend that I have a ball pretend that the ball is sliding down the hill I have been able to make observations about the ball sliding the positions of various times looking at the positions of the object of various times my job is to be able to reconstruct the shape of the hill on which the ball has been rolling down that's an inverse problem that the very first problem known ever to mankind stated and solved another fundamental work within the annals of inverse problem is by the Russian mathematician Victor ambatsum in 1929 he was a physicist he was a mathematical physicist he was trying to relate the energy levels in the atom with respect to the eigenvalues of certain differential operators that relate to the the the states and the energy and he said the following if I know the operator I can compute the eigenvalues that is the easy problem is like given a matrix I can compute the eigenvalues but he for the first time ever asked the following question if I specify a set of eigenvalues what kind of an operator will endow itself with this given I sense I can structure so given a matrix I can compute the eigenvalue but what is the matrix for which 3 and 5 are the eigenvalues so from eigenvalues to the matrix matrix the eigenvalue matrix eigenvalue is called the forward problem eigenvalue to the matrix is inverse problem so these are considered to be the very first instances by mathematicians to consider solve important inverse problems inverse problems occur routinely here are some examples in medical imaging how do doctors find there is a tumor you go to through the CAT scan they take pictures they they analyze this pictures from the picture you have to tell whether you have a particular disease or not that is an inverse problem how do I know there is an aircraft that has been flying for 30 years I would like to be able to verify the aircraft body does not have any cracks I have to test it because if there is a crack it does not visible outside it could be very dangerous to the function of the aircraft so what do they do they send a very fine acoustic signal and receive it in other end and based on the properties of the received acoustic signal they can say whether there is a crack or not that is what is called acoustic tomography acoustic tomography another place where inverse problem occurs in physics in solving the regular problem we assume the acceleration due to gravity is 980 centimeter per second square is that right that is a standard value but that is an average value but if you really take a gravity meter and take it to different places let us take a city called London you take the this gravity meter the gravity changes from place to place to place to place to place so there is a fine anomaly in the value of the gravity so there are beautiful instruments that are designed to be able to measure these anomalies so what is the what is the use of these anomaly measurements the local anomaly depends on what is behind below the earth so if there is a hard rock the gravity is more if there is a loose sand the gravity is less so geologists in their exploration they try to run this gravity anomaly meter and map the gravity field and once the gravity field is well mapped their geologists try to solve an inverse problem to discover oil water hard rock granite and so on and so forth so that is an inverse problem another inverse problem we have already seen at the discover repeating recovery of the surface temperature from satellite measurements these are called inverse problem they are also called retrieval problems so in trying to summarize our discussion of the development of predictive science there are basically three steps they did a mining step that is an inverse problem at the conceptual level at the highest level the aim at this level is to be able to discover the basic laws these are encapsulated as models which are the fundamental basis for analysis and they are discovered by analyzing data examples include Kepler's laws atom models in the early 1900s do we know the atom variable no but we can do lots of things in physics with material science but we still do not understand the atom variable nobody has seen an atom yet the atom model still evolves you all know in 9 in 2012 there is a big buzz about the discovery of a new particle called Higgs boson it was called cod particle so as one puzzle is solved other puzzle comes into play at the accelerator center in Geneva they continue to do the experiment almost all the atomic physicists in the world they joined together in trying to uncover what do they do the smash atoms at very high speed at the time when the splash happens there is a lot of energy splashed they take pictures of these energy splash from the pictures of the tracks of objects they are going to identify this track must have been caused by this particle this part the track must have been caused by these kinds of particles so looking at the tracks that happens at the time of explosion is very fundamental to uncovering what must have exploded so that is an inverse problem even today in physics after thousand after nearly 200 years of interest in physics we still try to complete the picture of an atom theory of evolution by Darwin is an example of an inverse problem and what are the models of interest in today I would like to be able to identify credit card fraud what the kind of a profile of a person who will commit a fraud in not paying for the credit card so that is an example of a data mining project you can see credit card information is the only thing we have there is no law there is nothing I would like to be able to mine this data to be able to develop a profile of a person and these profile once developed are given to the bank as you apply your indicators are compared with the profile and then they need to predict yes he seems to be his history tells me that he does not fit the profile we can grant him a loan for 100 lakhs to buy a house and so on and so forth step 2 data simulation is an inverse problem it is a computational problem so data simulation is largely an applied mathematical and a computational discipline the goal is to estimate the unknown initial condition boundary condition parameters and to create an assimilated model step 3 once the assimilated model is created we would like to be able to run the model forward in time to generate forecast so developing a forecast is a direct problem because you simply need to run the model forward in time which everybody knows so to be able to solve the direct problem of prediction to a forward problem of prediction I had to solve two inverse problems one to develop the model another to do the data simulation so inverse problems are two levels a forward problem at the third level are intertwined that complete the picture of what underlies the predictive science here are some references toronto law 1987 inverse problem theory is largely for geophysical audience the language is couched largely for geophysicist geophysical data analysis discrete inverse theory by Menke is a very nice little book Leibniz is an introduction to optimal estimation 1967 inverse problem mathematical associates America this book 4 on inverse problem is for undergraduate students he has a very beautiful delightful book on simple simple inverse problem that you can introduce even to even to undergraduate students norendra and anasame they have written a very beautiful book on stable adaptive systems you remember a system identification adaptive system that's one bishop a book on pattern recognition and machine learning tang and steve back and comar introduction to data mining Hamilton's book on time series analysis these are some of the references taken from different disciplines all have a common theme the examples are different the mathematics that underlies there always one of the same so I hope I have given you a broad overview of data simulation data mining and predictive science this completes our lecture 2 we will talk about various mathematical prerequisites the next couple of 3 classes once the mathematical prerequisites are covered then I will delve into different types of data simulation problem the coming lectures thank you for your patience