 Let us go to the other techniques namely genetic programming and model quiz. Now you may be aware of genetic algorithms, genetic programming share the concept of genetic algorithm as far as the Darwinian principle of survival of the fittest is concerned. They are also based on the same, but unlike they are similar to genetic algorithms, but in the end they give you a computer program or an equation. When we say computer program it is an algorithm involving various operations if then else and so on associated with any computer algorithm. Now because of that genetic programming is more suitable to determine the input output dependency structure that means to carry out regression rather than to carry out optimization as in case of genetic algorithm. Now basically what is done in genetic programming you have what are called functions and terminals. Now what we mean by functions are typically let us say the causative parameters like say the radius of st. jambaket, the leap angle, the discharge head etc. And what we mean by terminals are mathematical operators or logical operators like plus, minus, multiplied by divided by logarithm if then else do and things like that. So innumerable combinations of these function sets and terminal sets are done. Every time for a new combination the strength of this combination is determined by applying fitness criterion usually which is in the form of a root mean square error between the outcome that that particular combination produces and the actual output. And in that way you go on finding out the new combinations and ultimately end up with the fittest combination that is the answer at the end of the genetic programming. Every time we formulate the new combination by doing the operations like reproduction, mutation and crossover. Now let us see how in genetic operation, genetic programming the operations are done. Suppose you consider an equation or expression minus q plus root pi divided by 3p. Now we represent this equation in the form of a tree structure which is the basic you know idea in this genetic programming which differentiates it from the genetic algorithm. You have this equation minus q plus root pi divided by 3 into p, minus q plus root pi divided by 3 into p. This is an equation which is represented in the form of a tree structure. Now while there are two operations, three operations rather reproduction, crossover and mutation what we mean by reproduction is just allowing the same expression to continue further in further processing. When we say crossover it means that two random nodes will be selected from these two parents and the result and the subtree trees will be swapped. For example this is one node, this is another node of this parent and this parent. You put this subtree here, this node here and you will get another expression. Find out the fitness of this and if they satisfy your fitness go ahead. So similarly when we say mutation again here one subtree is replaced by another one randomly. For example this is one subtree you randomly replace by another subtree 2 into p like this and you will get another expression. So like that we can also play these are all mathematical operators but you can also play with computer programs I mean the logical operators and develop computer programs accordingly. Now so what is the procedure that is following GP is like this we select a random population of individuals which could be expressions, equations or computer programs then we evaluate the fitness of it individual and by evaluating the fitness we select parents for further processing. Then these parents are made to yield offspring another individuals by following the process of reproduction, mutation crossover which we have seen. Then you continue the creation of offspring till a specified number of offspring in a population are produced and till a certain number of generations of such populations are created and the best offspring in the entire generation process is the solution of the problem it will not be the last one. Genetic programming actually started after its presentation by COSA around the year 1992 they have been but its application in water resources had only started only in the year 1999. They have been used for various purposes like the classification regression pattern recognition that means given a partial information find the full information and so on. Typical applications are like this these people have applied it to rainfall runoff modeling to evaluate runoff from the given input of rainfall at various ranges. People have also applied GP to estimate suspended solids in water treatment plants. They also use GP to estimate the groundwater levels to determine the settling of physical pellets in wastewater management. They also use GP to obtain the resistance to open channel flow because of vegetation, vegetation induced resistance. Then in ocean science people have also applied this genetic programming to evaluate certain ocean components like this phytoplankton from the given information of sunlight luminance and reflectance and so on. So these are some of the works which people have done since 1999 which involves application of GP. So this was a bit exposure to GP there are so many finer points which we have to learn by going through the appropriate literature. Now similarly let me give you the concept of model tree. This model tree represents a computation process in which we build piecewise linear models. We break the input region into different subdomains and within each domain we build multi-linear regression models. Now so the input domain is divided into subdomains and linear regression models are fitted into each. For example consider this case. Let us say you have two input variables X1 and X2. Y is the output variable. Now using the objectives like this, this input space of X1 and X2 is to be divided into so many sub regions using some kind of similarity criteria or in other words to say the same thing using some kind of elimination criteria and having collected similar examples you fit linear models here and use that as a global model in the end to answer any new query or to get the output from any new input. Typically for example see this if X2 is greater than X2 then you follow this path. If it is not greater than 2 you follow this path. For example if X2 is less than 2 then you further see if X1 is less than 4 for example here is X2 is less than 2 then you see if X1 is less than 4. If it is not then you if it is yes if it is less than 4 then you fit one model this model number 4. So for this model number 4 you have X2 less than 4 and X1 less than 4 then you fit one linear model then if it is not so then you further see if X2 is more than one. If X2 is more than one you fit this model 6 and if X2 is less than one you fit a model 5. Similarly you follow other paths and identify the similar patterns and fit individual models in it. Now how to select these values for that purpose different there are different algorithms which use different criteria but there is one algorithm called this M5 model tree. It was given by it is something like a fifth you know it is just like marutiated or something. So there is a fifth kind of trial made by these people. So it is this was given by Quinlan and in this model they have used the variant of standard deviation of the class value that reaches a node or a decision box in relation to the standard deviation of the total data sets as a criterion. Similarly there may be some other elimination criteria but standard deviation reduction is a criterion which is found to be acceptable by the past investigators yielding satisfactory results. So you go on dividing these things until you get those you end up with getting those points in your bins which have a standard deviation say which is only say 5 times the 5% of the total standard deviation of that data and so on. So based on that this algorithm has been developed now there are two further points that is noted to be noted with respect to the empty algorithms. Many times it is found that if the splitting that you do becomes so large that overfitting occurs that means there is the individual patterns are learned but there is no generalization that is carried out. In that case you have to prune that large splitting by using some techniques which involve merging together of the few subdomains or sub regions together. Another difficulty that people had noted earlier is that suppose there are very small number of training examples in that case what happens there are large discontinuities between neighboring linear models and therefore in order to have a smooth variation from one model to other they have used certain mathematical functions which are called smoothing functions and application of empty procedure will usually involve both this pruning as well as smoothing methods. Now the advantage of the empty over other approaches like ANN or GP is that you can easily understand its outcome and it can be very easily replicated or the results become highly portable compared to NN. But one thing that is to be noted is that ANN and GP are purely non-linear models. They do not involve any kind of assumption beforehand whereas MT is piecewise linear and it works it assumes that it is possible to disintegrate the input space into subdomains so that linear models can be applied to that subdomain. So in that way it is not regarded as something which is a purely non-linear kind of fitting technique. As regard the applications of model trees are concerned again they have been applied mostly to similar type of problems that of GP genetic programming. Its applications have been reported since last only 3 years or so. People have applied it to evaluate the rainfall runoff relationship. They have used it to apply the to come up with reverse stage and discharge relationship or the rating curves. They have also come up with studies which estimate the sediment transport in open channel flows in they have also predicted the values of the flood discharges, river discharges during flood. People have also compared this technique of model tree with the technique of artificial neural networks and they find that the results of model trees are either similar to that of NN or sometimes they are shed better than that of NN. Similar results we also found in our application pertaining to wave analysis and for example this Bhattacharya and Solomata and in fact there is one investigator named Dimitri Solomata in Delve who has been very actively pursuing this technique. Many of his students have used this technique and they are saying that they are finding it is either similar to NN or it produces results which are marginally better in some cases than NN but where will it work and where will it not work is very difficult to say beforehand. Sometimes they are finding that higher river flows or low river flows are better approached by model trees whereas middle level middle range values they are better predicted by other techniques like NN and so on.