 Good morning everybody. Welcome to our continuing colloquia on recognizing our senior faculty. Today this is a program that was developed a few years ago to again have our senior, namely, full professors who have been in rank for more than seven years to have a chance to present their work, talk about their experiences, talk about the way they got to where they are, and then actually once they present this colloquium they get a chance to talk to the department head, of course in this case, you know, but the dean for sure, and talk about, you know, the next seven years. And so today we have a great pleasure of having Professor Gavinder Adru who is your, our department head in civil engineering. So he got his PhD in 1989 from University of California Davis and before he came to Purdue he worked as an assistant and associate professor at Kansas State. He joined Purdue in 1997 and currently he's the Bowen engineering head of civil engineering and Christopher B. and Susan S. Burke professor of civil engineering. His primary area of research includes surface and subsurface hydrology, contaminant transport, watershed hydrology and statistical hydrology. And I think I will stop right there before I go any further. And so we're really excited to hear what he has to share with us today. Yes. All right. Thank you very much, Claude. And thank you all for being here. I think we have been to several of these symposia. I have attended all the ones that civil faculty were presenting including some from outside department. So it's been a good learning experience. One of the things as Claude points out is typically after this presentation the person has a conversation with the department head. So Claude, I promise I'll have a very stern conversation with myself. And this actually gives me a chance to sort of marshal my thoughts as Claude mentioned. See what I have been doing, take stock of where I am and also perhaps try and do some forecasting as to what I think I'll be doing in the future. I would like to actually also mention that in this talk even though as faculty we do a lot of things which fall in the categories of discovery, learning and engagement, you know, or research, teaching and service, I have focused more on the research part primarily because that is where almost all my scholarship is. That is where I have done research, published papers and so on. I also perhaps should put the word experiments under quotes to indicate that this is not just physical experiments, but I'll also be talking about numerical experiments, theoretical experiments and so on. So as Claude mentioned my broad areas of research interest, surface and subsurface hydrology, contaminant transport and related topics, but I'll be focusing mostly on surface and subsurface hydrology in this talk. My research drivers, what I think of are typically stochastic processes, I look at scaling behavior, very interested in spatial heterogeneity, uncertainties and risk. So some of the topics that I will talk about will have borrowed from these these areas and I'll present a mix of experimental, theoretical and numerical work in some of these areas and as I'm doing this, I'll perhaps take some examples with some of the graduate students who have worked with me and I'll try and recognize them along the way. I also want to first start by acknowledging some of the funding agencies, not all. I have been fortunate to have been my research supported by a diverse set of agencies, including some international funding agencies. Even though I was very instrumental in preparing proposals and so on, I think most of the funding that came to me finally was for travel and me for conducting experiments there. It did not per se support graduate students and my summer support, but it did provide me with lots of opportunities to do very interesting work so I wanted to recognize them as well. So let me start with something on poor scale mechanics. Those of you who work in porous media will perhaps recognize this. This is Jack Chan, one of my master's and PhD students way back, very bright chap and some of the things that he was doing was essentially using mathematical morphological operations and image analysis techniques. So those of you in the geomatics area will be familiar with this. When we look at images, we can do operations like erosion and dilation to essentially extract features from images and he was essentially doing this to study poor scale properties in porous media and this is an example of essentially one cube. So this is just an image, 3D image with voxels with one pore and he is looking at what happens at different pressures, how water enters the pore and how water comes out of the pore and the red portion that you see is what is the air phase. So if you have air and water in a pore at different water pressures, air will actually go and invade the pore space. So it's all saturated first and it's draining now. That means air is slowly entering and at different pressures you can see so the value of D indicates pressure, how air enters that particular single pore and the reverse process vetting as the water pressure increases how basically water will come and invade the pore and eventually dry out but even with a single pore you are able to demonstrate hysteresis effects sometimes called the ink bottle effect in hydrology. So this is for a single pore at a pore scale. What Jack then did was essentially constructed a porous medium in the computer. It's an image, it's a 3D image where essentially all the pore space is conceptualized as intersecting spheres. So any pore shape that you get I can approximate it as closely as I want using spheres and this is just one image of that so the blue portion that you see is essentially the pore space. And then using again image analysis techniques what we were able to do is show how a soil that is completely saturated will slowly be invaded by air and become unsaturated and therefore this allows us to take care of not only ink bottle effects but how well the pores are connected and so on which is something that we have not been able to do in previous methods. So what you are seeing is at different pressures how basically air goes and invades the pore space and the reverse when wetting is happening how water will enter back and drive the air out of the pore space. So these are fairly fundamental things that we need for understanding subsurface hydrology. So this is all in images first. So then what we did was how do we go from these images to actually talking about soil properties. So as an example this is an image that is taken of a loam soil. So this is an SEM image you can take MRI images using the actual image using the actual image and then using the theories that we developed we were able to develop what is called a soil water retention curve which talks about what the water content is in the soil at different water pressures it's a very fundamental property. These the symbols that you see these are actually experimental measurements what you see are different old competing theories but these theories the prediction is based on fitting the data that you have whereas a theory that we developed with Jack actually uses the image and predicts what this should be. So these experiments are fairly long and painstaking to do a soil water retention curve using pressure plate apparatus can take as much as six months to a year to do one single graph like this and from there we can actually also predict what the hydraulic properties of the soil will be at different levels of saturation and what you see of course are these are the symbols this is our theory so from here to translate to here the other curves tell you what the existing theories would show you. So you can argue well GS you chose one example where your theory worked very well so I want to say that Jack actually did this oops sorry for 120 different soils okay so for 120 different soils basically using stochastic theories of either impenetrable spheres or fully penetrable spheres we can actually predict based on knowing bulk properties like the soil porosity and the interfacial surface area which are easily measurable what soil properties should be so this is a difference yes that's right so not much different they are different models but they're trying to address the same problem a different way or different conceptualization of what the pore scale should be like so they should be giving similar results fully penetrable spheres however gives you much more realistic representation of the pore space okay so the model that I had showed you of the soil that was a fully penetrable spheres model so basically from pore scale to how do we get to you know bulk properties of soil so from pore scale let me move to another set of experiments which I've been doing with my colleagues in Italy and this is perhaps one of my most long-standing collaborations these are essentially what you would call lab scale experiments lab bench scale so this is about one and a half meters by 75 centimeters by 75 centimeters so we create sand boxes very well controlled as homogeneous as possible to essentially understand when rain occurs on soil what is happening to the water how can we quantify it so we have various places where we can collect salt surface water you see some wires running down they are going to different depths in this these are TDR probes that measure soil water content and we also did experiments where we grew grass on that surface to examine different effects this is Emily Anderson she was a master student in fact she was here sometime last week to revisit campus and talk about some of her experiences so in her master's work and some other students we did a lot of work over here these are my colleagues in Italy faculty members staff member and so on but I have spent a lot of time over there you know doing these experiments so the lab scale so some of the results that we look at are essentially when the rainfall even occurs how much surface water we are getting and when rain stops how surface water essentially vanishes what is happening in deep flow that means as water enters the soil we are able to collect it from below and this is what the behavior of that water looks like how does the water content change within the within the sandbox to sort of understand rainfall and runoff processes and if we go grass on the surface so we will have a very strong deep flow component a smaller surface flow component and again water contents being measured in the soil so we did over a hundred experiments like this we analyze them and one of the things that we found was our existing theories of infiltration which we think we understand very well they do not they are not adequate to explain some simple behavior over slopes so some fundamental questions are some the mechanism of unexpectedly long recession so for clayey soils essentially once rainfall stops we still collected quite a lot of surface water whereas all theories would say once the rain stops within a couple of seconds we should not be collecting any surface water we also found that our existing theories do not explain very well what happens when water is moving over a sloping surface so all our celebrated theories that we have none of them were actually explaining the data very well so this is something that we are still working on and still trying to you know explain this some other experiments that we conducted or we continue to conduct in Italy through my collaborations is from the lab scale we now move to what we'll call a small plot or a field scale so this is a nine meter by nine meter area where we have simulated rainfall experiments we can do we do natural rainfall experiments we essentially collect surface flows we also have a soil moisture probe which tell us how the water content is beneath the soil we are also able to catch the deep drainage and work with that so a much larger scale and Richa was a PhD student so one of the things that Richa was interested in where she used some of this information is the problem of scaling now scaling I must tell you mean means different things to different people okay so in this context what we are looking at is let's say this is the plan view of the soil that we just saw nine meters by nine meters we know that soil hydraulic properties tend to be highly variable in space so this is just a conceptualization these different colors show the amount of variability in this supposedly homogeneous soil the saturated hydraulic conductivity typically varies a lot okay and trying to essentially determine water movement over such a heterogeneous surface for a rainfall event is fairly complex so if we want to run a numerical model we will be running it will take a lot of effort couple of days on a very powerful computer to do one rainfall experiment because you have to model the heterogeneous soil so in this scaling behavior what she is looking at is a problem which we are frequently interested in is what is how is the water content or moisture content of the soil surface changing with time okay so if we knew the behavior of how saturated hydraulic conductivity varies let's say m1 m2 m3 are three locations where you have soil probes soil moisture probes where you are measuring the water content and for a rainfall event we essentially have data which show how the water content change with time at these three locations and because they are spatially variable the way they change in time will be very different so the idea of scaling is if we have this kind of information are we able to use the physics that we know to collapse all this into one reference curve and having the reference curve are we able to then determine at some unmeasured location if I know just the saturated conductivity location how water content will change with time just from this reference so not having to solve the full surface flow surface flow equation which are extremely complex so that is scaling and I made a problem a little simpler than it sounds she did that she went back to that field and then this is what essentially her scaling results look like so at three locations in that field we have measurements of surface water soil moisture content and then what she does is she uses the measurements at let's say two locations to essentially predict what would be the third location so the symbols are essentially the measurements and the green line is the scaling model so similarly if she is trying to predict over here she uses the measurements obtained at these two locations to predict what it is at the third location if we are able to do that it's a huge savings because as I said trying to do this numerically is a huge challenge and when she's doing these predictions what you see the scatter plot which shows us several different events how well we were trying we were able to predict this quantity just through scaling relationships so this is like more like a plot scale kind of analysis then she further worked on the problem of aggregation and disaggregation which is also fairly important for us the problem of aggregation essentially says well if I have these three measurements and I have a scaling theory can I predict what the field scale average soil moisture would be because at that scale that is of interest to us and what this shows is for different kinds of cases if you have numerical results we can use those and see how well the average is predicted by our scaling problem the reverse problem is if I'm given the field scale average can I then use that to predict what is happening to soil moisture at a given location because this is the problem we face when we do remote sensing we are sensing a very large area and giving getting one average value for that for ground tooth however we go and measure at a point so how do we reconcile between these two scales so when she does this aggregation with her theory you can see that we struggle we do fine but once rain stops it's very difficult to get those two agree so disaggregation that means given the average to predict individual behavior that is a much harder problem so this was you know more like lap scale results and so on let me move to watershed scales okay so for watershed scales I'm going to talk about start with talking about latif kalin's work latif was also a phd student here he was looking at a very interesting problem so this is essentially a map view of a watershed a watershed is an area where essentially when rain falls through the stream network it funnels all the water and essentially moves them downstream through the stream network okay so when we do management strategies we look at watershed scales and watersheds are divided into sub watersheds and within each sub watershed we assume that properties are homogeneous and then we try to model the behavior what we typically have to contend with is that our measurements are usually made just at the watershed outlets we are measuring flow we are measuring how much sediment is going on a very important problem for us is with this measurement let's say we have sediment we are measuring where is it originating in the watershed so can we do that back calculation so this inverse problem tends to be very difficult to do so if I use one rainfall event so this is a rainfall event where we had rainfall and then we had sediment essentially come out what you see are the model results which is a solid line the circles are the experimental observations and then this is another event this is another event and so on so if I use each event and try to use this data to figure out where the sediment would have originated with each different event we get a different answer as to which area was contributing to sediment so that is in the nature of the experimental data we have the fact that our models are not perfectly accurate and of course that the inverse problem tends to be an ill posed problem so what essentially we conceptualize with him is we will treat the sediment generating potential of each of these sub areas as a random variable because that's the only way we could get our minds around this problem and so with each experiment the values of let's say the erosion potential that we get in each of these sub areas is one value that this random variable is taking one realization once we are able to do that then if we have many of these events we have many samples of these random variables and then we can use statistical methods to compare how significantly different one area is from the other in terms of its sediment generating potential so basically you have to rethink the problem think outside the box to be able to address but we do need many rainfall events to be able to do this well so a lot of data to be able to do this well and that's usually a challenge for us Mazak Arabi another PhD student also working on water watershed scale problems he was doing optimization studies so when we do water quality in stream networks we are very much concerned about you know what is the status of the watershed what is the health of the watershed our water quality standards being met so for sediment 30 milligram per liter let's say is a concentration that EPA or some other body says is what should be acceptable if you go beyond that we are violating a standard one way to address these kinds of problems is we essentially have best management strategies that we place in the watershed either in the upland areas or in the stream and these essentially help reduce the load that is coming out of the watershed and we have various options for these glass waterways wetlands parallel treasures and so on so one of the things math that did was essentially use optimization methods to say how they should be placed in a watershed to obtain best results best results either in terms of for a given cost how to distribute them to obtain the lowest concentrations or if you want to meet a concentration standard how to essentially place the BMPs to achieve those standards and again so very large optimization problems also fairly challenging so some of the questions that we have been interested in are listed over here so what role do best management practices play how do we use water quality data to assess the overall health of a watershed and so on so this is another another basically a sub watershed this shows you what the land uses and he used fairly advanced technique like generalized likelihood uncertainty estimation regionalized sensitivity analysis tree structure density estimation and those these are fairly advanced concepts for the time that you know he was working on these problems but what they would allow you to do is take this very large optimization problem but give managers an indication of what kind of practice should be placed where in the watershed to achieve best results but still fairly complicated problem then I want to talk a little bit about a larger scale than watersheds regional scale state level country level you know and here this she won't worry another of our phd students his main focus was essentially how to engage uncertainty that we have with measurements and this is a very important problem for us in this case he basically was working with a latent variable approach in a Bayesian framework using graphical models such as hidden Markov models we will talk about this and the idea is is encapsulated over here that measurement is always an approximation or estimate of the measuring we measure something if we know our instrument well enough we also have a measurement error that comes along with it a lot of time we leave the measurement error alone okay and that is particularly a problem in in hydrology so I'll give a couple of examples so sea surface temperature so those of you who are into global circulation models and how we do forecasting and so on one of the primary inputs to all these large models that you know that work on this is sea surface temperature so El Nino La Nina they're all based on sea surface temperature values so sea surface temperature is a very important boundary condition influences atmospheric variability and so on it is used for long range climatic forecasting in general circulation models in climate change studies and what this picture is trying to show you is not the sea surface temperature but the uncertainty associated with the sea surface temperatures so over time sea surface temperature data has have been measured or estimated using remote sensing platforms through ships passing through their measure temperature buoys that are placed in water they gather temperature data and this is showing you four snapshots in time may 1850 1900 1950 2000 what you can see is how the density of measurements has changed with time but what this is also showing is what is the variability that we have what standard deviation how much error we associate with each of these surf surf sea surface temperature estimates okay so we have this information but currently none of the GCMs use that none of the models use that it's too complex a problem similarly if I look at other data sets we did quite a bit of work over India so this is essentially rainfall data so that data when it's generated when it's made available the error or signal to noise ratio is also provided to us but nobody uses that information it's too people feel it's too complicated to deal with the uncertainty information so this was essentially or has been continues to be one of the focus of foci what we did was develop models which would explicitly account for this uncertainty and these are graphical pictures so in very simple terms graphical models we have essentially an x variable we construct a model we have the model error okay we use Bayesian non nonlinear principal component analysis or noisy principal component analysis principal component analysis to reduce dimensionality or this is rvm v and rvm relevance vector machines again variable noisy relevance vector machines these are essentially for regression and bn correlation Bayesian noisy correlation to essentially do correlation studies I guess the important thing with each of these is with the variables that we are measuring we associate an error but we assume that we know the error and then how do we implement it so these are actually fairly standard things that we do in statistical hydrology in fact all of us do it when we do correlation you fit something for y versus x how many of us actually use the error information is x if it's available or the error information y that is available if you did have that information your strategy for correlation would change a lot so I'll show you some examples of how this works and in machine learning what they have is if you want to do something you have they give you bench data sets benchmark data sets you have to show your algorithm how well does it perform on these benchmark data sets so for example this is the sync function the sync function looks like this a solid blue line what is provided to us so that is what we are trying to reconstruct if you will what is provided to us are these symbols so these are the measurements and they come with a lot of error but this is the original function that they were supposed to represent what we do know is the measurement value and the error associated with it if I use relevant sector machine which was state of the art it's a very good technique this black line is what I would reconstruct from these error measurements as the true signal but if I can use the variational noisy relevant sector machine which now incorporates the error in this data explicitly then this green line is essentially what I would reconstruct so the fact that I have given error information helps me greatly in reconstructing the series so if I have missing values and so on I can do very well so these are benchmark sets this is another benchmark set that we have to worry about in when we deal with data this is the actual data this is the image that I would be trying to reconstruct what I'm given is noisy and incomplete data not only are there errors there are gaps in the data I need to fill this to be able to do my analysis so probabilistic principle components dynof regularized em these were the state of the art models and this shows you if I apply these methods how well I can I can reconstruct this image and but if I'm able to incorporate the error information which is provided then the method that we came up with you know reconstructed much much better another example of a benchmark data set is when we use dimensionality reduction so this is essentially a data set that was created by essentially giving 100 examples the data is supposed to have it's 20 dimensional so there are 20 points in this direction it has only five independent vectors but it has noise and so when you do data reduction we want to be able to extract this if I use standardized principle components or probabilistic principle components this is what I extract if I use Bayesian noise noisy principle components we get those exact five only those five vectors back but that's because none of these methods would would ingest the uncertainty information so we basically leave it behind and I think we we do not we can do so much better so let's look at some actual data sets so this is essentially over India this is the all India summer monsoon region GCMs will provide you data at all these grades and then GCMs also do all sorts of ensemble averaging which means many GCMs are run and somehow their average is taken computationally very intensive and this is our state of the art right now you know so if we do that this is essentially trying to do show you how it work how well it works this is time in years this is the rainfall anomaly and the box plot and the spread is essentially from the ensemble the the observed values are essentially the crosses so even the GCM ensembles we don't do all that well if we use relevant sector machines we don't do great but we are better than GCMs and that is reasonably well known GCMs are are still very complicated difficult to do prediction with those some other examples if I am trying to forecast what is happening let's say for all India summer monsoon for the months of May or existing methods would give this as a forecast so where this red line is observation the blue is the mean of our prediction and this gives you an idea of what the spread is with more advanced methods you get perhaps a slight improvement the table below shows you what the error statistics are I should also point out that we're going to testing phase our performance is actually not great it's pretty weak but really that is our prediction skill with the best methods possible Ganesha sitting right here he works in hidden Markov models so you know when we talk to our phone when we talk to Siri the speech recognition software is actually a hidden Markov model or it used to be a hidden Markov model now if they have more advanced deep learning techniques like a long short-term memory units and so on but HMMs were used so what we use them is we actually observe let's say rainfall as a time series we want to predict droughts or we want to be able to characterize droughts so we treat droughts as hidden states not observed what we are observing is rainfall the hidden states are droughts and then we use the hidden Markov model to essentially characterize these drought states and do a probabilistic classification what probabilistic classification says is if I look at my phone it says 20% chance of rain tomorrow 50% chance and I make a decision should I get an umbrella or not should I wear a coat or not for droughts that is not the case you go to the US drought monitor it says you will have a D2 drought a D2 drought is a drought of a certain level of severity D4 is a very severe drought but it doesn't tell you anything about what percentage chance it just says D2 they could be off by a wide margin but you have no way of knowing that a 20% chance of rain at least gives you an idea of what to do with it if you are just going to say it may rain tomorrow what are you going to do with that information so probabilistic classification helps us with that and this essentially it shows you a little about the model and this is just an example of how it differs from the standard techniques so what you are seeing over here is essentially let's say the rainfall series in both cases the blue line this graph essentially is a probability scale and it shows at each different here that the standard method would give you one drought classification so basically over here for this year it's moderate and that's with entire probability one whereas if we use a probabilistic classification at each time the height of the bar tells you with what probability we belong to each class so your prediction may be well we are in moderate drought with this percentage it could be a severe drought with this percentage or it could be a mild drought with this percentage that is much more graded information which watershed managers then can use to divert resources more confidently it also shows you the differences that you are going to get between the two methods because they come from different ideas if the precipitation is very low we should be actually thinking of extreme drought which standard method may not be able to capture so there are some nuances that we deal with it similarly let's say we are trying to predict extreme droughts in india our standard method by definition must say give us uniform value everywhere it doesn't give you a chance to say this area is more prone to droughts in that different area those comparisons are not available because this method was not designed for that comparisons however in in some of these more advanced models we can show that some parts northwest part of india is more prone to droughts another example in the monsoon affected regions is to study monsoon breaks and you know active breaks active spells and so on or breaks in the monsoon and we have online and offline methods the online method is what we propose which says after we have sort of figured out the model as new data becomes available it keeps updating so it is basically useful to do a continuous prediction whereas the standard methods that we had the offline method you would have to give it all the data all at once and let the model decide so you do not know how well it performs on unseen data coming back to indiana shichekawa was another very bright student we are working on droughts in indiana and this is an example where we use again very advanced statistical techniques copulas for joint behavior and so on and many of you will perhaps remember the 1988 drought that was a very severe drought so some of the results that we were able to obtain for users so the state of indiana in such a deep drought how much rainfall would be needed to get to normal conditions so most of the state would have required 7 inches of rain is very difficult to get what we were also able to then say is what is the probability of getting 7 inches of rain so basically between 0.1 and 0.3 so very little chance of getting out of this drought because we need a lot of rain our probability of recovery is very small and we were able to do forecasts for one month six months and so on very useful for water planners minu ramdas is another phd student she was also working on drought she's talking about drought precursors how can we use our existing knowledge what we know right now in terms of various variables like soil moisture precipitation runoff and so on to say what kind of a drought we will get in the next month and what this graphs are showing is for different variables let's say if i take the month of march for these three variables at least there is some gradation this is going to be a very severe drought versus the mild drought other variables like evaporation wind speeds sea level temperatures they cannot they do not have enough resolution to tell you what kind of a drought we'll get because they do not contain that much information about droughts so this was part of a minu's work we also one of the things that we look at is when we have these variables and we are trying to do forecasting so this is what is where let's say the calibration data let's look at the validation data for each of these variables what this scatter shows is that our predictive ability is actually very weak maybe 10 percent with each variable so it's very difficult to make predictions however if we have multiple variables and your confidence in each of them is 10 variables and all these variables are pointing towards the same direction then you can combine their effect to get a much higher confidence level and usually that is what we have to rely on because the processes are so complex that working with a single variable unless it's an extremely strong predictor you really don't have much to work with in which case you have to start pulling a lot of other other knowledge to make reliable predictions so I'm going to do a small diversion and it takes some time to acknowledge the students I talked about the work of some of my students these are essentially many of the students that have worked with me over time and some of my current students graduate students very important to my work they have contributed a lot to my learning and some postdocs and visiting scholars also as you can see over here several of the students mean they have all been doing well several of them are in academic positions some of them are professors you notice that one person is a professor and a head so elsewhere so students are doing well and that's great I also want to essentially acknowledge the students by listing some of the awards that we have got with students three of my students have got best dissertation awards I'll point to some of the more let's say prestigious awards with my students Shivam Tripathi let's start with him KDD is knowledge data and discovery it's one of the machine learning prestigious conferences computer science people go to this and they had a challenge problem and Shivam we talked about some of his algorithms he was essentially awarded a best challenge paper award with that problem so Shivam also was recipient of the Alfred Nobel Prize which is a joint society award between ASC, AIME, ASME, IEEE and WSE all the societies get together and you know pick one and this is one of the ASC awards let me see Shichigal got a best paper award which was decided by the European Geophysical Union which looks at all hydrology papers in all journals and picks one typically based on how well it has been cited and so on so very very fortunate to have worked with many students who have done very well current topics so some of the things that we are doing for instance is this is essentially the upper Mississippi River basin Ohio River basin and so on when we do have all these stations where we have flow and water quality data water quality is very sparsely sampled so what you see in this graph are these are the symbols for water quality what using these advanced methods that we talk about we can reconstruct that series and we also have the error information about it then from that we are essentially able to do scatter plots of how well we predict versus observations we are also able to use this water quality data to figure out what the resilience is of this watershed in other words how soon does it recover when a violation occurs and this is essentially histogram because we have uncertainty associated with it we get a histogram of resilience values so this was also something that Yaman Hawk was essentially working on so if you have a watershed we measure water quality at different stations we measure different water quality parameters they all have different standards so you may have alachlor, ammonia, atrazine, total suspended solids different measurements very sparse you can reconstruct the series you can come up with a composite water quality index and error around it to essentially describe the watershed health okay so a lot of work I think that we are doing in reliability resilience and vulnerability of watersheds based on these concepts current topics so you have two students Abhishek and Anubhav we are now looking at how we predict at ungaged basins where we have no measurements so we use some machine learning techniques from measured locations see how well we can do to predict what is happening at unmeasured locations in terms of watershed health and so when we test this method we have essentially areas where you do have measurements but we don't use them we only use them to see how well we do and then you know we do a scatter plot to get as an estimate of how close we are in making these kinds of predictions at ungaged locations so the current problems I am interested in is when we make measurements of infiltration and soil properties from point measurements the instruments give conflicting estimates they are different from each other so I am trying to understand why because we use these instruments a lot but we are not fully able to explain that and so this is one of the topics that I am interested in some other topic that I have been working on I want to get back to is when we do have droughts and I was working with a large team how does that affect urban growth how can we design our cities to be more resilient to water shortages you know now that we have the capability to predict water shortages how do we essentially work with that so these are topics that are of interest to me some of the open questions that I would like to address going forward should the analysis of uncertainty depend on objectives of the study how do we actually deal with prediction and explanation so the strategies for both should be different I would like to be thinking about this something that we do a lot in hydrology worry about reducing uncertainty and improving predictions and I want to get away from our standard method of doing things and see how we can use predictions of test data during training which is a slightly different concept but would have to we'll have to change our way of thinking for how we do these things we use latent variables in all our statistical models need to be able to assign physical interpretation to them one of the things that I'm really interested in because this is a very standard problem for us how do we design models and parameter estimation methods when hydrologic data tend to be very multi-dimensional and scarce so any when we use deep learning algorithms and machine learning algorithms these are designed when you have extensive amounts of data we have the reverse problem we don't have enough data we have to therefore rethink or re or adopt adapt usually modify these algorithms to work for us so I find that a very interesting topic let me briefly touch upon some teaching interests so before I came to Purdue I was at Kansas State so these are the courses that I taught undergraduate to graduate level at Purdue over time I have taught and apart from special topics courses courses all the way from 100 to 600 level the 200 level courses I have not taught but I don't need that to change it's fine the way it is I so teaching is something that I really enjoy and I thought I would do a very quick diversion and share with you some of my teaching evaluations to to show you I should also point out that these evaluations are not meant to be flattering just that they are interesting so one of the first earlier evaluation I got was Dr. Ross seems to care about students I hope this doesn't affect his tenure so there was a feeling that if you're a good teacher you're not spending enough time on research that is not true I should let you know okay this was interesting I'm not sure this is true Dr. Ross shirts and pants have the sharpest crease he seems to put some effort into his clothes but what is it with the tie selection I think this is you know from my earlier days when I used to be going to work sometimes my daughter would come and say Nana please wear this tie it would have nothing to do with what I had on but I would still wear it and okay this is actually from here I figured out Dr. G's limitation to the teacher he cannot have a class go by without at least one mathematical equation on the board well just to make a point I didn't have any equations in this talk let's see this is it says I started to get differential equations in the groundwater class our students generally don't like differential equations I still can't fathom why but anyhow he likes mathematics nice handwriting and blackboard techniques so this was good this was I paid good money for this course and thus deserve a commensurate grade I don't think it quite works that way what's that it doesn't implicate you for you know I paid good money for this course yeah what does that mean I don't know what what he's saying is yeah I paid good money I should get a good grade you know so this is you know a student who wrote several things I have several points to make about professor Govinda Raju he's knowledgeable appears confident and relaxed cares about student learning he's a handsome guy but no there are five points right this is four the fifth point all they will accept D so like I said interesting so student comments are like really no it is fine I thought it was interesting comment but I just wanted to you know share some of these with you engagement right now I'm actually the president of American Institute of Hydrology this is our big licensing organization so I'm in my two-year term editorial board of several journals I'm the editor-in-chief of hydrologic engineering kumaris did it for 20 30 odd years for his journal have been active in many technical committees had leadership positions have shared many other committees and I continue to do so I have other consulting work and industry engagement that I have been involved with as well looking forward discovery you know in terms of research I'm happy to collaborate on problems where my skills you know would be useful you know and if the if it makes sense would be really happy to do that in terms of learning I learn a lot from graduate students usually my estimate of how good a graduate student is is based on how much I learned working with that student okay and as I said I've been very fortunate with students I would like to essentially teach some advanced courses and infiltration and a non-process engaging uncertainty in hydrology because eventually I think I can write books on these topics we have done enough work which now prompt me to think I should start putting it together moment analysis is something that I have been interested in and may continue to do that engagement would like to continue seeking leadership roles in influential national committees as I go forward and always looking for right graduate students always so this was essentially some of my thoughts what I have done I showed you some examples of some of my students work gives you a flavor for the kinds of things that I do a little about what I think I'll be doing in the future so with this 50 minutes this is how I'm reading my tea leaves going forward so thank you very much and I'll see if you have any question that I can answer many of these models that you have they I understand the data that you have they are on the past right yes but we are seeing more extreme weather the event the hundred year taking place every year yeah so so my simple answer is it's not a simple question it's actually a very complex question our all strategies that we have had so far for hydrologic design have assumed that the past is going to be a good representation of what is happening in the future so we use look at past reports to figure out what would be a hundred year event based on probability of accidents in future we are not able to do essentially make their determination if we assume that we have climate change and things are going to change so that is an extremely complex problem there is really no good answer nobody has that answer all these GCMs can they do model predictions for a hundred years but they are doing scenario analysis they say if carbon dioxide gets doubled if land use changes if this happens then this model thinks this is what's going to be the future this model is not doing great to basically reconcile the past and to begin with and then we are saying this is what's going to do in the future so when you see these IPCC reports and so on that's why they use many many models and try to average them to say well let's hope errors are cancelling out and that is their strategy actually that the it's a very deep question we don't have the answer to that yet we do not know how to do hydrologic design and we have some ideas basically our design is risk based fundamentally when you say a hundred year event we say society is willing to accept this kind of risk and we will design for a hundred year event in principle saying that if a flood of a magnitude greater than a hundred year event comes the structure will fail but we knew that that was a risk we were willing to take how we address this question of risk in a changing climate that is a very difficult question not simple don't have a good answer yes mark no i think our definitions of resilience i should say we so being a hydrologist i'm not so really the the definition of resilience and so on we would want ecologists to give us what they tell us is say this is the water quality standard you must meet and then we'll we'll figure it out so what we do is use that to work backwards the way we define resilience is in this case is if there is a violation what is the probability that the watershed will recover or how fast can it recover that's how resilient it is and we use how the water quality is changing in time to be able to assess that so we usually have only either the water either the watershed is in a failed state because the standards are not being met or it's in a non-failed state because the standards are met so there is no in between other states yes so this is where we would essentially hands handshake with ecologists they would have to then tell us how to how to work with that information so we are so i'm more in the hydrology part not in the ecology part any other questions we have five more minutes but it's up to you yeah well sand silt and clay three three kinds of soils yes so the the lab scale experiments were essentially done with the idea of trying to understand rainfall runoff and infiltration on sloping surfaces when we go to the field we are actually not able to translate information very easily we have to make field measurements to figure out what is happening in the field because those don't translate so the scaling problem i was talking about is when we go in the field my measurements are still only at a point scale the sam the volume that i'm sampling is very small and i perhaps sample that in multiple locations how do we use these small scale measurements to talk about field scale behavior so that is the scaling problem that we look at otherwise i think to say that i can take a soil sample bring it to the lab do a standard permeometer test and say this is a conductivity i'm simply not able to apply that conductivity to the field so that that scaling we don't i'm not sure it can be done you have to be in the field to make field scale expert field scale predictions i don't know which one you are referring to so this one yeah no no so what the gradation scale is showing is what is the erosion potential of that particular shaded area so you know they they basically go from 0 to 100 so watersheds from 0 to 5 have low potential of generating sediment 20 to 25 or higher values have higher potential for generating sediment and therefore our sources for sediment that comes down in the streams no i i said what we are not able to do is estimate that or measure that we are only able to back calculate that by looking at data at the outlet of the watershed and trying to do the inverse problem to say which region could have generated how much sediment and when we do that when i go to a different a different rainfall event look at different sort of i get different estimates of which region could have generated that sediment and hence i said one way to deal with that is to treat it as if it's a random variable and what we are getting with each rainfall event is one realization of that random variable and if i have 50 60 realizations then i have some way of characterizing the behavior of that of that random variable thank you all for coming thank you very much