 Welcome to the NPTEL course on remote sensing and GIS for rural development. This is week 12, lecture 3. We are concluding our lecture series by showing some applications and links to data that you could use quickly for specific concepts. One such concept that we will be looking in today's lecture will be the use of remote sensing for rural development case studies as in water quality. I have always said that water quality is not given as much as importance as water quantity has been given, which is a concern because you cannot use bad water, water may be available but it is unusable. For example, while I was working along the Ganges borders and rivers in the Nepal region, I could see a lot of water black in color, then it confluence into streams and rivers and then came into the Ganges. So Ganges is huge, maybe a lot of water comes and the pollution is not seen. But upstream where the water comes, there is a lot of pollution and there people are not using that water and using spring water for drinking. So it is as important to monitor the water quality and as the water quantity is monitored and measured. However, that could be itself by a course. So I will just introduce where you could get data and how you could get data and how remote sensing can fill the gap of data limitedness. First of all, we will be using the WRIS website by the government of India where initially as I said when I was studying and doing my PhD, these data were very, very difficult to get because this dashboard was not available. Right now the government has made it publicly available and is happy to share this data to everyone so that they can use. So we could also open it and then showcase how we could get the data online. So now you could see that I have clicked India WRIS home and in the home you could go to WRIS data. Come down to groundwater, first let us look at groundwater. There is not much issues in terms of getting the level behavior which we have seen already. But let us do surface water first because that is what we are going to do here. What do you mean surface water is? Water body like lake, pond and rivers are called surface water. So let this populate. It does take a little bit time based on the internet and also the data available and you can see slowly the data points populating for the entire India. So you can see here the numbers are coming up. It is still populating and then it is. So 3649 stations have been monitored and in the last 10 years, acquisition of 3610. For a country as big as India, we should have more. But again, as I said, it is not expensive to monitor and maintain and also to collect data. So installation is expensive, collection is expensive and maintaining the data is also expensive. Like water quantity, water quality involves chemical labs and setups to test the data. And that is where it becomes a little bit harder on the government's pocket to monitor. So here we have entire India, you could see that the whole of India, you could see the number of stations is 3649, you can come down and see that what are the major parameters that have been taken, pH, electrical conductivity, sodium, observation ratio, and nitrates and total dissolved solids. These are kind of very rudimentary in terms of starting. But when you go and look at the wells, you'll see multiple other parameters because just these monitoring is just very less. So it's not like only these parameters are measured. They're putting the main physical quality measurements, but we'll see how the data comes up. The background is creating a lot of internet issues because of speed and loading. So for that, let's go down here and you can see the base map. I'm just going to click on the base map and take streets. Streets is just a normal image. It doesn't take that much. So now you can see that it moves very freely and it doesn't get stuck. It's not my internet. It is there itself that is happening. So you could come here and then the layer list is there. So you can see that there's a lot of layers you would like to see, the boundaries, service monitoring stations, which is all on, and you could go to unit-wise selection. So you can particularly look at a state. So if you click all the sources, you can see what are the data they have. So normally, the water quality is monitored by two agencies. One is the central agency, which is the CPCB, Central Pollution Control Board, and then the state agencies. The Central Water Commission may also have some stations, which is a CWC, but again, CPCB has a mandate to monitor. And some state agencies, as mentioned here, Andhra Pradesh, Madhya Pradesh, Maharashtra, Telangana, and Uttarakhand, have their own boards that are monitoring. This is concerning because it's not that much data. So all that many states represented, given the number of states India has, we can only see around four or five, one, two, three, four, five states. They're not the big ones, so all are the same. Either they're not putting it on the dashboard or they're not fully connecting the data. So let's take agency all, and then maybe we could see Gujarat, because I'm going to show you the paper from Gujarat. So now you can see it launching. So Gujarat and Rajasthan are very, very important to be monitored for water quality. Why? Because there's a lot of water quality issues in these two regions. Arsenic you see in the Ganges Belt, but here you see a lot of iron, chloride, a lot of factory pollutants leaching into the water. So it is very, very important for monitoring these data sets. And for sure, this cannot be true. It cannot have a station outside of India. So this is sometimes an issue with the data set. We can just quickly see what data set this is. If you click it, you will have it here coming up. Let me see how Gujarat is coming. So Gujarat is coming up, the data set. And while the Gujarat is getting populated, we can say, like, which district you want, let's keep all districts. I'm going to go to Dakhot anyway, and you can click Dakhot. So this data was used actually for the paper. So that's why I'm going to go and show you. And then you can see yearly and you can say which date to which date. If you click on this button, it will move on the number of dates. You'll see that the date goes to 1939, and it just keeps on going. No, we did not have such long data. It's just not there. So I think it's 1980s, 1990s is a good number. And then I can keep it when you do yearly, because 2023 is still going on. So you have 1990, yes, for sure. But number of stations is almost 0, 0, 0, 0, 0. And some stations are populating up. So number of stations monitored in Dakhot is 0. So please note this point. In Dakhot, there is 0 stations monitored, which is where I said a lot of lift irrigation and projects are done. But we can keep Gujarat, you can just click on Gujarat. It will go back to Gujarat. Or we can just update this as all districts. So all the districts are coming up. And I'm just going to do the same thing here, 1990, 2022. And it will auto-populate. There's no summit button. It will auto-populate. And you can see that the 0, 0, 0 stations, one station. And then 15 stations are growing up. And then we have a good number of stations around 2017 onwards. So the last four, five years, they are monitoring a lot. And you could see that which districts having more stations are mapped here. And for sure, we don't have stations in Dakhot. But the 117 stations are there, which is pretty good. And you can click this to see full chat and other things. And what are the parameters? The parameters are not full here. You'll need to click on a particular data set to look at it. So we know that this region, cuts region, has a lot of salt content which is coming up. And a lot of salt precipitation happens. So we can leave this and not much agriculture happens, but we can just take something in the center and randomly selecting. And then we will pick it up. We'll come back here when we do the paper analysis. So you can see here, these are the wells. In this particular, we can just zoom in a bit and click on a particular well. Yeah. So here we have Moonsagar Lake of Mirmangam. And it's a lake data. You can see that lake data. And then you can see if you want to see the station data, you can click on the station itself. And it will start to populate. We can actually look into the number of data products. And in each station, you can look at. So now what I would do is we'll go into the paper to see why and how many data sets we have. OK, let's say Vadodara. We can click on Vadodara. Now eight stations are coming up. And in these eight stations, where do you want to look at? So let's say River Mahi and Doka or Chandwada. We can say Chandwada. We want to see. Just click on it and wait a bit. And then you can see Vadodara is a very industrial area. And here when I click that for Chandwada, every time you click and update, this link will go. So India, Gujarat, Vadodara, and Chandwada. So now you have all these parameters mapped, which is pretty good. It's all a lot of parameters. It is always important to learn what is the range as WHO standard and ISV standard Indian standards. Please look at these range before you take a research question, because that is how you defend your research. So it's better to look at it. Let's say one of the alkalinity is pretty important. Important other phenomena. We can see aluminum, magnesium, ammonia, total hard forms, and then fecal coliform, which is due to leakage of sewage into the drinking water and water bodies. You can see the data only exists from 2014. So you have Jan, 2014, and then March, and then June. So by monthly, June you have. And then suddenly July, we don't have, yeah, July. And then November, June, July. So there is some up and down of data, but not long-term data, which is available. And not all parts are mapped. So for example, this is a water body, but it's not marked. And then we'll come back and search for the study areas that I mentioned in the paper. So you can download this data freely. Just click here. You can download as image, PDF, vector image, print chart, full screen, and log in and give the data. Download the data. And every year wise, every month wise, the data will come. So there are more and more entries. So you can go back to where you want to read the data. So in our paper, what we have is we do have an analysis of a particular region, especially in case of Gujarat. We have taken two lakes. One is the Sursala Lake. And then we also have the Nalusarovar Lake. And these two lakes are very important because they have been used widely by people. And one is in Amdabad and then Vadodara. So if you go to Vadodara, you will see that the Sursala Lake is there. And then the Nalusarovar Lake is almost on the border of Amdabad and Surendra Nagar. So we'll just see Vadodara now. So we do have Vadodara here. And as I said, Vadodara has multiple stations. And one on the station should be our study area. So Sursala Lake is there. So let's click on Sursala Lake. And then the data will pop up. And you could see that Sursala Lake. That is here. So this is Sursala Lake. Yes, if you want to see the lake, let's put the base map back in. So see how this satellite imagery helps. Because yeah, it is a particular time period. It is a stationary image. It's not changing with the time. So Sursala Internet does take a hit when you do these things. So look at the population around. And a lot of people depend on this. This is the same scenario for a lake in rural region because pumps are very low. They have to go to the lake for everything. Initially, in the previous years, it was well maintained. Forefathers maintained it well. But nowadays, the generations are not taking care of it. There's a lot of pollution that enters into these lakes. That's very, very important to monitor and take up these lakes. So you can download the report. User manual is there if you want to see how to use the data. Available parameters for the lake are here. You can just click on one of these. Each one of these, so fluoride is important. And you can see only from 2020, you have these data. So what we will be doing is, OK, pH may be a higher value. We have 2018 to 2021, which is good. Ammonia we do have from 2018. So a lot of gaps are there. Again, these are expensive. So what happens is not every month, you'll see. So Jan, April, March, April, March, April, May, June, and then suddenly there's a gap. So you can see how the gap is there. It disrupts the continuity and the issues. Maybe there's a big spike of pollution happening that time. So it's very important to monitor these. So let's see how we could get these data. As I said, you can get these as an image. Or also, you can go up here and download the data. You need to have the account for the BRS as usual and then put it for research purposes, student purposes, if you're using it for your studies. So now I'm going to go back to the looks like my screen will pop up now. So what I've done is I have used my base map player. Instead of streets, I put imagery. And now the imagery does work. So first it was streets. I zoomed in just the internet took some time. So just excuse me for that. And what happens here is this is the Vadodara region and the Sursakhar Lake. And then I'm just going to click on imagery. So imagery takes more bandwidth. So what you should be doing is first keep the streets layer, which is not a satellite imagery. It is used from a satellite imagery but not. Then you can go here and then come down and see what are the parameters available. All these are available. But if you click on fluoride, for example, as I said, fluoride is only from 2020 to 2021. And fecal coliform, all these are the nitrates are very bad. You need to monitor it. Only very less data is there. Focal coliforms, fecal coliforms from 2018 to 2021. And if you need to download the data, just click here. You'll get all the data. Or you could bring this click and then take a full image or download a period document, SVG vector image, etc. So now I'm going to show you how we downloaded this data and used it to increase the, so this is a fecal coliform, to increase the spatial and temporal resolution of this data, especially the temporal resolution. For example, pH, I was saying it is OK because pH is more easier to monitor and maintain. But nitrates and other things are having issues. There's a lot of gaps. Fluoride is very important, a lot of gaps. And then dissolved oxygen, water quality parameters, etc. So please note that everything has to be monitored for a longer term to understand the impacts. Otherwise, there could have been a big pollution happening here. And if you're not looking at it, you're missing the statement. So if you see the common ones are there. We'll go on, for example, a lot of gaps. And there could have been a big pollution happening there. So it's very important to monitor these. So now let's go back to see how we did it in our study. So I'm going back to my lecture slides. So we were here, let's come up now in the monitor. So more spatial temporal resolution are needed, as I suggested and I showcased from the data set. More parameters are also needed because by the time you catch these pollutants, you should always read and update yourself on these pollutants. And you need to put it into the reports and documents. So for example, now COVID, no one knew how this virus was when it came. So that was emerging because of the new phenomena. Similarly, there could be other issues that can happen suddenly. So I strongly urge you to take a very careful look at these data sets and then use it wisely for assessing important pollutants and then coming back. While before we go, I'll also showcase the groundwater part. Let me share my first name. So there is the data here, but also you can go back to the water data, groundwater, and then groundwater quality. You can open a new tab just to keep it coming. And you would see that it's more or less of the data you will see for groundwater. And it is a concern because the missions for supplying water for these areas may be using more groundwater. So because in the summertime, the surface waters are repeating and how do you use other resources there? So you have a number of stations. There's 15,800 stations, 14,492. This is much bigger than the declared by the surface water boards. So you can see what agencies are monitoring. You click here. It's still loading, so that's why it's not coming up. So it could be mostly CPCB, the underground water board, and the state agencies could be there. And then just for the internet, let me put the base layer as a street map. And hopefully, let's see if the internet is faster. OK, so now you can see the agencies, CPCB, CPCB, center position control board, central water board, as I said. And only if they're telling on a government has put the data up, all the others are not. So you can still keep all of them. And then if you say Gujarat again, you can see a lot of more wells that are coming up. And then select districts, almost every district can be there, which is fine. And then you could see here, if you come down, what are the parameters, how long are they being taking all these things out here? There's multiple sliders, so make sure the yellow and gray ones are different, and then how they are monitoring, et cetera, et cetera. So the same way you could do it. But as I said, a lot of surface water board is more important to assess, because no water, they mostly use it for drinking in agriculture. They'll just not use it and ask the government to supply. And most of the government supply may come from surface water bodies. So it's important to go back to this area. But again, I've just showed you how to download the data. So it's the same way that you could download it from the groundwater board. So let's go back to my slide, which is coming up. So we have the paper that we'll be discussing today for the next 10 minutes. The potential of open source remote sensing data for improved special thermal mapping and monitoring of inland water quality in India case study of Gujarat. We did Gujarat because of, first we found out how many data sets that we could quickly assess. And the students also went there to do some field work. It was then by the PhD student, Singh Neetu. So he was the first name, Singh was the second. And all my team members, whether including Sivan and Amitah and me. So what happens here is this paper actually uses open source data, which means it's free open source. Anyone can use it. As is supported in this course, I have only used open source software and open source data. I have given links to data that is paid version and really expensive just for those who really want that data for a particular use. But till date, I have been happy to use open source data. It has been doing the work that I needed and very, very successfully the results are coming out. So we don't have to spend more money on proprietary data or costly data, unlike the other things. So what is the base idea here is we have these two links. And we have limited observation data. How can we use satellite data to capture the impact of water pollutants on water? So basically, visually, we know that when we go to the rivers in southern parts of Mumbai or near the airport, you will see that it's pitch black in color. Even Chennai, the river Kuwam, Adayar, Adayar river sometimes is OK. But Chennai river Kuwam, it will be pitch black. So you know for sure that it's not drinkable. It's not portable. But one day it was. It was really beautiful waterways. People use it for travel. People use it for drinking. But now it's purely polluted. So this color can give you an inference of the water quality. The same aspect has been used by multiple studies for using satellites for assessing the quality. So how you have satellites for assessing the plant growth and plant healthiness using the green color, same the color of the blue. The multiple band colors of the blue can be associated with a particular water quality issue. However, this is a correlation kind of work. So how is the color correlated to the water quality? There's a costuality. It is because of maybe sewage dumping, maybe industries dumping, maybe medical waste from another state being put into Tamil Nadu's water bodies. So what happens is there is a lot of these water quality impacts that can happen. Just pick up the news and check and see how many issues are happening. How many illegal dumpings happen on the borders of states because the rules are very strict in a particular state. So they will go out and put or dump and stuff. So this is very sad, but because they're not given an option to clean the water or they just don't want to do it and let others suffer. So this satellite data can actually pick it up. So this is what we did. We did some correlation analysis between the color of the water and the portent level. And then trained a model. Trained a model with high accuracy to predict the water quality for a longer time. So let's see how we did. First, these are study areas in Gujarat state. The Nalsarovar lake, which is right next to Amla and then the Sursakar lake in Valodra. We have taken it. These are the two lakes. You can see how the lake is surrounded by both rural and urban entities, whereas this is more an urban entity. And then the flow chart, always flow charts are good for studies. So a kind of recommendation for all students is whenever you're working with satellite data, remote sensing data, please draw it as a flow chart. When I introduced GIS, I had mentioned the schematic of the works, which is very important to understand. The same thing you can do here by having images of flow of the work. So let's see here what the study starts by doing is analyzing landsat images for the study area. So just collecting landsat images. And these images can be collected from the NASA or the ESA Sentinel Portals. I've shown you how to do it. And then identification of coincidence of pixels and the water body. So just masking out the pixels that can come out and then estimation of some more quality concern values. So you get a corrected landsat values. Just leave the how they corrected. Maybe they have a better correction for cloud cover and then reflectance, et cetera, et cetera. Okay, so there is some post-processing needed before running into these. And then what we do is, so let's say we have a corrected landsat image, which nowadays you do get from different portals. Then you have the in situ water quality parameters. What is in situ? In situ means observed, monitored physically. So you take a sample, that is in situ monitoring. You take it to your labs, analyze the water quality and bring it over to us. So remote sensing and in situ surface water quality parameters, correlation analysis is happening. So both on two axis and then say, okay, what colors can capture these water pollutant levels? And then linear regression modeling is being done between in situ and as a surface water quality parameters and concentration and landsat 7SR. And then calibration of model, adjusted correction, automation. These are the processes at the bottom are the processes for evaluating the model. So the model is basically kind of a linear regression model, a regression correlation model between the bands and the water quality. So what you could see is these are the first descriptive statistics of the surface water quality from the period 2006 to 2019. And you can see that Gujarat Pollution Control Board is the data set that we use. And they have beautiful data for biochemical oxygen demand, BOD, EO, dissolve oxygen, EC, and then pH. Then you have the same thing for NOS rover lake. You could see that there's a mean standard which is very, very basic statistics we have done. And then the linear regression models. So we have on the left side, the parameter. The parameter can be absolute or log values, depending on the model, which is better fit. So here you could see the pH is a function of your B5, B1, and B7, B1, B4, B5, B4, B5. So these kind of estimates can be quickly obtained by liquid shivering. So which models have worked? And the same models you could try for your area. And if you're happy with the error estimates, probability distributions, et cetera, R-squares, then you can continue the model. Or you should be using different bands. So this is the idea of using remote sensing data. So for example, let's say pH, a study in the US would have used B3 and B1, and it did not work well for our study. So we had to keep on looking at different colors because the political color could be slightly different or the political color impact on the water body could be different than what we have in India. So that is where we have to mix and match and play. Google Earth Engine helps you for it. But then once you get it with a couple of iterations, look at the results as simple as this. You cannot get it into a good journal. This is a very, very good journal that we had published. And because of the novelty that the bands are used to predict water quality when observation quality is low. So you have these two models, one for the Null Server Lake and then the other for the other lake. And what you have here is the adjusted R-square, which is kind of giving you the process to the fit of the model between observed and in situ. It is also necessary to see the scattered plots between the observed and the monitored. You could see that we are capturing all of that with a particular conference band interval. And there are some outliers or some of them are going above and beyond the conference interval and the most, the empirical model. These are empirical models. What is an empirical model? An empirical model is based on statistics. Here the band color could be done by a physical parameter, but it's a reflection. So we're using that as a proxy data. So it's kind of an empirical model. And then we showed you that the landsat bands of blue, green, red, near infrared, and IR, shortwave. So these are the B1, B2 that we saw in the earlier slide. And then shortwave, infrared, thermal, shortwave, infrared have significantly contributed to development of accurate models for estimating surface water quality parameters for us. So it's a R-square and Null Server Lake. So this is the conclusion that we had that between the observed and the models, we had good correlation, good accuracy. And once the accuracy is set, then what do you do? You either, you see, because here you don't see any observation data. So the idea is you find the correlation between an observed water quality parameter and its impact on the bands. And this is the impact. So it is not a straight A plus B plus C. It is just a very complex linear regression model. But again, we're using computers and the data sets. We can definitely do this quickly on computers. And then what happens is you run the model and then you plot your observed data on it to see the accuracy effect. So here if you could see, let me zoom in to some of them. So you can see that the observed part is the dots. And then the estimated is the, so using satellites. Estimated is the line. So you can see that the estimated for pH, let's take the pH, pH goes up and down. Why do we have higher number of lines is because satellite data has higher spatial and temporal resolution. So this particular data set at least would be, let's say bi-weekly or monthly. Whereas the observation data you could see comes in weekly or once in two months, once in three months. And there's a lot of data gaps, whereas satellite data did not have any data gaps and it's kind of continuous because every 15 days the data was coming in. So you could see that the estimated fell correctly and whenever the quality was okay, the pH went down and more acidic depending on the model. But we are concerned more on when the data captures the errors well, okay? So that is one part. And then you can also see these are the two legs dissolved oxygen and then we have BOD and then nitrates. So if you see here, you could see that the model predicts the up and down also not only the bottom ones, whereas the observed data was only capturing the bottom ones. So maybe the observed data was smart enough to take only the low points when the data was available. But beautifully the model captures this aspect. It's a sinusoidal because rainfall comes, water comes and then stops, then rainfall comes. So there is a sinusoidal movement happening and you could see that the peaks are also being caught. Here the peak one peak is not being caught but other peaks are being caught. This could be an outlier or our data didn't catch it. Again, you cannot expect perfect, perfect fit. Like this is a good fit. You can see up and down is being capturing the data. Here also you can see it, but the point here is it's not observed data. So always as a model data, you should be using it very carefully. However, if I find such high correlations and such high estimation power using remote sensing, we should use it at least as a warning. So if suddenly my remote sensing satellite captures the water quality turning slight brown, which is not visible to the eye. These bands are not visible to the eye. So once these satellite data can show a warning, then we can send the person to go collect the water quality and measure it. Rather than ignoring those warning sessions because if you just say, every three months I collect, rainfall happens or not, pollution happens or not, I have a fixed three months, then it is not going to help. Whereas these kinds of episode rules where the remote sensing captures sudden blackening of water. And then you go, oh, you send a person in and then take it. I'll show you some experience from the field from Singapore also. You would see that suddenly in the night, the water level started to increase even with over and there's no rainfall. Okay, this happens also in some cities in the South where I did some field work. In the night, there's a lot of discharge. So some sewage treatment plan or some illegal dumping will happen tonight because people will not see it. But in the morning when they walk around that area, they'll say, oh, no, it smells really bad. The water quality is bad. For example, not now, but like 10 years ago during the water dyeing industries where the cloth dyeing industry is using water dyes and very high chemicals in Thirupur region, they would dump all these into the rivers. And because of that, the rural people having a lot of breathing trouble, there's a lot of papers on it. There's a lot of studies, the government cracked down and then said, stop. So there's a ban on these kind of illegal dyeing units in Thirupur. It was known as the Manchester of India because a lot of clothes were dyed and sent abroad. And now these clothes are being dyed in Bangladesh. I do not know what is the environmental pollution there but in the South, it was really bad. All the rural regions around these industries are still facing the effects because the water quality is bad, the land term infertile and all those things. So these things can be captured by satellites because in the night, if you do it, still there is some data that has been collected as soon as the daybreak happens and then the light is there and if a sample is being taken beautifully, you can take it. And as I said, some colors are not visible by the eye but the bands will be caught. So we should be using these with observed data, whatever limited observed data we have, we should mix these two data and then make sure that we use them for prediction of water quality parameters. So RS is going to help with limited observation data plus I use the sign plus, not just RS or not just observation data, we should be using them together and to make models and the models needed to be calibrated and validated, just not simply using it and periodically. So if I calibrate the model now and I'm using it for two years, I should still think about calibrating the model because some new water quality parameter would have come up, some bands would have been increased of the satellite, et cetera. Models once calibrated valid can be used for prediction. So you can predict automatically the water quality issues and this enables long-term monitoring with higher spatial and temporal resolutions because that is what is needed for effective, sustainable, rural development and mapping. Thanks. With this, I would like to conclude today's lecture. I will see you in week 12 lecture four. Thank you. Thank you.