 Hello. Good afternoon. Good evening. Well, my name is Jose Rolfe-Valles and today I'm going to present you about the application of a multi-layer perceptron artificial neural networks in hydrological forecasting in El Salvador. First, I want to start talking about a little bit about myself. I am a similar engineer from El Salvador, which is in El Salvador, is located in central part of America, near to Honduras in Guatemala and also Mexico. I am probably an IHE 12 alumnus where I learned so many things. I have worked previously hydrological forecasting in my own country, but now I'm working as a manager of the situation in the forecast room and I define myself like a water system modeler in hydrological forecasting is enthusiastic. I also want to mention about the country profile of El Salvador. El Salvador's current is frequently affected by different natural assets, not only floods, but also the seismic activity, the landslide, and the drought, tsunamis, and volcanic eruption. Flood represents almost 20 percent of the economic losses, but major losses come from the earthquake in El Hazan and this table that is below shows you the economic losses produced by different hydrological events like URA can meet in 1998 and others hydrological events. Especially for the flood profile, I would say that most of the country is affected, is most of the cost part of the country is a flood prone area. Some areas have an annual flood frequency. For instance, the Grande San Miguel Basin, which is located in the eastern part of the country, and two things that are very relevant is that the rainy season goes from May to October and the average rainfall is about 1,800 millimeters. In order to mitigate to these flood problems, the early warning system has been implemented in 1998 after URA can meet and they are for key elements that needs to be in an early warning system, like one of them is the disaster knowledge, the second one is the detection, monitoring, analysis, and forecasting. The third one is the warning, dissemination, and communication, and also the preparedness and response capability. The thing that has worked in El Salvador about early warning systems is that we have a good understanding of the race and a very good warning, dissemination, and communication. But what is missing? The lack of operational forecast models that are able to predict future conditions, especially in Grande San Miguel, where each year you can have at least one flood event. And here is the Grande San Miguel Basin, it's located in the eastern part of the country. The drainage area is about 2,000 square kilometers and includes two lakes, one is the Lake Olomega and the other one is Hokotai. And flood impacts have increased due to the following factors, one of the increased organization, land use, low infiltration capacity, erosion, and poor water management. So our goal was to generate a rainfall runoff model in order to predict the Disha Alacanoa case, which is located on this part of the catchment. But first, question needs to be asked. The first one is do we have enough data to calibrate a model, how to deal with competitive rainfall events? It's difficult to predict in this rainfall events in this part of the world because it's a tropical region and competitive rainfall are the main forces of flood conditions. And so we decided to use observe rainfall in order to predict these charts. This is not our efficient ways because rainfall gauges are not always efficient to capture the competitive rainfall events. And the other question is how long it will be on Varkar's horizon, which rainfall runoff model technique we want to use and how are we going to develop or to put our model into operational forecast. So we decided to use data-driven model in the mix, one of the things that I learned in IGDEL, but what are the data-driven models? Data-driven models are built based on the collected to data to solve prediction problems, perform classification, clustering, and reconstruct highly non-linear function. It's highly dependent of the amount of data and many applications of flood, you can encounter in flood forecasting using artificial neural networks. And especially what is artificial neural networks? Well, artificial neural networks is consists of several layers of mutually-interconnected neurons which transport the input using a multi-parameter non-linear transformation. So the resulting model is capable of approximate complex input-output relationship. So this graph trying to explain you that we have an input layer and then a hidden layer which transport the input and then use a activation function and then it provides an output. So in this study, we want to predict the discharge in order to forecast flood conditions for different lead times. And we also wanted to evaluate how much the artificial neural network modeling technique is improving the results based on a simpler model. We use a naive reference model in order to benchmark the predicted performance. This is a very simple explanation is presented in the question and you only have to assume that the predicted discharge for a different lead time is equal to the current discharge. So it's a basic parametless model that is able to predict the discharge. The methodology that we follow is presented in this table and the main things that to highlight is the input variable selection that you want to select the proper variable that provide meaningful information to the predicted discharge, the training and validation process where you need to split your data into three different data sets and the results and discussion part where you're using visual and error function to see how well your model is able to reproduce the observed discharge. And the first part is the input variable selection and you want to select the proper variable to do your data driven model. You also want to evaluate which past rainfall, evapotranspiration and discharge itself provide meaningful information to the predicted discharge and we use the average movement information and correlation coefficient to detect which variable and past variable are providing more information to the output. So the main information is presented in this equation and when you can see that the predicted discharge is a function of the current and past rainfall, discharge and evapotranspiration itself. So based on the previous analysis we created experimental setups in order to, and the results are presented in this graph. From the figure you can see that you see the rainfall, the potential evapotranspiration and the discharge and the average neutral information correlation coefficient between this variable and the predicted discharge. One of the things that you can see is that the maximum information is provided for past discharge for almost a lifetime from one to four hours. Then it starts decreasing. The rainfall provides the maximum evolution information of the amine in lifetime between 10 and 14 hours. And you can see also that the potential evapotranspiration does not provide meaningful information to the predicted discharge. So we decided to remove that variable to the formulation for predicted discharge of one hour. Overall, the overall experimental setup for the different forecast lead time is presented in this table. You can see that the input variable that we select based on the previous analysis and the output variable which are the predicted discharge for different lead times. Another things that are important is the data part tuning of the fall data set. The hourly mean, average rainfall and discharge were predicted in three data sets. The first one is the training which reduced the model error between the observed and the simulated. Most of the data is presented in this data set as well as the maximum and the minimum discharge in order to reduce the extrapolation. The cross validation will allow us to stop the training process and avoid overfitting in the training process. And the third one is the verification which evaluates the model performance with a data that the model has not seen before. So the results is presented here for a predicted discharge of all one hour. We see that we have an excellent performance of both high and low flows in the verification video. And compared to an AI model, which is the model that we used as a reference, it provides also a good results. However, the AI model is even a delay in the time of the peak. As you can see in one of the figures that is located in this slide. For the remaining forecast this time, lead time, the scatter plot are presented in this slide. You can see that the model performance is decreased as we increase the forecast lead time. This result was expected. So he also provides the performance metrics as you can see in this table, where you can see that for instance, the Nash-Sauplift performance is about 80% for forecast model of one, two, three, and four hours of lead time. In comparison with the AI model, you can see that the AI model is the one that is in color yellow. And the artificial neural network is in color green. You can see that the performance are similar for lead time of one, two, three hours, but each model start the crisis in the model performance. But the artificial neural network for lead times higher than four hours, you can see that the performance are much, much better than the naive reference model. So you can see that if we decided that the AI model was the, we use it as a benchmark, we say that our artificial neural network is provided a better results than a single model. And you also, you want to put, then after you calibrate your model, you want to put your model in operation and mode. This means that you want to fit your model with real time data. This is a tricky process. Since you have to pre-process and post-process the data and the results. And this means that you should be able to remove erroneous data, field gaps in real time. You can use the different modular work assistants such as Delphius or maybe you can use a model centric approach. We use a model centric approach. And this means that we design the workflows in all routines that runs the model, the sporting results. And we use MATLAB routines to do that. You can see here in the graph that we obtained the data from the hydrometeorological gages. We import it to our database and there is a computer that import and extract the rainfall and water level through pre-processed data, run the artificial neural network modeling techniques and then post-process the results. These results are sent back to the database and then we communicate with the general public and also to the stakeholders using a hydrometeorological forecast application. This routine runs every hour and we use also a Windows task schedule to do that. The results of the operational forecasting is shown in this screen. This is for Grande de San Miguel and you can see that it's provided in every hour's forecast prediction. You can have the link in order to... You can see the results in this link. And it is also being to mention that we only know to this, to Grande de San Miguel, but we also did this for other four catchments in El Salvador. One of them is the Hiboa catchment, the Paz River and also the Torola River. After the first operational year, we wanted to verify the operational results which is a good practice. Why we did this? Because we want to identify the strengths and weaknesses of the model in the forecast system. There are many reasons that you want to verify such as economic and administrative reasons, but we wanted to know if something else is required. So, how to evaluate your deterministic model? Usually you use contingency tables that show the frequency of hits, faults, alarm and misadvent. Also, it's a good practice to quantify your verification using different metrics such as faults and alarm ratio, probability of detection and so on. Also, you want to support your metrics with a graphical verification. For this, we graphically verify the results using scatter plot and also relative operating curve which help you to know how much your forecast is better than a random event such as a flipping of a coin, for instance. And the table presents the verification metrics results for the Grande de San Miguel basin. You can see that the hits number for a lifetime of one hour is high. This means that we have 85 times that the model predicts a flood condition. In this frequency, we also, if we increase the lead time, we see that it decreased. As you can see that for a lead time of 12 hours, the number of hits is 43, but the faults alarm and also the misadvent are 45 and 35 respectively. And we also use the graphical verification of the scatter plot is presented as well. You can see that there is a large variability for a forecast time of 12 hours. Another graphical verification using the relative operating curve for different lead times. The relative operating curve is presented in this table for different lead times, in this graph, sorry. And what it means, the relative operating curve, it means that if the model fail faults below the 45 degree black line, then your model is not better than flipping of a coin. In other words, we should decide to evaluate or not, based on a flipping of a coin because it's better than our forecast model. In our case, all the trained models were better than flipping of a coin. Then you can see that the area on the curve, under the curve decrease, have increased the lead time. In the graph that we have above, we presented how the hits and mis and faults alarm frequency number change over the different lead time. The number of hits decrease as we increase the lead time and the number of faults alarm in misadvent increase. There is a point where the two line needs and there is a code deterministic forecast, which implies that for a lead time higher than six hours, you probably need a probabilistic point of view of the forecast condition. Some final remarks that I want to share with you is that the artificial neural networks allow us to solve the short term hydrological forecasting in some basis in the tableau. We obtain better results for a forecast time of one hour. However, a decision maker would like to have a higher lead time to evaluate or not. For a 12 hour of lead time, there is a high number of faults alarm in misadvent that makes it difficult to stone decision makers to use the lead time. So we should focus on our investigation to reduce the model error for lead time of nine to 12 hours to improve our results. It is important to mention that there is a difficult to predict hits are using only rainfall, which is we decided to use past hits are also as a serve as a state of soil moisture. The model right now is fully operational and it runs every hour at the Ministry of Environment and National Research of El Salvador and also provides a flood guidance for decision makers, stakeholders and local experts. So thank you so much. Here is my personal contact if you have more info if you would like more information.