 Hola, buenas tardes a todos. Hello, everyone. Welcome to our presentation. Allow me to introduce my colleague, Sara Rodriguez, and myself, Santiago Moreno. We are working at Minside as a scientist. And during the last years, we have had the opportunity to work in many, many fields related with the scientist. Now, we are bringing to you, we want to share with you a project we have been working in this case related to a particular field that is the satellite images and the discovery of water using satellite images. So I was saying before, we are working at Minside. Minside is a new company that helps our clients to go with them during the digital transformation process. And as I'm sure most of you know, Indra is one of the biggest consulting groups in Spain. So we have a lot of people, many colleagues that are very expertise on their fields and have a lot of knowledge in many, many different fields, like, for example, energy, industry, security, health care, and particularly, for example, here in Earth observation. Now, thanks to that, this is one of the reasons, because we have had this opportunity in the last years of working in very, very different fields. Almost any field you can imagine that our world as a scientist have applications. So inside Minside, we belong to the unit of data scientists and artificial intelligence. This is a result of combining three different competencies. You have here depicted in the slide, artificial intelligence, data engineer and visualizations, and data science, where Sarah and I belong. And we're trying to join all of them in order to help our clients and give to them the best answer as possible. As you can see here, our unit is composed from very, very different profiles. In fact, you can think that we are a very heterogeneous group of people, because we are. We really are. We are people who are coming from engineers. We have engineers, marketing people, statisticians, linguists. But we all share some things in common, many things in common, particularly that we really want to always try to get all the information as possible as the data is trying to give to us and trying to go beyond just the evidence. So after this brief introduction of ourselves, our company, and our unit, let's focus in the project we want to share with you. This project is the Land Analytic Earth Observation Platform. You can see here it's a project for the European Space Agency. We have been working in this project during 2017, so the last year. The project is, the platform is finished. In fact, in the last month of July, July 2018, we have the final report with people from the European Space Agency. And they acknowledge that, effectively, all the requirements that we have to reach were met. So we are going to go deeper in this, but I want you to have just at the beginning what all of this is about. The idea is just to take the information coming from the satellites as raw data. We are going to process it. And at the end, we are going to provide our results that implies that we have discovered what are at the pixel level, or we are going to classify pixels with water and non-water. And we are going to even a step ago, we are going to try to group all these pixels and study water bodies under evolution and time. The first thing we are going to show you is the tools we have used, what we have worked with in this project. Here, we have an image of the architecture of our IoT and big data platform called Sofia2. In this platform, we have developed all the analytical part of our project. Here, you can see there are the different models that all together make the platform. You can see how I like it. The models we have used during this project, particularly the data flow that we have used it to capture, to ingest in our staging area and a Hadoop cluster, all the information that was provided by the satellites. We'll see later. We have, before this, we are going to go deeper in this and another subsystem, the processing subsystem. We'll explain it later. And finally, all the information we have treated using the Sofia2 notebooks. We have used Pache Zeppelin working with Spark. Most of the coding was written in Scala and some part of it also in Python by Spark. Sofia2 is a platform that is born in the context of the internet of things, and it has been evolving since then. In fact, in the last month, it has been rebranding of Sofia2. Now, a new platform called OneSite is the evolution of Sofia2. And why is that? Well, the reason is because Experian has shown us that, well, it's pretty obvious, but anyway, it's important that Experian has shown us that we live in a world of constant change. So you have to adapt to it as fast as possible. So the idea you have to keep in mind is to be agile and to be flexible. And how do you work in a world like this? OK, you have to go to collaborative models. Try to shift towards service models based on mostly cloud strategies. And you are going to have to deal with a high penetration of open source models. So as you can see, this is a very complex world with very complex scenarios. Basically, in hyperconnectivity, you are going to go probably many, many times in a big data scenario. And of course, new requirements of server security are going to taking into account. So this is the philosophy that one side applies in order to answer to all the necessities and problematics we have seen in the previous slide. So our philosophy is based in these three steps. The first of all, think big, start small strategy, tries to give us the possibility to be agile and to be flexible. The platform is totally composed of open source technology. All the tools that are composed are open source. In fact, in a few months, I don't have the number exactly, but in a few months, the platform itself is going to be open source. And finally, we combine it with our vast experience because in MinSight is a reference in the service security world. And we don't have to forget that the one MinSight platform is an evolution of Sofia 2 that has been in top of the game of the big data world since the beginning. After talking about the tools, let's go to put into context our project. So our platform, Land Analytics Earth Observation Platform, we have developed it in collaboration with our colleagues from in the space. And it's for the client was the European Space Agency. And this is all contextualized inside what it is called the Copernicus Project. The Copernicus Project is probably, or I didn't say even probably, it is, in fact, the most ambitious Earth Observation Program to date. It basically provides a timely, accurate, and easily accessible information to improve the management with many objectives to improve the management of the environment to provide civil security and to reduce the effects of the climate change. So where does Copernicus come from? It's the evolution of the, you have written there, G-M-E-S. That means the global monitoring environment and social security system. The project is headed by the European Commission in a partnership with the European Space Agency. The European Commission is acting on behalf of the European Union is responsible for the overall environment, also for the setting the requirements and the management of the different services. Whereas the European Space Agency is responsible not only for the development of the data and the providing the data, but also for the different missions of satellites that are related with this project. So at the end, Copernicus is a unified system through which a lot of the data is going to be fed and into a range of thematic information services that are designed for the benefit of the environment and for our, even our lives. And we could res summary just in a phrase saying that to help us in a more sustainable future. And all of these services are grouped in six main categories that you have here in the slide. The oceans, earth, atmosphere, emergency, security and climate change. Our project is one of the case that is focused in the Earth observation. In this project, the European Space Agency is using and exploding more than 30 years of expertise they have in the space missions. And how is Copernicus possible? Well, the European Space Agency has developed or designed a particularly family of satellites for the purpose of the Sentinel, of the Copernicus program that are the Sentinel's mission. All these Sentinel missions share some things in common. For example, all of them are composed by our families or constellations of two different satellites to fulfill Rebyset and recover and get really robust data sets for the purposes of the Copernicus project. But every Sentinel mission has a different kind of technology. Here we are only showing Sentinel-1 and Sentinel-2 because these have been the main sources we have used in our project, OK? But there are a total of six Sentinel missions. Here you can see that Sentinel-1, for example, is the United Raider imaging technology base, whereas Sentinel-2 is a high-resolution multi-spectral imaging mission, OK? So Sentinel-1 is more focused, perhaps, in weather and land, whereas Sentinel-2 more, and land cover, water cover, and so on, OK? Some of the other Sentinel missions are focused in atmosphere, pollution, others in the ocean surface, the scholar and the surface temperature. So as you can imagine, all of this is going to provide a huge amount of data that you not only need to generate it and store it. You also have to process it and try to get information that has to be useful for the industry, for a new way for people. So within this context is where the Land Analytics Earth Observation Platform is born, because it is able to answer to all of these requirements since the image is taken from the satellite till a final product, OK, is offered. Here we have to take here some images from the European Space Agency, because perhaps if one of you want to go deeper in the Sentinel missions or Copernicus project, you can see here that our platform is also related in this web page. OK, so what is our platform about? I have already said that I'm going to go a little deeper, OK? So the idea is we're going to capture the information from the satellite. We're going to process it. And as a result, we're going to have a pixel-level classification with a presence of water, not water. We are going to go a step farther. As I have mentioned before, we are going to try to aggregate all these pixels in a water body. And even more, we are trying to just not get these water bodies, but try also to learn their behavior. And how are going to try to learn this behavior? We have defined two different approaches. The first one is based only on the information coming from the images themselves, OK? So the idea is if I am able, and we have been able, of course, to identify mass of water, we are going to have this, well, my colleagues in the second part of the presentation is going to display you how. We can have all the different steps of time of this water area. I'm going to know its area evolution, OK? With this, perhaps I can, I don't know, but perhaps we can train a model based on a time series model, or even a deep learning model with recurrent neural networks. So this is one of the approaches to learn the behavior. And the other one is try to enrich our data set, crossing it with external information. Coming, for example, from the geological information, visitation, and topographic information. With this, you can build a data set, and you can use, for example, machine learning libraries from Spark, and try to train a machine learning model based on that. Again, here's a summary of the information we have processed for the platform. We have imaged all of there were square size with almost 11,000 rows and 11,000 columns. Every image has a total of 120 million pixels. All the information was concentrated in the geographical area of Catalonia. We have processed, sorry, 244 images from in a period of time from April and July 2017. If you do the math, you can easily arrive to see that we have processes more than 29,000 million pixels. Here is a very, very concise slide of the architecture of the platform. As I have said before, it is divided in two different parts. The first one is what we call the processing of system. It's in charge of taking information or information coming from the satellite. Sorry. And at the end, at the result, we have a pixel classificated with water on or not water. This information, my colleagues are going to explain later how, is ingested in our analytics of system with Sophia 2. And with the Dataflow model, I have some new before. And there are all the analytics, sorry, takes place. So almost because now I'm going to leave the ground to my colleague, Sarah, who's going to tell you the methodology. Thank you, Santiago. Now I'm going to explain the methodology we employ to extract useful information from satellite images. Just to be sure that all of us know how a satellite works, I'm going to tell you just a few things. A satellite orbits around the Earth, taking different photos over time. These images, these photos, have associated two main parameters called orbit until. If two images have the same values of orbit until, they present information of the same geographic area. So you see? I think that with this information and with information Santiago told you before related to Sentinel, you are an expert on satellite. So we can talk about the workflow of the project right now. The project was divided into two main steps. Here in the presentation, we are going to focus on the second step of the project. Because, well, from our point of view, our super objective point of view is the most interesting part of the project. And while we really work on this, the step of the project. In the first step of the project, the satellite image were transformed into dispersion matrices. Matrices with just one and zero values where one values indicate the presence of water. In the second step of the project, we receive these dispersion matrices and we transform them into tabular data. Over these tabular data, we perform different kind of study with a common goal, is to extract a water body from these images and study their behavior over time. We perform two different kinds of study, two different level of precision, a more general and overall level of study, and a more specific study, a pixel level study. I'm going to explain these two studies in more detail. But before that, I want to remind you these slides and they are going to use a few seconds ago. Please remember, every time I'm going to talk at table, an image, sorry, remember that we are dealing with 120 millions of tables. So just keep in mind this is a big, big figure. What was the first study the European Space Agency asked for? They wanted to know the quantity of water per geographical area and per moment of time. And they also wanted to know the temporal evolution of this quantity, how the quantity of water varies over time. So here, for example, we have eight images related to the same geographical area. As you remember, same orbit and same tile. And while we can see that there was a strong increase of water between April and May. It's important to say that in this first analysis, we didn't distinguish between different sources of water. I mean, we aggregate information from river, from lakes, from seas, ocean, whatever pixel with water we found. No matter how far they were between them, okay? The second kind of study we performed was a pixel level analysis, more specific analysis. The European Space Agency asked for computing the persistence of the pixel. What does it mean? The persistence of the pixel. It's super easy, it's just quantifying how many times a pixel presented water and how many times a pixel didn't present water. So for example, here, you can see that this pixel presents water three over seven times, you see? I think that for now, everything is easy to understand, so let's complicate it a little. In the previous step, we studied its pixel independently. But now, we seek to group the pixel. We seek to cluster the pixel according to their position with respect to other pixel in the images. And just to be sure that all of us understand the methodology we follow, here we have a sample, a small sample. I'm going to put it right here. We have three different images, okay? They're associated to the same geographical area in three different moments of time. If we analyze the four images, we identify three different water bodies, the green one, the blue one, and the pink one, okay? If we analyze the second image, we identify also three water bodies, three independent water bodies. But if we compare both images, we realize that this one, the blue water body on the first moment of time, now in the second moment of time, belong to the green water body. So we can re-level this water body like this. If we analyze the third image, we identify two independent masses of water. And if we compare these two images, as we did before, we realize that this water body, the orange yellow water body, now in the third moment of time, belong to the pink water body. So we can re-level it like this. Cool, but we can go farther. Focus on this pixel, the pixel number one. This pixel doesn't exist in the third moment of time, but it exists in the second moment of time. And when it's exist, it's close to the pixel number two. And this pixel number two exists in the third moment of time. And this pixel number two belong to the pink water body in the third moment of time. So we can be sure that anytime the pixel number one appears, it's going to be close to the pixel number two and it's going to belong to the pink water body. So we can add this pixel number one to the pink water body. If we focus now on this other pixel, we can check that we have exactly the same situation. The pixel number one doesn't appear in the third moment of time, but when it appears, it's close to the pixel number two. So we can add this pixel number one to the final green water body. So that's a general idea of the methodology we follow to level the pixel in the images. And what was the algorithm? Well, just before talking about the algorithm, I want to show you what was the final result of our analysis. Here we have an example of a final table. We have one table like this per each geographic area. This table presents as many rows as pixel with water we have identified over time. And as many columns as a moment of time we have analyzed. And furthermore, it present an extra column with the final consolidated ID level of the pixel of the water body the pixel belong to. Now, what was the algorithm we use to, what was the algorithm we implement, we use to implement this methodology. We employ two different approaches, the best kind clustering and graphs. First, I'm going to talk about the graphs. We built a graph per each image, per each satellite image. We identify each pixel of the image as a vertices of the graphs and the chevich's distance between pixels were the edges of the graphs. Over these graphs, we extract the connected component and each connected component correspond to an independent mass of water. Cool. And what was the second approach? We employed a Devast count clustering algorithm. It's a base density approach. The idea behind is you have a point that belongs to a cluster that means that this point is going to be close to another point of the same cluster in other words. All the points that belong to a cluster are going to be reachable from one another. And this algorithm received two main parameters called epsilon and min points. Epsilon is the minimum distance between two points to be considered numbers. And min points is the minimum number of points to form a dense region. In our case, in our case of study, we identified its cluster as an independent mass of water, obviously. We employed a chevich's measurement and we established one at epsilon distance. Both methodologies, both approaches, the graphs and the Devast count algorithms return exactly the same results. However, the computing time of the Devast count algorithms were lower. So in our final delivery, now we're fine delivery to the European Space Agency, we finally use the Devast count clustering algorithm. And now we have prepared a demo in Sofiados, well, now one site platform. So this demo... This demo had been built using... Whoop, que bien, sorry. Sorry, just a minute. Okay, and this demo has been built on one site and you have used JavaScript library call session, okay? Here we have a water reservoir in the north of Spain, okay? The image you are seeing here is the image that is available in Google Earth, okay? No, it's not any of the images from Sentinel. We have simulated five different moments of time. And we can check one by one every moment of time. Here, for example, you can see that the water reservoir is quite empty. So we have identified four different water bodies. If we check the second image, we identify four independent water bodies. We can check the next one, the third one. Here we have six different water bodies. As you can see, the water bodies have increased their sizes. The fourth moment of time, we have two water bodies. And finally, we have a single water body. We can compare two different moments of time, for example, the third one and the fourth one. And you realize that in the third moment of time, this one, two, three, four, five masses of water in the fourth moment of time, sorry, all of them belong to the same water body, the pink, gray, big water body. Furthermore, you can check for the area of each water body, like this, in square meters. So if we combine all the information for all the moment of time, we realize that we have a single mass of water that have increased their size over time. So let's see, we can see here the temporal evolution. Here we have that we have a single water body that had increased over time. So here you can see the five moments of time running. Okay, so now we're going to conclude the presentation. Santiago is here, he's going to... He's going to conclude the presentation, tell you some achievements and the future application of our work. All right, thank you, Sara. All right, so just to end the presentation, these are the achievements that in the final review were reported to the European Space Agency, and it was, the agency confirmed them. Finally, we have built a full industrial implementation of platform in a cloud environment that is based in a combination. I want you to remember the slide we have presented before with the architecture, with the F observation and processing subsystem connected to the analytical subsystem. And with this, we have finally been able to extract the time series information related to water bodies and identify singular water bodies. And of course, this has future applications. This methodology can be easily exported, just not to the Sentinel-1 and Sentinel-2 missions, but for any Sentinel mission, okay? You can not focus only in the detection of water, but also any forest or vegetation cover or whatever. And also, you can cross this information with external sources, when you can even build a model too easy or to anticipate, for example, fire forest and things like that, okay? So I think that's all. Thank you for your assistance. Thank you very much, everyone, for your assistance. So now in 25th Theatre, we have the key closing note. So thank you. If anyone has any questions, thank you. Thank you. Thank you.