 Solve a little bit more extensive than normal. Yeah, all right. Okay Thank you So, hello everyone. My name is Marcus Niterer from coming from Bonn, Mondialis We have a startup existing for now almost five years. I'm original coming from research in Italy and I'm the Project coordinator of the grass. Yes project and also co-founder of OSG of foundation Yeah, we thought to bring grass GIS to the next level in the last years working on that The main author is Zürn Gebert He spent most of the time initially on it and yeah, we are Working on this software. This is our very outdated company picture. It's Let's say an approximation. We have more people now But maybe giving you an idea River Rhine in the background. So we are near Cologne and Interestingly and we are in an open-source context here. You can make a living of open-source software in case you didn't know The entire company along with our sister company terrestris Which exists much longer since 2002 is our full open-source companies and This is something Which I still touch try to repeat everywhere where I can because Yeah, it's an it's an interesting way of Developing software and offering services. So What I'm talking about we had some idea It's a bit small. I hope you can read it. I wasn't really prepared for a low resolution But I can tell you what's written there bring the The algorithms to the data So we heard in the previous talk that data can increase and do increase non-linearly and In our case we are dealing with geospatial data including Copernicus data. So also there are petabytes of data everywhere and those have to be dealt with somehow and everybody is dealing with IO problems and disk storage and so forth and why not go where the data are but this also implies kind of Bringing the user to the data. So this paradigm you have probably heard of already several times And it's still valid We wanted to check how to exploit The grass GS software particularly but not only purely grass, but all the related ecosystem with G dial Approach if is our snap included as well and whatever you want to deploy yourself How to get this into some kind of cloud context? So the original name was grass grass as a service G G RAS Which is not so intuitive probably to pronounce for marketing reasons. We then called it actinia Actinia is a sea creature which is like having tentacles and filtering the water So now we consider like something like data lake or the flood of information or whatever you want up to you and so with our Analysis software we can go there and fish the relevant information and go for processing that of course core element here is grass GIS And this software is if in case you are not aware of it under development since 1982 so way before I left school I joined in let's say as a Shy user in 1993 and then moved on to basically you are more or less coordinating it, you know, it's a Duocracy means who is working Can move things and I thought to contribute to that Yeah, if you are not familiar with grass itself we have something called grass database that is more or less a file based system With SQL database in the background as well, but there are a few particular things One thing is called location and there are inside websites That's more or less for the organization of the data You could also consider this as a workspace or as a project and subfolders. So nothing dramatic But there's something related to that because this brings the possibility to Offer user management. So probably you do not want to share especially in a cloud context You don't want to do that share all data with everybody, but you want to have a restrictive User model there and this is coming kind of implicit here Then we have lots of algorithms. We are talking about 500 plus Methods available Majorities in the core its vector analysis rust analysis volume so volumetric data analysis Time series, which is not so in terms of grass age. It is new, but it's already existing for seven years or something So we have space time cubes and you can go and analyze things With Algebra as well and all this is already there You have image processing which we use for the Copernicus data processing or Metrological data interpretation and so forth and what you can do here since we are in a GIS context You have the full integration between image processing and GIS in one shell. So it's not two distinct words I'm not interested in that. I'm a geographer myself so I like to get things together and Here you can do that and you can just smoothly go from one to the next So now the question is how to get this into the cloud and cloud means we want to have a restful API on top Maybe to start with to list what data are there What does belong to whom space space your temporal data set is offered as resources? So you can then go there and do not naturally Computation on top of that enable usage of grass GIS modules and they already mentioned user management So define different roles, but in a cloud context where you pay as you go also for the resources You want to have some control over what you offer to the user for example? You offer to the to the user being a provider a kind of flat rate But flat rate doesn't mean unlimited of course, but it means flat rate in the context of What they want so you can go there and say you're restrict to a it's like geofencing To a particular area of the world where you can compute things or amount of data volume and so on there are different possibilities and you can also expose the methods you have or the modules called in grass language Selectively to the users and say okay, we offer you this tech of functionality and if you want level 2 then you can also Access the other one Interestingly, you want to avoid that one users overrides things of the others who have to have kind of data locking also Natural, but you have to implement it and this is also coming already with grass GIS itself so if you take Upget install grass or DNF install grass or whatever you do Docker pull grass Yeah, then you have the possibility to already use a network drive and using the Unix or Windows user management Yeah, to have access or not and all this is now exposed through Actinia itself as well We have two kinds of storages. We have the persistent read-only Storage where you offer base cartography for example like the original data like What it is? Elevation model Copernicus data land use map whatever it is you already provide to your users It would go there because you do not want that anyone modifies them But the users through the computation here are different workers or nodes They want to write their own stuff and so that goes into the user space and this is also Connected to kind of garbage collection for example in ephemeral processing You say the data the results are available for whatever you put there 24 hours and they are deleted automatically Just housekeeping in order to avoid that too much storage is used So in the end you have this grass database over there, which is the data storage can be Whatever I come to it later And you have the different workers equipped with grads. Yes, also G dal P dal P dal as well. I forgot to mention before Whatever you put there basically the user management is done in in Redis and There we can we have from in Redis instance and the systems are communicating to each other and So forth everything then can be deployed on different cloud infrastructures. So this is all Docker based We have running instances in OpenShift kubernetes and OpenStack And also others we are using Terraform in order to deploy machines So kind of if Actinia wants to scale up we can say, okay You can order new machines by yourself and after consumption means the finishing of the process The machines are destroyed in order to not generate further cost Then we need a load balancer So the incoming requests are coming here from through the API But you want that let's say the cloud resources are optimally used for that There's an a load balancer then sending stuff to the different workers and ideally Well, the data are visible anyway, but you also in case you have a heterogeneous You have heterogeneous cloud resources like these instances with different flavors You want to send them to the right? Worker in order to be able to compute the job, okay now how to control all this We are having Jason files here. We have the rest API So there are requests like get location. So that is you can use curl Or we have some other interface or the web based system or maybe in the future also qjs based one You can call it from grass command line. So different ways of retrieving information you can query the system and ask, okay, what data set are already there and There could be yeah, there's for example the global in our system the global SRTM model Elevation model that is a 300 something gigabyte geotip file and In case you are working with elevation model I think most of you will only be interested in a subset, but each of you in a different subset So the idea of cloud is we offer it once and then you can just operate on the area of interest and which which could be change dynamically and Then as it is rest style you chain more stuff there. You'll say kind of zooming into North Carolina that is our some was geo sample data set What is inside so we're in North Carolina here? And then you see what data sets are there and you can go on and you can go further Look into the maps and you see there's also already a render Endpoint which means if you query the system and by the way, this is online reachable under Actinia Mundial is D You can go and play. There's the demo user available Yeah, then you can go into the dive into these data and also use them for computation Now user defined processing in this case You don't retrieve but you send something to the system and say please do this and that is a post request you see over there post request and You say I want to compute the slope of Some map and I want the result as a geotiff, please and what's also possible by the way you can also give it a URL It's not shown here because too long, but you can specify URL in this case. This data set will be retrieved first and then Computation being done on top of that or you intersect with data already there or you fetch different from different data sources and compute stuff And eventually you treat retrieve either vector file or raster file or you dump it into a post GIS database or whatever you prefer and Through this in Jason style you can write custom process chains Already mentioned grass modules are there importer exporter is there and then you can also bring in your own Python scripts And those can be whatever and if you say oh, but Python no idea I still have my good old 90th shell scripts. They work so nicely No need to rewrite them. You just wrap them into a Python script and hang them in and you are done So it's not that you have to rewrite everything, but you can just Make it appear a Python script and the system is happy with that We have also wrapped you find this on github and docker hub snap is a snap We made a docker image out of that by the way, it's a fraction of the original size There are some funny things in the original like full Java and so forth. This can be heavily reduced and Through that we built up the entire stack. So how does a curl request look like? So there's of course curl then demo user, please steal the password. It's public You send a post there a process chain. It's only written like a variable there This is essentially a file a json file or maybe you put it into a variable up to you And this is then sent to this endpoint here and it will do in this case asynchronous processing That means with synchronous processing You say okay do that and I wait till you are done and it comes back to me But in case the job is something complex and it would run for several hours You don't want to sit and block your terminal with that you use the asynchronous endpoint And then it is sent there and you have URL with a resource of status and you just ping it from time to time this you can automate of course If you have a web interface, then it would Notify you once it is done So both options are available Which means polling in this case so you'll get the status and once it is done you'll get the resource URLs back Which is the geotiff or whatever it is and then you can retrieve the map and you are done What else is there? We have been implementing processing chains for Sentinel one and Sentinel two data also for Lanza not written here There are endpoints like NDVI. So for example, you have a certain you're interested in Normalized differences vegetation index a very common index in Houston agriculture and elsewhere Also to find green areas in urban In urban areas you can use that you just say, okay I want to analyze in this area and maybe for the year 2018 from 1st of April to end of June Search something a scene with less than 1% cloud and do the NDVI So this is more less one endpoint and then you say just these few metadata send it to the system and you'll get stuff back We have connectors to is our API hub. That means there are the Sentinel data retrieved from one way We are in discussion also because we are involved in the open EO project which was mentioned earlier To connect to the Diaz platforms that are Copernicus platforms for Sentinel processing The Amazon AWS and Google Cloud Storage. We have also some deployments there Advantage is of those the Sentinel data already unpacked there. You do not have to retrieve the entire Full zip file of one gigabyte size if you are only interested in two channels Yeah, then you can switch to that provider You can see It's flexible. Yeah, our idea is to not be locked into one single platform, but to have the possibility to Well to deploy it here and there and use the best so example here Sentinel to process is the endpoint compute NDVI and use this scene the scene I got the scene name I got from somewhere But you can also search for it and then as before you can pull for the result And then you'll get the NDVI back and this you'll get like a screenshot So previews that you will need see what you have done plus the GOT file as well, which is of course a bit larger Okay, some more features you can also write you can write to Google Cloud Storage You can write to your own if you deploy actinia Yourself could be your laptop even Then you can naturally write also there or to S3 buckets Then we have edit For the grass users here at the possibility of a grass. Sorry often actinia Command execution that means you have one grass command You just write a CE in front of it Of course, you need to have the credentials and then the same command is sent to the cloud So not locally executed but in the cloud so you can play around prototype on your laptop And once you know you set the resolution This is one of the nice grass features to the original resolution and do the heavy computation in the cloud itself As mentioned we have an open EO support there. We are one of the back end providers We will probably not fully implement everything but no back end implements everything but the the relevant parts and You find on GitHub the related information also on open EO or G site That's a horizon European project by the way If you haven't been here this morning You can see in the video archive of today the related talks and then eventually something very interested Interesting called actinia algebra. That is something to do massive computation in parallel Since we are on cloud We also want to make good use of that and imagine you want to compute or something of an entire country Watersheds vegetation index run off whatever you can imagine as I mentioned can be GIS can be earth observation Then all this stuff is parallelized and executed in much faster time Of course, you need to have some more resources for that so What's upcoming We are almost through with implementation of process self-description. That means what you saw This is almost last slide what you saw That we can see what data are there we also want to have what methods are there Yeah, a kind of catalog and if you want to maybe then wrap around something like WPS style. Yeah, then you can make you