 Good morning I'm today talking about stack spatial temple asset catalog Has anybody of you heard about stack already or? Well, yeah, that's great. Well, do you have heard about it? Where have you heard about it? Anyway So that's an introductory talk. I'm just giving the basics today and The basic for this was what is annoying about metadata, right? so if you have data and want to expose that to search engine so that users can use it so you need to basically now a metadata standard and expose it or write XML and stuff like that and Also if you use it you need to understand the metadata format and We're trying to tackle that with focus on search and discovery of metadata We may get trapped with this of course So if there is like 40 metadata standards, then we do a new one and then there's 15 competing metadata standards. So that might happen, but we tried To avoid this and of course give good reasons for why we're doing this So first of all like at the moment when you're trying to search for geospatial data, then you're probably getting to view any of these portals that are out there like Dozens of portals where you can download your data like it Copernicus open access hub for Sentinel data or the NASA CMR and so on and so on but of course to find your data, you need to know all these portals, right? so otherwise you won't be able to find the data anyway and We could also like go and see whether there is something like an Dictionary where everybody puts in his data, but or like in like a person that like Views all over the data and puts together a dictionary about them But then we already realized that there was too much data out there like yeah Who is not a thing anymore where people? Looked at the data and put into everything into a dictionary. So We now have crawlers like for Google for example There everything is in one place and you can just visit Google and find everything So we think finding things via Google or any other search engine is better than going through all of these portals to find data And of course like there is going on so much Satellites at the moment that there are petabytes of data and you just need good tools to find all the data So for example, if you're going through looking into a Sentinel data, then you get This from ESA where it says if you just want a single granule a single tile when you get all this data and What you really want in the end is just maybe the metadata and the actual data file, right? So that's just these two things and this whole bunch of information that you get and Then you have to look through all these fires like if you have central to metadata XML files It's 20 megabytes of XML that you need to go through First step that's only 22 kilobytes that you can as a normal user can really understand Like for example for comparison the plaintext Bible is just 4,000 kilobytes. So That's quite a lot to read if you want to understand the data And even I need to find some kind of like documentation how this all works And what it actually is the data that you're finding the metadata so Why we're doing this now there are so many standards and proprietary solutions also for API is that you can like the portals There's very similar scopes and capabilities But it would be a good idea to basically unify them and make them interoperable so that a client can access all these API's and all these data That's a barrier for adoption. And so we thought about stack could be a good idea to evolve So what is stack actually it's basically defining a metadata standard for Specification it's not a standard because we're not like working for standardization of a company But it's for just a specification of what we think is useful for due special catalogs and assets With the focus on search and discovery so in most case you won't find any information how to process the data There you can still link from stack to the original metadata for processing, but to actually first find and Discover the data you can use stack. So It's very simple. It's Jason based so most people can really read Jason as it's just a very thin layer on top of the metadata and It's extensible So you don't need to write any like for example for previous things when when you had XML You needed to write an XSD schema and adopt it so you can add things But now it's just Jason where you can put your own things into that in addition to what we have standardized already Also a different thing from other previous standards is that you have also a static catalog that you control so You can basically put your metadata files together with your data or put it on top of that like if you have exposed central data for example in an S3 storage bucket you can open another S3 stored storage bucket with your static files That are conformant to stack and then you can crawl through all these metadata files. They are linked together with links and It's such that you don't need a server to run it You just can put it on your like file storage and then it's there and Google can make use of it You don't need to write any software for that or something like that That's the static catalogs and then there is dynamic APIs of course as well because if there is like Thousands and thousands of files you probably need to put them into the database and index them so that you can Better search for them. So you also expose an API which is based on the recent version of OGC API features the former WFS web feature service from the OGC And we just put a thin layer on top of this standard to make it searchable and It's an open specification. Of course. That's why we're here open source everybody can contribute and So what is it not? It's not a full-fledged meta step. Not as I said, it's not for processing stuff or so It's basically really focusing on search and discovery although you can if you want put your processing information into a bit It's extensible So also, it's not a replacement for the data providers internal metadata So you can basically from your Item file you can link to your X other metadata or other files that you have previous and stuff like that it's not the single source of truth in this case and It's also not for all kinds of data sets. So it's just for spatial temporal and you don't Really can expose things like additional document like you can link to additional documents, but it's not meant for like Putting other things in spatial temporal data into your stack catalog So and as such it's also not a replacement for ISO standards for example and stuff like that or CCSW There is the recent innovation plan for OTC catalogs or records Which is also a new API and we try to align also with this effort So what the state of stack at the moment the moment we're at version 0.9 just released some days ago and We're heading towards releasing the first stable version 1.0 in mid like third or fourth quarter of the this year And there is also plans to separate the actual Specification work for the metadata and the API in the next weeks so that they're more Streamlined towards their use cases So what is this specification actually about what do we expose there? So there's first there are catalogs Collections items the API and extensions and best practices. So what is all this about? A catalog is basically a very rough Or very small thing for cataloging you can group your Collections and items with it As very simple. It's like basically just an idea description and additional links to whatever you want to group and Then a collection is basically an addition of a catalog It extends it and adds collection level metadata to it. For example the extent spatial and temporal License provider And all these things that you have like for for example if you want to expose central data you want to talk that about what the central data actually is like that which platform it is using which Temporal coverage it has it's spatial coverage it has which license where you can find licensing information who's a provider and so on and so on Then This can be used standalone. So if you don't expose any assets granules Whatever you can only use it to expose your collections as well like for example Google Earth engine if you know that Just exposes their collections and then you need to use their tools to actually use this data So you can find this data as collection then but then it tells you that you need to use your tool so if there is data or so which you can download in the Traditional sense then you can at least use it in any cloud provider that is out there and exposes that it is stack collection and Collections are also useful for summarizing the actual item data that is Exposed and items itself are the actual granules. So the individual tiles and Items are basically Geo JSON features. So the feature is basically then the geometry of the asset that is exposed and an asset for example in an item could be the file for band 1 and then another asset is band 2 and so on and so on and All these assets can then be downloaded Provided with additional links like for example the the provider specific metadata in iso format or whatever and This actually is very nice if you combine it with cloud optimized geotifs So if you can see here, I'm not sure whether yeah, that It's basically just a browser that is working on a geotif Yeah, and the cloud optimized geotif and the cloud optimized geotif is basically a Geotif that is a bit different Structures and with HTTP get requests range requests. You can basically Without any like server software you can browse it on a map. So for example, this is I think leaflet and If you can see it if it zooms out then it basically if you zoom in now It loads the data just the data it needs so that there could be a 500 megabyte file Behind that and it just downloads the things you see here So that's of course pretty nice if you don't want to expose the WMS Especially for that and you can just download the data that you need and View it while discovering it whether it contains the data you need or not The IP API itself is as I said aligned with all you see API features It's pretty simple. I think there is a landing page where with capabilities There is collections that you can actually expose for example, that would be second and central to and then the items would be each granular yet you can Download here as data and then there is the stack specific search endpoints where you can basically search for whatever is in the files like whether the cloud cover or The extend or the provider or the license and so on That's defined as open API documents are pretty easy to use with the open API ecosystem as well to implement And then we have basically for items the metadata fields are very slim like there is title and Extends and There you can specify some things like when the metadata has been created or updated But then basically this thing is in the core is very slim and then you extend it with extensions so for example for content we have extensions for Describing data cubes for EO data, which is in this case electro optical Then for machine learning to specify the labels Point cloud data for a SAR data Then we have a specific one for satellite data, which is basically Inherited from EO and SAR and for scientific data like exposing doys and stuff like that And then for the API, of course This is also in the core very slim and then you can extend it via fields for example that you can say I don't only want to a certain set of fields in my response so that it gets smaller You can query it via some specifics You can sort it And there is a transaction extension to basically add and remove fields and update fields items and stuff like that and Of course also for versioning if there is like different versions of assets and you can version that There is a growing ecosystem behind stack You have a for example already a validator where you can just put your catalog into it and it validates it whether it's Okay, and according to the stack and there is an extension for intake. I don't know intake, but It was said it's a big thing In the Python world Then there is pystack for catalog creation and all work with stack catalogs And similarly works set stack Then there is a number of clients for example stack browser, which you already saw when we had this cloud-optimized duty preview there this is basically human readable version for the Catalog for the JSON files which also expose for example schema.org Translation so that they can be crawled by Google and their new Google data set search There is a QGIS plug-in the set search for searching data They're set fetch for fetching the data or downloading the data and then there are set API browser, which is basically so stack browser was more for the for the static catalogs and set API browser is more for the API part because there you can also search and stack this browser is just for like Going through the links that are in the data Then there is a couple of server implementations which you can use to expose your data for example staccato in Java Stacc API, which is I think a Node.js application and set API postgres scale which is basically I think Python with a postgres database behind it and This for example is a QGIS plug-in where you can basically just specify your acquired parameters and then It searches for data and basically loads the data directly as cloud-optimized duty into your QGIS Instance to work with Then you can as it's a cloud-optimized duty also directly as zoom into that and it loads the proper data and so on It we're working on basically making use several catalogs available openly available at the moment there are central one and two Lens at eight and use GS lens that collection two is directly Offered a stack and cock catalogs from usgs the seabirds 4 which is a Chinese Brazil satellite for observation NAAP NASA CMR is also Translated into stack and there are a couple of more things that are coming and are in preparation And maybe in the future also your data. It's pretty simple to expose such things So if you have data that one you want to be found then it's a good idea I think to expose it as stack catalog Here's an overview who is already exposing their data as stack and working with stack quite a number of entities And now I maybe just Show you a single example here Like for example, this one is the is a catalog. So It basically just is Jason you can I guess most of you can read what is in there and it's basically Gives you an idea. It's Copernicus as to here and then the title a description What is it is about a license keywords provider information the extent information temple and All the other summary data that is like for example the how to site what the ground sampling distance for the individual things is the constellation platform names Projection information The bands for example, which bands are in the actual assets that you can download then one then two and so on common names wave lengths and the links that you can basically Visit to get more information and that's a collection and now you can basically also look at the items Which is then an individual granular? It says For example, which idea it has which collection it belongs to and there is again links to get additional information the bounding box of the granular geometry Assets that you can download basically now which is Jason files or the JPEG 2000 either some nail or the actual data and Is there anything else in here? Yeah, some additional properties like cloud cover values and so on so that's how it's working and And Yeah, I'm happy to take a question if there are any Thank you very much for listening to this talk