 Thank you, Frederik. Hello, everyone. Yeah, we are going to talk a bit about VikiData and OpenSuiteApp and how we can use that in our applications. Why do we care about that? When we think about applications, the first thing that for most of us probably comes to mind is code. But that's by far not enough to actually make the applications work and useful, right? A free media player only solves half of the problem as long as the audio and video files I want to play are stuck in some proprietary streaming service. A free software email client only solves part of the problem as long as emails are hosted at Google. And a free software spell checker only solves half of the problem as long as the dictionary for my language isn't freely available. The last bit is an example for data we need for our applications to function properly. We actually have a number of examples in KDE where applications rely on data. One of the obvious things to come to mind is probably Mabel, which is basically just data, but there's a lot of cases where this is a lot more subtle. The metadata retrieval in media players come to mind. There is a number of places that do some form of coordinate to time zone or coordinate to country or region mapping for image viewers and for setting up the clock and stuff like that. And then of course there is the crazy stuff in things like KDE or January. And that's of course the reason why I looked into that entire topic. And if you haven't seen that, there we do stuff like checking if the power plug between your home country and the travel destination country are compatible or if you have to bring a converter and detect airport and stations mentioned in a travel document. So of course that is heavily relying on data as well. And there's probably many, many more examples all over KDE where this is the case. So where do we get that data from? Fortunately, there has been somewhat of a positive change in the last 10 years or so with open data becoming a thing in the same way as software became a thing in the decade before that. And that results in organizations like government agencies or even bigger companies publishing relevant data sets. That however is often in a form that is fairly disconnected from each other that doesn't use some kind of standard identifiers. So merging those individual data sets is quite some work on trivial. The formats are often fairly primitive like CSV or spreadsheets. They use a varying amount of licenses. And often this data is a factor we don't need. I mean, you find a typo in there and then there's usually no process to get back to the government agency where this came from to tell them here's the patch, the supply data, and then they need some kind of internal process to propagate that and eventually release a new data set, which then results in you having to do all kinds of work arounds in your code for that. Fortunately, there is a much better way to work with this. And that's the two giant databases maintained by the invocated data and by the open streetlap communities. They basically aggregate an enormous amount of data from various different sources provided from somebody else or obtain themselves and collect that in unified databases with a unified format, unified interface, unified license, and in a way that it is really easy to edit that and to change that. So we find a few typos in there. We fix that upstream. That's the easiest way for us to deal with that problem and it benefits for everyone else as well. For itinerary, we were missing some special French and Belgium station identifiers in invocated data. So we just edit them upstream. That avoids a whole lot of complication in our code. That's possible if you use those systems. So a very brief overview on what's actually in invocated data and open street map. And I'm sorry, Lydia, if I'm butchering your work here a bit, I had to simplify this enough to squeeze it on two slides. So Wikidata aims at basically being the machine-needable form of Wikipedia. So that means a very, very broad scope. So any factual statement is basically in scope there. And that so far has resulted in 8 billion statements about 100 million objects and counting. So this is very rapidly growing. And that is accompanied by another 60 million media assets. So images, logos, that kind of stuff. Technically, what this database contains are subject-particle object triples. So if you're old enough that you have been around in the KDE4 era, you might remember some of this from Nepomoc. And indeed, you'll find a number of similarities in here. So the subject is, that's items, which is basically just a numerical identifier prefix for the capital Q. Then the predicate, that can be any of about 9,000 different properties. Again, represented by numerical identifier and prefix by capital P. And that can be anything from very widely used and very generic things like instance of or creation time to something very niche and very specific like powerplug type. And the object that can be then primitive types like strings, numbers, dates or references to media assets or other items. And those statements can be qualified. So we can specify, for example, in which time frame they were valid. So we can also model how things change over time. And then in the data speak as an example, something like Q1431, P31, Q2989, 352 obviously means KDE is a free software community. And similar in OpenStreetMap, very similar size for that database. About 6 billion points in there and 700 million lines and polygons of geometry that represents anything on Earth. The license for OSM is, we need to pay a tiny bit more attention to them for Vickie Vata's CC0. ODBL has a share like an attribution requirements. That is for the somewhat comparable to the EGPL reviews and codes. So easy for us to comply with free software world, but we actually have to take care of that. And then what's technically in here is we basic element types, nodes, ways, and relations. Nodes are points. So that's basically just a geographic coordinate and a 100-nano degree resolution, depending on your latitude, that is in the centimeter range. So you can even model like room scale objects in here. Ways are then ordered sequences of nodes. So that's what's used to build lines or simple polygons. And relations, that is used both for semantic groupings like those five buildings belong to a campus and for modeling complex polygons. So polygons that have holes in them and that you can't represent by a single way. And then the part where this gets really interesting, each of those elements can be annotated by a large set of key value pairs. And that's what's actually adding the meaning to that geometry data. Because anything in here is described on a semantic level. So for a line, you won't find an annotation telling you render this 10 pixel white in a black and white dash pattern on the map. But it will tell you instead this is a railway track. And then any visual display decides how that is rendered. That's not in the data. And the data in here goes far beyond what you would usually see visualized on the map. So imagine you're building a digital system and you have a query like find me a pizza place within a 500 meter range from where I'm right now that I can enter with a wheelchair that offers vegan options where I can pay with one of my credit cards, which is near to a parking spot where I can charge my electrical car with a Type 2 connector. The data to answer that is all in here. So it's incredibly detailed and can be used very beyond just rendering a map. And if any of those two databases aren't good enough on their own, they offer cross-referencing between OSM elements and Vicky data items in both directions. Okay, so how can we make use of that then in our applications? And there's basically two approaches to this. One is bundling the data we need with the application for offline use. And the other one is accessing some kind of online API. For the bundling, that works if you have a reasonable amount of data and data that is barely static. So that doesn't change with a high probability within a release cycle. And that actually applies to a surprising amount of use cases. Because since you're doing some offline preparation as a developer for the data you ship, you can actually put quite some effort into this to very efficiently pack it in a way that it needs very little space and can still be very efficiently accessed at runtime. So there's a lot you can squeeze in a few hundred kilobytes if you're not using XML or JSON. And then of course there remains the question on where do you get the data from as the developer? And for that there's the online query APIs that we'll see in a second. There is in some cases derivative databases. So for example, there's the time zone shape pile that's just 120 megabyte in size extract with the exact vector borders for the time zones. That is generated from the full OSM data set. But it's of course a lot more efficient if you work with the small subset if you find one that matches your use case. And otherwise there's of course always the possibility to work with the full data sets. I mean 60 gigabytes of download is hard but that's still manageable if you don't have another option. Then for the online access, generally there is two different types of APIs that you find in both VikiData and OpenSuiteMap. One is a simple single item access API. That usually has a very fast response time in a millisecond. So VikiData for example uses that to power its auto completion. So that scales very well but it has a somewhat limited way of querying this. And for the second option starts the complex query services using query languages like Sparkle, again something you might remember from NEPOMO or OverpassQL for OpenSuiteMap. That little example I have here lists all the members of the KDE community as known by VikiData. And those services come with very nice interactive tools to work with. And that already shows that they are much more focused on research and experiments and obtaining data for offline processing rather than for use from applications. And that's emphasized by the fact that if you're lucky the response comes in within seconds but even waiting minutes isn't heard of. So definitely not something to use from within an application but very useful to you as a developer. And then there is a number of third party services that are built on top of the VikiData or OpenSuiteMap data sets that might be usable for specific purposes. One of them is actually one that KDE hosts itself. And that's on Maps KDE org, the back end for the vector maps in marble. And we are currently working on updating that to ensure it has worldwide coverage and is typically not lagging behind the upstream data by more than 24 hours. And what this offers us is a very efficient way to retrieve basically openSuiteMap data for a very small region, so a range of a few hundred meters for example. And if you look at an area of that size, the full raw data is just a few kilobytes usually. So this gives us a very flexible and powerful mechanism for any kind of use case that you might come up with as long as it has a very localized need of accessing data. And there's plenty of other services versus mentioning like the routing or geocoding services offered around OpenSuiteMap for example. When we use online access from within the application, there's a few things to consider. Privacy is one of the obvious issues because it becomes very, very easy here to leak high resolution coordinates, so possibly the location where the user is right now or where the user lives, as well as specific interests or activities. If you remember the query example from earlier for the digital assistant, if you send that as is to the server, you leak the exact position where it lives, as well as the very specific interest and specific constraints I have on what I'm searching for. If you're on the other hand would run this against something like the MapsKB or Interface, that already reduces the coordinate resolution to a few hundred meters, and it doesn't leak anything about what I'm actually looking for because it just gives me the entire data set. If that's good enough from a privacy point of view, I guess depends on what alternatives you have. So the offline approach will always win against that, but that's not always feasible, obviously. And then if you use somebody else's online access or online APIs, check the guidelines and rules for that because some of this can cause quite some load on somebody else's server. Somebody has to pay for that, like the liquidator and OpenSuite member just communities like us as well. So it's always important to keep in mind the cost on that side. If you want to do anything like that in your application, I listed a few examples here in just covering code that I touched recently. There's probably more like marble comes to mind where some of this that I some of the things I mentioned are actually used or are done in some way so that might give you some examples or inspiration on how this could be approached for this. Offline and online access for those kinds of data in there and various forms of more or less elaborate local pre-processing. And yeah, with that, we are already coming to an end. So I hope that gave you some ideas on what's possible and a bit of an overview on how we can approach using OpenSuite map and VikiData for applications. And I also have a few questions where I'd be interested in feedback. For one, do you think it makes sense if we extract some of those building blocks or collect some of the building blocks for working with this kind of data in say a separate library of framework? And then do you have database features that would make sense in frameworks? And I have the suspicion that some of the coordinate-based lookup teachers to get time zones or countries or so on might be of a broader interest there. So that would be something I would be interested in discussing this week, for example. And yeah, I think we have a few minutes for questions. Yeah, thank you so much for your talk. It's so fun to see Volga dive deeper and deeper into data and itinerary and transport. We do have currently two questions. So the first one is, is there data with shop opening times? Yes, I actually missed that in my query example. That is, of course, in the OpenSuite map data. And that, again, is probably a topic for an entire talk and an entire framework because modeling opening times while considering local public holidays and various different patterns and seasonal things. That's all done. They have all of that in there. It's a very, very detailed comprehensive specification on the format. Writing a parser and interpreter for that is something we still need to do. I expect that to be similar in complexity like the ICO recurrence handling. But yes, that's there and it's extremely powerful in what you can model with it. Well, okay, second question. Are you reusing data from nomantim? No, nominatim. Sorry for address lookup. You probably know what that is. I have no idea. Yeah, that's the field coder or reverse field coder. I always mix up which one is reverse. That is one of the third-party services that come to mind for specific use cases with geographic data, so for field coding. We are currently not doing field coding in itinerary yet, but we will need that at some point. If you get a hotel reservation with trustee address, we need to know where that actually is in order to plan a way to get there, to know and to complete all the missing bits on which country is this in, which time zone is this in, and so on. So that is important to have. I am not sure it's possible to do that offline, seeing that this was added to the question. I suspect the database behind that to be in the multi-gigabyte range, so that is not something you want to deploy on a mobile phone. That is typically something that is only realistically solvable by an online service, I think. All right. Yeah, great discussion in the chat. How do I find the way from my work desk to my bed and stuff like that? Not sure if... Are you solving that? We are working on that not necessarily for the bed-to-desk scenario, but I mean one of the reasons why I'm finding myself digging deeper and deeper into OpenStreetMavis, we are looking into indoor maps for big train stations and airports, and I'm doing sort of navigation in there. So what is the most efficient way to switch on between your trains if you're in a hurry in stations like Berlin-Hauptbahnhof, where you have like eight floors and only one working elevator and so on. But ultimately, if you can model that in an OpenStreetMav and work with it, if you add your bedroom and your office to the OpenStreetMav data, it should work there as well. Nice. All right, then I guess we are out of official questions there. One question from my side, accessibility. I still care about that. Wheelchair, accessibility data, blind navigation lanes and so on. I guess there's a lot of data for that as well. I have only scratched the surface for that, but there is a huge amount of text on that subject for wheelchair access. There is a complete model for toilets and restrooms. There's tactile paving and tactile signs and information about that, and probably more that I haven't found yet or don't even understand. So there seems to be a large sub-community with an accessibility interest involved there. There's things like wheel map that is based on top of OSM and they also have APIs for the life status of elevators and that kind of stuff. So there's people looking specifically at those topics and of course it would be interesting for us to leverage that as well.