 Felly, ddim yn ddweud i chi. Felly, mae'n ddweud i chi i ddim yn gweithio'r cwysig. Felly, ddweud i chi. Felly, ddweud i chi. Dwi'n edrych i'r gweithio. Roeddwn yn ystod o'r mawr, a wedi ddim yn adrwm y cyfaint i'r gweithio yn ystyried. Mae'n adref, ond wrth gwrsfawr. Felly, rydyn ni wedi'u gwaith o'r gweithio y twil wedi'u gwneud o'r cyfaint gweithio ymlaen o'r gweithio gweithio ac iddyn nhw'n ddratdiaeth a'r ddadechu'n dweud. Rydw i chi wedi gweld iddyn nhw. Rydych chi'n ddataeth? Rydych chi ddataeth eisiau ei gyd wedi'i phPasswr bwrdd gyda'r ei golygu dyna yngyrch o gyfnod cyfnodol o liedd ffordd. A wneud rydw i wneud rydw i'n ddoddiad ni i ffragoed ac rydyn nhw'n ddoddiad â'r ddoddiad gwydda i chi. Rydyn nhw'n ddoddiad yn gwneud y pwyllwyn a ddoddiad neu ddim yn greu codi yn y gweld, am gyfer ar y cyfan, mae'r sector yng Nghymru i'r bronrif, ac rwy'n bwysig, a'r cyfrannu gyfan iddangos nhw fewn bod gynch yn gwneud i'r llogau ar hyn ar y ddodol hwn. Roeddwn ni'n sgwrs a'r phrygau'r twdd hyn ar eich ddysgu, sy'n du i'w weredd yw'r cyfrannu roeddwn yn gweithio'r cyfrannu ar gyfer hynny, ac roeddwn ni'n gweithio i ac rym ni i fan i wnaeth yr oedd yn rhoi cyfordd iawn arlawnnodd yn ysgrifiannidau cyddurdod ar y cyfnodd o'i genant. Mae cyffredig i'w cerddau o'r ffordd, ac mae'n bod ni'n gallu'n ei phwnghwysbeth o hynny. Mae'n chyfodd yw'r ffordd sy'n gwisio pwnghwyl, mae'n gofynodd ymlawn ymlawn ni i blawn yn ysgrifiannid a'r blawn yn ysgrifiannid daith ymlawn i'n wych i dail nhw i wych i dail You've got the label, how it's known in Wikidata, and then the item types and the extract is taken from Wikipedia. So in this case, it's just there's only a French Wikipedia article so the extract is in French. Then underneath that, we've got the suggested matching OpenStreetMap item so the system thinks these two things are the same thing. Over here, we've got the map, the blue pins are where the Wikidata item coordinates are. So we've got this option, show on map, and that will zoom in on one of these matches so you can see it. So I'm going to use the Maison de Roy in the Grand Place as my example. So this is zoomed in. You can see on the map the red pin is the selected Wikidata item. There's a blue border like a polygon around the OpenStreetMap object and the system is saying this is an exact match. So if we like a match, then we take the box next to the item to say that it's good. So we can go through all of these and check that they're valid. Once we're happy, then we can click Save button to add the links to OpenStreetMap. Get confirmation page where you just see the same list of matches again. The system's like, are you really sure you want to save these? There's the list of matches. And then you get a change comment. You can set the comment for your change set. It makes one up by default, but you can edit it. You hit Save, and then it will save these links into OpenStreetMap. So people are using the tool. 140 users have used it, and you can see there are 6,500 change sets, almost 180,000 Wikidata links added to OpenStreetMap. So the system uses these matching criteria for deciding if something is a match between the two systems. Here is an example. Here's a pub. So entity type coordinates. The system looks at the name and sees if the name is the same. Like it does some normalisation on the name. It locates the name and removes and bits and pieces like that. And if it can't match on the name, then it will try matching on the street address. So I've got some more examples. This is Paddington Station in London. So it will look at the identifier, the station code, which is in both Wikidata and OpenStreetMap. And in fact there's lots of identifiers that I compare. So these identifiers all have a key that appear in OpenStreetMap and a property in Wikidata that I can use to match on. Here's another example. Here's a lighthouse. And all lighthouses have a standard reference number that I can match on. So I'm going to talk about this is a theatre in the centre of Brussels. You can see there's the pin of where the theatre is. And the system knows from Wikidata that it's a theatre. And it knows that OpenStreetMap uses a meanity equals theatre to represent theatres. But how do we get that mapping between the two? So if we have a look at Wikidata, here's the theatre on Wikidata. And it's got instance of theatre. There's like a type system in Wikidata. So we know this is a theatre. And then if we have a look on the theatre page, you can see there's a property within Wikidata for OSM tag or key. So Wikidata knows OpenStreetMap tags and that's how the system can figure out how a theatre is represented on OpenStreetMap. So the important thing if you want to work with Wikidata is you need to use the Wikidata query service. Like in the background, my system is using this. There's a user interface that you can look at and you can try out. And the queries are written in Sparkle, which is a semantic query language. This example is theatres in Brussels. And Sparkle's kind of complicated. Like you don't have to know this to use the tool. But it's just very useful. If you want to work with Wikidata, you should figure out Sparkle. And this example query I've got here is the theatres in Brussels, you can see. So I can use Sparkle to have a look at the OSM tag key within Wikidata as well. So this is a search for amenities within Wikidata. And you can see it's found a list of the various types of things that are amenities in the OSM key that goes with them. So this is the kind of searches that the system is doing underneath like much more complex Sparkle queries. And the Sparkle supports bounding box searches, which is important for geospatial data. So that's the bit of the code that you use for searching in a bounding box. I'll just talk a bit about how the matcher runs. Like you do a search for a place and it gives you some search results and you pick from one of the search results. The search results are coming from the Nominatum API, which also gives us the polygon for the thing. And then once we've got the polygon we can figure out the bounding box and go and ask Wikidata for items within that bounding box. And we also grab the first few paragraphs of text in every language so that we've got the excerpts to show on the comparison page and also to get the street addresses, which often appear in the excerpt. So this is built with WebSocket, like the user sits looking at this map for a minute or two while it's doing some processing and it shows you the status as it's updating. So the next step is it goes off and searches the OpenStreetMap overpass API to find matching items within the bounding box and then it loads all of that data into PostGIS to be able to do the comparison. And then it runs the matching process to try and find things. So just if anyone's interested, this is the stack that I've used to build. It's all written in Python with Flask, SQL Alchemy. I'm using Leaflet and Bootstrap on the front end. So I'm just going to talk about some of the other features in the software. Like one of the problems I had was what language should I use for showing labels in. There isn't a standard, there isn't an easy way to find what language is the preferred language for a particular country and even country level isn't useful because sometimes it varies by region. So the system tries to guess what language to show the labels in. Like in Central Brussels, it's decided to use Dutch as the top one because that has the most labels and then French and then English. But if we don't like it, we can change it. So there's an edit button and you can drag and drop to reorder. So maybe I can drag them, switch them around if I want to change it. I'll show you some more features. The system detects that the centre of Brussels is quite small and complains and says I might want to choose a larger area. So it gives some suggestions for bigger areas that I might want to search on. Equally, if I try searching for Belgium it'll work but it's big for the system. So if I click on Belgium you get to the page where it runs and tries to find the matches but it's too big so the system splits it up into chunks. Like if you just try and do the whole of a country at once then you'll get a time out from the WikiData query service and from the OpenStreetMap overpass service. So I split the area into chunks and I do them one at a time and then recombine the results. And even with the chunking sometimes I hit timeouts so the system detects when I hit a timeout and then splits it into four chunks and retries. So this is one approach for doing large areas. The problem you'll have with this is the list of matches will have like 10,000 items on it that you've got to go through and check. And there's no kind of bookmarking where you can just do half and then come back later. Like if you leave the browser window open it will work. So it would be better if we had a different approach which is we can use the browse interface. So there's a link here for browsing. If I click browse then I get a list of the sub regions within Belgium and I can zoom in on these. So if I click on Brussels Capital Region then these are all the municipalities of Belgium. Could I just get a quick word? People who are sitting on the stairs or standing here in the exit could please ask you to leave because you're blocking one of the emergency exits. So either take one of the free seats if they're still available. But people sitting on stairs and standing here kindly request to leave. The live stream is on and it is working. So you can watch that. There are some places here available. I'm sorry to disturb you but I couldn't reach you. Sorry? You're not the same. Okay, great. Thanks. So that's most of the features I want to talk about in the software. I'll just talk about WikiData a bit for people who aren't familiar with WikiData. It's a database of structured data run by the Wikimedia Foundation. The same people as Wikipedia. It's been around since 2012. And why do we want to do this is the other question. So I'm going to use the Grand Place as my example. Here's the Grand Place on Wikipedia. And it's got, here's a link to take you to WikiData. Here's the WikiData item that represents the Grand Place. So we get lots of links to Wikipedia. There's 50 languages, there's articles written about the Grand Place, which is useful. This is the main chunk of a WikiData page. You get a list of statements. This is a bit like tags in OpenStreetMap, like key and value. And then this is the key thing for referring to a WikiData page. They all have a unique identifier. It starts with a queue followed by a number. And that appears in the URL as well. And WikiData identifies are permanent and stable. It doesn't change over time when something gets renamed. So they're a useful way of linking into a catalogue. And this is what it looks like when you look on OpenStreetMap at the Grand Place. You can see it's got the WikiData link in there as a tag. So again, what do we get from WikiData? We get a link to Wikimedia Commons. Like if you want photos of the Grand Place, there's over 200 photos. We get some more labels. You can have the name in different languages. More labels than appear in OpenStreetMap. And we get some external identifiers. So the WikiData has links into the Freebase ID, and it has the GeoNames ID or the World Heritage Site ID. All very useful. Just by having the WikiData link, we've got linked into these external catalogs. So just to recap, this is what we get. Labels in more languages, links to Wikipedia, links to Wikimedia Commons, identifiers for other data catalogs. So this is a good thing, but there's people adding the tags by hand to OpenStreetMap, but it's time-consuming, and so that's what I thought it would be good to automate it. But there's also some difficulties trying to link the two systems. Like the licenses are different. WikiData is CC0, which is like public domain, whereas OpenStreetMap uses its own database license. And so you can't copy any data from OpenStreetMap into WikiData because of the difference in licensing. But it even gets worse than that. Like they use different intellectual property jurisdictions. Like OpenStreetMap uses a certain database rights under European law, and Wikimedia Foundation is keen on US intellectual property law, which says that things like co-ordinates are facts and they're not protected by intellectual property. So those people within the OpenStreetMap community are suspicious of where the co-ordinates in WikiData come from. They question whether a lot of them were copied from Google Maps. Like people look up where something is on Google Maps, get the co-ordinates, put them in WikiData, in which case does that make WikiData a derived work of Google Maps. But I think that these problems don't really affect this tool because I'm not copying any data between the systems. Like I use the co-ordinates to find the matches, but the only thing I'm doing is adding the link. So my first attempt at this was like a fully automated system where I was just uploading tags without checking first and that was against the rules. People were unimpressed. I had a role account doing that which got blocked. So better to have the user interface where people can check things and also local people can check things in their own area. Like it's not just me trying to do the whole world. So yeah, machine assisted editing is good. What about adding links in the other direction? It would be nice to put links in WikiData that point to OpenStreetMap. Now that is difficult because OpenStreetMap doesn't have stable identifiers. Like this is the URL for the ground place and you see it's got an ID in there. That ID isn't guaranteed to stay the same. Like someone is free to come to OpenStreetMap and redraw the ground place maybe in finer detail and the ID will change. And there's been discussions with OpenStreetMap about adding permanent IDs that don't change. But those have been going on for years and it still doesn't have permanent IDs. They're quite permanent. Like this probably won't change but not quite permanent enough for us to start putting them in WikiData. So we just have the links going in one direction at the moment. So just another screenshot of the tool. And that's mostly it. I'm just going to do a live demo and see if this works. So this is the page that I was just describing. It's called at the top. Like I've got English as the preferred option. It's still, you know, not a lot of it's in English because here we've got the name of a pub called the King of Spain and it's come up in English because I've got English selected. I can click show tags and it shows the tags that represent this. The building equals yes is highlighted because it's got building over here. So that's the matching type. This one actually matches on identifier. So none of the names match perfectly. The names are a bit all over the place but it's got this website address here which matches this website address here. So this website is from Wikidata and this is OpenStreetMap and it's managed to match it. I can do shell on map and then you get to see the pub highlighted. So I've checked all these and I can scroll down to the bottom. Here you've got the Brussels Stock Exchange and it knows from the categories on Wikipedia that the Brussels Stock Exchange is defunct like it's in the defunct Stock Exchange's category. So it's like maybe this isn't a good match because maybe the Stock Exchange doesn't exist anymore. Actually if I click on show on map you can see it's highlighted the building like that's the match it's found and I've got two pins here which are both the Brussels Stock Exchange. There's a boss of Brussels and there's a Brussels Stock Exchange. So what's going on there is that there's two items within Wikidata that represent the Stock Exchange and one of them represents the building and the other one represents the institution but they both have coordinates and they've both matched so the system doesn't know which to use and so it gives you like an error. It's got OSM coordinate matches multiple items and a lot of cross. So if I scroll to the bottom and then I click add then this is the confirmation page and you've got this warning here suggesting you talk to your local mapping community but I'm just going to hit save and it's using WebSockets and it's going through and it's saving so this is editing OpenStreetMap and it's edited OpenStreetMap and then I can say view your change set and you can see I've if I scroll down just here I've edited all of these things and added Wikidata tags so that is my talk. Are there any questions? Hi. My software doesn't consider the Wikipedia tag I think quite a lot of the Wikipedia tags are wrong like there's a lot more Wikidata items than there are Wikipedia articles and so the Wikidata tag can be a lot more precise like you might find a seaside resort has a beach and the beach is referred to in the article and so people link the beach to the seaside resort but there might be a Wikidata item that just represents the beach so you could do a more precise link that way and does anything use it the OpenStreetMap web interface is using it well it understands it and links through and the OpenStreetMap editor that's on the website the ID editor understands Wikidata and will query Wikidata and pull the title from Wikidata I've actually got an example if I can figure out how to... here we go well so you're asking about the Wikidata tag this example, Maison de Roy is a building in the ground place but if you look at the Wikipedia tag it says this is the ground place article in Dutch so this is wrong, this Wikipedia tag now I'm going to add the correct Wikidata tag maybe the software should be taking out this Wikipedia tag at the same time or correcting it I don't know I haven't written any code to handle Wikipedia tags at all and I need to do that so I've got something yes that's a good point like here I've got already tagged let it load so it'll show you a list of things that are already tagged and it'll say whether my suggestion matches what is there already here we go so that one's not a great one but yeah these are all matching it's a bit unsure about the central station so yeah there's something there to do that I'm not being paid this is just for fun and I don't really have any kind of official connection to Wikidata like if you just search for your local area and have a look try that or you can browse so you might pick your country and then zoom in a bit one of the pieces that I'm missing is like keeping track of progress like I should be able to say you know Brussels is 100% done or the browse screen should have percentages next to each subregion so you know there isn't a good way to figure out like where to go and work on at the moment basically work will be lost so how about using the browser storage to store just what I'm just adding basically a hash into the URL to use one of the things stored in the local browser storage that you don't have to store something on the server side but on the other hand you don't believe the URL is good to solve this issue but obviously you should probably do huge edges but still yeah this is a good idea the other option would be it might show you one match at a time and say is this good or bad and you hit save and I could be sending something back to the server and storing it so there's ways of doing it I've just been avoiding the problem by working on smaller areas like not trying to do massive areas at once like the other thing is how the change set looks to other people who come and have a look at OpenStreetMap like if you try and do a whole country it'll be very overwhelming for someone to try and look at your work so that's like a nice reason for using smaller areas just a small tiny tiny command you said about the stable identifier but I was just lucky with them in OpenStreetMap so just to everyone if you want to see an example of unstable identifier node 1 it's got pretty interesting history Thanks It doesn't work very well with linear features so the tool that I'm using doesn't seem to load rivers I don't think it does streams and canals but again not very well the canals are often represented as a series of ways and when I built this I was very keen to have a one-to-one mapping between OpenStreetMap and Wikidata and there isn't a one-to-one mapping OpenStreetMap tends to have separate represent a road as a series of ways and you know they're the same road because they've got the same reference or the same name so I get into difficulties with bridges and tunnels because in OpenStreetMap bridges and tunnels tend to be represented as two ways like in either direction for a road bridge or a rail bridge and so if I want to add the Wikidata tag I need to add it to both ways and I don't support that like the system will say I found two matching things Yeah stuff like that there's another tag for man-made equals bridge which is supposed to be a polygon drawn around the bridge and I've got some special case code that detects that and says I'll use that one and ignores the others but a tunnel doesn't have something like that tunnels if it's a two-bar tunnel it's always represented as two lines on an OpenStreetMap and I need to change my code or convince OpenStreetMap that there should be an object that uniquely represents the tunnel It works with hiking routes like hiking routes is a linear feature and is a relation and it matches those up Yeah it does relations it does all three types of objects I don't go near that I don't touch those that's for other people like maybe sometime Brands is complicated because I have a problem with banks when you try and do a city where a bank has an office I often match a nearby branch things like that it doesn't handle and libraries get confusing because the main library and branches there's a lot of libraries in Wikidata someone has been loading all the libraries into Wikidata but it's tricky add a link between a Wikidata property oh I see what you mean like the identifiers there's a config file well there's a bit of code it's just like a mapping between them yeah it's not difficult at all maybe one day it'll move to the database and you can just click a button to do it but at the moment it's set at the code go for it you said facts yeah like because they're too trivial and the US property intellectual property law to be protected like you know if you tell me the court and something you can't claim any kind of intellectual property on that I mean for a single co-ordinate it's the same in the EU but for a database of co-ordinates that is like a lot of work you've done collecting a database and so the EU says that that's protected but in the US no it's not protected okay then I'm starting to understand what happens still in the White House how can they help you help the community yeah I mean people can use the tool the or file bugs like I've got bug tracking on OpenStreetMap on GitHub that's it I think I mean if anyone wants to contribute like I'm the only developer at the moment the code is kind of tricky to install there is Ansible playbook for installing it but there's a lot of moving parts like you know it does there's lots of bits to it to try and get it working yeah yeah yeah good question well like I say the code is out there so someone could take it over like yeah I don't I don't have an answer really to how to make it more sustainable thank you very much thank you