 So, yeah, I'm Ilya. It's wonderful to see all here. It's a geospatial day room, which means we use, we like, and we work with maps. And we do that in multiple ways. Like yesterday, I believe most of you had to find your restaurant for the evening or your hotel who did open their map app and type something into search field. Oh, you see, that's a lot more than I expected. And that's the thing that we do often, just typing things into application. And the application gives you coordinates, location where you go to. That is called geocoding, converting your phrase into location on the map. And that's one of the three pillars of applied cartography. Finding things, plotting around, displaying a map. But when you take geospatial specialists and make him or her work on geocoding, I doubt they will like it. Because underneath all that, geocoding is not exactly a mapping problem. It's more of a language processing problem. You take a phrase, you split it by tokens, you do a full text search return result. It's not about maps at all. But the reverse problem, giving a location, producing an understandable stream of address is much more interesting and challenging, as I found out. I was tasked with installing a reverse geocoder at my last job in June, that's a right sharing operator in New York. And instead of using one of the pre-built solutions like Nominatin, I decided to try to write my own geocoder. Because how hard can it be? How do you know you're in Belgium right now? Given your location, you can just take a set of country polygons and do a simple point-and-polygon search. And you will get a result like Belgium, which you can produce to a user. Point-and-polygon is one of the basic operations in geospatial. It's highly optimized. There are very fast algorithms. You can do that a thousand times a second. And you can find your country. Except in the real world, sometimes you can get two or three countries at once. Like on the river, I think between Luxembourg and Germany, that belongs to both countries. Or disputed territories like Crimea or anything. Sometimes you can get no countries at all. Not only in oceans, but also there are unclaimed territories like in Africa and stuff. You have to convey it somehow to a user who requested the coordinates. The same for regions, but when, how do you get a city? Like, how do you know you're in Brussels, for example? Again, you can take a set of city-boundary polygons and find out, yeah, you're in Brussels. But what if you're in a smaller town or a village and there are no databases of village polygons? All you get are points. Like, how many people can fit in a single point? Thousands of them, because many towns and even cities often reside in any databases like points. So what you need is finding a nearest point. And that is, again, a basic operation geospatial. It's indexed, it's optimized, you can do that a lot. But cities are different, they are of different sizes. Sometimes there are both point and polygon for cities, like in OpenStreetMap. There are points for city regions, neighborhoods. And to know where you are, you have to make a lot of queries. Point and polygon for countries, for regions, nearest point for cities, rank them, choose what you need. That's a lot of queries and these can take a long time, you have to optimize it. And one of the useful things are Voronoi polygons. Basically for each point they build a polygon from inside which that point is the nearest one. With that, basically you convert your nearest point request to point and polygon request. And doing a bit more bounding, a bit more optimization, preprocessing. And you can get the entire administrative hierarchy from country to city neighborhood in just one query, very fast one. That's how you do administrative queries. But frankly, the question which city I am in doesn't come up often, no more often than what year it is. It can come handy, but not for most people. I know in Brussels what I'm interested in is more precise location, which means address, which means house number and street. For most countries it's house number and street. How do you know that? There are multiple open databases with address locations, which mean coordinate and address. You can use them very simply, you can find nearest point again and get your address. Like, what is address for the red point? You can shout. No, it's K7. Which point is red? This one. I'm really sorry. K7. Given this database it's K7. It's a pretty close answer, except that in many countries addresses are given not to entrances like in Brussels, not to points, but to buildings. That's the case for United States, for Russia, for Belarus, for quite a lot of countries actually. And since buildings are not points, these results can sometimes be not very specific, not very precise. But thankfully we use OpenStreetMap. OpenStreetMap has a lot of building polygons with addresses on them. So the task of finding an address for a point for reverse geocoding in most cases is very simple. You just find a nearest address object in the database and you're set. And that is how virtually every reverse geocoder in the world works, including Nominating. They keep a database of addressed objects and they find the closest one, an internet address. Very fast, very simple, works in most cases except some corner cases. Which corner cases? Like corner buildings. They often have addresses, they're addressed by two streets or more streets there on. How do you map that? Because it's one object, it can have two similar different properties. Well, in OpenStreetMap anything goes. So people just add two addressed points inside and are done with it. So what is the address we expect to know about this formerly red point? Somebody answered K5Y. Yes, exactly. You're on K Street. If you give a P6 address then your car or your friends will wait at the wrong side of the building. So you need K5, but here if you employ nearest object, look up, you will get the wrong address. Why is that? So it turns out that some points need special processing. Like when an address point is inside the building, it's not actually a point, it's a property of the building. So it doesn't matter how far it is, you just take distance of closest building and take address from a point. And you can imagine a query for looking up an address becomes much more complex. Because it's not a nearest object anymore, it's not about point and polygon. It's a special processing in which you find closest building, then find points inside it and find a proper address point and so on. And in OpenStreetMap it can be even worse. Like a building can have many addresses on different streets, like up to four streets. And you might need to find a proper location, proper address description for any point, including points inside the buildings. What is the address for this point? I don't know, it can be anything. But you have to produce a stream that will be understandable, that people will know how did you come with this stream. And in OpenStreetMap anything goes. So these were simple cases. There are more interesting cases. Like for example, a location inside a building that has an address itself, that has a addressed point with different address inside. And there is also a cafe with different address. So these can be there for different reasons. So is it here number 14 or still 16 because this point is not too far? And cafe most definitely is about just a single entrance and move away a bit and it's different address. So in these cases you have to come up with ranking different objects, how they influence the outcome. Correctly find distances on which addresses work and so on. And again in OpenStreetMap you can find anything like address building inside an address building with different addresses. This might be correct because one is underground. We had to fight this case. And addresses can go anything including territories. So imagine a school with a very big yard, it's fenced and the entire school has got an address. And when you're standing at the point, which address would you like to give to your friend to meet? Is it K5? Like who knows where it is? Or K6 which is across the road? It's farther from you but seems like more correct. And you have to account for that in your hypothetical query for looking up an address. Like in this case we might preprocess the data to move address from a territory to the biggest building inside. OpenStreetMap data model provides multiple challenges, some of which are quite hard, even for me, like address interpolation. What is the address of that point? Anybody? Take a guess. Yeah, for example K6. It's plausible. You just find nearest point on the interpolation line. Interpolation means there are multiple sequential numbers on that line from 2 to 16. So maybe K6. But in the real world there might be no such building. There might be K2, K10 and K16 and that's all. So using just interpolation data you will give a user a non-existing address. Is it okay to give them that? It's roughly precise but if they type that address into a different app they will find nothing and it just won't work. So there are multiple things you have to consider. We'll do reverse geocoding and it's all quite fun to do, especially when testing. And the point of this talk is all I've been talking about is real. It happens in OpenStreetMap database and the geocoder that I've been writing accommodates for these cases. Like all these came through testing on the real OpenStreetMap data. The geocoder was published. It was installed in production. It could perform like 50 queries per second. It performs better than nominating and other geocoders because it's not just a nearly addressed object. It knows about all these corner cases. And it is OpenSource. It was written for Juno Company and Juno Company has ceased to be. It's no more ex-company. But I don't want it to be buried among GitHub OpenSource projects. So I urge you to open this link and just take a look. Because it's several dozens of screens of a skill code. Heavily commented with hundreds of unit integration tests. It's really simple to deploy. It has Docker container. It works on a pretty plain OSM2PGS skill database. And it's very fast. I don't work on that anymore because I'm now in a different company. But I'll be happy to find some more corner cases to think about how to accommodate for them. And maybe to improve this and see this geocoder used. So, thank you. Any questions? Yes, the first one. What's the reason you don't use this at your third company? Because my current company doesn't use OpenStreetMap for geocoding. It's much, much bigger. I work in Lyft now. And it has their own data with different constraints. Is it as interesting as it is now? Well, I'm solving a problem of making an address easy to understand, easy to pronounce. Because many taxi drivers do not use taxi app for navigation. They switch to Waze or Google Maps. They type in the address and they go to the point. So the address has to be correct so that they plot a correct route. And basically that's it. You need a location that's easy to understand enough. And services like Web3Words or PlusCode, they don't solve this. Because few people can process these codes. Okay, next question. Can you go back one slide for the link? Of course. It's the same link if you use QR codes. But yeah. Have you had a question? I think she asked. Go ahead. I wonder what some countries where there's no precise address, like African countries, like just some buildings in the Arab world, what work in general will be done in Africa to locate buildings? But do you think reverse geocoding will do for points in Africa? Right, so will it work in Africa or other countries with low building address coverage? It will work no worse than any other reverse geocoder, obviously. But in countries with no established street name house number address, it may show some weird results. But again, it depends on addressing model in open street map. Like in Japan, it probably won't work because the address there is very different. It was reason specifically for American addresses, but many countries including Eastern European countries shared the schema, so it will work. But I would be pretty happy to see some test cases and current cases from other countries. Okay, we have time for one last question. Well, I've just nominated the last two reverse geocoding photos, like several million photos. I guess so. Again, as you don't need point of interest information, like EuroCave is having an address instead of Empire State Building or some name. So as long as you need country, city, street and house number, this should be better. And it's faster because it's just a scale queries for PostgreSQL, and it has lower footprint. I think it's twice. Database is two times smaller than nominating database. So it should work. Okay, that's the final question. Short one. What does the name stand for? It's GRG. I don't know. I will only pronounce it in Russian, but it's a crime of junior reverse geocoding. We were flashing like three days on a name and decided to go with the simplest one. Okay, that was it. Thank you very much. Thank you.