 I was saying I'll try and not put you guys to sleep, but some of the things that I'm talking. It's a little bit of a bit boring. There's a lot of databases involved stuff like that. But yes, couple of things to agree upon. There's no big data involved. There's literally no big data involved. There the large data sets. There's a lot of maps. So and this is probably slightly different from some of the presentations around the conference because there's no enterprise infrastructure. There's not a lot of money. There's that sort of stuff, right? So this is a totally open source project that I'm going to be talking about completely driven by people like Wikipedia. Okay, so let's give it a shot. How many of you guys heard about open street map? That's exciting. Have you guys used open street map? Have you guys edited stuff in it? Have you used the map elsewhere? Have you used it for routing finding directions? It's not very good. That's we're getting there soon. Excellent. So open street map for those of you don't know about the project. So it's a 10 year old open source project. It's the world's largest open source geographic data repository completely created by people like you. It's more defined like the Wikipedia for maps. It's for the entire planet. If you go to open street map dot org, you can see the map. I'm going to show some numbers and try and convince you that open street map is the thing right now. Okay. There are about 2 million users, 2 million mappers. That's a pretty big number for a project. That's just 10 years old. There are about 2,000 people editing the map every day. Adding at least one waypoint or a GPS point every day. It's pretty big. I'd like to call open street map as an insanely successful project because not because I've been involved with it for over six years, but the amount of data the amount of complexity the community has gone through the kind of infrastructure that the community supports right now is insane. It's it's fabulous. The we don't have a lot of enterprise big data infrastructure, but we use a lot of open source solutions to get around these kind of large data set issues and especially geographic data is complicated. How many of you guys worked with maps? Geographic data GPS devices. Yeah. So it's an insanely successful projects and right now as of this morning, there are about 4 billion 4 billion 4 billion GPS points in our database and it's all open. It's all across the world and it's open like I said before. There's a lot of complex data. Geographic data is in itself a bit complex. If you think about it, this is I just have a bunch of screenshots. So this is the screenshot of Bangalore where I've selected all the notes and all the roads and points and other things. Kind of gives you a sense of the complex data that goes into making a single piece of map, right? There's a lot of information there. Sadly that picture didn't come out really well. So if you guys can see it's a roundabout. This is a place called Swindon in England and this is very famous roundabout. It's called the magic roundabout because it's a complicated situation. There are all these blue arrows that you can see. It gives you like which direction your traffic and go and things like that. So we do have lots of traffic information. It's not very complete because it's volunteer driven and if you guys can help out, that'll be excellent. There are lots of people trying to use OpenStreetMap for navigation and things like that and exactly why the data is so complex now. There are about 2.7 billion nodes. 260 3 million ways and 3 million relation. So I'm going to talk about nodes ways and the relations in a bit but to give you an idea of how big the project is and how large the data set is. Do you guys think big data? Not quite sure. Yes. No. Yes. Still sleeping. Fine. How does this sound? So there's the uncompressed data the production database is as of yesterday just over 400 GB. Is that big data? No, some people saying yes. I'm interested. Excellent. So this is a large data set. I don't quite know what big data is but like I said, it's been around for 10 years. The project started in 2004 and since then it's everything's been developed in open. It's been an exciting journey since 2004 and we're celebrating the birthday on August 9. So we might have some events in Bangaloo. So if you guys that are on come over it's open everything about OpenStreetMap is open all the infrastructure all the data all the mailing list archives all IRC archives all the conversations that people have all the source code everything is open and the reason why I want to try and introduce OpenStreetMap as the reasonable infrastructure for spatial data is because of my experience dealing with OpenStreetMap and geographic data for a while and I've been involved in this very exciting project which we do in the Democratic Republic of Congo. That's where the map is the projects called Moabi. I don't know whether you guys have heard about it. We just had a launch in April. Moabi is an independent platform which helps you to collaborative mapping platform to track natural resource extraction in the Congo. It's a we're talking about forests and palm oil other resources like that. So this is sort of an aerial view of one of the major forest and the Congo Basin the forest. It's probably the second largest forest after the Amazon which is still intact the second largest rain forest that that is intact after the Amazon and there's been a lot of Congo Basin's been kind of the area like of conflicts for a decade and it's been increasingly peaceful in the last couple of months and years. This also means that it's more prone to extraction of natural resource because there are lots of stakeholders involved lots of organizations involved and it's a fairly complicated situation. If you want to get into this it's fairly risky and a lot of these stakeholders have way more control than what they should what they're supposed to have and this kind of creates a lot of issues and the and this large amount of forest also have somewhat unknowable importance to global climate change and there's still research is happening around it. So we work with the local communities. We collect a lot of data on the ground. This is picture from one of the recent mapping activities we did forest with the pygmy communities. This is more like local on the ground action very localized approach of data collection. Most of the apps that we use are pictured in there. There's no text in it because most people can't lead. So we're targeting different kinds of users. This is another so they actually draw a lot of pictures on the on papers and kind of encourage conversations around these several issues that we're dealing with. So there's several kinds of users that we're dealing with to give you a sense of what sort of stuff that we're doing. Right? We deal with these local communities who don't have access to GPS devices or phones or computers for that matter but what they understand is pictures and mostly some printed maps and then we have people who are involved in the open street map and other people who are on the internet who like to read about things and you know get to know about what's happening around them and also some people who would want to go on the map and edit it and help us build the database help us correct things and you know build the data even make it even better. So we so we've been why beats been around for a while. This is the second iteration of the project. The first iteration was sometime in 2009 at that point of time open street map wasn't a major thing. It was still around but the infrastructure was still being experimented and things like that. So the people who are involved with more be at that point of time built it in somewhat obsolete technology called Drupal 6 and that's also at that point of time people think that Drupal is the solution for everything, right? Are there any people thinking about Drupal's the solution for everything right now? Oh no, there's some people there excellent. So oh no, these pictures are horrible. So we we we took the entire open street map infrastructure to build out the new version of Moabi. So this is this is the open street map interface. You have a map you have a bunch of other layers. So you have your case layer. You have your other natural sources layer and things like that and then we make these beautiful maps from this data set this large data sets that we collect on the ground and from satellite which I will go into in a minute and then we target the rest of the users who want to just browse the map and also read about things we created this whole reports interface where people can use some maps that we make and write stories about. So we have a lot of researchers working with us dealing with these actual issues on the ground and they write about these things use our maps. They make more maps and things like that. So the platform kind of does this it takes data creates maps and then with those maps you tell stories. That's that's kind of the thing it doesn't do anything else. It just takes some data create some maps stories. But making maps and collaboratively editing geographic data is a tricky thing because geographic data changes very often and making maps is not that straight moment. It used to be very difficult but now with the kind of tools that we have right now is actually quite great. So most people just want to create maps, right? How many of you have maps in your product a lot? Do you guys use Google Maps? In the product. Do you guys like custom maps? Maps that are colored to your color scheme and your designers will be able to change things. Right? So most people just want to do that. They just want to create custom maps in their products. They just want to serve custom styles custom colors for their maps and not just used the good old Google Maps style of the standard open-street map style. So this is where we start talking about a tile server and this is slightly the boring because we have to now talk about infrastructure, right? So we have the data we use something called the tile server and then we make tiles. Do I have to explain what tiles are? Yes, okay. Do you guys know when Google Maps came into being? 2004 have you used Google Maps back then? It was what map quest much later? Google Maps is 2004 2005. So the early web Maps when Maps came onto the web. It was amazing because you could tan. You could move around you can zoom it was mind-blowing, right? You could also copy things but the problem was when the Maps was loading in your browser, it was just one single image was just one massive image and that would often crush your browser and it was nearly impossible for you to pan the map. Right? And then time change and these people came over this really cool idea called map tiles. So you take the whole map split it into several square pieces and it creates somewhat a seamless user experience on the browser. So you essentially load map map tiles for the areas that you are looking at right now and don't load the other things since map tiles are small. You can quickly load them when the user maps to another area. Right? Are you guys with me? I'm tells make sense. Excellent. That's probably first time I explained it, right? Okay. So so that's where this thing called tile server comes in. So you need to create map tiles and not just one canvas of map, right? That's how you serve web maps these days. This is we're talking about raster tiles. There's also this thing called vector tiles these days. Do they are also not going to get into what else but I'll show you a little bit of infrastructure stuff as well. Okay, so most people just want to create custom maps. And this is where you should go. If you want to take open-street map data, if you just want a base map in your product, which is style your design. Whatever design schema or style guide or whatever you need to get the planet the planets the open-street maps data down and which is also uncompressed 400 G plus. I have a little boring flow charts here because that I thought it would be easier to explain this structure stuff, right? So you have the planet down and there's this thing called the diff. We all know what diffs are right? Right? Yes. So like I said, the planet is a massive file. It's 400 GB. So and the data keeps updating like I said before there are about 2,000 people updating the map every day. So you want to be able to keep up with that. So when the map when the master database changes, it creates diffs and then you apply that on top of your existing planet. That's how that's how you keep up with your the new updates and this is kind of a replication process if you guys are familiar with and there's a tool called osmosis which does this sort of stuff. You can create replication create diffs and apply it on your database. The other important aspect here is this thing called the OSM to PGS QL. I have to make it very clear that we're using only Postgres SQL. That's the canonical database that's used across the open map infrastructure. So you have your you have your data you have your diffs and you run it by this tool called OSM to PGS kill which will give you a nice and shiny database. And and now it's where you get starting to style your data. Now you need to say that okay, this line it's a road. So it has to be black this area. It's a leak so it has to be blue. So now you attach your styles onto the data and you render them that's that's just about it. And the thing that we used to render it's a very popular open source project written in C++ called mapponic the high performing map rendering engine. It's open source. So you guys can go and figure it out. It's fairly straightforward to set it up. It supports multiple languages unique or all that stuff. But the only catch is that the styles are defined as XML. Do you guys like XML? I'm going to make you love XML towards the end of the stock. So oh, I'm sorry. How about now? Okay, sorry, I'll try and be a little louder. Okay, so we have we have mapponic and we have the data. So now we need to style and we use something called the mapponic XML. How many of you guys heard about tile mill? Yes, excellent. So tile mill is this new map studio written in Node.js built by this nice people at Mapbox. So it kind so kind of helps you not to write mapponic XML anymore you can they kind of introduced this idea called Cato CSS which is like CSS but for maps. So you use your CSS selectors. Just how you can you select your map objects based on the selectors and then you apply styles on and then you can convert that into mapponic XML and throw it on to mapponic and randomly starts. Two more things into this whole stack. You have you have apache. So your browser is requesting tiles. Your browser is requesting maps and apache is in the middle of this it grabs the request and this is apache mod called mod tile. It's now the I think it was written in C or C plus plus 1. It what basically what more tile does is that it grabs the request finds which tiles have been requested and tells this guy called render D. It's the render demand. It tells render D to make those styles and serve it back. So render D tells mapponic creates those styles and then caches it and give it's back to more tile and then you have your map. Right is that this is okay. Doesn't make sense. Right? So this is your tile server. So fairly straightforward process to set it up. Takes about under an hour if you know what you're doing and also read the documentation properly. So now you have your custom map and you can add endless number of styles on to this. The same data can take endless endless styles and you can create multiple map layers which look different and that's exactly what we are also doing. I'm coffee couple of things that we did on our project which made a lot of which made our lives much easier. So as you can already see that updating these styles and getting them are adding new styles are fairly complicated because you have to tell mapponic about this things. You have to update our config file things like that. So introduces get workflow to manage cartography and style. So essentially your designer will push these carto files the tile mill files into your git repository and there's a nice little fabric script which does all the magic for you. So it'll take all the changes push it on the server deploy it and redeploy it and all that stuff. All this is open source. You guys can look it up. So that's the rendering tool chain. This is called the rendering tool chain if you are in the Maps world and that's that's how you build a tile server and so everything what I've been talking about so far is on this website called switch to osm.org so very informative website where they have several guides about using open street map data setting up a tile server. What sort of stuff do you need in terms of setting up a tile server? How do you tune your database? What are the considerations? What are the optimizations that you need things like that? Okay. So now we have the tile server. So that's mostly what 90% of people would want. We just want to serve custom tiles type. But we also want to collect data. We also want to collect geographic data structure. Then also make it collaborative and easy to edit and things like that. So it's a little complicated like you like you've seen with the tile server several moving parts and with the whole system coming into being there are lots of moving parts and lots of things that you have to consider things like that. It's not that bad everything's quite well documented. The only three steps open street map does have only three steps. Edit your map add data style it and render it. So we've actually seen styling and rendering already. I haven't gone into a lot of depth with styling because it's all available online and you guys can look it up fairly straightforward. So you can style and render then that's that's kind of it. So now the edit bits. How do you add data? How do you collect this data? How do you tag and structuralize? That's all stuff. Okay, so mostly you have three kinds of geodata coming into your system. You will have satellite images like we're always speaking a lot about satellite images and QGI is and things like that. We use Maps World make things you have GPX files things that's coming from your GPS devices the NMEA and other similar formats which are coming straight from your GPS device and also you have your traditional GIS things like shapefiles things like that that you get you download from Internet or you just collect from some university or something. See OpenStreetMap does this whole data editing through things called editors of course and the two editors which if you guys have played with OpenStreetMap you must have seen this one ID and the other one called Johnson. So the ID is ID is built in JavaScript pure JavaScript built on top of D3. She has her know about D3. It's a really cool project if you want to know they don't you they they don't use any JavaScript framework. It's just vanilla JavaScript and D3 is it's a great open source project if you're getting into single page web development you should take a look. I've been working with ID and it's a it's it's mind blowing. It's a really interesting set of source code ID. Let's you take a satellite image and you can draw on top of it and you can draw vectors from it. So this is how you would add data into the system using ID since it's open source you can customize it the way you want with customized ID in several ways very specific to the data that we have and it's really straightforward. Jawsome it's the Java open-street map editor. It's a slightly high-level offline heavy-duty application. People use it for like bulk imports or you know large dataset and things like that and Jawsome is also customizable you can take it and make it the way your data you want an editor and things like that. So that's the editor and now the really cool thing about open-street map is that it has a nice API which takes all the data that's coming through your editor and puts it in your database and this API is also used to create users you can do revisions with your geographic data which I'll talk about in a bit which is also very important and you can do all that sort of stuff so you have you have this API which does all this magic and then you take this data into your child server which is the next big part right? So that's kind of the whole infrastructure. You take your data you edit it you import bulk datasets if you want use the open-street by API we'll talk about how this data has been modeled and all in a minute the way open-street map models the data is using three data primitives. It's called once node the other ones way and the other ones relations. So earlier I showed a slide with the numbers of these things on database, right? A node is actually a point. It has a latitude and longitude attached to it. It's just a point and away is an ordered list of nodes ordered list of two or more nodes. So you can see there are more nodes. So they're closed ways which you would which is closed way where the starting node and the ending node will be the same and you can have a area. So for instance, if you draw a building it'll be an area right? It'll be a closed way and then you have a relationship geography is all about relationships capturing things on the ground and the relationships between them. So you have a data primitive for relationships and the three geometric objects that you use generally representing geographic data in vector format digitally right points and lines and polygons. So point is your point of interest line is a road for instance and polygon is a building or an administrative boundary or things like that. This is actually the super secret source of OpenStreetMap. They are called tags and this is how OpenStreetMap adds metadata into these geometry objects and I'll tell you how you can scale these tags. Yeah, that picture is not very clear. So a tag is actually essentially a key value pair. Right and for instance, highway equal to primary is a key value pair name at a polypondage. You guys heard about this highway and it's a relation the relationship is it belongs to a national highway relationship, right? Does that make sense this bunch of key value pairs? So you essentially draw a line and say that okay, this is this is highway equal to primary. It has a name and it belongs to a relationship, right? And this is actually very scalable because you can have n number of tags. You can have any sort of metadata to your geographic data and this is very well documented in the OpenStreetMap bookie. I'm going to leave that link there and I'll take you guys to this whole idea of presets. So I we've seen key value pairs, but what if we want to have multiple key value pairs that will represent a single entity on the on the ground. So here the example. I think it's a mining concession. So we use bunch of tags to represent a mining concession. So it's essentially a geographic feature and you can have multiple metadata attached to it, but you use a single preset which is a collection of tag and collection of tags to represent that object in the database. If you're familiar with ID, ID uses presets when you start drawing things that will list all the presets on the left side. There's a slightly complicated way of adding new presets and adding new tags into the OpenStreetMap production database because you have to join a mailing list. You have to propose a tag. It has to be voted upon and has to be decided whether it's actually meaningful to add those things. But when you're using the OpenStreetMap infrastructure for for your own data purposes, you just want to add more presets and we've been working on a preset editor that lets you create these presets. All right, I'm going to have to leave it here because if you deal with geographic data and not use Postgres SQL, it's kind of sucky. It's it doesn't work otherwise. Do you guys recognize this logo? There are lots of elephants. So this is the logo for post GIS. It's an extension to Postgres SQL to deal with spatial data. So OpenStreetMap uses Postgres with post GIS and this is how you scale key value pairs and presets. You can use this cool thing called head store Postgres SQL and shove all your tags into head store and you don't have to be with your schema. So you can add your geographic data but keep storing your tags and you don't have to decide your tags early on before you model your data. So if you think that at some point you want to add another tag, you just have to shove it into head store. It will still work and all the softwares in the pipeline dealing with OpenStreetMap data supports head store. You just have to enable it. I quickly talk about the API. It's a restful XML API. It's XML because all the data formats that OpenStreetMap deals with is XML. This is how the XML looks like. Yeah, this doesn't look so well. So you represent all the nodes and ways and relationships with XML. So essentially you can see who added this. The user ID the visibility chain set which I'll talk about in a minute and then all the key value pairs and things like that. Um and this is another cool thing with what you get free with this infrastructure. Most often when you collaborate on geographic data, you want to be able to revision that you want to be able to version that and you might want to go back because there are lots of people editing it. OpenStreetMap uses this idea called change set. So every time you make a change, a new change set is created. So all the changes that have made this part of a change set and gets saved into your database. So you can go back to a change set at any point of time. Um people have built really cool things around the OpenStreetMap data. This is thing called overpass. It's open source. It lets you it kind of takes your database and creates a nice API around it so that your apps can talk to it apps can talk to your direct your production database and it lets you export stuff in GeoJSON and other custom Json formats things like that. So overpass is a really interesting tool. Um that's kind of what I wanted to share to give you a sense of what this whole geographic infrastructure looks like. So yeah, I'm open for questions Streetman. Did you mean the data or the infrastructure? So the question was he wanted to know whether there any government organizations or authorities using OpenStreetMap data or the infrastructure infrastructure. Yes data. Yes. So the National Park service in the US uses the OpenStreetMap infrastructure India. So not it. No. Not that I know of a lot of them. Yes, because it's free. How can you repeat the question because his question was again the same question because are people using OpenStreetMap are who are the people who are using OpenStreetMap and also the infrastructure. Geos. He wanted to know if NGOs. Yeah, NGOs specifically. Yes, a lot of them. I work for an NGO. Uh, I work for multiple NGOs all of which we use OpenStreetMap. Apple uses a little bit of OpenStreetMap. If you want to talk about the commercial service providers, if you heard about DeliNav, it's a major navigation company based in the US. They use OpenStreetMap for navigation. For square uses OpenStreetMap. For Scott switched last year, I think. So we're going to give the others. Yeah, a little chance. We'll come back to you if there's time. Yeah. Thank you. Anyone else with questions here? So yeah, we'll go to him and then come to you. Yes, actually last year we're exploring OSM for reverse geocoding and geocoding users and we didn't find it much useful there. But I guess it's a work in progress and the data would be right. So yeah, geocoding is a tough problem. It's a and especially reverse geocoding. It gets even harder. It depends on the quality of data, which all of us are working on very closely trying to prove things the situation in India is a little complicated because we don't have a specific addressing scheme. Even though we have it keeps changing from city to city and in suburbs to suburbs the couple of recent changes you should be able to find a couple of new tools which does really cool reverse geocoding even for India and I'll be interested to know your feedback as well. So just let me know. I can help you just one follow up question. How about the coverage in China? We got a lot of big providers. They are useless in China because they have their own providers. Interesting. I haven't checked China, but I assume there's a fairly decent community in China. Are there any comparisons? Trying to find you. So are there any comparisons of your venue data with the four square venue data or factual? Good question. So his question is more on the data side of OpenStreetMap. So the way four square uses OpenStreetMap is as a map is a base layer. So they have their data on top of it and they haven't actually mixed it. But did someone compare how good is OpenStreetMap information with respect? So again, it depends on the neighborhood that you are. If you are in a place where there are a lot of contributors, you get a really cool map and you get three. You're not aware of any studies. Um, no, I don't think there's any specific study around like point of interest, but there are a lot of people using OpenStreetMap for point of interest. Uh, even factual does take a lot of stuff from the map and also gives it back. Any other questions? Um, stop with the gentleman in pink. Use the mic, please. In India, we have survey of India maps. There's a big project called, you know, if a national optic fiber network, yes, for that a lot of mapping maps are required. So other than Google maps, survey of India maps are reduced, but I don't think they have comprehensive information. Have you been approached? So we've, yeah, Nisha is here and she will know a little bit more about this. We've been trying to get in touch with the survey of India to talk about the open geographic data situation in India, which is also quite shitty because they don't want to open it. Um, there's not this, there's not a single source of ground truth in India. It's all, it's all, it's all over the place. There are a lot of people doing stuff, but I always use open street map for whatever projects that I'm involved in and I found it very. I mean, it's fairly comprehensive. And if I find something wrong, I can always fix it. No, it's better to collaborate to serve up in the application. I mean, unfortunately, the situation is that they're not very, you know, responsive as one would imagine. So we've been trying to approach them. We've trying to set up a meeting with them to have a chat about this, how to open up more of this data. Yeah, with a platform that's this editable. Have you ever run into problems with where people, you know, do dubious edits? Like one of the funny thing that happened with Wikipedia a while ago was the CTO, XCTO of SAP had a Mr. Bean photo. So anything like that. So that's a that's a generic problem with any crowdsourcing platforms. People do tend to vandalize a lot, but OpenStreetMap is like Wikipedia. It's a major, it's a vast community and there are always people watching art for things and there are also several tools built on top of OpenStreetMap for validation. And you know, if you make lots of edits in a minute or if you if you change it is it's really huge. There are things that trigger alerts into the IRC channel and then people go check people go check that. And so it's a fairly nice workflow where you can see who's editing and what's being changed. So people keep an eye on those things. But if you're using OpenStreetMap for your own data infrastructure that problem is in their eyes. We have two times two more questions gentlemen in black. Like if somebody wanted to do the color schemes on the tiles, it's better to set up a tile server, right? There's a project that's either called lease.js or leaflet.js which let you do that without having your own tiles over on OpenStreetMap data. That's right. So it's called leaflet.js and it's a JavaScript mapping library. What it does is that it picks your tile server URL and displace that map on the browser, but it won't let you change the colors. It'll so there are some plugins which will let you change the map color to gray and black, but it doesn't do more than that. Oh, that's about that. Yeah, time for one last question. All the way at the back pretty much the front end view of the data right? So if let's say this is a logistics company which wants to use OpenStreetMap data. So how can that be looked at? Um, right. So there have been recently a lot of work around routing and fleet management and things like that. So there are a couple of open source library. One is called OS RM open source open street routing managing not getting that right, but it's called OS RM. It takes OpenStreetMap data and you can build routes out of it and things like that. So it's something that completely work in progress because the data is continuously changing and it's also the availability of data is not the same everywhere in the world. It depends on where the contributors are or how excited people are in that neighborhood. So yeah, Mr. I'm seeing a question. Mr. Onward ladies and gentlemen, please put your hands together for him.