 So, welcome, Justine. She's talking about road safety, a new mistake. Road safety, you've got road blocks, traffic safety. Yeah, so I'll just get started. While this is slowly coming on, it might be quite mesmerizing to look at the moving images. So I'm going to talk about navigating the obstacles of working with local government, especially when trying to answer really hard questions. We're using open data. I'm going to share a project from the United States, from Washington, T.C., that we're working on. And yeah, we're going to look more closely at that. In a minute, I'll have some more general examples of what we do before from other cities around the globe. We, I'm Christina Franken. You can guess my accent later. I think the question is like, who here knows Mapbox? I guess some people do. Mapbox is a technology company, a location platform for developers. So it's mostly developers. The project I'm here to introduce is or share about is Mapbox Cities. It's a program where we work with cities specifically. And next to data analysis and kind of working with open data in these cities to answer really kind of big urban challenges. We're looking or researching specifically like how new technologies like augmented reality change the way that consumers and ultimately citizens engage with the data layer city has. So let's dive straight in. Essentially what we're doing is turning this, which is a very common site on open data models around the globe. This is an example of a very attractive spreadsheet at the city of Melbourne in Australia. We're turning this, we're helping Melbourne to turn this into this. Essentially for, as a kind of public front end for data, for a massive open data set, this is much more attractive for most kind of members of the local community. Essentially this is the building application spreadsheet that we've seen before, turned into a 3D interactive model of the city, really showing how Melbourne is right now in 3D and then how it's going to change over time when all these new development activities have been completed. So I'm not sure if I mentioned it, but Mapbox is an open source company. It's really a big part of the program that we work with cities to better understand the potential of open source. And this is a very interesting study that just only recently was published by the World Bank. It's essentially showing how the organizations that kind of invest into open source can expect an increased return on investment compared to traditionally closed by default software investments. So yeah, ideally our big picture is that it's to build a toolbox of solutions for cities to pick from rather than having to build from scratch every time. This example from just now, from Belgium, I was like, it's my street. It would be like, why can't you not just all use the same kind of tool or even connect the databases rather than keeping it separate? And that obviously happens all around the globe and a lot of urban issues are very similar. So ideally we're fusing open source tools and open source code bases. We could get into the really important issues rather than getting stuck at like showing some kind of bits of data on a matter for different cities and different times. So yeah, let's dive straight into this. So yeah, that's how it could be. It could be like just a visualization of schools in this case, but it could be a visualization of anything else and the code is open source and any city can kind of use it and replace it for whatever they need it for. This is a very common example at the moment in the US. They're just discovering cycling. I know this is not so exciting over here, but kind of this whole bike share thing is huge and there's so many different providers and that was like one of the first ones. The DC government actually made available on GitHub to kind of just copy for other cities to reuse. So I mentioned earlier, I think the example I'm talking about is traffic safety or traffic road issues. I think it's hard for us to imagine how different this is in the United States. The whole thing started when 2015, the numbers of traffic the challenges were announced and it was suddenly 7.2% higher than the year before. That's quite a jump and that also made the public aware of what was going on on the roads. This is like a big number for obviously it was a very big country but I'm going to have some comparison later. There's a lot of vision zero initiatives that started after this but then actually 2016 the numbers still continues to rise and that is currently waiting for the 2017 numbers. It's expected to be around 40K of people actually dying in traffic, not just accidents and that means like to put it in perspective it's kind of roughly the same category of size as like breast cancer or colon cancer and that's kind of really shocking for a developed country. I'm not going to go into the craziness of people driving in the US but what I want to show is like, I could but I won't what I want to show is like the first, how we got into this as a map of cities we were asked to visualize this first big number from 2015 and the team working on that and I can't really take the price here it's kind of a really good way of kind of looking at the data because it is like, it's about like informing the public what could happen to them and understanding what has happened in the past so they ask people rather than looking at all the data and exploring by country or by county or the city, they ask them to put in their daily commute route and then explore the data that happened on the route that they know from every day like getting to work at home, right? So that's the first way of us getting into this kind of subject. Yeah, I'm not sure who knows who is familiar with Vision Zero. It's a Swedish concept. The ironic thing is that it's not that popular or important here in Europe obviously the impact that can be made in a city like Los Angeles with like 244 people kind of dying from traffic from just being on the road really like pedestrians and cyclists is very different. This is a number for LA and then compared to Amsterdam, that's unfortunate. It's not very easy to find these numbers. It's about 15 people. Obviously, LA is much bigger than Amsterdam but I would assume that there's a very similar or even more cyclists and pedestrians on the road than Amsterdam than there is in LA. So, you know, like I don't want to say this is like scientific to compare but I think it's just showing the sizes of what we're talking about. Essentially cities like LA and DC are trying to reduce their number of traffic less to zero in a given time frame. Yeah, so Vision Zero in DC started actually in 2015. A bit ahead of the curve. They wanted to reduce their traffic fatalities by 2025 to zero. Yeah, it's run by the district department of transport. So, DDOT, I'm going to maybe mention this abbreviation a few more times today. Essentially, that's the people we started talking with after they saw the other visualization that we've done and they were quite interested in doing something with the data. The reason is that one of their three main goals for Vision Zero is to kind of make this a very data driven campaign and use insights from data to prioritize and ideally justify measures that have been taken on the roads of the city. This is important since the local media is quite active and they often call to justify or explain why certain decisions have been made especially in the light of equality and like kind of unprejudiced kind of decision making by the local government. I'm not sure has anybody here already ever been to Washington DC. Essentially, it's an interesting place because when you get there all you see is like a very nice, very clean, very wide, very American city. A lot of green space, a lot of open water and like just a lot of government buildings of course. But there is much more to it. So, this is a map by my colleague Eric Fischer. He looked at the US 2010 census data and every dot on this map represents 25 residents. Red is for white, blue is for black neighborhoods or black kind of racial areas. Green is for Asian and orange Hispanic and you obviously don't see many of those. And yellow is also about as far as. So, this kind of explains how the city is kind of mixed or not mixed by these two extreme kind of racial... It's kind of really shocking if you think about it because all we know of DC is often like the very white and very central neighborhoods. But that explains as well that when the government spends money on a campaign like vision zero that they are asking for to prove how those decisions that kind of govern the action measures on the roads have been made and if they've been made like a quality mind. Yeah, so the main goal for using data in this campaign is to act faster and act fair. But ideally, D dot would make decisions continuously as the data kind of is collected and is being updated. The problem is that if it takes almost forever to make the data public, then how long is it going to take to actually make policy from this and how long is it going to take to make action happen in the streets. So, it's interesting. Let's dive a bit further in how this project unfolded. So, it really started in December 2016 after the kind of first map we published. But then the real like there's all sorts of other things happening that delayed it. And I'm going to kind of come back to this a bit later. Well, we kicked off with an in-person meeting in August and had the external deadline to finish and present this project during Smart Cities Week DC last year in October. So, the whole project happened in three months. Because until the meeting in DC in August I could not get the government kind of people in this project, D dot, to focus on getting us a real research question or like the main question that they wanted us to answer with all the data that we tried to explore. There was all sorts of destruction before that. But that's also fine. So, let's take a look at this data. On the left is a screenshot from just now. Part of the data set is of course much, much bigger. And on the right is like how it roughly looked mid-last year. It was about 150,000 entries and 45 attributes. It's quite a huge kind of spreadsheet. Essentially, the biggest issue was data quantity. It was very poor. There was a lot of kind of missing values or things just didn't make sense. Honestly, like if you download it and you already have a direct connection with the government, you're like, oh, I really thought all the time it's mine. I'm just not good enough with Python and I just make all these weird beginners mistakes. But actually it was like quality was so bad. And they had a great idea mid, like throughout the process, like halfway like July or something, they decided to split the data into two basically massive data sets because they wanted to split some of the attributes up into like the difference between cycling, walking pedestrians and vehicle injuries and all that. But that totally made it impossible to merge those two different data sets back together. So it was almost comical. I think I should kind of present it in a much more fun way. But essentially it was really like they could not, they did not believe me that there was not one attribute that had only unique values. So I could not merge those two data sets back together and we had a few people looking over it and actually kind of presenting it to them and I think they still haven't found the errors because the problem is not that they don't have, the data is so bad on their servers on their end, the data only gets bad by the time it actually goes through all the processes that are needed and the IT teams and different departments that actually put it in the open data pool at the end. So yeah, that was a bit of a challenge but we kind of worked around it and essentially we used the very, like a 2016 data set for the bigger analysis and kind of data model. So the first step for us was visualizing it. Mapbox is a mapping platform so we visualized it. I think it's not very... I don't know if you can see it very well, but essentially like putting all the different dots on the map is one thing. We aggregated it to census tracts a better idea of like how this is spread around the city and then what's turned on here is this show hybrids intersections and that's actually the actual work we've done is bringing a lot of different data sets that are relevant for this into a scientific kind of collision frequency model to explore and actually prioritize the intersections based on the frequency of crashes that have been happening there over the past years in the data that we have available. So we worked with the assumption if there is more vehicles and more pedestrians there must be more opportunities for incidents. Another assumption and I think that's very common if people tend to drive fast in an area then there must be more crashes happening and another one was if there's more shops restaurants like economic or business activity or maybe also schools then that also influences the frequency of crashes in this neighborhood. The data we used was from the open data model is the crash data. I just mentioned that they were very still are very proud of and there should be but it was just very frustrating the amount of time I spent on that on that data set. Like the 2010 census data and intersection data then we looked at the Howard University traffic data center because this somehow did if they give all that their traffic accounts to the university traffic center but they don't somehow have maintained their own version of it anymore the traffic data center essentially analyzes and visualizes it on a map and that's you know online so we could not get these traffic accounts which we needed in order to normalize the crash data. We had to like scrape the website they just could not explain it themselves either it was quite comical and so lastly the speeds in order to explore areas where people tend to drive faster have to mention that in the U.S. there is no uniform like in a city speed limits means that any city can any road in a city can have a different speed limit based on the traffic signs they have ironically again they don't have that data available as open data so it was just a sign inventory that they may have been able to publish but they were too slow to do it for the short time frame that we set for this project so we ended up using proprietary mobile sensor data that we used for our navigation products and kind of extracted like areas where people tend to drive faster than the average of the city just to make a roughly informed decision but ideally we would the DDoC would you know get faster in publishing the data they have internally on their systems so the condition frequency model kind of looked at the fresh data in relation to several other data sets that we had available and looked at what are the conditions under which there is more than usual more crashes than usual happen so the outcome was that clearly likely a DDoC will use to prioritize measures across the city to improve traffic safety for pedestrians and cyclists this year the outcome for this well we got a featured in the Washington Post that was kind of a big deal apparently when you were in DC I didn't know it was kind of funny but in general for a company like Mapbox we started the company started in DC and we have had some form of relationship with the city but it's been intensified since which is good and we've been speaking with more departments and actually just got on the office there so it's kind of good to do something like this obviously we're in a special situation where we don't have to win the city as a customer but it's been certainly an interesting way to get our foot in the door and I think that could help could work for other strategies as well I mentioned that I'm going to come back to this but it takes so long one thing was that we were like sending agreements back and forth because we weren't sure at the beginning like what kinds of data we would look at and what types of data we would exchange essentially when we kind of agreed that we would only look at open data for now it was okay to just go without an agreement and the other thing is towards some data sets that the DDoS GIS people were very proud of and they wanted us to use them but we just like did not make any sense without like having a real like research kind of question instruction in order to approach this whole thing so we wasted a lot of time like kind of trying to understand like complex like word centerline like system that they have built with like 14 kind of aspects of every alongside every word to really exactly identify the width of every pipeline and we're just like this was just like information overload very early on and that's what really kind of caused it to delay the whole project we got a lot of like kind of speaking engagement afterwards and still doing it but I think we're at this point know what we like want to make sure that we've got my takeaways from this and kind of make you know make kind of think about what the next steps are there was a huge team involved and I'm lucky that Mapox is so diverse and has you know offices in Bangalore, the Washington DC San Francisco, we worked with a research fellow in London the final team is like some very passionate people around open data with Eric Fischer and Morgan who both maintain their own projects so it's really important to have that in place the project is currently like still kind of private repo at Mapox but we're working on like making it open so we can hopefully soon talk about like sharing the data set as not only the dataset but the tool and the kind of the code that went into the model to use it in other cities around the globe so yeah as I mentioned takeaways so one thing I would like to say and I mentioned it earlier in a conversation as well like I'm sure don't just tell so it's just like it's very easy for us like I mean I'm somehow in between so I don't I'm not like totally like geeky but it's very easy when you talk with government that you use the wrong wording or you use like terms that are very you know known to you but it's very difficult for them to understand and the same goes for like visualizing things like if I think of for me the building the dataset from Melbourne that I showed in the beginning it is very like I can see it already but not everybody can see it when you see a spreadsheet like that so I think it's very important to just prototype and like make a mock up deadlines yeah I literally signed us up for the Smart Cities week DC because I knew that if I get in and they like it then I have an internal reason as well to pull the whole team along and that's what happened and that's what happened with lots of other conferences as well yeah I said diverse team but it's not so much necessary that the team is diverse as in like you know gender diversity or racial diversity but I think everybody has to be very passionate about the cause and I think that's what something what like I really notice is like working on traffic safety there is always some going to be someone that is more passionate about this than others and that's the end already I yeah if this is interesting for you sign up for the newsletter if you have questions ask me now if I can if they can and if not you can email me thank you yeah we do have some time for questions everybody has a question questions I'm wondering so you've talked about the fact that you have a partnership with DC of course why in general did my box start to my box cities project oh that's a good question it was why was I don't think it was not a strategic decision per se I did some research in 2016 around the same time as this whole thing started about smart cities and why cities should use data and what's the importance of open data and open source for cities but they said but then White House with around Obama they kind of launched this fact sheet for smart cities and actually because we're a DC company and White House is kind of important when we had the opportunity to put something or announce something as part of this fact sheet they I got again like the internal buy-in from everybody and I just kind of made a website and pushed it we've been like three days honestly and that's why after that it all came together we basically had to come up with something that is cool enough to announce in a White House fact sheet and we came up with the idea to launch an open call to invite three cities to work with us and the three cities last year were Melbourne Australia Bloomington and Indiana is like a small university town similar to this in the US and then West Midlands just north of London and we had applications from all around the globe with 70 kind of applications worldwide so that was kind of it just there was really a need for it even though we still probably don't really know why we're doing it I think there is we get so much good feedback from both cities but also people that work with cities that like we are not trying to sell to cities this is not a commercial I think it's more like combining the research and kind of believes that NAPOs shares with the open data community and like kind of doing it rather than just talking about it you know cool thank you thank you