 My name is Gleb, I'm from Belarus. We run an open data group in Minsk, and there are several dozen, probably around, more than 100 people participating, mostly students, researchers from scientific libraries and there are some journalists, and we're trying to find sources of data that will help to understand what's happening in our country and how to deal with social and civic issues. According to open data barometer, the situation with open data in Belarus is pretty bad, as we're 93 out of 115 countries served, which basically means that there is some data that the government is collecting, but it's not sharing almost anything. And the problems are basically that even if there is data, the government either wants to sell it because it costs them money to produce it or sometimes the people who are in charge are overloaded and they just don't want to be responsible for any mistakes that are in the data. There is no freedom of information legal framework, just no law at all, and we cannot file a freedom of information request. There is a procedure to ask for basically a response from the government, but it can come as a printed letter saying this data could be available, but it's not, so they're not responsible. And most people, like journalists and researchers, don't know about wiki data, open street map, and we're trying to change that. There is a special case. Some of the data is closed as national security stuff, basically KGB archives and archives about whatever was happening during the USSR period. So, yeah, I just want to give a bit of background about the country. It's basically the population of New York City in terms of population size. There are two official languages, Belarusian and Russian, which makes things a bit more complicated because we have one alphabet but two variations, so special characters. When you work with archive data from, let's say, 100 years ago, you have five or six or seven or eight languages that were in official use. Basically, there are no minorities because everyone is multilingual, so in my family we speak four languages because grandparents were from four different communities. The cities which had traditionally more scientific data collected and archive data were traditionally Jewish, Polish, and Russian, and the titular nation, Belarusian, had less access to education and it was concentrated in the rural areas, so it's much harder to find information about non-Russian and non-Polish and non-Hiber sources. It's all changing very fast. In a matter of decades, it's been changing as there were waves of migration before the Soviet Union disintegrated and after that, but it's a really interesting place. You could say it's a firewall country because when people track planes flying over Minsk, most of them are just passing through, over powers from Russia and Asia over to Western Europe and that's an actual picture from Fly Trader. Despite that, it's an interesting place to explore and I'm sure that the situation is really different from where you come from but I think there are some similar problems and similar approaches that maybe could be interesting. Just as a random art example of using data as art, people in our community looked at the map of Belarus and split it into polgons and looked for geographic names like river names and village and city names and the patterns that are contained in those names and it turns out that those patterns contain data about a list of people who live there obviously and there are local references to animals and things like that and you can group them and it's kind of an interesting dataset to play with so all the data comes from OpenStreetMap containing 23,000 location names and it's just interesting to see how Baltic, for example, place names are concentrated in the north so places closest to the Lithuanian border and here are places that contain more Russian-specific suffixes so it's not true that there is no data there is some data and it's interesting to discover it and this map is more Belarusian one which oriented suffix on the map of Belarus and so about less art and more data-oriented projects we try to get civic data out of news reports and press releases from the police from the government ministries and sometimes when the bureaucrats are feeling lazy they use the same template they use the same kind of sentence over and over again and we try to parse that for actual numbers I'll show some examples later from the police reports we try to look at municipal services so the hotline reports about, like, broken curves and illegal parking and things like that we look at maps of transportation because public transportation is really big in Belarus most people use it we look at environmental data and it's really important in Belarus because, well, it used to be a mostly rural country most people lived in the countryside up until the 80s so the landscape is very, very non-organized and the population density is kind of low so most of the land areas are forests and wetlands we also look at how public transportation is giving way to private cars and car ownership and we also try to extract data about house ownership because it is basically the apartments, obviously the best way to understand how cities are changing the government doesn't publish enough urban data so we're trying to understand this from classified ads, for example we try to work with commercial data as well because the companies accumulate more and more data even compared to the government services nowadays and we try to use as much open source data sets as possible from open street map and Wikipedia and try to popularize that so that's an example of government data that we try to parse and the animation in the first preside was from us trying to parse the official zoning map of Minsk, the major city into polygons and to understand what the population density was in different parts of the city and we actually get this as JPEGs that's the best you can get, it's kind of secret you can only get it when you know someone in the ministry but once somebody got it, it was shared in the community and the reason it's secret is because people started protesting against shrinking of green areas and parks and all the public places that were given to commercial construction to foreign and domestic investors building shopping centers and new apartment buildings and it's kind of a mess and the discussion process is really, really a conflict point for the government so they don't want to publish any zoning maps but we try to get this data as much as we can so about parsing news reports and police reports it's again the map of Minsk but we had a parser running for three years that extracted every daily police press release report about all violent crimes in Minsk I think this data shows only reports from one or two months so the numbers are not that large but it was actually useful you could see which areas had more crime and which areas had less so in general the crime level is very low in Belarus I think it's kind of comparable to Portland the number that's stuck in my head is that the number of murders in Belarus per year was lower than in Baltimore but I guess that's completely not a fair comparison comparing the population gives you an idea and here's another example where we extracted all the human rights organizations reports about police arrests and harassment during protests and demonstrations because there were major waves of demonstrations against government austerity plans in 2017 so we extracted data about all the arrests and all the punishments that were given to people so 224 people were given arrests on average of around 10 days in jail and around $250 on average fines so that's some information that's a map showing the concentration of people arrested in different places in Belarus another interesting source of data is the traffic police reports and here I know the visualization is kind of small we were just trying to show how different types of traffic incident reports were split between men and women and the blue part is the men who participated in the incident and the pink is the women and it showed data from 2012 to 2015 it was just a random thing that we wanted to try out and it shows all the incidents incidents caused by intoxicated drivers so women were basically non-existent there and accidents with injuries and accidents with fatalities so yeah, whenever there is a discussion online about women drivers and men drivers I don't know how popular this discussion is in this part of the world but in Belarus somehow it's kind of very active yeah, it's an argument and yeah, the right part shows split by age so 19 to 23, 24 to 28, 29 to 33 and so forth so the youngest drivers caused the most accidents this is another view of the same data and it shows the number of accidents by day so this line chart shows days and accidents peaks per day and most of the peaks were either holidays or the first day after holiday or there was one huge peak during the hurricane and the small storm that basically caused a lot of accidents and you can find other different things that matter okay, and another major source of data is the municipal hotline data that they published completely accidentally basically the government is trying to use a private company to build this portal showing all the municipal problem reports and it's working really well it's just that all the JSONs are containing all the data about people who report stuff and all the reports themselves are completely available so it's a huge data set and it's growing it's not full yet because right now most of the reports are submitted by phone and not by app and we only get the data through the reports that are submitted through the app and actually there are really nice people when we talk to them and we tell them that all this data is available and they were like, yeah, we know don't do anything bad with that so, yeah, it's a nice picture of Minsk and it's showing all the requests that were reported through the app and it's actually a number of reports reported through the phone through the phone hotline is a number of magnitude larger but we don't get all the data there in the data set they're trying to switch to this new system but it's really expensive because the private company is using Oracle as a database and I think they're not aware yet how much it would cost and when we put this data set we tried to split all the user names which contained first names we threw out the last names because we didn't want to keep this personal data so the first names actually gave us the information about gender so we tried to split how many reports were concentrated in different categories of issues so reports about sidewalks and about problems with parks and things like that and versus problems with staircases and doors in apartment buildings and things like that and it turned out that men were much more active in reporting problems with sidewalks that were closest to bike paths and women were more active about reporting problems with infrastructure inside the apartment buildings so that was some interesting bit of information and it also gave us an idea about the difference in terms of time that was spent on correcting those reports I haven't covered even one third of the data but I think I'm almost out of time so this shows the splits by district in the city and how long it takes different parts of the city to fix problems it's all from the same dataset this is from the KGB data it's completely unrelated to municipal I'm just showing a map of... we worked with human rights activists who collected data from the KGB archives and it's showing all the people who were arrested and sent away to Siberia or to local prisons during the stunning oppressions so the dots show places with the most arrests per year so there were waves of oppressions and you can kind of see that we also used some pirate data because it's not possible to get all the data you want officially so Sy Hub, which is the pirate scientific database made available some download information about different countries so we were able to see what people, by IP ranges what people from Belarus were downloading and what types of publications journals, sections of journals and different topics and it gave a lot of interesting information that we published online as an actual scientific publication traffic data is interesting and we also run some community and activist projects where people just go outside and measure pollution in the local water sources and we map that I think you just have to download the presentation with the links if you are actually interested or just talk to me later because there's too much that I wasn't able to show and just one random source of information that was really interesting to use we downloaded all the classified ads for the used cars in Belarus around 200,000 reports for a month and we did it for 5 months and then we looked at the oldest cars per city and the most polluting cars per 1,000 population in different parts of the country and it showed really significant variation because the oldest cars pollute a lot more and you can see it all from the classified data and from the address and there is real estate data and there is density data and all that can be combined to gain insights into what's happening in different parts of the cities and when you use open street map on top of that you can actually see which parts of the city have most beer stores per district and whether that correlates to the density it doesn't it basically correlates to the number of cheapest apartments and places where people don't have anything else to do it's interesting how much you can get out of open street map when the state doesn't publish anything and according to Tim Berners-Lee actually made a screenshot from his TED Talk this circle shows Belarus on the open street map activity timeline so I'm kind of proud and we run an infographics school for 5 years now 4 months courses that teach journalists and activists to use data and most of the screenshots were from things that people did in the school of infographics thank you and I'm happy to answer any questions