 yours. Thank you, Tom. So hi everyone, I'm Martin Lambrex. I'm a big fan and a big user of open data. I'm a freelance data journalist and visualization consultant and I think I can safely say that I own a big part of my career to open data and the use thereof. But over the years I have noticed some things that annoyed me a bit when looking for open data and using open data and I think a lot of these things are also contributing to hindering open data to be used more than it is today. And so in the next couple of minutes I will walk you through some of these annoyances and I will try to come up with solutions to these annoyances so that open data publishing is more centered on users and so the subtitle of my talk is open data for humans so not only for computers but also for humans. Okay I think I might discourage some of you in the room as open data publishers but yeah don't don't be discouraged. I don't want to criticize your work as open data publishers because as I said I'm a big fan but I do want to raise some concerns and maybe there are some things you can take away from my talk and can implement in in your data publishing or in your data portal so please don't be offended or don't be discouraged. I just want to make you think about the user a little bit more. I'm going to start off with a little story about what I did with open data and this is a quote I think many of you can agree with but and it's not really a very strong quote except when it comes from this guy. This is Jan Pirmans the president of the Flemish Parliament and of course when he says that there are some people in parliament literally doing nothing well me as a data journalist I'm triggered I want to find out who those lazy members of parliament are and so I went looking for the data and the Flemish Parliament as open data API and policy and so I went looking for the documentation of this API and and it's all there of course all the data I was interested in is in the API but then when I went looking through how I could retrieve information about the activities of all the members of parliament I had to look into a lot of these API calls and I had to store a lot of information and then make other calls so in the end I ended up not using the API of the Flemish Parliament so although I'm inert I'm also a bit of a programmer so I know my Jason from my XML but I wasn't been able to extract that information from that API because it was for me it was a bit too difficult it was a bit too technically challenging and as a journalist you're under a lot of time pressure so I ditched the idea of you using the API because it was giving me too much headaches and I switched to another strategy and what I did was I'm going to the website of the Flemish Parliament where you have a nice overview of all the members of parliament and when you click through you get an overview of how many documents each member of parliament filed and how many times they had said something in parliament so instead of using the API I just wrote a scraper that went to this page then click through to the individual pages of each member of parliament and took the data from there so in the end I was happy I had the data I wanted and we also made this piece so we identified the lazy MPs I'm sorry this is in Dutch we made a nice graphic out of it and over here are the members of parliament well each dot is one MP as you can see and on the top right are the people what we called were the busy bees people talking a lot but also filing a lot of documents here are the chatterers who talk a lot but don't file a lot of documents and Herman the crow is obviously one of them these are the silent forces who don't say much but they are very active they file a lot of documents and then of course here are the lazy MPs we made a little button to zoom in on them and the most lazy one was a Gwendolyn Ritter who is now not in parliament anymore and surprisingly Jampoemans is there himself but I think his his activities as president don't count in these data set so this was a nice exercise but after I made it I started thinking about well this this wasn't really meant to be I had to scrape the website in order to get the data while there's an open data policy by the by the parliament and so this is one of the things I want to talk about if I as a technical journalist can't access this information then I think there's something wrong and that's really the baseline of what I want to show you next so maybe we can go check or talk a bit about what open data really is and for that I want to go to the open knowledge website and the print is a bit small for the people in the back I think but it stated here that everyone must be able to use reuse and redistribute there should be no discrimination against fields of endeavor or against persons or groups and so I think this is a really valuable definition or part of the definition but you can interpret this in multiple ways and if you focus on the first part everyone must be able to use reuse and redistribute you can interpret that as anyone should have access but you can also say that anyone with the knowledge they have already should be able to get access to the data and I think at the moment and that's not really the case and I'll illustrate this with a little analogy and I will keep on using that analogy throughout the rest of my talk so I think you can compare open data in an API as a can of soup and there are some juicy things in there there's value in there but in order to access it you have to have some tools and knowledge to get to the content so obviously if you want to open a can you need a can opener and you need to know how to handle a can opener and only then you can get access to the content like the API of the Flemish Parliament I didn't have enough knowledge to get to the juicy bits of the soup so I had to use another strategy my strategy was a bit like this this is actually a screenshot from a video on YouTube that shows you 10 ways you can open a can without a can opener and it's very interesting but I challenge you to open a can of soup with an eggs like that it will be really difficult and bit dirty I guess so my first point here is well is your data access accessible to non-programmers because if you interpret the definition of open data like I did then a lot of open data isn't really open because you need some skills and tools in order to access the data and it's also important to have the data human readable because you can say everybody has access but if it's only you and your peers that can read the data then you will also not reach a lot of the people interested in the data so instead of offering a can of soup or a can of food offer data in this way really open anyone can access it you can eat it with your bare hands there's no knowledge needed to to get the data or to access the data anyone really can can eat this juicy bowl of pasta so my first advice would be to provide non-technical and easy to digest views of your data instead of only offering data as an API for example that a lot of people will have a hard time accessing then if people have the data they need to know of or when they have found the data people need to know how to really start making a delicious meal with the data they have and so if you think about the data sets as ingredients then you also have to have some kind of stepwise guidance for your users for your users so that he or she knows how to process the data what can be concluded from it what not and how they can bring this data to to a product today they want to build or or use so my second advice would be to provide documentation and examples and tutorials on how to use the data the description on the API page of the Flemish Parliament was really cryptic there wasn't wasn't a lot of explication and it wasn't just it wasn't enough for me so I think if you want users to use your data more you have to offer some kind of guidance for the more non-technical users as well and this is an example and I think it's an excellent example is the website of transport focus which is the the organization that is well it's a trend rampus from from the UK I don't know what the equivalent the Wallonia is but they are publishing a lot of data about the surveys about quality of public transport and they have on their side these nice videos explaining how people can get access to the data and what can be done with the data so I think this is a good example of how you can lower the barrier for non-technical users to get access to your data okay next point imagine you are selling meat and in this of course you're selling data but in this analogy you're selling meat when you're selling meat I think you also have to talk about this little cutie here so you need to give a bit more context to what you're selling you need to explain where the data comes from and you see the tag on the ear of the of the little cow so you need to give the broader picture to your user you need to explain where the data comes from and how it fits into a broader picture and you can even go one step further and when you're talking about raising cattle you also have to mention things like deforestation and so when you're publishing data I think it's a good idea to give your user also a bit of context to explain how this data fits into the rest of the data you offer but also how this data fits into a discussion that the public debate or the news for example and you need to explain why this data is important and and what can be done given the broader context of society so my advice would be to give more context and this can be just a little text describing what the broader context is or you could give some experts some room to write about why this data set is important and what could be done with it and next point this is what I imagine to be a real delicious meal so okay I wanted to the Google translate but it doesn't come up so doesn't matter I think if you don't speak Arab it's there okay so when when the man the recipes like this I think most of us won't understand but maybe some people speak or read Arabic and they will be able to prepare this delicious meal it's I think there's something lost in translation here potato with rice and sauce I think the original name will be something else but my point is here if you offer the data and your documentation only in one language then a lot of people will simply not have access to the data and so what you should do is publish in multiple languages and I'm sure in Belgium most of us do one example is for instance Todd Bell they have here a lot of their content is published in four languages so that's really great also in English and two weeks ago I gave a course about data visualization to statisticians from national offices from across Europe from official offices for statistics and one of them is a Swiss guy he did a survey about on the data portals of the official bureaus for statistics and he said to me that almost all of them are publishing information and data in English as well so I think this this is really great because otherwise if you don't do that you're you'll exclude a lot of your possible users this is also an interesting book it's actually a recipe book or a cookbook for blind people so it's a cookbook in Braille and I must admit I make a lot of data visualizations and I almost never take into account people who have problems with their site but at the course I give at these statisticians there was one guy from Slovakia he had bad eyesight and so that was the first time I really started thinking about this group of people that is also very interested in data and with some simple measures we can give them some kind of access to the information we are publishing so I would say try to publish data or descriptions and documentation compatible with screen readers once again the transport focus website is a nice example if you go there you can see they're really investing in making their content accessible to people who have bad sight or blind people so I think this is also once again a good example okay that that was the first part about the data itself but I also want to talk a bit about data portals where the data is published so if we go back to the open knowledge website or maybe I can just skip to the quote this is also part of the definition of open data from open knowledge and so the data must also be available in a convenient and modifiable form and so it should be convenient to handle the data but you can also interpret this as the portal or the way the data is accessible is also very convenient and I think in a lot of cases and that's not really the case so we'll keep on using the food analogy I think a lot of data portals are a bit like this this is a giant supermarket and as a user you don't know where to start looking or you there's no good guidance on where you should go and I think in a lot of cases also there's just too much information and before you start opening more data I think you have to think about how to open data better and my apologies to Elaine the next the next image is from open data portal as you can see at that time there were 804,000 data sets there are already more but for me as someone looking for some kind of information or a specific data set this is a bit intimidating when I see this I think I'll never be able to find the data set I'm looking for so I think this is also something to take into account okay this is collect and go the call rights webshop and what I wanted to show you here is that you could compare this to data portal we saw the European open data portal where you have here the countries now we have here different kinds of food but what they do really well of course is showing you this the things you can't miss and if you go down a little further you also see what's new and on top you also have the opportunity to just where it's the menu of the week and they just give you a recipe for each day and you can order the the ingredients directly from there so why am I showing you this I think you also have to think about how you can market your open data if you're just showing the all the data that's published on your portal chronologically then you're missing out on a big opportunity you could put the most popular data sets on top for example or the most relevant ones or the data sets which have the highest quality and you can also track which data sets are downloaded the most or on which pages people come the most and I think it's a good idea to put these things on top of your portal and not bury them somewhere somewhere below and one good example of this is data USA so this is a website collecting open data from different sources in the US and making it available in a very attractive way and so you can see here the new data sets are highlighted and I'm sure they also are tracking which of these pages is the most popular and so they put the most popular ones on top because a lot of users are interested in these data sets. Okay next one at the university I had a friend and when other friends had their birthday what he did was going to the supermarket and buying some cans of food then stripping of the labels and giving these cans as a gift to other people so you had cans of soup but also cans of cat food for example or dog food and without the labels you don't know what's in there so it's a nice idea but it's not very usable what you of course need to do is offer some kind of a teaser or a monster so people will know what's in there and people will know if what they are maybe going to buy is really what they are looking for and if we translate this to data publishing and then this this is offering a preview of the data if you don't offer a preview of what's in a data set people will have to download first then open the file and then then only see what's in there if you can give a preview directly on the website where people are searching then they will have a much easier time knowing or selecting the data sets and they're interested in maybe yeah data USA is doing this I'm just going to open an article and there should be visualization appear here and what you can do directly here is view the data and then you will get a preview of what's in there and as a user you can then decide well okay this is what I want or not if you don't have the preview you have to download first before you can even see what's in there so I think giving a data preview can help users a lot okay if you want to prepare a meal with different ingredients people would want to go to well we have here seven ingredients and people are not interested in visiting seven different shops to get all these ingredients they want everything in one place like sweet like we saw in the the collect and go by by call read people don't really care about where the data is coming from for example if I want a list of addresses of schools in Belgium I need to go to the flanners open data or the institution responsible for education I need to go to Wallonia I need to go to Brussels as a data journalist it's much easier if there's one authority offering all these data in one place so users want everything in one place and they don't really care who is responsible for generating the data they want the data they're interested in just all in one place so I think aggregating things also has a lot of value for end users okay this is a picture from supermarket and it says the choice is yours but the message obviously is not in line but with what is offered there in this shop and so I think we need to talk about usability here and I did a little bit of research and maybe I'm wrong and I think some of you know better than I do but my feeling is that there is not a lot of research about how a data portal should be more aimed at users that there's not a lot of user testing involved one thing I found was this study and I think it's really interesting so what these people did these researchers just put people in front of the data portal and asked them to look for certain information or just basically use the website and then they just they follow people along their journey on this website and so if we can quickly look at what they found so in the end they have some suggestions for making data portals so better meta description descriptions really showing that if you select multiple things that these filters are and filters or or filters for example sorting things and other elements so I think if you're serious about data publishing and if you invest in a data portal then you should also do some user testing because otherwise you will miss a lot of users who will simply not find the information and they're looking for okay we're almost there this is someone looking at the menu and I think it's pretty obvious that you should provide an index or some your data should be indexable or searchable by search robots in a lot of cases this is not really the case because if you only can get to data when you have to select from a drop-down menu for example this information that's in the data will not be indexed by these search robots so you have to know sure that you have good descriptions for every data set and user rights keywords and that those are not hidden behind UI elements like drop downs or filters for example and here my food analogy stops because I couldn't think about about a good example of this one so number 12 is about embeddable yeah embeddable data and one good example of that is our world in data which is a website publishing a lot of data from research mainly about the long longer-term trends on different topics in our world and they have this excellent module I think so here you see a visualization but you can for example we can look for Belgium here and the charts then updates and then if you use this one I think you get an embed code so you can simply embed this chart in your own website and it also respects the filter you set so you can see here that now the data is filtered on Belgium and so you can embed this chart in your own website and why is this important I think is that a lot of people or institutions are interested in publishing data visualizations but it's oftentimes very hard to make good visualizations and if you offer something like this you you're giving the power to the user to just take your your data and publish it without any technical knowledge or without any knowledge about data visualization and they can just go and embed this in an article they are writing for example and this also lowers the barrier for using the data you are publishing so that's the URL for this for these slides so you can access these articles as well these are two articles I found very very interesting and I basically took the points from these articles to make this presentation so if you're interested in getting more users to use your data I think you should read these articles because they are they make excellent points about usability and lowering barriers for users so these are really great articles and so these are my slides all the links are there you can click on the images and they will lead you to the things I showed and with that I think I can conclude this and I thank you for your attention