 Pryddum ni weld yw hi ar gyfer gweithio'r wych yn i'ch ydw i'r ddweud. Felly, y t rpm i, mae'n Tryfod Ddechrau Yn ym Mhwych, na chyddo, phobwch chi'n gwybod yn iechyd o'r effaith o'r teimlo ond bethau'n ymgyflwyno'r fforddiau yng Nghyrchyddol, a'r Llyfr Eiddiol yn visitor. Mae'n ganddo i'n mynd i amlwydoedd butfyniais, ond rydych chi'n defnyddio ychydig iawn ar y cwrs o hynny, mae'n dweud rhai fod ychyn yn ôl maen nhwc. ac mae'r ddylch yn gwneud, a'n ymwneud yn ysgrifennu ar y rymd yn ymwneud, oedd yn ymwneud, wedi'u neud o'r bod yn ymwneud, ac yn ymwneud yn ymwneud yn ymwneud. Mae'r cyfnodd yn ymarfer o'r cyfnodd yn ymwneud, ymddai'n ffordd i chi'n fath o'r cyffredinol o'r cyfnodd i gael bod hynny'n eisiau gweld yn gweithio. Felly mae'r bydd ymwneud yn ymwneud, I want to start out by just asking you to consider one thing, which is I'm sure pretty much everyone here has probably made at least one chart. Some of you have probably made hundreds and hundreds, but how many times have you actually been in the data set that you're visualising? And this is something I've actually thought about for a really, really long time. In 2010, I started working for the statistics department of the International Organisation for Migration. I was working in the Iraq office trying to keep track of how many families had become refugees as a result of the war in Iraq, and how many were displaced within the country. But more important than just kind of simply counting how many of those families there were, I was trying to understand what those families needed, whether they needed water, whether they needed an education, food, what they needed to basically kind of stay alive. But those statistics also served a second purpose, which was to be able to go to donors and say, this is the money that we need to be able to kind of provide for those Iraqis. And I have a really, really important and slightly difficult confession to make, which is that I was really, really bad at my job. And that's not me being modest at all. You can look at this chart, which proves it. I made this chart. It comes from one of the reports that we published, and I have some big issues with it. It's actually surprising. The thing that bothers me about is not the fact that it's an ugly pie chart, because again, if you think about the people that are outside of this room, actually all around the world it's charts like this that people are producing in classrooms and in offices. The fact that it's 3D obviously is not great. It is plenty misleading. But there are other things that I think are really, really deeply problematic about it that kind of explain why it was that I was so bad at my job. The first of those is just simple geographies. So I wasn't actually based in Iraq. Like so many of the humanitarian organisations at the time, we were based in Jordan because the situation in Iraq was just so, so bad. And I think that geographical separation from the data that you're analysing is super, super important, and it's really easy for us to forget about it. It meant that I couldn't just sanity check some of the results that I was looking at. So let's say, for example, I kind of noticed that there was electricity problems in one part of the country. Without actually being there, it's really hard to understand whether those electricity problems are because of a short-term outage, or whether they were systemic within the country, or whether in fact actually people had just misunderstood the question that was being put to them and were talking about electrical products as opposed to electricity itself. So that geography was a big, big problem for me. There's another reason why I have a big problem with this chart. If you remove the labels, I could be depicting absolutely anything here whatsoever. I could be showing the types of underwear that women wear. I could be showing murder weapons, anything at all, because the actual visualisation is so divorced from the subject itself. And I know that the idea of kind of communicating something personal about the data might feel a bit alien to somebody. This is a really cheesy transition. Speaking of aliens, this is Lieutenant Commander Data from Star Trek. And Data, for those of you who don't know him, I feel like this is probably an audience with some Star Trek fans in it. Data is insanely smart. He can compute just about anything, and that's why he was such an asset to the captain. But there's only one thing that Data can't do, and that's understand human emotions. Except in season one, episode 13, Data's identical twin law shows up. And law is supposed to be some kind of upgrade on data because law has the emotion chip that Data doesn't have. But actually that's kind of a problem because while Data is super, super virtuous, law with his emotion chip is kind of manipulative and self-serving. And I actually think the fact that that was kind of depicted in that way reflects how a lot of people think about Data, that it's emotionless and that it's good for that reason. And that's kind of problematic for us because very often people who are working with Data don't really have an incentive to correct those people. It's great if people think that we're like these perfect humans with no kind of input and no biases in what we're doing. But I think that is actually really important to be honest and upfront about what it is that we're doing. There's no such thing as emotionless data visualisation because any visualisation is being visualised by a human who feels emotion, so we just need to kind of acknowledge that. We make choices about fonts, about colours, about scale, all the time. Now, reason number three, why I think I was really, really bad at my job is I think the most important reason, which is that the charts I was making were being shown to a couple dozen people, sometimes a hundred, it was mostly donors and colleagues. And actually they weren't being seen by the people that needed to see them the most, which is the Iraqis. So the Iraqis who had provided us with this data weren't being given any kind of mechanism to say, hang on a second, you've got that wrong. And I think this happens all the time. One aspect of this that was kind of relevant is that this report was eight megabytes in size. It was absolutely huge. And download speeds are not the same around the world. And the fact is, is that some Iraqis would have had to have waited for ages and ages in front of their screens to have just simply downloaded the thing. And that's even just the Iraqis that were speaking English. Some of the reports were translated into Arabic, but not all of them were. I can understand the reasons for that. We were working in an incredibly difficult environment. We needed to be efficient and the most important people that needed to see these charts were the ones who were giving us the money to just do our jobs. But at the same time, it meant that what we were doing lacks transparency. We weren't being transparent about the limitations of the data. We weren't being transparent about the limitations of the visualisation. We weren't being transparent about the limitations of us, the people that were actually putting this stuff together. You can't see my process here. You can't see whether I collected this data from questionnaires or face-to-face interviews. You can't see how many people I'm depicting. I'm not going to be a failure all round. And especially as I said, the fact that this was kind of just spat out in a PDF that didn't give Iraqis the opportunity to say you got it wrong. And again, that's super, super ironic when the people that you're visualising, you're saying these people are disempowered and you're not giving them the opportunity to kind of say you're wrong. So it's a really, really intense experience. And as I said, I was really, really bad at my job and I don't like being bad at my job. So in 2012, I decided to go through Iraq to find out if I was kind of getting my numbers right. It was... I've just jumped to the wrong part in this, sorry. It was pretty intense because I realised how little my charts were really, really communicating the reality that was there. They were completely out of touch. And so as a way of capturing something that felt less abstract, I started to take photographs that felt a lot more real to me of kind of like what life was like there. And I started to take photos all the time. I was taking hundreds and hundreds of them. And that kind of process of repetition meant that I started to notice patterns in some of these photographs and noticing patterns, as you know, often leads you to want to kind of visualise data. So eventually, I kind of turned these into charts, some of the photos that I took, and I turned them into charts and produced an exhibition of the photographic images in 2013. So I'm just going to talk you through a couple of the visuals that I've made in case they're interesting. This was a woman just kind of walking down the street in Baghdad, and I turned it into a bar chart. And what I'm actually displaying here is the proportion of women aged 15 to 49 who think that a husband has a justification for striking his wife. And I know that the terminology around this sounds kind of like quite bizarre, like that was the way that the question was worded. But I think it's really, really easy to forget the kind of cultural differences about the way that you ask a question. So in a country like Iraq, just asking people, you know, have you been abused by your husband to represent a massive intrusion into someone's private life, right? So the data you're going to get is going to be incredibly, incredibly flawed. So you have to be creative about ways that you kind of understand trends. So this data actually comes from the Iraqi census. And as you can see, one in five women think that a husband has a justification for striking his wife if she burns the food. Like really, really shocking data, but I just wanted to remember some of the actual people behind the data, and for me, using photographs was really important in order to do that. Similarly, this one is just some Iraqis in the north of Iraq. I've actually cropped the waxes out of this in case anyone wanted to have a guess of what it is that I'm visualising here. Can you see the line chart at the top? Does anyone want to visualise the case? Sorry. No, that would have actually probably been a better thing to visualise with this particular image. It's actually depicting foreign aid, so it's quite useful for seeing how the country was quickly... I wouldn't say forgotten, but some of that financial aid trailed off, and the idea of using electricity was because the economy was kind of running off of that financial aid at the time, but yeah, that would have been much better. This one here shows because obviously working for the international organisation for migration, I was kind of focusing all the time on the numbers of refugees, the number of people that were internally displaced within the country, so I kind of come back to Iraq. Again, the idea of something like this was to really, really remind people that this data represents humans and to fix it in a kind of geographic landscape of Iraq itself. This one was... I actually was helping someone who wanted to write a will. In Iraq, you go to these people out on the street. This was like a guy sitting underneath an umbrella, and you go and get one form from him, and then you get sent by another guy. This comes from the World Bank, and they don't actually publish this indicator anymore, but it just explains to you the number of steps that are needed to enforce a contract. I wanted to visualise it this way because when you're actually doing it, it's such a painful bureaucratic nightmare to get anything done. Every single one of these steps creates an opportunity for corruption. Actually, the endemic corruption in the country is really, really relevant for the statistics that we're looking at, and it's something that's so easy to forget. All the time, things like the percentage of Iraqis that had graduated from high school, literacy rates, numeracy rates, things like that. While I was there, I started speaking to this mother who explained to me that her daughter was falling behind in school, not because she necessarily wasn't doing very well, but because even the teachers in the schools accepted phone cards as bribes for giving the kids better grades. I'd been spending all of this time looking at this data set on school grades with no understanding of the way that that data had fax on the ground. Going there really, really helped me to understand the way that I wasn't, the missing gaps in some of the things that I was looking at. This last one here was an attempt for me to do a better pie chart. I'm not claiming, by the way, that any of these are very, very good. I looked back at them with the benefit of hindsight and think they're pretty bad in lots of ways, but this one showed that nine out of every 100 Iraqis their cause of death was suspected torture. The idea, again, was so that you couldn't look at a visualisation and loose sight of the people that were within it or the human side of things. Even if you remove the labels you still get a sense of subject, everything's fixed in a real time and a real place. The other purpose of these charts is that they're supposed to be very, very inclusive. The process here, you can see that someone has selected the images and it almost looks like a real collage that's been cut out and stuck down. Hopefully people can see a way that they could actually replicate my process. I think that's really, really important. I do not think that data visualisation is for geeks. I don't describe myself as a geek and I don't particularly like the label geek because I think it just creates a new community of insiders that can be quite exclusionary. It's really important for me that as many people look at my work as possible, which is the reason why I eventually moved into journalism and I'm not ashamed of the fact that I seek out readers and want a lot of readers because, as I said earlier on, those readers can check my work and tell me whether or not I'm writing things right. So I started working out of the Guardian's data blog and then I moved eventually to 538 over here in the US and while I was there, as Irene mentioned, I started writing a column called I Normal where the name eventually switched because we felt like it might sound like we're placing some kind of moral or ethical judgement on the readers by providing a response to those questions. The idea was that, A, I could not only get more input on the visualisations I was creating, which was super, super important, but I wanted to have input about the questions I should be asking, what data sets I should be looking at, what hypotheses were kind of out there. I want to talk about one of these columns in particular, I produced dozens of them and I still write it now, but for a different site. So I got this one question from a lady called Caroline who was 44 and living in Philadelphia at the time and she wrote to me and asked, Dear Mona, I recently read an article that said most of the prison population is religious. Well, there are very few atheists in prison. Please tell me if this is true for the United States. So as many journalists do, I kind of merrily found my data, I started analysing it and visualising it with the help of Reuben Fischerbaum and Alison McCann at 538. And we thought a lot about the best way to visualise this so should we be showing raw numbers and we eventually felt like in order to respond to this reader what they were really interesting was ratios so which population is kind of over or underrepresented in prison relative to the U.S. population as a whole. And as you can see, Pentecostal Christians underrepresented and then there are more in prison Muslims relative to the U.S. population as a whole and also Jewish inmates as well. So I published this data and as I always do at the end of my columns I invited readers to get in touch and they did. I had some former inmates who got in touch to explain to me that part of the reason for this data is that if you are in prison because these individuals had been sometimes you get access to better meals if you're eating kosher or halal as opposed to these standard prison meals I also had explained to me by former inmates that sometimes if you're members of certain religious groups you get extra recreational time which gets you out of your cell which is super important if you're in prison for 23 hours a day. So the ability for people to get in touch and explain to me the why is super, super important for what I do informs the theories I make. I'm not saying that my job is quite boring if I just relentlessly describe the what of things rather than understanding why things are happening which is why my inbox is really important to me and I'm going to share some of the questions I've got in my inbox that I think are quite interesting because they reveal again some of the things that the general public is thinking about data. So this one which was sent to me in 2015 do attractive people have more sex than ugly people and I think it's interesting because I could quite easily imagine throwing a blog post together that did really, really well and people just accepting that data on face value without any kind of interrogation of what exactly is meant by attractive and ugly and I think the need to kind of take a step back and explain the definition of terms and the methodology of the way that this stuff is super important. Here's another one. Am I the only one that doesn't use the designed opening in my underwear when peeing? I thought this was an interesting one. I assume it came from a man and as someone who doesn't own a penis I didn't think this question was interesting at all but I kind of shared it on social media and actually a lot of people do think this is interesting and a lot of people wanted to share their data with me which is quite nice because what I find interesting isn't necessarily the same topics as the people that I should be serving find interesting so to be able to check that is super important. This one is quite dark so someone asked me what the autoerotic asphyxiation success rate was and they're obviously quite keen for a certain answer because I'm sure the vast majority of times it's performed it's done successfully, right? You can kind of hear like the desire for reassurance down but this question actually got me thinking really really critically about your responsibilities as a data journalist let's say I was actually able to collect that data and let's say that that data showed that 90% of the time you're fine if you do this, you're totally fine how can I put that into a chart that communicates risk to a reader in a really really responsible manner so that people don't just see that chart and do something that is potentially quite dangerous to them and when you're doing health reporting these questions are really really critical because you could actually be shaping the kind of people's choices and then just really really simple ones like how much Peter's a lot of people so this last one leads me to kind of like the last bit of my very strange kind of annotated somewhat narcissistic resume which is the thing I'm describing right now but I want to talk about the kind of last iteration if you like in the way that I've tried to think about data visualisation so as you can see from these messages 538 readers are kind of special and this actually happens around most websites the internet doesn't create this kind of perfect flat space actually you get kind of clusters and communities around certain things so I write for the Guardian now there's a certain type of Guardian reader that's not necessarily the same as a 538 reader so I wanted to publish on something that felt a little bit more flat and could hopefully give me access to new audiences I also wanted to resist the urge to produce interactives all the time, all the time to do and I wasn't thinking critically about when I actually needed to ask the reader to kind of get involved and when I needed to just give them the kind of simple image and the kind of more simplistic story so I started to basically use Instagram because I thought it was a really really good way as well to the comments underneath the images are really really fantastic because they're transparent and it gives an opportunity to kind of have a discussion and a debate around that visualisation so here's one I did which was actually a response to this particular question how much pee is a lot of pee? I tried to use everyday objects whenever I can in order to convey scale I think that's super important not everyone necessarily grasps the same units of measurement we use different units of measurement and even if we use the same units of measurement sometimes especially when you're using large units it's really difficult to understand how much we're talking about but pretty much everyone has held a 1 litre bottle of liquid in their hands at some point and that can kind of help them to understand it speaking of scale again thinking critically about some of the work that I do I don't really think I need to have my hand in it it kind of makes it look like a mini basketball but the idea is to convey the relative size of the basketball and the hoop this is a slightly more serious one here that again is about scale so this is contrasting the average parking space in America with the average solitary confinement cell and what was quite good about this I think is that I got online abuse from both sides of the spectrum so I got abuse from people who were saying to me I didn't imply that prisoners have too much space and I had abuse from people claiming I was implying that prisoners had too little space which I think sometimes bodes quite well for the way that you've visualised visualised something I also want to use this as much as possible to again reach out to the community of non geeks so to show that data is relevant to everyone's lives in ways that might not be expected so there was a journalist in New York who did a Freedom of Information Act's request to find out how many decapitated animals there were in New York parks and I was able to visualise this and show which ones were kind of the most commonly found animals so there's a lot of headless chickens this is over a ten year time period but still kind of interesting and it wasn't always just heads it was sometimes just like the bodies and not the heads and we're finding like school playgrounds and this is very very bizarre but anyway super interesting and like got people thinking about like why this data is the way it is and we shared quite a lot which is for me always a good thing and then finally circumcision rates showing the benefits of kind of doing small multiples sometimes before finishing I also want to talk about some examples of what I think a good data journalism that aren't my own and good visualisations so again I want to come back to this idea of conveying the imprecision in what we do the fallibility of what we do and one way to do that is actually to just focus on one very very very very very very small data set in a way so this comes from this blog that some of you might know called the quantified break up I think that's what it's called it was a tumbler and it was written by a woman who was going through a divorce in New York and she visualised all kinds of aspects of her break up she visualised the messages that she was receiving the frequency, the time of day they were exchanging messages her and her ex how the relationship fizzled out she visualised how much sleep she was getting times that she just started crying in public and couldn't control herself and people really really loved it not just because it told a story but because I think it was really really honest in terms of it's claims and it's limitations this wasn't one woman trying to claim that she was representing divorce in America or divorce in New York or women in general she was just saying here's my story in the data and I think that transparency is actually something that's pretty great this is a very very different kind of example so this came from the New York Times written by Gregor Aish Kevin Keely and Amanda Cox and they basically asked people to draw the data themselves this charts the parents income versus the likelihood that their children will attend college so you kind of draw out the data and then you see how you compared to the actual reality it tells you a story as you're doing it and it invites you to interact in a way that feels super super inclusive and then you can kind of see how you compared to other people again it's just really really friendly just had great design and then the last one that I want to talk about is earlier this month the Guardian published I think was a pretty good analysis of 70 million comments that had been published on the website I don't know if any of you guys saw this and I think one reason why it was really really good was that it didn't overwhelm the viewer in terms of too much data it broke it down into several different slices so you could see how levels of abuse kind of varied by gender and by subject area what was also really really good about I think is that it allowed the reader to kind of take part in this little quiz where you could read a comment a real comment that had been left on the site and decide for yourself whether you would have blocked it and defined it as abuse or left it up and what that did was I think it really really helped to convey A a sense of scale because you realised that you kind of took a minute or two to decide whether or not this comment was abusive and then you're like whoa 70 million of these have been analysed that's a lot and it also showed you that even though you've been reading all of this what seems like a truly truly objective data looking at the scale of abuse once you read that comment you realise that every single one of those data points represents a small human decision about whether or not this is abuse it's based purely on subjectivity even if the whole altogether appears objective and is objective in some ways so I thought that was really really smart now I know I've kind of touched on a lot of disparate topics so I would like to close by kind of thinking of some ways that we can basically get better at what it is that we're doing so I would say that even though data visualisation has obviously made leaps and bounds and has progressed hugely in some ways our data literacy hasn't and never will we're kind of flawed human being so if I was to ask everyone here to say how many people they think are in this auditorium we would probably get very different responses on range and the bigger the scale the harder it is if you're in a huge huge concert venue it's really really hard to get to how many people are there and so we have a responsibility to communicate scale to readers in much better ways so even something as simple as 70 million comments trying to help people to understand how many that actually represents and there's all kinds of simple and complicated ways of doing that but I think it's really really important as a goal to bear in mind the other thing that I think is super important if I haven't made it kind of clear already is the notion of conversation so as I've said I think communicating and interacting with people makes me so so much better at what I do so Gorka used to do this thing that I don't think they actually do anymore where you could leave comments on photographs and when you hovered over you could see where everyone had kind of clustered together to have a conversation so if this was the photograph say you could see whether people were more interested in the President of the United States or a wealthy royal child and if you imagine this example translated into a chart where people were able to leave comments it would actually be incredibly powerful you would be able to see if everyone's having a conversation about one year in the data on a line chart or something or whether everyone's focused on one state in America of their conversation or even if everyone is just talking about one dot in your scatterplot maybe that's a point of inaccuracy in your data and you need to go back and change it and I think that's super super helpful for having a more more fruitful discussion that was when I was going to say about Obama and the Child now the last thing that I think is really important as well is that we just get better at communicating uncertainty that was kind of the idea of me starting to do things like hand drawings and using photographs it was about admitting fallibility better and showing that actually this isn't perfectly precise which I think some data visualisations do kind of communicate to people today in a way that I find slightly troubling sometimes so there's all kinds of ways that you can communicate the uncertainty whether it's just kind of plotting out things around an average or plotting out probabilities differently we just need to think more critically about it so the last thing I would say is that yes data should strive towards objectivity and fairness and visualisations should keep on trying to do that but we shouldn't be alienating people because if we are we're actually jeopardising the quality of what it is that we're doing and that the internet doesn't solve everything yes theoretically you do have access to a whole lot of people but you have to reach out to them you have to make sure that you're publishing in the same language as them that they are available to you regardless of their internet speeds and we you know in a very very concrete way we're actually unelected officials who are kind of representing people in our data and if we're not checking in on them to say have we got this right I think there's something pretty dark about that so thank you very much thanks for having me oh yes sorry it's time for questions right yeah if anyone has any questions for me so I think it's a really really great question in an ideal world I think I should be picking up the phone every single time I write an article because the spreadsheet just doesn't tell you everything very often that's incredibly difficult when you're in a newsroom with a kind of couple of hours of deadline and that sounds like a really really lame excuse but it's just kind of like the reality sometimes so I think the point is as you say to kind of have it be more iterative to have people get in touch and you know that in my defence this piece didn't claim to explain why it just like claimed to be showing this is the what but people are actually really interested in the why and so being able to go back to that piece and say this is what people had kind of told us but not just this is what people told us but then kind of making it investigative and finding out whether or not those reasons actually play out among a broader you know across prisons across the country I think is actually really really important and I think the nice thing about data journalism if it's done right is it kind of questions the notion of who are the experts right so generally when you're writing journalism you call up an academic and you say hey you're an expert on like prisons in the US tell us what's going on and that's still super super important but actually the individuals within the data are so well qualified to understand what it is that's going on and as long as you provide them with the tools to be able to get in touch with you it's incredibly powerful it's a constant challenge and I think I agree with you that these aren't perfect but the kind of idea behind them is to show their fallibility it's very obvious that if a human has hand drawn this it is not a perfect representation of the data set itself and if anything I think that they're intended for me to counter what it is that I'm doing in my day job 9-5 which is conveying precision with kind of quite frightening maybe I'm being a bit over the top so call it frightening but you know like I want people to question the data behind the visualisation more frequently so I don't know if that actually answers your question but I'm thinking about this stuff like it's all about communicating it's not saying these are perfect visualisations it's about showing that these are not perfect visualisations I actually think people are super super sceptical naturally and I think a lot of people do naturally ask some of the right questions I think source is always super super important so for example if polling data is presented increasingly now people are like was that poll 100 people was it based on 2,000 and they get that there's a difference between the two I would argue that even a poll that's based on 2,000 people is not necessarily the most accurate thing in the world and I still think that journalists need to do more to kind of communicate the weaknesses of things like polling to readers so I think understanding sample size is super super important understanding not just sample size because that's quite an easy example but sampling methods so let's say for example very often when a poll is published it will say a national representative sample of 1,000 people but sometimes that national representative sample for example included one black person who was weighted up to represent all of the black people in America and it's somehow national representative so I think being able to peel back even that top layer of the data and say there were 1,000 people here and say how many people were there in this data set that looked like me before I interpret this is super super important as well do we have time for a couple more if there were a couple more maybe it's not I really think one thing that was quite good was I don't know if you guys have seen 538 actually published the probability forecast for different candidates and by showing the kind of hump of all of the possibilities before showing the ones that they think are kind of probable and their kind of middle one it really communicates a range of different scenarios to the viewer which I think is quite effective I actually think that it's been kind of lagging we haven't really been thinking about it very much so we've got like a bit of ground to catch up on of ways that we can kind of communicate that uncertainty I think lots of people have done things that are based on like deaths so contrasting the number of deaths that have resulted from what thing A versus thing B over time the problem with all of that though is that it involves a lot of kind of personal choices about what you're going to make that contrast with so if you take for example the risks of guns in America people on both sides of the spectrum will just choose a different reference point in order to convey the scale of risk so I actually think one thing that can be quite informative is breaking down that risk by different demographic factors so for example showing something your times did the likelihood that you're going to end up in prison depending on your age and depending very importantly on your race so helping readers to kind of find their way into the data by presenting their particular demographic can be super important I think there's some really really small things that you can do so I think at all points during this conference it's come up at one point or another during everyone's everyone that's spoken is this idea of kind of staggering things and telling it to you in a story and where attention comes in that I think is really interesting is I don't know if you noticed but on that the slide that I had from the Guardian where it showed that chart it showed four of six and I think communicating to the reader listen like you're going to have to see another five of these or another three of these helps keep people kind of focused because it's not like oh my god is this going to be infinite I don't know what to call for I have five minutes before I have to get back to work or I'm hungry or whatever just being able to let them know how long they're going to have to save before you watch a video for example you hover over it and say am I going to be here for 20 minutes or for five minutes so I think that's one really really good way to keep people's attention to break it down into different sections and I think that's part of the problem with some interactives some interactives just demand way way way too much and sometimes actually they can just be split out into like several flat graphics that tell a much more compelling story and require less of the reader and I think the process question is really really important so there's a piece that I'm quite proud of that I did with my colleague Andrew Flowers at 538 and what we tried to do was we tried to find out what was the most common first and last name combination in America and what was nice about that is that every single part, every single step of your process tells you a story in and of itself so you start off with what are the most common first names and people are interested in that data kind of in and of itself and then you're like what are the most common surnames again people interested in that data in and of itself and then you talk about how if you assume that the probability is even for say any given first name in any given surname this is what the data looks like again kind of interesting and then you say actually when you use a phone book and look at things you can see how those and even probability isn't the way that life plays out because if you're surname is Smith you're probably not going to want to call your son John because it's just so boring so being able to explain that to readers every single step was inherently interesting in and of itself and I actually think that as practitioners you find that all the time as you're doing your steps you kind of find some interesting little nugget before you move on and I think remembering them and conveying them can really help keep people kind of with you throughout that process I think this idea of going to them rather than expecting them to come to you is really really important this is a very very weird example on pubic care and it was posted in this forum that is like feminist for pubic care justice or something I don't know why that is the first one that came into mind but I think if you go out into the communities that are really really passionate about the thing that you're doing and share it with them and say to them please come and please contribute to this discussion it shows them that you're not just kind of making the right noises you really do care I'm like a response to the people that write to me on my inbox and I think that's really really important and I go into the comments and I'll say hey that's an interesting point what about this so I just think showing them that you're actually willing to engage is really really important as well all the time all the time I'm trying to think of like just one example again I don't know why my mind is really really in the gutter today but one that came to me was someone said to me which seems like a completely plausible theory to me that because of smartphone usage more people have piles because they're sitting on the toilet for so long just on their phones and I was like that's interesting and yeah there's no data on piles versus smartphone usage but yeah all the time but John's really really nice I haven't talked about this at all but I think that we as data journalists don't do enough to actually do the process of collecting the data so that's one reason why I showed that kind of quantified break up thing so there's two examples where I did this and I can understand why we're reluctant to do it right because if I collect data from guardian readers those are guardian readers they are not representative of all of America in any way but I think you can communicate that to people really really honestly and in a way it's even more transparent because you say we asked guardian readers and people get that guardian readers are not representative of everyone so did it twice I did once I wrote an article on redheads and asked people whether or not they felt like redheads experienced discrimination I kind of just did this like little blog post and stepped away from my desk and when I came back I'd used a Google form it had completely crashed because 20,000 people who claim to be redheads had very very strong views on whether or not redhead discrimination exists and I did another one that asked people about their sex habits and the thing that's really really nice about that is it was so easy to communicate the limitations of that data because for example someone that entered in some data said that they lived at 10 Downing Street like you know you put that in the article people get it David Cameron probably didn't contribute data to my article and it's quite powerful because it still shows you okay we don't know anything about this phenomenon this is the best we can do in terms of right now just saying something about what's going on it's just fun it's just fun apart from anything else thank you