 Okay, so it is my joy to introduce Fern Bishop who will talk to you about the visualization jungle Oh, I am liked cool everyone can hear me, right? It's very weird being able to hear yourself So as he said, my name is fun I am a PhD student at St Andrews on my work in visualization But don't worry that doesn't really mean that I know an awful lot So if you have any questions or you want to tell me no about something I say then please do So I'm going to talk to you about today is visualization jungle So probably you've all seen visualization. I'll talk a little bit about what that is, but it's not always very obvious When things are going wrong And it's useful to sort of have an idea of the different ways visualization can be done So that you can notice it so let's start by talking about what visualization is and this is the most important visualization you'll see today how much pie I've eaten and how much pie I have left to eat and It's a very awkward visualization because really both want both halves to be as big as possible Which doesn't geometrically work But the idea is that visualization is the way is a way of representing data visually so that you can understand it So if I show you just a set of numbers, then it's really hard to tell if something is going on with them You've probably seen some things like this These are all very common visualizations which you will see Out and about you'll see them in the news. You probably have had to use them at work things like that And as you can see just from this there are a great variety of visualizations They are working very different ways and like I said, it is very hard to know what is going on when you're just using numbers So if I show you this data set here, this is four sets of data I'm so sorry, did I break everything? No, it's fine Okay, so if you see there's columns going down one two three four and those are all XY coordinates and if you do some basics of text statistical analysis on them They all look about the same the sum is the same the average is the same They all move around in what seems like kind of similar ways, but if you plot them Then you'll see that actually they are very different. This particular set of data is called ascom's razor Has some school tech and it makes it very easy to maybe I should stop wandering around so much It makes it very easy to see why visualization is powerful because a lot of these things are very different So you can see some of them very spread out some of them have very solidly defined lines And if you try and draw through them, they all sort of seem like they have the same statistical properties So why is visualization so powerful? Well, it's really quite new In terms of writing humans have been doing it for about 5,000 years But we've had to use our eyes for a lot longer than that anatomically correct humans happened about 200,000 years ago So it's 195,000 years of difference in that And throughout all of that time we've needed to run from lines and tigers are chasing us mercilessly and we've had to find delicious berries So our visual cortex has our visual senses have gone very good At noticing things that stick out which makes us very good at spotting patterns But not always is it so easy to tell when something is going wrong in visualization so what I want to do is take you through some visualizations that have made me angry in the past and Talk about why they make me so angry and at the end of this talk I'll also give you some guidelines for how to not do that So if we start with what is quite a simple one These are salaries in tech so we have people who are paid too much the top of the cream cream of the crop We have people who are paid enough and does anyone have a guess at what the third section would be Paid sorry All of us. Yeah Probably not so in this visualization it was women and Welcome to why this annoys me. I am happy to say in this particular instance. It was a parody From a very good set of visualizations which are highlighting some of the Diversity problems in tech, but you will occasionally see visualizations that do this It's basically trying to compare two things that are not the same So if we're taking a tour through the visualization jungle, what kind of doesn't make sense? What is a bit split? Well, well, it's kind of like a platypus, right? It's kind of a duck. It's kind of an auto. It kind of doesn't really know what it is Let's move on to the next visualization. We're all in the UK. So we probably all like our pubs and This map is of every pub in the UK and you can probably tell that it is not very useful map Would anyone like to try and point out their local to me on this? So you can tell it's quite hard and really the problem here is that there's too much data Which is another really easy thing to do in visualization to say I need I need to show everything So I'm going to put it all into a graph if you put too much in you can't see anything If we're gonna go for the animal comparison again, I'm sorry this next image wigs me out a little bit But it's kind of like ants, you know, and you see a whole group of ants moving around you can't actually pick out a single ant anymore They just become a cluster. They become an identity of their own Can anyone see what's up with this visualization Yeah, why access so if you were to look at the the height of the people in this you would You would have to believe that men in the Netherlands are about three times as big as people from the Philippines Which is probably not true. I don't know. I've never been to the Philippines, but I've been to the Netherlands and they weren't that tall Maybe people in the Philippines are really short probably something is going on and someone called out It's the y-axis. The y-axis here is starting at 1.5, which is really exaggerating this change So what animals do we know that have a bit of a problem with scale? I Think I think the drafts right drafts have a real issue the head is so much higher than everything else And it it's just a little bit nonsensical Here is a useful trick if you want to tell your boss you're doing well when you're not actually doing that well So I Those who are lagging behind It took me a little while actually you guys were all pretty on the ball This is cumulative annual revenue which means you take one year and then you add the next year's revenue on top of it And you just keep adding up so it will always go up And as you can probably see there is a little bit of a curve at the bottom and perhaps you can see what is coming next if I Show you the actual annual review It ain't so pretty Um So what what do we know? This is basically a good trick because you're picking your data, right? You're you're deciding what you want to show and you're choosing it accordingly. You're being very picky The most picky animal I can think of is the panda who lives solely on bamboo And also it looks very cute, but I promise this wasn't just an excuse to put a panda in my presentation Uh We're going to move on to one that makes me a little bit angry and then you'll see me get a lot more angry afterwards So It's really nice because I don't need to present at all. I can just show you visualizations and leave you to get on with it So the problem here is that it doesn't actually have any relationship between the data and the bars 13% isn't Five times as big as 34% One of the reasons that I particularly wanted to give this talk is you may have noticed what's going on in bomb right of that Visualization which is that it was presented on NBC news These aren't just things that someone is slipping under the table in their weekly meetings with their boss Like these are things that are being used in the news to deceive people Which is bad. So in this particular case, we're gonna go for an animal. I Think this is quite egregious. They are they are lying at this point So we're gonna go with the most heinous of creatures and we have encountered the evil hyena and This is the last of the lion's tigers and bears which you have seen no actual lion's tigers all bears This visualization it's kind of tricky to spot what's going on which is one of the reasons it's so dangerous Can anyone spot it? Yeah, so why access but for a different reason this time instead of just being cut it's been flipped so um, this particular visualization was showing gondas in Florida after the instigation of particularly egregious law And if you see it looks like it's going down But if you reverse the y-axis actually had a pretty significant impact going up Which makes me really angry because there is no way that wasn't intentional So For some reason I've given this a pretty cute animal even though it makes me very angry But the problem is that it blends in just like a chameleon is very hard to spot when people do a trick like that so Let's talk about how to not do these things um Making good visualizations is challenging and there are some guidelines I can give you but the first thing I'm going to tell you is that everything I'm about to say will be wrong at some point in time There are always reasons to break the rules We all know rule breaking can sometimes have more significant impact, but you need to be careful about when you're doing it and When you're not So the first thing is you can use appropriate visualizations The source on this slide Is a really good website called the data viz project. They have an excellent selection of visualization types They'll help you figure out what you want to use So for example bar charts are really good if you want to compare Just compare numbers because we're very good at comparing heights because we always want to know how many more berries We're getting so figuring out larger quantities is something that our visual system is pretty well-attuned to And pie charts are good for showing parts of a whole maps are good if you're doing location based Stuff before anyone shapes at me who is intervisualization Some people don't like pie charts because we're not very good at judging angle So you can just turn it into squares and then it is a little bit easier to judge you can do it You cannot do it. I'm not here to preach for or against pie charts. That's someone else's problem You can use meaningful colors So in the example we have here We're showing the colors of fruits and it is quite natural to use the color of the fruit that you're showing My personal work is working with young children Creating visualizations look at how they do it and I see this in primary school children Even if it doesn't always make sense like I've had people say I made this red because maths is hard They'll you look for meaning in colors So it if there is the possibility of using a meaningful color then do One of the places this trips up is gender like pink and blue and sometimes people are a bit unsure about that if you Really want to break conventions then use completely different colors Don't use pink or blue use red and green or something that color-blind people won't be angry at you for But don't just use pink and something else because it makes it harder to spot that you've done that Follow conventions, so we've talked about the y-axis a couple of times usually zeros at the bottom So unless you have a really good reason for flipping it don't Seems simple apparently not And you sensible scales there are times that were just small changes significant For example, if if it was one degree hotter here today You maybe care about that and so you would truncate the y-axis in order to show that variance Also common in stock markets, but in heights it the way that this ended up visually it made no sense Filter and highlight your data like I said Sometimes you have a lot of things you want to tell people and that is okay, but you need to help them pick out the important parts So rather than showing every pub in the UK. I live in St. Andrew's I really only care about where the ones in St. Andrew's are on a day-to-day basis So I tend to not look at where every pub in the UK is as much as that would be a really good road trip If you if you can't filter then you can highlight So in this second section here, you've got a couple of points that are important that have been shown in bowls You can also do small multiples you can have like multiple visualizations showing lots of different things Simplify where possible everyone loves the underground map It is the coolest of all maps if you try and navigate London above London on the underground map You'll end up very lost very quickly Because it has no relationship to how London is actually laid out But when you just need to know what connection to get on an underground, it's fine It's okay to simplify things so long as the message that you're Showing is true to what the data is showing and you're not trying to distort it in some way Talking about not trying to distort it in some way don't lie That one seems quite obvious, but I'm going to put it up here Anyway, this is what that chart should have looked like. You can see it's not that close So to summarize all of those points Don't defy expectations Like it really is that simple just stop and think about how you're doing it and is it the way that makes the most sense And if it isn't change something Go on the internet look at what other people would do I'm going to take you through a really cool visualization which tells a story Which is one of the ways of helping things make be more merriable has anyone seen this visualization before? Hey, cool. So this is a really neato visualization Which is also very very old it shows Napoleon's March on Moscow It doesn't a very interesting way you can see the whole journey So you can see the number of troops he's had at the start. That's the beige The beige line and you can see it dwindling as this March occurred So he was marching on Moscow and the Russians were pulling back and removing Removing things from the land as he went so a lot of soldiers Had trouble not because of fighting but because of other reasons And so this is the number coming back So you can see him going back the way and the continuation of the dwindle And you can pick out really interesting points in this visualization because you can see like where a camp split off and then Rejoined on the March back and suddenly the numbers jump up again And you can also see the effect that temperature had over time on the March back a Few more interesting visualizations and I think another really useful way of showing how a Visualization can tell you so much more just by how you look at it is can anyone tell me what the Where the one on the right is located? New York right it's very simple because it has that big blank space in the middle, which is Central Park That's one of the powers of using maps for geography based Information is that you can see See where things are Maps great, huh? Who'd have thought it? You can see the one in the middle has a very strong Visual impact because there are a lot of animals clumped at the top this is a lifespan of those animals You can see the poor lonely tortoise taking a really long time But living for a very long time too The important thing to remember is if you're making visualization people aren't going to spend that long looking at it You're going to spend really long time making it and they're going to look at it for 10 seconds Or not if it's really beautiful visualization that tells the story But just Assume that they won't make it as easy as possible to read and remember they may just be skimming So in summary Visualization is great, but it isn't always accurate as you've seen Think about the visualization you're looking at when you see it even if it's in a trusted news source because sometimes we could just lie Sorry, it's a little bit be in my bonnet sticks of conventions if you're making them and Go ahead and like look at interesting things on the internet I've got a web page that I'll put up at the end which is the next slide in case anyone's desperate to leave Where I've got some Links for those who are interested and you're welcome to come and talk to me. So Thank you for your time Do we have time for questions like five minutes for questions if you want to Otherwise, I'll go and sit outside and you're welcome to come and talk to me And yeah, I've got some related reading if anyone's interested. Thank you for joining me on this Yeah, okay Do you have any tips for visualizing? Multivariate data because this is a course is somewhere you can lie with your data is just hide it in some relationship between two variables So the idea of small multiples is that rather than having everything having everything in like one visualization You try and split out meaningful bits into separate visualizations And you might end up overlaying things, but it can be helpful. I'd need to look at the data. Just Say more but come talk to me afterwards and we can talk about it more Are they any we have time for more questions and the hands? Okay, in that case. Yeah. Thank you very much for coming You