 So I know this is the last talk of the day and it's kind of late and everybody's just kind of sitting there wondering how long, how much longer they're going to be able to stay awake. So at least the good news, I guess, is that there's not going to be no more code. That was a bad news. So the good news is we're going to do some exercises now to kind of get your blood flowing a little bit and then kind of hope that maybe you'll stay awake for the next few minutes then after that. So help me out a bit and participate in this short exercise. I want you to raise an extremity if the next few things apply to you. Okay? Have you ever seen a movie? All right, that's not too bad. Ever read a book? We're friends here, come on. A book that's not non-fiction. So an actual story in a book. Okay. Fifty shades of gray? Okay, no, never mind. Whoops. Have you ever been to the opera? Have I been to the opera? Okay, quite a few. All right. Blay, viewer. Okay, pantomime. Okay, we're not going to push this too far. But the point is, and I wish I had like a handheld microphone because I could just drop it now and walk off the stage at least if I hadn't put three questions there because I said why, how and when. But I hope, I think the why should be clear. Stories are all around us. We're all used to stories. They're just everywhere. And we're really good with stories. We understand them. We know them. And they work for us. They speak to us. And I want to talk a bit about that, why that is. And then talk a bit about how we can build stories using data. And then get to this last question because it's kind of the same as the first one, but I'll get to that. So why questions? Why questions? No, why stories? Let's try that, okay? Why so many questions on my slides? I don't know. Why stories? To answer that, or at least to start answering that, I'm going to tell you a story. And I should also tell you that most of my slides are going to be blank, because I didn't quite finish them. No, but the point here is there is no visual for this. This is a little story that I'm going to tell you. This is about a young woman who walks into a bar. And this is not a setup for a choke, you know, a rabbi, a priest, and Irene walk into a bar. But a young woman walks into a bar. This is in Berlin in the 1920s. A young Russian woman who is doing her PhD at that, or her graduate studies at this point, in a new field, in a field that was new at that time, which was psychology. And by the end of this story, she will have made an observation that's going to change the way we understand storytelling to this day. This is actually pretty cool, because you guys should see yourselves. This is what you look like. You're all paying attention, right? This is a story. A story makes you pay attention. This is why we tell stories, because we get people's attention. Another reason why we tell stories is memory. I didn't find a good picture for memory, because you mostly find flash drives and all this nonsense. So I'm going to call this stickiness. There's also a book that's called Made to Stick that I can highly recommend. It's about marketing, but it's all about storytelling. How do you tell a story? How do you tell parts of stories to market and to get people to remember what you're talking about, and to get to them, I guess? Stickiness is important, and memory is important. When you're doing something that's a bit unusual in data visualization, which is not exploration and analysis of data, but communication, when you're trying to get something across to people, when you're trying to make people change their minds, make decisions, spend money, whatever it is. And for that, you need a mechanism that will actually get to them, and then they will remember what you told them. Because if you just throw a whole bunch of bar charts in front of them, they're not going to remember. But the way memory works, and this is why I didn't want to put those flash drives up there, is not like a computer. We don't just store a piece of information on a shelf somewhere in our brains, and then we get back to it and retrieve it. But memory is much more like there are these floating bubbles that kind of float through our brains somehow. And this is a slightly mixed metaphor here, but they have little bits of velcro stuck to the outside. And when you try to retrieve a memory, what you do is you kind of hold out another piece of velcro and try to hook that, hook those memories. And the more hooks you have, the more easily those things will stick, and they will stick together. So once you have one of them, another one will float by and stick to that. And so, but the way memory really works is that you have to know something, you have to remember something to remember something else. And that way you can reconstruct a story. And what stories do is they give you some of the connective tissue between those items of information or these pieces of information. And that is why stories that are powerful and have been used for many thousands of years, tens of thousands of years probably, to convey information, to pass information on to other generations, and to teach people about things. There's even some arguments that say that stories perhaps even predate complex language, that we developed more complex grammar and more complex language to be able to express more complex stories. Because we want to be able to express more complex and more abstract thoughts. I'm not gonna go into that direction any further though. I want to talk a bit about how we can tell stories. And how we can tell stories when we talk about data. So far this has all been very kind of general about stories. But how does this work when we talk about data? And there are two elements here, two parts to this. One of them is techniques, and the other one is about structure. And I will argue that there are techniques, visualization techniques, that are specific to storytelling. I call them presentation only techniques. And I should perhaps also preface this by saying that this is not something that I'm 100% done with, or that's established knowledge. But this is stuff that I'm thinking about, and that I think is interesting still to think about. So I have a lot of questions, I don't have a whole lot of answers. But maybe I can kind of have you appreciate some of those questions and some of that uncertainty with me. So techniques, there's a technique that I particularly like that is called the connected scatterplot. The way this works is, has anybody seen this before? A bit more exercise here, okay, a few people. This was done by Hannah Fairfield a few years ago. And the way this works is that what you see here is there's a vertical axis, which is the number of car fatalities, and horizontal axis, which is the number of miles driven. And there's a data point for each year. Now if you were to just draw this as a 2D scatterplot, it would be a fairly boring, fairly sparse set of dots on a scatterplot. But by connecting those points with lines, you're now creating something that is much more interesting, that actually tells you a story. There are little arrows on here that tell you which way time flows. This is from 1950 to 2012, 2011 I believe. And you can look at this and you can see that there is kind of a small structure. There are individual steps. And there's a larger structure where you can see how the direction changes. So sometimes the line goes up, essentially up into the right. Sometimes it goes down into the right. Even to the left, backwards a little bit on itself. And there are little annotations there that have little spark lines that tell you which part of that overall shape they're talking about. And as you see more of this, as I'm revealing more of this here, you can see something that you haven't seen before in a line chart, which is a loop. And then you remember that of course the horizontal axis is not time, because time goes along the chart or along the line, but number of miles driven. And so as the number of miles driven drops, perhaps the number of fatalities also was dropping and that created that kind of loop. And the whole thing also makes for a very pleasing, very interesting page layout and page design. This is one of my favorite news graphics. And I think it's a really good example of how you can turn a small number of data points, this is 60 data points or so, into a very compelling story. Of course there's not just the data points, there's also a little bit more background there. But you turn something that is just a bunch of numbers into something that is interesting to explore that people will spend time with and want to know what it is and want to read those annotations and read all the stuff around it. So these techniques are really powerful. Or this technique can be very powerful, but it's used the right way. And you've seen an example, you may not even have noticed it, but there was an example in Jeff Hare's talk. And you will see a few more of these, I think tomorrow when Hannah talks about these. The nice thing about this technique in particular, is that it's actually a really bad technique, because if you were to talk to a lot of visualization academics, they would probably look at this and they'd say yes, but this is not general. This is not gonna work for this and that, and they're right. So here's an example. How to make a hairball and not even use the node link diagram. There you go. And I would even say that for most data sets that you can find, it would look like this. So you get all kinds of really horrible stupid hairballs when you play with this. This is unemployment data, labor force data versus unemployment. I think this is all changed, indexed in the year 2000 or something like that. And I thought it would be a good idea, it would be a fun demo and it was just pointless. Or even it's actually a good bad demo, I guess. But in many cases, it doesn't actually work. But that's actually in a way that's the strength of this technique, that it works for some things really well, and for many others it just doesn't. And I think that that should be okay. That is not a common sentiment you will hear in the visualization world. A case in point was this thing, which was published a few years ago, I think 2008 or so, by Amanda Cox and Lee Byron and a few other people in the New York Times. And this is called the stream graph. And there was a paper about that at InfoVis, I think the year later. And there was a huge controversy about how bad this basically was. Anybody remember that? Does anybody hang out in those places? Okay. So there was basically a lot of, this should never have been accepted as a paper at InfoVis. And it's terrible because the problem is that, so what you see here is box office numbers for a number of movies over time. And they're all stacked. And if you know anything about stacked bar charts or stacked area charts, you know that the problem with those is that the baseline is all different for all of these. So it's hard to compare across because the lower element or the lower segment is going to push up the baseline. And so you can't really see what the shape is at some top because it's being distorted by the lower one, right? And that is totally true and this is true for this chart. But it's also not the point. The point of this is to get people who don't know a whole lot about this data to start and explore that and to see the large differences, to see the large patterns that are going on here and look at those. And those you can still see. And it's a compelling, interesting chart that was printed on a half a page lengthwise or two-thirds of a page lengthwise on in a newspaper, which is an impressive, huge thing to look at and to explore and to actually spend time with. And so this is a way of telling story. There's also, I forgot to say this earlier about the connected scatter plot. There's of course a time element. So a large part of me arguing for and against stories involves time and narration and narrative. And so we have an element of this here. It's not quite as strong as the connected scatter plot was. But there is an element of time here and this is the same here as well. So techniques that work well in some cases, especially when they are well suited to engage people and to get their attention and to keep them hooked at least for a few minutes. I think are really good for this kind of task, I guess. Even though they might not be, again, this is not just like before. This is not a channel technique. I will not tell you use this instead of scatter plots or bar charts for exploration of data, no. But this is a really good way of getting people interested and to pay attention. Another one I'm just gonna mention because it's something I've spent some time on recently is what's called the isotype. Anybody heard of the isotype? It's type charts. Yeah, a few people, okay. This is an idea that was developed in the 1920s in Vienna by Otto Neurath and his wife, Marie Neurath and a guy named Gerrit Ahrens. The way this works is that it uses little icons, little symbols to represent multiples. So this is the number of weavers working at home, those are the black figures or in factories, the red figures. And how that changes over time. So this is across the long, this is over the 19th century and essentially tells you the story of the Industrial Revolution. Those little red boxes there are factories and sometimes for some reason people get confused by this. The things that are sticking out are smokestacks, okay? So that's the smoke there, I don't know. In the past people have been confused by what that actually is. It's not, I don't know what else it would be, but they're factories, okay? They're steam engines, that's what they do, they produce smoke. And so you can see how over time that the number of people changes and if you look at the legends there, it's a bit hard to read perhaps. But each little person represents 10,000 people, 10,000 workers. And each little blue bundle represents some number of 50 million pounds of product. And you can see how that was increasing very dramatically over time. Whereas the number of workers was essentially flat. It didn't change a whole lot over that entire century. But this is a technique that's interesting because it has shapes and it has stuff to look at, and you can actually remember this much better than a bar chart because bars don't have shapes. In fact, we've done some work and we're just gonna be published at a conference in a few weeks. And also, since I'm a recovering academic, I have to show this next chart because there's really no way of not doing that, I guess, in the talk. So there is Napoleon's March, which anybody seen this before? Okay, I'm gonna just very briefly explain what this is just for those three people who may not have. So what you see here is this tan ribbon that's going left to right, is Napoleon's army, number of men in Napoleon's army, starts at 422,000, that's the width is the number of men. And goes from a place called Covno in Lithuania, or what's today Lithuania, to Moscow. He ends up with 100,000 in Moscow, turns around, goes back and comes back with 10,000 of his 422,000. I think it's actually a connected scatter plot in a way. It connects points on a map. It also represents the width, the people there is width. And I think it's often hailed us this amazing example of statistical graphic. But what it really is, in my opinion, is a super narrow chart for one particular purpose, it worked exactly for that, and for nothing else. I challenge you to produce something using that same idea that actually works for some other data, not for the same data. People have tried different approaches for the same data, and they've mostly been horrible. So try this on different data, and I almost guarantee it's not gonna work. For once, it's a good thing that Napoleon invaded Russia, and not the other way around. Because we can read this left to right first, and so it makes a much more compelling story. So people say that this is telling a story. Which is only true because we read it a certain way. And then there's actually another one on the same page, on the same sheet. Charles Minar, who produced this map, had a map of, I forget the name. But an invasion that was going from southern Spain into Italy. And so the problem was that that direction changed a bit. And he was going, so that that army was going north. And so what he did, he actually rotated that map by about 45 degrees. So that it would actually work, so it would be red left to right. So if you wanna try this for the Oregon Trail, or for something like that, good luck. So that's what I mean though, this is really specific. And I think it can be very valuable to do very specific things. But we have to also appreciate that they are very specific and not pretend or try to make everything as general and as generic as a scatterplot or a bar chart. Now the second part of the how is about structure. And I'm interested here in low level structure. So I used to talk in the past about narrative arc and how this all works. But I'm gonna talk about very low level stuff today. Because I think it's actually perhaps even more useful than that. I'm going to show you an example of what I mean by that. This is perhaps my favorite story from the news that this is from a few years ago. I think it was done in 2007 or eight about the Copenhagen climate talks that took place in 2007. And it's a very nice stepper. So Jim was mentioning this earlier. So this is an example of a stepper, no scrolling here, simple stepping. And it walks you through a little argument. So I'm gonna show you this very briefly. So it surface this map which doesn't do a whole lot. But then we get to the interesting part. So this is now four emissions for four different countries. China, the US, Europe, which is a total country like Africa and India. And you can see total emissions here. And these talks are always about seeing things a certain way. So looking at total emissions is one way, but that's only one way of looking at it. And then you could look at things like emissions per capita. Which of course looks quite different because China and India have so many more people. So that looks quite a bit different than before. And then, of course, emissions come from industry. So why not look at a measure like GDP? And that's very nice for China because China's GDP has been skyrocketing. And so if you use that number that's been growing almost exponentially to normalize your data, it looks like the number is actually dropping quite a bit. And then you can say, well, let's use that and extrapolate into the future. And expect that GDP will be increasing like it is right now, like it was at that point. And but in terms of absolute numbers, then this means that. Which is not quite as good. And so apparently at this point, the US said, well, actually let's try this with absolute numbers instead and do this. So this walks you through a little argument. And the nice thing about this is that it has a sort of punchline. But in particular, what's important about it is that it's individual steps. So I'm not against scrolling, but I am against the idea that just because you can scroll, you can just do things that are super continuous. And people will actually understand what's going on. Because we know from psychology and from cognitive psychology research that when you think about things that happened, trying to find the general term here, what you remember are events. You don't remember all the time in between. So when somebody tells you how to come here from the park plaza, they're gonna tell you where to turn. They're not gonna tell you to walk, walk, walk, walk, walk, walk, walk, and then go and all this boring bits will not just not be even there. You wouldn't even think about doing that. So this is why I like this kind of representation because it shows you that. It shows you the important parts here. And I'm not against animation. Well, actually I am against animation, except when it's about transitions. So transitions are really useful. And actually what I left out here in the example when I showed you this, I just showed you the images. Those transitions, those steps are animated, which is very useful, because it's much easier to follow what's going on when there's a case. But when you look at this, you might look at this the way it's laid out here, which I just like to see because that shows me everything. And I can understand the structure of this little narrative. I can look at this and they say, well, this reminds me of a comic. And so why don't we think about comics and how comics work a little bit? And how people have been drawing comics and constructing comics. I really like this little image from Scott McCloud's book, where he quotes Will Eisner's term, who calls comics the sequential art. And the interesting thing about McCloud also, and I think this is an argument he's also taking from Eisner, I'm not entirely sure who actually started with that, is that he talks about how things happen between the frames. So each frame is a snapshot. Each frame is an event, but there's time passing between them. And that time can be useful because that is something you can imply, but you don't necessarily have to show it. And what's also happening here is that there is a change between frames, between JSON frames. And there are different ways of managing those changes, or of showing different things between those frames, I guess. It's not a bit too fancy when I say managing those changes. And McCloud has six different steps that he has here. So there's this book, I should mention, Understanding Comics by Scott McCloud. If you haven't read it, I strongly suggest you do. It's really good, it's a lot of fun to read, it's an actual comic. And it's just extremely well laid out and you can learn a ton from it, there's a lot in there. I just stole this part from it for this talk, where he talks about these different steps from one frame to the next. And when you think about those, you will find that if you want to tell a story about anything, whether it's a story about people or about data or about anything, you will find that some of those will apply. So a change from moment to moment, where you go from one step in time to the next, where things happen over time. Or action to action, which is something happens, something else happens. Subject to subject, which you basically, like in a movie, you change the frame, or the cut, scene to scene, where you have a larger change between what you're looking at, aspect to aspect is basically the same thing, but seen from different points of view. Like you just saw in that Copenhagen example, there will be different aspects of the same thing. And then there's the non-secretary, which is just random things that you can juxtapose. Which is actually surprisingly common in news graphics. There's this whole thing about how news graphics start by giving you kind of this quick overview, and then they go into more details. And so some of the examples, even some of the ones that we've seen today, in my opinion, are somewhat non-secretaries. But it's kind of interesting to watch that sometimes, they're not necessarily constructed like stories. Now, this book is about 10 years old. No, my math is off. It came out in 1993, which makes it 20 something. That's 10 years old. But anyway, so the, okay, that's 23 years old, so. It's been around for a while, okay. It's still a good book, highly recommended. But people have built on it. And one of those people is Neil Cohn. Anybody heard of Neil Cohn before? Okay, not many people. So he's now doing, I think he's a professor now at UC San Diego. Or maybe he's a post-doc, I'm not entirely sure. But he's done some really interesting work around comics, and the cognitive side of comics. And he has this classification. He basically argues a little bit against McLeod. But the two actually work really well together in a way. Where he talks about these different frames and what the frame itself represents. So he doesn't, he's not interested in the space between the frames as in the frames themselves. And so he has this classification of frames that he kind of uses to understand what structure is of a comic. And he uses a lot of these four frame examples. So I'm gonna show you two of those. This first frame here is what he calls an Establisher. And he has these little letters for that. So E is for Establisher. And this first one, this Establisher tells you where you are. So before I showed you this, you did not know what this next slide was gonna be, or what this comic was gonna be about. This is what this tells you. It tells you that there's this cup sitting there. Then there's the initial. The initial is the start of the action. So this is when something is starting to happen. Then there's the peak. And the peak is where the action peaks, where the main thing happens. So it slaps him. And then there's the release. And the release, and of course this is a bit funnier when there's not a guy talking here about it. But the release is basically the part where it's funny because now you're past that peak point and you're kind of waiting for the reaction. Or you're kind of waiting for what's happening next. And he calls this the EIPR model. So E for initial, I for, sorry, E for Establisher, I for initial, P for peak and then R for release. And that's a fairly common model in lots of these little four pain comic strips. It's another example that shows this, the doggy sees a ball, doggy runs after the ball, doggy gets caught in a soccer game, and then hides somewhere. And so it's kind of a little comic. And but it establishes a little structure. So the individual items are classified on kind of the bottom level as individual frames. But the whole structure, the EIPR, essentially makes an art, a narrative art. And you can imagine mapping those ideas to those frames and thinking about how that actually works. And I would argue that there is a certain amount of that in here, that there are a few other things that are going on here. But this actually works fairly well as this kind of structure, even though it's six frames or seven actually. But I'm ignoring the first one. And so these steps are important because, as I said, we remember those events. And you can structure, you can think about data, and you can think about the views that you create, and how those actually fit together, and use those kind of ideas for structuring a story. And when you look at how steps work, and how steps actually are expressed, even in some of the scroll detailing examples, you will find that people do that. So this is an example that just came out last week. The Washington Post, I think, did this idea of scroll up Mount Everest from sea level. But even though you can scroll forever in this thing, you will see that there are events marked on this overview, and there are events as a scroll along. If this were just scrolling up all the way, it would be super boring, and you wouldn't actually remember what you did along the way, because you would just keep scrolling through drawings of Mount Everest. And then here's my last part about when. When do you tell stories? And in a way, this is really the same question as the why, because when do you tell stories? Well, when those things apply that I talked about earlier. So another blank frame here. Who was this person that told you a story about at the beginning? Anybody remember her nationality? Okay, what year? 1920s, okay, where was this set? Okay, why do you remember that? I mean, seriously, this is not going to get you fed. It's not going to get you laid. Maybe you can try this out tonight, but it's not. You know, this is not, there is no reason why you would have to remember this, but there is maybe one reason that you would remember that, and that is because I haven't told you who this woman was. Anybody have a guess? Lynn knows. So her name was Bluma Cygarnik, not a household name, but a very, very famous psychologist from the, I guess the first half of the 20th century. And there's this effect that's called the Cygarnik effect, which is about how we remember things that are unfinished. And what she found in this bar or in this pub or whatever she went to is that the waiters there would take orders and would remember them, even for large groups, and they would remember them really well, until they were paid for it, and then they would just forget them. And that is an effect that's really, really effective in making you remember things. So we remember the story because as the story unfolds, it's unfinished, and then, but we have to remember the pieces of it until the end. And this is a very effective way also of getting people to keep paying attention to things by using something that's called a cliffhanger. So when an episode of some sort of series ends in a cliffhanger, then you wanna keep watching. You're gonna tune in the next week in the old days, or you're gonna binge watch the whole thing on Netflix. And that's why, because it teases the next thing. So, when do we want to tell stories? Well, if you're looking for attention, memory, or we just wanna see a guy hang off a cliff. All right, thank you so much, and I'm happy to answer questions at this time.