 Great, thank you. So the title of my talk is Do You Know Nothing When You See It? And when I say nothing, I don't mean nothing like television static because that's actually not nothing, that's something, that's the image, it just got garbled as it was going through the air. I mean something more like this, but really not even this because this is random noise that I generated with my computer and so it's pseudo random and probably your awesome human brain is finding patterns like deep dream already. You can see the bunnies and hearts and clouds in my random data. So our goal is to be able to identify when something that we've created is nothing. And in order to do that, I'm gonna try statistics. So I am a statistics professor and that means whenever I get in front of a group of people, I actually try and sneakily teach them some statistics. So the question is what is statistics about? And it's about many things. It's about variation which is one of the main themes that we're gonna talk about here. Sometimes there's modeling. I'm gonna ignore modeling almost completely. And when you take an intro stat class or if you're doing basic science, you're often asking the question, are these two numbers different? Which is another way of saying, is this one number the difference different than zero? And then sometimes you wanna know this number, what are some other reasonable numbers that we could have seen if we hadn't seen this one? And in order to answer those questions, we need context about the variation. So the first example that came to mind when I was preparing this talk was ants. I don't know why. So imagine that you had some ants and you had some big looking ants and some little looking ants and you thought, well, maybe I just got unlucky. I got five big looking ants and five little looking ants but really they're both draws from the same population and overall the groups don't have any difference in size. But you found, you observed a difference in the group sizes of three quarters of an inch. You and I know something about ant size variation and so that sounds very large and we'll see that it actually was, okay? In another context, imagine you had some tug of war game set up, you had two teams, there's men and women, they're split into the blue team and the pink team and we mapped their heights, we found the average height and we found that the average height between those two groups was different. And again, we observed a difference that was about three quarters of an inch. And we wanna know, is that significant? Are those two teams really different or did we just observe a difference by nature of the selection process, the random generation that we're assuming is happening behind the scenes? So if you've taken a standard statistics class, you're thinking confidence intervals, you're thinking point estimate plus or minus the standard error times something that has to do with some distribution and maybe you're thinking about some pages in your textbook that had some standard error calculations, there's some square roots, there's some fractions. Okay, so we're thinking about difference of means so maybe it's just that first one, but really it's worse than that because there's all these different proofs about if you have different group sizes and if the variances are different then you have to pick those things out and then it actually is worse than that because once you have your standard error then you have to know the degrees of freedom that you're gonna look at in your distribution. And even I, I have a PhD in statistics, I get kind of the heebie-jeebies if you ask me to sort of think of the right standard error computation. Okay, so let's say that you are better at statistics than me and so you were able to come up with the right standard error calculation, you found the degrees of freedom, now you're gonna go look at some idealized distribution and we call this a sampling distribution. So it's not a distribution of data, it's a distribution of statistics which are numbers that were computed from other numbers. In this case, we had the average heights of the two tug-of-war teams and then we were looking at their difference. And then you could come up with some confidence interval and you could see if there was really no relationship between the team and the height, how often would we observe a difference of three quarters of an inch. I'm gonna argue that's not the way that you should do that problem solving. Instead, you should use randomization. And randomization is just what it sounds. You have two things that you think they might have a relationship. You wanna come up with a sampling distribution and you want it to be the null distribution, the distribution of essentially nothing. And so what you're gonna do is you're going to take the values, the labels here and you're gonna mix them up and you're gonna compute the group height means for those different groups in your mix-up data and you're gonna compute the difference in the heights. So you can see already sometimes you're getting a positive difference, sometimes you're getting a negative difference. I think one of those turned out to be zero. We're gonna do this like a thousand times and then we can look at the distribution of that statistic. This distribution is centered around zero because it's a null distribution. And then we can calculate where is 95% of the data? What's the middle 95%? And sort of say if there really was no difference between the heights of these two tug-of-war teams, what sorts of differences might we observe? And so in the case of the tug-of-war teams, if there was really no difference, we could observe height differences that were negative four inches to positive four inches different. Our observed difference of three quarters of an inch is tiny and so we think that that really is nothing. Even though we saw a difference of three quarters of an inch, it's nothing. For the ants, we can do the same thing. That distribution looks different. Randomization distributions are not always symmetric and they're not always smooth. They can have these lumps. But again, we can compute where the middle 95% of the data is. So for the ants, it goes from negative 0.5 to 0.5 and that means that our observed difference of three quarters of an inch, that would actually be pretty weird to see if there was really no difference between my two different ant size groups. Okay, I'm a statistician. The open source programming language of my choice is R. And so if you wanted to do this yourself, this is the code. So if I wanted to compute one difference in means, it's that first piece of code. And if I wanted to do 1,000 of them, I could use the second chunk. But there's another technique that we might want to use for assessing whether the number that we got is sort of reasonable or what some other possible numbers we could have observed would be. And that's bootstrapping. So with bootstrapping, you take the data that you already have. It's kind of like pulling yourself up by your bootstraps. You're making something from nothing only that's not really possible. So in this case, we're gonna make data from our old data. We're gonna sample with replacement. So we're gonna pull out data points. Here, we're not breaking the relationship between the two variables. We're just pulling them out directly. And you can see sometimes I get the same data point more than once in my bootstrapped sample. But now I have new data and I can treat that as my current data and compute the mean heights and then the possible difference in heights. And so if that observed tug of war game that we saw was really representative of the world, other than 0.75, I might have observed 0.667. And so then I can come up with bootstrap distributions much the same way with randomization distributions. They might not be symmetric. These are always sort of centered around the estimate that we got from the real data. So in this case, centered around that 0.75. And then it shows us some possible numbers that we could have observed. So again, if that was our data and we were generating bootstrap samples, we might have seen differences from negative three to positive four. And again, we think that ours is sort of not different than nothing or from zero. And with the ants, all of the possible observed values that we think we could have seen, they're all negative. So we think that there really is some relationship. One of those groups is bigger than the other. If you wanna know more about randomization and the bootstrap, you could look at this open source textbook, what? It was written by some really cool statisticians. The source is all on GitHub. You can download the PDF for free. And then if you like physical books, you can buy this textbook, statistics textbook that I used in my class for $9 on Amazon. So it's really cool and it's really good. If you want some other resources, Jonathan Stray has a five minute lightning talk from Nikar, which is called solve every statistics problem with one weird trick. And that one weird trick is randomization. So you kind of know about it. And Tim Hesterberg at Google has a paper called what teacher should know about the bootstrap. I think it really should be called what everyone should know about the bootstrap. So you could learn more there. Then the question is, what does this have to do with visualization? So the idea is we're gonna try and do the same thing with visualization that we did with numbers. We're gonna try and say, is this different than nothing? And what are some other things that we could have observed? And this is gonna help us try not to make visualizations that don't show anything. When I say that, I don't mean visualizations of the type you see on WTF viz, although I like making fun of those too. And I don't even mean showing nothing in the way that Daryl Hough is talking about and how to lie with statistics in the 1950s. Instead, I'm talking about using techniques like randomization. So there's an awesome paper by Hadley Wickham, Andreas Buia, Dye Cook, and Heike Hoffman. And it's called Graphical Inference for Info Viz. So you could find the link here. I think it's really worth checking out. But one of the techniques that they suggest in this paper is called the lineup. And the idea of the lineup is that you put your data, your plot that you think shows something real, in a lineup with a bunch of innocent plots. And if you can pick your accused plot out of the lineup of innocent plots, that means that it's somehow different than nothing. So again, to illustrate this, let's see if the... Okay, the video's not working. But I would do the same thing. I would break the relationship between the X and Y and look at what other plots I could have gotten from the same data with that relationship broken. Instead of looking at them all in sequence, I wanna look at them all together. So let's take an example. This is some data about loans. And I'm plotting the balance of the loan against the income and I'm coloring by whether or not the person defaulted on their loan. And my human brain sees a pattern. I say it looks like people who default on loans are the ones that have really high loan balances. It doesn't matter if they're rich or poor, it's just those high loan balances. But is it possible that I just sort of made that up? So this is what it would look like if you used the protocol that they're suggesting in this graphical inference paper and you look at the lineup of plots. There's 20 of them because in classical statistics you use a p-value cutoff of 0.05. That's the same as one out of 20. So if you just guessed randomly, you'd get the right one right one out of 20 times even if you were just guessing at random. So with this one, I think that you can probably identify which is the real plot. Part of it is I showed you the real one before so you might have recognized it. In practice, you shouldn't look at the accused plot alone before you look at it in the lineup. So either you should build this into your workflow where you're looking at a lineup of plots every time you make a visualization or maybe more realistically than that. If you make a plot and you think it might sort of show nothing, make a lineup and then show it to someone who hasn't seen the original plot. Again, if you're an R person, this code probably looks pretty familiar. The top code is using the ggplot2 library to make a scatter plot. And then the bottom piece of code is just showing how you could randomize the data to make those null plots. So I'm doing a null permute on the label of default versus not default. And then I'm wrapping those facets into an array of plots. And when you run this code, they've done this sort of clever thing where it'll make the plot in your plotting window, but then it also prints this thing in the console. It says decrypt and then it's a bunch of sort of nonsense characters. And that's where they hid the answer. So if you were doing this just on your computer and you wanted to not know ahead of time which one is the real plot, but you have to figure it out afterward, this is how you do it. So it gives you this piece of code and then when you run the code, it tells you where the true data was. So in this case, we got it right. You can use the same approach to do a visual t-test. So this is the data from those tug-of-war teams again. Somewhere in there, I've hidden the real plot of the different heights. The crosses represent the mean heights. And then the question is if you could pick out the real data. So in your head, make your guess. It says decrypt. So if you're gonna decrypting things quickly, you could see what it is. But the true data is in position five. So I'll show it to you again. I don't know, that wasn't the one that I had picked out. It's certainly not the most extreme of the possible plots. And so what this is telling me is the same thing as the permutation test that I did with the statistics and the distributions at the beginning of the talk that there really isn't a difference between these two groups. So these examples are sort of, there was like a positive one and a negative one, but this one I don't think that you really needed visualization to tell you that answer. You could have used standard statistics. The power of the graphical inference technique is that it's very generalizable. You can use it for anything. So one thing where I think humans are great at finding patterns in noise is in time series analysis. So I took some data. This is about the steps that I take. So it's from my Fitbit. And sometimes I like to make up stories about like, ah, like the variance is going up or the mean seems like it's changing or like there was something that happened in December. And I'm really good at making up those stories about the distributions. So this is, again, I hid the accused plot, the real data somewhere in there. Again, look at it and try and see if you can guess which one it is. Oh, maybe I have another one where I made the scale a little bit nicer. And now we're going to try and decrypt. So there's the string again. It says the true data is in position four. And I'll show it to you again. Again, that wasn't the one that I had really picked out. But if I had just shown you the one time series plot, you probably would have believed my stories about what was interesting about that plot. Okay, so with the statistics analog, we've kind of been doing the, is this different than zero task? But then I wanna switch to the, is what are some other reasonable values that we could have gotten for this? So with the numbers, that was what are some other reasonable values that we could have gotten. And then I'm gonna talk about that in terms of plots. So some of this comes from joint work with a colleague of mine named Aaron Lunzer. We work together at the communications design group. And this is the highest quality image of Aaron that I could find. I guess it's almost life-size in this theater. So maybe that's fine. But we worked on this tool, which was a prototype tool. And we call it lively arm. So this is something that is real in the sense that if you go on GitHub and download the code, you could run it yourself. I don't recommend it. It's very buggy. But it really has R on the back end and then it has Lidly Web and JavaScript on the front end. And what it lets you do is play with the bin width and bin offset of a histogram. Many things let you do this, but this tool also lets you overlay a sweep of parameter values so that you can see a variety of histograms all with the same bin width but slightly different bin offsets. And then they form what I like to call a histogram cloud, which is kind of like a kernel density estimate, but it's easier for people to understand. There's one more feature that we've built into this, which is that you could call out the individual histograms if you wanted to. And so instead of just seeing them overlaid as a cloud, you can do small multiples. And it's actually a two-hand interaction here. There's like an iPad and you're controlling the one screen with your right hand and the other with your left hand. So it's a little bit buggy, but you can see the small multiples of the different possibilities of histograms that you could have seen. Again, I think it's really easy with histograms to just use the default algorithm, like the Sturges algorithm or whatever it is in your favorite tool, which will make the bin widths for you. If you're using GG Plot 2, it will give you a warning. Like, I chose a default but you should come up with something better. But many people don't. I know that you all do because you're visualization professionals, but I think that it's not always obvious to non-professionals that those defaults make a huge difference. So you might use the defaults and find what looks like a pattern, but it's really just the result of the parameter values that you chose. So giving people the opportunity to play with the parameters is really powerful. So after Aaron and I worked on this work, which is essentially in one dimension, we're looking at histograms, I started thinking how could we do something like this in two dimensions, which made me think about maps. So the modifiable aerial unit problem says that if you aggregate spatial data in different polygonal shapes, you're gonna get different spatial patterns. And so this is like something that can happen with gerrymandering. It can happen with the different county data and zip code data, these different levels of polygons. And there's a lot of statistical problems that go along with that. So Aaron Luncer and I worked together on this next tool, which doesn't really have a name, but what it does is it takes point data, it's about earthquakes in Southern California, and then it's aggregating them into these polygons. But instead of just making fixed polygons, you can scale and rotate and move the polygons to see some other possible values of the visualization that you could have gotten. So you start with your default values and maybe you think that there's some very obvious trend that everyone should be aware of. And you can tell some great story about why there's a hotspot of earthquakes in this one place. But then when you start manipulating the polygons, you can see that there's many other possible patterns that you could have created. So the takeaway that I want you to take from this talk is that there are statistical tasks and there are very analogous visualization tasks. So in statistics, we want to know if two numbers are different, which is like knowing if one number is different than zero. And we could use tools like randomization to answer that question. In the visualization world, we can use randomization in a different way to answer the question, is this plot different than nothing? In statistics, we might want to ask what are some reasonable other possible values for this number? So what other differences in heights could we have observed? And you could use something like the bootstrap to show you other possible reasonable values. And in the visualization space, I think that using parameter manipulation or giving users the ability to see how your parameter choices impact the visual story that they are seeing can be really powerful. And so I, thank you. I think we might have a minute or two for questions if anybody has any. All right, cool. Oh, there's one. I'll talk to him. Yeah, so I think that's a great question. The question was, could we have a computer make those assessments? And the answer is not really. It's hard to generalize a technique that would allow a computer to make those assessments. The paper that I was referencing, the graphical inference for InfoViz, they do try and come up with some distance metric between the null plot and then the interesting plot, and then use some cutoff value to say, like, this is different enough that we think that it's different. One thing that I didn't say is that these authors also were sort of using computer tools as an intermediary. They would run a bunch of tests on Amazon Mechanical Turk and get like 1,000 people to look at visualizations, and then they would feel really certain about their results. But I think asking a computer if a number is different than zero, it's much easier for a computer to assess that than for a computer to assess. Is this picture different than nothing? All right, thank you so much. Thank you.