 So, if you work with data or numbers, you may recognize this quote, which I recently learned is not a quote from Stalin, I looked it up on WikiQuote, and who knows who said it, but it's true I think, and it's horribly true, that one death is a tragedy and a million deaths is a statistic. In fact, I'd say that maybe a hundred deaths is a statistic, unfortunately, in this day and age. And the photo I'm using for this slide is of a little boy named Brandon Holt who was six years old when he was killed accidentally by his four-year-old playmate. He lived in Tums River, New Jersey, and unfortunately, Brandon's tragedy is now a statistic. He is one of the over 3,500 people who have been killed so far this year in the United States by guns. Okay, this talk is probably going to drive you to drink any alcohol. Sorry, it gets a little dire, but I'm going to tell the story of data and how there are people who are working to kill data, and there are people who are working to keep data alive. And it's not that people are out there intentionally trying to kill data. They're well-intentioned people, but they sort of take the life out of data in many ways, and I'll talk about those in a bit. I want to use a recent visualization that we created about gun violence as an example for what I'm talking about. You'll all remember, of course, the tragic events in Newtown last year, the end of last year, not far from here, and that really prompted us as a company. We wanted to do something. We wanted to use our skills. We were very moved by what happened, and we wanted to use our talents to do something about it, and so we started looking at the numbers and really trying to wrap our heads around what was happening. So the latest data that was available from a good source was FBI data from 2010. And so I looked at that large number, and again, that's a statistic, that's just a big number. So I started thinking about how can I contextualize that? How can I start to understand what that number means? How can I start to see those individual tragedies out there? So I started thinking about smaller numbers of people, and I started thinking about my hometown, which is a small town in Minnesota, and there are about 1,800 people in my hometown of Canby, Minnesota, and that's about January and February of gun homicides in 2010. So to get the entire year, I had to branch out into the neighboring towns around where I grew up. So in the end, I had to add 15 other towns in the surrounding area to equal the almost 10,000 gun murders in the U.S. at the time. And it really made me think, wow, if those towns were just wiped off of the countryside over a year, we would demand that something be done about it. That seems outrageous, yet that happens on a yearly basis here. And there are other numbers I started thinking about. This is a number that helped us launch a war. These are the number of people who were killed in the 9-11 attacks, and I don't want to diminish that number at all, but we launched an entire war over that, and yet we don't deal with the internal war that we're dealing with inside our country on a daily basis. I want to show just a quick video of the tool we created. This is an animated video we made for a film festival, and it just doesn't show the entire tool, but it gets at it. And I'll let it run through, and then I'll talk about it. So some of the text is a little small. In case you couldn't see it, each individual line represents a person who is killed, and it starts off as a bright orange color indicating their life, and then when they are killed, the line continues as a gray line, sort of their ghost, so to speak, that is we project how long they might have lived, and we're using a World Health Organization data to figure that out, statistically speaking. So you could see that the focus that we had for this piece was on the individuals, on the individual victims, but when we started out on the journey to create this, when we just started looking at the data, we started with the source that you would all go to, which is the FBI. It's really the premium data source for all of this horrible, horrible data, unfortunately. But it's great, they have tons and tons of data, but if you go to the Uniform Crime Report, which is what we started using for this piece, it's all aggregate data. It comes from incident-based data, and it's an underlying system called the NIBRIS, which is the National Incident-Based Reporting System, I believe. But what they spit out is this aggregate data. And aggregate data, like statistics, I think, is really this death of data, because somebody, a statistician, an analyst, or somebody is sitting back there deciding what it is we want to see. They're deciding for us, it should be bucketed in this way, it should be summed in this way, it should be averaged this way. And I think of aggregate data as sort of like your spouse on Valentine's Day. They think they know what you want, but they really don't know what you want. And oftentimes, when you see aggregate data, it's like, oh, that's interesting, but how about if we sliced it some other way? That's often the problem I have. So this is an example of what you'll see when you go to the Uniform Crime Report on the FBI. The long list of table numbers, and in order to even get the table name, you have to roll over the link, and it's really obscure and strange. And the tables have these sort of obtuse names to them. And so you have to go into each table one by one and sort of see how they sliced it and diced it. And so here's an example of one of their tables, you know, and it says murder victims by age, sex, and race. And so you're thinking, oh, I'm going to have this huge matrix of awesome stuff. But no, it's just a column of race and a column of gender. And then they have these sort of random age groupings by like five years. And is that really the best way to look at things? I'm not sure if age one to four is appropriate. Maybe something really bizarre happens at age three. It really hides things a lot when you aggregate data. There are things that, you know, that analyst and that statistician just might not find that might be applicable to my particular cause. The great thing, though, is that you can request the raw data from the FBI. So that's what we get when we request raw data, another horrible, yet a further step to the grave, I think, in terms of data. So this is what they spit out at you, and I'm not sure what kind of horrible ancient machine spits this out. But it's a one humongous text file, yet it's page-nated, and it has page numbers and titles at the randomly interspersed in this text. And you have to read this separate code file in order to understand how some lines are shorter than others. And it's really complicated, unless you're a programmer, you know, if you got to this level and you're not a programmer, you're dead in the water, because there's nothing you could do with this, unless you know how to, you know, go in there and parse it out and clean it up. Which of course we did, because we're able to do that, thankfully. So, you know, we parsed it out into a nice Excel file. So, I mean, that process, though, takes a long time, just finding the right data. And, I mean, this is sort of a simple example, because homicide data is readily available, and you know where to go to get it. But if it's something obscure like water licenses or something like that, and, one, it's going to take a long time to find that source. It's going to take a while to figure out what their data source is like. If it's not sufficient, is there raw data underneath it that I can get? And, you know, it's this long process. And I think that this is really, you know, a sort of sad secret of open data, that everybody's talking about open data, and it's great, and there's all this big push towards open data, yet when you go and look at it, the vast majority of it is not raw data. It's that aggregate data. It's somebody that's sliced and diced and bucketed things for us. And it's not, you know, it doesn't sometimes work with other things. Things aren't linked together. They don't relate, and it's very complicated to work with. So, I think there's a myth about open data. And in fact, if you go to data.gov, they claim to have, you know, tens of thousands of raw data sets, and the vast majority of those are not raw data sets. They're aggregate data. So, I think that it's very misleading. There's a misleading landscape out there, unfortunately. But so, now we have our raw data. It's been, you know, it's been cleaned up. We can use it. We can start looking at it. So, now that data is living. Now it's ready to be consumed, and we can actually do things with it. And it's like a piece of clay. It can be consumed by anyone. And we started also looking at other data sources. And on sort of the opposite end of the spectrum in terms of this data, we found that Slate.com was doing something really interesting. They had also been sort of compelled by the shootings in Newtown. And they started crowdsourcing gun deaths in America, which was such a fascinating concept to us. So, they're basically allowing anybody to tweet them or email them with an article, you know, something citing a death, and provide that the data for that killing. So, things like the victim's name, the city where it happened, the age of the victim, and those sorts of things. And so, it's sort of a different animal. It's sort of on the opposite end of the spectrum where, whereas the FBI is sort of this trusted source, they have this long process and it takes two years to get this data. Unfortunately, it's all historic data by the time you get it. So, if it's relevant, I'm not sure. But they have this long process and they have a methodology where they collect the data from all of the law enforcement agencies around the country and make sure that everybody's reporting things in the same way and they sort of get it up to this high level where it's been vetted and gone through this whole process. So, it's a very accurate and trusted source. On the other end of the spectrum is the slate data where anybody can throw it out in there. And they come from thousands of different articles, from thousands of different journalists, and thousands of different papers. And people report different ways. And did they even report the sex of the person or the age correctly? And there are numerous points where this could fail and the data could not be accurate. So, it's much less accurate. However, it's real time. We have a cron job running that goes out and gets this data periodically, you know, every couple of minutes or so to make sure that we have the most accurate data or the most recent data. And it's wonderful because it's not historic data. It's what's happening right now. And you can go out and right now you can go grab the file off of Slate's site and it's just they provide a nice little tidy CSV file and it's fantastic. So now we can start doing stuff with this. I can start doing my own analysis. I don't have to go through that FBI analyst who wants to decide what age groups I should look at. So I can look at things like what's the relationship between the killer and the victim? And what type of weapon did they use? So, you know, we didn't just look at gun homicides. We looked at all homicides and tried to figure out what's interesting about this or what's compelling. And so one of the things we, you know, that popped out, which was kind of fascinating and we recently did a graphic for Scientific American that notes this a bit. Well, first of all, men kill a lot more than women. I think that was pretty well known. But the interesting thing when you start to look at the weapon preference is that if a woman knows you, she's much more likely to use something other than a gun to kill you. She's gonna strangle you or like set you on fire in your sleep or poison you or something strange. Whereas men will, men just like to use guns. By far the best choice for men. The best choice for men. I mean, those other weapon includes like, you know, 40 different weapon types under that gray bar. So it's very splintered once you split up that gray bar and there are lots of weird things in there that you would never expect anybody to do, unfortunately. But this just shows that this is raw data. This is, it can breathe now. Now we're opening it up and we can actually use this as though it's a medium. It becomes like clay at this point or paint or something. It can have a life of its own and we can start to play with it. We use R a lot internally. We also use Tableau. I personally like Tableau because it's, I don't have a lot of time and R sort of trips me up sometimes. So it's nice and quick to throw things into Tableau although our true data scientist prefers R because it's, you know, nerdier. But here you can see, you know, now I can do all these things that I could never do with that aggregate data. I can show, I can't remember what the colors are. Sex, no, race. There aren't that many sexes, sorry. And I can break it down by each individual age. I don't have to, you know, have these five year increments. So I can see all this like crazy stuff that goes on with young people and, you know, what races are killing each other and what the, you know, relationship is between these people. I can start to look at all this crazy multivariate stuff and it's just like, geez, you get your head blown at some point. You're like, what am I even looking at? This is crazy. I can do it in a geospatial format. I can do line charts. I can look at trends. I can look at how this looked month over month or week over week or by state or, you know, any kind of way I want to. I mean, I personally, I love data so I could spend each and every day of 24 or seven in Tableau just tweaking it. And sometimes I do. But it just shows that it's, you know, this raw data is really like, it is like a medium. It's like an artistic medium. And this is a creative process going through all of this stuff and just playing with it and seeing, well, what happens if I look at it this way? Oh, what happens if I look at parents killing children or children killing parents? Or what happens when I look at, you know, time of day or all sorts of things? And now you can really start to see these really crazy nuanced things that are happening. And, you know, it doesn't have to be these sort of high level, you know, governmental, you know, these are the big takeaways that the government wants us to see. It's more about like, well, if I have access to these things, I can see what things are relevant to me personally. I want to show a little bit of the creative process that went into this. Unfortunately, I could not find any of the early sketches that we did, but there weren't many. This project was very fast and furious. It took us about almost four weeks to create it. And it came pretty quickly once we started looking at the data and the big point we wanted to do is just to make it all about the victims because we felt like after Newtown, there was this big debate about gun control and it seemed like, you know, the sides were getting more divisive and, you know, people were starting to argue much more about things rather than coming together about it. So we felt that if we focused on the victims, that's at least one point where everybody has sort of a common ground. We know that these people shouldn't be dead and, oh, by the way, they were killed with guns, so let's start talking about that. So this was the first sketch I was lying in bed one night and said, oh, it's just, it's an arc of life and wrote that and then that later became the entire piece. And here's a little bit of mathy stuff that surrounded it. And here's our small team that was working on the project. And we were going through some comps. You can see we sort of sketched stuff out on the, we have a glass table that we sketched out on as well as our whiteboard. And Katie, our designer, was doing this in Illustrator and she was like, oh my God, how am I supposed to do 10,000 lines? And she hated us and so I said, okay, well, let's break open processing. Processing is an open source tool that's built on Java. It's very similar to Java and it's very easy to sort of rapidly prototype data visualization, so we cracked that open and just started throwing the numbers in there. And it's more than anything sort of proved that we could tell a story through this and it sort of let us quickly play with things like how do we want to sort the lines and we ended up doing it by age because it showed a little bit of the distribution of ages. And so that was a great tool for us to use for that. This is me and the airplane coming. I was in San Francisco and they had finished the visualization as I was boarding the plane so I was able to pull it up on my iPad in flight with a glass of wine and it was great. And one of our goals is to make all of our projects work on mobile devices. So it was great to sort of just see that happen in there. And so the end result was, the film that I showed earlier doesn't really capture the interactivity of the piece. If you want to check it out it's guns.parascopic.com. And the beauty of it is that using that raw data now I can add in any filter and I can let the viewer choose what's important to them. So we have a filter for everything available to us. So races type of gun, whether a single person was killed or multiple people were killed, things that you won't get through the FBI data. So it's a really rich experience that because I don't want to claim that I know either what the important pieces are. I know what they are to me but they might not be the same for you. And this offers a tool for everybody to come together and just say, okay, I see what's happening. Now let me sort of pick and choose and see maybe there's a solution in here. Maybe there's something that, I can find the big problem areas and try to fix that. For instance, lots of young people are victims and killers as well. They tend to kill each other at that age range. So like those are big areas of need where we can start creating solutions for that. And I can let the person view this in different ways. Not only through our sort of fun, beautiful narrative but I can also show distributions in a more traditional way so that people can use this as a bit of an analytical tool. And I can also tell the individual stories. When you roll over an individual line, you can see the victim's name and their age where they were killed and who their killer was. So I can start to get back again to those individual tragedies rather than just getting lost in the statistics. Thank you. Irene took the clock away, so I don't know how much time we have. We don't have much time. Okay, so if there are any questions, yeah. We had explored a few different ways and that's how I was going through our server frantically this morning, trying to find images that we usually, when we sketch, we take photos of them just to have in our history. And I think we were just blowing through this project so quickly that we forgot to do that, unfortunately. But we had maybe three or four different ideas that we had talked about prior to that. We had brainstormed, and I forget there's some great saying that some quote were, I don't remember who said it, but it was something to the effect of if you're, if you're sort of picking the best just because it's the best out of what you have in front of you, it's not the best. It has to be the right one. So we had gone through this big brainstorming session and drawn a few different ideas out and none of them were really like, yeah, that's it. It was just like, yeah, we could do that or we could do that or that would work, that sort of thing. And then I went home that night and I actually watched a really terrible crime mystery. And I don't know, something about it made me focus on that individual person and what is their life about? And it became so maddening to me that we have all these numbers, we have all these individual people that are stuck in these numbers, but it's really the beauty of this one life. And I don't know who those people are, I don't know each individual, but I have these tiny data points about them. And so to me it was just, I don't know where it came from, but it was just that arc of that life. It was like there's a beginning to that life and there's some flow to it and there's the apex and then there's the death and what could have been and what could have been became the focus. It was the stolen years and the potential of that life. And so that's really how it came to be and the colors we were playing with at first are kind of strange and we wanted to sort of note that vibrancy of life and that's why we settled on sort of the orangey bright yellows and reds and oranges. It's that vibrancy and that sort of fire and that energy that's within us. And then after the death, it was really like the ghost of that person. And I don't know if I have anything more than that to say about it. It was one of the fastest projects that we were able to come up with a concept for though. It was very like really it was just two brainstorming sessions and we had sort of were able to start prototyping on it. Yeah. How did you find that sound and how did you add it? Yeah, the sound was added just as an afterthought and really just for the film festival that we were a part of. We felt like we wanted to keep the interactive version sort of more austere and we didn't want that sort of that drama to it. But for the film festival it felt more appropriate to have that more dramatic aspect to it. We worked with a great sound designer that we use all the time. His name is Sean Eden. He's in New York. He's a musician and an actor and he's fantastic and he as well as my business partner Dino came up with sort of the crescendo of that sound. But the individual pieces my business partner had this idea that they should be sort of like a gas. They should be the sort of the extinguishing of that life. So you'll hear that as it builds, as each person is killed, there's that little bit of a gasp for a flame dying out sound. So as that progresses it becomes almost like a voice, I think, as it builds. So that's how that came to be, yeah. So I was kind of struck, you were talking about how the arc of that person's life, I guess I have one question and I may have, you may have said this, I may have missed it, I apologize. How did you decide how long that would have lived? Is it just statistical? It's statistical, yeah. So we base it on. It's on demographics? Exactly, exactly. And then by that we can also pick out what they may have died from as well. And if you could somehow find each of those deaths, you were talking about the arc of their life, each of those deaths affected, you were talking about what their relationship was to the murderer or whatever. There's also all those family relationships, things like that, it could be hard to do, but maybe if you could find how many friends they had on Facebook, whatever it was, sort of like it occurred to me, like the ripple effect of each of those things could also be a really powerful way as well as, because it's not just that person. Exactly, exactly. It's just, exactly. Exactly, there's a huge void left from that that we tend to forget about because it's a story in the paper and it's a blip and then the next day it's gone and it's off of my radar, I don't have to think about it anymore. I mean, some of these stories stick with me just because they're so horrible and whatnot, but you're right, I mean, there is this effect of there's a nuclear bomb that goes off in the lives of many people every day. And that is certainly something that we could explore with this late data that's, the victim names are available there. In the government data, they remove that for privacy concerns, but the slate data is, I mean, that is a really rich data source and I would love to continue working with that and in those ways, I think that could be really powerful. Great point. Anybody else? Yeah. Obviously, there was a message came first, right? You knew what it was you wanted to express, how to express it was no question, but you knew what it was you wanted to express and finding the aspects of the data and the right way to present it that would get across that message. But you're also talking about opening up the data and finding out what the data wants to say, which implies the message hasn't been decided yet, that the message comes last to a certain extent. What percentage of data visualization do you think is message comes first versus message comes last? 32%. I don't know. We take it on a case by case basis. It really depends on what is it we're dealing with and some data that we're dealing with is really just exploratory and even the client comes to us and says, we don't know what this is telling us and we just want to make our data open or we want to make a tool to explore it or we want to make a tool that enables people to do X, Y, or Z. And that's more of the exploratory realm where there maybe isn't a message, but it's just meant as a tool, meant as a way for people to go in and do different things. And this is obviously on the other end of the spectrum where it starts with that message, it starts with a big point and then after the fact people can go in and explore. But I do think that exploratory side of it should be a part of every visualization because that's what really engages people and that's what helps people not only find their own epiphanies within that data, but also helps them trust what you're saying because they can go in and look at the numbers and say, oh yeah, their point is correct. There really is a big problem here because I can go inside and see that rather than just throwing up a big number and saying you should care about this and this, go write your congressperson. It's, look at this terrible thing that's happening and here's the data that lets ease you into it and show you what's happening. So I do think that exploratory side should be a part of everything. In some cases it should lead, in some cases it should take the backseat. But it's all really up to what we feel is important in terms of whatever the subject matter is. Anybody? That's it. Thank you.