 Well, I'm Lane postdoc at Tufts. Tufts is close by in case you didn't know. So if you're in the area You should drop by some time and visit us So I work in vault visual analytics lab at Tufts with a lot of great people a lot of great research going on and Today, I just want to share some of that with you You know, what does it mean to take a user-centered approach to data visualization research? Sometimes we see things that kind of focus on the technology the graphics maybe the aesthetics that sort of thing and We really take you know kind of this I guess really strong user-centered approach to the work that we do and it's because you know use users matter, you know like Nigel said context matters and We want to explore that quantify that and use that to build better data visualizations And we do that because I'm going to give you a reminder in case you know, you've forgotten We've all you know kind of staked our careers on the belief that visualization is an indispensable tool for analysis and understanding I know I have And fortunately in recent years, it's been a lot easier to kind of push that I mean if you look at successes like Tableau, right? I mean you I just point to Tableau like why are you doing data? This this is amazing And then you know on the other side things like the New York Times, you know, so if people were like, what's that blow? I don't get it. I'll just show them the New York Times stuff. They're like, oh, that's great That's great, but casting these in a broader light I mean we've had a lot of successes in both exploratory data visualization and also kind of on the expository side So like an exploratory, you know a problem for instance is where you don't know You know what's in the data until you get it in there and kind of discover the story and expository is where you you know kind of have a point to make and these are important trends and important ways of thinking about data visualization and these We've had so many successes in the recent years. I'm so excited about that however You know, there are some areas and the exploratory side and the expository side where we have data and The data has a lot of promise, but that promise is not being realized My own experience working with cybersecurity analysts, you know, we know that the data is there Everybody's tracking it, but it's currently not, you know Doing a great job of keeping our data safer as evidenced by the fact that in the past year I've replaced my credit card three times It's not really working for them And you know part of the problem is that and I actually just came across this visualization last week of a new product But you know security analysts are really excited about but it looks like this So I see the appeal right I watch minority report and you know, I look at this Future AI I want to be that But we know that there's some things going on here that might not be optimal for the the task and you know the gravity of the task that security analysts face and We need to do better But on the expository side where we kind of already know the story and think about this So a person getting a diagnosis and this is happening day after day after day It's very difficult time. The data is there, but it's not being realized It's not being used to its full potential to help people through this difficult decision process of like what treatments to get and what are the risks and trade-offs between them and You know, I just want to point out that these areas demand accurate and precise data visualization All right, so it kind of gives us a license to look at data visualization under a microscope to put it under a microscope to understand you know how it works and I Kind of experienced an exit to existential crisis during my undergrad and PhD years where you know I was going along building cybersecurity visualization systems for you know, a lot of different contexts and you know I looked at these I knew that there was a lot riding on the decisions made with these tools And I realized that I was just building the same things over and over again All right So bar charts area charts stack bar charts node link charts line charts radial scatter plots so on and so forth and I couldn't find you know a way to make an optimal design decision and That's that's kind of what I wanted to work towards right So how can one of I'm presenting this really critical information to a user, you know make a more optimal design choice and That led me to some really interesting research questions that I'm going to you know tell you about today So just getting right into it So one of those is that we want to leverage, you know the human ability of perception and our understanding of it to improve visualization design and So the problem looks like this we have many techniques that work for the exact same data So if we're familiar with you know a parallel coordinates plot, which is on the left, you know it's sort of an isomorph of the Scatter plot matrix which is in the middle Which is you know also a scatter plot matrix is an isomorph of a radial scatter plot these are all showing you know They can show the exact same data. They're quite different, you know What are the trade-offs between them? How do how do we get it that and typically in the academic community or even you know some of you here are familiar with AB testing right you can do a comparative evaluation right I can take the scatter plot here and Test it against the parallel coordinates plot and the scatter plot against the line chart and the scatter plot against the stack Bar chart so on and so forth, but you end up with a large experiment. So 8 choose 2 that's 28 conditions I don't want to run that that's going to be expensive So there's got to be other ways that we can get at this problem and in a principled way So another possible approach is to do a model based evaluation So in this case we would instead of you know just doing a comparative evaluation we would model the performance of Fort for instance a scatter plot on a given task and we could take that same modeling procedure and Apply it to a different set of visualizations now We have a set of models and you know if you've worked with models before you can compare them You know very directly and efficiently in this case model 3 is outperforming model 2 is outperforming model 4 Hypothetically the data actually does not back that up at all But you know the point is that models of performance can be compared directly and efficiently they're very easy to get It's just choosing the model is the hard part models are so scalable and falsifiable All right If somebody comes along later and they have a better model a better way to look at this data You can just you know regenerate throw it out start over But to really be effective and I really want to make this point a model has to be grounded in theory So if you've done any sort of modeling, you know that you can build a model out of anything, right? You can do overfitting so on and so forth, but you know for for this problem We really want the model to be grounded in some sort of theory So whenever we're comparing a parallel coordinates plot to a you know a radial line chart or a Scatter plot to a line chart, you know We have some notion as to why we see the effects that we see and that kind of kind of connects it to how humans actually work So to get that in this experiment we turn to a psychophysiological methodology and you know I'm not a cognitive psychologist, but We're gonna try this I'm assuming you know what correlation is so I'm gonna ask you which one is more correlated Left to right Good lots of people think it's a trick. It's not a trick So yes, if you're on team left Congratulations You won and you know on the right. There's a point five difference point zero five. Let's try it again Left to right right Right lots of bright Are you sure who says right? That's also true. So Pearson's Pearson's correlation. All right. We get the Yeah, yeah, yeah, yeah, so Pearson's are for this one so So later on I'll talk about the the effect of expertise and visualization Really believe something there and this does happen. So if you were on team right, I'm sorry to say Yes team right you were you were incorrect this time But the real point that I want to make here is that these have the same difference So the first one had a difference of point zero five and the second one on the right Also had a difference of point zero five. Why is it? And it turns out the the perception of correlation in scatter plots We know from previous research that can be modeled using Weber's law Which if you've taken, you know an intro to site course or sensation and perception that sort of thing You might have come across papers law, but it's a model for low-level perceptual discrimination So things like you know the perception of line length The perception of weight was actually how it was originally discovered Brightness and sound and taste saltiness follows Weber's law. It's really interesting But here it is applying to something that we think of it as a high-level task the perception of correlation How can we reduce it to something like this? So perception of correlation follows Weber's law and don't worry about that equation. Let's break it down just a bit You know in case you're not Matthew Santiago you're gonna So to perceive a difference, right? It depends on a few things right so in order to see a difference It depends on the actual intensity of the stimulus so kind of like a baseline and it depends on the difference between the two stimuli and All of that is modulated by a constant K which you can get through an experiment So if I were to ask you, you know about up 200 times You know the thing that we just did for one correlation level I could get a good value of K right we're not going to do that today or we might But so this is how Weber's law works So one thing I want to point out is that Delta P is actually the just noticeable difference the J&D and you know If you studied anything like compression or how MP3 works and that sort of thing. It's really interesting It's the same concept. So just noticeable differences are used in that sort of context So the hypothesis that we came up with like if we know that the perception of correlation in scatter plots follows Weber's law Like this with worse performance, you know being higher So you guys were kind of on the the right side for the easy one, right between 1 and 0.95 so discrimination threshold was lower and for the other one you were further down So worse performance is higher So if we know that the perception of correlation in scatter plots looks like this What does it look like for other charts right are some linear or some nonlinear? Are you know exponential logarithmic? I mean we really didn't know going into this so And whatever we found we thought would be interesting information, right? So if some of them are nonlinear, that's interesting, right? At least for this threshold We can say that the green one which corresponds to a parallel coordinates plot is better than you know The or is more discriminable than the the radial line chart there So that's interesting, but there's another possibility Whenever you have two that are linear All right, and there's no crossing between them you can start to say things like well in this case The parallel coordinates plot is strictly better than or the perception of correlation for a parallel coordinates plot is strictly better than that of a Radial line chart and if we were to find things like this in an experiment That's really really interesting because we can make really strong claims about which one is best for the perception of correlation So to test this we ran an experiment which we are apt to do And you know first we had to choose some stimuli, right? So what charts do we want to test and how do we want to test them? So of course the scatter plot from previous research The parallel coordinates plot because everyone uses them Apparently they're good for something What's interesting here is that we made the decision very early on to test both negative and positively correlated data, right? So on the negatively negative side it kind of looks a little bit different than the positive side That's something that we'll return to later on The next set of charts we tested are the stacked area stack line and stack bar chart The way that we arrived at this is that we took one of our data sets through an excel and we saw well What charts are one click away? All right, and I mean it was a really great way to you know figure out Charts to test and if there are differences between these that's really interesting great So things that are one click away should they really be? Other charts also very common line charts, and we also you know ordered one of the lines to test for a possible effect there And finally to round things out We tested the donut chart. Yeah pun intended and the the radar chart So you know all these charts together we had to test them So this again turned into be a large experiment so we turned to our old friend Amazon's mechanical Turk Which is a great way to get participants for your experiment do pay them enough I mean there's some controversy about how much you pay people on mechanical Turk. It should be sufficient So the design in case you're into that if you're a stats person, right? So we had about 1700 people nine charts It's between subjects and you know any statistical analysis I talk about you know It's a pretty strong correction for determining what statistically significant so But the moment the truth came you know when we talk about the results, right? So we ran this experiment we ended up with 200,000 individual perceptual judgments, which I'm showing you right here Previous research only had the chart at the very top left And so now we have 18 of them nine charts for both positive and negatively correlated data And the real test you know is are these actually linear do they follow the labor's law or can they be modeled using? labor's law and The way that we do that so you can fit a line to anything, right? So the real test comes whenever you actually go back to statistics turns out statistics are good for something And so we tested the correlation coefficient, which we're in general, you know very high we tested the fit the r-square value also high and we tested the rms error, which was low and So that led us to this really interesting finding that we can say that the perception of correlation in every chart that we tested Can be modeled using labor's law Which is interesting this low-level perceptual thing for line length and brightness, you know It's now giving us a way to connect a scatter plot with a donut chart Or a stacked area chart with a ordered line chart and kind of talk about them in the same way So that's really interesting from like an info this theory perspective But you know what can you do to improve visualization design? And it turns out if you have a good model you can do a lot of good with good models So one of the first things we did was to come up with a perceptually backed ranking of precision for judging correlation so Now with these now validated lines, or you know, we at least know that they can be modeled using vapor's law You can make some design decisions based off of this, but it's a little bit messy So we decided to simplify and one of the things that we did we just obtain rankings that you know each value of r And though we produce this chart Which we call a noodle chart for obvious reasons It's just a simplification of the chart that you saw before but it's a bit more readable, right? So the best performance is on top. So parallel coordinates plots that show negative data Scatter plots or among the top performers perhaps not too surprising But down, you know worse performance on the bottom, right parallel coordinates plots again, but showing positive data And you really want to stay away from a line chart that shows positive data that turned out to be the worst overall Another interesting thing that you can do with a model Note that we didn't test every correlation value, right? We didn't test like one two three four five six seven eight nine We only tested the mid-range and one thing that you can do is kind of interpolate or predict performance based on a model So if you have a model and you have a reason to you know Believe that this model is reflective of how humans actually work You can you know make a reasonable prediction about how people are going to perform on r equals 0.9 and r equals 0.1 So that's also pretty fun And then finally if you're tired of the noodle chart we can just produce an overall ranking, right? So if you just want you know a ranking of precision for correlation you can get this one So that's one thing the first thing that we wanted to do But another thing is that we can explore properties of the perceptual space of correlation So taking this example again. I hinted at it earlier, but it turns out there's some really interesting findings here So if you look at the scatter plot between negative and positive it kind of looks symmetric, right? You can kind of fold it over on itself But you can't do that with parallel coordinates plots and does that manifest in performance? You might have solved before but I'm going to show you very explicitly now that you know If you're showing positively correlated data right this stuff with the parallel lines on the right. It's terrible All right people perform horribly with this for you know detecting correlation Whereas with the negative right where the lines are crossing a lot Over here on the left actually performed almost as well as scatter plots It's really interesting for us to know that the the perceptual space of parallel coordinates plots At least whenever it comes to correlation is not symmetric. It's asymmetric It kind of changes the way that I think about visualizations now, right? So if I'm looking at them I'm looking for symmetries and asymmetries and what benefits that might bring In the noodle chart it looks like this So there's definitely some a gap between them so One final thing you can do, you know with a good model is that you can guide a novice user in depicting correlation All right, so if you you know The way that we get this is that we look at the charts that were one step away in Excel All right, so and we found the the line chart So the stack line chart at the top performed the worst In the middle of the stacked area chart performed Okay, and actually it was the stack bar chart that performed the best and these are all statistically significant differences between them So if you had a tool, you know for novice users to depict correlation You might guide them towards you know Picking the one that's best for them or at least telling them what the trade-offs are You know, I don't know how to design an adaptive system. That's an open challenge But if you can figure it out, here you go. Here's a model that you can use so all this to say that a Theory-grounded model can help build the science of visualization and still provide actual information for visualization design So that's one. That's just perception All right, people do a lot more than just perceive visualizations. All right, so If you look at a person you have to worry about things like their cognitive states, right? So what mental state are you in right now, but also cognitive traits? Who are you? I mean there's things like personality or things like your spatial ability or numeracy that sort of thing An experience and bias so these are all things that we sort of proposed You know might modulate the effectiveness of visualizations and I'll just give you one more experiment today in the interest of time That deals with cognitive states specifically So broadly you can take this as you know quantifying the impact of individual differences differences between users and you know how that might impact visualization and visualization design and Shout out to Nigel for mentioning this yesterday emotion Actually came across this problem when I was in grad school I wanted to know like does emotion impact how we perceive or does emotion impact graphical perception and it turns out Emotion is pretty difficult to study So what we did it was we've leveraged a well-studied graphical perception tasks something that's been around since I think before I was born So replicated many many times we adapted a well-studied emotion priming technique So we went to the cognitive psychology were like how do you prime people emotionally and they're like we have ways And they are effective And so we ended up with an experiment combining emotion and graphical perception it's a really fun thing to run So the the task you know if you're not familiar Cleveland McGill studies in 1984 You know this is kind of one of the reasons why we know that a bar chart is better than a pie chart They tested this explicitly right so they asked the same question over and over so which of the two is larger Right a or B and what percentage is the smaller the larger and it turns out they had really awesome results from this turned out There's not a pie in this chart. Thanks Jeff so They have these results that give you a ranking right so if you're a robot you're gonna be closer to the 1.0 side This is log error, and if you're pretty terrible like the stacked Stack bar chart t5 is you're gonna be you know pretty far up on the error scale, and it turns out that Jeff here and Vostok Replicated this right a few years ago on Amazon's mechanical Turk, which gave us a great baseline to study these experiments further so But we had to take this task and combine it with emotion in some way and it had to be appropriate for the web And so after scouring the literature for a while the first you know way that we came up with is to prime via stories, right? So we actually turned to I think you know I was doing this I went to the New York Times and you know I had the the great task of looking through these stories finding ones that were very emotionally sad or emotionally positive Making sure they're about the same reading level and the same length and that sort of thing and you know after I would say a week or two of looking for stories. I settled on this one for a negative prime So looking for a place to die From Teresa Brown. I'll give you a second to read that just to make sure you know you guys think it's negative enough to Prime someone Pretty negative. Yeah, so my participants would agree So after obtaining all the proper proper approvals and that sort of thing for RV We actually validated this on mechanical Turk Well, we tested the stories with users before we did anything with the visualizations and it turns out that yes This story prime people negatively and yes the positive story that I chose Prime people positively it turns out that a negative prime is really tough to get so following previous literature I tested Stephen Hawking's a brief history of time and that was really negative So and this is a known problem. So we just ended up with positive negative in this case So the full study we tested eight chart types kind of inspired by previous research was about a thousand people on mechanical Turk and you know, we had ways of measuring whether they were primed or not and We ended up with these results So this is log error by chart and you can see that you know the possibly prime to the the ones in blue The negatively prime the ones in red and they're pretty close overall but there's some that have a really, you know, it's seemingly strong distance one in particular is this and Just to let you know what's going on V1 in the top is when the bars are always next to each other V2 is that there is at least one bar separating them and it seems like there's an effect there But I wasn't sure right You have to validate like once you come across a new hypothesis. So I increased the task difficulty, right? So I you know had an adjacent task with 20 bars a non-adjacent task with a Few more so I wanted to really tease out this effect if it was really there And I actually switched the primes out to and I've tested them both ways But this time I was using pictures from the international affective picture system This is not one of them that you're not allowed to put them out there You know, but this is you know reflective of some of the cute stuff that you get and I guess I'm too kind to put some of the negative stuff you get you guys will probably not be ready for lunch. So So what were the results from this experiment? so we tested 450 people and you know just a small number of conditions this time and It turns out the reaction times you had this interesting effect that you know if you use the more difficult task We saw clear separation over time. So that's the top left and you know the blue and purple We're mostly together over time and you know the easier task There was you know some separation there But what was really exciting to me is whenever we found that this error over trials on the right So for the difficult task, you just got this you know increasing separation between them And it turns out that was strongly statistically significant So we found that the people that were prime negatively actually performed significantly worse on the task whenever it was this difficult task right and Like I said, I'm not a psychologist I don't know how to interpret this in terms of you know theory and that sort of thing So it's good that you know in the academic community you can reach out to people and so we went to Steve Franken area Northwestern University a cognitive psychologist and he gave us some you know Possible reasons as to why that we're seeing the effect that we're seeing and one is that positive moods can expand the scope of the perceptual spotlight of attention Another is that they can encourage the observer to process a larger spatial area of the world in a single single glance and Conversely that you know negative or anxious moods can really constrict this area So it's you know interesting to see that Emotion which is something that's always present with us right we're always you know experience of flux in emotional state It could be some hypothesis you just have rejected it could be some email that you just got This can actually have an impact on the way that we perceive and interact with data visualizations Interaction would be a fun thing to test in the future But needless to say emotion does play an important role in visualization and it does influence graphical perception accuracy So we have about five minutes left. I just want to leave you with kind of a problem and a challenge All right, so visualization in critical situations. Can you solve this problem? So imagine you just got you know a positive test result. Can you solve this problem? All right, it's a it's a really difficult one and one that people face daily right understanding conditional probabilities One of it relates to you know screening tests and that sort of thing It's not just for breast cancer It can be for things like prostate cancer lung cancer Neonatal screening we're currently collaborating with Tufts Medical School to kind of explore the space of this and what the implications are and turns out that You know lack of understanding of conditional probability is you know one of the causes of over treatment, right? If you don't really know where you stand on this You know you're more likely to just take the most aggressive approach to treating You know your cancer or your potential for cancer and I can give you the answer, but I'm not sure that helps that much So take a second to read through and I'm just kidding found at the end You're about the true percentage risk is about 8% I'm not sure if any of you got that if you did congratulations You're on the the high scale of this but you know for years people have been studying this since the 80s So gigarensers in the 80s Was studying this and you know visualization At least the people that I know thought that you know visualization would come in and save the day, right? Oh, I know how to represent this I'll use a Venge diagram and you know maybe some icon arrays and that sort of thing and we know from previous research that these are absolutely unsuccessful so The the accuracy rates still hover between like six and forty five percent And you know I want to leave you with this what would it take to solve a problem like this right a problem that has real social impact And you know I would I would offer a direction All right, I think that our innate perceptual and cognitive abilities should really become an integral part of the next generation of visualization systems All right and to illustrate that we have the human on one side and the system that has their data on the other and I've described today just a few things You know perception in cognitive states and we also know that cognitive traits things like personality impact the effectiveness of a Visualization it kind of modulates the effectiveness between the human in the system And what I would like to see is that instead systems that kind of take these two in account into account, right? How does that happen? I don't care What I want is you know, I do care But what I really want is just to smooth that out, right? So I want the human to connect with the data as well as possible And I think this is one way that we can do that So with that I will give thanks to all my collaborators at vault. These are amazing people Presenting their research here Caroline inspired my research to do emotion and Remco has been a great mentor to me and Check out everything that we have there So Eli's doing some amazing work in user-centered interaction, right? And I didn't even get a chance to even touch it today So it's definitely worth checking out. So any questions guys?