 Hi everyone. Thanks for joining us today. We'll get started in just a second. I just want to say hello and let everyone know that this session is being recorded. There is also the option for closed captions. You can select that within the zoom features if you'd like to turn those on. We will also be sharing this slide deck and the recording after the winter institute wraps up because this is a session that's part of the winter institute so there are many events going on concurrently and I will make a little pitch for that and just share the link to the wiki. There are events the rest of this week so encourage you to find other ones to register for if you haven't already and yeah thank you again for joining us today. This workshop is around data visualization specifically geared for teaching and learning projects. We're going to be focusing on basically some best practices around visualizations that could apply to non-teaching and learning projects but because that's the capacity of the work that we do that's kind of where we're coming at this workshop from. So my name is Trish. I should introduce myself. I know some of you already. I'm an evaluation and research consultant with the Center for Teaching, Learning and Technology and I'm joined by my colleague Natasha who's a scholarship of teaching and learning facilitator. Throughout the session please feel free to pop questions in the chat. We will have some time hopefully we'll have time for a little bit of a break at the middle as well as some time at the end to do some deeper dives into your questions but do feel free to post your questions in the chat and then we'll see if it's something that a topic we're going to cover. We can take a break to address those concerns as they come up and with that I will pass it over to my colleague Natasha. Great so hi everyone welcome to the session today. Just before we get started I just want to do a land acknowledgement so I'd like to acknowledge and honor the existence of the first people from the land where I'm sharing with you today by acknowledging that I'm here currently on the land of the Musqueam in this place that we know today as UBC. I would encourage you to take a moment to kind of reflect on where you're joining us from and consider and give give respect to the lands that you are situated in. I appreciate this place so much because it provides me with opportunities to learn to work to play and I would like to just continue to state that I'm committed to learning more beyond just just what the land who the land belongs to but learning more about the processes of decolonializing and indigenizing our work and play here. So I'm just going to throw into the chat a link if you don't already know the lands that you are currently situated on this is an opportunity for you to discover that. So we'll get started today with just a quick outline of our workshop starting with a little comic that is from XKCD that I just really enjoy. So just kind of a brief overview of of our presentation today. We're going to go over kind of the basic principles of data visualization. We've got a quick kind of introductory section. We'll talk a little bit about what is the purpose of your visualization? What are you hoping these visualizations will accomplish for you? What is the context that you're coming from? We'll talk a bit about kind of basic dos and don'ts around data visualization and that will be the bulk of our of our session. So we'll talk about a few different types. We'll talk about bar and column charts. We'll talk about line plots and scatter plots. We'll talk about some less common visualizations and a little bit about how to visualize qualitative data as well. That is something that we've heard from from past participants that they'd like to know more about. So we've added a little bit of a section on that and then we'll also discuss kind of best practices overall for visualizing data and accessibility considerations that you should keep in mind when you're planning data database. And then we'll end with time for questions and a list of potential resources that might be useful for you moving forward. And as Trish mentioned, we will share the slides afterwards. So there will be references on some of the slides and links and you can refer to those for more detailed information, more advanced information on each of those different topics. So just as we get started, I'm going to put a link into the chat for us here. We would love to know what brought you to today's session. So this is a link to a Jamboard and you can just use on the kind of left side of the Jamboard. You'll notice that there's these little sticky notes. You can just add a sticky note and add your response in there. Great. So we've got a few responses. I'll just read them out as they kind of come in. So a few folks mentioning looking for new ideas, looking to make more interesting slides, looking to improve my teaching and my materials for learning, looking for better ways to show data, looking for how to represent qualitative data. So great. I'm glad we've got that a little additional piece in there. Hopefully that will be a good foundational start for that. How to integrate images in a helpful way in reports and teaching. Teaching data is looking for some inspiration and new tips. If you have any suggestions for us, we'd also love if you wanted to throw more things into the chat. We're all here to learn from each other and we're always happy to hear from you. Building on a foundation of knowledge regarding data visualizations, learning about accessibility. A couple of people mentioned qualitative data, looking for more catchy ways to share information. Yeah, quite a few different bits and pieces. So hopefully we'll be able to get you started on that journey. Yes, I can throw the link right in there for you. Hopefully we'll be able to kind of get started on that journey today and then send you off with some resources that will help you if there are particular areas that you already have a bit of a foundation in and want some kind of more advanced knowledge in or things that you want to pursue further. Great. So before we get started with principles and guidelines, we would encourage you to think about why. So think about why you're trying to visualize this data. What is the story that you're trying to tell? So remember that as the person creating the figure, you are the one doing the interpretation of data. So the first thing we would encourage you to think about is who is your audience? Is the audience a novice? Is the audience an expert in the field and expert in the type of data that you are presenting? Do they have a background in this area? Do they know what you're talking about? Will they know the story that you're trying to tell? And if not, how can you simplify so that users of all abilities or users who may not be super familiar with your topic or the type of visualization will still be able to pull out the information that they need? So these considerations are kind of true more broadly and particularly important questions to be asking when you're focused on making sure that your data visualizations are inclusive and accessible. And then the second thing we would encourage you to think about, big picture question is what is your audience trying to learn from this image? So how should the reader benefit from this visualization? Consider how your graph will help the user understand insights from the data and consider what the story is that you're trying to tell. So are you trying to learn more about the data? Are you doing more of an exploratory thing or is the goal explanatory to really explain and tell a story? Great. So we've already got some things happening in the chat about cool recommendations for visualization accessibility. Thanks for sharing that Nadia. We will get into that later as well. Happy to have more resources at our fingertips and add those onto our list as well. So just moving on to kind of broad principles of data-vis. I really like this quote by Edward Tuft who is an American statistician and he says, graphical excellence gives the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space. And I just really like this nice concise level sound bite. The question really is, can you understand it in five seconds? If the visual that you're using adds more complexity than the writing, than the text that you already have, think about whether you actually need it, think about whether it's contributing anything. So the goal of visualization is really to reduce complexity to make it easier and more accessible for someone to be able to look at something and grab the information that they need as quickly as possible. And so with that in mind, we have kind of the basic principles of good data visualization. So a good visualization first is accurate. It's representing quantities accurately or it's representing data accurately. If you're looking at qualitative, more qualitative data. Secondly, it is clearly indicating how the different pieces relate to one another. So what is the relationship between the variables, between the categories? Is it a continuous thing? Are they discrete? Clearly indicating how all of the different pieces interact with one another. And then the third is that it makes obvious how people should be using the information. So your graphical representation should highlight, it should really be able to highlight what you want people to pull out from it very, very quickly and very easily. Depending on what you're trying to do, there might be a couple of additional things you might want to consider. So you might want to consider making it easy for people to compare different quantities. And you may want to make it easy for people to see a ranking or a ranked order. So those are kind of additional optional pieces, but these are kind of the three big principles of data visualization. So we've got a little bit of a participatory component. The question we've got here is looking at these four graphs quickly. Which of these graphs do you think has the highest mean? So we're going to open a zoom poll. And if you just want to throw in your response, when we talk about being able to pull out data quickly and easily. I'll give us about 10 more seconds. Trish, do you mind sharing the results with us? So we've got quite a spread of responses, which is what we usually get. So some people say one or the other. About a third of the respondents say more than one of the above. I won't spend too much time on it. I'd be curious to know those of you who said more than one of the above. Which ones do you think? But this is a kind of an interesting illustration of a couple of different things, of kind of easy being able to pull out information and also of the importance of visualizing data for yourself before you start visualizing data for others. So this is actually called Anscombe's Quartet. And all of these, all four of these graphs have the same mean, the same variance, and the same correlation value. So they have a correlation of about 0.8. And as I mentioned, the goal here is really to highlight kind of the importance of visualizing data for yourself before you are moving on to providing visualizations for others. So the goal here is to look at, are there outliers or other anomalies in your data? So for example, looking at graphs C or D, being able to visualize the data here will give you an immediate response, an immediate kind of clarification on whether there might be errors in processing, whether there might be an outlier in your data, and help you consider how you might best present your data for your audience. Especially if you're going to use things like means or medians to showcase the data, because when the data isn't normally distributed, these can be kind of misrepresented. And that's where data visualization can really help you tell the story more clearly. The other benefit of visualizing for yourself before you are visualizing for others is that it can help you identify anything that might be missing. So perhaps missing responses in a particular category, or something like that, and then just kind of generally better understanding the data at large. So we would encourage you to think about data visualization as something that you do early on for yourself, before you make nicer versions to share more broadly. So with that, we'll get into kind of our most common fixer types, and we're going to get started with bar and column charts, which are probably quite familiar to all of you. Just a brief overview of what they're useful for. So bar and column charts are best when you're comparing different categories of data, and they're most useful for plotting means or medians, so plotting things like grades or scale means or Likert scale responses. So range is the values as well. So once again, I'm going to ask for a little bit of participation. So we're going to present you with a problematic bar chart. And I would encourage you to take a minute and type out your answer into the chat window. And I'll give everybody about 90 seconds to get their answer in there. And then we'll press enter at the same time, so that everyone can go through and read all of the responses simultaneously. So we'll show you the chart, take a couple of seconds, don't press enter quite yet. And then I'll let you know once we're ready for you to press enter. I'll give us about another 15 seconds. All right. So when you're ready, please go ahead and press enter. And then we'll go through and look at all of the responses. Great. Lots of responses. Thank you. All right. So we've got percentages down out up to 100 in each year. So there might be some issues with the actual accurate representation of the data on this bar chart. Too many categories, not enough color division, not enough space. Hard to make direct comparisons between different years. There's too much data. So a few people have mentioned too many categories. It's impossible to make comparisons. Bars are too small. The color categories are distracting, hard to figure out. A lack of clarity on kind of what we're supposed to be pulling out from this. Too much overwhelming information, bad choices of color. Great. So so many problems. Yeah, it's hard to understand. It's hard to pull out information. It's unclear. Great. So I just want to add in that this was a figure that I made for a project I was working on. And this was basically the default that was offered to me in Excel. So basically taking all the data that I had, I was like, let's just throw everything in there and see what it looks like. So I just want to make another case for it's fine to, you know, plot the data for yourself at first to sort of just see like, okay, what does it all look like? What is kind of happening? But also like a very strong case, like please just don't use the default options that Excel gives you. Excel is a very powerful and wonderful tool. You could do many, many things with it, but just using the default is not the best choice. So I'm sure we've all used or seen graphs similar to this at conferences or at various points in time in our work. Our goal here is to figure out how to break down a chart like this into something that's more readable, legible, helpful. So we'll go through a few kind of points on how to make your data cleaner for a bar and column chart. So the first point that we'll make here is a pretty simple one, but one that has a lot of impact. So ranking your items in a helpful way. If you have a small number, we would encourage you to rank by the count or the percentage we would typically encourage you to use the percentage because it does give it better a more representative picture. And if you have a large number of items that you really need to get across ranking alphabetically and then highlighting key points, we'll talk about a few ways to highlight in the next little bit, but pulling out grayscale versus color to highlight the things that you need. This version kind of uses a subset of the data from the previous slide, but to tell a specific story. So here it's much more streamlined, it's much clearer, it's much easier to pull out the information that you're looking for. So moving on to talking about highlighting, using perceptual features to your advantage to highlight what is important for the viewer to focus on. So in some cases you don't want to remove all of the rest of the data, but you do want to show particular items in kind of that the reader is meant to focus on. So in this case you can do a few things to make the data pop a little bit more. So here we've got all of the bars measuring kind of the same dependent variables, so there's no need for these different colors. This only serves to kind of distract the reader further. So by using a labeled axis, you're removing that kind of distracting legend on the side there. You're removing a visual feature that doesn't actually add any information and then colors only being used to highlight where you want the attention to be drawn. Labeling axes is also a best practice when it comes to accessibility, which we'll talk about a little bit more later, but if you're thinking about, for example, someone who may have impairments with color vision, this legend is absolutely useless when you're looking at these different bars. So labeling the axis is always a best practice that we would recommend that you do. Thinking a little bit about stacked plots, which we've often seen used, particularly kind of in teaching and learning contexts, think about how you might best kind of split that data to compare multiple values within each category. So if you're trying to look at multiple different factors, stack plots don't really allow for easy comparisons across groups. If you look at this on its own, it's quite challenging to break up each of those different pieces, whereas if you break it up, it allows for much easier comparison across all of those different categories. So breaking up that stack plot gives you a really clear picture. It's much easier to pull out, particularly those dark blue and the turquoise and the pink bars, because from the stacked plot it's really difficult to pull out that information. Another kind of minor point, some of these are kind of bigger pictures, some of these are really small practical kind of tips and tricks, but they do make a difference. Think about using a column chart instead of a bar chart if you have labels that are long and need to be there. So we would always encourage you to consider how to shorten your labels as much as possible, but if you need to keep long labels, we would encourage you to rotate the graph so that the labels run along the y-axis and the numerical scales along the x-axis. So turning a column chart into a bar chart. So thinking a little bit more in detail about Likert scale or Likert type scale data in particular, this is often something that comes up with teaching and learning projects in particular. We often are surveying our students or our TAs and asking them for their feedback on different bits and pieces. So a couple of kind of tips and tricks for Likert data in particular, we would encourage you to always be mindful of sorting positive or strongly agreed data from largest to smallest or smallest to largest, including a legend at the top when appropriate. So instead of doing what we have on the left here where again things aren't really in order, this is pretty similar to our principle of ranking order, it's much easier to pull out kind of what what what did the students say the best things about the worst things about at a glance when things are ordered in a in a logical and helpful way. So showing that distribution of data kind of provides information on all of those responses where just providing a mean alone on a Likert scale may not provide enough information but this allows you to pull out that information quickly and easily. The other thing that we would encourage you to do is if you do need to see kind of all of those different components, we would encourage you to use 100% stacked plots on an equal spread. So here we can see that kind of in the picture on the left here we've got different numbers of students have responded to different to different questions. We would encourage you to go instead of using numbers we would encourage you to use percentages and then to make sure that the data is on an equal spread so that you can really compare what actually is happening across those different items. We would encourage you to also be mindful if you do this that if groups are really unequal in between those different categories people might misinterpret this data to see them as equal so being mindful of when it's appropriate to use that stacked plot versus not or the caveats that you might need to provide if you do use a visualization like that. So this tells us you know 50% of the respondents felt this way given the same size which may or may not be equal across the groups. So just thinking of alternatives, when a bar chart is too messy there are a few potential alternatives that you might consider using. I also just want to highlight that all of these visualizations are created in Excel these are meant to be simple straightforward we're not trying to teach you you know fancy versions using software that you may not be familiar with all of these all of these are fairly straightforward to do in Excel including the the kind of splitting the data doing the stacked plots doing what's on this on this slide as well so consider a slope chart particularly if you're looking at kind of changes over time and this will kind of lead us into into talking a little bit about line charts as well. So think about a slope chart for changes over time and you can you can do it as one large chart or you can also quite easily split it into having each of the different lines kind of portrayed side by side which might be helpful if you're trying to pull out particular trends and differences in the shape of the distribution across those different categories we would also encourage you to consider that sometimes a lollipop chart might help you compare data more easily so when you have two categories of data that you're comparing it's a little bit harder to see on a on a bar chart here you can kind of see the distribution of you know the orange bars versus the blue bars you can tell that sometimes the orange bars are higher but you need to do quite a bit more work to see that whereas if you use a lollipop chart that can be something that might be more helpful for visualizing okay it's really easy for me to pull out you know where the blue ones are higher where the orange ones are higher and that kind of thing if you're trying to compare between two groups so we'll take a pause here for a minute do we have any questions anything we'd like to chat more about please feel free to unmute yourself or if you'd prefer feel free to put it in the chat and I'm happy to read the question out loud so that we have that available to us awesome maybe we'll just take a quick little break if anyone needs our session supposed to be 90 minutes I suspect we'll run a little bit short but we always have questions at the end so even though the slide deck might end early I suspect we might be around a little longer so I'm gonna my clock says 157 so just give a three minute break if you need to grab another coffee take a quick bathroom break and then we'll continue from there so we've got another good question so do you recommend using excel to generate graphs or are there other tools you'd recommend for beginners so I'll I can just speak to excel I think when I started using excel I've always kind of thought of it as just kind of you know this is what you use if you don't really have other options available to you but as I did mention there is quite a bit you can do in excel and as we as we were working through kind of creating this slideshow and reading a couple of resources that we can recommend to you at the end there is a there's a fantastic book um which is by Stephanie Evergreen and it's a fantastic book um and she's got a great website as well which we can put into the chat um of of how to use excel to make kind of more complex things than you thought excel was capable of really um in a fairly straightforward way and then I'll let Tricia speak to other options as well so I'm just posting uh someone asked about the slides and they'll end up in the oh there we go he's got that there oh awesome thank you Duncan I I'm appreciating everyone being on the on top of it for us um so Duncan just shared um a link to Stephanie's book which is fantastic um I read it about a year ago and was so inspired um it was one of the motivations for putting together this workshop was there were just so many great tips and tricks and um everything in her book she uses excel as the platform so um and she has a great like model for like is this better for beginners or do you have to be an excel ninja to do this um so it's a nice inspiration to kind of get a sense of like why like the way this figure looks but is it going to be too hard for me to make so that's just my extra pitch for um that book in terms of other tools um I have used are um in our studio for making visualizations I don't really recommend it for beginners like you do need to have a little bit of a programming coding not necessarily background but just like mindset um so are can be awesome and it can make really beautiful visualizations and things that are quite complex um it's also nice if you're making repeated visualizations so it's easy to just reuse sort of the template you have um but I wouldn't recommend it for beginners um I do think that excel has amazing options um I know I made that case about that hideous figure that we used as the example um again that's really just like if you go with the default that excel gives you you'll end up with a pretty decent figure but you really want to go back to basically everything Natasha's brought up to to date um in terms of making your figure more accessible easier to read and really that key point of like what is the data that you're trying to showcase who is your audience what do they need to know from the data that's there so um yeah I think excel is a great a great place to start um and then if you're a little a little if you have that background and you're comfortable in our I think are to do some amazing things there's a lot of um resources as well for making different types of figures online it's an open source tool so um the community is fantastic for providing tips and tricks for things um so if you find yourself kind of going down a path of becoming more and more in love with data visualizations then I think um that could be a a place for you to to consider for your next your next venture into the data-vis world see if there are any other questions thank you Ainsley for posting that wiki link so that's where the slides will be available shortly after the workshop they'll be able to once our amazing organizers have a chance to to post them they'll be available there as well as slides for any other winter institute session so I'm not seeing any other questions in which case I will continue on okay so um the next piece I want to talk about moving moving away from bar plots um is line and scattered plots which are another common visualization so these are really great um for things like temporal data for um in the context of teaching learning looking at things like changes over term or changes over years so um if you're thinking about how motivation shifts over multiple years in a series of courses um or looking at something like pre post data so you know how does student knowledge on a specific topic change from the beginning of the term to the end of the term as well as things like continuous data so things like student grades so um again looking at some sort of best practices and ways to make your data clearer so um the first one is just um reducing clutter so as we saw with bar plots um you know just because you have access to all of the data doesn't necessarily mean that it's um or the default is the best way to to pull it out um so you can do something like highlighting meaningful data so you can see that's already been done a little bit in this chart so what could have happened is you might have shown all of those lines so the ones that are grayed out are other countries that are being displayed um but they're not being highlighted so here only five countries are being displayed to kind of pull out what that data is and how it looks across the years a way to make that um even more clear like easier to visualize um is by um looking at by moving that axis over or sorry moving the legend over so Natasha mentioned this in one of the plots earlier so having the labels next to the lines themselves um it helps reduce any sort of um if there if anyone has an issue with a color impairment it makes it easier for them to see which lines match up with which you don't have to jump from the top of the bar um or the top of the plot to the bottom to kind of say okay which one is the purple one where is napal or where is germany which where does that match up so it makes it a little bit easier to process um and then um I have another example here oh this is basically looking at the same type of principle but looking at a scatter plot so again the way you can use color so in that last example I I showcased how some of the the items were grayed out so that the plots or the countries that we wanted to focus on would be highlighted um so you can see that here we have all of the five areas are are on display but what you might want to do is pull out a subset of data so by doing something like changing the title so between these two figures is you can pull out um kind of the emphasis that you want to make again this really depends on what you're trying to do with the data and what what the message behind the data is and what you're trying to tell um if you're trying to tell a story of what the rate is for your um in region then of course the data on the left makes more sense but if you're trying to do something like highlight um European countries then you can do that by uh only having some of the countries highlighted and so it does show the overall distribution of the data and that there's a positive relationship there um but you're really focusing on one subset one population something um I want to highlight and to have people be mindful of is with scatter plots so like this one here it can be easy to jump to the conclusion that there's a correlation between data so looking at something like um GDP and immigration rate you might say oh there's a positive correlation between these two these two factors um I don't know this data set it's it's an example that we pulled from online um so it's possible that there is a significant correlation uh but it's also possible that there's not uh so you just want to be careful it's easy for people to jump to that conclusion that there's a significant correlation between the data so just being mindful when you're presenting your data um that if you're going to view any additional statistical analyses that you make that clear and that that's included in um your presentation and or if you're not going to do those additional um analyses that you also let people know that you know it looks like there's a trend that there's a positive relationship however you know a statistical analysis wasn't done to ensure that that's actually the relationship that's there so um you can certainly talk about you know it looks like there's a trend in the data here that there's a positive relationship but if you don't have access to that actual statistical analysis just be mindful that people might jump to the conclusion that there's something happening that might not be um okay so we're going to move on to another um another poll so um pie charts aren't very common but um I shouldn't say they aren't I should say they are very common but I wish we wish that they were less common um I don't like a case for why that is um I'm just going to put on another poll here so just going to ask you to again um looking at this video just quickly I'll give 30 seconds for this so that people aren't thinking too deeply about it which area do you think is the largest well looking at the current version of this job everyone a lot of people 88 percent of people indicating um that area one is the largest um a few people indicating area five um so you are correct area one is the largest but it is kind of hard to tell and you can't really tell like okay maybe five is the next largest but I don't know what the difference between two three and four is it's quite difficult to tell um this is a bar plot um of the same data and just really highlights um it's really hard to compare the slices in the pie chart um so even though many of you got it right just seeing this visual on its own you don't get a sense of okay well how much of the pie chart is it is it about a quarter is it more than a quarter you're not getting any type of information um so I'm I'm not enamored with the the chart that's on the right um that that would be my preferred bar chart method we'll give you some examples in a minute but um just as a sense to get you know these are all different values and so presenting a part a pie chart like this you really don't get a sense of what the data is again the data is it like well there are many other problems with the figure but um it's just a case against pie charts they really make it hard for people to distinguish the proportion of data that's within that area um so we're just going to talk a little bit um again about pie charts and and another less common visualizing less a common that should be less common visualization which is also bubble plots um because they have the same features it makes it really hard to compare areas and it's rarely the best option for presenting data so um similar to the last one with these bubble plots um with this one at least they do some of the items do you have the amounts within them so uh a value so that you can actually kind of get a sense more easily but looking at some of the smaller ones it's hard to know you know what is that value how can you compare it if you're looking in this example um you know Indonesia at 15.5 and India at 208 you know it is would 15 really fit into that India plot you know so many numbers of times like is it actually a proportional representation um that's likely the case so um it's really difficult to use them to compare especially comparing against or comparing between figures so in this example here where you're trying to say you know Indonesia and Tanzania for example you know they look quite similar it's great that they have those values included in this example but it can be quite difficult to say i'm just going to finish this lurb and then i'll get to your question brief okay um so another way that you could get around this is um using something like a donut chart so a donut chart is a fantastic alternative um to the pie chart um which gives you the extra space so you can see that the center here is hollowed out so that's why it's a donut and not a pie um and it basically just helps again this isn't necessarily the the most ideal format for presenting your data but if you want sort of the snapshot visualization a donut chart can be useful if you have a small number of categories and the differences are quite distinct well you'll notice here is that um you know perhaps uh the research stream and contract stream faculty they look quite close but we've included the percentages um and we've included each um the label of each item just outside so there isn't um we've gotten rid of the legend which could make it confusing and difficult for people to match up um and we've included the percentages so you can get a sense very quickly you know what this little slice represents so it makes it easier to compare those areas um and this white space also just adds further clarity um we're in a pie chart that the the actual area of the pie slice is not really representative of of what that data is of the number that's that's there um so let me go back to this question I'm happy to take a first pass if you want um so so it's a good question do you know do bubble plots have have a place somewhere um part of the challenge kind of in the way that they're generally used so if you do want to use them something that we would recommend that you consider is much of the time when we see bubble plots um kind of used and kind of widely um what we see is that people scale um the diameter or the radius of the circle to the the point value of whatever it is so if you're um you know trying to compare something that's I don't know one versus two um using one versus two as the the diameter or the radius will actually result in a much larger area I think it's like four times the area making the the difference look much larger than it's actually the case in so if you are trying to use a bubble plot the goal should be to um ensure that the total area of the circle corresponds to the value so it just becomes a much more kind of complex operation um and it tends not to have the visual impact in the same way because we're used to seeing these big differences and the reason for those big differences is because it's kind of misrepresenting the data um by using it as the diameter of the radius as opposed to the the full um kind of area of the circle itself if that makes any sense so that's one thing to consider it it looks a lot less compelling I have tried using bubble plots in a couple of different uh situations and it just looks a lot less compelling when you're when you're using the area and it's really hard for us to interpret the area of a circle as as a comparative kind of value as opposed to that diameter or radius but then it becomes misrepresentative if that makes sense yeah please go ahead yeah no I don't think I was going to cover anything else I think the one other thing is um you know this is just one example and we're using it to really highlight the flaws of this method but looking at the bubble plot um you know it's not clear why certain ones have the information included some of them don't um so if you wanted to really pull out something like you know the the situation in India you know we're highlighting that it's much larger what you might want to do is something um like the suggestions we made earlier again apply some of those tips and and tricks and best practices so perhaps really pulling out a subset of the data and highlighting um a small number of countries and changing your title to reflect what it is that you want to focus on um or doing something like pulling out maybe just like the top three and highlighting those and then graying out the other circles so that it's less visually distracting so certainly I think there are ways that you could if you really wanted to use a bubble plot or a pie chart there are ways that you can make it better absolutely um and I think those are yeah some of the tips that we suggested earlier would be ways to to consider that okay so thanks I want to talk briefly about um pictographs again this isn't a visualization that we use too often but we do see it and um infographs are are around the university everywhere so um you'll have seen them in in your day to day um and they can be useful um but I just want to pull out some best practices so sometimes you'll see something like little figures of a person to indicate how many people responded or how many people feel a certain way um don't truncate people there's no such thing as a half person that's a very confusing thing to say one and a half people agree with this um again that's a case for where using a percentage x percent of respondents or something like that would make more sense so always use the full image um likewise if you're using um a similar example like this don't distort the images to just to demonstrate that there's you know double the the amount there's 100 people versus 200 people the larger one is very confusing it's it's one thing to use um you know one image to indicate more than one person so having this little figure represent 100 people um but just doubling it again it goes back to the issues that we talked about with the the pie charts and the bubbles where it's really hard to interpret area this figure on the right to me looks like it's at least three times larger than the guy on the left so um it's a it's quite confusing instead um match the image and represent it the same way so um you can have the single figure represent 100 but then double it in terms of the actual icon itself it's also helpful if you provide a numeric visual for the number of items so um in this example here on the left um there are 20 squares but it also has a number um because maybe you could quickly do the fact that you know five rows five or five rows oh five columns and four rows um is 20 it's nice to just have that number there it it doesn't take up too much extra space and it's a quick quick way add that if for some reason this visual makes sense um also group the information in a way that's um easy for people to tally so like this example above where that single icon is 100 representing 100 people um if you're using a single icon to represent something make it a number that's easy for people to multiply out so having um one purple cube represent seven and then using six cubes um people might not be great at doing that math um so try to even out the numbers and if it's a case where you know well it's really only a small number of of things people whatever that you want to represent um maybe it doesn't make sense to display it in this way um so just some examples to to highlight again really trying to push that ease of processing um and helping people figure out what what it is that they're looking for and looking at okay um so we're going to talk just a tiny bit about um visualizing qualitative data we don't have oh sorry Kathy and I see you have a question please um you're welcome back back to the slide before yeah what because I'm sure we've all seen a half person or you know on these sorts of things what do you propose in or what are the best practices proposed instead yeah so I would say something like if again if you're looking at something like if this was on an infographic and you wanted to minimize sort of the data you might just use like a subset of people so the the data itself isn't really translating so saying 56 of your respondents say that they would recommend that this course would be useful for someone else to take having 56 little icon or well it's not 56 icon say hard to work backwards from an example say seven students can you think of an example that I can translate this to it's it's rare that you would have a situation where having an icon represent an actual person would really make the data meaningful so you know if you want to say majority of students felt that this was a useful course you know having a single image of a person is not really useful um if you want to say you know seven out of ten students felt this way you might choose to do something like have seven students um colored and then the last three grade out what I would suggest in that case is just rounding down so instead of using a half person just round so if it's 7.5 out of 10 students um because you don't want to say 75 out of 100 or whatever the actual um responding count was I would round down because you don't have this it's not a full person agreeing with it it's a half person agreeing with it which it's just sort of a weird it's a weird visual to see um and it doesn't likely give you like an extra to say like eight and a half people really liked this um you might it might be a case where having just like you know the percentage listed 85% of students indicated that this course was useful for the learning of x, y, and z topic does that help is that a yeah some good points thanks yeah yeah I think I can't think of a case I rarely use the the icons and so I'm trying to think of a case and sort of often it's kind of meant to be more illustrative in the case of like yeah seven out of 10 and you would gray out the last few um but again you want to use it where you're you're representing a small enough sample because say you tallied 42 students in your class you're not going to have 42 little people and then gray out you know seven of them or what not um it's it's less meaningful yeah I I guess yeah if if there's a double personality then that could be it for it I think um that's a whole other workshop okay so um again we're not going to spend too much time on the qualitative data side of things there there unfortunately that that's the background that Natasha and I come from is more with the quantitative data and that's where we we do often see a lot of sort of these um issues and concerns come up where people struggling to display their data in the best way but we do have a few suggestions around visualizing um qualitative data so word clouds and or word trees are something that I'm sure people have seen are familiar with um they can be helpful to give you a little sort of a snapshot of the themes that emerge so this doesn't to be the case where um perhaps you had students complete um an open-ended question and they were asked you know what was the most helpful piece of the course and what you can do is pull the responses um into a word cloud there are lots of websites that will do this for free for you um and they'll just basically highlight what was the most used word or what were the most used words and then those appear in water so in this example it looks like writing worksheet the discussion were the top ones and then things like uh expectations and guidance were less common um the issue with a word cloud or word tree is that it can be difficult to compare frequencies so we don't really know you know was worksheet happening 20 times more than writing was or 20 times fewer what's the sort of even though you're getting that these were the top responses you don't really know how frequently they were happening um also when you plug it into um one of these word cloud generators they often just kind of put the data together for you so you can't really play around as much with things like um size or shape um so it can be difficult to process them based on the orientation or the colors are difficult to see so what we would suggest instead is you might consider something like grouping the information together so break the word cloud down into a series of words so um this is sort of like the start of a thematic analysis in in looking at you know which items go together and what were the types of comments that people had around that so um you can see on the lower left corner here something um feedback was a really common theme and you know that related to responses about guidance or guided or expectations or comments um so it gives you a sense of sort of where people's thoughts were sitting and the type of information that they were trying to convey in their responses so breaking it down into a series of smaller word clouds um gives you a greater sense of the frequency of those responses and different types of themes that were emerging so just another example here so this is a word free so this is um basically the same kind of idea as what's appearing on the right in this last slide where it's taking the data that um the qualitative data that was provided and breaking it up in a way um that might be easier to help people understand what was really happening so what this does is it creates themes so you can see influence correct informed scholarship writing freedom writing portfolio outreach popularized for myself at value so these are sort of the overall themes from the responses and then um what's so this is a tree because sort of the center piece is the root and then um sort of branches coming out of it and then responses um and often these are direct quotes that relate to that theme so under the example of influence um some of the responses that might have come from that were helping people see the importance of science promoting trust in science advocating for an issuer cost so it allows you to organize the data in a way um that tells you what the main themes are um and give some examples of what that theme means so what does it mean outreach was a word that people were using quite often or that sort of categorized the the message that people were giving um it provides a little bit more context um like in the previous example with the um the word cloud it doesn't give you a sense of the frequency or how often something is occurring or or um like compared comparing across examples but it gives you a little bit more context um I'm just seeing some comments and a chapter answer yeah so um Ainsley it's a good question how do you break down a large word cloud into themes um so the word cloud example from the last uh the the broken down ones the big one I kind of threw together for the presentation but the broken down ones are actually from a real project um that I worked on with a faculty member um and as I was doing the thematic analysis and pulling out the different um kind of bits and pieces of what was coming up so so the the example um here was a worksheet implemented in in labs in a science class and we were looking at kind of just what are the big picture themes of the comments that the students were making and so that we broke it down across those different themes and then had created the word clouds as just a way to just for ourselves not to disseminate the information but just for ourselves to kind of get a sense of of what kinds of things um were coming up and just just to be able to break it down into those different themes so um there were there was um kind of feedback or there was feedback from the students about receiving feedback um on their worksheets there was um a little bit about kind of how the worksheets were integrated into class and that kind of thing and so we had just used this as a way for us to kind of sort out our own thoughts and our own themes um of what was coming up with that project and so I just made a series of individual word clouds based on um those those kind of thematic analysis pieces that I had and then regarding which software um so we've got a suggestion in there um as well uh there are also plenty of online kind of free uh word cloud generators some are better than others um but there are quite a lot um at your disposal yeah I think the thing too with the word clouds and word trees is making a decision on whether yeah like Natasha said like you're going through and doing a thematic analysis first and organizing the data that way um in which case um so in this example the word trees is you want to be able to pull out those themes yourself they're not coming from the responses directly necessarily um it's sort of you know you can read the examples that are provided and say oh this is all relating to influence so you're the one creating those subcategories so it's not coming basically like any software is not doing that analysis for you um it's just sort of displaying the information okay um so the next piece I'm going to talk about again um looking at visualizing data is using participant quotes so this is a nice way um to really provide context and information about what we've learned um and um you can use icons so here um two little figures of people to sort of help readers see that you know there's there's something more catchy than just using a quote and this would be something that would be you know you might see on an infographic so instead of just having the texting quote you might have a little person and the little um air bubble to sort of indicate it's a it's someone's voice that's sharing it it really hits home the fact that it's not just you know experts and students said this but this is actually the feedback they provided about it um one thing to make sure is that you have student permission to share quotes so in any data that you're collecting make sure that you let students know how that information will be used um that's a whole separate workshop around gathering consent but um regardless of the context just making sure that you're not using a student's voice without their permission to do so um and you can think about how different icons can help contextualize or strengthen the visualization so you know maybe you have a picture of a smiley face and a sad face and the smiley face is you know highlighting what was really appreciated in the the context of the course or the worksheets in this example and then the sad face you know showcases something that could be improved or feedback that um was gathered that you know shows how things can can change um the next one is looking at a combination of mixed data presentation so uh I like using this method again it sort of mimics that infographic style in that you're getting a snapshot of a bunch of different information um but you're also being able to combine both the qualitative and the quantitative pieces so um this is an example about a new peer feedback system and so um the question that they were asked probably on a survey was I'd like to see this system used for giving feedback on other assignments in this course as well and so it's highlighting um 60 percent of the students strongly or somewhat agreed 10 percent were neutral and 30 percent strongly or somewhat disagreed and then what you see below that is mimicking the colors and alignment is positive feedback and negative feedback so on the right we have some direct student quotes about what they liked and why they agreed with that statement or sorry I think I said on the right on the left in the bluey green color uh and then on the right in the orange we have uh student responses around why they disagreed with that statement so getting feedback on sort of the positives and the negatives is really important um including sort of that holistic view of you know here's a percentage of the number of students who liked it so it's not just 60 percent of students that they liked said that they liked it but why did they like it and for students who didn't why didn't they like it it's really important to provide in teaching and learning projects we support it's rarely you know all gold stars and rainbows there's usually feedback um that comes with any project and it's important to highlight those pieces and really showcase what isn't working um as well as what is working okay so um I'm going to talk about a few other best practices and these are ones that could apply regardless of the type of figure or image you're using um so it could apply to things like bar plots or line plots as well so um when looking at the y-axis you want to make sure that you want to use a scale that doesn't over or under exaggerate the trends or differences um so in this example here this is the same data that's plotted in multiple ways um so looking at the grade increase over time um and what we see is on the y-axis each of the y-axis are different so in the first one they increment by 20 uh 25 points in the second one by five points and in the third one by 10 points and so you can see that drastically changes the way that the figure appears um and therefore could could and would likely um influence the way someone interprets the data so if I had only shown you that first figure on the left you might say oh like it doesn't look like there was much of a change happening compared to if you just looked at the figure in the middle you'd say oh there's a huge change happening um so just be mindful of how the those um the scale that you're using can impact the way that the data is perceived and understood um so we suggest here the best choice being this one with the 10 the 10 point increment um again this could be something where you know you have a figure but then you also highlight you know over the course of five years we saw a 15 point grade increase so adding some context to the information that could come in the form of a title or um a sub figure category um label um so that the the data is not only visualized but also explained and sort of what the meaning behind that is um the thing with this too is that you want to make sure that if you're um so this is these are three examples of the same data set but if you were presenting different data uh side by side making sure that you use the same scale for each so that that uh you know in one example the differences aren't more or less exaggerated so just be sure that um you're mindful of that uh so the last point I want to talk on um around another best practice is conveying uncertainty so data often contains uncertainty this is often unintentional so um it's you have uncertainty but it's not like you're not that you're making up the data or that the data isn't accurate it's that there's just not certainty around how things look so um in this the context of a teaching and learning project perhaps you're surveying your students and you're getting feedback from from a number of students um you want to be clear and and share what percentage of students completed the survey or task so you know again if you're saying nine out of 10 students said that they loved this course but only 10 students responded out of a class of 250 that doesn't really represent likely the entire class's view so just being mindful of um what data you have and how you're sharing it so what proportion of students completed the task or the survey or you gathered feedback from all of those who could have if possible and when um you're comfortable or you feel confident in doing so including error bars or confidence intervals help to showcase variability so um this is sort of you know when you have the mean of your data um what the distribution of responses around the mean is um and I'll show an example of this in just a second um and then also um when possible and when it makes sense to display the entire distribution which allows um for people to have context so Natasha brought up this point earlier of it's always a good idea to plot the data um for yourself to get a sense of the the entire distribution so this will give you a sense of you know are your responses really skewed or are you missing a lot of data or are there some um outliers in your data um this is a great idea to do you know just plot all of your data to see how it looks before you do things like calculate the means and plot those those against each other um so just going to give two examples here so um the example on the left is um that an example of that third point so displaying the entire distribution um so what's happening in this figure is um the where the lines are is showcasing where the mean is and then those individual dots showcase individual responses so you get a good sense of what the skewer the spread of the data is in addition to the mean so looking at um this first one fire you can see that there's a response that's up at 140k um even though the mean sits at around 78 so showcasing the entire spread um when you see that compared to the the line on the bottom or it's an entertainment you can see the spread is quite shorter so there's a wider range of responses and this can be really informative again depending on the context and what it is that you're trying to showcase so what you're trying to share who your audience is um your comfort with plotting this type of data it can be really helpful to showcase this type of information an example on the right um this is an example around median income um this is an example showcasing um error bars or confidence bars so there's a confidence interval around the means so for each year um the line itself showcases the the mean so how the or median sorry in this example the median income year per year but it also shows a confidence interval around that so it gives you a sense of the spread of that data so even though you know in the first example of the median was about 45 000 there is some distribution where you know people get 2012 um the range of responses is much greater even though uh the median is about 55 000 so next I'm going to move on to some issues around um best practices for color and um thinking about issues that people have um with color color blindness or just difficulty with color um and these are considerations that I would I would really recommend um you keep in mind even if you don't have this issue itself important for disability and inclusion that people can understand and perceive your figures so um this is an example using mixed colors for an ordinal scale so um students responded to each of these items along the y-axis on a strongly agreed strongly disagree liquid scale and you can see each of the items um has a different color instead what I would recommend um for an item like this is using divergent colors so strongly agree appears in the darker blue and strongly disagree appears in that darker red and so you can see as the items um as responses move away from that neutral they shift in color so it's a clear sense that the red tones are below the sort of more negative tones and the blue tones are the more positive ones uh yeah what on the left is also very eye irritating also just having that mixed match of colors it's a lot to bounce back and forth you can't just say okay blue is positive or negative um you kind of have to say okay this this darker blue is positive in this uh or negative in this other darker blue is positive so it's it's quite drawing to to look at um when you're using continuous data it can be useful to use um a sequential color so going from lighter to darker or darker to lighter um to show that increase in values um or using a mix of colors only in the case when you're using um qualitative or categorical data so if you're trying if you're showing showcasing something some of the examples earlier we had different courses and those courses would appear in different colors although um the the scale um based on the scale that you're using so um it might make sense to have a subset of courses so perhaps you're looking at a comparison between arts and science courses so arts courses maybe would appear in uh blue and science courses might appear in a green um so that it sort of creates a visual comparison more easily even though there's a subset of courses within that so looking at um categories and using colors and sort of create groupings um lastly um the the previous slide talked about using colored how to use color effectively um but the extra pieces also thinking about color blindness or color sensitivities so um I'm not sure if anyone in this group has any color sensitivities how these display for you visually but um these are so the original is I don't know what to call that the raw color set um and then the following ones are um for different types of color senses not color sensitivities or blindness so um where people have reduced processing for red contrast green contrast or blue and how those colors appear so even though they might look very different for you um how they would appear for someone else um might not be the same so the colors might look quite similar even when they're not or when you think that they don't um so you can also um something that I do quite often is there's a feature if you're in a web browser you can go into your accessibility features and you can actually simulate how it would look under each of these features which is a great way to tell if the colors you're using are accessible or you can do something like print in gray scale which would give you also a sense of how the figures look so just some examples of sort of dos and don'ts so um you can see on in this top figure here um using a dotted line and labeling the access helps to clarify which pieces of information are tied to each other so we talked about this previously so instead of that don't where the data label is included on the side as a legend um someone with a color blindness or color sensitivity might not be able to tell those apart as you can see oops so in the example in the bottom um there's a filter applied and you can see that those colors look basically identical so by including the dashed line as well as the labels um kind of pointing out off of the lines themselves it helps clarify exactly which one is pancakes and which one is lost um I also just included um some information here again we'll share these slides with you it's a great helpful way to show um uh to talk about the importance of considering this there are also many tools online that can help you pick your color some sense so um in the next slide um I'll share some resources and one of them is um a website that you can select what type of data you're using so whether it's qualitative or quantitative whether it's divergent or sequential um whether you want to include features like color blindness or high contrast um and it'll showcase um the hex codes for those different colors for you um it's a fantastic tool I really love that website it's easy to implement I have a color database subset that I go to regularly that I use for all of my figures um so it's just a really handy tool so with that I'll pull up the resources again um we'll share these slides make sure that they're accessible on the wiki for everyone um so you can go into the links there but these are just some of the um some of the content that we talked about today in today's session and with that um I see a question in my chat so I'm just going to read that um for liquid scales what's the guidance for which way the scale goes I read most positive I read most that positive should be on the right um oh I see positive should be on the right least positive or negative on the left but I've seen it used in both directions um this an interesting question I don't know that there is like scientific evidence for one or the other I would recommend going from the yeah the way you have it so least positive or or negative on the left to positive on the right um I think this just makes sense thinking about when you're doing things like if you're going um encoding the data later on uh a lower score representing disagreement and a higher score representing agreement something positive um it's easier to process um I think I think that's more natural in terms of how folks read it as well um I think it's more intuitive but I haven't seen any like harder bath science on like definitely do it this way so that would be my recommendation journey as well