 As we continue SPSS and introduction in basic graphics, we should look at scatter plots, a very common method of looking at associations, or as I like to think as way of assessing togetherness in data. In other words, you want to see what goes with what, or more specifically, what variable goes with what other variable. So scatter plots are a great way of visualizing the association between two quantitative variables. When you make a scatter plot, there are some things you should look for. And in case you're wondering what they are, they include, for instance, whether the association between the two variables is linear, because a lot of the procedures that are common assume that you can draw a straight line through the data. You want to check the spread of the data, especially whether the spread changes as you go from left to right on a scatter plot. That's called heterogeneity of variance, and it can cause problems with certain procedures. You want to look for outliers, either univariate, that's a score that's unusual on a single variable by itself, or in this case, what's even more significant is bivariate, where you have an unusual combination of scores. And then finally, you want to try to get some idea for the correlation or the strength of the association between the two variables, and scatter plot will allow you to do all of those. Now in SPSS, there are three general kinds of scatter plots that you can do. Number one is a simple scatter. It's a bivariate X and Y chart. Easy to do. Number two is a matrix scatter plot where you actually have several variables and they're simultaneously. And it's a good way of looking at complex associations between collections of variables. And number three, SPSS is able to do a 3d scatter plot, but I'll have some words to say about that a little bit later. But let's try this and see how scatter plots work in SPSS, at least very basically. So just open up the syntax file, and we can see how it works. When you open up the syntax file, we have the same situation where you can load the data, we'll use demo dot save. And you can use this command if you're on a Mac using version 22 and this command on Windows version 22. But we're just going to make a couple of scatter plots and it's a really basic, easy command. The first thing we're going to do is make a scatter plot of age and income. But let's come up to graphs, to legacy dialogues, and down to scatter. I'm going to use a simple scatter that's just a basic bivariate X Y chart. I'll hit define. And all I need to do here is pick my variables for the X axis across the bottom and the Y axis up the side. I'm going to pick age for the X axis and put it right there and household income for the Y axis. And the idea is maybe there's an association between household income and how old a person is. That's all I need to do except click OK. And when I get that, I get this basic scatter plot. So I have age and years across the bottom, I have household income in thousands up the side. And you can see of course that most of the people are near the bottom. That's because most people make less than $200,000 a year. This graph goes up to $1.2 million. We have a marker that's a large empty circle that's in black and you can change the markers and there's things you can do to clean up the chart. But it's also easy to tell the people who for instance make a lot of money are generally older. And so we can see in this data there is some kind of association between age and income. But let's try to get a more nuanced one by looking at several variables simultaneously with a scatter plot matrix. Come back up to graphs and legacy dialogues and down to scatter. This time, however, I'm going to pick matrix scatter click define. And then all I need to do is pick the variables I want to include. I don't have to specify X or Y because they're all going to serve as both X and Y in different parts of the matrix. I'm going to pick a few here. I'm going to get household income. I'll move it over. I will get age and move that over. I'll get address years at current address, move that over. I'll get reside, which is the number of people residing in the house, move that. And then finally, I'll get level of education. There's nothing especially meaningful about these. They're just ones that I thought would be easy to look at. Now, as a general recommendation, if you do have one variable that is an outcome variable, you might want to put that one in first. That puts it in the first column in the first row and it makes it easier to find it when you're looking at your analyses. But I've got my five variables in there and I just come and press OK. Take some moment. And then I come up and this is the scatter plot matrix. And so you have all five variables listed on the side, you have all five variables listed across the bottom. So each one functions as both an x and a y. You have empty boxes down the diagonal because that would be each variable with itself and the correlation is always one. Now, there are things you can do to clean this up. You can change the marker from a big black circle to something that's smaller and easier to see. You can put regression lines through. But it's easy to see that there's some really important patterns. So for instance, age in years and years at current address right here. Obviously, there's a limit. You can't live someplace longer than you've been alive. That's why we have nothing in the top left at that. But you do see some associations and some cutoffs that go through. Now, this one's really dense in a lot of situations. It's going to be a lot easier to see the patterns that's there, especially if you change the markers and put in regression lines. But this gives a good idea of what you can do with a scatter plot matrix. Now, let's go back one more time to the legacy dialogues and to scatter. Because you saw that there were other options there. There's a dot plot that's like a histogram and there's an overlay scatter, which I don't want to deal with. And then there's a 3d scatter and you might look at that go cool. It's interactive. It's 3d. It's a great thing. I'm actually not even going to do it because every time I've done a 3d diagram, I found it's impossible to read it clearly. It's very hard to manipulate an SPSS and it ends up being really a bad experience. And it's much easier to look at the association between variables using a scatter plot matrix. That's why I recommend that you avoid the 3d completely, even though it's available here, but avoid it completely and use the bivariate and the scatter plot matrices as a way of looking at the associations between variables in your data.