 When you're working in SPSS and you're accessing data, one of the most important things you can do is to create labels and definitions for your data. I like to think of this as the statistical version of Alice in Wonderland and the caterpillar, asking her to explain herself, you need to explain yourself or more specifically, when it comes to your data, you need to tell SPSS what do your data mean. Now, that is the data description. And I see two kinds of information that you tell SPSS about your data. The first one, I'm going to call semiotics, which comes from the study of meaning. This is where you tell SPSS, what the variable names are, the data types, the variable labels, the value labels, the missing values, the level of measurement, and the role that each variable plays. Contrasted with that, there are other elements that you can call aesthetics. And that addresses variable width, decimal places, column width, and alignment. And these are all settings within the data window of SPSS. One of the most important though, at least for human consumption is going to be the variable and value labels. And so I'm going to take a little time and talk about those. With the variable names, that's what the short names, the ones that you have there at the top of the column, there are some important rules. So the rules for variable names. Number one, the names must be unique, no two variables can have the same name, that shouldn't be too surprising. It's an identifier. Rule number two, the names must start with a letter. I put an asterisk there because you can start with an at a pound sign or a dollar sign, but you don't want to because those are generally reserved for special functions within SPSS. Rule number three, names can use letters, upper lowercase, they can use numbers, and they can use period underscore at pound dollar sign. On the other hand, don't end with a period that can cause confusion with the command terminator. And don't end with an underscore because that's used for automatic variable names when SPSS is doing computations. Rule number four, names cannot include spaces. And rule number five, names must be less than 64 bytes. And most text coding systems, that 64 characters, but if you're using the Unicode system that might be only 32 characters. And the last rule rule number six is the names cannot be any of these words all and by eq g e g t l e l t and e not or two or with because those are all reserved function names within SPSS. So don't create that confusion. So those are the short names that go at the top of a variable. On the other hand, the label that you associate with that you can give it a more descriptive name. Those are the variable labels. And so there are a few rules for those rule number one, they must be less than 256 bytes, that actually means it can be really long. Although you don't usually want to do that because some procedures will display as few as 40 bytes, 40 characters. And you really want to be able to read what it is. So you want to keep it short, but you can go longer if you need to. Rule number two, the labels must be enclosed in quotes. Although I'll tell you, they need to be straight quotes, the vertical ones, and not the curly quotes are SPSS chokes on those. Rule number three, labels can include any character, including spaces, which is something that you can't have in the variable name, but you can put it here. So that allows you to put labels that sort of float on top of the variable names, and those can show up in the variable lists, they can show up in the charts and the output that you create. Another really important one is value labels. So you may have a variable called gender, and you may put zeros and ones, but do you remember what those zeros and ones are? So I'm going to show you some ways of dealing with that. The most important thing is to put value labels on there. So here are the rules for value labels. Rule number one, they must be less than 121 bytes. So that actually is really long. You generally want to keep your labels pretty short. Rule number two, like the variable labels, the value labels must be enclosed in quotes, and they need to be the straight quotes and not curly quotes. Rule number three, labels can include any character, including spaces, that's good. This is an interesting one. Rule number four, the value labels do not need to be unique. That is, more than one value can have the same label. So you might have the numbers one through nine. And it could be that seven, eight, nine, all say the same thing. But they underneath have different code and terrorist situations where you might want to do that. But mostly I want to show you how this works in SPSS. So just open up this syntax file. And this one's going to be a little different because we're actually not going to use a data file, I'll refer to one, but I mostly just want to show you the syntax. This syntax file shows how to write variable labels and value labels. Now, you don't necessarily have to put them all broken down in lines, I do it because it makes them a lot more readable, it's a lot easier to see what's going on. The first thing is the command variable labels, because there's an SPSS command, it's written on all capitals. And then what you do is you write the short name of the variable. And then you have at least one space, and then you have straight quotes, and then the long label. So here, for instance, I've got very zero one, that would be the first variable. And then this is its label written out. And you don't need to have anything after it don't need any commas or question marks or semicolons or anything. You just go to the next one. Now I put it into another line because that makes it easy to follow. And I run them all through here. I'm going to make one important recommendation. If you have a dichotomous variable or binary one that has only two possible values and gender might fit into that category, let me recommend this, that you code it as zeros and ones, a lot of people use ones and twos, but that gets confusing. If you code it as zeros and ones, and name the variable after whatever the one is. Now, when it comes to male and female, I generally give ones to whichever group I think can have the higher score on my main outcome variable. So it'll switch around. But if for some reason, I think that men are going to have a higher score on an outcome variable, then I will call it male. And then the label will be R for respondent is male. On the other hand, if I think women are going to have a higher score, then I will call the variable female and the label will be R as female, I would obviously only use one of those two. Now here are some other examples. I tend to give generic names such as variable or really just q for question q 01 q 02. And I use the leading zeros so they sort properly in the dialogue boxes. And when you're done listing all of your variable names and the variable labels and quotes, just end with a period doesn't have to be have a space before that's leftover from earlier versions of SPSS it's a habit I have. So you can run this at any time and it will assign these labels to the variables and then they'll show up in the data file, which is nice. Next are the value labels. And what you have here is the first command, which is written in all caps, and then you give a list of variables to which the values apply. And you can list them out separately, ver one, ver two, here, I've got a very three with that elite in zero. And then if they're all next to each other, if they are adjacent, they can actually specify ranges, ver three, two, and capitals, ver 10. So that'll be three, four, five, six, seven, eight, nine, 10. And then you just go to the next line, and you give the first value, that's a zero, and then I give zero equals no, and one equals yes. When you're done giving the values need to put a slash. So it knows you're done with the values for that variable. Then you can go on to the next variable. I said, for instance, if I gave one on a gender variable to men, I would call it male. And so zero, which would mean no, they're not male, would be female, and one yes, they are or true, that would be male. And do a slash. On the other hand, if you coded it the other way, then you just call it female and zero, which means no or false means they're not female, they're male. One means they are fine. Obviously, use just one of these, I do the slash. And then I could have a rating variable, say for instance, a lot of people call it a Likert scale, just a rating scale. And I could do rate zero one to rate 10. And I can specify every value. So this is a five point scale from strongly disagree to strongly agree. Finish with a slash, or maybe have a different kind of scale here at the end, I have scale zero once your scale zero to that's an 11 point scale, but I only mark the two ends, the zero and the 10. So zero is never or not at all 10 is always completely. And then to let SPSS know that I'm done specifying value labels and with a period. So this is actually a single sentence. And it's a way of telling it how you want the numbers to appear both in the data window and in any output that you get. Finally, I'll mention something about missing values because it can also be easier to specify these in syntax, the command is missing values. And you just give the names of the variables and you can use two in the same way. And then in parentheses, you put the number that is assigned to missing values. 99 is common. So I've got that there. And then you can do a slash if you're going to use different codes after that. I could do mail through female. And here I say two through high. And really what that means is anything other than zero or a one is missing. So if I accidentally type in a seven, you know, it's missing. And then here, I specify several different values, I can put seven comma eight comma nine. So if any of those show up, those would be considered missing values. Do what you want. The nice thing is it will exclude them automatically from analyses, but it will include them in frequencies when you're getting that output. Finish with a period. And then you just run these like you do any other command. And it's going to do a lot to clarify your data and make it easier to follow your analyses and reconstitute your work in the future.