 Now that we have a data sheet to work with in the form of a spreadsheet file, we want to import that into SPSS. I've opened SPSS, this is version 24, and what I want to do immediately is go up to file new Open import data. The third one from the top is what I want, and this is an Excel spreadsheet. That is the way that we have saved our data. It is in a flat file made with Microsoft Excel. So I click on that. I'm just going to navigate to that on my hard drive. I see the folder there, the folder is in my SPSS folder on my hard drive. SPSS underscore data is what we save the file as. We can see the extension there, XLSX. That is a Microsoft Excel spreadsheet, and I'm going to hit open. We're going to see this dialog box open, and what you would notice if you do a lot of statistical analysis, that these flat data sets that you do get have a lot of mistakes in them. And with this import dialog, we can actually correct many of these. You would often notice that people who enter the data leave leading spaces in the string, so it will be space city, and some would just say city for instance there, and that would be seen as two different data point values. So I always like to say remove leading spaces and also trailing spaces. So someone would type city and then a space before they moved on to the next patient, and then coastal with a space, and then inland without a space, inland with a space. So I like to remove these. This is just one quick and easy way to clean up the data that we do import, and I'm just going to click OK. And that is some syntax that you saw there that is written, that is the code that SPSS writes to do this import. And what do we see at the top? We see all of our column headers. Those were our variable names, and we see these little funny symbols next to each of these. That is SPSS's attempt at deciding what data type these data point values are for that specific variable. And we see a little ruler there that would suggest that SPSS thought that this was numerical value, and indeed it was. And if we just move along here, we see even though we have grade 1, 2, and 3 on these numbers, we see that SPSS did not see this. It correctly guessed that this variable is actually categorical and not numerical. So that was very good. I can just move in between these and just drag it just to open up. Another thing that you would note is that all of these suddenly look different. For instance, there is heart rate, and it's now one word heart rate, no space, white cell count, one word, no space. The diabetes was always one word grade, one word treatment group. So these are now what SPSS sees as the variable name. There's a variable name and there's a variable label. The label was the long value that we had in our spreadsheet. It concatenates those into something short, because that is what it needs when it writes its syntax. When you push these buttons to do the analysis, code is actually generated called syntax and that syntax requires these words without spaces or any other illegal characters. Now when I hover over that, you can actually see name is treatment group, but label is treatment space group. Now SPSS might make a mistake with deciding what kind of variables we are dealing with, what kind of data types we're dealing with for these variables. So down at the bottom, I'd like to go to variable view. And now suddenly we see transposition almost, because we see all the variables down as rows now. There's ID, there's age, there's referral. And look here, there's the name, but there's the label. So the name was changed to heart rate, but the label still exists heart rate. So you can see the change that was made there. I don't want to spend too much time at the moment with width and decimals, except for saying the following. If we move down to decimals, remember I showed you, there were a lot of decimals. Now we can get rid of them here. They were, they still existed, even with my copy and paste values, they were still there. And there was actual fact for temperature and heart rate, because of the random nature in which we chose those, we have 15 decimal values. So for heart, for temperature, I want to bring that down to one. And for heart rate, I actually want no none. So SPSS is going to do that rounding off for me, and those would be the only decimals that we are going to see. Missing is quite important. We want SPSS to know if any missing data was coded. If we do extraction of data from a patient folder, in many instances you'll note that that data is not noted. Very simply, we did not note that data when we work with a patient's file as clinicians that often happens. And we might have in our spreadsheet coded something specific. So if a patient's white cell count was not noted in the patient's file, we might have coded that in our spreadsheet with a value 999. And we can actually tell SPSS here that that was 999. And we'll get into that later. What I want to show you, more importantly, is this measures column, because I can change here if I drop down whether this was nominal, scale, or ordinal. So let's go to ID. It was actually, I'm just going to leave it at nominal, because it's not something we're going to do any statistical analysis on. There was just purely a patient data as far as the number was concerned for the study. Age is indeed scale, so that was correct. The referral was indeed nominal. I don't want to change that to ordinal. Remember that was just the values either from coastal area, from inland or from city region. Temperature was indeed scale, which is SPSS's name for numeric. Heart rate was numerical values. The white cell count was numerical. The diabetes was just yes or no. There was no order to that. So that's nominal. But you see here grade, I'm going to change that to ordinal because patients with grade 4 disease are worse than grade 1. There's some order to that. So I'm going to change that to an ordinal scale. The treatment group was nominal categorical, so that's fine. The outcome was nominal categorical and the months was scale. So it's very important to spend some time here and just correct these. If you import from other programs, it might not make as good a set of guesses as it made here and please change those. The role I always leave to input. When we do regression analysis, I might want to change one of these as a target or both as other things predict for instance or let's go to outcome. That outcome is a target and these others might be used as predictors of predicting what the outcome is going to be. But you can change that while you do the statistical analysis. So I tend not to change these. So that is what we normally do just when importing a flat file. It's just to do this variable view and make sure everything is correct. If I go back to my data column, there's all the data ready for us to do some statistical analysis.