 So let's change these coded values before we do that though We need to know a few things about our data set and I wanted to show you just do how to play around and look at certain cells How to get to them because those would be the values you want to change how to play around with your data set But first of all we have this size function and I passed the argument data frame to it It's going to return two values the number of rows and the number of columns Therefore, I passed two computer variable names on the left-hand side there and I've chosen in rows and in columns because that's quite descriptive So the size Returns two values. I've got to have two computer variables on the left-hand side So let's run that block of code and we see 120 comma 6 indeed. I have that two pole two values ordered pair that was returned the 120 rows in six columns What if I want to know the value of a specific cell I can use its address It's a row number and then it's column number And I have to do that in square brackets and it's always row comma columns So I can ask the data frame. What is in a row three column number four? Let's look what that value is 16.02 Now I needn't refer to it as row four now. Remember there we have row four was Va one I can refer to Va one by its name now This name does not go in quotation marks. It has a little colon before that's how Julia and the data frames package refers to a column name So I'm gonna say in the data frame row three in the Va one column, which is exactly the same as four So I should get exactly the same value back. I Can ask for quite a few rows using this colon separator Remember we used from numbers one to five if I want to do a for loop and we're gonna use a for very shortly I can say the data frame give me rows three four and five comma and if I just use the colon there's shorthand for saying Give me all the columns To show you let's run that so I get all the columns and what I only get rows one two and three See how the index changes to one two and three, but these certainly were rows one row three four and five I Can pass a list like this two and four An array I should say two and four. So I want only columns two and four But I want rows three four and five So you see how you can play around by just selecting certain things Now this would be the same as saying give me rows three four and five of the columns get one and Va one that was Columns two and four. So I'm gonna get back exactly the same values there I Can ask for specific rows as long as I passed them as an array of two five and ninety nine comma columns two to four So it's always row comma columns rows comma columns Now that we know how to do that how to get to specific cells Let's change all the a's in the cat one Column to minor infections. I'm going to change the actual Data point values inside remember when we imported when we imported we saw a a a and there were bees Further down I want to change that a into minor infection and all the bees So I've got to run down Each and every one is I'm gonna go to column one. I've got to run through every row Check if it's a I wanted to write mine infection if it finds B. I wanted to write major infection So this is a little for loop that I'm going to write So there's my for end end at the bottom loop and inside of it. I'm going to do some Boolean tests Now remember we had in the rows in columns the in rows was 120 So I'm going to cycle through from number one to number hundred and twenty So I'm going to loop through all of those rows and I'm using this Implicit computer variable here are I could use anything there could have said I doesn't matter for our in one to 120 in essence and it's just going to run through all of those First I'm going to indicate Every time I run through I want a different cell and I'm going to give that cell I'm going to pass its value into this computer variable called temp. It's this easier to work with it like that So I have this computer variable temp And inside of it. I'm going to do so at the moment. It's row one in the category column Whatever the value finds in that cell now goes into 10 Now get used to using this first part of the if statement You're gonna do an if else and statement here Always say is in a the temp and then pass the computer variable temp, which holds that value for the cell Always test if it has as a missing value in a now We know we don't have any so I'm just going to comment the line out. You're saying do nothing It's going to be ignored because of the hashtag there But you might have that is CSV files as important as missing values and you want to be able to do something to them Else if so if it's not that it's going to go to the next if statement So we have to use the Julia code else if if temp equals equals. So I'm not Equating these two to each other. I'm not saying temp equals I'm saying is the value that's inside temp, which is that one cell we're dealing with now if it's a Then make that cell which is at the moment row one in the category column make it minor infections If it wasn't a we'll do the next if else if temp if it finds B We're going to make it major infection Lastly else, but I only have the A and B. So I'm just going to write there do nothing just to have something there End end so this last end is for my for loop. So it's going to run through 120 times And then we have the if else if else if else end Statements in the middle. So execute that section nothing happens But I tell you now it's changed all the A's to minor infection all the B's to major infection and I want to do the same to cat, too Remember we said if we find a C and X on R we want it to be female if it's an L a B and F I'm going to change it to male once again. I'm going to use this for end loop R in one 220 in essence I'm going to do the same thing put that cell value every time put it into temp now this R and temp is Implicit it's inside of this for in loop It's scope is not beyond this little loop So when this loop is when we finish executing this loop, I cannot call temp and I cannot call R They are inside of the scope is inside of of This loop I'm going to run to exactly the same thing This is good to use this is in a for your first if and then else if for the ones you're really interested in and The last else just being empty It's good to learn at this way and as things get more complicated you can start adding things only new thing here is this these double Lines here. It's usually shift back backslash most keyboards Especially on the Mac All we're saying is all so if temp is C Yes or no or a stamp X. Yes or no or a stamp R If all three of that turn return of returns are false. It's going to go to the next if else else if statement If any one of them is true because these are all it's not and they don't all have to be true because remember We used C X and R for female It's going to turn that cell into female if it finds LB or if it's going to change it into male So I'm going to run that code cell the last thing I want to do the second last thing I should say I want to go down The VAR one column and I want to subtract five from every entry and I just have to do that I'm going to say VAR if I say DF in that column It's going to look at all of those values at once. Well, it doesn't really happen at once But you say the hell it looks at them all at once I don't have to cycle through them with a for loop and I'm going to subtract five from all of them And let's remember how computer system works at With the equal sign it's going to execute whatever's on the right-hand side and then place a new value inside of that value I Can just do it and remember we had thirty eight for the first age now We have thirty three for the first age and because I didn't use the semicolon It's actually going to print those all to the screen for me as in data array just for that column All right, lastly in this little section I want to change all the column names and this is how I'm going to do it rename exclamation mark this exclamation mark that just means do it in place Not just for the sake of the execution of the cell make it permanent takes three arguments the data frame name the old category number and Value and the new category so cat one. I want to be in fiction. I've got to use those Little columns there cat two. I want gender bar one age bar two HPA one C bar three CRP And if I execute all of this, I'm not going to use a semicolon at the end I wanted to print to the screen so you can see In patient ID infection gender age HPA one C CRP now instead of cat one cat two bar one two and three And minor infection major infection for babies and C's and females and males Specifically as we've indicated and five subtracted from everything now. You don't have to do this You don't have to change the names specifically if if if you want to keep the secrecy the anonymity in this analysis you don't have to do this I'm doing it here for illustrative purposes so that we do when we do the when we do the get-fly graphs Things are just done automatically. I don't have to put them in by hand now One thing that is permanent that you would have to do is just to change all the values back to what they were supposed to We cannot do the data analysis on the false ages That is not representative then of the sample that we are dealing with indeed, but that's beautiful All the hard work is now into a data frame that we can use and in the next section Most exciting we're going to do some descriptive statistics where we start getting a feeling for The data that we are working with by doing some descriptive statistics getting some numbers and plotting some graphs gives us As human beings a good feeling for all of these numbers because certainly looking at them like this does not mean a lot