 Oops, do you want to do introductions again? Yeah, and I'm Sarah Mannheimer. I'm a librarian and I am Learning are as well in the process always in the process Okay, so wait, so we're going to talk today about sort of extending what we learned in the intro and data viz Workshops, we'll talk about relational operators how to join sequences of relational operators together with logical operators We'll create some conditional statements. So if else statements will create for loops and then also create some functions of our own and Think most people have seen this already. We used this online tutorial last week, but Basically, this workshop is assuming that you have a little bit of familiarity with our Yes The other thing I forgot to make sure is that you have the link to the tutorial. This was emailed out to you But it's this at the top of the screen Way thing I forgot to make sure is that you have the link to the tutorial. This was emailed out to you But it's this at the top of the screen To go away put it in the chat It is Let's close the door to do mind So it's our connect dot that math Montana dot edu slash capital intermediate capital R So this is what we'll be using Just like if you attended the data viz workshop This is an online tutorial that allows you to sort of play around with code without having our studio installed on your machine And so each little box that you'll see is like a little sandbox And it each box doesn't rely on the code that you've done before so we've Programmed it in so that everything goes in order But it's kind of just a way to really think about only the code and then we don't have to worry about if there are technical issues with RStudio, so hopefully this is helpful to you And then we also have some Challenges that will work through and answers to the challenges are available within the tutorial I'll show you next. So for example this first Oh, yeah, thanks when you go to the challenge, I'm not seeing the answers. Oh, yeah, here we go The solution is here. So this is hopefully you can come back to this later And if you're having issues, there is a solution. It's not always the only way to do something But those are available to you And those are in the handout as well if you saw that PDF handout that I emailed earlier so, okay So first let's talk about a Little bit of refresher. So the workshop covers content that requires we remember how to extract elements from Vectors and data frames. So let's do a few warm-ups just to get us started First to extract an element from a vector. We use this bracket notation So the vector only has one dimension. So inside the brackets will go one number and You get you extract some an element using the corresponding index number. So they start from number one So for example in this vector We have This is index number one index number two Index number three, etc So if we run this x and then the bracket notation with one that'll extract that first element So you can see it gives you that first number of five point four Then you can also use the notation with colon one colon three to extract inclusive one three three And you can also use a negative like a Yeah, a minus sign to exclude something. So here the last one we do x minus two that excludes the second index So let's do a very quick challenge here. What is our produce? Why does our produce an error when you run the following code? Yeah, right the everything except one is sort of a different idea than one to three So you can go if you want to view Everything except one to three then you could just go two to three That's one way to do it show between one and three but without one or you can shoot Say you don't want one You use a comma you could do everything except one How else can I do this that work? We have Not one And then six point two seven point one. Nope. That's not working So Nice So you can play around with that in the sandbox as you like That's coming up. Yeah And then let's extract some elements from a data frame. So a vector is one dimensional data frame. It has two dimensions columns and rows and so then we can extract two types of elements So here's our example. We're making a little data frame where row a is one 13 row b to 6 10 etc so then We can use this example And we so we created the data frame and we named it example underscore df. That's what this code is doing so and see you can You can type in these Boxes however you like and Do your own code. So let's say We want to print out the whole data frame There we can see so we have column a column b column c column d And if you want to extract something you use this Dollar sign notation so Here we extracted column a that's what's showing in the one here or we could try Give it a try yourself extracting column d maybe And then you'll see it's showing below column d which is that 4 8 12 16 So you can also then extract a column using that bracket notation that we used in the vectors But you use that you do use the comma for this one so Here is a few examples and then we have them notated with a comment so If you say one comma one it extracts the first row the first column entry So that would be looking at our full data frame first row first column is the number one Or you can do a blank one comma one which would extract every row in the first column So that would be one five nine and then you can do the opposite extract every column in the first row which would be One two three four trying to low here So let's do a little challenge Change the code below to extract the third and fourth columns from our example data frame up to here So you'll want column c and column d Using what we learned in the vector example and in the data frame example And then raise your hand when you're done Different languages using that same full-in notation if we say We don't want any any row Any row but we want columns three and four inclusive of each other You can use the solution to do that too And then how would you change the code above to extract The second and fourth column so write your own code. You can use this example here But you hopefully did it yourself Nothing from the first row but for columns we use this concatenate function It's kind of a trick question because we haven't showed you this yet and then Run that code So you use concatenate with a two and a four maybe we should add that above where we show that concatenate function in the first exercise Oh questions This is the simplest part so um Talk about relational operator. So this is um shows how one object relates to each other And there are a few relational operators you can use So we'll talk through a few examples so First we're like creating a few Objects we are creating a w in this code chunk An x a y is e a dna one and a dna two And then we'll work through these ideas of inequalities and using some of these relational operators to investigate values so You can use an equals to find out whether it's equivalent or not equals to find out whether it is not so let's run that code Okay, so dna one Does not equal dna two you can see they're just a little bit different And so you're like does dna one not equal dna two true make sense um And then we can also use greater than or less than so these are should be familiar to you from math class But there's a bit of a twist so to write greater than you use those carrots greater than or less than If you want to add greater than or equals to you just use those one at a time greater than For less than equals So we've got let's run this code Is w greater than 10? Yes Is x greater than y? No Remember the values here or you can Write them in just to remind yourself. Whoops. So here I've printed out w to remind myself. Okay. Is w greater than 10? Yeah and Is the x greater than the y? No, okay then Using you can check whether a character number or factor is included in a vector using the percent in percent operator the inclusion operator and so you can Here we're linking these three Strings of characters basically green pink red into a vector and Then we're saying is blue in this vector of colors The answer is false because we just have green pink red Then we can create this vector of numbers And we ask is five in numbers The answer is true Here's another example. We've created a vector called some letters with a b c d e and then we ask r a And b in some letters and it says true true So we could change this like say. Oh is q and b are those in some letters and it'll say False true because q is not in that vector and b is I'll make sense So let's do a little challenge Um Write the code for each of these four Ideas is two times x plus point two equal to y Is hello greater than or equal to goodbye is true greater than false Etc. I'll give you just one minute to do that. Did you Know that because of the way this sandbox works we have everything coded in as though it's already run But if you were working on your own machine, you would you would have to do that Yeah How do you do with all of those? And we introduced this new Function for you n care to figure out how long a string is So this is asking is dna one Longer than five bases. So when we use that n care function, we find out it's actually 15 There's 15 letters in that dna one string. Oh, that's more than five. So you could say This created at the answer. Yes Questions there. Let's keep going to compare decimal value numbers. So That's you sweet Okay, so let's take a quick look back at what we saw in that challenge three so When we looked at it two times x plus zero point two if that was equal to y So we can't just use just the equal sign because that's as if we're doing actually like math Double equal sign checks. Is it equal to? So we got false here, but if we look back at what x and y are if we do them separately to two And then print y So if I print these separately I see they're both supposed to be 2.8. So why when we ran that one line, why did we get false for an answer? That seems counterintuitive, right? so R is very very smart sometimes smarter than we are It does need a little bit of leading sometimes when we're running certain things. So When we're comparing decimal value numbers Sometimes there is a discrepancy on where our automatically rounds to when it's estimating these numbers And although it prints out to one decimal place for both of these So we assume they should be the same it may indicate that they are not technically equal because it's rounding to different values so To work around that we have a function in r called all dot equal and here we can type all And then put in our two arguments separated by a comma and it'll actually check With a certain rounding that is the same for both arguments Whether they are actually equal. So we'll type in the two times x plus two comma y So when we run that code It's true. So when they're rounding to the same decimal place, those are in fact equal You can also set tolerance in all dot equal just using another argument. That's ANC or DNC ANC Tolerance equal and you could set it to a value that you like the default is a very very small value So it does rounding out quite a bit of them like past the decimal Okay, okay. Let's look at comparing characters. So again in challenge 3 when we were looking at hello is greater than or equal to goodbye These are both words, right? We're comparing two words So how are we supposed to know if it's greater than or equal to it doesn't make sense to compare those things, right? But it still gives us true. It does compare these somehow which is Quite interesting and we'll see even if we run this line here It'll give us the same thing saying that hello is in fact greater than goodbye. Does anyone have any guesses for why this is the case? Great thinking. So exactly. I'm so proud of you. Okay so When are like we've seen before when we're dealing with levels of a categorical variable our defaults to putting in alphabetical order, right? So it sort of ranks these saying like a is the first b is the second c is the third So within the ones you're comparing it ranks them alphabetically Since in the alphabet g comes before h g would be A value less than so g would be first hello would be second g is a value less than h So h is in fact greater than so since it starts with the letter h It's going to be greater than a word that starts with the letter g But as i'm sure you know dictionaries and alphabetical orders may differ a little bit all over the world. So Obviously the language you're dealing with or the dictionary you're working with is going to have some impact on What you get for these logical results when you're dealing with words. So This function right here sets your system To use a certain dictionary based off your time zone. So this one locates says, okay I know what time zone you're in and automatically does the default dictionary associated with that region, which is really cool Okay, so let's look at another challenge Yeah, so you can change the dictionary whatever you want. So does it have to get if you don't have to use the automatic location You can override it to a different Just to repeat that for our webx viewers. You can override the automatic location. So you could set it to a certain dictionary as well Did you have more to say? I'm sorry. Okay, sounds good. Okay, let's take a look at challenge four So what is going on in the code below? So first let's just start. Let's look at what some letters is to remind ourselves what we're looking at So some letters itself is a vector of these five values. We have a b c d e This what we looked at before so we know equals equals is asking is it equal to But the exclamation point which we also call a bang if we do a bang equals this is saying not equal to and then we have A concatenate so a vector of two very or two elements just a and c So when we run this code We get a warning and we get false true true true Something's up, right? That's not what we were expecting to see So let's figure out what is going on here R does this thing called recycling and it's not like recycle your cans. It's kind of the same idea, but it's using it again um So what r does here when you have a longer vector first you're comparing to a shorter vector after the relational operator It recycles the shorter vector until you get something of equal or greater length comparing to your longer vector so What this is doing is it's going to take this but it doesn't know what to put in these last three slots so What it does is it gets recycled and it repeats itself until It's either equal to that length or greater than that length so As you can see here We're going to ignore the not equals to if we were comparing if this was equal to a is equal to a so we would expect that to be true The bang is going to switch that so it's going to give us false So a is not equal to a it is equal to a so The opposite of that is going to be false. Does that make sense? It's a little like wraparound but Then we can look at these equal to see those aren't equal so this is going to be true And you can go thrust away and infer by assuming that so A fun way to kind of cheat this too is you can actually put the shorter Vector the shorter length vector first and you're longer second and it should work out. Okay Most of the time for you that so Now we're going to look at this with our inclusion operator. So this is the percent in percent So first let's look at this Without the bang and so now we're seeing if these two letters are in the vector sum letters The vector sum letters whether those five or five elements. That's abc and de So when we run this It's going to give us true true. So it's doing this element wise. So it's it's saying is this a in some letters That's true is b in some letters. That's also true So when we negate this what are we expecting to see? False false exactly that's what we get Okay, now let's look at some logical values. So logical values are What we've already been kind of dealing with so these are those true and false You may have noticed already that both of these words are always typed out in all caps So there's actually two ways you can do logical values in r. So you can either have Just a capital first letter. Oh, not bad. Just a capital t Will automatically read as true or you have to have the completely capital forward same for false capital f will suffice as false But if I try to type out true with some lowercase letters or a lowercase r is not going to recognize that You can see it already autocompletes and wants me to pick that so it can actually read it as true It's trying to read my mind and it's actually doing a pretty good job here, which is great so If you can see color logical very blue And words are for vectors Named objects Yes, unless you change colors In default r, that's what they are. We want to repeat that from oh, absolutely So grata mentioned that logical values there in blue in default r unless you've changed your What's it called our theme or in disobey color scheme? and then Vector names or object names are usually in black So you can tell the difference r will read it as a logical value if it is in fact in blue Thank you for that. That's a good point today okay, okay, so Some letters we can also kind of use these logical values to tell them what we want to extract this is Not the most straightforward method, but in some cases it can be very helpful to use So here it's telling us Our some letters were five letters abcde, right? So it's saying within that true. I want a I want b. I don't want c Don't want d don't want e So when we get output, what are we expecting to see? It's a and b right and it does that it can read our minds so It's not always the most intuitive to use this way to extract elements, but It is an option and it may be helpful in some cases. So it's worth knowing I want to take a look at the witch function. So or the witch statement so Witch is special which returns indices of the values. So it's finding the location of an element not the element itself which is Very tricky because the first time you use this everybody's like that's not what I thought it was going to do And if you don't know what's going on it can be a heck of a time trying to figure it out. So What it's doing is it's going to locate elements. So first I am going to Oh my goodness gracious There we go. So I'm saving my x2 vector here And now I'm asking which elements in x2 are greater than eight So especially if you haven't run this yet, which elements in this vector are greater than eight 9 11 13 15 however when we actually run this Oh, well We get four five six and seven That's not what we were expecting, right? So what this is actually doing Is it's telling us these are the locations within that vector where numbers satisfy this condition So the fourth place fifth place and sixth place. So if we go back just look at our vector really quick Look at one two three four. That's where nine starts all of those past there Are greater than eight. So it's telling you the location. So be careful when you use the which function It's not telling you the actual value that's there just where to locate it. So Now let's look at the next one. So which elements of x equals two are equals equals. So r equal to seven What do you think it's going to give us? Three because it's in the third position in the vector. Let's make sure it gives us that Perfect. You guys are brilliant. I'm so proud of you So now let's talk about some matrices. So we've talked about vectors before right Where that's just a string of you can have characters numbers all sorts of jazz pushed together in one Thing you're going to use So we can actually combine multiple vectors in a matrix One way to do that is to use the data.frame function We're going to name a vector Call it this Say what's in my vector And I'm going to name multiple separate them with commas as you do normal arguments. So here our arguments themselves are vectors You could do this in multiple steps where you do. Okay. Cupid. I'm going to define it up here I'm going to define match up here and then I'm just going to put the names that should also work right Okay, so your arguments here are vectors. So you could do it outside or nested within like we have in this example So in this example, we're going to go on a dating show Um, we're looking at two different dating sites. We're looking at okaycupid and match.com And this is the number of messages that one person our lucky contestant has received in the day across the whole week So let's go ahead I'm going to run Our vector This we shouldn't expect any output because it's just saving it. I'm going to call it messages Now a good thing to do is to always add add row names. So I know like which day is first What are we talking about? So when I add this Should add the row names And then I could oh my goodness I can actually type Thing itself And we can see it all printed out So now we have a very nice matrix with the days of the week as row names and our two columns with okaycupid and match Notice that our vectors although they're input and it almost seems like it would be a row They're actually input as vectors Or as columns in our matrix and data frame So now another challenge for you So we're going to use this message as matrix to return a matrix of logical values that answers the following question For what days were the number of messages at either site greater than 12? So Here We can just take our vector messages And then what symbol do I use for greater? One of the carrots right perfect greater than well so So it's easy as that and it'll give me a printed out data frame Keeping the row names keeping the column names and tells me on monday Okaycupid did have more than 12 messages. That is true and we'll do that for each of the observations You know, okay Oh, I got a spicy challenge. Are you guys ready you warmed up? Okay, so we're going to use the messages matrix to return the rows of messages that answer the following questions So this is where we're going to think back to that bracket notation Now we saw earlier. So we're used to seeing the normal parentheses normal just curvy parentheses Those are used when you have a function when you're trying to index within an object Whether it be a vector or matrix or whatever you're dealing with That's when we're going to use those square brackets. So here when we're going to index into that We're going to think of our square brackets and since we have rows and columns. We have two dimensions here we're going to have Square bracket comma square bracket and it's going to be rows comma columns We're going to do that. So we're thinking back to that so this one's spicy because we're going to Have a lot of nesting to deal with and it'll make more sense in a second. So Here we know we want to do something with the messages matrix, right? So let's start there I know I want to index into it somewhere to get something so I want to see only the rows that give me We're messages at okcubit are equal to 13 for the first one So I can use which of the functions it's a hint That we've already gone over The which function, right? So let's play with the which function So since this one is a function, we're going to use the normal parentheses put that in there And then we're going to index into that and say we just want to see the okcubit in this one, right? We don't care about match yet. So I'm actually going to take the name of my vector Use that trick we reviewed earlier that I can dollar sign into one of my columns okcubit And then I want to see When were they equal to 13? But what am I missing here? Two equals exactly great answer because this is just saying this is actually like we're doing an operation I'm gonna say is it equal which is this trip? So when I run that code And define columns great. Do I spell something? Oh Exactly. I mentioned it. I forgot it myself Okay, so we're just talking about the rows. So we want this just like on the rows part So when we square rock it in it's rows comma columns. So we can put a comma And we're not really interested. We just want all the elements from those columns That satisfy this condition. So we don't have to put anything after the comma So just to write it out When you index into something you're going to have your object name Oh yep square brackets And if it's of two dimensions like a data frame or a matrix you're going to have rows comma And notice that just as a reminder when I put this hashtag This comments it out. So it won't actually affect the code. I'm running. This is just a note to myself as a reminder Okay, now Sorry when I did Which message is okay? I got a list of all of the Without the comma Didn't get an error. I just got Okay Is there any differences in how these look? Do you have any idea where that was? Does it matter that I have um capital letters Yes, so r is case sensitive. It does matter whether you have capital letters or lowercase letters If it doesn't match exactly what the variable name is tight as it won't understand what you're trying to Point out or get it to call So when you do turn it all to lowercase, do you get the undefined columns? Yeah, okay Interesting. I wonder if we're reading that to be mild. Okay. Does it not come up all lowercase? Yeah, we we have some things that are saved Oh in the back end for use between sandboxes and so Big question, though Okay, let's take a shot at the second part of this challenge. So now we want to know Which rows were were the messages at okay keeper great at okay cupid greater than the number of messages that match so Start at the basic. We're going to look in our messages data frame We're going to index into it somehow. I know I want something in there, right? So we can use our which function again Start with which since it's a function normal parentheses And now I want to compare two of the variables within my data frame So I can kind of steal what I did here where I indexed into one of the variables and I want to compare within messages Here okay cupid and I want to see How when they're greater than match so I'm going to use a greater than symbol and then what am I going to do to get match here exactly Such as dollar sign. What am I missing? exactly perfect Something that gets me every time personally and let's go ahead and run that And it'll print out a data frame for me. So this is I think it's a different way than doing it As the solutions say yep, so you're indexing in specifically into The first column and the second column to get okay cupid and match This is another way you can do it with the dollar sign notation. It does make it longer coding But they do the same things The output will be a little different though This one instead of telling me just the days. This is going to give me a data frame with the days and The numbers there So don't worry about that This shouldn't give you a warning if you're doing these actually in r but just because learn r is Right for sandboxes and helping us it does come up with warnings that in this case are unneeded. So Now let's explore the subset function so Subset it's going to take an object And a condition you give it and then return all of things within that object that meet that condition You don't know guy with that. Okay so Let's first make an object so I can understand and we can reference one So I'm just saving the specter as my x3 and then within the subset function My first argument arguments are just the things inside a function separated by commas My first argument is going to be the object itself And then my second is I'm going to give it a condition that I want it to satisfy And if those things are if an element satisfies that it's going to be included in the subset if it does not it's going to be thrown away so I want to see within x3. I want all of the elements that are greater than six So what am I expecting when I run this code? Which two elements is it going to drop? Just three and five Run that and it did what we expected it to Yes, you have to have the object first That's what it's expecting unless we talked about pipe operators. Yeah The percent greater than percent You do that More of that later But great question It's like a challenge seven then So using the okay keep the data from above we're going to answer this question So we're going to change the witch statement to a subset statement Extracting the number of days Extracting the days that the number of messages at okay keep it is greater than the messages that match So let's roll up to challenge six steal that code Basically here to work with So I'm actually going to start. Oh my goodness on a separate line And use parts of this code, but I'm not actually going to change that code itself. I'm just going to reference it for this example So first of all, we know we want to use subset instead of witch, right? So if I'm taking I'm going to take out What we've indexed in the rows And I'm going to paste that on another line. So now I just don't have the square brackets I don't have the first object name and I don't have the comma I'm going to change this witch So now we want to look at what function The subset exactly I'm going to type in subset So We're missing one thing here though. What is the first argument in the subset function? It's not the condition right away. It's the object, right? So we need to put the object's name first so messages comma And then we can have our condition So I'm just going to put hashtag here That way comments that code out. It shouldn't affect what results we get And run it there Oh, thank you so much. I do math all day. I'm not very good at English Okay, thank you for catching up And now we see it's printed out the rows the entire rows And we can see the days So I think this one is a little different than the solutions to so same thing This is indexing into the first column. So that's telling me I just want the okay cupid Does the same thing as This does Same with this second column says second column in my data frame. I want to look at the match So two ways to do same thing Now let's look at logicals so One thing to know a little nuance That we've seen when we're talking about logicals. We've been referring to the true and false values, right? These are logical values now. We're going to look at logical statements which have logical operators So just a ton of vocab thrown at you So statements have some sort of operator there operators are like when I want to do addition My plus sign is my operator subtraction minus sign It does an operation statements Is comparing two things with an operator So there's three ways we can do this the first is with an and statement. So this is where we use that ampersand In case you don't know where it is You're going to press the shift key on your keyboard and then hit the seven and that's how you're going to get your ampersand so The ampersand is special though because it only returns true if all of the relational statements are true So if you have even one false, it's going to return false. All of them have to be true So a couple examples is three less than five. That's true And is nine greater than seven. That is also true. So since true and true We're going to get a true out of this With this one Three is not greater than five. So we're going to get false and true So it's going to get us a false both of them have to be true for and to get a true Then our second statement is or so This is where we have our vertical bar or a pipe But be careful when you call it a pipe because there is a certain like pipe operator using an r That's more commonly referred to as a pipe. So we would suggest just calling it a vertical bar even though it is kind of more to say This one is located on your keyboard when you press shift and then the key above your enter key So it's kind of been a strange spot, especially if you haven't seen it before What this does is it reads as or so if i'm telling you something Do you want to do this or this if one of them is yes? We'll go do something So it's kind of the same deal with that just with numericals and math So if at least one is true, it's going to give you a true So three is greater than five that one's false or nine is greater than seven That's true since one is true. It's going to give us true If they're both false like three greater than five false nine less than seven false. It's going to give us a false If you have both true, what is it going to give you? True because you have true or true as long as one of them is true. It'll give you true Now we have not so we've seen this a little bit before when we were doing equals equals And then we were doing not equals to right So this they call you probably know it as an exclamation point. They call it a bang also So if you hear that terminology, you're not just like what's going on So we can use the bang to negate statements So Essentially what this does is negates whatever comes after it So you have to put it before your statement that you want negated That's why it's not equals to the bang comes before the equal sign as well so This one is a little backwards logic because you have to be cognizant that the bang is there Otherwise, you're going to get unexpected results because if I do is numeric five I know that's true. But if I negate it, I want it to see false Because five is a numeric data type You don't okay with that Sounds good Okay one thing to note Sometimes you may see in code where people use the double ampersand or the double bar Those are not the same as their single counterparts. They do not do the same thing so instead of doing an and where Both of them have to be true to get a true result This the double ampersand Will only evaluate the first element of an object So it's not looking at all of the things within an object Especially if you're dealing with multiple objects with a lot of elements It's only going to look at the first one and can give you misleading results Same with the bar. It's only going to evaluate the first one if you have the double bar. So This one if you're This one is where it gets a little tricky to because if you're dealing with something with multiple elements And neither of the first ones are true. It's going to give you a false But with a single bar as long as at least one of them are true. It gives you true So it's not considering the rest of the elements and if you have a true in the rest of the elements It could give you misleading results as well. So Let's look at these a little bit closer So within this example, we're just doing one ampersand. So this is and So we need Both of them to be true for it to give us a true, right So in this sense when we're comparing logical values In a logical statement We're seeing if they match Is essentially what we're doing. So we're going to take the first element here. Does it match the first element there? Yes, so what's it going to give us? true For the second element, it does not match. So we're expecting false And then the second element it does match But it's both false So here you have to well actually let me run it nature I think you have to have two truths to have a true especially with and so With the ampersand this is giving you a false statement So think of it. There could also be a statement here asking you is three greater than five That's going to give you a false result Since there is already one false result. It's going to opt to give you false You don't okay about that? Okay, so I guess I misspoke when I said if they match This is just a way to think of like if you plug in statements there is a better way to think about it Okay, now let's look at the bar. So what does the bar mean? Is this our and or not? This is our four So as long as one of the elements is true, then we're going to have true, right? So we're going to match these up element wise. So when I match the first ones is one of them true They're both true. So we're going to expect a true second At least one of them is true. So we're going to expect a true here And then no truths in the third element. So we're going to expect that too. Oh my goodness We're going to expect that to be false expecting true true false and that's what we get Okay, so now on the last one. This is where our caution comes in So when we're doing the two ampersands, it only considers the first element of the both things you're comparing So when we run it, I'll run it first to show you Instead of giving us what we expect the three different elements It only gives us the one because it only compares the first elements. So Since this is true and this is true both are true, but it's only doing the first It's going to give us a true out You know, okay. I know I'm saying a lot of the same words again and again Please ask questions if you get mixed up Or especially if I get mixed up too Let's look at challenge eight All right, we got two parts to the challenge The first one we're going to see if the last day of the week is under five messages or above 10 messages So first of all, we have to discern what is the last day of the week in our sense So they do give us a very helpful hint I'm going to take that hint Copy that in Sit in there so But I'm going to go ahead and highlight Just this part of it just to see what the tail is doing So when I just run the tail part, it's telling me 14 oh no, I just want Okay, so it's going to give me Oh, no ignore that So when I run this whole thing, it's going to run The last row so tail is going to give us the very last row that n says I only want the last one row If we have a bigger number there like 10, it's going to give me the last 10 rows But in this case, we only have seven rows for the days of the week So we're only interested in the very last day So this is say I'm going to say I want the one last row And okay, keep it. So I'm going to save that as last After that, I'm going to print out last That should get me what I'm trying to show you 14 okay, so Let's go ahead and look back data frame Here we are. So what it's doing is it's giving me the last value in okay cupid But I can see here that my last value is on a sundae So now what did you say? Okay, sorry So now we can use this last to do some operations on So we want to find out It's the last day of the week under five messages or above 10 messages. So What we can do is we don't actually have to index into this in this case So we could just put those side by side with a relational operator That says I just want the very last observation So I want the one last observation if I have n equals two It's going to give me the two last observations in that column And then whatever number you specify tells you how many Great question um So here the first part of this is under five messages. So I'm going to use less than five And then what relational operator do I want here? Or what's the symbol for that? The bar right so we're going to go shift our bar And then And use the name again and say last and now I want to see if it's above 10 so greater than So when I run that it's going to tell me true And I already saw that when I just print out last it's 14, right? So since it's or only one of these things has to be true to return a true And we can tell that which one is the satisfying is true. Is it less than five or greater than 10? 14 is greater than 10. So that's the one. It's satisfied Okay Now let's do some more with it So now we want to see if the last day of the week Um, is it between 15 and 20 messages excluding 15 but including 20? This is where we can add a little bit to our like greater than or less than We can include with greater than or equal to or less than or equal to and exclude if we don't include the equal sign So here I'm going to do something very similar I'm going to take my last And I'm going to exclude 15. So I don't need an equal sign, right? I just want the greater than Then 15 And then what relational operator am I going to use? And And that is our ampersand Just the one and then Type last again and here I want to include 20 So these the order you type it out is exactly how you would read it. So is this less than or equal to So the equal sign always usually comes after I'm going to run that. What are we expecting when I run it? True. So we know the value last when I just look at the value last. What does that give me? Oh Well, the value last is supposed to be 14 because that's how many messages we got on sunday from okcupid So since it's 14 Is it greater than 15? It's not And means both of them have to be true So we don't even have to read the second statement to know what this is going to give us, right? What answer is this going to give us? I'm going to go ahead and give us falls and there prints out my 14 After I needed it Okey dokey, how are we feeling on this one? feeling okay So it might be No, go ahead and ask your first If you don't wear in your data set sunday's data is like if you wanted to get wednesday's data Gotcha. Is that Easy enough How do we subset into row names? We're almost done with your part. We'll cut that. Okay, sounds good. We're gonna currently it's gonna finish at this page. We'll take a little bit of a break and then we'll start with that Great leaving question One last thing on this challenge it might be Like when you're in a math class or and you're writing out actually an operation it may be kind of That instinct to say like a value is less than my object is less than another value Let's see how our treats that so if we were going to look at the last one and say 15 Think last is going to be greater than 15 I want to see greater than or less than or equal to 20 If I run that let's see what it gives us So it gives us an error So our doesn't like to read it the way we would normally write it out Shorter and more condensed in say a math class or just by itself It does need those relational operators to understand that it needs conditions to it And then last little bit of the challenge We're gonna look at the subset command So just a reminder Let's walk through what these are saying. So this is our function since it's function regular parentheses First element our argument in our subset is always our object. So here we're using the messages data frame And then we put our condition so we can take that condition and put it in here So we're looking at when okay cupid is less than six messages Or matches less than six messages. So notice that if either is true, it's considered a bad day You're not getting enough messages from either one. It's a bad day So when I run that you can see Wednesday Thursday and Friday are bad days And it does say what we expected since it's or only one have to meet those conditions It doesn't necessarily have to be both Now for good days, we want to see them both have a lot of messages Right, we want our contestant to be successful in their dating life So same set up here function Normal parentheses objects in our conditions. So okay cupid has to be greater than 10 and match has to be greater than 10 So all the days where this is true. Both of these are true. That's going to be considered a good day. So when we run that You can see that Monday Saturday and Sunday are good days The dollar sign because you've already defined The object Yes, in this case that I think yeah, that does work Absolutely. Okay. Let's take a short. How long of a break? five minutes all right, so um this is We call this intermediate are Um, because this is the middle of where we want to end in our series But a lot of the topics here are things that we don't often think about so it's actually in some ways advanced are It's very technical Logistical or logical thinking And so if you're a little bit overwhelmed with the TDM of going through today's lecture or today's workshop We understand that also, it's You know Harley is fighting with this a little bit more than I am because a lot of this is the old way of doing Are or talking to our putting in our but it's important to understand What's actually happening so that when we talk about more advanced or newer ways of Programming in the next workshop. We call that the tidy way It helps us understand what's actually going on when we just use more functions rather than asking questions directly So there's a little bit of push and pull between The old way in the new way and sometimes it's easy for us to Be tempted to to skip ahead next workshop One of the yes Yes Yes, yes next week or next two weeks In the data wrangling we'll primarily use the tidy r package Which we talked about a little bit with data wrangling mostly to get the gg plot functions But that nuts will bring in the other height in The percent greater than percent symbol it to talk to or to pass things along our system Another way of An old-fashioned way of doing things is row names. We don't typically recommend using row names anymore It works in this particular case for the these examples But even though it's not The best way to program. I don't know how many times I've actually encountered data where people have actually named their rows So it's important to be able to know that you can have row names and work with them Even though it's kind of a legacy legacy thing that we don't recommend So what sally asked is what if you know the name of your row and you just want that row? Uh, we can type in messages And since we know the name We'll use square brackets. We would put Wednesday in quotes And then outside the quotes we put our comma And we can run that And we get the whole row for Wednesday Again, that only works if you have row names on your object If we didn't have row names on our object and this was the preferred way would be having a column of Days of the week We can also look for quality on if that day of the week was Wednesday All right, so We need logics in order to make conditional statements And um, so we're building in complexity Um, so when we have conditional statements Now we're going to be saying well, we don't want to just test for quality. We don't want to just Find if things are are true or false. We want to actually do something with that in the first Or where we want to start is with an if Statement typically we don't just say if this is true then do that We typically say if it's not true do something else. So we would have an if In an else statement Yes, yes Well sort of so If this condition is true Then we do this statement Otherwise so else is more like otherwise if it's not true then do something else So the then is the statement. So then do this Um, and then the else is if it's not true then do something else So for instance, um, let's just say that y had took a value of negative three and We have an if statement. It's a function. So we have parentheses our round parentheses Inside of that we have our conditional statement. Um, so y is less than zero if that's true Then we'll print out y is a negative number So we'll run that And it is true because it was negative three and we get y is a negative number Um, let's see what happens if we change it to a positive number For instance five We don't get anything because we didn't tell it what to do if it wasn't true So, um, we would need to have a more complicated statement. We'll get there. Uh, we're not there yet. So for uh first We'll do a challenge Where we're going to use the last number from challenge eight And again last isn't saved in memory. So we have to get it again Then we're going to write an if statement that prints you're popular if The number of messages exceeds 10 So if last exceeds 10 If last Is bigger than 10 Then we have curly braces. So that's um shift square bracket And I like to put in a little bit of spacing here In our studio, it'll automatically indent for you. So it'll be easier to track which Things are inside your If l statements or if statements And so we typed in your popular and then we close our curly brace Learn r does highlight. So when we close the curly brace, it does Indicate the first one so that we know we have a matching pair We should be able to run that And Last was 14 14 is bigger than 10. So you're popular Again, most of the time we want an if l statement because we don't want to know if something is true We also want to be able to handle the case if it's not true um, and so we have if Condition if that condition evaluates to true then we do statement one If it evaluates to false Then we jump into the else and we do statement two Typically, um, I put my else right after the closed curly brace of if Um, just so that I know that it's where it's flowing, but it doesn't matter. It should work if it's on another line all right, so We're going to try this again Same if statement. So if y is less than zero will print out It's a y's a negative number Otherwise else we'll print out y's either positive or zero. We don't know because we didn't test And so let's see what happens when we run that And we get y's a negative number because it's still negative three Let's see what happens if we change it to zero We get y's either positive or zero Let's see what happens if we change it to five And we get the same message y's either positive or zero because we're going into that else statement We don't and we would need a further statement in order to test if it was positive or if it was zero Uh, so that is a traditional way of doing it We are going to preview into one function that does all of this on one line and that's using the if else function It condenses it down because it puts the condition in the first spot in the function the first statement That will happen if the condition is true in the second spot And the statement that's going to happen if the condition is false in the third spot And so it just takes all those six lines of code seven. Oh, yeah Um, two, three, four, five lines of code and puts it on one line If else the condition was y's less than zero If that's true, we say y's a negative number if it's false, we say y's either positive or zero We should get the same results And we do and you could play around with changing that number if you want to for our challenge 10 We're going to rewrite challenge nine Um, and give it an else condition or condition if if the condition is not true We're going to say send more messages. So let's go back up to what we did for challenge nine I'm going to copy that and I guess I didn't need to get Last again So we're going to take this we're going to condense it down into the if else statement or if else function And keep our condition last is bigger than 10 Um, I'm just going to copy that your popular message And I'll get rid of that And if last is not bigger than 10 We'll change it to we'll print out send more messages Last was 14. So we're going to get your popular Let's change 10 to 15 And see what happens Last is not bigger than 15. So we get send more messages What happens if we change this to 14? What message are we going to get? Send more messages because last is equal to 14 It's not strictly greater than 14. And so that would evaluate to false. So we would get send more messages All right. So what's the difference between an if else statement and an else if statement? Well, this is just a more complicated If else Statement is where we just nest them together. So now instead of If condition one is false Maybe we want to check condition two and then we just add on a second if statement That second if statement can also have what happens if that Condition two is false. And so that also has an if and we can keep nesting these down as long as as long as we want Um, and I don't know if you guys do a lot of if statements and excel But sometimes I do especially with grading if I'm like trying to have excel Tell me what grade to assign somebody and I've got like eight conditions nested together Um, I could use our but you know grade books or make solid down really anyway So got a lot of if else conditions in here R is just a little bit different because instead of just saying if Then we have if else or we have the if else function So now we can actually get into Well, we can decide if y is zero or if it's positive So the first condition is checking to see if y is less than zero. So that would tell us if it's a negative number If that's false, you know, it's either positive or zero So then we add in if y is Equal to zero then we know that it is zero Otherwise the only other option as long as that the real number is that it has to be positive So now if we run through our three numbers If y is negative three we get out y as a negative number if we put in zero We get y zero if we put in five We get Y is positive What about y is pi So we can tell pi we can just say p i and it knows that pi is 3.14 blah, blah, blah, blah Okay I like to keep this in here The modulo statement, uh, even though most people probably are not going to use modulo a lot But this can help with deciding if things are even or can help with allocating People to groups and so what modulo does is it returns the remainder of a division necessarily the divisor itself So if we were to say Five modulo three five divided by three is One with the remainder of two so it returns two So let's try that out five modulo two And sorry modulo three Five divided by two is four with the remainder of one five divided by three is one with the remainder of two And so we can use this to determine if something is even or odd Or we can say If x modulo two is zero then we know it's divisible by two Otherwise if x modulo three is equal to zero then we know it's divisible by three Otherwise, it's not divisible by two or three So if x was six that's divisible by two Why did we only get one statement and not both? Because six is divisible by both two and three It hits the first condition The first condition is true So it prints out the first message and then it's done It doesn't go into the second condition. It doesn't even know the second condition exists So if we were to change x to nine We get x is divisible by three if we change it to 11 We get x is not divisible by two or three any questions about that Most of you probably won't use that but it's kind of fun to know that it exists All right So now let's talk about loops because Whenever we're doing something and we're doing it multiple times It's more efficient to write a loop to have the code Do it over and over again Then to do it by hand over and over again Because you're more likely to make a mistake if you have Your rerunning code So loops are similar to if else statements will have some sort of condition As long as the condition stays true We will run whatever statements we want it to do As soon as the condition evaluates to false we'll stop running our loop There are while loops And while loops are dangerous because you don't Specify a stopping point necessarily and they could run infinitely Um Usually if you do A lot of programming at one point in your life You're going to write an infinite loop that has to that will only end if you shut your computer down or force are to close Four loops are safer because you're specifying a stopping point to begin with So for instance if we have 100 numbers will say for I is some index usually In looping statements. We have an index indicator that's Maybe one to three characters. It's a lot shorter than a variable so that it stands out makes it Obvious that it's Not a variable that it's just used in the loop So for I in One to the number then we run the statement Notice this does not have the percent symbols around the end This is a different kind of in Than checking to see if Something is in a vector It's The in in print in percent symbols came after four loops. So This is just an older style of doing it and It's specifically saying as long as I is in the sequence. We're going to keep running it We don't need I think you probably could use the percent symbols but for Specifically for the four function. You don't need it We could also say that the sequence was the numbers one To 100 and then we could say for I in that sequence as long As that remains true, then we run the statement So what happens is we start out with I is one That's less than 100 and we go through it auto increments So then after when we hit The end the last curly brace that we'll add one and go through again and then keeps going through Until we get to the 100th iteration through and then it will stop So we can test this by just printing out our index And so um, all we do is we're printing out I so we go through and it prints out the numbers one through 10 And then um, we can check that it made it all the way through by printing out I I remains the last number that was Evaluated and so it won't be 11. It'll actually be 10 It doesn't hit 11 until it goes back into the loop again So this is another way if you get an error message when you have a for loop I tend I type in I And I see where did it stop and then that helps you kind of narrow down where my problems are all right So let's say that we have two vectors And we're going to add them together and we're going to ignore the fact that we could just say y or x plus y And add them element by element We're going to do this the crude way by hand And we we're going to store our results in z So we're repeating a missing value na the length of x and So however long x is we're going to get a vector that's set up to be the same length as x But it has missing values stored in there We're going to go through each element of x x is 10 elements long and we're going to store in x plus y And I need lowercase z Another reason why it's probably not a good idea to use x and z by themselves x plus We don't want the whole vector of x we want the i-th element of x And we want the i-th element of y And let's see what we get. Let's also print out x and y sometimes tab complete is too Let's see Uh, so it does look like y is the same length as x If y was longer than x it would just ignore the Values past the tenth row or the tenth position and it is I have a problem here because I'm only saving the last element I forgot my index into into z That looks better 1 plus 10 is 11 2 plus 12 is 14 All the way up to 38 So if I forget my indexing it's only going to Save the last Time through the for loop. All right. So this I was hinting at this This is a medium challenge. I already gave you the answer of it How do you add x and y? You just say x plus y And if you want to save that into a vector you could say z is x plus y And then we need to print z out We get the same thing. Okay, so we're going to go a little bit Challenging we're going to go back to the blackfoot fish data set that we use for data visualization We are going to Read in the data set we're going to modify the code to write a for loop We're going to we want to Divide our sample into a training sample that we could run a model on and a testing sample And this is a really complicated way of doing that, but this is a good Practical example We're going to find the indices needed to sample every seventh row from the data set Starting with the first row. We're going to stop as soon as we've sampled 1200 rows And then we could run a model on that after that So first read read in the data set We're setting our size of our stopping point. So we want a sample size of 1200 We're initializing our vector that we're going to store our indices in So we're going to temporarily have n a's in there and it's going to be of length 1200 And Our first index we want to be one Because we want to start with the first row Then we want to go through For i in two We already know our first spot has to be one and we want to stop at some point Um Where what is our stopping point? How many elements do we want in our final vector? and testing Then we're going to store our index in samps i And we need to put something in To get the process that we execute at every index So we want to do every seventh row Um, so there's lots of different ways of doing this Um What's one way? So the second element should be Eight, right? So how would we get eight term one? Yep, samps the previous we could do that we could do Samps i minus one Plus seven that's one way to do it. What's another way to do it? We could do it the math way You could say i minus one times seven plus one That goes if so if i is two two minus one is one times seven plus one is eight Eight plus seven is 15. So Three minus one is two times seven. So that would be the other way to do it, but Probably This way it would be Most intuitive so it's the previous Store value that we stored plus seven Then we're going yes, you have well i is one So if we do your way We would have to if we did it This way We could start when we could start with i equals one Right one minus one is zero zero times seven is zero Plus one we would get one to start If we did your way, we would have to have the first one be one Because we would need to know we can't We can't In r r is one based and so samps zero is mean so it doesn't have anything in that zero slot All right, so then once we have that vector the indices We could put that in for our rows. We could use that for row selection No, you don't really need the which function We just showed you that so that you could use that if you needed to for particular situations That will just select those every seventh rows And then we can negate that to get our Training set, okay Run this make sure I don't have something highlighted And we can't say anything it didn't print anything out because it's all just stored in memory But it's We can get pad of samps And then we can see one eight fifteen twenty two, etc Just like you can nest if all statements you can nest four loops And so if you needed to iterate through a matrix you could go through the rows and you could go through the columns And you could do something fancy So if we wanted to If if we if we needed to do something For every row and then we needed to change what we did for every column then we that's how we would do that So when we would possibly use this in our code Say we wanted to manipulate a matrix by setting its elements specific values based on their own column position So we the first one would potentially you know, it could go either way We could say the first index ran over the rows of the matrix and the second one ran over the columns And when should you use a loop whenever you're doing something over and over again multiple times Use a loop to make make your code more efficient and and less error prone But if you're really doing something over and over again Um, and you're not just doing it for a singular single data frame if you're going to do it for multiple data frames um Then you probably want to go more advanced than a for loop and you probably want to write your own function and This is not as scary as it sounds Functions can be really simple with just a few instructions Or they can be really complicated and be made up of multiple functions So a function that operates on functions uh The trickiest thing about functions is that you do not want to reuse a name That somebody else is used because as soon as you do that as long as you're in that instance It's going to use your version instead of what was already Programmed into base r or whatever packages you're using So for instance Don't create a function and call it mean um Unless you really want your version of mean instead of ours version of mean or use table lm for linear model glm str t apply df A lot of times people you name their data frames df But df is a function um degrees of freedom And so I don't like naming my Even though naming an object df is different than df the function. I still don't like reusing that name So let's create a function that um does something that we can intuitively check Let's convert feet feet In feet and inches to centimeters so first Um We'll create the name of our function So we're saving our function in feet underscore inch underscore two underscore cm Assignment arrow The function for function is function Uh, it's going to take two parameters First one is going to be feet And the second one is inches. We're naming them so we can rearrange the order Uh, we start the function with the curly brace We end the function with the curly brace and what happens in the middle is the important stuff So if we want to convert feet and inches to centimeters first we need to Only have Either feet or inches in this particular case. We're going to convert feet two inches So we're going to take our feet to multiply by 12 add whatever inches we had Then we're going to take that and convert that to centimeters Then it's important to say what we want to come out of this Um, you don't have to return anything but otherwise it's a function that doesn't really look like it's doing anything So we return centimeters So if we run this Nothing is going to happen It it's just creating the function If we want to use it Now we can just use feet inch two centimeters And we can say feet equals five inches equals six I'll put in my height And I can get my height in centimeters out Um, because these are named We can change the order We shouldn't get a different result uh, however, if We don't put in names We did six comma five That's getting somebody that's six feet and five inches uh, so the function It is basically what the parameters that we put in the function are like our ingredients And we mix them together in a particular way and we return um The particular thing we're making or cake or cupcakes or our state Uh, so then so the middle part is what we are going to do with our ingredients And then we return what our final product is going to be um, this Functions scoping is something that is covered probably more In A lot of computer science classes. Are you think that it would be? Um, there's local functions or there's global So if we define something outside of a function It will be kept around And understood outside that function If we have a variable that we only need for the purposes of that function and we define it Inside a function. It won't stick around. It'll be gone So if we have a function that doesn't have any parameters All that it's doing is um Taking in assigning the value of one to x and two to y and then concatenating x and y Um, then we run the function f Which with no parameters and print out x. Let's see what happens So we didn't have to specify return Um, it just printed out x and y Uh, it's not saved in f It's not saved anywhere. It's just printing it out. So it's um transient. It won't remain around um where x is one Locally within that function. So I printed out one there y is two If we were to print out y y was not defined outside of the function So it shouldn't know what it is and it doesn't Okay, so um, keep Keep in mind, you know where things are used if it's inside a function. It only knows about it inside a function And um, so for instance We can play with changing x value of x It doesn't um Here we're defining g Before we specify the value of x Now because Then we call g x was not defined outside of g So it uses what it had saved in memory It knew before from before that x was 15. It didn't know about y. So it used the value of two We redefined x and then it grabbed it from the global parameters Um, so you can use that Um, you want to be really careful About making sure you know where your values are coming from when you write a function So there's a lot of things that you have to be careful about when you use functions But functions can be a lifesaver Uh, again, if you're doing a lot of tedious work, that's repetitive. That's exactly what functions are well designed for so Um Because it's like a loop it's for a process that you do multiple times So for instance, let's say that we want to scale a variable and we're not going to use a predefined scaling function We're going to create our own scaling function. So we're going to take each observation And we're going to subtract the minimum value of a vector And we're going to divide it by the range or the difference between the maximum value of that vector and the minimum value of the vector So we can use this to scale The length of a fish or the width of a fish or the sorry the weight of a fish And this might be something that we would do for all of our quantitative variables for a particular data set And so we could do this by hand Um, so we're taking well, let's just run it The naive way and it's saving it so it's not going to print it out um It's taking the length from the black quick fish subtracting the minimum removing the missing values Dividing by and then we're using lots of parentheses to make sure that we have our numerator and the numerator position And then our denominators in the denominator position Sort of dividing by the maximum removing the missing values Minus the minimum removing missing values Again, there are functions that can make this simpler, but we're just trying to illustrate a complicated process here Where we're got a lot of variables in our data frame going around and we want to really make sure That we don't make a mistake Can we make sure that we don't make a mistake? If we make a mistake Copy paste error, right? That's like the biggest possible error That anybody can make if they're not being careful So we should just forgot one place to replace where we would replace length with weight and that would Really mess up our conversion here So we're going to work through in the last eight minutes Writing a function to do this instead of doing it by hand So the first thing that we want to do Is um, we want to actually let's look at what we get here And so our end goal is that we have um A variable that's scaled between zero and one so all of these values should be between zero and one And so we want to return the snippet of code into something that we can apply to any numeric vector So the first step is to examine the process to determine how many inputs there are We go back up here We've got one variable black foot fish link And we just do use that in three different ways. So we only have one variable One input And we'll have one output So we're getting out the scale variable So then we want to figure out how to change the snippet into The output that we want using temporary variables or local variables So we're going to break this down. So we're not going straight to the end result So we're just modifying the code snippet above to refer to a temporary variable x And we want to make sure our code does not depend on a specific data set So and we're going to save the rescaled vector in a variable called x rescaled Go back up here. I'm just going to copy the whole code and saving Black foot fish link in x so wherever I see that I'm going to replace it with x And I want to make sure That I save this in x rescaled and let's print out head of x rescaled let's print out Head of x as well. All right. So originally x was 232 208 164 rescaled They're all values between zero and one Now is there any other duplication that we have? We use minimum twice maximum once So maybe it would make sense to store the minimum value And then we can since we're reusing that twice We can also use the range function for the bottom And so let's see it may let's see what we get With using the range function If we have a sequence from one to five The range function gives us two numbers one and five This has a minimum value and the maximum value. So how can we use that? Here So we want to make sure that we Find when we find the range of x that we don't call it range. Remember that's a that's a reserved word So we'll call it x range So now let's create an x range. We're going to start back up here. We're going to take what we have done so far and Now well actually Put this down here and for first we need x range And that's going to be the range Of x we got that in there twice, but it's okay Oh, and we forgot to print it out Run these three lines of code And the minimum value is 16 and the maximum value is 986 So now we want to reuse that wherever we see min or max. So instead of min We're going to put in x range. How do we access the first element of a vector? square brackets one yes And so I'm going to do that Both places where we had minimum And how do we access the maximum Instead of one we put in two And let's see if we get the same thing Are those numbers Similar to what we got up here scroll a little bit Looks the same to me So we did the right thing. All right. Well, this is already easier to read, right? Um, and maybe we can put a comment in here range Gives the min first And max second So that will help us remember What is in that variable? Now we just need to put it in a function So remember you name your function Then you call the function function and you give it the parameters or the inputs that you need And then you do the operation that you need to do So here we're going to use the function template And we're going to call the function rescale And notice if we are go slow Rescale has already been defined But you can go type slowly to make sure that it's not something that computer already knows about What is our input we're going to call it x because we want it to work for any Factor that we give it The first thing that we needed to do was Get the range of x using the range function And we put in our input x and we want to make sure we remove the missing values running out of time So i'm just going to copy our function here And then we return x rescaled And we can put in A vector x And so let's just say that x is the numbers one through five And we rescale that Let's see Yeah Let me just copy this one get all the way To 19 You can put this together and you can generate a nice little plot and go through and Calculate the condition scale or the condition factor for each fish And so that's the The gotcha or the the reward for this process So all the little pieces all the logicals all the inequalities All the loops come together into writing a function and again wherever you can reuse code Write code to reuse code so that you're not copying and pasting and making errors and forgetting what you did all right so Two weeks data wrangling We we're not going to go through the same things We're going to do more advanced data wrangling, but it's Using tidy r which is more like accessing SQL databases or python it uses more similar language. You know what that is And it it's more of a Doing some data wrangling using Um common english words that we would think about if we're talking about what we're doing so Less programming you still be programming, but you're going to be thinking about it a different way