 All right, of course, as always, Creative Commons license, so feel free to remix, use this yourselves, just make sure to attribute it. And just a quick review of the highlights so far. So yesterday and today, you know, setting your working directory and using tab, very essential. We went over again and again repeatedly how vectors are one-dimensional, data frames are two-dimensional lists, our bulk storage objects, we're going to be interacting with these all a lot today. We have really pounded on factors, we've made many factors and debugged issues coming out with them, inspecting our objects, so using your structure, dim names, etc. functions, and then plotting, lots of plotting. So ggplot, base r, and you'll notice these plots actually look different. So box plots are actually different between the two. The lower and upper limits are the same, so these are matching, but these quartiles are actually different. So base r uses a slightly different method to produce quartiles for evenly sized groups, and because we have group sizes of four here, it's doing a different, or sorry, they're quartiles, doing a different process to actually find these limits versus these ones. So ggplot is using the actual quartiles, whereas base r is doing a like a, it's applying a function behind the scenes to find those values. So just so you know, these are different on purpose, and that is what they represent. But reminder that this is the median, the middle line is the median, this is the 75th, like this is where 75% of the data should be contained here and below, and this is where 100% of the data should be here and below. Similarly, 100% of the data should be here and above, 25% of the data should be below this, right? So it's dividing your data into parts of four, right? Four parts here, one, two, three in the upper part, and then four, okay? So slightly different calculation between these two. All right. I guess this is why it would be important when you report these that you actually describe how you generated those figures. Exactly. Yeah, whether you use ggplot or base r. Yeah, it makes a difference. But the statistical test will be the same. The the results should be the same. Yeah. All right. So data slicing, just a quick review, because we talked about it a lot yesterday. If you want to slice your data, like your data frame, you're going to have your row numbers before the comma and your column numbers after the comma. They can also be names. So you could have a vector of row names or a vector of column names, like we did when we selected, oh, actually no, that was a vector of column numbers, selecting which ones had marker in the name when we used grep yesterday, okay? So if we wanted rows one through three, and we wanted all the columns, so we just want those first three rows, we can write df2. And here we're using the colon, actually, to automatically create a sequence of numbers. And so that would give us the first three rows of this data frame, okay? So this is a nice trick using the colon to create a vector. So I'll show it here, or actually I'll keep going because I think I just showed in the slides. And then we leave it empty on the right hand side with the column space to include all the columns, okay? We can similarly, so as I was using yesterday, this is how I was creating a vector. This is the exact same. So using, if you have, you want just numbers all in a row, and you want a long sequence, it's easy to use the colon. So you just have your starting number and your end number, and you put a colon in between. And this is the same. This produces one, two, three. This is just explicitly writing one, two, three. And then you can also use the sequence function, which has a default step of one. And it will also give you one, two, three, okay? So these are all equal. And they're equal if you go backwards. So three, two, one, if you go three to one, sequence three to one, it will be the same. You can also use conditions. So if you only want individuals who are over the age of 92, you can do this. And it will only give you individuals over the age of 92. And you can see, you can verify it when you print it, that you're seeing all individuals, they all have an age over 92 years in your data. Okay? So df2, and then within the row space here, so where the rows are, we put this condition and we only keep them there. Yes. And then Greg really put an awesome example down here. So the cool thing about what Greg just wrote, sorry, Greg, I'm just gonna use exactly what you wrote here, because I think it's really useful. Sorry, I took it for the long. I was kind of slow. No, but this is perfect, because this is actually the perfect time to talk about this too. What Greg did here was, he subset now not a data frame, but a vector. And because he's sub-setting a vector, he doesn't have any comma inside these brackets, right? There's no comma in the square brackets. So similar to here, where I have a condition for age, he's putting a condition for the treatment factor, okay? And so he's like, for this vector, marker one, subset it to where treatment factor is equal to control. And there's no commas, because it's just one vector he's sub-setting. So this is the first vector, right? How we were doing vector one, vector two, comparison and a t-test. The second vector will subset that same vector. So it's sub-setting df2 marker one here, the exact same df2 marker one. But he's now going to take only the treated patients. So now this condition is changed to df2 dollar sign treatment factor equals treated. Notice it's a double equals, because it's a condition. So it's saying equals to, it has to be equals to, and he's looking to get truths and falses out of that. So I'm actually going to copy the inside part. And I just want to show you guys what this actually produces, what's inside here. So down here. Oh, I don't know if you guys have noticed the quotes in, I guess, PowerPoint, any word processing, they're curly, but the quotes that are wants are straight up and down. So if you copy and paste from text like this, or like, yeah, so slack text, the curled quotes are not going to be transferred to R. You need to rewrite your quotes. So just so you guys know. But here, we get a vector of truths and falses. Okay. And so when we put those truths and falses in brackets, and use them to select marker one, use them to select from here. So this is our marker one. What it's doing is it's saying, along this vector, only select the truths. Okay. So here, when I run it, it subset it. So you notice here, this is all of marker one. This is only where marker one, where treatment factor is equal to control. So it's subsetting to only the positions where it's true. Similar to this, this is subsetting to only positions, only rows where age is greater than 92. So that is again, it's truths and falses, and it's subsetting the rows where it's true. Okay. So we're going to be doing a lot of this today. So hopefully if it's not totally solid yet, it will become as you repeat it again and again and again. But it's a really flexible way to be subsetting and extracting the data you want, particularly, you know, when you're cleaning data, maybe have exclusion restrictions. Maybe you want to do a subset analysis on only like a certain group of your data. This is how you would kind of cut and slice it into these different parts. All right. So I'm just going to go over the short exercise that we didn't do, but I think I encourage you guys to do. So it's the assignment to data, the analysis test, the distribution of skin cancer biopsies, you read in your data, do multiple chi-squares, get this nice graph. So here it can all be fit really in two slides here. So you just set your working directory, you read in your data, just inspect it, you know, you want to look at what's there. Maybe you use view at this point if you'd like or structure. Here I'm using head. I'm creating my factors. So I have multiple factors. I have a benign versus malignant. I have male versus female. So my sex levels are actually zero and one. So I have to know that that's male and female. Also for my city, I want to set Toronto as the baseline. So I want to be comparing everyone to Toronto and Toronto is a level zero. So I want to, yeah, I order it in that way. Toronto is a center of the world. Obviously, yeah, as I'm here right now. So and so then you do a table just to see where the malignancy rates are overall. And so this would be a single table. This is, it's not going to be a cross tab like we were looking at yesterday. But then you would make cross tabs here. You want to see how does the malignancy actually distribute by sex and how does it distribute by city? So that's where we have a table. And then we have a malignant factor and then comparing it to city factor here. So this is going to have malignant factor on one side of the table and city on the other side. And then we do a chi square test by wrapping it around our table here. So we actually test if it's independent because maybe, you know, the malignancy rates are not correlated with where you are. And that's good. That I mean, that's fine too. And then once you do that here, so we already did this, you would save the output of the table as an object. Okay, then you would melt that object. So you can melt a table to get the counts from that table. Okay. And now we plot the melted table. So we create a gg plot of that table. And that we do a facet grid over the second element of that table, which is the cities so that we're able to break it up into that nice bar plot. So the solutions are there. The solutions will be there later today. I believe they might actually be up already. I can't remember, to be honest. And this should be a good exercise just to kind of test yourselves to see what you're remembering from what we covered yesterday. All right. So now we're going to revise the code that we worked on yesterday to basically make it more efficient. We did a lot of copying and pasting, a lot of repeated code. And there's other ways to do this, using loops and functions in R that make it more efficient and easier to do. And it's not that arduous when you have five markers, but many of us have like 10,000 markers that we're testing. So at that point, you don't want to have 10,000 lines copied and pasted down. So we are going to write four loops today and we're going to write functions today. So here, let's just go back to the box plots we made yesterday. Similar to the t-test, we'll also be revising that code. So go ahead and find this in your script. It's a round line 65 of the master script that I sent. But in your own script, it could be anywhere. And see, yeah, if you had 100 markers, it would be great if you didn't have to just copy and paste this line again and again and again. Okay. So four loops. Once you've found this, go ahead and leave it. We're going to come back to it. We're going to jump into more theory of what a four loop is and what loops are. So first, I want to say there are multiple kinds of loops. There's while loops. There's other loops, but four loops are really common. They're very commonly used. And the reason it's called a four loop is, is you want to loop for a certain number of times. So you want to loop over a preset kind of number of iterations. All right. So here, I'm saying four. So four is the function I use to do my loop. Let me see. It opens the for loop. And then here I define within this, I define a new looping variable. So loop bar hasn't been used elsewhere in my script so far. A lot of people will use just the letter I is very common. But here I wanted to write loop bar just so you guys know this can be any variable that you just make up on the fly. And it is being defined in and for your for loop. Okay. So this variable, it's going to take on a new value each iteration of your loop. Okay. So for loop bar, which I've just defined, I'm going to say I want it to be a one, two, and then a three. Okay. This can be any vector. So it doesn't have to be. It's so common for for loops to be like one, two, three, four, five. Right. So for loop bar in one to five, but often you could have it be for loop bar in and then have a vector of the city names that I just showed. And you'd say for city in all these different cities. Right. So it can be any vector that you define here. And what's going to happen is you start the curly brackets. So notice the four closes here. This, this parentheses here is closing this open parentheses here. So this is enclosed. And then after you write this four command, it's expecting a curly bracket. And then an end of curly bracket. And now inside these curly brackets is where you're actually going to have your looping happen. So inside here is what you want done within your loop. So it's whatever you want repeated is going to be inside these curly brackets. Okay. So go ahead and write this loop and then run it. And you can run it by having your cursor anywhere on this top line and pressing control enter. You could also select the whole thing and do control enter, but it should run this loop. And once you've done that, go ahead and click yes. You can also change how this vector is written out. So we just learned you could write it in a sequence of one to three with a colon. You can use the sequence function. Or you could write, you can do it explicitly with a vector here like I do. So with a concatenate and then one, two, three, that also is fine. Awesome. We got one. Very nice. Yes. Yes, exactly Diego. We're not using any data set yet. So yeah, this is just we're just defining this vector. We've defined this variable and we're just going to loop. We're very simple. And so what you should be seeing is this output one, two, three. So don't forget to click the guess once you've got this. And if you get this experiment, try making your loop do different things inside, try changing your loop bar name, maybe use a different variable, try changing what you're iterating over. So maybe instead of one, two, three, maybe make it five, eight, 45, right? Make it character strings. Any vector should be fine. All right, we got to know. Go ahead and copy and paste. Yeah, Nagla. Give us a screenshot. I'd love to see what's going on here. Check your parentheses. So I really encourage you to utilize our studio, the fact that you can see highlighted whether you've closed or opened your parentheses. So what's happening here is you oh, actually one sec. You didn't close your curly bracket on the end. So it keeps expecting you to finish the curly bracket there, Nagla. So if you press escape down below to get out of those pluses because it's just going to keep looking for the curly bracket. And then on your top line and 66, make sure you include that last curly bracket to close the loop, because that's like what it thinks is that you're keep giving it more stuff to do in the for loop. And similarly, sometimes I get kind of overwhelmed with the number of curly brackets and stuff and other parentheses and everything. R is highlighting for you. R studio pardon. It's highlighting. Highlight for you. Yeah, it will highlight for you the bracket that is corresponding to the one that you have your cursor on. So I'll just write it in here. Sorry, guys, I'm just going to move this for loop bar in. I'll do one to three. So see when you open your curly brackets, it will give you a second bracket there. So you can use that. Also notice in Nagla's code, she wrote it all on one line. That's totally fine. So you could also write it like that print loop and that would be totally fine to have it all on one line. But notice when I have my cursor there, it highlights this curly bracket. So what it's telling me is this curly bracket is closing this curly bracket. So I know that it's closed. It helps me find the one that corresponds to it. Similar to parentheses, it also will do that. I have it on the outside of that parentheses. Now this one is highlighted. So I can easily know how I closed up all my parentheses or my brackets because it is just such a common error. Our studio knows that. So the makers made it so that it's easy to reference or find where your brackets are opening and closing. So I'll give everyone another minute. Wonderful. Okay. So now let's try to write a loop that adds two to each value from one to 10. Okay. So if we did that, we'd write four loop var. Again, this could be any variable in and I'm going to say one to 10 with a colon. I open my curly brackets and I add two. Okay. So I'm going to go back to here. Try to do it on your own. Write a loop variable that adds two to each value from one to 10. All right. And click yes once you've done it. And again, if this is so easy, just keep modifying it. Try new things. Make it more difficult. Change your variables. Make them character strings. Try looping over your box plots at this point if you're able. Challenge yourself. So inside the loop, you want to add two. So you're going to loop over one to 10. And then you want to inside here instead of print, you want to add 10. Or sorry, add two part of me. Add two. And don't forget to click yes once you got it working. You may need to write print around the addition part. To see the output. So you're going to show us how to loop a t test or a box plot because I'm trying to do it. Yes. Yes. Yes. Yes. So good. If you guys can try to do that, we will definitely be going over that. All right. The t test and box plot. Yeah. Yeah. Thanks. I'm going to give everyone one more minute to get this one. We'll go over the solution and then we'll advance to okay. Yeah. No worries, Carl. So I think many of you have found it's very similar to the last one in that you have your loop var for loop var in one to 10. And then we close that four portion of our for loop. We open our curly brackets because now this is what we want to be done 10 times. Okay. So this is what we want to iterate or repeat. We want to print loop var plus two each time. And here we can verify it in the output. That's great. It goes one plus two, two plus two, three plus two and so on. So that's good to go. All right. Now let's try to do our box plot. This is what we had before. And you all have that. I'm actually going to leave that. Now give this a try. Let's pick this apart and then I'll have you guys click yes once you've got it. But essentially I'm going to say I'm going to set up my part MF rows just like yesterday, right? So here I'm saying I want one row and five columns of graphs. So this is separate from the for loop. This is just setting up where my graphs go. Then down here for I start my for loop, I'm going to use a variable I call marker here. And I'm going to say for marker in paste zero. So paste zero pastes without any space in between the two things it's pasting. I'm pasting the word marker to the numbers one through five. So let's actually take that and see what that actually creates for us. So I'll do one paste zero down here, paste zero, A, B, C and I'll do one, one, two, three there. So A, B, C, one, two, four. It pastes it and there's no space. If I just used paste, it automatically adds a space. Okay, so that's why I'm doing paste zero here, back here, paste zero. And so when I paste zero, one thing, marker, with one through five, which is the vector one, two, three, four, five, I create a new vector where I have marker one, marker two, marker three, marker four, and marker five. All right, so I had one string and I'm pasting it to a vector. And it will create a vector where that string is pasted on the front of all of them. And you may remember that that those are the column names that I actually want. Okay, so that's why I'm doing that. I'm creating this variable marker here. And it represents the different column names that I would like to create a box plot of. Okay, so that's what this paste statement does. Then down here, I'm saying, okay, now inside my loop, for each of these markers I create, so first, the first pass through the loop, it's going to be marker one, that's what my marker variable marker is going to equal marker one, the first time I go through the loop. Okay, so the first time I go through the loop, I say create marker column. And that is equal to my data frame with the column that has marker here. So marker, remember the first time through, it's marker one. So DF two, marker one. This is my marker one column, see marker one, 0.367, 0.230, 0.714. That's just that column. All right, so here I'm creating one vector that is just that marker column. And then here I'm creating a second vector that is my treatment factor column. All right. And then I'm creating a box plot now of my marker column relative to my treatment column. All right, just like these box plots, marker column relative to treatment column. But instead of having the data referenced here, I actually just put in the full vectors here. All right. And sorry, I'm going to clear all. So if you actually got this working already, go ahead and re-click yes. Because I just cleared all the yeses. But go ahead and do this, write this loop. And notice, yep. Sorry, why is marker without quotation mark treatment factor is this quotation mark? Yeah, because marker is a variable. So marker is a variable, and it's representing the first time through marker one, the second time it represents marker two, marker three, and those are character strings. Right. So those are, whoops, one sec, these character strings, marker one, marker two, marker three. Yeah. So even though it looks like it's not a character, it represents a character string. And here, this is the actual character string for that column. Yeah. Very good question. Do you need to use the names of the columns in this context? Or theoretically, could you also give it the number? You could definitely give it the number. Yeah. So a couple things. You didn't close your for loop yet, so it's missing that last curly bracket to close it up. But also, it's it is C plot. Oh, sorry. But the brackets are down there. No. The closing brackets. The plus signs. The reason it keeps giving a plus sign is because it's missing this final curly bracket. Oh, maybe there's an extra row. Maybe. Yeah. I think actually, what happened is like you ran it line by line, and you just haven't run the line that has the closing bracket yet. And so the console thinks but I do see it like on line 78. So when you're on a loop, you would want to wrap that whole section. So can I know I'm still getting that. Do you want to like in a breakout room or something? You want to escape you want to escape out of this plus signs because it's just going to keep looking for ending close. Yeah. And then run and then run the whole thing. Yeah. No, so I'm getting in a narrow. Yeah. So is the box plot written VOC still? No, okay. I can screenshot it again. Okay, perfect. Yeah. So Diego, it looks like it's just continuing on one line. So you want to have your commands follow line by line. So here, how it's a new line here. I don't want to put them next to each other because it's going to treat it all as one command. So you just have to break it up line over line. One way you could write multiple commands all on one line is put a semicolon in between them. Also, treated factor might not be the name of the column because it has a space. How do you know when to use the curve brackets versus the square brackets? One sec. So square brackets are for indexing or like slicing, essentially. Curly brackets are for loops and functions. And we're going to go over functions later. But they're basically saying inside here is where you're going to repeat or do whatever I'm telling you to do. So curly brackets are pretty special for that. And then parentheses we've seen. If you run it line by line, so if you start running it in the middle of this, it will treat it like you're just running that line of code. So that may be an issue that's coming up. So you can either start running it by having your cursor on this first line or highlighting the entire for loop and running it. So that could be an issue that's coming up. Like Ray, I'm wondering for you if that's the issue because it looks, your code looks actually fine off the top from what I can see. That's what it was. What's the output supposed to look like? Box plots. The five box plots. I'm not seeing anything. I don't see any. No, I don't get any more errors, but it doesn't produce anything. Okay. Let's see. I'm going to try this myself. And I'm still encountering nails. And it's not giving you any errors like the figure margins are too large, right? For if it's not printing the box plots. No, I just get all the blue text at the bottom in the console. Okay. No errors. Hey, Rose, I saw you have a raised hand. I messaged in the flag. Can you please check? Yeah, definitely. Sorry. So for yours, you'll want to run that par MF row one to five first. It looks like it might be still set up on a two by two. And you'll notice if you arrow back on your, Reza, if you arrow back on your graphs, you probably have a graph of four box plots. And then the fifth one is by itself on that next pane. So par MF row. Yeah. And Ryan's pointing out there's lots of ways to do this basically. So you don't have to define this treatment column this way. You could use a dollar sign or you could even within here just use treatment called dollar sign treatment factor. So you don't have to create a new variable here. So there's lots of ways to do it that are exactly the same. So you basically define your column in like on the fly essentially here. This to my mind just cleaned it up for us. Yeah. So now this is the classic and Reza your your hands still raised. I just want to make sure. Yeah, I have one more question. So I got the image right. Right. Just I'm wondering the logic like how is it understanding? Okay, I have to move on marker one to market two. Can you go over again? Absolutely. Absolutely. Yeah. So perfect. That's right. That's right, Ren. So the way it knows to repeat it is here I've created a vector that's marker one, marker two, marker three, marker four, marker five, right? This pace zero marker one to five. It creates that vector. And so sorry, my gosh, I lost the screen. So here the curly brackets say now let's repeat for each element of that vector. I'm going to go through this entire sequence between these curly brackets for each one of these vector names. So the first time it goes through it's marker one. So it says the marker call is going to be marker one treatment call is the same every time and then make a box plot. Now it hits the bottom here and there's still more in this vector. So it goes back and repeats it says okay now it's marker two. So now my marker column is now marker two. My treatment column is the same make a box plot hits the bottom. Now it goes back up. Now it's marker three. Make that my marker column here. So marker column here is marker three. Make a treatment column, do a box plot and then back up again. Yeah, no, they look great. I was just wondering if the name of the plot should be marker one, marker two, marker three or because it says treat no. So okay, yeah, we need to change that later. That's what Greg just Yeah, exactly. You can add that later. But um, yeah, it's not the it's it won't be added as a name. You could do it by revising this code. I'll show you here. We were looking at the are like the base are kind of commands you can add. And so what you could add is main equals marker. Okay, because marker, as we saw before, is going to be that I don't know, I just printed too much. It's going to be marker one, two, three, four, five. So the main call the main plot title is going to be on each of theirs. Yep, keep making it bigger. Still too small. Okay, so if we run this, now we have a name on top. Nice. So you wrote it after treatment column. Exactly. And so I wrote it here. I just added it to my box as a comma. Okay. Yeah, exactly. Okay, I just want to make sure we got everyone. Awesome. And also, you know, you may find too, because this says marker call on our axis, you could also change. So here I'm just adding, I'm making a new line, just to make it look nice. You might want to say why lab equals marker, right. And so then it will say marker one, marker two, marker three on this y axis. So it's just another argument to my box plot. So here I'm going to run it again. And now it's saying it there, marker one, marker two, marker three, marker four. Okay, so you can do lots of things to revise it, but then make it really fast. Now you're not copying and pasting five lines. And if the plot new figure margins are too large, it just means you got to make this larger. So if I have it down here, and I try to make my plot, it's not happy. So that just means expand. And then it'll work. I think we all got it. And if you guys have gotten this one, and are waiting for others to kind of debug and everything, give it a try with the t test. All but five are gone. So another thing that may happen. This ours filling these in one by one. So if you have, for example, four slots, let's say I made this part MF row 124, but I have five graphs, what it's going to do is so I'm going to run that now it's part MF row 124, that means I have one row and four column slots for graphs. I'll run my loop. Oh, it looks like I only have one graph. No, it plotted all four. So I use these arrows to navigate back, it plotted 1234, and then said, Oh, I have I need to plot. I need to continue to plot or I have more space. So then it goes to the next one. Five. Okay. So if you see this happening, just reset your part MF row. If it's one through five already, just do it again, just run it again. It's going to reset it, and then it will fit them all in. Okay. All right. So we're going to take a quick break at this point. And we're actually going to first do a group. So variable that makes a new value on each loop iteration is called marker here. You could call it anything you want. You define it when you make your for loop. Then you create a vector of values here. So values that loop that loop bar here, it's marker takes on for each iteration. I'm actually going to edit that right now. Marker take on for each iteration. And these can be any vector. So in this case, we use the vector marker one, marker two, marker three, marker four, marker five, you could have used a different vector, you could have used the column numbers and put it here. Someone made a great point about that. I want to pull up my slack, make sure. Okay, good. Then you're going to create an object that is one marker vector. So this is one vector of one of the markers that you select. And it's going to be a new one each time you iterate over the loop. So each time the loop runs over or it loops, it's going to be a new value. And then you create a treatment vector. This you could have just defined it in here as DF $2 sign treatment factor. You didn't have to create a new vector here. But you can. And then you plot a box plot with your two vectors. So simple enough, just like above, just a box plot here. Okay. And as we saw before, we set up this par MF row one to five, so that we'd have one, two, three, four, five box plots all next to each other. And so this is what you should see. This is identical to the non loop solution. So the output of the loop should be the same as the output of just copying and pasting five times. And that's what a loop does. It's just doing the same thing a number of times. So let's try to compute the t tests in a loop. Okay. So everybody give that a try. You can use code from your previous loop. It's going to be pretty similar. Because as we saw, the t test code is very similar to the box plot code. So try to make that work. Try to have it be doing a t test for each one of your markers without you writing that, just copying it again and again. All right, we got to know. Please share with the class on the slack. If that's more comfortable for you. Where are the issues coming up? Yeah. Great. So you're not having any bugs. That's great, Ren. So now you're finding that it's just computing it within the loop, but it's not printing the output. So before when we were doing the addition, we had to actually print the output. If you surround your t test command with print. So print and then open parentheses have the t test and close parentheses. That should, that should give you output, but it's running error free right now. So that's good. Nice. We got a few successes here. Awesome. Great job, you guys. You know, loops take some people years to really get their heads around. So if it's not totally landing yet, that's okay, but I'm really impressed. You guys are really getting it very quickly here. All right, I'll give everyone a couple more minutes. So I'm just playing with it a bit and I placed the print in a different place right after the thingy before marker column equals df2. And it did something else. I'm not sure what it's supposed to do. Okay. We'll go over it. Yeah. Do you want to do you want to copy and paste that or do a little screen grab of that? Yeah. So you'll have to print the output. So here I'm just going to start going through the solution here. One sec. Yeah. So it looks like ran. It's not running your loop. It had an error in your loop and then it ended it and then it just ran one t test. Okay. But I just moved the print. So maybe it's not correct to do that. You see I moved the print. I just played with it just to see. I'll make it clear. Yes. Okay. So because we're getting a lot of questions about this exact thing, I'm going to go ahead and go through the solution because it's very similar to what we just did, the variable that takes on each iteration. This one again, I'm going to marker, marker. Here it's iterating over it. Here you're creating an object that is one marker vector, create an object that's a treatment vector, conduct the t test. But here you want to print the output. So what loops are doing, they're not going to, by default, be printing out similar to when we were doing the addition problems or just printing the single values. They're not going to be printing to the console the output of the loop. So when you have a t test, for example, you'd want to put print around the t test portion. And then it will be printing to your console. So many of you were having issues at that point. So if you have a loop and it seems like everything's invisible, try print. Yes, Pierre. I'm thinking about your question. So one modification I can think of, Pierre, is to first print marker and then print the output for the loop. So I'm going to show you in R and I'll just move this so everyone can see the solution here. So here is what we're doing and see it doesn't give me anything. So I want to print and now it's going to actually print my output. Good. But yeah, like Pierre made a great point. You know, here we're looking down, they look exactly identical. So it's hard to know which is which. So another thing you could do is add a reference basically in your loop. Print marker, because we know marker will change values each time. So here let's see what that gives us. Marker five is this one. The result for marker four is this one. Marker three is this one. Marker two. So that may help with referencing, but that's a that's a really good point. Because in essence, we can't have this marker call change as easily. Another thing you can do, I just thought of this. And it will show up in your t test output is you can put here the F2 marker. And similarly, as we said before, this could go here where your treatment column is. Now if we look at that output, nope, it's just going to keep that marker. Yeah, because it's just keeping it as a variable. So you'd still want to have your reference. You'd still want to print which marker it is by printing marker. That's great. Thank you. And would it be the same for the box plot? If I want to, I can also put print marker and so for box plot because it's plotting, you probably want to include it on your plot. Yes. So the difference there is with box plots here, we can add it as a label on it. Oh, sorry, this is a different box plot. You can add it as a label, or you can add it as the title above your box plot. So here, if I run this one again, I was just doing an illustration there. So par MFRO one through five and we want to know what each of them is plot new figure margins too large. We're very familiar with that error message. So I'm just going to expand out my plot pain. Let's see if it'll work now. There, marker one, marker two, marker three, marker four, marker five. So the title is a variable and it's a string that we defined as this vector here, this marker vector. And each pass through the loop, we're using that as our Y label and as our main title to our box plot. Yeah, that's great. I will try that. Thank you very much. Wonderful. Yeah, of course. Yeah, Christina, I don't think so. But you know what? No, so this is going to be the next thing we do to save the output. Essentially, we will do that. Yeah. I think I see what you're saying. Yeah, that's going to be the next step. Like saving each one individually or saving the whole thing as a whole. Yeah, today we're going to go through and we're going to save each one of them all together as like a summarized output. So you have, you'll know, for example, like, I can't remember exactly, it would be like the mean, the standard deviation, etc., for each of the markers and you basically create a table that has all these statistics for them in one table. In general, though, does our allow you to do that? Like incorporate, like in a loop, a changing variable as a name of a file. Like I think you could do that in Unix, right? Yep, this one. Yeah, as a name of a file. Yeah, absolutely. Thanks. Okay. All right. So it looks like we're getting it. Okay, so we've done our t-test now. I'm going to go ahead and clear all the yeses. And let's move right along. So let's organize these results in a table. So we're going to say we've got these results. We really like them. But we would like to summarize them all together. So we want to create a table or data frame object in R that has four columns. We want the marker name because we want to know which marker we've tested. We want to know the mean for the control group, the mean for the treated group, the p-value for the difference between the groups. Okay? So this is like a common, you know, your testing difference in groups and you want to have it summarized across all your different markers. So there's many, many ways to do this. I'm going to show you how to do it by building four different vectors, individual vectors that we'll then put into a data frame all together. Okay? So I'm just going to show you the result. This is how I would do it. This is one way you can do it. But there are many ways to do this. Okay? So first, I'm creating empty vectors. And this is a vector for each of the columns I just described. This first one is an empty vector for marker names. So notice that you have concatenate and it doesn't have anything inside the function. This is just a zero length vector. I'm initializing, so to speak. Then I have the control mean. So this is the mean for the control group. It's going to be an empty vector to start. Case mean, similar initialization, and then the p value. So these are all just four empty vectors that I'm starting with. I'm going to do four i. So i is my variable that I'm going to define here. And then I'm going to say for one in five. So I've got five markers. I want to run this five times. And then i, the letter i here, this variable is going to take on the value one through five for each pass through my loop. So the first time i is going to be one, then it's going to be two, three, four, five. Okay? Now inside my loop, I define my marker as, and I do the pace zero, like I did before. But here I'm doing it within my loop. Okay? So I'm saying my marker is first, it's going to be marker one. I'm going to paste marker to one. Okay? Because i is going to be one first. Then my marker column is, of course, it's going to be taken the same way I did before. So I create a marker column. I create my, I create my treatment column. This doesn't have to be done separately as we pointed out, but you can do it. And then I run my t test. So I create a t test object where I say t test is marker column by treatment column. Okay? So this is my t test output. Now I, sorry, I'm going to go ahead, creating a string, create two vectors to use in my t test, store the output in my t test. And now I add the marker object to the first position in marker name. So the first pass through. So now marker name is this empty vector. I'm saying make the first element of that vector marker, which we defined up here. So it's going to be marker one is going to be first. As we go through again, marker two will be second, marker three will be third and so forth. And then from my t test object, I'm actually able to do dollar sign estimate. And the first one is my control estimate. Okay? I'm just going to check my, yeah, it's going to be my control estimate. So here I get my control and case estimates from my t test object. And I'm saving them in control and case mean. And again, the first one. So like the first markers estimate for control is going to be in the first position of this control vector, right? So I'm just adding on these vectors piece by piece. Similarly, I can extract the p value. So notice I'm extracting these from my t test output. All right. So when you run a statistical test in R, you can save the output. And then you can extract different things from it. Okay? So then I'm taking the p value. And for each marker, I'm going to add on to that vector, I'm going to add another p value. And these are all going to be in an order that is defined by the one through five. So their orders all match, right? And then I combine them after my loop. So here this closes my loop. I've now in this loop, I will create a vector, four vectors that are five in length. So they have five elements each. And I'm going to put them all together in a data frame. And so data frame, the function can take any number of vectors as long as they're the same length. And it will just bind them together as a data frame here. All right. So go ahead and give this a try. And before you do this, I just want to show one more thing here for the t test. So here we did t test marker one, mark two, t test out. Let's say I make an object here. That's t test out. I'm just going to expand this a little. So I define that object. Nothing happens. It just runs. I run my t test. But I have this t test out. Now if I use structure str, t test out, I can look at all the elements that are inside the output of this statistical test. Okay. So it's a list object. All right. And as we saw before, list objects, we can use double brackets. So we could actually use a double bracket to be sub setting these elements, or we can use the dollar signs. And here we're using the dollar signs. So here, here when it says estimate, down here, estimate and p value, these are my dollar signs. So I'm dollar sign estimate, I have two, the mean and group control and mean and group treated. The first element of estimate is the mean and group control. So that's why I say my t test dollar sign estimate from that list to extract the estimate. And I want the first element because I want the mean and group control. And that's what I'm adding to the vector, the control mean vector here. Okay. And similarly, the estimate for mean and group treated is the second element here of estimate. So that's why I say my t test dollar sign estimate. So that gives me this. And then I want the second element only. So I then use those square brackets to subset that vector to only take the second element. All right. And then for the p value, whoops, sorry, for the p value, it's just one value. So here, I want this p value. So I just say dollar sign p value. And it will give me one value that I would put, I would add to my p value vector here. Okay. So go ahead and give this a try. And click yes once you've got it. And you can, if you get this quickly and easily try, I mean, you can try to make your t test call different. So you don't have to define new columns here, if you want, you can also add additional columns to your data frame. So there's other things you could be pulling out here, you could be pulling out the t statistic, you could be pulling out the standard error. There's many things if you use the structure function to look at what the t test output looks like, any of these things can be pulled in and be made into new vectors that are then added to your results data frame to instead create like an empty data frame and then fill in specific positions instead of like the vectors. Yep. Yeah, the tricky thing there is you have to make sure that all the data types are the same. So as you're adding it in, but it should be fine. But yeah, you could do that, you can make an empty data frame and do point wise addition in there. And so then if you're doing that, instead of one, like a bracket, that's just one position you want a bracket that has a comma, and then you'll be imputing, you'll be putting in the row and the column. So you mentioned they have to be the same, like data type, but if you're creating data frame in general, oh, in the column, not in the entire data frame. Yes, exactly. Yeah. Okay. Okay. Yeah. Awesome. I see some of you are getting it. I'm not sure if this is possible or something you would even really want to do, but is there a way to just sort of capture the entire output of the loop as like, I don't know, text, everything that it's output kind of the way it's printed in the command line? Yeah, you can create a new list. So like we talked about before, because lists are just both storage objects because data frames are quite strict about what you have. We want to make the input really normalized before we put it in a data frame. We want it kind of in a matrix, but if you want to just keep it all, you can just make a list, and you would just have an empty list to find up here, like my list. And then down here, it would be now double brackets for the eye, and you could just put in the entire t-test into each one of those elements. You could also name the elements of the list. So it could be double bracket, and then you have marker being the instead of the eye, and it would be like marker one, marker two, marker three, so forth in the list. So when you say to create a new list, is it the same thing? Like you give it a name for the list equals and then still the C in the double brackets, same one. It would be list instead of C now, because this is initializing the empty vector, but if you wrote list instead of C here, then you have an empty list. Yeah. Okay. And when you say double brackets, sorry, like you mean in place of instead of one square bracket here, you have two square brackets, because lists are indexed two square brackets. Two square brackets are like something comma something within square brackets. That's a data frame. Okay. So if we say my list equals list, so now it's empty list. My list square bracket, double square brackets, I is how you would. Okay. And that's it, just to tell it it's a list. Yeah. The first one tells it it's a list, and then the double square brackets, I'm referring to how you would add things in down in this part. I see. Okay. Thank you. Yeah. Ran, just try running your whole loop again. No, I'm getting the same without the without this last bit. So without the results, Dia, are you able to just run just the loop just the loop, just the loop. Yeah. And also maybe re initialize these, make sure these are empty and then run the loop. Sorry, again, again. Yep. So run these four. So just highlight from here and run down to here. Okay, so include them, but all the way to the results. Yeah, but don't include this last line, just to see if you can get it to run without that. I have an error on my P test, but it says it cannot find it. But it seems to be right. Oh, you know what else is making it really hard for you. Probably is you're in a markdown file instead of an R script. And so you're not in anything that's like a coding interface. So yeah, I noticed that it looks so okay. So you want to be in an R script, not in a markdown file, because markdown file is the only place you can like legitimately put code in a chunk. And so you're outside of a chunk here. And so it's just treating it as like just normal text. Yeah, yeah. That's also going to help you. Yeah. Let's see. Oh, now it looks better. Awesome. And I didn't get the auto complete with the tab. Yes, exactly. It's not going to give you any of that in a markdown file. Really for creating reports only. I would not do my scripting in there. Yeah. Yeah. Okay. But I still get that they're all with my T test. So I'll see it again there. Let's see. Because it's underscored in one place and then not in the last row. So P value is my T test, but there's no underscore between my and T test for you, Ren. Diego, let's see. Yeah. So yeah. So there was I had an extra quotation in the like that after the paste zero. I think you're also missing your end of your curly brackets. So your loop isn't closing out. It looks like I don't see a curly bracket after P value after you define your P value. Oh yeah. Okay. Yeah. So Dahlia, when you get an error like that, it means that the these vectors weren't made correctly in the loop. So we have to go back and debug the loop. Are you able to do a screenshot of the full loop instead of the, because the results DF part is probably if seems fine. You might just need to run it a second time. Dahlia. Oh, T two mean is no. You're missing an I in estimate on T two mean on line 75 Dahlia. Yep. Right. Great. No output or error is correct. So what should be created is, is a new, a new variable in your environment. So here, I'm just gonna go to run this. So see here, it's created all my vectors case mean control mean P value marker name. Those vectors are created. Now when I run results DF. Now I have a new data frame. That's my results data frame in my environment and actually click on that if I want to view it and I can see marker one, two, three, four, five, I have the means for each of them and the P values. So once you guys have that, yeah, then you've done it right. Because the key thing is we want object out of this that we can then, you know, we can write it to a spreadsheet, we can sort it, we can we can parse it. Awesome. All right, you guys, I think we've got it. And Dahlia, you know, I'm going to just say how I figured out your bug is looking in the environment pain. You can see each of these vectors that we create here. So marker name control mean case mean and P value, these are each vectors that we're creating in our in our environment. So these are all over here. And so I could see that T two mean was an empty vector. So here, I'm just going to produce that error. If there's a misspelling somewhere here. So I'm going to go run, run, run on this. Okay, then my case mean here case mean is no, it's empty, because I had a misspelling here. So it wasn't creating it. So that's how I would know that that's where my issue lies is in creating that value, or this vector here. Case mean, it's not filling in as I expect. So that's why I had to add, I would have to add the either. All right, so I'm just going to rerun mine now. Good job, everyone. I'm going to clear off all our yeses. Excellent work. And let's order them by P values. So again, we have five markers here, it's very easy to parse. But if we had 100s or 10,000, you know, which is often the case, we want to see what's the most significant, you know, right, we want to order them. And so the way we can do this is again, with these kind of like conditional slicing, but here instead of slicing, we're reordering our data. So what you do is you take your results df, this data frame that we output, you use those same square brackets, but we have a comma, right, because it's a data frame, it's two dimensional. And we want to order the rows. So we want to reorder the rows. So we have the lowest p value first and the highest p value last. And so that's how we do it here, we use the order function. And then within that, we put the vector that we'd like the order to be based on. And so here that's our p value. So results df p value close it. And this is our rows argument. And we want all the columns. So the right hand side of the comma is going to be blank. So I'll just fill in this. And so the order vector is actually just a position vector. It's just telling you that the second row should go first. The third row should go second. The fifth row should go third. The fourth row should go fourth. And the first row should go last if you're reordering them. And so that's what this vector actually is explicitly creating here. So go ahead and sort your p values using the order. Sorry, what was that? What's the comma that's right at the end of that order? Yeah, yeah. That's doing what again? Yeah. So results df is a two dimensional object. So we have rows and columns, we want to order the rows and we want to keep all the columns. So the left hand side is the rows. And so here we're just reordering the rows. So this vector, this argument on the order, it just goes to the row space. And then the columns, we just keep all of the columns. Great question. And so this again, this is just creating a new object in your environment. Okay. So you should see this object. You can go and inspect it. Or if you want to see the output right when you do it, you can actually surround this entire thing with parentheses. All right. So once you see an ordered an ordered data frame, go ahead and click yes. And I'll add that this guy. How does it decide on the something on a low to high, high to low? The default is low to high. But within the order function, you can actually make it decreasing as well. So here, I have order here. But if I go into the order function, so here, this is all the order function. I'm going to add a new argument. I'm going to push tab to see what options there are decreasing. If I say decreasing equals true, then it will go the largest p value to the smallest. So you can order in either direction. And increasing is just a default so you don't have to indicate anything. Exactly. Got it. Yep. And for those of you who have it, go ahead and try different orders. Try sorting on different columns. You could sort based on the control mean or the case mean, or you could also only select certain columns. So you don't have to output all the columns. You could have it only output, maybe the marker name, or you could have it output only the means. So you could have a group of columns if you define a vector of column names or column numbers, or you could give just one column. So you would be sorting the whole data frame, but returning only one of the columns, maybe. So experiment with that. Sorry, guys. And I'll give you all a couple minutes here to make sure you get that order, because this is very important. This is so commonly used. Also, you can try sorting based on a character vector and see what that gives you if you've also completed this. So how about the marker name? What does that order look like? And don't forget to click yes once you got it. So, Ran, the reason it's in values is because you only selected one column. So once it's a vector, it's going to be in values. It's not going to be in a data frame anymore. So yeah. Okay, but okay. So if you keep all the columns or multiple columns, then it will be in your data frames. And you can view it in view. Otherwise, it's going to be a vector. So it's just going to be a series of numbers in the value space. But you can also surround your entire command with parentheses if you want to see the output at the same time. So here, I'm going to do, if I just put open parentheses at the front and close parentheses at the back, so the whole thing is surrounded, it's going to print it out at the same time it's creating it. Well, okay. So that could be two if you want, just a marker name. It will also just give you that at the same time as creating it. Okay, okay. Thanks. All right. Nice work, everyone. So I'm going to go ahead and clear them, move on to the next piece to when you're printing your results table, right? So this is where you want to keep this for maybe a manuscript sharing with collaborators. So you're going to write a CSV of your sorted results file. Okay. So the function to write or create a CSV file by default, it's going to go to the working directory that you set in R. Okay. So you've set your working directory, you're going to like far long ago, you will have set that. We've got results sorted, which we created in the last step. And this is the file that we're writing our data to. So Christina, this is something you can do in a loop. If you wanted to write a new file for each of your markers or create a different spreadsheet for each marker, you could also just have this name, this file name change, and write many, many files. Or similarly, you can save many, many different images, for example, if you wanted to. But here, we're just going to write one summarized CSV file. You can call it anything you want. I'm calling it my results table here. Quote equals false. So what I'm saying here is I don't want in my CSV file that's created for this, like marker two, to have quotes around it. So that's what quote equals false means. If you don't say that, then it will be quote, marker two, for example. It's fine to have either way. It's not a huge deal. It's just kind of a style issue. And then row names equals false. So what that means is these row names, I don't want these to be included in the file that's created. Otherwise, you're going to have a CSV file. It's not again, it's not a huge deal. But there's going to be the first column in the CSV file will be these numbers. And then it will start with your actual data frame content after that. So you can experiment with setting these to true and false and check how that file changes. And once you've created a file, all this is going to do is print this command in R, but you can go look in your working directory, and you should have a CSV file created there. And that's how you'll know it's run. So go ahead and click yes once you have a CSV file that you've created. So even though we write the command of write CSV, the name of the file still has to have the dot CSV at the end. Yes. Yes. All right. It's a good question because I think you could say write CSV and then say your file is dot txt. I think that should work fine. But it's probably better to have it because my function is very simple, similar to my for loop initially, where I was just printing the output. This is just printing the output. But what it's, the way it's doing that is by using return to print that output. So this function returns just whatever argument you put in. Let's see. If we run the function, so now we've defined it up here, and you run your function without anything in it, we can do this because we have a default argument to it. So running the function without any argument in here, and it will return hello. Okay. Because that is the default. If you run your function and you have a different quoted string in there, it will return yes, because it's going to change now what argument is, it's going to return now whatever you've input. So very, very simple. So go ahead and write this function, this very simple function, and click yes once you've got it running. So test it. I have one test here where I expect yes to output yes, but put in different character strings and also see if maybe numeric values will work or maybe not. Just give it a try and click yes once you got it. Nice. Wonderful. A lot of people have it. I'm going to give everyone a couple more minutes. Great. Good one. All right. So it's a pretty simple function. I'm going to go ahead and clear all, but of course, if you're running into issues, just post it for everyone to see. We can all learn from this. Let's look at a more complex function. All right. So here again, I am including some default arguments to my A and B. These are my arguments. And look how similar it is to a for loop. I have function here defined, I have my arguments in here, and then I have my curly brackets. This is where all the work happens. And then down here is where I return the output of the function. So my sum is what's going to be spit out by this function. All right. So what does it return? My second function. What should the default return? You can type it into the chat. You can write it up yourself and see. Or you can just do the math on the fly here. Kind of a working memory problem. Close. Six. A is one. So one plus one. B is two. So two plus two. So you've got two plus four being added. So it should output six. No, I'm glad you put a guest in there, Rad. Okay. So see how it just takes whatever is input and then just puts it through each line, line by line, and then outputs whatever your return statement is telling it to output. Yeah, exactly. Grade one math. It's the furthest back. It's the furthest from what you've done. Similarly here, let's change A now is equal to three and B is equal to one. So similarly, now we have three plus one and one plus two. So we've got four plus three is happening here on my son. So that's going to be seven output here. All right. So now for you guys, try the following function. The input is any string or number. The output is the string with site underscore pasted to the front of it. So remember, paste zero. All right. Let me just make sure I'm not giving the answer. Yes. Okay. And go ahead and click yes when you think you've got it. I'll just give some sample output too. And here I called my function add sample. You can call it a different name if you'd like. Awesome. And if you've got it working, you know, start making more functions, you can make a function to compute your box plots, you can make a function to write a file that we just wrote. And you know, the functions can be as complex as you'd like them to be, a function to run a loop over your markers. So if you input a data frame that has any columns with marker in the name, then it will run a maybe a t test on that. There's lots of options. Also, it may be helpful if you have the slides to look at past functions to help guide how this one is being done. So you'll notice once you have your function created, it will actually be put one second. I'm going to create a function here. Sorry, guys. I'm actually going to move this for a moment. Nice. I have quite a few people getting it. Sorry. So here, once your function is created, you'll see it in functions, I've created add sample as a function. Okay. So you can test your function by running it, but you'll also see it in your environment pane, because now you're creating a function that can be used many times over. You could create a function. So for those of you who are done, you could create a function that extracts only the elements of the t test output that you like care about, for example. So you could create just a single vector that's named. Here's my case mean, my treated mean, and my confidence intervals, and my p value, for example. And then that's the output of your function. So I encourage you to go back through and try to functionalize some of your past code. And I'll give everyone else a few more minutes to get this one. And don't forget to click yes, if you got this function in your environment pane, it's running correctly. I'm still with an error there. I think what you mentioned, but yeah, can you add a little screen grab of it? Yeah, I put it in the thread. Oh, because now site number is your variable that you define in your function. So within your functions curly brackets, you want to use site number. Does that make sense? I'm going to go ahead. Yeah, yeah, yeah, site number and not site. No, site number and not my input function, because my input function, you're almost doing like inception, right? You're like taking the function's name and putting it in its own self. But that function's name, that's the function, you want to use the variable that's defined in the in the curved parentheses, that is the variable that's used within the function. Okay. So I'm going to go ahead and get this. So with the function, you define the name of the function here. I called it add sample equals function in my curve in my parentheses, I define a variable that's going to be used inside these curly brackets. Okay, so here my input value, I'm going to say Toronto is the default here. Now I start my curly brackets, you don't have to have a default or yeah, actually you do because if you don't set it, this won't run. But I mean, you that is an option to have a function without a default. And then you have your curly brackets now within here, input val, this variable you defined in the parentheses of your function, this input val is now a variable that's going to be inside these curly brackets. And so that's why I'm pasting site underscore input val to make my new string. And then I return that new string that can all be done on one line. I don't have to define a new variable. Like I don't have to define new string here and return new string, but I have to use input val. Because that's how that information when I say orange, now that's my input val, that's going to be used in the curly brackets to create my string output. All right, so great. And just want to make sure everyone's got it. So just make sure you click yes once you have it because I want to make sure they is awesome. Great. All right. So now try writing the following function. We're going to do another little exercise. Input a marker vector, output the proportion of the exponentiated marker values in the vector that are greater than two. So what proportion of the values of x df to marker one, for example, so this would be the vector, the marker vector is greater than two. So you should be able to put in any vector and then find if you exponentiate it, how many are greater than two. So a proportion, it's the length is the denominator, the length of the full vector. And the length that it meets this criteria is the numerator. And remember, when you define this function, it's just going to run without an error, it's going to add a function to here. So to your environment pane, you're going to have another function added here. And you'll be able to run it, but you won't have any output. So there won't be anything new coming out unless you test your function. So you can go ahead and put in marker values, marker vectors to test it. Remember your length function is going to come into play here. Any chance you can put the previous slide on to kind of feel so hit? Yeah, definitely here. I'll do this slide. Yeah, I guess. Would that help? Let you know soon. So this is a different function just so everyone's clear. This is our previous function. But here's a guide, you know, you'll want to have your variable be a vector now. Okay, so this is some marker vector that you're going to put in as your variable. And inside your function, you're going to be computing what proportion of that vector has an exponentiated value greater than two. And remember, all the slides are on the course website. So you can download the slides and have another window. I'm sure your screen real estate is running out of space. But yeah, that's a great point, Francis, because you can also look even further back at, you know, the different functions length dim, these ones that could help subsetting. So doing this kind of conditional subsetting of your vector, it'll all be useful. Yeah, awesome. We've got one. So very nice. We're getting a lot of sharing on the slack. It's really helpful. I think a lot of you guys are seeing similar issues. A key thing is your input value, because it's going to be a marker vector, you don't want just a character string, we've just had character strings so far. But you'd want an actual vector to be tested here. A single character is okay. It just won't be as valid for testing. So for example, like, if you exponentiate this, it will give you a missing value because it's just a character string. Another thing is you don't want to just compute this. You want to return the proportion of values in the vector that you input. So you want input val, if that's your variable, whatever variable, to be inside the exponential here, the exponential function. And you want to find what proportion of that val meets this condition versus the full length of the vector. I know it's a super challenging one, because it's pulling together a lot of what we've been learning these last two days. Yes, Christina, it should be a single value. It's a single proportion. So it's a fraction is what you're expecting as the output. A lot of you are getting very, very close. Also, feel free to use Google if you're if you're running into issues too. If there's something like you want some command that does something or some function that does something for you, give it a try. Sorry, Lorraine, I can't find the extra boxes you mentioned. If you use the RStudio kind of ability to find the extra ones, so it should be highlighting the opening one. So here I put my cursor here and it highlights opening one. I think you have an extra one that doesn't have anyone that goes to close it. So you can either add a closing one or delete an opening one. But I think that's why it's all seem to be right. All closing. Okay, so maybe you just need to run that that chunk again. I'm getting the same. Oh, okay. Awesome. Okay. Okay, guys, I'm gonna put us out of our misery. I think some of you are very close. And those of you who are not, I think it will make a lot of sense when we put it all together. This was a really hard challenging question to ask of you. Okay, so I'm just gonna go ahead. I'm gonna clear our guess of the nose. Input marker vector, output proportion of exponentiated marker values. So even the names, like it's hard. Okay, so here I'm using one trick. You don't have to use it this way. Like you don't have to do it this way. You can just use length as well, similar to what we did before. So I'm defining my function, marker prop, fine, you can call it anything you want. All right. For my function, I define this variable Invec. This is what I'm going to be using down below. So this Invec could be any vector of values. All right. And now what I do is my first line, I say, how many of these are true. So what you can do is I'm just gonna go to here. If I have a vector of values, so let me just say marker vec. And I'm just defining it here on the fly to do a demonstration. Okay. Dev off because I'm getting this graphic state issue. Okay, marker vec. Now, I want to know how many of these are true. So if I just do the X marker vec here, greater than two, I get a vector of truths and falses. Okay. And I want to know the length of this. I want to know the length of the ones that are true. Okay. And that's the proportion that meet that criteria. So there's a few things I can do here down below. I just take the sum and that will count the number of truths in there. So that's one shortcut way to do this sum. There's 60 truths over if I take the length of the total one marker vec, 60 out of 100, meet that criteria. But another way we can do it is you take marker vec. So we take the length of marker vec, and we use our square brackets to do that conditional formatting. So marker vec, oops, sorry, X marker vec, greater than two, this in here, it will give us the length. But here, if I just did it without the length, it's subsetting marker vec to only cases where the exponentiated value is greater than two. Okay. So this is what we talked about before. So then you could make this the length of that into your denominator length. That's the number that's true, but you can also just sum the truths in that vector. And then you divide the number that's true divided by the total length of that vector. Okay. And that's the proportion that you're getting out. Okay. So that's it. So the key thing though was finding how many of the values in the vector is the condition true for and counting it somehow. So either doing it by subsetting your vector like we've shown before, or you can just sum a conditional vector and you return the proportion. Okay. Just going to go to the slack channel. Yeah. So this one was a challenge.