 To finish up our section and are an introduction on basic statistics, let's take a short look at selecting cases. What this does is it allows you to focus your analysis, choose particular cases and look at them more closely. Now in our you can do this a couple of different ways. You can select by category if you have the name of a category, or you can select by value on a scaled variable, or you can select by both. Let me show you how this works in our just open up the script and we'll take a look at how it works. As with most of our other examples, we'll begin by loading the data sets package and by using library, just control or command enter to run that command. That's now loaded and we'll use the iris data set. So we'll look at the first few cases, head iris is how we do that. We'll zoom in on it for a second. There's the iris data, we've already seen it several times. Then we'll come down and we'll make a histogram of the petal length for all of the irises and the data set. So irises are the name of the data set and then petal length. There's our histogram off to the right, I'll zoom in on it for a second. So you see, of course, that we've got this group stuck way at the left. And then we have a gap right here. Then we have a pretty much normal distribution and the rest of it, zoom back out. We can also get some summary statistics. We'll do that right here for petal length. There we have the minimum value of the quartiles and the mean. Now let's do one more thing and let's get the name of the species. That's going to be our categorical variable and the number of cases for of each species. So I do summary. And then it knows that this is a categorical variable. So we run it through and we have 50 of each. That's good. The first thing we're going to do is we're going to select cases by their category, in this case, by the species of iris. We'll do this three times, we'll do it once for versicolor. So I'm going to do histogram, or I say, use the iris data, and then dollar sign means use this variable petal length. And then in square brackets, I put this to indicate select these rows or select these cases. And I say select when this variable species is equals, you got to use the two equal signs to versicolor, make sure you spell it and capitalize it exactly as it appears in the data. Then we'll put a title on it that says petal length versicolor. So here we go. And there is our selected cases. This is just 50 cases going into the histogram now on the bottom right. We'll do a similar thing for virginica, where we simply change our selection criteria from versicolor to virginica. And we get a new title there. And then finally, we can do it for Satosa also. So great. That's three different histograms by selecting values on a categorical variable where you just type them in quotes exactly as they appear in the data. Now, another way to do this is to select by value on a quantitative or scaled variable. We want to do that. What you do is in the square brackets to indicate your selecting rows, you put the variable I'm specifying that it's in the iris data set, and then say what value you're selecting. I'm looking for values less than two. And I have the title change to reflect that. Now what's interesting is this selects the Satosis is the exact same group. And so the diagram doesn't change, but the titles and the method of selecting the cases did. Probably a more interesting one is when you want to use multiple selectors. Let's look for virginica. That'll be our species. And we want short pedals only. So this says what variable we're using pedal length. And this is how we select we say iris dollar sign species. So that tells us which variable is equal to with the two equals virginica. And then I just put an ampersand. And then say iris pedal length is less than 5.5. Then I can run that. I get my new title, and I'll zoom in on it. And so what we have here are just for virginica, but the shorter ones. And so this is a pair of selectors you simultaneously. Now, another way to do this, by the way, is if you know you're going to be using the same sub sample, many times, you might as well create a new data set that has just those cases. And the way you do that is you specify the data that you're selecting from then in square brackets, the rows and the columns, and then you use the assignment operator, that's the less than and dash here, which you can read as get so. So I'm going to create one called I dot Satosa for iris Satosa. And I'm going to do it by going to the iris data and in species reading just Satosa, I then put a comma because this one selects the rows, I need to tell it which columns if I want all of them, you just leave it blank. So I'm going to do that. And now you see up here in the top right, I'll zoom in on it. I now have a new object, new data object in the environment is a data frame called I Satosa. And we can look at that sub sample that I've just created, we'll get the head of just those cases. Now you see, it looks just the same as the other ones, except it only has 50 cases as opposed to 150. I can get a summary for those cases. And this time I'm doing just the petal length. And I can also get a histogram for the petal length, and it's going to be just the Satosa's. And so that's several ways of dealing with sub samples and again, saving this election, if you're going to be using it multiple times, it allows you to drill down on the data and get a more focused picture of what's going on and helps inform your analyses that you carry on from this point.