 In this video, we're going to talk about scatter plots. So scatter plots are used between two quantitative variables to show how the two variables change together. And as we've used in the previous videos, we will also be using plot 9 or ggplot plotting tool. So here we are in Google CoLab. We're actually going to use a new data set for this particular video. In particular, we're using the Marcellus shale data set, which I've already read in using the first method in our import lesson. And so to give you an idea of what this data frame looks like, we can look at the first five rows. And so here we've got all of these different columns. They've got a lot of metadata on different wells within the Marcellus shale formation. But the things that we are going to be interested in are these two columns here, total gas and max gas. These are two quantitative variables that we will use to create scatter plots. And so because the data is already ready to go, there isn't much that we need to do in terms of cleaning. And so I'm going to jump right in to the plots themselves. So we're going to start off with a basic scatter plot. And again, this is quantitative first quantitative. And again, we're using ggplot and we give it our data frame, which is the mark, what we call mark for Marcellus. And here we have another geome for ggplot, but we're going to give it geome underscore point. And so this is how you specify a scatter plot is by having points. And similar to previous plots, we need to have an AES statement. So we can say x is just max gas and y is total gas. And so for a scatter plot, you do need to specify both x and y, and they both need to be quantitative. But we can see here from this plot, we could make some assumptions to show whether total gas increases to max gas. We can see that maybe there's some outliers over here, but it's really just a basic scatter plot. And so what we can do to make this a little bit more descriptive is we can actually add a best fit line and a confidence interval. And so I'm going to just copy this from above because we're going to be building off of this existing plot. And right here, I'm going to add a second plot. And this plot is going to be stat smooth. And stat smooth works very similarly to a geome. So we still need to say max gas and total gas. And then outside of the AES, we need to specify the method. So this is how it's going to fit the best fit line. And in this case, LM is for linear regression. And if we wanted to, we can give it a color so that it looks differently than our dots. And so if we run this, we can see here that we now have this best fit line. And if you zoom in, you can see there's a slight confidence interval there that shows what the 95% confidence interval for this data set is.