 Okay, hi everybody. I'm going to start on problem set 4. Problem set 4 is also the first steps that you're going to need to take for your semester project. So this is fairly important. I'm going to start by getting the data into the problem set, answering a few of the questions, and I'm going to start doing some of the coding. I'm probably going to stop and break this up into two videos so that it's not too long, but I'm going to go until I feel like it's a good stopping point. I don't necessarily have a place chosen. So to start with, this is the problem set 4.QMD file. You can either get it from the GitHub repo, or you can actually get it, I put it in Canvas. Either way, you can get it. I'm going to do something with it before I start, which is to rename it. So your instructions when you turn stuff in is that you're supposed to rename it with your name in the file name to help make it easier to know whose stuff we're grading. I'm going to, instead of doing that, I'm going to rename it problem set 4 demonstration project. You'd put your last name and first name instead of demonstration project, but otherwise, you can do the same thing. So I clicked to put the checkbox on the file, and I hit rename, and I'm coming over here and just typing in demonstration project, and then I'll hit OK, and it renamed the file. Now the next thing I'm going to do is open up that file, and so I'm going to do something that you don't have to do, well, yeah, at this point, but I'm going to go ahead and do it. I want to get this data in here. So I'm going to create a code chunk, and as you recall, you do the control alt I on a Windows or you do a command option I on a Mac. You're going to see some of these ghostly things popping up. I'm testing something called co-pilot, and co-pilot is wrong. That's not what I want to do. What I actually want to do is load my data and create an object. So I'm going to create an object just called my data. Now I'm going to change that. We've used that before. I'm going to call this demonstration data, and I'm going to load using the read.csv. Co-pilot is making some good guesses, but it wanted me to load data from a GitHub repository. I'm just going to do read.csv demonstration data.csv, and hopefully this should work without having to set a working directory or anything because I've got this set up in a project, and so it should be reading everything as if it was the main directory. And then I'm going to use that head command just to make sure my data loaded okay. And I'm going to run the current chunk just to hopefully this is going to work. And it doesn't seem to be wanting to work. This is kind of typical. I'm going to go with the other option, which is to highlight that and hit run selected lines. Okay, I'm going to, okay, that's why it wouldn't run the current chunk. So you see what happens when you have an error. We may make this a little shorter video just to get this in here. But the error happened and R gave me this thing showing me where it is. And when I look at it, I can see that the issue is that I typed, I didn't type demonstration data. So I'm going to clear this out and actually run it again. But this time I'm going to do run current chunk. And that time it worked. Okay, so now if I render the file, maybe that'll work and maybe it won't. So I'm going to, I'm going to save that and try to render the file. I still have an error. I typed head instead of heads. I typed heads instead of head. So we'll try it one more time. All right. So, so I had a couple of typos that caused a couple of errors that's going to happen to you too. So I'm kind of glad that happened. Anyway, there's the head of my file. And this file actually has a lot more than two variables. So later I'm going to have to make a decision. So the first question is what is the source of the data set? Where did you get it? So I got this from the VDEM project, VDEM varieties of democracy project. I can hit tab because that's already there. And the reason that's already there is because I've typed similar things. So, I'm going to hit tab again and see if this is right. Yeah, this is correct. This is actually something that I typed in another project. The VDEM project collects data through expert surveys and observations. That's about how long an explanation I expect from you. You don't have to do a lot more than that. So now, two main variables of interest. So I need to select two main variables of interest. You're going to need to do the same thing. And I just need to figure out what I want my variables of interest to be. So I'm going to use, for my purposes, I think I want to use, V2X part tip DEM, which that, so what I'm going to do is come down here where it says what are the two main variables of interest. And I'm going to say my X variable is going to be that. And that's the name of the variable. But I want to actually explain what that is. It is a participatory democracy index. So I could explain a lot about what that means. You don't need to worry about that. But at least give me something that tells me a little bit more than just part tip DEM. And then the other thing that I want to look at is, let's see, I want to go access to justice for women. So my goal here is going to see how the level of participatory democracy affects the access to justice for women. And this is not a well-designed experiment. It's not even a well-designed observational study. This is just really data exploration I want to see. Is there a relationship, whether it's a good one or not? I'm not even trying to get at, but I'm going to model the relationship with no other factors. So then I want to know how many observations there are. And there are several ways I can do this. First way is I can look over here and I can see that I have 179 observations of 16 variables. That's kind of the easy cheating way. It is perfectly acceptable at this point. I can also use Vue. So I can't use Vue. Vue will not work in one of these porto files. Vue is one of the few things that I need to do down here in the console. And if I do Vue with a capital V demonstration data, then I get this thing. And I can just scroll to the bottom and see how many rows there are. There are 179 rows, which is the same thing I got over here. The other thing that I can do is I can use the length command to count the number. So I could actually open a code box and use the length command. And I can also use that along with the as numeric command to create an object that I'll use later. I'm not to that point yet. I'm not going to worry about that. Now what is the median? So what I'm going to do here is I'm going to let R do the work for me because I've got 179 variables. So I don't want to have to go through and find the median. And so I'm going to type median demonstration data and the dollar sign. And that's what I want. I want the median for V2X. So the thing with the little ghost things where I can just hit the tab, you probably won't have that. But the thing that happens when I start typing where things start to pop up, you will have. So let's see. That is not the right. I need V2CLA what I have highlighted down there in the corner. So I can, once these things happen, I can up and down arrow on my keyboard to pick the one I want and then hit return. So this is going to return both of those medians. And I'm going to run the chunk in order to get the answer. Maybe I'm going to run all and hopefully that'll give me the answer. All right. So this doesn't seem to be wanting to work. So I'm going to try something else. I'm going to try just rendering it. All else fails. I'm going to just save the file and reopen it. So that worked. That was just, that was purely a computer glitch. Something like that should not be happening on a brand new computer, but things happen. So I'd already answered what is the source. And so I'm down here on question five. So the way I set this up, there is actually a place to type it. So the code produced it. I've got it, but I also am going to come down here and type 0.33 and then type 0.914 for the median of y. And I'm going to render that again. So I rendered that again. And all right. So the code produced the answers, but then I also type the answers in the type section. So you can see typing the answers in the type section is different than typing it in the code. If I typed it, I mean, it's okay since these are numbers. If I type them up in the code, the number would return. But if I started typing something like if I typed the answer to this question in a code chunk, it would return an error. So Porto gives us the ability to type things in typed areas that get turned out as typed text. And we can do some little bits of formatting, like if I really want to, I can boldface this. And you don't need to do this at this point where you're doing the problem set unless you want to do it for practice. But it's a good idea to get familiar with this because when you produce that final project, you want it to look nice. I mean, so you can see I did the thing with the two asterisks on either side and it turned those things into bold. So I'll do the same thing with the mean. I'm going to go control Alt I and I'm going to, all I want to do, I'm not actually trying to do anything with this. All I want to do is actually get the means of those things so that I can type them in, because typing them in is actually what you're getting credit for here. So I run those things and I will save this. Once it's saved, I will render it. And so I got the answers and I'm going to come over here and put the answers and my cat is going crazy making noise. So hopefully that doesn't interfere here too much. I'm going to go ahead and render this and I think I'm about 15 minutes in. So I'm going to stop here. I'm going to come back with another video where I pick up with problem seven, finding the sample variance and sample standard deviation. And I'm not going to go through all the steps to figure them. I'm just going to rely on the r functions for this problem set and you are absolutely free to do the same thing at this point.