 Hi, my name and I'll be presenting my personal data project for business data literacy. For this research, I use the Vote NSA data set from the US data package available in our studio and pulled from the open interest stat website. The first step was to take a look at the rows and columns within the data set. Vote NSA contains 434 rows and five columns of variables including name, party, state, money and their vote for or against mass surveillance. Two variables that were of interest to me were party whether Democrat or Republican and the voters preference for or against continuing mass surveillance. I was interested in whether a voter's party and how they voted would pass a test for independence. After removing the rows of non-voters, I visualized a proportion of voters who want to continue mass surveillance. It appears the difference in vote by party is statistically significant, so I will test the independence. My research statement is the data provides strong evidence that the proportion of Democrats who vote to continue mass surveillance is different than that of Republicans. The null hypothesis is the probability that a Democrat will vote to continue mass surveillance is the same as the probability that a Republican will vote to continue mass surveillance. The alternative hypothesis is these probabilities are different. The first step in the downy and fur process is to calculate a sample statistic or Delta. In this code chunk, you can see I use the Vote NSA data set and I specified party and phone spy vote. The success factor was D for Democrat and I'm looking for the sample statistic. The chi-squared sample statistic in this test is 10.1. The next step in the downy and fur process is to calculate the null distribution. Here you can see I use the same factors, the Vote NSA data set, the party and phone spy vote results. Now we can observe the sample statistic in the null distribution. As you can see, the sample statistic is far outside the null distribution and we will calculate the p-value and validate our results. The significance level for our test was set at 5%. Our calculated p-value was far below the significance level at 0.15%. Given that our level of significance is 5%, our p-value of 0.15% is much smaller and therefore we do have sufficient evidence to reject the null hypothesis. Based on this sample, we have sufficient evidence that whether a voter is Democrat or Republican and their vote to continue mass surveillance are not independent variables.