 Hi. In this video, we're going to continue summarizing categorical data, but looking at two-way tables. So basically looking at value counts across two different categorical variables simultaneously. So let's get to it. We're going to continue on again with our state energy rankings data. And recall that last time, we made a new categorical variable asking whether a state produced natural gas or not using this NPWARE function. We're going to borrow the same code here and we're going to make a second categorical variable. Looking at whether a state has total energy production above one quadrillion BTU or not. So we'll develop another variable. We'll call this above one quadrillion BTU. Question mark and we'll replace our is null question with a different statement. So we'll ask ourselves whether the total energy is greater than a thousand because total energy is in units of trillions of BTU. So a thousand trillions as a quadrillion. And this time, if it is above, then we're going to return yes. Yes, it is above one quadrillion. And if it's not, so if this statement returns a false, then we're going to say no, it is not above one quadrillion BTU. And again, we'll do the same processing. We'll convert it to categorical and confirm that. Let's go ahead and run this. There we go. We see that our question above one quadrillion BTU is categorical. So now we have two interesting categorical variables here produces NG and above one quadrillion BTU. Let's take a look at how many values we have in each combination of categories between these two categorical variables. So in order to do this, we're going to use the cross tab function available from pandas. And we'll make a new object called TWT same for two way table. Reference pandas with TD function is cross tab. And we'll take our data frame, which we're calling DF3 right now. And we're going to give it the first categorical variable that we want. We developed before the produces NG. And then we give it the second categorical variable that we want in the two way table. And it's going to be the one that we just made above one quadrillion BTU. And let's take a look at what we get. There we go. So we've got our first variable produces NG. No, yes, that's been up the rose. Our second variable above one quadrillion BTU giving the columns. No, yes. That's the count of all the states that make these up. So there are 18 states that do not produce energy and do not have one quadrillion produce more than one quadrillion BTU and total energy and so on so forth. Another handy feature of cross tab is if we add the argument margins equals true. What this will do is it'll give total columns and rows to this. As we see here, so added on all, which gives just the summation across all the rows. So 18 plus zeros 18 18 plus 15 is 33 and so on. And then it gives us all row which gives the summation down all the columns. So 18 plus 18 is 36 etc. And then this last intersection between the all column and all row gives all the observations in the data set to all of the total number of rows of 51, in this case 51 states. So, let's use this to find some proportions now. First, suppose we want to find what proportion of states that produce natural gas also produce more than one quadrillion BTU and total. And then the same structure that we had before for finding proportions, which can I copy and paste down here. Now we have to change this up a little bit because TWT now is a data frame. It's not a series anymore. So we have to if we want to extract values from this we need to reference the row and the column. Okay. And to do so using that dot look function with that we introduced before. So our numerator for this proportion is going to be those states that produce natural gas and also produce more than one quadrillion BTU. Now that's located in the yes and yes column and row respectively. Well this value 15 right. So we're going to take our TWT option object data frame. We want to use our dot look, use our dot look function. And we want to reference the yes row. And we want to reference the yes column to extract that 15. Let's just make sure that worked real quick. And there we go. Our denominator. On the other hand, is going to be the number of states that produce natural gas. So it's the first criteria here so this are then what number of states that produce natural gas, also produce one quadrillion. So if we want to get the total number of states that produce natural gas. Where do we find that on the table. Well it's in this yes row right produces natural gas yes before we had found that it was 33 and we see that referenced under the all column here. So we could say TWT dot look, we want the yes row. And the all column TWT. TXT 33 there we go. And then we just find the proportion. So point 45 or 45 and a half percent. Okay, well let's let's do a slightly different proportion. And this is only just slightly different but highlights. So what's the proportion of states that produce more than one quadrillion BTU in total also produce natural gas so we're reversing it. So let's borrow our same structure here. So now we're still we still have the same numerator. And this is because we're still looking at the number of states that do both things that meet both criteria, produce more than one quadrillion BTU and also produce natural gas. And we still want that. But our denominator now changes. Our denominator is just what the first criteria is and that is the states that produce more than one quadrillion BTU. So not the states that produce natural gas now but more than one quadrillion BTU. And that is under our yes column and at our all row. All we have to do is flip the order of what we have in this line for the denominator. The all row yes column 15 15 states in total that produce more than one quadrillion BTU. We see that all of them produce natural gas and so our proportion is one or 100%. Lastly, one more proportion here what proportion of states produce more than one quadrillion BTU and also produce natural gas. And this is the same question I just asked. No, we've by removing this that produce. We're now looking at out of all the states all 51, which ones do both things. Whereas this one was just looking at only the states that produce one quadrillion BTU. So we've got this conditional statement that here that changes the game dramatically. Let's take this again. We're still using the same numerator. So we're still looking at the states that do both things. But now our denominator changes again. We want to look at all the states. So what proportion of all states do both things. And we can get this in the all row in the all column to get this 51. So all and all. And go ahead and run this. We see that 29.4% of the states out of all states produce more than one quadrillion BTU and also produce natural gas. Okay. So that is two way tables and a bit more about proportions and how they can get more complex when we're looking at two different categorical variables. Thank you.