 Welcome to SSUnitex Social Decide and this is continuation of PySpark tutorial. So in this video we are going to see about the count function and countDisting function. So I would strongly recommend to watch all these videos in the sequence order as I am recording and uploading. Today's agenda is first we will see about the count function then we will see count function with the group by and then we will see distinct function with count function and at last we will see about the countDisting function. So if we are having this distinct with the count function then why we are going to use the countDisting function. So all those we will see in this video. So let me quickly go inside the browser and we will try to see in practical. So here we are reading data from one of the CSV file that is the salesnew.csv and we have loaded the data into this DF data frame and this data frame is containing the data for the sales. Now the first that we just want to check the total count on this data frame. So simply we can use DF dot we can specify the count function here and we can execute. So what it will return it will return the total count. So as we can see total 799 rows in this data frame so output of this is 799. So simply we can use data frame dot on function. Next our requirement is we just want to check the item wise total count in this data frame. So we have to use the group by on the item name. So we can use DF dot here we have to specify the group by function. So simply we can use the group by and inside this group by we have to specify the column name. The column is item name and last we can use the count function and let me put this in one of the data frame that is the DF one and let me use the display with the DF one. Let me try to execute it. So it will be going to return the item name and whatever the count is having in this data frame. So that we can see like these are the items and whatever the count is displaying here. So simply we can use the group by and count function. If we have multiple column to group by then inside this we can simply put comma and add another column. Next we just want to check the distinct count on this data frame. So for that again let me try to use the DF dot. We have to use the distinct function and then we have to use the count function. What it will be returning it will be returning the total distinct count on this data frame. So that we can see 799. So like all these rows are unique. But here our requirement is we just want to check the distinct count of the items. So we cannot use directly item names here inside this count. We have to use the count distinct function. So let me try to use the count distinct function. So DF1 equals to DF dot. Here let me try to use the select clause and inside this select we have to specify your outer distinct function. So let me use the count distinct and inside this we can specify the column. So the column name is item name. So here we can see this count distinct function is not available. So what we have to do? We have to import this function. So for importing the function we can use the pyspark.sql.funcents then import. Here we can specify the count distinct. Now we can use it. Let me try to execute and we will see. So now the total we can see 40. So 40 distinct items are available in this data frame. So as we could see here then we can say if we are going to check the distinct count on any specific column or the combination of any column then we cannot use the distinct with count. We have to use the count distinct function. If we just want to check the sales order date, the combination of sales order date and the item name what will be the distinct count then we can specify that column here and let me execute it. So it will be returning the distinct count that is 500. So this is the combination of item name with the SOID. So I hope guys you have understand how we can use the count distinct count function with the group by count function with the distinct clause. Thank you so much for watching this video. See you in the next video.