 Hello, welcome to SSUnitex, so we will decide and this is continuation of PySpark tutorial. So in this video we are going to see about the ranking function. So ranking functions are coming under the windows function. So today we will see about the row number rank and dense rank ranking function. So these functions are very similar as SQL Server. So row number is going to use for generating a sequence number based on the order by clause on the any particular column and we can also do the partition. Rank will also going to generate the sequence number and dense rank will also do the same thing. What is the difference between these three? We will see in practical. Let me quickly go inside the browser and we will try to see practical. So here I am going to create one of the data frame and it is having total three columns with name, department name and salary. So now the requirement is we will see about the rank, dense rank and row number. So let's start with the row number first. So what I want to do? I want to generate a sequence number over here and it will be on the descending order or the ascending order of the salary. So for using the windows function we have to first import the windows. So we can use the from pyspark dot SQL dot windows. So we can import the windows function like this and we can also import the function. So we can use the from pyspark dot SQL dot function and I am going to import this all. Now first I am going to use the existing data frame that we have created and here I am going to add a new column. So for adding a new column we can use with column. We can go with the column first parameter asking about the column name. So column name would be serial number of row number. Now here we have to use the row number function. So simply we can use the row underscore number. So the syntax are very similar. Here we can use the over clause dot over and inside this bracket we need to use the window dot order by and inside that we can simply pass the column name. So I just wanna pass the column name as salary. Let me put this in another data frame and here I am going to use the display command with the df1. Let me try to execute and we will see the output. Okay so we have to use the order by let me execute now. Okay here we have to specify the bracket. So this syntax so here we can see it has generated a sequence order and it is starting from 1. So 1 2 3 4 5 6 7 like that it is added. And even if we are having the same salary then it is continuous increasing the numbers. So here we can also use the partition. So how we can do the partition on this. So after this windows we can add dot and here we can add partition by and inside this bracket we can specify on which column basis we want to do the partition. So maybe we want the partition on the basis of department name. Now let me try to execute and we will see the output. So here we can see the partition has been enabled and we can see the finance department. So the serial number is 1 2 3 and after that this serial number has been reset to 1. For the marketing it is 1 2 and for the sales we can see 1 2 3 like that. So this is very similar to the SQL server. Now this is all about the row number. Now we can also use the rank. So instead of row number we can simply use the rank and remaining everything will be same. So we can execute and we will see. So here it is generating a rank. So rank is 1 2 3 as we can see then 1 1 then 1 2 and here we have the 1 1. So why we have 1 1 here because as you could see the sales is having the same salary. So if we have the same salary then it will generate the same sequence number which is 1 1 and then we can see next number will be skipped in case of the rank and in case of the dense rank the next number will not be skipped. This number will be 2 so let me also use the dense rank here. Dense underscore rank let me execute it and here if you can scroll down then we can also verify it. So here for the sales we can see the number is 1 1 then it is having 2 2 then 3. So like that we can also use. So I hope you have understand how we can use the dense rank rank and row number. We can also put this value in a variable. So let me try to cut it from here and let me create a variable maybe with x and here we can specify it and here we can simply use this variable and try to execute. So it will be working fine. So either you can specify this partition by in a variable and after that you can also use that variable in your query. So either way you can use it. So I hope guys you have understand how we can use the row number rank and dense rank function. Thank you so much for watching this video. See you in the next video.