 Hello, welcome to SSUnitech, so we will decide and in this video we are going to see about the collect transformation inside the PySpark. So what is the collect? So collect is an action that returns the entire data set in an array to the driver. So what is the use of the collect? So collect is an action, hence it does not return the data frame, instead it returns the data in an array to the driver. So for example, we are having a data set and that data set if we are going to use the collect in the data frame, then your entire data will be converted into an array. Use the loop for iterating each and every value inside the array. And when we are not recommended to use it, so whenever we are having the last data set, so it's not recommended to use the collect because it will be going to convert entire data set into an array. So that's why it is not required with the larger data set. So let me quickly go inside the browser and we will try to see in practical about collect. So here I am going to create one of the data frame which will be having total two columns. First column will be department name and second column will be department ID. So let me try to execute this command and we'll see the output. So as you can see data frame is created. Now the requirement is we just want to loop through with all these rows. So as of now we have not used the loop. So it's very straightforward. So we can use the for I in then we have to specify the data frame. So data frame is DF. Now here what we want we just want to print your eye and here as we can see it is having two columns. So let me print the value of department name column. Let me try to execute and we'll see the output. So here the column is missing. So we can add the column and execute it. So here as we can see it is not returning proper output. Because we cannot iterate by using loop directly with the data frame. So the requirement is very straightforward. First we are required to convert this into an array. Then we can loop through on that array. So how we can do that? So simply we can use DF dot collect here and let me put this into another data frame which is DF1 and here instead of DF let me try to print with DF1. Let me execute and we'll see the output of this. So now we can see we are able to iterate with each row with one by one. Or here we can also specify the index. So I0 will be the first column we can execute or we can go with I1 for the second column which is department ID that we can see. So I hope guys you have understood how we can use the collect and when we are required to use the collect transformation. So thank you so much for watching this video. See you in the next video.