 Hello, welcome to SSUnitex social decide and this is continuation of PySpark tutorial. So in this video, we are going to see about the array type, array and array contains function inside the PySpark. So today's asana is first we'll see about the array type, then we'll see about the array function and at last we'll see about the array contents. So what is the array type? So array type is one of the data type inside the array. If any column is having array type of data, then the data type of that column should be array type. Next is the array. So array is used to create a new array column by merging the data from multiple columns. So let's assume we have two different columns and we just want to combine those two columns and generate a new column and that column will be array type. Then we can use the array function. Next is the array contains. So as name is suggesting, if your array containing some of the string, then this function will return true, otherwise it will return false. So let me quickly go inside the browser and we'll try to see in practical. So here as you can see, first I'm going to import few of the data types. So string data type, array type of data, we just want to include this data type as well, extract type and extract field. So next we have this data. So this data is having total three rows and here we can see this is the skills. So on the skill level, we are having ADF, Scala, PySpark values like that. And then we can see the work profile. So another column could be work profile of this is the PySpark and ADF. Then we have the current state and the previous state. So this data we are having and here we just want to create the schema. So for creating the schema, extract type, extract field, all those things we have already covered in the detail in the last videos. So here we just want to create a new data frame by using this data. So the first column will be name and the data type will be string. Next we can see the skill and the data type will be array type and what will be the inner data type like the values that we just want to store under the array, what will be the data type for those. So that is the string data type. Next we can see array type again string and for the current state and previous state we have the data type as a string. So here we just want to create a new data frame. So simply we can use the data and schema. Let me try to execute and we'll see the output of this. So now data frame is created successfully. Here we can see name, skills, work profile, current state and previous state. In the skills we can see it is having the array type of data. Next we can also verify here the array type and on the zero index we have ADF index one, we have Scala and index two we have PySpark. Similarly in case of work profile we can see it is again the string array type and this array type zero index PySpark and index one will be ADF. So this is all about the data frame that we have created. So here we have used the array type. Next what is the array? So if we just want to use the array then I am going to add one more column on this data frame and that column will be the combination of the current state and previous state and it will be having another column as array type. So how we can do that? So for adding a new column on the existing data frame we are required to use the with column. Here I am going to specify the states. So this will be your column name. So first parameter will be column name and second parameter will be asking the expression that you want to use. So I am going to use array. So this array function we have to import first. So for importing we can use from PySpark dot SQL dot functions and then we can use import and I am going to use a stick. So here we can use array and inside this array function we are required to specify the columns. So first column will be df dot current state former second column will be df dot previous state. Let me put this in another data frame is df1 and just try to see the output of this df1. Let me try to execute and we will see the output. So it will add one more column as it states okay this is because incorrect. Now we can see previous state now let me try to execute. So here we will be seeing like it is added one more column and the data type of this we can see as array. We can verify like here or we can also expand this and we will see the data type here. So for this state the data type is an array and the elements under that is string. So similarly we have seen how we can use the array function for generating the array type of column by combining two or more than two columns. Next we can also use array contents. So what array contents will do array contents will help us to check whether any particular string is available under that array type column or not. So how we can do that? So here I am going to use the df dot I am going to add a new column and this column could be having array contents. So this will be your column name. Next here we can use array underscore contents function. So this is asking first your column name. So I just want to check under the skills column. So I am going to use the skills column then the second parameter it is asking the value. So value what value we want to check I just want to check adf. If your skills contain adf then simply we just want to get true otherwise we will get false and let me see this df2 so this is a data frame 2 let me execute it then and here we can see the output. So under this we can see it is having true true and false why it is false because if we can expand this skill and let me make it a little bit bigger in the size. So here we can see we don't have adf so that's why array contents value is false. Here we have adf we can see here we have adf we can see that's why array contents is true. So I hope you have understood how we can use array type array and array contents. So thank you so much for watching this video. If you like this video please subscribe our channel to get many more videos. See you in the next video.