 Hello, welcome to SSUnitech social decide and this is the first video of PySpark interview questions and answers. So recently one of my viewer has attended interview in KPMG. So this question was asked there. So I got the email from that viewer and he's saying we need to write a PySpark query by using the below input to get the below output. So as we can see in the input we have total two columns. First is the name and second is the hobbies and hobbies are having comma separated values like we can see badminton comma tennis. Similarly for Bob we have tennis comma cricket and for Julie cricket comma Karam. So in the output we just wanna like badminton then tennis similarly tennis then cricket and Karam. So all these comma separated values should be converted into multiple rows. So how we can do that in PySpark. So let me quickly go inside the browser and we'll try to see in practical. So here I have created this data and the columns and creating the data frame. So let me try to execute this and here we should be able to see the same input data that we have seen. It is two columns name and hobbies and similarly the data as well. Now the first thing how we can convert this into multiple rows. So we can convert this by using explode function but explode function as an input parameter it is taking an array type of variable or the map type but it's a plain string. So first thing we have to convert this into array type. So how we can do that. For that we can use the split function. So what it split function will do it will be taking these two values as an two different values and will be converting that into array type. So let me try to write the query. So you will be able to understand. So here we have created DF data frame then let me select and here I just wanna select only two columns first is the name and the second column that we want to do the transformation. So here as I told you we have to use the split function. So first we have to import the split function. So from PySpark dot SQL dot functions then we can go with import and after that split. At the same time let me use the explode as well. Now here I am going to use the split function. So split here it is asking the column name. So the column name is DF dot hobbies. Now the second parameter it is asking the pattern. So by which pattern it will be on the basis of comma. Now let me use a display here so we just wanna see the output of this split function. So it should be having two columns but the second column we can see it is having the values as array that you can see on the zero index it is badminton and index one it is tennis. Now it's good. Now we can use the explode function to convert this array type values into multiple rows. So how we can do that? We can simply use explode function here and we can close the bracket and execute it. So it should be going to convert like this. So output is as expected but the column name is not correct. So what we can do we can use alias. So this alias should be using after this select and let me use alias and the alias name will be hobbies. Let me execute this again and we'll see the output. Now we can see the same output that we were seeing in the slide. So first we need to use the split function to convert string type values into array. Then we can use the explode function to convert array into multiple rows. So I hope guys you have understand how we can write the transformation query. So thank you so much for watching this video. See you in the next video.