 Hello. Welcome to SSUnitex. So, she'll decide and this is continuation of PySpark interview question and answer. So, in the previous video, we have seen one of the interview question that was asked in KPMG. And today, we are going to see one more interview question that was asked on the same interview. So, this is one of my viewers has sent me one email. So, the first question that we can see, we have already discussed about this question in the previous video of this video series. So, if you haven't watched, please watch that video. I'll provide the link of that video in the description of this video. Now, we are talking about the second question. So, we are required to write a PySpark query and by using that query, we will be having input like this, like CT1, CT2 and CT3. And in the output, we are required to have the first notional value. So, as we can see in the first row, GOVA is coming first, AP is coming last from the left side. So, GOVA will be here. Similarly, for the CT2, we are having AP and here we can see AP. And for CT1, we have null, then blank and then Bangalore. So, we should be having Bangalore. So, we are ignoring the null and blanks. First, not null and not blank values are coming in the output. So, how we can write this query for getting the output? So, let me quickly go inside the browser and we'll try to see in practical. So, I have already created this data frame. So, it is having the same data that we have seen in the slide. So, we are having three columns in this data frame. So, the first column will be CT1, CT2 and CT3 as we have the same values. So, here we can use the colas function for getting that output. So, how we can use the colas function and write that query? So, first if we want to use any SQL function, then we have to import it. So, for importing, we have to write like from pyspac.sql.functions. Then we can go with import. And here either we can go with astic or we can specify the function. So, I am going with astic. Now, here we have the data frame that is DF. So, I am going to add one more column here. So, we can go with the with column. Now, inside the bracket, here first parameter it will be asking your column name. So, I am going to call this as first not null. After that, the second parameter, here we have to specify the actual code. So, as I told you, we can use the colas function. So, let me try to use the colas. And in this colas function, here we can pass the parameters. So, the first parameter we can pass as CT1. Second parameter would be your CT2. Third parameter will be CT3. So, what colas function is doing? Colas function is responsible for picking the first not null value from left side. So, if the CT1 first not null value will be Goa. For the second row, first not null value will be blank. So, let me try to execute and we'll show you the output. So, before going to execute, let me assign this in another data frame. So, I am going to add a new data frame as DF1. Let me try to execute and we'll see the output. So, here in the output, as we can see, the first row, it is Goa. It's coming perfectly fine. But in case of second row, we are expecting as AP. But the first row, as we can see, blank. So, that's why it is blank. As colas is always returning the first not null value. So, what we have to do? Here we have to check if this CT value is null or not. If that, if this CT value is blank or not, if this CT value is blank, then we are marking that as null. So, how we can do that? We can use the when condition. And after that, here we have to specify the condition. So, the condition will be if this value as blank. Then what we want to specify here? We want to specify as none. So, none is treating as a blank. Now, if this is not the case, then in otherwise, we have to pick the same column. So, that is the CT1. Same thing we are required to do for others. So, let me make this in alignment. And let me do the same thing for CT2 along with the CT3 as well. So, let me put comma for CT2 for CT3. Now, here it should be CT2. It should be CT3. Here it should be CT2 and it should be CT3. Now, let me try to execute and we'll see the output of this. So, now here we can see we are having the expected output. As we can see CT1, CT2, CT3 as an input. But in the output, we can see Gova, AP and Bangalore. So, what it is doing? Let me recall this query. So, we are adding one more column in the existing data frame. And we are using the colise function for getting the first not null value. And here first we are checking if the CT value is blank. Then we are marking that as null. If it's not blank, then we are returning the CT value. Similarly, for the CT2 and CT3. And after that, we are just doing the display of this data frame. So, we can see the same output. So, this is the query, the simplest way we can do and we can achieve this output. So, I hope guys, you have understood how we can write the query for getting this output. Thank you so much for watching this video. See you in the next video.