 Hello. Welcome to SSUnitex. So, see this side and this is continuation of PySpark interview questions and answers. So, in this video we are going to see how we can process the files those are received before or after specified time. It means we are going to specified on a time and we just want to process all those files which has been received before that specified time or after that specified time. So, while reading the data from the file we are required to check if that file is received before that time and if the file is is received after that time. So, inside the PySpark we are having these two options. First is the modified before and second is the modified after. So, when we can use the modified before and after. So, modified before this attribute can be used to read files that were modified before the specified time stamp. Second the modified after. So, this attribute can be used to read files that were modified after the specified time stamp. So, let's assume I am going to specify the today's date and whatever the file is received before today we just want to process those files or whatever the files that will be received after today's date we just want to process those files. So, these two options we are directly having inside the PySpark. First we can go with the modified before attribute and second with the modified after attribute. Let me quickly go inside the browser and we will try to see in practical. So, here we are under this input container and in the input container here we are having these two sales file. So, sales 0101, sales 0102. So, these two CSV files we just want to process and here we can see one file that is received on 29th of Jan 2024 and second file is received on 30th of Jan 2024. So, our requirement is first we just want to process the file which is received before 30 and the second file we just want to process which is received after 29. So, here for reading the data from the blob stories we can go with df.read method and after that here we can specify the option because we are also having the header. So, the header value that should be true after that here we can specify we just want to read the data from which format. So, the format that is the CSV and here we can specify the path. So, the path is very straightforward we have already created one of the mount point that is MNT under that we have the input folder and under the input folder which files we want to process. So, we want to process the file the name is sales.anything. So, the file name we want to process which is having the sales. So, let me try to put this into another data frame that is df1 and instead of df it should be spark because we are reading it. So, under the spark we are having this read method and then option then CSV like this we can specify. Let me try to use the display and in this display we can specify df1. Let me try to execute this cell and we will see the output from this. So, each file is having total 799 rows. So, here we should be having 1598 rows. So, that is why it is clubbed both the rows. Now, let us come to today's requirement. The requirement is we just want to use the modified before and after. So, we can specify the method that is modified after or before that you can see. So, let me go with before first and here we can specify the date and time. So, that we have already seen in blob stories we have the file which we have received on 29th of 2024 and the timestamp that we have. So, let me quickly specify time here. So, the format that should be 2024 01 30 and then we are specifying this timestamp as 00. So, what are the file that is received on 29th should be processed. Let me try to execute this cell and we will see the output. So, it should be having total 799 rows. So, that we can see now it is having total 799 rows and this we are reading from which file. So, this we are getting from the file which has been received on 29th. Now, if you want to use after here then it will be processed the file which is received on 30th. So, let me execute this cell and we will see it should be again returning 799 rows but this data we are getting from the file which has been received on 30th of Jan. So, these two methods we can use modified after and modified before. So, in the real time environment I have seen we are required to process the data which we have received the file on today. So, let's assume we have scheduled a job and that job is executing on 6pm every day and the requirement is whatever the files has been received on that day that should be processed. So, on those scenarios we can go with modified after and here we are going to check the today's date and the timestamp that should be 00 like that. So, it will be going to process the data which is received on that particular day. So, I hope guys you have understood when we can use the modified before and after and you can explain that to the interviewer. Thank you so much for watching this video. See you in the next video.