 Hello, welcome to SSUnitex to see this side and this is continuation of PySpark tutorial. So in this video, we are going to see how we can read the data from the CSV file and load the data into the pocket file. So before going forward, if you haven't watched the last video of this video series, so I would strongly recommend to watch that video. First ask, we are required to read the data from the CSV file and load the data into the data frame. So this is the first thing that we have to do. The second thing, we need to load the data from the data frame and load that data into the pocket file. So this is the second thing. And the last, we are reading the data again from the pocket file and loading that data into the data frame. So these three things we have to do. So let's start with the first one for reading the data from the CSV file and loading that data into data frame. So here inside the browser, we are under the blob storage of the input container. And here under the sales folder, we have the sales.csv file. So what we have to do, we have to read the data from the CSV file and load that data into the data frame. So we are here inside the Databricks workspace and under this notebook. So first thing we need to notice, this cluster should be up and running. So that is running here. The second thing, we should be having a mount point which will be making the connectivity with your blob storage. So mount point is nothing but the bridge between the notebook to your blob storage. So first, let me check how many allable mount points are there. So as we have already created the mount point for the input location, so we can use the dbutils.fs.mounts. So this command will help us to check how many allable mount points. So this is the mount point that we are having here for the input location. Now we are required to read the data from the CSV file. So as we have seen in the last video for reading the data from the CSV file, we are required to use the spark dot. We can use the control space. So all the allable commands will be popping up here. So spark, we are going to use the read because we are reading the data. So we can use the read command. And here we can use either your format or we can also specify if your file is having special things. Special things means if your file is having header, then under the option we can specify header as true. Then we are reading the data from the CSV file. So we can specify the CSV. Here we need to specify the path. So as we have created the mount point with the MNT, then input, then here we can see we have the folder with the sales. So this is the case since two. So make sure you are typing the name correctly and then the file name. So this is again sales dot CSV. So we need to specify sales dot CSV. Let me try to put this inside a data frame. So I'm going to call the data frame name as df underscore CSV. Now it is executing. So your data frame will be having the values. So let me try to check if we are able to load data in this data frame or not. So we can use the display command and let me try to execute it. So it should be going to display all the allable values that we have seen inside the sales dot CSV file. So that we can see sales order ID, sales order date, item, code, then the name, quantity and value. So all these are here and we are having total 799 rows. So that looks okay. So the first thing that we have done for reading the data from the CSV file and loading inside the data frame. The second ask, we need to load the data into the pocket file. So we need to load the data from the data frame and for loading the data here we can use the write method. So we should be having the write method. Then here again we can use the option. So this option will make sure your file will be containing headers. So let me try to use the room of your header. So this file again will be having the header. Now last one, here we are going to load the data into the pocket file. So we can simply use the pocket and here we can specify the path. So I just want to keep the file inside the output location. So we can directly use the output then specify the sales. So let me quickly go here in the output container, we should not be having any folder with the name of sales that we can see. Now let me try to execute it. It will be going to execute and loading the data from your data frame to pocket file. So the job is completed, we can go here and try to refresh it. So we should be seeing the sales folder and under this sales folder we should be seeing the pocket file. So if you can make it bigger, then we should be seeing we have this pocket file. So this pocket file is creating at the runtime. So we have no control on that. We can only specify this folder path and you will be seeing all the data which is available under the pocket file. Now the second ask is completed, we have loaded the data into the pocket file. Now the last ask we are required to read the data from this pocket file. So how we can read it? So that again we have to use the spark dot then we can use the read command then dot. Here let me make sure like your file name is having header and this should be true. Now here we are going to read the data from the pocket file. So we can use the pocket and specify this particular location. So we can copy and paste it here. Let me take this inside a data frame. So let me call this as df underscore par for the pocket. So now this command executed successfully. Let me try to see how many records are there. So this should be having seven double nine records. Let me try to execute and we will see. So it is executed and we can see seven double nine rows are here. So I hope guys you have understand how we can achieve it. So thank you so much for watching this video. See you in the next video.