 Hello and welcome to this video lecture. In today's video lecture we shall see how to handle missing data using pandas library. This is the outcome of today's video lecture at the end of this session. You will be able to handle missing data using pandas library routines. So we shall see how to handle missing data, how to filter missing data and how also to fill any data if it is missing in the CSV files. Before moving ahead I strongly encourage all of you to refer to the pandas library documentation to get a basic information and knowledge about pandas library which will help us to go smoothly through the course of this video lecture. So let us talk about missing values. Whenever we are using any file say for example a CSV file in which our data is stored so there are chances that the say for example the sensor which is detecting any information it did not sense the data and the value that it was supposed to take it went missing. So many a times data and column values or values stored in the rows are missing and to detect such missing values we have two functions that is data frame dot is na which helps us to detect missing values in data files or rather data frames and data frame dot is null which also does the same work. We shall see the example of both these routines. So say for example I have a file named data dot CSV and I want to read this data using the function read underscore CSV so this read underscore CSV is a function which has been defined in the pandas library and once I print the data frame DF which holds the data from the read underscore CSV you can see the five rows having these columns as serial number, name, age, telephone number and city. So as you can see in the first row the age of wiki is not mentioned and it is mentioned as not a number. See the age of kaushal is mentioned as not a number in the data file. Similarly the telephone number of puja row number 3 is missing and the city of puja is also missing. So many a times it may happen that when you are entering data into a general database and say for example you do not have a landline telephone number so you do not enter anything in that field and that goes missing. So we need some measure to handle this data, missing data and that is how is NA method helps us to handle this data. So when I execute this routine DF dot is NA this is the output that I get. So everywhere that I am getting true is the place that there is a missing value. So you can see that the age of wiki has been mentioned as true which tells us that it is a value that is missing. So wherever in the output you see true also you can see the age of kaushal is being mentioned and true so it is a missing value. So this is how the dot is NA file function helps us to detect missing values. Similarly the isNull function also works on a similar lines. So when I run the isNull function we get a similar kind of output where true tells us that the value is missing in the data file. So moving ahead is NA dot NE is a routine that returns true if any missing value in a common is if there is any missing value in a column. So the output after the data frame is printed the output of dot NE routine tells us that if any of these columns starting from serial number name age to city has null values if you find a true output in this dot NE routine it tells us that there is a value which is missing in the particular column. So let us see how to find the total number of missing values in a particular column. So is NA dot sum routine helps us to find out the number of missing values in a particular column. So in age column we can see that there are two values which are missing and that is how it shows that two values are missing similarly two values are missing in a telephone number and one value is missing in city and that is done with the help of the dot sum routine. Moving ahead we have seen how to find out the missing values now we will see two ways very popular and common ways that are done or used to help handle these missing values. Now one way of handling these missing values is to drop these values using the drop NA function or we can also handle these values by replacing the missing values using the fill NA function. Let us see how both these functions work. So the drop NA functions has the parameters as given on the screen axis parameter zero for rows or index and one or columns for the columns it takes the rows as default or zero value as default. How takes two values any is if any any values are present drop that row or column and if you write how equals all so if all the values are any then drop that row or column we will see how it works in a short while. Now let us take the same example and I just tried DF drop NA so all the values whether it is in a row or a column that has a missing value are dropped and we can see that only one row that is row number one having the values two nithin age is 20 the telephone number is as given and city is Nashik is printed here. So all the other rows or columns were having a missing value and that is why they were dropped except from column number one which has these values. So you can see that row number having the index zero was dropped because age was missing. Row number one is printed row number three was again dropped because its telephone number and city was missing similarly with four and five. So we see that only row number two having the index one is printed because it has all the values present. So we see here when we use the parameter one it gives us all the columns where the values are present so any column where even a single missing value is present will be dropped and every column where all the values are present will be printed using the drop any function if we pass one parameter to it. So moving ahead we have passed the zero parameter and how equals any so it gives us the output that we are checking for the rows here and only the row having all the values present in the data is printed all the other rows are dropped. Here if we change the parameter zero to one and we keep the parameter how equals any what it does is it prints all the columns where all the data is present and leaves all the columns where the data is going missing. Moving ahead when the parameter is one and we keep how equals all so it prints all the data irrespective of whether the data is missing or not. This is again similar when the parameter is changed from one to zero and we want to view all the rows and with the parameter how equals all it gives us the output of all the data irrespective of whether the data is present or missing. Now let us talk about the dot fill NA function so these are the parameters the value that we need to fill in the missing value say for example I want to fill a particular value say for example zero instead of the missing values I can do that the method to fill the missing values so these are the methods backfill bfill padding or ffill and axis is once again as we have seen earlier zero or one stating the rows and or columns. Let us quickly see the examples of dot fill NA function so here we can see that we are continuing with the same example and when I change and I use the dot fill NA function with 50 everywhere that the data was missing is replaced with 50 so you can see that the age of wiki was missing in our example or the data frame but it has been replaced with 50. So similarly if I use a method is equal to backfill what it takes is it takes the value from the next row so you can see that the age of nithin is 20 but the age of wiki is missing so using backfill function the age of wiki is also becomes 20 similarly if you see that the city of puja was missing but the city of puja has been taken from the city of kaushal that is the next row and it has been adapted so if you use the method is equal to backfill it takes the value from the next row. At this point in time I want you to pause the video and answer this question the question is what is the difference between drop NA and fill NA function so the difference between these two functions is drop NA function is used to drop the missing values whereas fill NA function is used to replace these missing values with either the forward or the backward values or a particular value that the user wants to replace the missing values with. These are the references thank you very much.