 students welcome to another very important area in machine learning this is called dimensionality reduction and you will enjoy it because some of the things that we get from it when we discussed data completeness and data wrangling it overlaps with that so I am sure that you will be able to understand it easily and quickly so I am sure that there are some purely mathematical things involved but I will try to make it easy for you so that you can easily understand it it basically happens that any of your big data set has a very good feature because when the data is generated then the analysis or decision on it is not normally in the mind of the software developers or the architect or solution designers so that is why there is a lot of data in it which is not needed for analysis or decision making and then at the same time when decision makers belong to different areas or different functions within the same organization then their interest is as we have discussed multiple times that if the marketing people have their own action plan product development, manufacturing, finance, budget then every one has their own interest within the same data set which is available within the organization because you have to store and analyze this data when you do processing of the data you have to store it you need a lot of processing capacity you need storage and then the time consumed is also there and then there are many other things which you will learn with the passage of time so this reduction is very very critical when you design different statistical models or business problems and when you study and design them then you realize that where I have to do the reduction where I have to do the reduction in dimensions what things I have to do you will remember one thing we discussed in data wrangling that there are such missing values where there are zero values and if we learn more than 50% or a special threshold if your values are zero or if the data does not fulfill the parameters of quality then you remove it or reduce it from your data on which you have to do the analysis where those things can be applicable but the same logic that we will apply using different mathematical and statistical functions we will use them and you will enjoy how these things are in practical life the things that you have studied in stat or math how they are applicable to real life problems and we will use them to reduce the data and the amazing thing is that you reduce the feature set in overall data but the quality of the data in the sense that you have to remove the result that is not compromised so these are the different methods and in this we will study what is the information in rows what is the information in columns and in the same way the remaining parameters I will share with you how to reduce the dimensions and how to manage them and then what is the information in rows what is the information in columns what is the information and the different reduction you do not need to remove a column in fact it can also be that you prepare a combination of columns and then reduce the rest of the columns that is how it is going to work so in the next module I will share with you more things first of all you have the concept of understanding I am sure that we have already established a concept now we will take it further