 students, when I gave you the example of data wrangling, the dimension and measures we understand from there that in data wrangling we will understand the dimensions and measures of the different examples of it a little bit more that how these things help you understand it I gave you an example that right from the source system where your data is coming, the pre-processing of data and then the transformation, enrichment, completeness, quality control in it, all the parameters because they are part of, because we see that there is a different angle in it to understand this thing that's why I am slightly in the reduction, I am also using it that the concept of wrangling, if you take it along with it then it will be easy for you because in many situations we have gone through an SQL database, when you have SQL databases then you will handle the reduction in a different way if you have image data, voice data or you have IoT data or sensors data then you handle it in a different way and we also learn that normally now people talk a lot about SQL, about unstructured data, at the same time for business decision makers it is very very important that you cannot ignore SQL similarly there are many business applications, they all have structured databases, so it is necessary to understand both, that's why I am referring again to the concept of wrangling, the structure data that you have to do in it and mostly in unstructured there are some components that we will use, especially the statistical or algebra algorithms, so they help you a lot, when you have unstructured or voice or images of this type of data but at the same time you have to keep in mind that we have structure data also available and as a data scientist in many situations you have to handle that data also and normally you have to use these techniques, the most simple techniques that we have used in wrangling, by using them you can perform it, just based on your judgment you can use those things so basically the first thing is that when you have a data set, you prepare one thing which is called data dictionary, you call it metadata, so that is basically the information about data, which column you have, what information is there, what is its description, what is its source, what is its objective you have a whole document prepared in analytics, not normally, rather I would say in each and every project when you do the analysis of the data, the data scientist is part of your responsibility to prepare the data dictionary and maybe one of your team members will do it and you will develop a common understanding that the data set in the project is your target data set or source data set, the different dimensions and measures of that are their definition, the concept of the cube is basically that one data you can look at it with six different dimensions and then when you denormalize it or take a look at it from its SQL, or when you link it to different tables, then you can have even many more dimensions where your star schema or snowflake concept have come into picture now this is a data set in which the name, role, salary, age are different, these are the parameters, you can consider these your dimensions and the values in the rows are based on your data and when you perform a calculation on it, that will become your measure or if you have to calculate the bracket of the salary, or if you have the same kind of challenges that you compare the age group and the salary, or if you link it with the experience, then based on this, there is a city, a state, a joining date, it is just a sample data set, so when you see it, then you know what you want to do with it now in this we can see that we have numbers and text and broadly speaking the numbers are your measures and the text are your labels, these are your dimensions this is the date, it is both number and label, because based on this you can calculate someone's age, so this is the concept, and the rest you have are labels, this is the text now the gender, we know that this is your quality or attribute which you can do in yes or no mail or female, we cannot measure it in terms of numbers similarly this is the name and the rule, now I have told you about the date of the third date, this is the number and this is your label and this is the text so basically in this you can see how you calculate it, if you see that you just have to see how many people come from a specific area or in your team or in your data, in one area, you can simply select a city, you don't have to select a state or you don't have to select a gender but if you have to decide from that which is related to the facilities or leave example, then that is applicable only to the female, not to the male there are many more things like this, you analyze the salary brackets, you analyze different things, so based on that you can do it so what we have understood is that based on your business problem, the different dimensions or measures you can include or exclude them from your data set and accordingly you plan the dimensions and reduction and then you execute it