 We will start today's session on data presentation. So let us first review what we did in the first lecture, and then I will take up any questions you might have. So please hand raise if you have any questions. So in first lecture we talked about different types of variables and data that are possible in measurement. So I told you that you can have two types of variables, qualitative variables and quantitative variables. Qualitative variables are the ones with which you cannot associate number and quantitative variables are the one with which you can associate numbers. And therefore the data can be of two types qualitative data or quantitative data. And variable quantity may be a continuous or it can be a discontinuous or it can have only integer variable. So you can have basically two types of variables continuous variables and integer variables. So if you have a qualitative data you can further classify it into two classes. One of them is called nominal. You will call your data to be a nominal data if there is no natural ordering your data and you will call your data to be a ordinal data if there is a natural ordering. So this is what we briefly did in the first lecture. So I would like to know if there are any questions from this lecture. So I will randomly go to one of the centres. So this is S1, Chauhan College of Engineering. My question is that data we use data that is that should be useful for the subject of mathematics but can it should be useful in the communication skill also and how it is useful in communication skill. You normally deal with the data. It is not like that you only deal with the data only in field of science on engineering. It is respective of any field you are working in you will have different kind of measurements and whenever you have measurements you will be dealing with data. So it is not like that data is something which is restricted to one field of science or engineering. Even if you are working in some other field which is not connected to science or engineering you will see different types of data that are possible and therefore the idea was to tell all types of students and teachers that these are the different kinds of data that you would normally see when you look at the literature, when you look at the experiments or in books or in papers. Even in newspapers every day you see some of the data and charts I have discussed. So it is not like that data is only for science, scientists and engineers. So let us go to. Hello good morning sir. Good morning. Sir you are talking about two types of scales of measurement like nominal or ordinal. What about the other two kinds of scales interval and ratio scale and as far as humanities and social sciences are concerned we are more into using interval scale. So will you elaborate or throw light on this kind of scale, how to use it? So in this two scales if I understand your question correctly then I have told you that if you have let us say either a continuous data or a discrete data and when you have a lot of them then you have to basically divide your data into different intervals or bins and then you count how many observations are falling into a given categories. And in those cases you use something called either a bar chart or a histogram. So that is how we deal with the intervals where there is a spread of the data over a broad range and there is a lot of data. Okay, thank you sir. 1-3-0. Kavikal Guru. Sir could you please teach us first the content then start the first half of the tutorial. Well I told in the beginning that what kinds of variables and data you can have okay. So we talked about qualitative variables, quantitative variables and we further saw that your qualitative variables can be of two types nominal and ordinal and similarly your quantitative variables can be integer or discrete okay. So look at some of the assignments and then we will discuss okay. So what we are going to do, we are going to do some assignments online and we will discuss. So here is the question on your screen. So first question is that gender of people working in the office is a, so question is what kind of variable is gender of people working in the office. What kind of variable whether it is a qualitative variable or it is a quantitative variable? Quantitative. You said quantitative no, quantitative is the answer because here in this is the gender okay can have possible values and females okay these are just the names okay. They do not have any numbers assigned to them and therefore it is a qualitative variable okay. Let us take the second question. The second question is what kind of variable is age of students studying in a school. So let us go many caps institute of technology. Quantitative variables. Yeah very good. Yeah it is a very good variable because if you look at the age of different students it will be different and that is some with each person's age there is a corresponding number okay. So let us go to some other institute let us say, so could you please complete this sentence that continuous variables are usually and you have to complete this sentence. So continuous variables are something which are like fractions okay right. Let me give you a hint. So continuous variables are usually in decimal points or fractions right. So continuous variables arise when you are let us say trying to measure the height of the students in a class okay because if you measure the height of different students okay the height will be different and it will not be full integers and therefore the correct word to use here will be continuous variables are usually measurements okay. Because when you make the measurements you will usually get the continuous variables like if you try to measure the height of the students in a class or height of the trees in a garden and so on okay. Let us go to the next question okay. So the next question is 1D and here again you have to complete the sentence that discrete variables are usually okay. So when do you get the discrete variables okay so you have to use the proper word here to complete the sentence. Let us go to center number 1 2 0 3. Quantitative. No, so basically discrete variables means something which is not fractions okay. Now when do you get if you try to do a measurement what are the situations where you get full integer numbers. So think about this and try to complete this sentence. So like in the earlier question we saw that the correct word to use will be measurements okay continuous variables are usually measurements. So what will be the corresponding word here for 1D discrete variables are usually yeah could you please complete this sentence. Yeah sir the answer is countable. Yeah counts not countable. So discrete variables are usually counts that is the correct answer thank you so basically whenever you are trying to count the things let us say number of petals on a flower or number of students in a class whenever you are doing a counting in those situation you do not get something which is fraction okay. So the correct answer is that discrete variables are usually counts okay. Now let us look at the next question next question is when observations fall into separate distinct categories they give rise to then there is a blank and data so you have to complete this sentence. Sir the answer for the e-question is nominal data is it right let us see when observations fall into separate distinct categories they give rise to nominal data okay let us look at the other answers let us connect to some other center it will be a qualitative data it will be a qualitative data okay when observations fall into different categories okay let us go to some other center according to me it could be nominal data or we can say that it is informational and analytical data okay so you can see the answer is here on slide number 5 okay a qualitative data arises when observations fall into separate distinct categories okay the answer for question number 1 e will be that when observations fall into separate distinct categories they give rise to qualitative data okay. Now let us take the next question which is 1 f a ordinal data has a there is a blank which you have to fill up ordering so what will be the correct word to use here let us connect to some center. Good morning sir it might be like continuous ordering no so here you have to use just one single word okay look at the question the question is that ordinal data has a and you have to use one word here proper natural something yeah so ordinal data has a natural ordering okay so if you find a natural ordering your data then it is a ordinal data because you can arrange them into a given order or a specific order okay so the correct word to use here will be a natural ordering. Now here again we have some questions and for each questions there are two correct answers so first question is what type of variable is color of i and here you have to choose two correct answer so let us go to Siliguri Institute of Technology. It is qualitative and nominal it will be qualitative and nominal yeah why it will be qualitative and why it will be nominal. Because it is color of i so it should be a nominal data it is not in ordering system and it is not a parallel number so it should be quantitative okay that is a perfect number very good okay so now we have a second question although my slide says that is the first question by mistake so the second question is what type of variable is weight of a person and again here there are two correct answers so let us go to LDRP Institute of Technology. Quantitative. It will be quantitative yes that is right and then there is one more which is correct out of these. Ordinal. No it will not be ordinal that is not the correct answer so one of the answers is correct but the second one is not correct okay because ordinal belongs to the qualitative data so that is not the correct answer let us go to PSC College of Technology. Quantitative and continuous. Yeah it will be a quantitative data and it will be a continuous data why do you think that it is these two are the correct answers yeah answers are correct but why these are the correct answers weight may be a fractions may be a it is not an integers it will be a continuous data okay and why it will be a quantitative data it is a numbers it is not a blank colors or anything else it is a numbers so it is a quantitative okay perfect okay yeah your answers are correct thank you okay so now we have the third question and the question is what type of variable is how many sheeps a former has let us go to Maulana Ajat options A and C is correct answer option A yeah integer and quantitative okay because if you count sheeps then you will get only full digits and therefore it is a integer variable and it is quantitative because you can associate numbers with it okay so now let us take the next question what type of variable is satisfaction level if survey involves levels as very dissatisfied dissatisfied neutral satisfied very satisfied the right answer would be qualitative and ordinal so these are the correct answers thank you because in this case you cannot associate numbers with this and then you can arrange these levels different levels in a given order okay is starting from either very dissatisfied to very satisfied or the reverse of this starting from very satisfied to very dissatisfied okay so now we will quickly review the second lecture of this module data presentation module and here I had discussed how you will use different types of plot to present your data so if you have to present your discrete data you have to first count the frequency of your data and you have to plot the frequency distribution okay and normally for presenting your discrete data you will use something called bar chart okay in the bar chart on the x axis you will put the different categories and on the y axis you will put the total number of counts or frequency corresponding to each category and if you have to show the relative frequency or percentage in that case it is ideal to use something called pie chart and in a pie chart each slice represents proportion of the total whenever you have to look at the correlation between two different types of data in those situations you use something called scatter plot okay for example if you have to look at that if there is a correlation between the temperature of place and altitude of that place in that case since you would like to see if there is any connection you will use scatter plot so scatter plots are normally used to find out the correlation between the two different sets of measurements and whenever you are using a scatter plot if you find the outlier then outlier tells you something interesting so you go and study that point in detail. Now similar to discrete data if you have got a bunch of continuous data so again you can show them because it will not be ideal to see to show this much of data in a table it will be a good idea to prepare a plot corresponding to this data. Suppose you have got a continuous data in that case you will look at the minimum and maximum of the data then you will divide the entire range into a number of bins and then you will count how many observations are falling into each of these bins and then the numbers corresponding to each bin will be called frequency and in that case you plot something called histogram so on the x axis in a histogram you will have the bin intervals or bin centers and on the y axis you will have frequencies corresponding to each bin. So based on this let us try to complete some of the exercises so first question is a discrete data obtained by counting number of observations falling into different categories can be shown using so let us go to Azaram Bapu College the first answer is bar chart yeah that is the correct answer here. Now let us look at the next question which is unfortunately again numbered as A by mistake so here there is a blank and then the question is often more appropriate for showing relative frequency or percentage. So here you have to fill up this gap by using the correct name of the plot ok so which kind of plot is often more appropriate for showing relative frequencies or percentages let us go to MET sir answer is pie chart yeah that is the right answer thank you very much. So the next question is to examine relationship between two quantitative data sets a different type of plot is used ok so you have to provide me the correct word here to examine relationship between two quantitative data set. NK Orchid College sir the answer is scatter that is the right answer thank you very much ok. So then the next question is there is a blank and then so essentially what we are asking it which type of plot can help you to find outliers. So what will be the correct word to use here let us say Coimbatore Institute scatter plot can be used sir yeah thank you very much that is the correct answer let us take the next question frequency distribution of a continuous quantitative variable can be shown using. The correct answer is frequency distribution of continuous quantitative variable can be shown using histogram. Can be shown using histogram yeah that is the right answer thank you very much let us take the next question. In a scatter plot with dependent and independent variables the independent variable is plotted on which axis and dependent variable is plotted on which axis so you have to provide me the axis names. So the answer is independent variable is plotted on x axis and dependent variable is plotted on y axis. Yeah that is the right answer thank you very much ok. So I guess there are some questions from some centres right Henry let us take the last one. Yes the Aagraj College. Can you give us a simple explanation about the purpose of outliers. Purpose of outliers is to find out so outliers arise when one of your observations falls completely off the train ok and it is possible that you can get the outliers due to some error in your measurement or it is possible that measurement was correct but there was something which you did not understand ok. So outlier is something which either tells you about a mistake you have made in your measurement or it can help you to identify the problems with your measurement ok. So that is why it is very important to look at the that specific data point or the outlier to find out what went wrong there ok. Is there something interesting I have got by or there was a error with the measurement. Thank you sir. So in the next lecture of the data of presentation module we discussed the anatomy of figures and tables ok. So some of the things which I told you like how to make good figures. So in order to make a good figure you have to choose appropriate axis and origin. Origin need not be 0 all the time and you have to make your graphs in such a way that they can utilize the entire area which is available on the graph you have to put proper number of major tick marks and minor tick marks and you have to use proper lesions to differentiate between different measurements. The measurements which were done by changing the parameter values and in a typical figure you will have x axis y axis you have to choose proper origin you have to put the proper axis labels you have to put the lesions ok. And then you so the data points using different types of symbols if your measurements were repeated by changing the given set of parameters then you so them using different symbols. So that you can distinguish between different curves and similarly if you are showing let us say a histogram or bar chart sometimes you also might have error bars and we have talked about the error bars in later lectures. And typically in a table there is a column title for each of the column and then you have to use lines to differentiate your data contained in the table from the column title in the table body you put the data. If there is certain data points which requires explanation then you put them into the footnotes and captions are generally at the bottom of the table. So in some of the journals they can be also at the top of the table. And one of the important things was that if you can present your data using the figure ok do not use tables. So if you so the your entire data using a plot rather than a table you should use the figure because tables has just a bunch of numbers and table as such look ugly and people would like to see a visual impression of your data. So this it is always a good idea to use figures if you can replace if you can present your same data using figures and you can avoid the tables. And then we also talked about the misleading graphs like if you are using fancy 3D images ok let us say even a pie chart in 3D in let us say Excel then this can misrepresent your data and therefore it should be very careful when you are using let us say 3D pie chart representation. Similarly, truncating your bar chart can lead to wrong conclusions because if you are truncating something for example here there were two graphs on the screen on the right hand side this is the correct graph and on the left hand side it is a truncated graph. So although it is the same data but due to the truncation since origin is not 0 in this case it is starting from a higher value very close to the end values and therefore you see a lot of difference in the data. So sometimes truncating your bar chart can lead to the wrong conclusions. Similarly, if you are changing the ratio of the graph dimensions it can also lead to misrepresentation of your data for example if you have a original graph and if you let us say scale the width by a factor of half and height by a factor of 2 then it can your data could appear that as if it has more slope and similarly if you scale the width by a factor of 2 and scale the height by a factor of half then in this case slope can appear to be a smaller than the original slope. So let us take some of the questions from this module and then we will try to do the online assignments. So if you have any questions on graphs and tables please hand raise and we will try to take some of those questions. What is meant by truncated chart sir? So truncated chart means let us say you are making a bar chart and your entire data had a spread from 0 to let us say close to 10000 ok. Now what sometimes people do instead of showing the entire thing from 0 to 10000 they randomly truncate their bar chart at some other point let us say close to 80000 or something ok. So this was your full bar chart ok you could see that the 0 is here and then it is going up to close to 10000 or something ok. So origin was chosen as 0 here but somebody tried to truncate this entire thing ok and instead of choosing the origin at 0 he started from let us say 90000 or something. So if you start the data from 90000 ok and then you keep the values up to the maximum possible values. So what you have on the left side is just a small portion of this original bar chart ok. So here if you look at the different let us say bars all of these bars have almost the same height ok and therefore if I have to draw a conclusion from this bar chart I will say that for all of these different categories A, B, C, D all the counts where almost identical. However since they have been truncated and a different origin has been chosen here if you look at the height of the bars ok for different categories they appear to be different ok. For example the last one has highest height and the first one has the lowest one and therefore although it is the same kind of data ok it is giving you a wrong information it says that E has the most number of counts or E has the highest frequency ok and therefore you should not be truncating your bar chart ok. You should be always starting it with the proper origin which in this case was 0 ok. So let us do some of the online quiz. So first question is figures should not be used unnecessarily as they take what will be the correct thing to put here ok why we should not be using the figures unnecessarily. Silicon Institute of Technology why we should not be using figures unnecessarily. Because it will take some space in the paper as well as I think it is misleading also because it has no significance. So basically if your figure has a very trivial data and you can replace your figure by just one small sentence one or two sentence and if it is a very trivial data then you should not be ideally using figure because it takes more space and whenever you are putting a figure in your let us say scientific article paper ok you will have to pay some publication charge and therefore it will cost you money. So it will be it is always a good idea to avoid the figures if you can ok. So do not use the figures unnecessarily yeah that is the correct answer thank you. So let us take the second question. Second question is what takes more space than figures. So you have to give me one word here what usually takes more space than figures. The answer is tables. Yeah tables often take more space than the figure and therefore if you can replace your tables by figure it is a good idea. So do not put the tables unnecessarily if you can present the same data by using a figure it is always a good idea to use a figure because figure gives you a visual impression and table has a lot of data and people normally do not like to see a lot of data they like to see the pictures it is the natural human tendency ok. So third question is changing ratio of graph dimensions can lead to St. Francis. Distortion of the graph. Yes it can lead to distortion on the graph and then on top of that there is also a problem. So distortion is the correct answer but you know that is not the reason you do it ok. Hacker college. Misleading graphs. Yeah it can lead to the misleading graph one of the answers was that it can lead to distortion that is correct but that is not the reason you should not be changing the ratio of the graph. You should not be changing the ratio of the graph dimensions because it can lead to a misleading graph and which could lead to the further misinterpretation of your data and therefore it is not advisable. Now let us take the next question of vertical axis of a bar chart can give you misleading information ok. What do you do with the vertical axis of a bar chart that can give you a misleading information ok. So question is if you do something with the vertical axis of bar chart it can give you a misleading information. So what is the correct word to use here? Dr. Diva Patel Institute. Hello sir answer is truncation. Yeah that is the right answer you should not be truncating your bar chart because if you truncate your bar chart your actual bar chart or original bar chart may not show any differences but if you truncate your bar chart then truncated bar chart actually can show you the difference in the data and therefore it can lead to misinterpretation ok. Thank you very much ok. So now we have the next question and you have to tell me whether these statements are true or false ok. So the first statement which I am making is origin of a figure should always be 00 or 0.0 0.0 is this true or false let us take some college GHI Soni Nagpur. It is true it is true ok. Thank you very much. Let us take the second one and the second thing is that footnotes should not be used in a table is it true or false. So whether you should be using footnotes in a table or not is it correct or not is it true or false. Sarvajani College. It is false. Actually this is a true statement you can use footnotes in a table if you are one of your data points requires further explanation ok. So let us look at the third question minor sticks should not be numbered. Is that a true statement or a false statement? Do we pertain? So minor sticks should not be ticked. It is true. Thank you very much that is the correct answer ok. The last question is use tables rather than figures if same information can be conveyed using table. False statement sir ok. Thank you very much ok. So the next lecture was on the error bars and here we discussed different types of error bars. So rather than going through the entire thing we will just look at the summary of what we did in this lecture. We looked at different types of error bars and error bars can be broadly classified into two categories descriptive and inferential. So range and standard deviations are called descriptive error bars because they can describe the data. So range is typically the one which measures the amount of spread between the extremes of the data points minimum and maximum of the data point and standard deviation is something which is typically or roughly speaking the difference between the data points and their mean and on the right hand column there is a formula to calculate the standard deviation. Standard error or standard error of the mean and confidence intervals are called inferential error bars because they can help you to make inferences from the data. So standard error is a measure of how variable the mean will be if you repeat your study many many times and confidence interval is a range of value that you can be 95 percent confident that it contains the true mean. So based on this let us try to look at some of the questions. We will start from the, from question number 1. So range and standard deviation are examples of. Descriptive. Yeah that is right. Let us take the next question. Standard error and confidence intervals are examples of. The answer is inferential. Thank you very much that is the correct answer. Let us take the next question very quickly. So the third question is error bars give you information about what conclusions to that is already done. So let us take the question number E about two third of the data points lie within mean plus minus. So here you have to provide me 1 or 2. The answer should be 1. Yeah and if you talk about mean plus minus 2 standard deviation then that contains 95 percent of your data. So correct answer is mean plus minus 1 standard deviation. I wanted to do one more exercise. Let us get back to the exercise. So you can see some of the questions on the screen and these are true and false question. So let us look at the first question. Standard error increases with sample size. So is the first statement is a correct statement or false statement? Answer is false. It is false ok. And why is it like that? S e equals to S d upon n. Yeah because n comes into the denominator and therefore if you increase the n ok it will decrease. Let us take the second question. Standard deviation roughly gives the average or typical difference between the data points and their mean. Is this a correct statement or a false statement? Yes this statement is true. This statement is true. Very good. Thank you very much. It is a correct answer. Third question is mean of data with confidence interval error bars define the range of values which are most plausible for true mean. Is this a correct statement or false statement? It is true. Yeah so that is true. Thank you very much. And the last question is while compare results from two groups say wild type and mutant to see if they are different descriptive error bars should be used. So question is whether you will use descriptive error bars while comparing observations from two groups or not? We do not use descriptive error bars. We have to use inferential errors. Ok very good. That is the correct answer. Thank you very much. So my last lecture was on proper use of error bars and briefly I told you that there can be different types of error bars and therefore it is always a good idea to tell your audience what type of error bar you are using. If you do not say in your figure legends what type of error bars you are using it could be misleading and it could also lead to the misinterpretations. And you have to be very careful when you are showing error bars and you are reporting the data from replicates or independent samples. And whenever you are reporting the sample size you have to be very careful that it is distinguished from the number of replicates that you have in your measurement. So replicates are nothing but they are the repetition of the measurement on one individual on a single condition or multiple measurements of the same or identical samples. The other important thing which I told you that if you are showing the data from the representative experiment then you should not show the error bars for the representative experiment the sample size is 1. So that means people perform many many experiments but they are showing their best data and since the sample size is 1 it is not relevant to show the error bars in such situations. Then one of the important things which I convey to you is that whenever you are comparing the data from two different groups then in that case you should be using the inferential error bars. So based on this we have a small exercise to do and again this is a true or false statement. So question is error bars are meaningful even if figure legend does not state what kind they are. Is this a correct statement or a false statement? Sir, we have one doubt in morning session we are seeing standard deviation and errors this are all where we have to apply this sir. You would like to know where you will apply standard deviation and where you will you will encounter such kind of situations where you have to show the standard deviation ok. So in engineering college you might be having some experiments right and you see that whenever students do experiments they just do not do experiment one time ok if they have to do a measurement they repeat it ok few times ok. The idea is that you can get a good average of your data ok and then on top of that if you want to see how each of these data points were different from the overall mean of your result you can use the standard deviation ok. So normally standard deviation is sometimes gives you roughly the estimate of how much could be the deviation of each data points from the mean that you have calculated ok from your measurements. So now let us go back to the question and let us try to see if that is a correct statement. The question was that error bars are meaningful even if figure lesion does not state what kind they are. Is this a true statement or false statement? This is false statement. This is false statement very good because unless you say what type of error bolt is it is not meaningful. Second question is error bars should be used with caution when reporting data from replicate measurements ok is this a true statement or false statement? Graduate school of technology true true true ok. So the next question is number of independently conducted experiments are same as number of replicates. Is this a correct statement or false statement? This is false statement ok. Thank you very much. The next question is the sample size and is one for data from repetitive experiment. Is this a true statement or false statement? True true ok. Thank you very much that is the correct answer ok. So now we have some fill in the blanks questions. So first question is when comparing new experimental results with control experiments what type of error bars should be used? So you have to give me the name of the correct type of error bar that you have to use. Sir the answer is inferential. Ok. Thank you very much. So next question is scientists handle wide variation that occurs in nature by performing what type of experiments? Representative experiment. No not representative experiment. Actually the correct answer here is that independent experiment because whenever you do an experiment you want to measure something you repeat your measurements many many times and each of these are called independent experiments ok. So the correct word to use here will be independent experiment. Now let us look at the next question. Replication of measurement on one individual in a single condition or multiple measurements of the same or identical samples is called? Replicate. Replicate yes that is the correct answer. So this is the end of all the exercises related to the data presentation module and I will hand over now to the next instructor. Thank you very much for your attention. .