 Can you just tell no I just want to get the fee a third year fourth year second year second years all of second year yes sir fourth semester yes sir fourth semester probability statistics so you have a course on probability statistics right exactly and then you also have a course on biostatistics in the sixth or seventh semester no yes or no actually we have only probability and statistics biostatistics is something new we deal with biotechnological terminologies where in probability statistics we deal with some mechanical objects like that do you have a course on biostatistics in your curriculum no sir no biostatistics okay okay then this will actually give you a nice overview of everything right the statistical principles test of significance then design of experiments normally the design of experiments is not taught anywhere so that will be very new for you none of the courses teach but there are many courses will teach you ANOVA and T test and things like that actually one thing I want to ask is does the mathematics become very difficult as you go along you will see lot of equations formally do you feel it very difficult no sir it is not actually so difficult comparing to the other other semester max papers okay another one another point is the most nowadays lot of softwares are available which can do the statistic but you will not know what is happening behind that so softwares will give you a lot of data resides but now if you follow this course then you will be able to understand how it is doing that and you also understand what are the significance of some of those terms I am sure there must be some softwares in your college like Matlab SPSS sir we are studying Matlab Matlab we are doing Matlab does not have much statistics SPSS or mini tab SPSS actually it is next terminal we are having those syllabus for next terminal test okay so SPSS can do all these very fast you just put in the data it will give you lot of statistical analysis but now if you go through the course it will give you you will get an idea of what though how to calculate all those things so that is the main thing about this course and the SPSS does not talk about design of experiments I don't think SPSS has much of design of experiments so that is also very new for you people which you learn on this course so anything specific you want to ask you want to ask about the exam or you want to ask something specific anybody has generally regarding exam would you provide us a spreadsheet for calculating the problems okay exam we will provide all the tables F test F table P tables Z table all the science test tables so all the tables will be the chi-square table but the calculations you have to do it by calculate one and problems will not be very big that means I will not I will not I am not asking very big data problems it will be small data problems so it is not difficult to do like Hanover there will be a problem means it will not have too many data points no calculating the total sum of squares and all if you have too many data points it may take more time but it will have less number of data points okay so there will be problems on Hanover there will be problems on scientists and something on binomial something on the Poisson distribution and like that in a p-test for sample p-test and then the paired t-test and if you have done all the assignments I think problems are almost like assignments there is no difference there is nothing difficult you should be able to do exam much easier much much easier and compared to your quizzes you know problems will be exactly same similar type of problem okay tell me why you people wanted to attend this course because if I am sure you have so many classes to do you have some six or seven papers right 30 papers two labs so did you all find time to do this try to do five videos exam so why do you people wanted to do this it's a lot of work right particular reason I can't hear you I don't get the sound hello yes a city could you hear me sir yes now I can now I can hear you you already have some seven six or seven 30 papers and then two labs so there must be a lot of work for you how why you wanted to do this course also because this course also has a five videos and then one final exam so it's a lot of work isn't it do you find time did you find time to do this actually we don't find more time in doing this course but we would go through these videos by last three days and we would complete the assignment by time sir and actually we would get an extra credit and we are doing our many projects in our this current semester so it would be helpful in publishing paper something we could go with some statistical ideas so we we opt for this because nowadays if you are doing research in biology or biotechnology we have to do some statistical analysis without that journals will not accept any papers so that means you have to do some fee test you have to do some ANOVA and that is very very important other one is if you are doing any say fermentation and you have to plan experiments I want to change temperature I want to change pH you cannot blindly do those experiments if you follow a design you know like a factorial design or a fractional factorial then or a secd like you know and central composite research then it is easy to publish in journals also instead of blindly doing some experiments of changing temperature pH or carbon if you do a factorial then publication also become easier for you and all journals want some statistical analysis you have to do two sample details you have to do ANOVA you know that journals don't they have to give the p-value so journals don't accept without any of these data analysis that's very important so that way it becomes easy yes but till now till now we don't have some sort of clear ideas regarding design of experiments we were okay with biostatistics sir okay sir I I just have seen only six up to sixth week I have seen videos okay I will do all right those give you all the DOEs the last 10 classes all about DOE these are now experiments because talking about factorial design and fractional factorial design what about others have they all seen seven in eight week yes other people okay sir I will hand over it to my friends they would have something have they done DOE I saw hello tell me your name first yeah my name is Robin okay sir I have a small doubt in a in T test yeah before that have you all seen the seventh and eighth week the design of experiment no sir has anybody has anybody seen seventh and eighth week no sir nobody has seen no sir oh but we need to do that also when will you do in upcoming weeks you will finish before exam yeah before exam we will come okay when is the exam few people are taking 20 March 20 so because the last two weeks lot of these DOEs are there design of experiment last 10 classes I do things like fractional factory factorial design fractional factorial design like a Berman design and then box Wilkinson design box Benham design all these are designs which we do in the last 10 classes so you need to okay now what's your question sir in regarding the T test while we calculating the value of T some values be a will be negative sir but negative value means it will be mostly less than the value of a table we calculate from table whether we can take is that a minus value or we got to take models for the T value calculator you have to take the modulus value because you are a T distribution is a symmetric right whatever is there on the left hand side is same as the left hand side so you take the modular and then you can read out the T value from that so it's okay okay I think you have not mentioned in that videos so I have somewhat confusion in that so you take the modulus and then take the T P value take the T from the table and then measure the P value because if you look at normal distribution if you look at T distribution they are symmetric whatever is there on the right hand side is same as the left hand side right so we take the modulus thanks regarding that box plot box is a whisker plot regarding my name is Kishore sir Kishore regarding that box whisker plot is there any limitations for number of samples that we should consider or it's just a simple graphical representation for a number of samples it is a very nice way of looking at it visually you know anybody can see the visual and see how different the data is right so it is meant for visual observation you can't don't there is no numerical in that so you can take even 4.5 pounds is okay but it gives a nice visual observation one data set is very high one data set is very low and if there are any outliers there will be star given so you know very outliers are there and also the range you can it gives you a range also so it's just a visual observation so 4.5 data point are okay regarding variable distribution the reliability will be will be decreasing with the increasing of shape that is beta is there if the time is not given we should we can calculate the reliability using a beta and eta itself or time is essential or something else no no no it is beta and eta itself we can calculate there was one nice problem you saw right there is a nice biomaterial problem you saw that yes sir the design A and B so that's a very nice problem if you understand that problem it gives you how important variable distribution is especially with respect to reliability of the machine parts reliability of any item which are looking at even two bulbs the reliability of two light bulbs you can study using this variable type okay and one more question regarding non-parametric tests you have asked what are the other non-parametric tests and given options are ranked test actually the sign test wilcoxon signed rank test cross call valise test and students range what did he write it as i can't hear you the first the first two tests are same actually they are derivatives for dependent dependent samples and the second one cross call valise test is for k independent samples and the students range test i have doubt with students range test whether it is a non-parametric or parametric actually it's a parametric student range test is a parametric test parametric right so the what is the right answer for that question the sign test if you see the first slide you know i give comparison of parametric test and the corresponding non-parametric test that slide is very very useful look at that slide you remember the one slide yes sir yes sir i referred that yeah equivalent to anova equivalent to test and equivalent to two sample test use that table okay for that sir that only i asked yeah thank you sir i'm priya i have doubt why we consider the small sample as a value less than 30 um you saw the t table no it keeps on changing it keeps on decreasing and finally it comes to for a say 95 percent it comes to 1.96 right so slowly that number keeps decreasing decreasing have you seen the t table and seen how the 95 percent decreases as we increase the number of degrees of freedom so it keeps coming down to 1.96 and 1.96 is the final value asset for a 95 percent confidence okay that is why 30 is the best so but actually 30 experiment is too much so even if you look at the t table have a look at the t table and then around 78 experiments now initially t value is very high right and the degrees of freedom is low okay the t value is very high when df is very low as you keep increasing the degrees of freedom the t value keeps coming down correct so at 678 it keeps coming down down down down finally at 30 and around that place it's almost same 1.96 1.97 like that so ideally you should do 30 experiment but around 67 itself is coming down quite a lot so that is why everybody does seven experiments or six experiments as the standard so six degrees of freedom is good approximation okay because if you see for a 95 percent confidence listen for a 95 percent confidence you have 5 percent on both sides right 5 percent is okay so that 1.96 is what is coming okay hi sir I am Balu sir in non-parametric test for homogeneity of variance we use three methods sir of the three methods which is the most feasible methods box with that box plot box box plot if you can plot you can quickly see how it looks like okay that gives you an idea how far the variances are between that's a pictorial representation if you want a mathematical way the second method is good we will assume it as a normal distribution and then we can carry out a chi square test understand no so we will assume that they so what it is suppose I give 10 data point I will calculate sigma and mu for that assuming it as a normal distribution and then I can compare with the data using a expected minus observed square divided by chi square test then I can check for the chi square so that is most of the software use that method you understand what I'm really chase our city chase our city please respond no no yes sir speaking to the other student please hand over the mic to him no no not that one not the other student just a while ago he asked the question please respond please respond to back yes sir you understand so what you have to do so if suppose I give 10 data points I will assume it is normal and calculate the standard deviation and the mean and then I will find out that is the expected whereas observed is the data that is given so I do here chi square test and I see whether chi square is statistically significant or not so that is the normal method used in many so but box the disker plot is easy to see we can it looks immediately we can tell how good the data looks like how bad the data looks okay sir understand sir actually we are doing projects with the plants the phytochemical analysis of plants we are doing many projects sir we have chosen three different three plans and three more three different extraction methods for phytochemical analysis and I have to I have to represent some statistical statistical analysis regarding the efficiency efficiency in extraction extraction which which solvent is giving more yield and and regarding some particular compounds for example say tannins if I am selecting tannin as my particular compound which I should get from that plant I will get three different every plant will be giving tannin in the phytochemical test sir and every plant will be giving different type different sort of extractions for different sort of solvents how should I represent this in a simple statistical manner it is going some complex complex things how should I do it in a simple manner to explain to the viewers other readers so you have three three plans and three different extraction procedures right so if you for each plant if you take it will become three into three nine experiments but you have to repeat also because without repeating error will not be there so you have to repeat each time each experiment once more that means I take again the plant again I have extract right now so you repeat yeah what is your plan you're going to do it right repeat it actually jatropha jatropha carcass brahmi centella asiatica and tectonograndes teak they're using these three plants and three different solvent sorry sir three different solvents you are using right we are using water acetone and yes sir it is how do you use breaking uh can the technical please we are using water acetone and benzene as three solvents and and one more complexity is arising when I am doing with antimicrobial tests for four more bacterias using these three samples okay okay so some sort of complexity is going on and I can't make it in a simple manner to make it understand to my friends or a guide so I need some mathematical assistance for doing it in a very neat manner or simple have you have you gone through the design of experiments the seventh week and the eighth week sir still now I haven't seen those you see that now there in fact these type of problems can be addressed in that design of experiment what you do is you read all the seventh and eighth week and then you send me an email then you will be better because I want to do three different plants three different extraction procedures and four different bacteria I'm looking at right so how do I plan my experiments so in fact design of experiments exactly talks like that you go through the both the seventh and eighth module then you send an email then when I okay start become busy because basically you're doing three into three nine experiments right that means you take each plant you will extract with each solvent you take the second plant extract with each so you will get some results but you have to repeat also because we need some error so each time we have to do it once more once more so that mean maybe three times repeats so you'll do about 27 experiments so from there we can we can do anything we can use a two way ANOVA simple right we have three plants three solvent two way ANOVA three plants three solvent yes sir so we could go with ANOVA you have to repeat two times or three times okay so we can do two way ANOVA then look at the error bar and and then we can think whether there is an interaction so we can have a main effect for plant main effect for solvent whether there is an interaction also we can see if you wanted to study interaction we need to repeat the experiment I talk talk right at least two times okay so we can do it two way ANOVA simple okay understood no okay yes sir so you do it later on and then you can send me the results I can say whether it looks good or is there any problem I can comment on that also no problem okay because yes you do the experiments you do the ANOVA collect the data do the ANOVA you send me the excel you give me your conclusion I can comment on it later let me take okay thank you sir when you take about three months maybe sir actually I have completed the project sir okay okay did you do a replication hello did you replicate the experiment or only once you did sir we did it for four four times sir we got error in two two tests for two bacterias we have error so we did it I am not talking about bacteria we do first part first part is three plants three solvents yes sir three plants three solvents we we did it for sir four times we did four times we extracted can you do it okay sir I will do that two ANOVA test and I would check with the results with you sir yeah sure you can do it two ANOVA for that very simple right you the results yeah I will come in I will give you a comment okay yeah I can hear sir could you hear me sir I can hear you how the t and f tables the values are assigned sir the tabulated values are signed in the t table and f table okay so what they do is the value that is given inside is the probability value right so for example if you take a normal distribution you know the equation of the normal distribution e minus x minus mu square divided by sigma square right so for given mu and sigma for different x values we can plot that graph that is how softwares do they plot the graph and then for each of the x value it will calculate the area under the curve which is the probability the softwares can do that you understood no so the normal distribution and the t t test is is there any relations sir ah no there is um normal distribution is when your data points are very very large your t distribution becomes a normal distribution so normal distribution is like a bell shaped curve right whereas t distribution is also symmetric but it falls down very fast no comes down very fast because you have less number of data you have less number of data and so it is symmetric but it falls down very fast so if you have more data more data it will become like a normal distribution so it is if whether it is applicable for more than two samples or two tests t distribution sir the t and the f test whether it is applicable for more than two um samples the mean samples it is applicable but um if you have you seen the table the day the p value the probability the value will be very very very high okay so suppose the degrees of freedom is less the very error is very large the degrees of freedom is more error will come down so it depends if you have two data points degrees of freedom will be only one right so the error will be very very high your t value should be more than 30 or 40 then only it will consider as different that is if you have only two degrees of freedom but if you have many degrees of freedom like 20 or 25 then even t of 2.9 means that there is a significant difference understand so and the t value if the degrees of freedom is less your variation becomes very large basically its uncertainty is more if you have less degrees of freedom understand no yes sir you should have more degrees of freedom to reduce the uncertainty thank you sir one final question sir sir one final question from my side there is a relation is there any relation between weibull function and gamma gamma distribution weibull distribution and gamma distribution um actually when i when i saw that gamma function in that weibull distribution i was a little bit thinking whether is there any relationship is going on between them there is no there is no relationship there is no relationship okay gamma function that is separate gamma distribution is separate gamma function is different don't mix up okay okay okay okay really in the reliability test function you have said qt and qt equal to ft equal to 1 minus e power t by eta power beta is that right the reliability test function is just e power minus t by eta power beta for calculating the unknown values we use unknown beta and eta we use these formulas but actually i don't i i think there is no problems related with the calculation of beta and eta no there is no i will take long time for you if i give problems it may take much much longer but is why i didn't have any problem yes correct but in real life you can do suppose i am given 20 light bulbs i find out the light i see how long each bulb will last one will last for 100 hours one may last 200 hours 300 hours and then i can find out the this constant from the data so i ideally we can do it if you are doing it in the lab okay that is the finding out reliability of light bulbs finding out reliability of your screens of your material anything no i have a rod i'm bending it and how long it will last how many cycles i can bend it for so that sort of thing we can do so if you are doing any experiments like that in your lab it's very very useful that sort of thank you sir and in the small silly doubt sir at the in the t table we have values the freedom no sir degrees of freedom was given until 1000 actually t test is for values less than 30 if we can use for values greater than 30 we can use or for the number of samples if it's okay i can hear it actually it is not it stops at 30 you it can go beyond also but the as i said no the if you look at 95 percent column it will keep on decreasing it becomes smaller smaller small comes to 1.96 then it will become very the difference won't be much 31 will be 1.960 1.95991 like that you know it's like exponentially it comes out and flattening so there will still be some different but it will be in the second decimal third decimal fourth decimal like that so if you are doing calculation then we can stop there actually so it is not that you you will stop at 30 because 31 also will have a value 32 also will be there 33 also will be there but the change but the change is minimum ignore that actually okay that is what we do actually so 30 sometimes as a as a scientist you may stop at 6 or 7 because you see i want you to you people to go and see the t table and as you keep looking at the degrees of freedom it goes down down down at around 78 itself it will become 2.5 2.45 2.44 like that it gives it reduces little little little little so that is why everybody stops around 767 degrees of freedom they don't go up to 30 doing 30 experiments is tough right it's a lot of work so 67 like uh one of the person said he did the extraction he did only four times if he has to do 30 times it's a lot of work he will be doing for course five months actually so 67 degrees of freedom is good enough generally thank you sir and actually we it is a small doubt sir we would calculate the f test t test and all distribution the table actually the table test we would use normal distribution as a base sir for a t distribution is different from normal t distribution is approximation of normal so if you look at normal distribution and t distribution there is a difference in the graph okay um only the is that is that distribution when we do that is almost like a normal distribution so t distribution is approximation of normal distribution actually the critical but actually if z tends to critical value we will we would use uh for calculating level of significance we would use a normal function right sir we would use normal distribution for calculating the alpha value for is that distribution that is a normal distribution yes correct but t distribution is more like approximation of normal distribution okay so the difference is there actually between t and normal distribution okay normal is very ideal condition where mean mode median are same but in t distribution mean mode median will not be same it will be like it will be like a sharp curve like that you know whereas normal distribution will be like this t distributions will be if you plot you know it will go like this like a mountain peak but when you make have more data more data then t will become like a normal distribution like that it will become like a bell shaped curve thank you sir okay thank you for your uh kind answering for our questions sir and i will hand it over my friends is there any questions i would ask there are no questions sir thank you sir good luck to all of you in your exam good luck in your good luck in your b-take also okay if you have any doubt you can write to me um if you have if you are doing any project um you can do the analysis and send me the results i can comment on it i will say yes good it looks good or you can do some changes i can suggest okay all the best