 Good morning. Good morning, everybody. Today you have noticed that besides the color cards out in the entrance, there was also a sheet. Maybe you have probably never seen one of those sheets before. But these sheets are used to evaluate the lectures. And in general, we use these sheets every second year. And then we evaluate all the courses which are conducted here at EGH to get a feeling for how the material is received by the students. And of course the idea is that we use these tests, these assessments to runningly improve the lectures. The idea of giving you those sheets here today is that we have changed significant parts of the lecture on statistics and probability here over the last couple of years. This year there have been some major changes and what we would like to see is how these changes are perceived by you, by the students. Of course you are not the same students as last year, but your answers I hope will give us an idea on whether we are improving the lectures or whether we are doing the opposite. Hopefully we are improving. So therefore I would very much like you to fill these forms out. There are also, you should by all means, you should really fill them out, but please do it in the break if you can do that. There are also some additional questions. There are some questions which are asked by the school. These questions are formulated not by me but by somebody else. And then there are some other questions. They are called Fragendestussierenden. These questions and those questions will be put on the overhead projector in the break. So there are not too many. I think there are 10 additional questions and you can read them on the overhead projector in the break and then fill out your answers to those questions also. So this evaluation is a completely voluntary one from our side and I hope you will participate in that by filling out the questions to the best of your opinion. Thank you very much for that. Today, as has become the practice, the standard, in this lecture we will first have a very short summary of what we learned in the previous lecture. We will look at what we are doing in the framework of model building and estimation and then we are going to introduce the two new things for today's lecture, namely testing for statistical significance and selection of distribution functions. So the last issue, the last topic for today's lecture is one topic which several of you already have asked me about. When we have met in the corridors, how do we decide on what type of distribution function we should actually use to model a random variable and uncertain phenomena in our engineering decision making and we will introduce one of the approaches which we can apply for that today. So let's go. Now in the previous lecture there were not too much new material but the two things which we looked at were the estimators for sample descriptors, how to estimate the characteristics of sample descriptors in a statistical manner and then we also introduced confidence intervals on estimators. Now the sample descriptors, what we did the last time was that we took this huge leap, this big step where we were considering the observations which we can observe in the laboratory or in the real world when we are going observing, sampling, collecting information. We tried to imagine that these observations were outcomes of random phenomena and then we made this abstraction that we did not yet know the outcomes of these uncertain phenomena. But we wanted, alone, based on statistical evaluations to try to assess the information which we could expect to get out of such observation. And these observations which we were or the characteristics of the observations which we were interested in were what we called sample descriptors and sample descriptors could be, for instance, the sample mean value, the sample variance, the sample median or whatever. These are functions of samples of observations but because we don't yet have the observations themselves then we were and of course we would be and we are interested in trying to assess the statistical characteristics of these sample descriptors. That was what we looked at the last time and I appreciate the abstraction in this but I hope you get the idea. Now the question is, what did we learn? What did we learn? In the sense, we learned that these sample descriptors are associated with uncertainty. And this uncertainty was due to the lack of data. So in principle if we had infinite amount of data we would be able to reduce this uncertainty but the uncertainty associated with these sample descriptors will depend on the amount of data we have available. So that means if we are planning to collect data then we already, before we collect the data we have and we can establish an idea on the uncertainty associated, for instance, with the sample mean value. This uncertainty is of the type, epistemical uncertainty. It's only due to the fact that we don't know. And this lack of knowledge we can improve. We can buy more data, we can reduce the uncertainty. And this is useful to understand and to know before we go out and conduct experiments. That's the whole idea. So before we decide on how to conduct an experiment before we plan a rigorous experiment in the laboratory we can already establish how many samples we need in order to achieve a certain certainty about the results of the experiments. Sorry. Now if we look at the sample mean then we also learned that the sample mean value is what we called an unbiased estimator. That means that the expected value will be equal to the true value of the mean value of the random variable generating the data. We also learned that the variance of the sample mean depends inversely proportional to the number of samples. So that means that the variance associated with the sample mean can be reduced by increasing the number of samples as we see here. And if you now imagine that the sample mean value is a random variable, it's an uncertain phenomena. We already agreed on that. We also understand that the sample mean value is evaluated as you see here. In the first equation it's the sum of random variables simply the outcomes of the experiments added up together and divided by n. And these uncertain outcomes of the experiment results when you add them up then of course they follow a normal distribution. So what you see here is a probability density function which is a normal probability density function for the sample average. And what you see in this illustration here is that the mean value just for illustration is centered around a value zero in this case. But you see as a function of the number of experiments the uncertainty associated with this random variable, the uncertainty, the standard deviation is decreasing as the number of experiments is increasing. So sample descriptors, sample mean values, variances, etc. can be considered as uncertain. We can model them by random variables themselves. So now we have random variables to model sample descriptors and the uncertainty associated with those can be reduced by increasing the number of experiments. That was one important insight in the last lecture. We also learned that the sample variance is a based estimator. And we also learned how we could remediate the usually applied sample descriptor for the sample variance instead of dividing by n, we divide by 1 or we multiply by 1 divided by n minus 1. So instead of 1 over n, we have 1 over n minus 1. And that estimator is an unbiased estimator. Now furthermore, due to this uncertainty which is associated with the sample descriptors, for instance the sample mean we looked at that the last time, of course we don't know the exact value. It's an uncertain phenomena. But what we can do is that we can establish intervals within which we can find these values with a certain probability. Such intervals we call intervals of confidence or confidence intervals. They can be in principle, they can be double-sided, they can be single-sided but what we have introduced are double-sided confidence intervals. And we say that we determine such confidence intervals corresponding to a certain significance level and this significance level is the probability which we want to associate this interval with. So first we determine the probability with which we want to identify the interval where we can expect to find our sample descriptor, the sample mean value in this case here. Then we determine the lower and the upper limit of these intervals and here where we are looking at the sample mean value. This random variable which we have in the probability operator has been standardized so that we already know that it's normal distributed and when we standardize then we can evaluate this probability using the standard cumulative normal distribution function. Now, again, looking at this uncertain mean value as an example, the same example basically from before, we have a case where the sample mean value has an expected value of zero and you see here this is now the density function for the sample mean value I've turned around so you can now imagine that the outcomes of the sample mean value are following this probability density function which is indicated and these confidence intervals are also indicated on this density function and we are determined such that we have a predefined probability of exceeding or being below the confidence intervals. This is how we determine the confidence intervals and you see that these confidence intervals they depend on the number of experiments. This is more or less the essence of the material which we introduced the last time. Not too much but yet a quite important aspect in statistics, namely the uncertainty associated with what we can expect to find when we do sampling and observations. Okay, now it's time for one of these small exercises and I have introduced a random variable set. I defined it by a strange ratio. I defined it by a random variable x divided by a random variable y normalized by a constant n and I give you the information that x is standard normal distributed and y is chi distributed. Now this strange ratio does it follow a normal distribution, a t-distribution or an f-distribution? And this is one of the questions where you're not supposed to look too long in the script because there you would find it, I hope. Of course you can find it in the script but these are tools. This kind of knowledge is what I consider to be tool and the better you know your tools, the more ready you are for engineering decision making. So, can I see some colors? Ah, don't be shy, come on. Ah, I need more. I only see one half right now. Yes, actually it's a square root of n. I'm sorry, type error. But I see many green cards and you guessed it. What's a t-distribution? Now whether it would have been the square root of n or whether it would have been n is actually immaterial until I tell you that n is the number of degrees of freedom corresponding to the chi-squared random variable y. Okay, very good. It's very useful to be able to more or less remember these things here and it makes it much, much faster for you when you do exercises. Also it gives you the ability to what we introduced in a previous lecture, namely pattern recognition. So if you suddenly see some random variable which looks like or you see some equation and suddenly you see that you have a standard normal distributed random variable and then you're somehow dividing by a random variable which actually could be seen to be a chi-square, to be a chi-distributed random variable, then immediately you see, oh, maybe this thing here we can immediately use the fact that this is t-distributed. And that type of tools, pattern recognition is very convenient in trying to understand the problems and solve the problems you're dealing with. Now we are still in the business of estimation and model building and the first part we are going to look at is the part where we are still trying to understand the data we are dealing with. And a little later today we will also be looking at the choice of distribution family as I indicated to you earlier. But let's focus on the information we can take out from the data alone still. The typical engineering dilemma. The thing is, I always say that what engineers are really experts in doing is to make assumptions. Good assumptions are made by good engineers. Bad assumptions are made by bad engineers. The thing is that we have to try to establish a basis for making conclusions. And very often we have to do that based on a very limited amount of data. You will find that not only here during your education but you will especially find that when you are done with your education that this becomes the main part of your daily business. You always will feel that you are lacking the real information you would like to have but anyhow based on the information which you can establish which you can afford you need to try to draw some conclusions. This is what engineers do. They try to provide decisions involving conclusions. And of course we are also dealing with a high level of variability. Uncertainty. So for instance we would like to make a few on-site tests to validate a model which we have been using to assess for instance soil strength characteristics. It may seem like a pretty banal problem but if you are constructing maybe the largest bridge in the world and we are looking at the foundations of the pylons I am sure that deep inside your heart you would really really like to know everything which is down in the underground and you would like to know the true behavior of the soil sorry that sort of information cannot be revealed to you. It would be impossible, practically impossible and it would be extremely expensive even to attempt to do it. So what we can do is that we can take some samples and then combine these samples the outcomes of the testing we can combine that with the engineering models and then we can use the observations to try to validate our engineering models but that will always be uncertainty associated with it and in the end the client who wants to build the bridge he wants to know okay can we do it what is the sufficient foundation in order to support the pylons and you will be involved in that decision making so this is just to outline the situation of these dilemmas. Now another situation could be related to observations of traffic crossing a bridge in order to check whether the design traffic volumes actually the assumptions in this regard are actually valid. Will there be more or less traffic on the bridge than we assumed that can have a significant effect on for instance the fatigue lifetime of a bridge. The traffic volume will be determining the number of cycles of loading passing over the bridge during its lifetime and these load cycles they are inducing stresses and when we are dealing with fatigue in metallic materials then the differences between the maximum stresses and the minimum stresses over the lifetime of the bridge they have a significant importance for the life of the bridge until cracks start to grow here and there in the bridge. So therefore that can also be relevant and of course if we are dealing with a potential pollution of groundwater somebody comes up with this suspicion that maybe the groundwater at this location has been polluted maybe we have a small release of some chemicals in a small factory located in the area and now we are buried. Can this water still be used as drinking water? And of course you cannot take out all the water in the underground and test it and filter it. So what can you do? You can make some tests of the qualities of the drinking water. Again you can only do a limited number of tests and it's not like you can test everywhere. Still people they want to drink water or they want to know if they cannot drink the water and get some water from elsewhere and we need the basis for this type of decision making. So these are typical engineering decision situations. How can we go about that type of problem? Well first of all it's definitely necessary of utmost importance that such conclusions that they are made on the basis of a basis which is really consistent and transparent. Consistent refers to that there should be consistency between the conclusions or the recommendations and the evidence. So we need a formalism to reflect that dependency and we also need to have a certain formalism which can help us to standardize building up that type of decision basis. So consistency and formalism are two qualities which we are looking for and one scheme which has been developed and is broadly applied and very, very useful is to first try to formulate hypothesis for what we are worried about or what we want to verify and then test the hypothesis. So can we make a test where we can reject our hypothesis? This is basically what we are aiming for and this kind of thinking is fundamental for engineering. Any engineering model, any engineering recommendation should be presented, should be documented should be built up in a way where it can be scrutinized this is what we call it scrutinization. So we are trying to develop like a model or let's call it an air castle. All our considerations are built up and documented and then what we really want to do is to try to see if it's possible to pull out the foundation under this model. Trying to find ways, arguments whereby the model would not be valid. So we are trying really to find all the weak points of the engineering models we are trying to formulate. This is a very, very important role of engineering that is to develop models where they can be criticized and then really try critically to find what are the weak spots in the engineering models, scrutinization. And this is what testing is about. Testing is scrutinization. Are there any reason why we could reject this hypothesis? And in this procedure, the first step is to formulate what we call null hypothesis. This sounds very scientifically but it's a standardized procedure which is applicable to any type of testing whether that be in the medical industry or in engineering applications. We are simply formulating a null hypothesis postulating that a sample statistic, for instance a sample mean, is equal to a given value. That is an example of a null hypothesis but it could be any other hypothesis. But the hypothesis is related to something which we can observe, a sample descriptor. The next step is to formulate what we call an operating rule on the basis of which the null hypothesis can either be accepted or rejected, given the evidence. So when we have the test results. So we are in a situation after we get the test results using the operating rule we can either accept or reject the hypothesis. That should be the idea behind the operating rule. And typically operating rules are defined by some interval delta within which the observed sample statistics like for instance the sample mean has to be located in order so that we can accept the null hypothesis. Of course rejecting the null hypothesis corresponds to accepting the alternate hypothesis, the H1 hypothesis. So often you see in the literature and you see people using the null hypothesis and the one hypothesis basically referring to acceptance and rejection. The next thing is, so we have the hypothesis, we have the operating rule and now we need what we call a significance level. We call that alpha for conducting the test. And this level of significance is the probability which is associated with the event that the hypothesis is rejected even though it's true. Of course. It is only a testing procedure and the test cannot be perfect because there's uncertainty associated with the information we are collecting. There's uncertainty associated with the sample descriptor which we are using as a basis for the testing. Now the significance level describes the probability that the hypothesis is rejected even though it's true. And we call that situation a type one error. Depending on our choice of alpha we have the alpha probability that the hypothesis is actually rejected even though it's true. And of course we would like to choose alpha such that it would be really big. No, we would like to choose alpha so that it's appropriately small. We want to conduct significance tests so that the probability of rejecting a hypothesis even though it's true is small. We want small type one errors. But we should keep in mind that the choice of the significance level will also influence the probability that we are accepting the null hypothesis even though it's false. So there's an interdependency there and of course we are not really interested in accepting a wrong hypothesis. Now what we have to do in order to be able to actually decide on the best alpha is to solve a decision problem. We will look at decision problems in this extra lecture which we talked about. But you can imagine that there are consequences associated to rejecting a hypothesis which is actually true. That means we are losing opportunity maybe in order to make an action wherefrom we could benefit. But because the hypothesis falls out negatively then we have to decide to do otherwise. On the other hand we may also risk that the testing comes out positive of the hypothesis even though the situation is different and that event may be related to severe consequences. So we might decide to do something even though the true state of nature would not be in a condition where it would be the right thing to do. And then we may have adverse events like failures of buildings or polluted groundwater which is being used for drinking water, etc. In the case of type 2 error. So the choice of alpha can be established by looking at the consequences associated with type 1 and type 2 errors. And of course the probability of committing the type 1 error which is giving true alpha and the probability of committing a type 2 error which is more tricky to determine and this is not something I expect that you will be able to but when it comes to the type 1 error this is one aspect I want you to be able to deal with. The next step concerns the calculation of the interval delta corresponding to the alpha significance level. And then also to calculate if relevant the probability of performing a type 2 error but as I said this can be tricky. The fifth step is to actually perform the planned test and then to evaluate the observed sample statistic and to check whether the null hypothesis can be accepted or rejected. Now given that the null hypothesis is rejected or as I write here is not supported by the evidence it doesn't mean that the true state of nature is not what we would like it to be it just means that the data we have the information we have about the true state of nature does not support our hypothesis. Now as I said given that it's not supported by the evidence the null hypothesis is said to be rejected but when we say it's rejected we have to say at the same time at the alpha significance level because just to say that it's rejected is no information we have at the same time to say at what level it's rejected. What level of significance. And otherwise of course otherwise it's accepted. This is the general procedure for testing. Now testing it sounds very scientifically, very clinically but you see here in the context of engineering decision making what we're really dealing with is model building and we want to find a basis for evaluating the goodness of our assumptions and for this purpose exactly significant testing is an important aspect. I think some of you have maybe a bomb or something which is about to go off. Please check. Now we can visualize the first step formulate the null hypothesis the second step formulate the operating rule the third step select the significance level considering the probability of type 1 and type 2 errors assess the acceptance criteria the operating rule the delta and then do the testing do your observations and then check for acceptance or rejections these are the operational steps in testing finally conclude make a recommendation but document your recommendation based on the evidence a clearly statement of your hypothesis formalistic description of your operating rules and document your observations do it all in a transparent way where people can see what you have done doing the right thing is not enough we have to do it in a way where other people they can also see that what we have been doing is correct and they can be confident with it that we can communicate this information with other professionals if something goes wrong imagine that you make a decision you make it a support for decision like you say the underground is very fine here you can build the foundation using this or that concept then imagine that after some years you have some severe conditions for the structure you are interested in for which you made the decision support and then people they will start running around in circles looking for where was the mistake made this is typically an issue which comes up especially when somebody has to pay for the damages and then the whole idea from a professional point of view is of course that you want to do the things right in a way where people can go back and see what was the basis for the decision making documenting everything and nobody will be opposed to your work everybody will be happy with what you did because you can document it in a formalized way in a transparent way and that should be your direction for your future work as engineers now that some typical tests in engineering which I'm going to introduce here after the break when you have filled out your assessment forms please only fill out one each there are some typical tests which we are going to hear about I'm going to introduce them I'm going to more specifically look at some of them the rest are also explained clearly in the lecture notes and you will be trained during the exercise tutorials in using these tests the ones we will look at are what we call the testing of the mean in the situation that we know the variance if we know the variance then we can do some quite easy tests on the mean I will show you we can also test the mean in the case where we don't know the variance this is also quite easy and I will show you how to do that specifically then we can do tests on the variance I'm going to explain you the line of syncing when we do that and finally we can do tests on two or more datasets so we can compare characteristics of one dataset with characteristics of another dataset and we can either... so we can formulate hypothesis related to the two datasets and we can decide on whether the hypothesis can be accepted or rejected so these are the four tests you will learn of here in this course but there are many, many, many tests out in the literature that are specific for specific types of decision situations and I simply advise you in your future to keep in mind that there is a big, big toolbox of statistical testing you will learn about these four in specific but you may be in situations where you encounter the need to do other types of statistical tests and then you can look up into the literature but the fundamental principle is exactly the same as you will learn in this course here and the rest is quite easy to catch up from the literature okay, then I will call it a break here before the bell and please remember to try to fill out the forms okay, thank you I appreciate that you did not know the ID number for this lecture but I will make these small parts filled out by myself tonight at the same time of course when I correct your answers okay, that was a so-called Danish joke don't take me too serious when I'm joking so let's proceed let's look at a small example imagine that we are dealing with a nice bridge like this one do you know where this bridge is? yeah, they what? it's very close it's very close this bridge is the Faru Bridge in Denmark and it's actually quite a nice looking bridge from my perspective but then again I have been involved with this bridge for some years also now the thing is that bridges like this one and many many others they are built by selection of materials and then they are they are erected or executed they are actually built and they are standing in an environment which of course due to the second law of thermodynamics nature will try to increase entropy and that means that the materials will somehow deteriorate and this is also what happens for bridges like this when we are looking at the concrete parts of the bridge which constitutes everything else then of course the bridge girder the bridge girder here is made of steel and the cables are also made of steel but everything else is made of concrete what we have is that the chlorides chlorides ions are migrating through the concrete cover sickness we have concrete cover outside the reinforcement bars in the concrete structures and these chlorides they will find their way into the reinforcement and the so called corrosion protective environment which is inherent in the concrete it provides a certain environment which protects the reinforcement from corrosion but due to the chloride ions this environment will disappear over time as the concentration of chlorides is increasing then the corrosion at the reinforcement bar at some time will start to initiate this is a problem for many reasons and we could talk for hours on this subject one of the immediate effects is that corrosion products are building up on the reinforcement the cross section of the reinforcement is reduced normally that does not matter too much because the outer layer of reinforcement is not really very important so the strength of the concrete structure but what happens is that these corrosion products they have a volume which is many many times larger than the steel itself and due to this volume increase we have the effect that the concrete cover will start to crack and after some progress corrosion when the sufficient amount of corrosion products are building up then also the concrete cover is cracking and falling off and when we are in this situation then we have an acceleration of the deterioration mechanism the chlorides migrate into the next layer of reinforcement and within a relatively short period of years short number of years we will have that the structure deteriorates and we don't want that so when we do the design of structures like this there are several issues which are important several assumptions which are crucial for the design now of course the first assumption is what is the amount of chloride what is the concentration of chloride we can expect to be acting on a structure like that this is a decisive assumption because that will determine the optimal design of the concrete material itself we can make different types of concretes which will be more or less resistant to the exposure of chlorides and we can also determine that's the choice we make we can determine ourselves the sickness of the concrete cover we have outside the reinforcement bars welcome there is still one chair somewhere so the assumption on the chloride concentration in the environment is important and you see here the effect of of bad design or say you see the effect over a very long period of time for even a very good structure in the end we will have no variation but the question is when and how much now I have to say that this picture here has nothing to do with this bridge here and now that we are worldwide web and everything I have to say that very clearly this bridge here is in an extremely good condition of course this one here is a completely anonymous structure from another planet now consider as an example where we want to verify whether the chloride concentration on the surface of a concrete structure is in compliance with our design assumptions that's important because if it's not the case then probably we need to do something then maybe we have to try to protect the structure in some way or another you will learn how you can protect structures against deterioration but not in this course let's see so the design assumption let's assume that the design assumption is that the mean surface chloride concentration is equal to 0.3% and we can also use this as our null hypothesis let's assume also that based on experience we know that the standard deviation of the surface chloride concentration is equal to 0.04% in that case I am just underlying here we are dealing with the mean surface chloride concentration and we know the standard deviation so we know the variance in that case the chloride concentration can be assumed so the sample mean value can be assumed to follow a normal distribution and we can formulate a null hypothesis at a given significance level and we can accept the null hypothesis if the sample mean value is located within an interval of this design assumption so we are deciding now that we will accept the null hypothesis if the sample mean value is located within a small interval of the design assumption and as I said this sample mean value we know that it's normal distributed so in this case the testing takes place in something that looks very similar to what we had in the last lecture namely the determination of confidence intervals so we can now write up the probability that the sample mean value is located within this interval is now equal to 1-alpha where alpha is a significance level and now what we have to do is to choose alpha and as I said we have to bear in mind the effect of type 1 errors and the possible consequences of type 2 errors but let's just say that we choose the significance level to be 10% that means in 10% of the cases we would reject the null hypothesis even though it would be true and let's assume or let's choose 10 experiments so we have to verify this hypothesis by conducting 10 experiments and of course we know that the sample average follows nicely a normal distributed random variable actually nicely we could debate that why could we debate that does anybody of you have an idea on maybe the teacher is doing bad medicine right now I'm saying it follows a normal distribution and I'm using the central limit theorem more or less repeatedly in the lectures but you should be a little skeptical to what I'm saying also once in a while at least now what is the assumption for the central limit theorem is that there should be a large number of random variables in the sum and here immediately I say 10 is already very large I would have said the same thing if it would have been 8 but had it been 5 so be careful once in a while 10 experiments is okay for this assumption it follows roughly a normal distribution now using the assumption that it follows a normal distribution it's not a bad assumption but it is an assumption we can immediately evaluate this probability here by the standard normal cumulative distribution function taking the random variable and standardizing it subtracting the mean value and divide it by the standard deviation and then subtracting the probability corresponding to the upper interval limit and then minus the probability corresponding to the integral we have down below the lower limit of the interval and when you input the interval upper limit plus delta the operating rule then we subtract the mean value and we divide it by a standard deviation like this and we set it equal to 0.9 namely 1 minus the significance level then we can solve and determine delta so delta comes out as being equal to 0.0208 the 0.9 here is the sum of these two areas in the density function did you get that so delta is 0.0208 and now what does that mean that means that if the sample average lies in the interval between 0.28 which corresponds to 0.3 minus delta and it's lower than 0.3 plus delta which is equal to 0.32 then we can want then we can accept our null hypothesis in any case we don't have evidence which supports rejecting it and we said that we wanted 10 experiments now we imagine we conduct 10 experiments to determine this chloride concentration on the structure and we get these values here whoops I have a vector with some samples taken from the real life and based on these values I can calculate a sample average being equal to 0.29 now you see that 0.29 is located within this interval here we determined and as a consequence here the null hypothesis should be accepted at the 0.1 significance level okay that was a significance test you may wonder what if we did not know based on experience as I said the standard deviation associated with the chloride concentration what would happen then well in that situation remembering our toolbox still we could formulate an operating rule T being located between minus delta and plus delta where T is now not a normal distributed random variable but T is a T distributed random variable similarly it is a standard normal distributed random variable divided by the standard deviation which is the square root of the variance and the variance would follow a chi-square distribution so the standard deviation would follow a chi-distribution divided by the square root of n and of course we immediately see that this is a T distribution of 1 degrees of freedom and if we now write up this operating rule that this statistic should be located within minus delta and plus delta and we use the fact that T follows a T distributed random variable and we know the probability it's still 1 minus 0.1 so it's 0.9 then we can look up into a table for the T distribution and we can determine delta and in this case if you look up into a table there are some tables in the end of the lecture notes for T distributions and F distributions etc then you will find delta being equal to 1.83 okay now what can we do with the same sample as we had before so we have now 10 observations of the chloride concentration now what we do need to do here is we have to calculate the standard deviation this one here not only the sample mean value but this thing here is now also an outcome of the sample and we can calculate that according to the usual scheme but now instead of dividing by 1 by n we divide by n minus 1 and then we get that this sample standard deviation is equal to 0.025 now the T statistic looking back here this is the T statistic now I insert the calculated value for the standard deviation and I insert the calculated value for the sample average I insert in and I insert the mean value then I get this expression here which is equal to minus 1.27 and we see that the delta we calculated was equal to 1.83 so 1.27 minus is located within plus minus delta and what does that mean? it means that the null hypothesis cannot be rejected at the alpha significance level so we will accept the null hypothesis so both in the case where we knew the standard deviation and in the case where we did not know the standard deviation in both cases based on this information based on the data we had we could not reject the zero hypothesis the null hypothesis and this is a way these tests are performed if you formulate the hypothesis you formulate the operating rule you choose the alpha significance level the value of the acceptance criteria delta which is a part of the operating rule you do your testing you evaluate your sample characteristics the sample mean, the sample standard deviation depending on the information you have you put it in to the operating rule and you check whether you should accept or reject at the alpha significance level that's a scheme don't be too worried you will have training in using these concepts in the exercise tutorials but the scheme is always the same the problem is the difficulty let's say if there is a difficulty the difficulty is to recognize what type of statistic we want to use when we are formulating the test and that depends completely on the information you have in the first case we knew the value of the standard deviation for that reason we could say immediately that the statistic we were concerned about followed a normal distribution and we could use that information to develop our acceptance criteria our operating rule in the second case we did not know the standard deviation then it turned out that the statistic would become a T-distributed statistic and then we use that when we calculate the acceptance criteria in our operating rule and there are many other examples where it then turns out that the distribution to be concerned about for the statistic follows another distribution and I will illustrate that to you when it comes to testing of the variance let's consider an example where we want to test the variance of fatigue lives of welded joints and the variance of the fatigue lives of welded joints is extremely important when we are designing steel structures with welded joints that's clear the variance will determine the life of the structure and what we can do is that we can improve the detailing of the joints and we can improve the quality of the welds and it's not so easy and it's also not very easy to validate how good the quality is in a certain weld type but one of the means is to do a weld surface treatment now if you look at this weld here you actually see that there is a very big fatigue crack located here at the foot of the welding you look at the close up of the weld this is a weld in an offshore platform you see that the surface of this weld is not very nice and there is a lot of geometrical imperfections associated with this weld as you see it and of course there are many good reasons for that but the problem is that due to these geometrical let's call them imperfections there will also be singularities and the stresses due to the load on the structural members and these singularities they of course correspond to locations where in principle due to these small geometrical imperfections we have infinitely high stresses I have a small message of wisdom to you no material can sustain infinite stresses but what they will do, well it either cracks immediately or it yields and this is also of course what happens in welds like this we have development of cracks or we have local yielding but at any rate due to these geometrical imperfections we have high stresses not infinite but they are high and that will reduce the fatigue lives the only way of improving the situation is to treat the surface to reduce the stresses one obvious way of doing that is to smoothen the surface of the weld then we don't have the geometrical imperfections and other option is to impose stresses which are counteracting cracking in the weld and this for doing that there are also many techniques anyway as experiments are very expensive only a few data can be available to verify the effect of weld surface treatment so we need again we are in the engineering dilemma we only have a few data and we want to test the hypothesis of the beneficial effect of a possible weld surface treatment so in that case we may formulate we may postulate a null hypothesis namely that the variance of the fatigue lives in the situation where we use the surface treatment is smaller than the variance with the untreated weld surface like formulated here so the new variance is smaller than the old variance and the operating rule would be to accept the null hypothesis if the statistic is square namely the variance is larger than or equal to a certain delta which we need to determine being equal to one minus the significance level and in that situation of course we can use that the variance is chi-square distributed random variable now not a normal, not a t but a chi-square distributed random variable within degrees of freedom we will learn how to do to consider that in the tutorial exercises another situation is where we are we are dealing with two or more data sets which may not be very large individually and they may not be of the same size also so there may be more observations in one set than in another set and we would like to know how the data compare in terms of mean values, variances and also correlation so one typical decision situation when you are dealing with individual data sets maybe they are related maybe they are actually trying to or maybe they are so much related that they in principle correspond to the same type of observations and because of the statistical uncertainty associated with the data sets the few number of data in each set it would be interesting to try to see can we justify putting the data set together in one big data set and in order to justify such an assumption it's relevant to look at whether the mean values for instance can be assumed to be the same mean values the mean values, the sample mean values from the two data sets based on the sample mean values from the two data sets and the same concerns the variances are the variances underlying the two data sets could they be the same and in other cases we just want to check whether the mean values and variances are significantly different and we also want to maybe test whether there is zero correlation between the data sets and these tests we can also conduct here we assume that we have two data sets being realizations of the random variables x and y which are assumed to follow a normal distribution with given mean values and variances and then we can form this statistic the difference between for instance the sample mean values and this statistic even though I used the letter t is not t distributed but of course it's normal distributed and the mean values are readily calculated it's the difference between the two mean values and the variances given here of course relative to the number of data of the two data sets now based on that we can formulate operating rules and the acceptance criteria delta again now using that it's a normal distributed random variable the test statistic we can easily calculate the acceptance criteria delta we can do our sampling and we can test whether the mean values are different or not if we are testing for equal variances then we end up by comparing the variance of the one data set with the variance of the other data set so we are taking the ratio of the two variances and then it of course what we are dealing with is a chi-square distributed random variable divided by a chi-square distributed random variable the number of degrees of freedom of the two chi-square distributed random variables are different and then therefore we end up with an F distributed test statistics with the parameters k and l k and l being the number of degrees of freedom of the two chi-square distributed random variables the null hypothesis would be that the variance are equal the operating rule to accept the null hypothesis would be that the F statistic is smaller than equal to delta and delta can be determined from this probability now what we do is that we look up into the table in the back of the lecture notes we find the F distribution we go in with the probability 1-alpha and we find out what delta does that correspond to in this distribution and then of course we have to do the testing and we have to compare with the acceptance criteria and we have to see whether we would reject or accept the hypothesis ok but we went through in principle all the different tests and all the different statistics we wanted to look at looking at the normal distribution the t distribution we looked at the F distribution as well and we also looked at the chi-square distribution a few words I want to close this topic with is that the test for statistical significance for a variety of say can be formulated in different ways for different types of problems and this constitutes say a fundamental problem in testing and therefore we must be careful not to overvalue the information we get out of significance testing because we can formulate hypothesis in different ways and we would get in principle different results we can also choose different significance levels it's not like one completely clear choice of alpha and so if we cannot say reject a hypothesis at a certain value of the significance level then you can be sure that if we change the significance level then in the end we can find a value so that we can reject the hypothesis and that's very important when we are communicating because some people would focus more on the value of the significance level in their understanding of the test results some people would focus more on whether the test is actually being accepted or rejected so some people as soon as they hear that the test proves that the hypothesis should be rejected they would not even listen to what significance level you made the test on so here we are dealing with a problem of communication communication really what you can do is that you can simply and clearly document and state the assumptions underlying the test now also the different choices have an effect on the probability of performing type 1 and 2 errors and as I already mentioned this can be associated with significant economical consequences but it is discussed that the choice of alpha in principle can be considered a decision problem small exercise now the uncertainty associated with the sample descriptor sample mean can be reduced by either significance testing more samples or engineering experience there is no formula in the script which will help you on this one I can promise you ok let me see your color sheets come come come I need much more I know you have still a few missing I will not mention names but I might point my finger ok I see green and I see some red and yellow in mixtures but in this case in this case we are looking at how the sample descriptors the uncertainty associated with them for instance the sample mean value how the uncertainty the standard deviation of the sample mean value depends on the number of samples and the more samples we have the smaller the standard deviation for instance of the sample mean value the uncertainty goes down significance testing any well the testing itself is not going to reduce anything engineering experience may influence your mind setting but it will not reduce the uncertainty now coming back to the general scheme here we have to look at the distribution family choice and this is quite a straightforward thing which I will do a little fast so sit back tighten your seat belts and be ready ok now we have discussed earlier and I will go through a little of this in the next lecture as well we have discussed earlier that what we need to do is to have a formalized way of choosing the family of the distribution type we are dealing with but we are basing this choice on observations but also physical arguments information is data physical arguments is engineering understanding I think it was Einstein who said that data is only information experience is knowledge but in the end we have to combine the two ok and of course data is the basis for improving experience now a formalized approach is to postulate a hypothesis for the distribution family to estimate the parameters of the postulated probability distribution and then to perform a statistical test to reject and verify the hypothesis and we will look at that in the coming lecture but we will also look at a more heuristic way of choosing the family of the distribution function we will see that now the problem is again always we don't have too much data available therefore we need to use common sense and first of all any given physical reason for selecting one particular distribution function should be utilized central limit theorem so many times but we have also been looking at the different situations leading to extreme value distribution functions of different types one two and three for instance the bible gumbel and freshier distributions we have been looking at the distribution of times till the occurrence of rare events following Poisson distributions following events occurring according to the Poisson distribution and the time following an exponential distribution rating times following gamma distributions so there are many considerations which can give us support and we have been looking at the statistical distributions depending on the functional expressions we are dealing with leading to chi-square, chi-t f distributions so it's not like we are completely without a clue in regard to the choice of probability distribution family but of course sometimes we would like a little more support so we can postulate something and we can see whether the data are in strong or in gross contradiction with our assumptions and one of those ways is to use probability paper now probability paper is constructed in such a way that the commutative distribution when that is plotted on a given paper then we get a straight line, then we don't get one of these S shaped curves but we get a straight line so it's a non-linear transformation of the y-axis and the usual commutative distribution function plots you normally have seen so what we do is simply that we stretch or compress the axis non-linearly so that this S shaped curve stretches out to become one straight line and for the normal distribution such a transformation is easily done like here we are taking this S shape forming the axis non-linearly whereby this S curve becomes a straight line now the idea is that if we are plotting a quantile plot on such a paper then the points we are plotting up on the paper would follow a straight line so that would be evidence pointing to the direction that maybe this particular type of distribution is not a bad choice this is just an example on how to construct such a paper in the case of the normal probability distribution function we take the S shape and then we select values here in this range the distribution curve is more or less linear but then when you see here out here corresponding to 0.999 we are stretching all the way up to this linear line and then we are plotting the value here on a transformed y-axis so this is the way the paper can be constructed you can buy all these papers you can find them on the internet you can construct them yourself it doesn't matter I just wanted to give you the idea how a non-linear transformation performed what is the idea behind the paper simply a transformation of the axis y-axis if you look at a set of data these are concrete compression strength results and they are ordered I believe yes in increasing order the values then we can calculate the the cumulative distribution function based on the observations as the number of the observation divided by the total number of observations plus one hang on just a sec and if you plot that up in a normal distribution paper normal probability paper then you get a straight line then you get points which are located around a straight line like this and that would support the choice of a normal distribution to model the data to model the concrete compression strength but there is one very very important observation here that most of the data are of course centered around the mean value and this will always be the case when we have observations in many applications the goodness of the choice of the distribution the goodness should be judged on the behavior in the tails so here we are in the upper tail of the distribution values of the concrete compression strength above 40 megapascal and in this area here we are in the lower end now if I want to make a probabilistic model by describing the capacity of a concrete structural element then I am mostly worried about possible low values of this uncertain phenomenon and I want my model to be good in the lower tail so that means when I am evaluating whether these points are lying on a straight line then I want to focus on the lower end values and here I see some deviations not big but they are there so having said that that was all for today and I wish you a nice afternoon thank you good to see you again