 Welcome to dealing with materials data. In this course we learn about the collection analysis and interpretation of data from material science and engineering. We have been looking at dealing with probability distributions using R and we have discussed some discrete distributions and now we are looking at normal distribution. And in this session we are going to learn about error function and how the normal distribution is related to the error function. Just to remind ourselves the normal distribution is given by this expression it is also known as Gauss function. So sigma is the standard deviation and mu is the mean of the normal distribution. So it is an exponential minus x minus mu whole squared by 2 sigma squared and there is a normalization factor in the front which is 1 by sigma root 2 pi. It is possible to carry out a transformation of variable from x to z and that makes the mu 0 and the standard deviation 1. So we take x minus mu by sigma as the z and then we get the standard normal distribution which has the normalization factor 1 by root 2 pi and it is just exponential minus z squared by 2. And we have also learnt that random errors are nice always follows the normal distribution and we saw an example of electrical conductivity in electrolytic tough pitch copper and we saw that repeated measurements if you make because of random errors you get normal distribution. Now we want to learn about error function and the error function is related to the cumulative distribution function of standard normal variable. So capital F of z is related to the error function and by definition error function is 1 by root pi minus x to plus x exponential minus t squared dt but because we are going to and error function is symmetric so we are going to make it 0 to x and multiply it by 2. So error function of x is 2 by root 2 pi and integral 0 to x exponential minus t squared dt. So this is the definition that we are going to use 1 minus error function of x is known as the complementary error function and sometimes it is denoted as fc of x. And for x less than 0 the cumulative distribution of standard normal variable is just half error function x by root 2 and for x greater than or equal to 0 it is half 1 plus error function of x by root 2. So we are going to plot these functions and see how they are related. So in other words if we can take a standard normal variable and get its cumulative distribution function from that we can actually calculate the error function. For example 2 times f of x minus 1 will give you the error function of x by root 2. So that is what we are going to look at and we will also learn how the error function is related to the diffusion problem. To understand the diffusion problem diffusion flux and concentration gradient are related by a constitutive law and it is given as a fixed first law of diffusion. It says that the diffusion flux is proportional to the concentration gradient and the proportionality constant is called diffusivity and the negative sign indicates that the diffusional fluxes are such that concentration gradients will get evened out over a long period of time. Now if you add to fix first law the law of conservation of mass you get the so-called fixed second law. So it is a conservation law so there is no really new law called fixed second law. Fixed first law is a constitutive law, second law is just combination of the constitutive law with the law of conservation of mass and that gives you dou c by dou t that is the rate of change of concentration with time as del dot d del c. So d del c is the term that comes from the first law which is just the concentration gradient and the proportionality constant d and so there is another del operator which is operating on this and at this point it is useful to note that Fourier law of heat conduction which is also a constitutive law and if you add to it law of conservation of energy you get a very similar equation and that is the equation for heat conduction so that it is given in terms of temperature and instead of d you have alpha which is the thermal diffusivity. So dou t by dou t is alpha del square t is a common form for the heat conduction and because of the similarity between the two forms as you can see the solutions for both are very similar in mathematical terms it is just the interpretation that varies. However, in this part of the session we are going to stick to the diffusion equation so we are going to concentrate on this equation dou c by dou t is del dot d del c and if you assume that d is a constant you can pull it out it will become d del square c which is alpha del square c so they are very similar. So if you look at the solution to the fixed second law in what is known as a semi infinite bar for the boundary conditions at x equal to 0 there is a concentration it is called C s to say that it is a surface concentration and at x equal to infinity that is far away from the surface there is a concentration C naught and in this case the solution is shown to be concentration as a function of position and time x and t and they always appear in this combination x by root dt and root dt is actually known as the diffusion distance and so this is nothing but surface concentration minus the difference between surface concentration and far field concentration error function of x by 2 root dt and this solution of fixed second law is relevant for cases such as doping and carburization etc because in these cases you can think of a material which has some given amount of concentration and on the surface we start increasing the concentration so that there is diffusion that takes place into the material so as the time goes by the concentration keeps changing in the material so this is what is described by this equation and all this solutions and the next couple of solutions that I am going to show they are all described in Portland distilling phase transformation in metals and alloys. So the next is also a semi infinite bar solution but in this case the solution is slightly different because instead of surface concentration being a specific concentration it is kept at 0 but the far field concentration is at the same value C0 let us say so this is relevant for things like decarburization because if you have a material that has some amount of concentration of one of the species and at the surface beyond the surface you will see that there is no concentration of the species so there is a concentration gradient so this will start leaving the material and in that case the solution is C0 error function x by 2 root dt you can also think of another solution which is 2 semi infinite bars which are attached to each other so this is for the boundary condition at x equal to 0 you have a combination that is so it should be a plus sign here 0.5 C1 plus C2 and if you go on the one side to the left side say at x equal to minus infinity the far field composition is C1 and x equal to plus infinity on the right side the composition far field composition is C2. So in other words we take two bars one has a far field composition C2 other one has a far field composition C1 and we put them together at time t equal to 0 so obviously at the interface where they are put together the net concentration will be an average of the two and then the solution is given by this expression so it is C1 plus C2 by 2 minus C1 minus C2 by 2 error function x by root dt. So this is relevant for cases where if you take two different compositions of a particular material and weld them together as a function of time how the composition changes for example. You can also think of other cases you can make C1 and C2 to be 0 and you can imagine putting a small amount of material at the centre and allow it to diffuse for example and that solution is relevant for radioactive tracers and even for calculating diffusivity for example. So all these things are described in detail in Polter and Distilling phase transformations in metals and alloys in brief. There are also other classic textbooks for diffusion such as Schumann's diffusion textbook there are also mathematical textbooks which describe the solution of diffusion equation for different boundary conditions. Because diffusion equation is a partial differential equation the solution depends on what boundary conditions you assume so depending on boundary condition the solution will change. So in other words the complete description of the partial differential equation includes the boundary conditions so if you give the equation and give the relevant boundary conditions then you can get solutions and they are different for different boundary conditions. So here are the three cases where error function turns out to be the solution and as we have seen error function is related to the cumulative distribution function of the standard normal variable. So you can now use R to plot the error function solution for these three scenarios and that is what we are going to do in this session. Just to summarize so we have been looking at normal or standard normal distribution we have noticed that distribution of random errors or noise follows this distribution. We have also seen that the cumulative distribution function of standard normal distribution is related to error function which happens to be a solution for the partial differential equations that describe diffusion and heat conduction and so on. So it is relevant from that point of view also and there is another reason why normal distribution is common. In fact the word normal itself says that this is expected to be the most common distribution. Irrespective of distribution from which you sample if there are random fluctuations which are result of many independent random components and they also tend to be distributed normally. There are exceptions, highly skewed distributions and distributions with no finite variance do not follow this but in general you will find that normal is very common and that is why in many many things we use normal distribution or standard normal distribution as a sort of benchmark and we describe everything else with respect to the normal distribution. So we will see a couple of examples of this and in many a times we are actually interested in knowing if a particular data that we have received is normal or not or does it follow normal distribution or not. How to look at those things etc will be things that we will look at in the following sessions. But in this session now let us go back and look at the diffusion solution and try to use R to get the diffusion solution. As usual we will get R, so this is version 3.6.1 action of the source is what it is called and we need to look at the working directory we need to make sure that we are at the right directory and the first thing to do is to actually plot the error function. So let us do this, so we are going to plot error function from between minus 5 and plus 5 and what we are going to plot is, so you can see that P norm z01 is nothing but the cumulative distribution function of the standard normal variable 2 times that minus 1 is actually going to describe the error function that is what we have seen and that is what we are going to use here. So if you do this you see that the function as it goes to minus infinity goes to minus 1, as it goes to plus infinity goes to plus 1 that is because error function of minus x is nothing but minus of error function x, so it is about 0 it is just negative of this, so that is why the function looks like this. So if you go to Portland distilling for example you will see the error function plotted and here is the plot that gives the same thing for us. So the next step is to get the semi-infinite bar solution and for that we have to assume certain coefficients for diffusion and so on and so let me first get the full code. So what we are trying to do in this is to actually reproduce some of the figures that are there in Portland distilling. So let us assume that the surface concentration is 1.4, so let us assume that the surface concentration is 1.4 and for fill composition is 0.1 and diffusivity is 4e power minus 11 meter square per second and we are going to look at some 0.5 millimeter distance to which we are going to starting from surface some 0.5 millimeter is the distance to which we are going to look at diffusion and we are going to consider time of 10 seconds, 50 seconds and 100 seconds. So if you look at for these numbers Portland distilling tells you that is about 1000 seconds is what it will take for the composition to penetrate the entire 2 millimeter distance for example. But we are going to plot the early stages 10, 50 and 100 seconds. So z1 is x by 2 square root dt and this is the parameter that goes into the error function solution that we saw. So the solution itself is given in terms of error function. So this is the parameter for error function. So for that parameter we have to calculate the cumulative distribution function from the standard normal distribution multiplied by 2 subtract 1, so that is the error function actually. So this is the parameter of the error function because the t is changing for every x, so we need to calculate that parameter from that parameter for the given x and t we can find the solution and x is a sequence. So it goes from 0 to 0.5 millimeter and then for time t equal to 10 seconds for example, so we are getting the solution and parameter and from the parameter we are getting the solution and we are going to plot and remember the solution had surface concentration minus the difference between surface and far field concentration multiplied by the error function solution. So this is the first solution for 10 seconds, similar thing for 50 seconds and 100 seconds and so they are the 50 second, 100 second solutions are going to be marked in blue and red. So after we do that I also want to plot certain lines. So this is the 0 that is the surface we are going to plot. I am going to also draw a line at root dt different lines and I am going to show you, so if you look at the figure in Portland distilling it shows that the distance from the surface into the material is actually proportional to square root dt, so this from the surface the distance to which penetration happens is actually. So that is what this line, horizontal line is going to show us. So here is the solution, so this is the surface and this is the 10 second solution. So you can see that this is proportional to root dt because this is the root dt value that we have plotted for 10 seconds and this is for 50, this is for 100 and you can see as the time proceeds the diffusion, the flux increases into the material because there is a concentration gradient this is 1.4, this is 0.1. So the material will keep accumulating in the system so that the concentration gradients are getting evened out. So that is where we are moving towards and that is what is shown and this is a figure that is there in Portland distilling for example. So you can generate the same solution using R, of course we can also generate the other solutions and the second solution is of course we want to keep the surface concentration to be 0 and we want to look at for field composition to be 0.1 and then we want to look at what happens to the solution, so let us do that. So we are doing the same thing, surface concentration is 0, for field composition let us say is 0.6 and same diffusivity and we are again looking at some 0.5 millimeter and we take 10 seconds, 50 seconds and 100 seconds and we are going to use the same formula because when you substitute Cs to be 0 it reduces to C0 times Z11 the error function solution so that is what we saw. So because this is also a semi-infinite bar solution so it should be the same solution and again we have drawn certain vertical and horizontal lines to show what is happening to the solution. So if you plot, so in this case the material has very high concentration 0.6, at the surface the concentration is 0, so obviously there is a concentration gradient between what exists inside the material and at the surface, so the concentration starts changing, the carbon atoms start leaving from the surface, this is known as decarburization and that leads to this kind of profiles and as you can see again here also we see that as time goes on the extent to which decarburization takes place, the concentration for example in the bulk is falling and how much does it fall depends on the square root dt which is the diffusion distance that is why we call it diffusion distance because dimension wise it is a distance and it basically tells you for a given time for a given diffusivity how much will be the depth to which considerable diffusion would have taken place, so that is what is shown by this line to indicate that as time goes by 10, 50, 100 seconds the depth also keeps increasing to which decarburization takes place. Of course the last solution that we want to look at is the two semi-infinite bars put together with the fulfill compositions which are given by C1 and C2, so that is what we have done here, so let us assume again that C1 is 1.4, C2 is 0.1, so we are taking two steels which compositions with composition 1.4 on one side and 0.1 on other side and we are putting them together and diffusivity is the same 4 e power minus 11 meter square per second and we are going to look at, so now the origin is at where we have put this material together, so on the minus you will see the material with composition 1.4 on the positive side you will see composition, material with composition 0.1 and at 0, so it will be an average of these two compositions, so it will be 1.5 divided by 2 which is like 0.75 and again we are going to look at solutions for time 10, 50 and 100 and for solution you just have to evaluate this parameter which is x by 2 square root dt and for that parameter you have to evaluate the error function, error function is given by the cumulative distribution function of the standard normal distribution, 2 times that minus 1 actually is the error function solution and remember error function is an integral and the quantity that is inside the integral gets integrated out, so it is the boundary condition, so the solution is 0 to x by 2 square root dt, so that is what gets substituted and that is why it is a function of this value and that is what is happening here and then we want to plot the solution. So, it is 0.5 c1 plus c2 minus 0.5 c1 minus c2 times error function, so that is the solution and this is for 10 seconds, 50 seconds, 100 seconds as usual we draw some lines to show for example where the interfaces where these two things are put together and what is the fulfill composition on either side c1 and c2 for the left and right respectively, so let us do this. So, this figure is also there in Portland distilling, so you see that we took a material with 1.4 and we took another material with 0.1, so initially this was the composition profile and as time goes by in this the composition keeps decreasing, in this the composition keeps increasing, so the area under this part should be exactly equal to the area under this part by mass conservation because the carbon that is leaving from here is entering here and this depth of penetration of the diffusion curves is proportional to root dt, so as you can see 10, 50, 100 etcetera you will see that it is penetrating more and more in this case it is losing carbon, in this case it is gaining carbon and of course the after an infinite amount of time you expect that everything becomes 0.75, so that will be the composition that the system is trying to reach at the end after a infinitely long time that is the composition that is going to take place. But as you can see as the time proceeds you will see that for example in 10 seconds how much diffusion took place and 50 and 50 to 100 it is less because the concentration gradients are coming down as diffusion progresses which means that it will take longer and longer to achieve the same amount of diffusion, so things also get slowed down compared to the initial stages, so it will take really long time for composition to reach the uniform value everywhere. So we have looked at 3 different solutions for diffusion equation and all these solutions are based on the cumulative standard normal distribution function and it is related to the error function through this formula 2 times cumulative distribution function minus 1 is basically the error function and that happens to be the solution for the diffusion equation and you can get this solution using R because R can calculate this cumulative distribution function. Typically if you look at textbooks like Raghavan or Porter and Distilling you will find that they list out error function solution for different values, so you have to manually calculate and for values in between you have to either do interpolation or find tables that will calculate it for you or write programs that will do it for you but of course with R you can just call this P norm and get the solutions, so to summarize we have looked at normal distribution, it is a continuous distribution and random errors give rise to this distribution irrespective of from which distribution is sampled if there are random errors they also follow this distribution, so it is very very common to see this. It also has relevance to material science and engineering specifically because the solution for the diffusion equation which happens to be one of the important mechanisms by which atomic movement happens in materials especially in solids which leads to lots of phase transformations and microstructural changes which are very very important from an engineering point of view because all of the heat treatment is basically based on phase transformation and how fast or slow these phase transformations take place namely diffusion, so it is very very important to have an idea about the solution for the diffusion and it so happens that this normal distribution function has a relationship to the solution of the diffusion equation for certain boundary conditions and that is what we have explored in this session, we will continue looking at normal distribution a little bit more before we move on to the other probability distributions, thank you.