 Hello everyone. So, we were discussing about basic introduction to Python. And in the last class, we talked about how to create variables in Python, how to assign values to them, how to manipulate your strings, how to do some basic mathematical operations on numbers. And then we also looked into how to use a file statements, looping and at the end we talked about how to use this num5 function and we also saw how to read files and how to manipulate files ok. So, today we will continue the discussion and our focus today will be to understand some of the statistics aspects that are available or statistics function that are available in Python ok. So, when we want to use statistics, we want to have to play with data, maybe we also want to use visualization aspects so that we can see how the data pattern looks like. So, for this we would be interested in many libraries. So, here as usual I am going to take import the numpy function and also there is another nice library in Python that is mostly to see visualization of the data it is called cbond that also I am going to import as sns. And then there is another nice library called mathplotlib.pileplot and from that I am going to import a particular fun library this is called a figure and I will also import this function mathplotlib by plot as plt. And also there is another function or sorry another library called psi p and I am going to also take some functions from that. So, from this psi p stats I am going to import this norm function which is going to help us to deal with the Gaussian distribution and also from the psi p I am also going to import stats which we will see that which is going to help us look into various statistical aspects. And I am also going to import beta function as beta underscore dist and I will also going to take beta from psi p stats. So, these two are actually same if you see that, but here I have just named beta as beta distribution here it is simply beta. Now, let us get started with normal distributions how to generate samples using normal distribution. So, for that suppose you want to generate normal distribution with mu and sigma given as 0.1 notice that in python in the same line you can assign two variables here. So, mu and sigma are assigned values 0 and 0.1 respectively. And now in 9 pi there is a function random dot normal which will help us to generate the samples according to normal distribution. We have to just pass on this mu and sigma values and how many samples we want then it will generate us. So, let us see that I am just executing the cells now. Now, I have this I have this data generator S which is like about 1000 data points which are Gaussian distributed. Now, I want to see let us see that now you want to let us say you want to see the PDF or CDF of Gaussian distribution. So, for that you first need to specify and at which point you want to see the PDF CDF. So, let us say I want to see the plots between minus 2 to 6.1. So, I am going to use this function NPA range to get those values ok. And then ok this is the same thing written here. Now, what I will do is I will use this function norm dot PDF notice that norm that we got it from psi p stats which we imported as norm and dot PDF. So, this is going to generate me give me this PDF of a normal distribution at the points x ok. Similarly, I have this CDF now if I am going to call norm dot CDF function this is going to give me CDF of this normal distribution at point x at point 6. So, here in this norm dot PDF can take other variables like the mu and sigma squares, but here we have not passed anything other than the points where we want to see the PDF. So, it is going to give us a PDF which is normally distributed that is with mu 0 and variance 0.1. Now, we want to see that right for that we need to use this plot function which we imported as PLT this math plot lib dot pi plot we imported as plot. Now, I am going to first say I have to first I am going to I think here yeah. So, first I am going to create this variable AX which is plot dot subplots and this is going to create a figure which has of size 10 cross 6 that is 10 cross 10 comma 6 is giving you the aspect ratio. Now, I have this x I want to see the y values of this VDI function and now I can this plot function also help me in assigning the labels I can write label equals to PDF and also use the color for those lines that are going to show. Yeah, first let us see this how it looks. So, now if I do this you see that range is between minus 6 to 6 and this is like a plot of a Gaussian distribution and this is like a plot of a CDF of a Gaussian distribution. Notice that it is at 0 you are going to see the value of half as expected. Now, this things the labels which we put PDF they have come here and color red is assigned to the PDF function and blue is assigned to the CDF function and if you want to label X and Y labels that provision is there you can simply say set Y label as simply probability and which whatever color you want to appear and similarly you can label X and so, that is why this probability and X has appeared with color blue and you can also give the title to this plot and here I have given it as a PDF and CDF standard normal. So, now let us see this plot legend actually plots this legend here if you do not put this then this legend will not appear. So, if I do this there is no legend here, but if I do this the legend will appear. Like this we can generate our CDF and CDF of various distribution now I am going to use another one exponential distribution, but here instead of looking into PDF CDF let us try to generate the sample according to the distribution with the specified parameters. So, suppose let us say I want to generate exponentially distributed samples with parameter 3.45. Now, there is a function for it called ran dot exponential I can pass on that whatever the lambda value I want and how many samples I want and it will generate those many. And now I want to see now I have a data now I want to see that histogram wise how they look. So, I hope all of you understand histogram basically splits the groups the data into various bins and see that in each bin how many points are following and the number of samples in a particular bin will give you the height of that bin. To assume that I am going to use this function figure which has aspect ratio 10 into 5 and then I am going to use this SNS function which we have defined and in that I am going to call this function his plot passing on the data I have ok. Now, let us and let us see what happens now if you see that I have this. So, here it is saying it has basically looks like it has grouped 3, 4, 5, 6, 7 between it has grouped the data into this bins the bins are of shown by this vertical bars and the height of this vertical told us out of this 1000 samples how many samples fill in each of this bins. And if you see that roughly it has this exponential structure to it and again we can label this x and y labels. So, y labels are named as count here and x label simply x. Similarly, we can do it for other distributions like here let us say in this exercise I am going to see the PDF of how comma distribution. Now, first as I said when I want to see a PDF I need to first specify at what point I want to see the specify here I have specified this to be range between 0 to 40 and the 0 to 40 is to be split into 100 points with linear spacing ok. Now, this stats function. So, this stat library we have which we imported as from the SIP package I can call the gamma function and do the generate the PDF I need to pass the points we are going to see now here I also need to specify what is the here I have specified shift and the scale values right. Now, I have plotted actually 3 here for different possible different values of A and scale and I want to also visualize them now. To visualize so now the graph is generated here let us say to visualize what I am going to do is again I am going to use this function subplots in plot and specify the aspect ratio ok and I will call each one of them now I want to see first the y 1 and y 2 and y 3. So, I have passed on them to your plot function and also each one of them I have labeled ok with sorry not I said shift it is actually shape parameter and the scale parameters and this is how it looks and here we have also put a legend and let us see what happens if I do not have this plot here ok and now when I say plot I do not see much difference, but legend is there ok let us see again and now if I put this yeah only see the difference I see is that whatever that some naming was coming on the top that disappeared. So, this map plot legend something came here and if I say plot I think it is just showing me a clean image without any additional things which I did not ask for it ok. Next we can similarly look for beta distribution here again I have this beta distribution, but here see like I had passed on if I want to generate a multiple gamma distribution I specified the scale and shape parameters 1 for 3 set of scale and shape parameters, but and I have to then call this plot function 3 times maybe this can be avoided if we intelligently use our loops and that is what we will demonstrate now for the beta distribution here. Suppose beta distributions we want to generate the CDF for 5 different pairs of parameters alpha and beta parameters. So, this alpha parameters I am taking 0.5, 5, 1, 2, 2 here and similarly beta values have taken 0.5, 1, 3, 2, 5 here and the X values where I want to see that I have taken it to be the interval 0.01 to 1 with increment of 0.01. So, there are about 100 points. Now, this for the plot figure I have taken the figure size to be 10 into 7 and now instead of calling this plot function now for each pair I can try to write a call them iteratively through a loop function. Here I am defining variable i which is going to range length of a. So, I want to have to plot 5 figures here. So, I am just defining that to be length of a and then I can just use the beta dot pdf function and pass X a of i and b of i and then I can simply draw that by using this plot function. So, here notice that first let us plot this. So, now nice we have got a nice plot here. Now, you see that to write the label I have used this R this is useful whenever I want to use this symbols like here I want alpha to be 0.5 alpha I cannot directly write here like this, but I can use the Lattec code for alpha which is backslash alpha here and Python to know about this I will write an R at the beginning. So, now if you see that it is going to be alpha here and then it is equal alpha equal and then I have added a string here. This is the string I am adding and this corresponding to the value of a at location i and after that if you see I have added another string here which is a comma and a space which has appeared here and then again I need to get this Lattec code for beta to let it know that I am writing a Lattec command here I am going to write R and then again I will append this string I want to show bi, but bi I cannot directly put so I will convert into string and append it here. And I can also specify the range of y here which are specified by giving the y limit. So, in this way the plotting and visualization of the CDF is pretty easy just by few lines of code we will get and you can play around with them. And of course, we can label them we can give the x and y label as per where wish and also specify the font sizes. So, there will be multiple options here like size and if you want to add some colors and all if you do not specify any colors by default it is going to take it as black, but if you want to specify some colors you have to pass on that argument here. So, let us add this plot dot show here and see what happens which we saw last time. So, you see that when I did plot dot show it cleaned up like no other text you give other than these figures.