 इराडिय कोई गुई अगीसा जीना काचाlo अस मेरे अश्बा काचा कराजाऻ गुई अश्बा कोई विए अच्चान काचा। इर्चान श्विए बालेटी अखाए सईद कराजाऻ कता. अर्ट्याग और्चा़ के वीशाविए शाश्वलीकाच कराचा। तरीथा लिए अगर भाढा कि सी की आपिकाली भी जी करुठ़ा create of all the data U officers of description Commission so we worked out several methods of describing the data so that one can easily understand what is the data all about and what does it contain. So we presented here some of the measures of giving such description one of them is a numerical descriptors such as measure of central tendency. of dispersions which is range, variance, interquartial range etc these are all numerical descriptors then we also presented some of the graphical descriptor such as histogram, box and whisker खड िर एककूरfox ख़्गे � discuss ख़्गे ख़्गे ख़्गे ख़्गे खच मेंちゃん Но आcil और आ 수가 ख़म देर में him कू वज संद, ऑत, नब skin ङे, � lead, ट्य्र deer на ट्य्र, ट्य्र जख्र वि preliminary तरंो उआशा और आशा ब ठ़श्कर कि द trivia आश sitcom, शप् sự ऴाचा of different variables and also it can help us if you have a prior knowledge what the data really represents then you can have different measures of describing data. We will see that as we went ahead as we go ahead with this course and if necessary we will have it as an appendix or an additional recording where we will show you some of this special kind of data description methods that we use. However what is presented here is the most common methods of exploratory data analysis as it is also been called. The same method is also known as storytelling in the data science parlance because in order to make people understand what is the data you have to tell a story about the data and therefore the world has come storytelling but it means the same thing. Today what we plan to do is we want to formally introduce probability through some definitions such as events and sample space the three basic postulates of probability conditional probability base theorem and independent events particularly base theorem in today's world of data analytics is playing a major role we will talk about it when it comes to that. So let us begin we all live in the world of chance what is happening now is only I know I mean this is our Indian philosophy you know what is now and you do not know what is coming up in future. So future is always a probabilistic and uncertain event and whatever happens it happens due to several circumstances and therefore we say that we live in the world of chance. So occurrence of any event that you can think of in day to day life is there is a probability associated with it. Outcome of any experiment let us come back to the world of material science and materials engineering and the world of experimentations. We can say that outcome of any experiment has an element of probability attached to it. Generally speaking we count it in this way that we take it as a ratio of the number of event specific outcomes divided by total number of all possible outcomes. We will elaborate on this also briefly and in future. So then let us start with some formal definition of this event and the samples. So we say that an experiment can result in different outcomes or there is a spelling error here we have to say outcomes even though it is repeated under the same manner every time it is called a random experiment. This you must have experienced right from the day you started doing physics experiment or chemistry experiment in the small high school laboratories. Set of all such possible outcomes of a random experiment you pick up a random experiment and consider a set of all possible outcomes that come out of this random experiment is called a sample space of the experiment and it is generally denoted by capital S. We say that S the sample space is discrete if it contains all finite or countable infinite set of outcomes. Suppose we are conducting an experiment of counting number of atoms that comes out of 3D FIM then they are the number of atoms and therefore it is countable set and therefore S the event space becomes a discrete. S is continuous if we say that the outcome is contained in an interval of set of real numbers. So now an event we have defined as sample space we are defining an event. An event is a subset of a sample space S we generally denoted by E, A, B some of the capital letters in our capital alphabets. So union of two events E1 and E2 is denoted by E1 union E2 as shown here intersection is denoted thus and the complement of the event is all the elements in S that does not belong to the event is called the complement of E. So if we want to draw a picture we can make use of Venn diagram. Generally the sample space is denoted by such a rectangle it is S this is what I called an event E. Then the E complement is everything that is in S but not in E this is called E complement. So this area is called E complement. If E1 and E2 are mutually exclusive what we mean is that there is the interaction between E1 and E2 is null set. This is E1 and this is E2 again in a sample space S then the interaction between E1 and E2 is null set and therefore it is called then in that case E1 and E2 are called mutually exclusive. Of course we know from the set theory that a complement of a complement it is the set itself. So next we come to the three postulates of probability. Probability of event A is a function of events that is function from a sample space to a space 01 that is I can write it here that probability is a function from a sample space to the interval 01. In such a way that probability of any event is always greater than 0 because it lies between 0 and 1. Probability of complete sample space is 1 the occurrence of the all the events in the sample space or any of the events from the sample space from us any experiment that probability is 1. And if A1, A2, A3, An and so on are finite or infinite mutually exclusive events of S then the probability of union of all of them is a probability of sum of all of them. These are called the three basic postulates of probability. Now from this we can derive a few things. It is very obvious that probability of A complement is 1 minus probability of A because A complement union A is the full sample space S and the probability of full sample space S is 1 and probability of an A complement and A are mutually exclusive obviously. Therefore probability of A complement is 1 minus probability of A. Probability of null set is 0. If A is discrete then probability of A is the sum of probability of individual outcomes comprising A. So, if it is a discrete and it has several outcomes in it then it is sum of all the it actually follows from this particular postulate. From this third postulate the third point here that if A is discrete then A probability of A is sum of probability of all individual outcomes is derived from the third postulate. If an experiment result in any one of the n equally likely outcomes then if event A comprises of n such individual outcomes then probability of A is n divided by N. Remember that this exactly talks what we talked in the very first few slides. You recall this are the same things I am saying that probability of an event is the number of event specific outcomes and this is what it is defined what is event specific outcome. So, if an experiment result in any one of the total number of n equally likely outcomes and if event A comprises of only small n of such outcomes and probability of A is n small n divided by N. Probability of A union B now we are not assuming that they are mutually exclusive then it is equal to probability of A plus probability of B minus probability of A intersection B. This you can see that A intersection B actually occurs several times. So, if you take your sample space S and this is A and this is B this is A and this is B and this is your sample space A then it says that when you take a union of this A and B together this is A union B then when you count the probability of A union B it is a probability of A plus probability of B and you can see that probability of A intersection B gets counted twice. So, it has been removed once now we come to the next stage. So far we have been talking that you have a sample space that is let us draw a Venn diagram there is a sample space S and within it there is an event A. In real life number of times we do not know or we have very little information about the sample space S but we are aware that actually there is an event B which has occurred also within the sample space S. So, what am I saying is that you we do not have the knowledge of sample space S as a whole but within the sample space we know that an event of B has occurred the event B has occurred and then we want to learn something about probability of A then of course we would not know the whole probability of A probability of A is this whole area but we are aware of only that the event B has occurred we do not know how does the whole event space S look like and therefore all the knowledge of A comes from this particular intersection and that I will make it dark. This intersection is the area of which we have an idea because we know B and we know that A has occurred it means that we are aware of only A intersection B when this happens if you want to find out the probability this kind of a probability is called a conditional probability conditioned upon the fact that you only know about the event B. So, probability of A given this line is to be pronounced as probability of A given B is equal to probability of A intersection B divided by the probability of B. In other words we are sort of replacing the whole sample space S only by the sample space B the set B sample space remains S but we have only knowledge about B and therefore we are looking at it. Another way of looking at the same situation is that you may know everything about sample space S but we are interested in finding out what is happening to event A when the event B has occurred. So, this situation can come up let us see what kind of an example could that be shall we say that we are looking into the big set of certain kind of alloys but we know that only certain alloying elements is what we are interested in and then we want to learn certain property. So, this property A is talking about property of the alloy when all elements are present but we do not have any idea we only know that when is some elements are present which is given by B we know the property of A and then what is the probability of that property occurring we cannot talk in terms of the whole sample space S we have to talk in terms of only spend sample space B. This is called conditional probability and this is a simple simplification of it if you take this denominator to the other side you get that probability of A intersection B is probability of A given B multiplied by probability of B and this can also be probability of B given A multiplied by probability of A this is simple mathematics one can check it out by themselves and therefore for any event A we know that probability of A is probability of A intersection B plus probability of B intersection probability of A intersection probability B complement any event of A can be described as B intersection A and the B complement is this area this is B event so this is the complement of B area so B intersection A will give you this area and B intersection sorry B A intersection B complement will give you this area and B intersection A will give you area and put together they make a whole of an A is what is written here and therefore now we can say that probability of A is probability of A given B multiplied by probability of B plus probability of A given B complement multiplied by B complement probability of B complement sometimes this is called total probability rule this is not used commonly but it is better to know that sometimes this is called total probability rule now from this we come to a very important theorem which is known as Bayes theorem it is very interesting that Bayes theorem was discovered by Thomas Bayes sometime in 18th or 17th century I think it was 18th century but I might be mistaken it might be 17th century and today in the 21st century we find an extreme use of it and it has become a very important theorem when we were in the college and we were studying this Bayes theorem we had to create artificial problems to understand it but today it is the applications is available in abundance and it is it is in fact most commonly used analysis which is known as Bayesian analysis will touch upon it in this course also but this what is this Bayes theorem so consider the two events A and B and they are independent only if some of the conditions are true this comes from the conditional distribution so if the probability of A given B is probability of A and probability of B given A is also probability of B and probability of A intersection B is probability of A multiplied by probability of B any of these three conditions is met then we say that events A and B are independent events now we come to Bayes theorem you recall the previous slide let us recall in the previous slide we have that probability of A intersection B is equal to probability of A given B multiplied by probability of B which is same as probability of B given A multiplied by probability of A this has been used in a different way here so in other words you can say that probability of A given B multiplied by probability of B is probability of B given A multiplied by probability of A it is just played around with we can say that probability of A given B is probability of B given A multiplied by probability of A divided by probability of B looks very simple but this gives a very important information let us assume or let us imaging that you have to you have certain data B okay B is your data which is given to you and A are certain parameters that describes the data that describes the data now this parameters are called estimates data is the reality parameters we obtain by what is known as estimates we will learn about this in more details in future but then these are the parameters and these are its estimate data is the reality data is the reality because you have observed it so now the data keeps coming to you if it is a real-time data it keeps coming to you it gets augmented so your reality becomes larger and larger so how do you improve upon your estimates based on this larger getting reality that is what it says A is a parameter if you estimate using B then it is same as the probability of data given the parameters in terms of probability of parameters and probability of data this becomes an iterative treatment which can be written as probability of one iteration the next iteration of parameter estimation come out from the same data given the previous estimator probability of the previous estimator divided by probability of data please remember that this k runs from one to onwards so you have some initial idea say you are working with data on rainfall and you have an initial idea that generally the rainfall in this area is so many percentage or if the average rainfall is so many millimeters then you get the latest data from the latest data that is using the previous estimator you come up with the latest data to give you the estimator for the new data that is from the base data you can come up with the new data and this is called base theorem so it means that it actually gives you an opportunity to improve upon your estimators we will go into details when we go upon the base theorem how it has gained importance today and why it was not there before base theorem application or Bayesian analysis is extremely data intensive you have to have lot of data in order to have application of base theorem with today's world of facebook google mobile phones etc etc the real-time data is increasing very fast even in the industry with the robotics playing its role artificial intelligence playing its role we generate a lot of data on any processing of metallurgical industry and when this data is generated the question comes that whatever parameters which we were estimating using this data can we improve upon it and that improvement comes from the base theorem and therefore in today's life base theorem has become very important as I said I repeat again we will talk about Bayesian analysis in in this course at a little end of the course so with this we let us summarize what we learned today we first introduced probability postulates based on the fact that every experiment has is run by chance it cannot give you the same answer all the time there is an error factor and there is a chance factor playing with it in order to control that we must have some idea of probability and therefore we introduced probability through certain definitions like event and sample space and then we introduced probability postulates we did little bit of algebra of probability that is you know what is the probability of the null set probability of sample space probability of A complement is 1 minus probability of A is what I call algebra of probability then we introduced the situation in which you may not know everything about the sample space but you may know about a particular event and you look into the whole other events occurrence in the light of this event which has already occurred which is called a conditional probability and based on conditional probability we defined what are known as independent events it means that whether that event occurred or not has no effect on the occurrence of the other event so if A and B are independent event events then occurrence of event B has nothing to do with the occurrence event A and finally we introduced quickly a Bayes theorem which as I said in today's world has gained a lot of importance and we will talk about Bayesian Atlantis is at the end of this course. Thank you.