 Today we are going to discuss the topic, the role of statistics and data mining with machine learning. At the end of this session, a student will be able to demonstrate how statistics and data mining play an important role in machine learning. The technologies used are statistics, data mining and machine learning, which have different roles in the understanding of data. They describe the characteristics of a particular data set of interest and they go into resolving the finding of relationships and patterns in that data to build finally a model. Data mining and machine learning algorithms are rooted in classical statistical analysis. That is whenever we are analyzing statistics, there is involvement of data mining and machine learning. So we have a combination of capabilities and technology and this is associated with understanding of the business problem which is of vital importance and the business goal and subject matter expertise is required to be dealt with. Let us now think and answer what is the desire from each, what is expected from statistics, data mining and machine learning. Statistics is a science of analyzing the data. It is a classical or conventional method in which statistics is inferential in nature and it reaches to conclusions about a particular data which is under consideration. It focuses on understanding the characteristics of the variables by which data is represented. Machine learning models leverage these statistical algorithms for their processing. In a statistical model, a hypothesis is a questionable way to confirm the validity of the specific algorithm. It deals with the truth which has to be abided by the particular environment. Machine learning is applying statistics to predictive analysis. Statistics is a science of analyzing the data. It deals with classical or conventional statistics which is the property inferential in nature and it reaches to conclusions about the data which is at hand. It focuses on understanding the characteristics of the variable but machine learning models leverage these particular statistical algorithms. In a statistical model, we see that the truth of falsity is referred to a particular hypothesis for testing and we confirm the validity of the specific algorithm finally. Machine learning applies statistics to predict analysis. Data mining is based on the principles of statistics once again. However, the process of exploring and analyzing large amount of data is encountered to discover the patterns which are present in that particular data of importance. An algorithm is used to find relationships and patterns in the data. Information about the pattern is then used to make forecasts and predictions. However, we use to solve large range of business problems which are at hand. Traditionally, organizations used data mining tools on large volumes of structured data which we call usually as big data. The process involves the goal of data mining to explain and understand the data which is not intended to make predictions or backup a particular hypothesis which we are using as our assumption. We provide software solutions that enable data mining of a combination of structured and unstructured data. To extract this data from a larger data set for the purpose of classification or prediction data is clustered into groups. Therefore, we deal with both classification and clustering. Data mining tools are intended to support the human decision making process and data mining is intended to show these patterns that can be used and the machine learning will automate the process of identifying these patterns and use them to make a particular prediction. Therefore, data mining is related to machine learning. The interest that we find in data mining is we are handling large amounts of data, hence predictions are more accurate. The cost decreases because of the mass storage devices having a lower cost as compared to what they were before. We can use fielded database management systems and faster computers and parallel approaches help us to get our results in a short period of time even though it involves large amount of computation. The development is done in an automatic learning technique which is supported by machine learning and the presence of uncertainty of data is dealt with. The purpose of data mining is the process information of enormous stock of data which is available and the tasks followed are summarization of information of all the database which is stored, identification of a model for predicting the particular information, then learning which embeds a change in the particular environment and it uses appropriate software to predict the particular results. The steps to be followed are data is selected then transformed then we go for data mining of extraction of patterns of interest and visiting those patterns not visited before then we go for results of interpretation and validation of this particular data and then incorporation of a discovered knowledge. The process consists of the term called as knowledge discovery in databases where we have pre-processing of the particular data then using data mining to form the particular model and then post-processing of the validated model which is given to a knowledge and this data is refined to produce what should be in the particular knowledge base. The issues with regard to this particular context are we should understand the role that is played by each of them the role of machine learning, the role of AI which is the broadest way of describing systems that can think. Artificial intelligence is the overall category that includes machine learning and natural language processing. Machine learning focuses on ability to learn and that's why brings about a change and adapts to a particular model which is based on the data rather than explicit programming. The subsets of AI include reasoning. This reasoning is required because machine learning reasoning allows the system to make inferences based on the data which is available in hand. Reasoning helps to fill in the blanks where there is incomplete information thus reaching a particular result even if data is not available. This is called as non-monotonic reasoning and machine reasoning helps to make sense of this connected data so that we will generate a proper and predicted data which is accurate in nature. Natural language processing which is the second subfield is the ability to train computers to understand both written text and human speech. The techniques are needed for capturing the meaning that is semantics of the unstructured text from the documents or communication from the user. Use of various tools like tag tools and semantic tools are necessary here. The primary way that systems can interpret text and spoken language is of importance and one of the fundamental technologies that allows non-technical people to interact with advanced technologies is this natural language processing. Planning is the third sub area of artificial intelligence which is the ability of artificial intelligence systems to act autonomously and flexibly to construct a sequence of actions to finally reach a particular goal. A proper planned system will give the best results. It requires a system to adopt based on the context which is given by the surrounding and the given challenge at hand. Putting all these together we see in this diagram that artificial intelligence is connected to machine learning and these both use techniques of statistics and data mining for further processing their computations. The references used are as shown. Thank you.