 to this video, Fundamentals of Data Mining. Today, we will discuss on Fundamentals of Data Mining. Myself, Dr. R. V. R. Giddy from Computer Science and Engineering Department. The learning outcome of this video are, at the end of this video, students will be able to exhibit the overview of data mining and how it is related to KDD process. KDD means knowledge discovery data. Then, students will be able to identify the differences between database and data warehouse. Means, what is database and what do, what does it mean by data warehouse. So, these are the two outcomes we will be able to learn today. The content of this video, the introduction, KDD that is knowledge discovery in database, data warehouse, why data mining is required. In today's era of data, we have huge amount of data. Everywhere data is available, but we do not know how to use the data, when to use that particular data. We have data that has been shifted from terabyte to parabyte. But we do not have information or we do not have knowledge, so that we can take some decisions from that particular data. So, for conversion of data into knowledge or to infer some knowledge and then to take some decisions, user can take decisions from that knowledge. For that, we require data mining. So, huge amount of data is there, then huge amount of information is available. That information can be used to understand the behavior, draw the patterns and make the decision. Next, what is the data mining? Now, how data mining will be defined? Now, we know that database is nothing but a collection of data. Data mining is what? We have a data. On that data, we are using some data mining techniques to infer some knowledge that is called as a data mining. In short, it is defined as data mining is often defined as finding hidden information in a database. So, we are mining from the database to infer some knowledge or data mining is alternately known as KDE, KDD, that is knowledge discovery in database, knowledge extraction, data pattern, analysis data, acrology, then information harvesting, business intelligence, etc. Now, let us see what is KDD, that is knowledge discovery in database. Knowledge discovery in database, that is KDD is the process of discovering useful knowledge from the collection of data. Now, what are the different steps involved in the KDD? Those are preparation. The first part is preparation of the data, then transformation of the data, selection of the data, then cleaning of the data, then applying data mining techniques and inferring some conclusion. So, these are the steps are available in KDD. Let us clear this picture by looking at this figure 1.1, that is knowledge discovery in database, KDD. This figure shows that we have at the bottom side, we have database and at the top end, we have knowledge. So, the path from database to knowledge, how it travels, let us see how it occurs or how it happens. Now, initially, we collect the data from various sources. Now, that data may not be in the required form or it may contain the noisy data or irrelevant data as per our application or as per our requirement. So, the first part is data cleaning in which we are supposed to clean the data means whatever the noisy data is there or irrelevant data with respect to our application that has to be removed, that is called the data cleaning. Then, the next part comes out to be data integration. This is data integration. In data integration, we are collecting this data from various sources. So, collecting all that data and putting it one particular database, that is called the database. We are collecting data from SQL, NSQL, NoSQL or some other databases. All this data is collected and placed in one particular part, that part is called the data warehouse in place of database. Now, the next step comes out to be selection means selecting the data. Select we have collected the data from various sources. Now, we have cleared or we have cleaned the data. Now, selecting the data as per the requirement of our application or to get some proper knowledge to get some proper inference. So, we will be selecting the data and data that is given to data mining. Now, what is this data mining? Once the data is selected, once the data is selected, this can be now on this we can apply the different data mining techniques like we have clustering classification, decision tree or artificial neural network. Then, we have association rule mining. So, N number of data mining techniques are there. So, apply one of the technique on the selected data to get the some patterns. So, this data after applying data mining, the next part is getting patterns pattern evaluation. Now, pattern evaluation means from that, after applying that one particular data mining technique, we can generate some patterns or we will be able to generate some patterns and based on that pattern, we can conclude something or we can infer some conclusion or that conclusion is nothing but a knowledge and based on that knowledge at the end user or for the end user, that can be useful that can be useful for to make the some decisions. This is what KDT. Now, let us see what is data warehouse. A data warehouse is a technique for collecting and managing data from various sources in one particular part. A data warehouse works as a central repository where information arrives from one or more data sources, data flows into a data warehouse from the transactional system and other relational database system and it all data is merged into one particular part from the various sources. It can be structured or it can be semi-structured or it can be unstructured. Now, let us see what is data warehouse. By looking at this figure 2, on the left hand side, there are four box, four terms which says that we are collecting data from various sources from Kolkata, Bangalore, Delhi, Mumbai. All this data is collected in a data warehouse and that is been given to the other process that is data cleaning, transformation, then integration, then selection and load refresh and that has again come that comes to the data warehouse where we apply the data mining technique for the analysis part and then the queries are applied to the data warehouse based on the requirement of the user that is client 1 or client 2 or n number of clients and based on that after analysis of the entire data, analysis of the entire data that the query will be filed to the data warehouse and we get the required output. Now, for the students, this is a question, find out the issues that may arise in a data mining, think and give the answer. Now, you can take a pause over here, think a minute and then give the answer. The answers is answer are mining different kinds of knowledge that is data mining issues are then interactive mining of knowledge of multiple levels of abstraction, professional and the visualization of data mining results, handling noisy and incomplete data, efficiency and scalability of data mining algorithms. Now, let us observe or identify the differences between database and data warehouse. In database, it is designed to record whereas in data warehouse it is to design to analyze that is the difference record and analyze. In second point application oriented collection in a in database is an application oriented collection of data whereas in data warehouse is a subject oriented collection of data. The third point normally it is limited to single application here it is multiple applications in data warehouse and in database it is real time and here data data is replaced or it is collected as per the requirement or whenever it is needed and in database the last point is efficient in processing and storage and it is here efficient and in analytics in data warehouse. References the textbooks are referred that is data mining introductory and advanced topics by Dunham and the second book was by Ken Han Kember. So, this was the video based on the fundamentals of data mining. I hope you will understand something. Thank you.