 Hi, my name is Prabhakar, I am with IIT Kanpur, welcome to this lecture on data management. We will have a quick look into what is a DBMS. Here is a simplified model of a computer, there is a central processing unit, there is a memory here and there is a disk. Programs and data are sitting on the disk, they are brought into the main memory, this part and the CPU executes these programs. And after the execution is over, some of the data is written back onto the disk. Now the thing with this is, if you want the data to persist after the program finishes execution or when the computer is turned off, you will have to put it on the disk. If the data is sitting in the main memory, it will go away when you turn the power off or when the program terminates. The disk is a persistent data store, the main memory is not, typically it is made of magnetic storage or these days you get solid state disks and when you put it on a disk, you get persistence. So here I have a set of abstractions on what happens on the disk and in the computer. So we know that everything is represented as a set of 0's and 1's, when it is on the disk, it is a string of 0's and 1's. Now there is a file manager which looks at these 0's and 1's and gives an abstraction of a set of files, it clunks them together saying these are one file, these sets of 0's and 1's are another file and so on. So at this level, I have a sequence of files, I can look at them as files, you all know about this right, when you use the disk explorer, you can see various files and directories and so on. Now some of these files may be documents, may be word documents or spreadsheets and so on. So we need a word processor to be able to figure out that it is a document to see and manipulate. Similarly, some of them may be images, may be JPEG images or something else and some other software is needed to figure out that this file is storing an image. And then we have something like language processors, the program, the file may be a program and a language processor will understand that this is a program written in a programming language and then compile it and execute it and so on. So this is another piece of software I use to manipulate my files. And then there is this thing called database management system which can take a file and provide you an abstraction of a DBMS. So the database management system is the software which can interpret the strings of zeros and ones that are written on the disk as values which you and I can understand. So essentially it is some software to manage data. It will allow you to store and retrieve data from the disk. The disk is very critical, efficiently, means without spending too much time and as you would accept it's a very essential part of any application. If the data I manage, process, input, type in and so on is not available when I come back tomorrow, the application is no good. We have seen in the web lecture, this is the architecture of a website and there is this database sitting at the back, deep inside, storing all the important and relevant information and history. So what does a DBMS do? Basically it stores and retrieves data. You can write into the DBMS and read from the DBMS. This is the simplest of things that it does. There is another complicated function that it does. Let's say two people simultaneously want to read or write into the disk. To give an example, there is one ticket, a train ticket from Delhi to Kanpur. Two people want to buy that. I have to permit only one person to buy the ticket. I can't issue the ticket to both people. So this is called concurrent access. So a DBMS engine has built in mechanisms to permit multiple people to access simultaneously. The third function that a DBMS does is to recover from crashes. Why this is what we mean is you know computer systems fail, the hard disk crashes and power goes off in between. Sometimes the software crashes. But if I have bought a train ticket, it should not happen because the computer has crashed, my reservation is lost. There should be a mechanism to recover. Or let's say I am transferring money from this account to that account. In the middle of this transfer, if the power fails, the money either should be transferred or not transferred at all. You cannot deduct money from my account and not add it to your account. Either you do not deduct from my account at all or if you deduct, then you should add. And in between if there is a crash, you have to make sure that either it is not deducted at all or it is transferred from my account to your account. So these are very important functions of a database engine. And this is essentially what it does. It's actually quite a complicated piece of software. It takes a huge amount of code and competence to build a database engine. So when data is stored in a computer, it is stored as a set of relations. This is called a relational model. These database engines typically are relational database engines. Now relations are like nothing but tables. Here is an example. Let's say I have an organization, there are employees in the organization. I want to store their data in the DBMS. So I create a table, a relation called employee. Now this employee relation has these kinds of information. It stores employee number, the name of the employee, the department in which the employee is working, and the date of joining. So here if you see, 221 is the employee number. Deepak is the name of the employee. He works in media technologies and he joined in 2007. So I have a large number of records like this. So this is called a table or a relation. So one row in the relation is called a tuple. And these columns are called attributes. So a relation will have many attributes and a large number of tuples or rows in it. And finally a database, if you want to build an application, its database is a collection of tables. Any organization will have a large number of tables. It will have information about employees. It will have information about inventory it has. It will have information about the projects, the number of rooms. It has what each room has and so on as an example. So a DBMS stores data in a set of tables. This is also called a schema. Suppose I have data in the DBMS, how do I write programs for it? You know some programming languages, right? You must have heard of them at least, C, Java, basic and so on. So DBMS is used something called SQL. People pronounce it as SQL, it stands for structured query language. Let's see an example of how SQL works. It's quite a simple language actually. Let's say I have this employee table and I want to find out who are the employees working in the DevOps department. So I can write a query like this. Select the name of the employee from the employee table where the department value equals to DevOps. Select name from employee where department equals to DevOps. So what does this query do? It goes and looks at the table. Checks for all those rows where the department value is DevOps. So these three rows qualify. The first and the second row do not qualify. So I get the answer as Revati, Gaurav and Sunil. So this is how you can extract information from the database. So shall I give you a small homework? So write an SQL query to select the employees whose employee number is 221, who are the employees whose employee number is 221? What would be the SQL query? If you look at the DBMS software, there are a large number of software which allow you to build and manage database management systems. So the relational database management systems are the ones which have been invoked for a long time and typically when we say DBMS we mean a relational database, I have a small list here. MySQL is probably very, very popular and large number of people know about it. This is an open source free database. MariaDB is open source and free. It's a fork out of MySQL. They're almost similar. Then SQLite is a database you will see on your Android phones, for example. Ingress, Postgres are also very well known database engines which have been around for a while. These are commercial database management systems. Oracle, DB2 and SQL server. This is from Microsoft. This is from IBM and this is from Oracle Corporation. A database engine actually is a very sophisticated piece of software and it's very expensive. These days, people are talking about no SQL database management systems. In the last few years, especially when data has become very big and e-commerce and web-based systems are coming into place. These are some examples of no SQL databases, MongoDB, CouchDB, Neo4j and so on. But today when we say a database, we mean a relational database and these are the major relational database products. So that's a quick introduction to what is a database engine. Thank you.