 Today, we are going to discuss the second lecture of the database management system course. Before starting the lecture today, we will briefly review what we did in the previous lecture. In the previous lecture, first of all, we gave an overview of the entire course. Then we discussed some basic definitions. Then we compared the database approach with the traditional file system approach. And finally, we were discussing some advantages of the database approach. We discussed some of the advantages and in this lecture, we will do rest of them. So, let us start with the today's lecture. This slides shows you what we are going to discuss today. We will start by discussing some more advantages of the database approach. Then we see some cost that we have to keep in mind that we have to pay when we adopt the database approach or database environment. Then the levels of data that we encounter that we see in the database approach and finally, the database users. Before starting today's topics, first I will like to explain you some terms that we have been using in the previous lecture. We will use them today and in the following lectures. First thing is what is data and information? What is link between them and what is difference between them? Data we have already defined as the raw effects. But the thing is most of the time the raw effects by themselves are not much useful for the user for you. For example, here you see on the slide, you see some data on the screen. Obviously, you can guess from the data that there are some names written over here. What do these figures mean? What is this data and how these figures, these facts are related to each other? This is not clear just from the data. So, it means data by itself is not much useful, much meaningful for the user. So, then comes the information. A common definition of the information is that the processed data is called information. So, what we mean by processing that we do on the data? By processing it can mean anything very small thing as small as just naming the data. And it could mean large sort of computations, many calculation, many computation. So, the data is a raw effects and the information is the data after some processing. For example, in our example, in our case, if we simply place these headings on this data item. For example, we say the first column is employee name, second column is age and third column is salary. It adds some information, it adds some knowledge about this data in your mind. We can further improve the level of information by adding further processing, by doing further processing. For example, we can add this, that this is a company name super soft and the department is sales. So, now you see this is a small example of the processing. Now, it becomes a proper screen, proper information in presented in a proper way. So, I hope now you are clear what is difference between data and information. Next thing that I have already discussed perhaps the previous lecture and that you will come across very frequently is schema. What do we mean by schema? Schema is basically a place or repository where the information about the structure of the data is stored. Because as you must remember a definition of the database that we discussed in the previous lecture, the definition was that a database is a self-describing collection of integrated records. So, when we say self-describing, this thing self-describing, the description of the data is stored in the schema. There are different types of schema and there are many other things that are stored in the schema and that you will come across or we will discuss later. But for now, it is sufficient for you to know that schema is such a place where the structure of the data, the data which is the database in the store is the data store or information store. Apart from this, another term that you may have heard so far is database application. This program is similar to the database application. These are programs that are written for a specific purpose. And the data which is stored in the database, they do some kind of activity or some kind of process. For example, to store the data on the database, retrieve it, display it on anything, apart from this, to print it on the printer or to delete the data or to update the data, these are different examples that you do with the application programs or database applications. So, for database applications, the terms we can use, we can call them application programs, we can call them applications, we can call them database applications. All of these are the same according to our context. Apart from this, another thing that we will discuss frequently is database management system. This is also among the basics. You must be very clear about what is the DBMS. Database management system is a collection of programs that are used to manage the data in the database and the users of the database. These are two major activities that are performed by the DBMS. What do we mean by the management of data? By the management of data, we mean that we define the structure of the data, then we put data into the database, then we make any changes, any updation, any removal in the data that is all done by the, that is all controlled by the DBMS. Likewise, the users of the database who will access this data, who will perform what sort of operations. This has to be controlled, this has to be checked and this is the second major activity that the DBMS performs. So, from now on whenever I use the term DBMS, you just bring into your mind that it is a software, it is a collection of program, it is a huge collection of program because nowadays the DBMS, it may consist of thousands of programs, but collectively they are called the DBMS and its main job is to manage the data and its users. Let us go to the next slide. Here we are going to discuss some other advantages of the database approach and the first one is data consistency. It means that the database approach provides you a very helpful, very convenient environment where you can maintain the consistency of the database. What do we mean by the data consistency? It means that if the facts in the database are duplicated, if they are repeated, they do not contradict each other, they do not conflict each other, rather they either they are same or they support each other. So, there are a lot of chances that your data will be inconsistent. For example, if you do not reflect on a change in a place, there are a lot of chances that the data will be inconsistent. Whereas the database approach, two factors. One, because the database is designed keeping in mind the sharing of the data and secondly, the redundancy is controlled. So, there are very less chances that the data in the database becomes inconsistent. Wherever necessarily, if it is duplicated, then the database approach, the DBMS that helps the user, helps the system to maintain the data consistency. We will discuss this especially when we read the Referential Integrity Constraint. In the previous lecture, we saw a slide where we saw that all the applications are accessed through the DBMS. Because all your data access is controlled by the DBMS. So, DBMS is in a very good position that it can check the type of access the user having, who is accessing the user, what kind of operation he is doing. This means that with the help of the DBMS in the database environment, you can ensure the security of the data in a very good way. After this, faster development of new applications. The database environment supports faster development of new applications. How? The reason is that we studied that the database is developed keeping in view the future growth potential of the system and also of the organization. So, the time when we need to develop a new system, for that application, the data required, either it would be already there in the database. Why? I had told you that we design the database as soon as we see the future needs. So, it is possible that the data is already present in the database as such. One. The second possibility is that the required data is not present there. But the data that is present, we can drive it, compute it. The third possibility is that if the data is not present there, in that case, we have to make the changes in the scheme of the database, mind it. The change in the data in the database and the change in the scheme of the database, these are two different things. The change in the data, if it is properly controlled, if it is being done with the proper authorization, that is no problem at all. But the change in the schema, it could be very, very difficult to manage. If it is handled improperly, it can cause great loss to the database or to the entire system. That is why you must keep in mind that making a change in the schema for database is a serious thing that you have to do very carefully after very well thought. So, because the data required will already be present, when you have to add it, in both cases, the database approach makes it very easy. Especially, if we compare it with the file system environment, it is much, much more convenient, much, much easier to make changes in the schema. So, it helps an easier development of the new applications. Another benefit of the database approach is the economy of scale. This means that because we define the database with the objectives of sharing the data, sharing the resources, the benefit of this is that your resources, your data, you do not have to repeat it again and again. You are sharing resources, sharing data. This means that your expenses are saved. This means that overall, it provides you with an economy of scale. Better concurrency control. This is something that you will better understand in the later lectures. But just to give you an idea, through the database approach, multiple users can access the data at the same time. This is called concurrent access. This is a very useful thing. But the thing is, it has got great, great complexities in it. And DBMS or the database approach handles it beautifully. There are so elegant algorithms, there are so elegant techniques that control the concurrent access of the data by multiple users. DBMS is today literally support thousands of users at a time. And the some example that you might have already seen in even Pakistan, like for example, if you see, when we go to any ATM, any bank service machine to get cash, to get money, then the ATMs of that bank are in the same city. And of course, on many ATMs at the same time, the users will be accessing their data. Obviously, they want to get money, but they cannot get money until they have their particular check, or if they have a proper user, then you check if they have given the right pin number or not. So, the pin number or the card number of that user are both in the database. Now, your 20 users are getting money from their own card at the same time, or you say that basically they are accessing the database. But the beauty of this concept is that none of them have any idea that any other user is also accessing it. And none of them have any idea. This is what the DBMS automatically controls and makes sure that there is no disturbance created for the user and even for the data in the database by this concurrent access. We will discuss in detail later, inshallah. Let us see what is next. Better backup and recovery procedures. Database means data which is valuable. Second, it becomes huge, especially in big organizations over the period of time. So, there are many chances or it may be that the thing that has a data store, which is a storage device, for example, hard disk or any other thing, there are many chances of being corrupted or corrupt. So, this is such a useful and important thing, and it is so complex and complex in such a long period of time that we cannot afford to leave it on just one device. So, for that purpose, we have to keep different backups of our data. And depending upon the importance of the data, depending upon the critical nature of the data, in such organizations, there are many backups of the data. So, what do we mean and why do we need the backup? And second thing is recovery. Although, the environment in which the machines, computers or systems run, they are quite secure. In that, the chances of breaking things are quite low. Despite that, you can never ignore, you can never ignore altogether this fact that all of a sudden, something can crash while doing the work. The light can go away. If it is not at such a big cost, because there are many ways to do it. Any reason why the database was not properly closed, but suddenly it was closed. It stopped working. So, the result that was in your database, it will not be consistent in the right way. So, it is necessary that when you turn on the database again, turn on the DBMS, detect that thing, recognize that last time when the system was down, it was not down properly. Detect it, and then recover the data. The recovery is that it should be in the right condition. How will it come? How will it do? We will read this later. At this time, this is just enough. You just have the idea, what do we mean by backup and recovery? And to conclude this thing, the database environment provides you a very convenient and very supportive environment for the backup and recovery procedures. But it is not always just the sugar. It means that apart from all those advantages that we just discussed, we have to pay, we have to bear some costs as well. So, we have to keep in mind, what are the costs involved in the database approach? Or we can say them some minor disadvantages. First one is higher costs. In the database approach, you have to have some specialized software in the form of DBMS, some specialized hardware, and some specialized personnel as well. So, this means that if you adopt the database system, then you will have to pay more costs for these three things. So, you should be prepared for that. But believe me, that would definitely be justified. When the database runs and the company benefits, then all the costs are justified. After that, conversion costs. Conversion costs means that if you have already a system running, it could be either the manual system or it could be a file processing environment. If you decide to switch from your previous environment not towards the database environment, it is a big decision. And companies normally think a lot about it before they act on it. Because there are a lot of efforts and a lot of costs involved in this. So, you must be ready for that. And you should keep this thing in mind that there are a lot of conversion costs involved in this. Apart from this, the recovery in this, although the database supports you, or DBMS gives you the facility that in the case of any crash, it recovers the data. But it is more technical. It is more technical to handle. So, you must have the appropriate staff. So, these are some disadvantages that we face in the database approach. Data as resource. The purpose of this slide or this topic is to explain to you the importance of data for an organization. What is a resource? A resource is any asset that is of the value for the company or for the organization and that incurs cost. If you look at any organization, you will see a lot of assets in it. For example, if you look at a factory, then there is the building of that factory. The machine that is in that factory is there. There are vehicles in that factory. There is office equipment in that factory. There is even a human resource in that factory. There are skilled people in that factory. All of them, they are important for the organization. They are important for the company. Why? Because the company has put costs on them. The company has spent money on them. And the other thing is that to do business for the company, for running, for functioning, those assets are needed. So, those assets are of value to the company. They will protect those assets. They will not let them mishandle or misuse any one of them. Is data a resource? Yes. How? See, the company does business for who? For the profit. If you run a company, a business on the profitable basis, you need accurate decisions at the appropriate time. The right decision is at the right time. And the right decision is at the right time. What do you need? You need information. And from where do you get the information? From data. So, if your data is not at the right time, if you do not get the data at the right time, then you will not be able to make the right decision. When you cannot make the right decision, then you will ultimately affect your business, its progress, its development. So, this means that data is valuable like any other resource of the company. Rather, it is more valuable than those resources. Why? Because if you do not use data properly, and you make wrong decisions, you make untimely decisions, then your business goes down. And if your business goes down, then you will not be able to make the right decisions. This means that you have to protect your data. You have to take care of your data like you take care of your other assets, other physical assets. I will explain this through an example. First, you are running a business. You are running a store. And in that, you deal with different items. Now, in this kind of business, you have one thing that you have to look at which items are your dead stock. Dead stock means items that have not been sold for a long time. And when they are not sold, they are in your inventory. This means that you have blocked your money. The second thing is called Re-order. Re-order means that you have to give order to any item. Now, if your business is stable, in which there are a few items, for example, there are 15, 20, 25 items, then you can remember that you are fine. I have to wear items or you can look at them. But as you have increased the number of items, at that time, it is important to realize that when should I re-order. Because it is very important to re-order at the right time. That is why your items are running the flow of your business. It is not that the customer comes to you for some work, and he or she is not available to you. On the one hand, and on the other hand, you should not keep a sense of something that blocks your money. Now, if there are a few items, but if there are too many items, then in this case, the database can help you a lot. You can find out from this which items are your dead items. You can properly identify them at the right time. And you can see that even if you do for some time, or for a long time, the items are dead items, then you can remove them on sale. You can do anything, but somehow, whatever money you can take from them, and the same thing is re-order value. If you have any items that are hot, they are sold fast. Some items are sold slow, so when should you re-order them? Because in re-order, you have to take care of delivery time. When you will order the items that you are taking, it will take you some time to re-order them. So, you can re-order them at this time, so that when they supply you, then you do not have anything in stock. So, like this, if you have business terms, or your re-order value, or your re-order point, or your re-order time, then your database can tell you correctly on the database, inform you, and on this information base, you can make the proper decisions. Similarly, another example is that the customers you have are not that efficient. With them, your business terms are not that good, so you have to point out them. Because if you have a general policy that you keep giving credit to people, you keep giving credit to people, but the important point is which customers are that are not so good to return to you. So, when you change your decision about those customers, then this information is if there is a long list of your customers. So, keep in mind that it is not much easy or it will not be that accurate. But if you are taking this from the database, then you will get the correct information and you can make the proper decisions. And the good decisions, as I have mentioned before, they can help to to prosper your business. On this slide, you can see that you can see different levels of data. For example, when you are working in a database approach, then what level of data do you get? The first thing you see is real world data. Real world data is that the systems running in your real world, in which physical things, people, procedures, processes are working, the data on that level is called real world data. If you think about it, your production system, your banking system, the people involved in it, for example, the bankers, your assistants on your counters are existing in real world. You will have your own properties. For example, their name, their address, etc. So, one level where we encounter data, where we see data, is real world data. When you want to transform this data in a database, then the next level of data comes from there. This level is called the metadata. Metadata means the data about data. The data we store in the database, what will be its structure, what will be the form, what will be the constraints, because the structure of the database is stored in the schema. This means that the schema is in the metadata. This is our second level where we encounter data. And the third level is data occurrences. Data occurrences are when we feed and put the data in the database, when you define the structure of the data as the next stage, according to the structure you will feed the data in the database. For example, you say that the name its structure and its format will be stored in the form of text. So, if there are 100,000, 200,000, 100,000, 100,000, 100,000, then all the names will be stored in the form of text. In such a way, if we want to store the numbers, then the thousands of records will enter the data. Obviously, the real world we had entities, but they were in the real world, now they are in the form of data. So, these are the three levels that we encounter in the database environment. In this slide, you can see a system that some people are working on. And first of all, they are showing the environment of the office. And here, different employees are discussing something. If you take an employee, as we have marked here, now its properties are name, age, qualification, salary, these are its properties in the real world, which are associated with this physical entity, different systems can be shown here. These can be shown here. Now we come to the metadata level. Here, we first we have to store the data about these employees. So, when we define this metadata, we will say, Amp will be the record type or file name, and the name which will be an attribute will be the type of text, age, number, so this is an example of metadata. Now, if you look at the next level, when you store the data according to this metadata, here you can see the different employees that have been recorded in front of you. So, based on this slide, you have understood how we see the data on different levels. On this slide, you have seen that we will discuss the different types of users of the database. And among them, the first type we have seen is application programmers. What is an application program? We have already discussed this. So, application programmers are those trained staff who write the application program for a particular system, for a particular organization. For this, to write application programmers we use two types of software. We use software and that software interacts through the DBMS to process the data. So, one type of user is application programmers. The other type of users we call end users. End users are users who ultimately use the database in daily life. That is, application programmers have created an application program for you when they have deployed and implemented that database. Now, the organization for which the database was created will have different application programs for its different users. And these users will use end users. For example, if you look at a factory or a company, there are different sections. There are sales, there is production. All of these will have different users. And every user will be accessing the data for which the application programmer has written the application program. So, we call end users like this. Generally, there are two types of end users. One is naive users. Nive users are those who do not know about the database, structure of the application, and neither do they need it. All they have to do is that they need to know to use a particular application program. And their problem for them is that only with that application program can they not go there nor do they need to go there. The other type of end users are sophisticated users. They are users who can access the data through the application programs. But if they want to, they can access the data by removing it. But remember one thing that for an access apart from the application program you have to have two major things. One, you have to be familiar with the structure of the data in the database. And you have to have the appropriate skills for that. And secondly, as I said the end users, the naive users or the pre-returned applications do not need or are allowed to use the database. So those who are sophisticated users and normally they are in top management who have more authorization than the end users who can access the data by removing it from the application program. Let's go to our side. Another type of user which is normally available in the database environment is the database administrator. And briefly we call it DBA. DBA is a person who has central control over data and programs of this data. DBA is a technical person and he is the person who is responsible for the proper functioning and proper maintenance of the database. That is, whatever the operation is on the database and the safety and protection of the database and the backup and recovery of everything that is responsible is DBA. DBA works he has to be a technical and qualified person. What are the functions of the DBA? Let's discuss them one by one. First thing is schema definition. Schema definition is the thing that is generally the responsibility of the designer and designer may be an independent person, a different person and it may be the DBA itself. It depends upon the scale of the organization, scale of the business. In some cases, both roles are played by one person i.e. the designer or the DBA and in some cases, the designer you hire an independent company to develop the system for you and when the system develops and installs the system then after that the maintenance that is the duty of the DBA but the organizations of the moderate size the schema definition comes as a part of the duties of the DBA. Another responsibility of the DBA is granting access to the data. As I have already discussed that there are different types of users. What type of user should be accessed and what type of user should be accessed. All this is defined obviously the DBA defines the management and the higher authorities or their direction but its implementation technically is the responsibility of the DBA. Other than that routine maintenance of the database the activities that are there are routine and backup of the database other than that monitoring disk space when you define your database initially when you implement it there is no data at all in the database but as the time passes the database grows in size according to the functioning according to the working day when the data is added when the database is added the size grows to monitor when the database is growing is there enough space to handle the size of the database and secondly to perform other activities related to the functioning of the database so this is also the job of the DBA then obviously monitoring jobs running the jobs running have to be monitored there are many different activities in this that the people who are using it are using it wrong for example if you try to access a particular type of data without permission then are they not trying to access it if the printer is working then are the printers stopped working is it not that one job is being printed and the other many jobs are being waited for so how to handle this situation so the working day to day also manages the DBA on this slide what I am showing you in this the database environment the typical components are shown to them and the overall picture of the database will be gathered in your mind and all the components that we have discussed are shown to you and how and where the role is being played what you are seeing is two things are shown one is DBMS and the other is database the database of DBMS is defined by how you access the data or how you store it so this is the link between DBMS and the database after this application programs determine what to get what kind of data you want from the database after this the database designers are the people who design the database just mentioned that if there is a small size organization the DBA plays the role of the database designer but the big big organization the designer it would be not the DBA so the database designer designs the database after that we have read that the database administrator maintains DBMS and its functionality then comes the application programmers application programmers develop your application program and write two arrows are showing that your database administrator as a representative of the organization for which the database is being developed he assists helps the designer in the designing phase of the database and in the same way he helps the application programmers to develop the application programs and we have end users they are the people that we use and that is why they interact with the database through the application program and see the concern of the end users is only through the application program so you can understand from this that by these end users we mean the naive users these components jointly they are the software part of a database environment these components jointly are users of the database and this is the data so this is a typical database environment that you will find in different organizations that concludes our today's lecture in today's lecture we discussed the basic terminology of the database environment then we discussed different advantages and some disadvantages of the database approach then we discussed different levels of data and finally we saw different types of users in the database environment in the next lecture we are going to discuss the architecture of the database the architecture of the database explains at what different levels we see the data in the database and in order to make proper use of the data and to design the database properly it is very important to understand the architecture of the database very well but in this lecture I will give you a brief history of the database architecture as we have discussed before that the data processing start with the file system environment soon people started realizing the drawbacks in the file system environment and this drawback becomes more apparent in 1960s when in America Apollo mission was sent to the moon then it was realized that as a result of this mission the huge amount of data generating to handle and process we did not have proper resources so IBM and some other companies launched the first database system defined and implemented that was called IMS IMS was very popular so far the mainframe machines are still running on some applications if the new applications are not being used due to certain reasons we will discuss later but at that time it was very popular and the special thing about it is that it is based on the name of the hierarchical data model we will read about it later but the special thing about it is that it supports one-to-many relationship which means it was basically a tree-like structure in which you have a root and its further branches and like that so your hierarchical data model was a tree-like structure and there was only two entities one-to-many relationship was supported as soon as this data model and as a result it became a drawback of this DBMS and then people started finding it hard to implement the IMS in different applications in different situations and then as a result the DBMS was made and that was IDS the special thing about IDS was that it was based on the network data model and the network data model supported many-to-many relationships so IDS was also popular along with the different efforts to standardize the architecture and that started and a committee which was called CODASIL and it was a conference on data system languages this committee started an effort to standardize the architecture of the database so that all the databases can be followed and then this company created a different group which was called DBTG or Database Task Group and in 1971 they gave their recommendations and in this recommendation they proposed a general architecture of the database this architecture was a two-layered architecture in this we used to say that system view means that view of the entire organization the structure of the organization represents the database structure and the user view is that that view which had different user groups and these views were expressed through the sub-schemas and a special language was defined which was called DDL i.e. Data Definition Language so through DDL we defined the schema and sub-schemas the recommendations of DBTG were presented in front of ANSI ANSI is an organization whose work is to fix the standards of ANSI did not accept these recommendations and ANSI made a committee which was called SPARK it is made up of Standards Planning and Requirements Committee SPARK's objective was to suggest an architecture of the database so SPARK suggested an architecture which we still know and we call three-level schema architecture or three-layered architecture in this architecture there were three levels of data system view or user view physical view conceptual view or external view from there system view and user view were represented through the sub-schemas in three-level architecture the three views were represented through different schemes and all these schemes are kept in the database After discussing the history of the architecture we will discuss the architecture Thank you