 So, here as I said we will go back to basics, we will very quickly look at the software requirement specifications, this is the document that you have to prepare for any information system and this will be followed by the actual database design. So, how do you design software, there are two components to be designed, one is the table design, the schema. So, for that we shall study the notion of normal forms and arriving at square tables which are considered the ideal table design for a schema. Additionally, we will have to consider designing software itself, the programs that will have to be written. Let us look at SRS documents, these are the draft contents, we have seen this slide earlier, this is just a quick repeat recap, whatever information system you are undertaking to develop, you will give a brief introduction to that system, you will write functional software specifications, then you will write the data models and data dictionary which is ER model and data flow model. This is what we have discussed in this course, as I have mentioned briefly, UML or Unified Modeling Language is what is often used to describe your system analysis using object oriented approach. We have not covered that here, however conceptually things are very similar and you can look up the UML to study how UML representation for your analysis of information system can be put together. In the same document, you will also describe the user interface requirements, you will describe the interfaces to other systems, you will describe the procedures and workflow, you will describe the acceptance criteria and you will have appendices indicating say details of data dictionary, details of data flow, whatever you want. This is again a reference to the fifth edition of Pressman's book, the software engineering a practitioner's approach and the design document structure which I am now describing is taken from the fourth edition. The fifth edition actually describes the UML and the use case approach to design. So since we have discussed the classical year model, the design is done in a slightly different way from the data flow diagram, but doesn't matter conceptually things are same. So information system design actually will result in the design document, just like you have a SRS document, you have a design document and what you have to consider in designing the system is data design, software architecture design, how these different program modules will interact with each other, interface design. So what are the interfaces between users and the system, a user manual, design of transactions and workflow and design of tests for testing whether your software will work or not. All of this is actually the design document. I primarily expect your groups to complete the data design and to provide at least the architecture design and interface design. So in fact data design and interface design with user manual, these are the two most important aspects from the perspective of a user of an application. Internal things are technology and since software technology is not being covered in this course, if you are not very heavy on elaborating that that is okay. But we should understand how exactly the design of a system is done. We are going to concentrate on data design. Basically data design means that in your ER model, you remember the ER model and data flow diagram. In ER model you will have entities and in data flow diagram you will have stores. Stores which contain some information coming from input going to some process and so on. Basically you want to map all these stores into SQL tables. You want to map all the entities and associations into SQL tables. Now, the fundamental principles of any design is first, the design should avoid storage of redundant information in tables. No unnecessary redundancy, same information being replicated here and there should not be there. And second, the schema must permit all the relevant information of our application to be represented and captured properly. We shall see some examples to show how we might actually miss out on containing some information in our design. To start our discussion, we imagine that my analysis has thrown up let's say a total of 300 attributes for different entities. Let's say I have 20 entities, associations, etc. But there are 300 distinct attributes. One very stupid way of designing the data architecture is that I say I have a single table called my application table, one table. And I have all 300 attributes there, all 300 attributes. So design is done, one table, all 300 attributes. Of course the primary key for this table will be a very complex key, right? It will have to have variety of components, some belonging to this entity, some belonging to that entity, whatever, whatever. And life may be slightly miserable. But from a simplicity point of view, I am delighted because I have exactly one table to handle. What is wrong in having a single table to represent all the data in my information system? That is the subject matter of our discussion in this session. Through this we shall learn a little bit about the notion of normalization theory. Normalization theory is considered an extremely important development in relational database design. We shall see the basics of that normalization theory and how it is to be applied. Quite independent of the mapping of stores into SQL tables, I might have actually files for bulk input output, transaction logs, audit trails. What is a transaction log? Let's say I am entering grades for students. Every time I enter, they should be a record saying I entered this data this time from this terminal or give it credit. So this is a transaction log. Now please remember that the transaction information is not a persistent information from a database perspective. From a database individual account holder's name and the balance, etc. is the persistent information. The rest of it is transient. But I will have to create some placeholder for the transient information. So if I want to cost check next day, etc. Now these are traditionally files. You know you write a log into a file and keep writing logs. Log is log record, logging. And keep writing. And once the database is updated you want to actually forget about that. But you need that file during a transient process. Now for that you don't want to waste your normalization theory concepts and things like that. It's not a database schema design. So when you are using a relational database you will routinely use a single table for a transaction file as a transient table. And you keep dumping logs and records into that table. At the end you may delete all records, whatever. That is how we say you will have to do some file design for bulk input, output, logs, auditors, etc. In our system we will be concentrating on how do we map our stores or entities into SQL tables. And we start with the assumption that ideally if I can have a single table to represent all data in my information system I should be okay. We shall want to find out why it is not okay. And if it is not okay what is okay. And that will lead us to the notion of the normalization theory. So let's look at that design. The question is what do we mean by audit trails? Audit trail is a notion that comes from financial transactions. In a financial transaction typically a debit has to be accompanied by credit. It's a double entry system. Those of you who know accounting will know that there has to be a double entry system. A single entry is not considered valid. That means these account books are somewhere imbalanced. Now an audit trail consists of two things. One is the transaction log itself. So let's say you are issuing cash, somebody is withdrawing cash. Okay. Now when the cash is withdrawn his particular account balance has to be debited. Okay. So you are debited that account, you credit cash, something like that. Account transfer is there. So my account is debited, somebody else's account is credited. These two entries may happen in different books. My account traditionally a register containing my account is one register, register containing your account is another register. Now there are 20 such registers and hundreds of pages. Different people are coming every day for doing this transaction. Imagine that there are such 35 transfers take place. Meaning 35 accounts are withdrawn, 35 accounts are credited. These 35 are not necessarily all distinct, some may be common. At the end of the day it is required by the bank to ensure that whatever I have debited has been credited somewhere. Otherwise somebody will lose money, either the bank or one of the customers. An audit trail helps you record all the debits and credits in a sequence separately. So that you don't have to open up those old registers at the end of the evening. You are sure that every time you have made an entry into that register you have also created an audit record. That is called a trail. At the end of the day what a banking clerk will be required to do is it will simply be asked to go through all the audit trail as we call it and match debits and credits. So if all debits and credits match then the books are balanced. This is of course presuming that every entry corresponds to a real entry in the register somewhere. Now when you don't have registers but computerized systems you still do the same thing. In a computerized system equivalent of updating an account book will be the account database table. An equivalent of checking the audit trail is to ensure that these log trails of debits and credits are balanced at the end of the day. So that's the purpose of the audit. So basically an audit trail is a property to log? Yes, the debit log, the credit log. You see it is not as simple as that because log is typically associated with an individual's activity. So if I am sitting as a clerk whatever I do is my log. However I am only debiting. The crediting might be done by him. There might be an inward remittance of money where I have received money from a branch in Allahabad. A debit has occurred in Allahabad branch. The credit has occurred here. So when I do debit credit in my branch I will not get all matching debits or all matching credits and these leg has happened somewhere else. So the audit trail is used not only to balance your books but also to flag out components which can be reconciled only when matched against something else. So if you take a bank-wise balancing that balancing is a huge affair and it may go into weeks or months in the world times. Because the Allahabad data will go to a central place, my data will go to a central place, Calcutta data will go to a central place and all these debits and credits will be reconciled. But audit trails will come from different branches and an audit trail for a branch may consist of multiple logs combined together for that branch. This is a functional description. This is not exact there are many nuances but it's a financial transaction terminology. The point from our perspective is that an audit trail or a log is not a conventional database schema on whose design a huge amount of energy has to be wasted. It is a bulk data exactly routinely the same contents for every report in that and that needs to be stored actually in a file but since we use database we will use a database table to store that file that's about it. So to begin with we describe the notion of functional dependency and the notion of normal forms. What you have are formal definitions of the three normal forms. The first normal form, the second normal form and the third normal form. We shall illustrate this with some example. So first let's very quickly go through the basic technical definition. First normal form, all attributes have atomic values and a normal form is a term used to as an adjective for a table. So given any database table that table is in first normal form, second normal form or third normal form. Then is a table in the first normal form. A table is in the first normal form if all the attributes of that table have atomic values. That means multi-valued attributes are ruled out. Single value for every attribute. Consequently all the tables that we have seen so far have been actually first normal form table. First normal form requirement is very simple. In fact routinely any table that you can think of as long as you avoid multi-valued attributes it's a first normal form table. In fact our notion that why not I design a single table containing all the data. I am of course assuming that that table will be in the first normal form because I have to avoid multi-valued data. So first normal form is a no-brainer, ordinarily you will get any table and in fact the question we are asking is why that form is not sufficient. So before you understand that let's look at the second and the third normal form. The second normal form says all non-key attributes of a table should be fully functionally dependent on the primary key. We introduce the notion of a primary key. All of you are familiar with primary key. Remember the keys that we have discussed. Super key, then candidate key which is a minimal super key and primary key is a chosen candidate key to be the unique identifier for that row. So these are the notions. So naturally now we say that look every table must have a primary key. Now if there are twenty attributes in a table the primary key may be either one attribute or combination of two or three attributes let's say. So that means for a primary key in a table there are several attributes which are non-key attributes. Like for example take the student table. Role number is primary key. Name is another attribute. Hostile is another attribute etc. So we have n number of attributes which are not primary key attribute. The second normal form definition requires that all such non-key attributes should be fully functionally dependent on the primary key attribute. We shall describe what full functional dependency is. But what should I say? Intuitively the idea is that if I have a table and I have a primary key consisting of a few attributes then every non-key attribute in some sense should be dependent on that primary key. We shall see what that sense of dependency is. There is an extended definition called third normal form which says that not only the non-key attribute should be fully functionally dependent on the primary key they should be non-transitive dependence in this case. So we are now complicated in the matter. To recapitulate we fully understand what first normal form is. All that is required is atomic values, no multiple values. The next one we can intuitively understand that we are talking about non-key attributes of a table being dependent on the primary key in some way. We shall see what that's some way. And in the third normal form we are saying something more complicated which we don't understand at this stage. There is a first second third normal form definition. Incidentally third normal form of design is considered to be a minimal acceptable good database design. That means if your database has 14 tables then each of the tables should be in third normal form is a criteria for good design, thumb rule. There are four, fifth, sixth normal form will not go into the normalization theory. We will just try to understand these basic things here. Here is a table, a single table called academic data or akal underscore data. You will recall our discussion of the academic system, student, course, grades, etc., etc. Now I am giving you the fancy option of saying then why do these bagajamari of database design with multiple tables. Let me put all the data in a single table and that is my design. So how design the schema? What would the schema contain? It will contain roll number, name, hostel, room number, CPI, course code, course grade, course name, credits, etc., etc. You agree that all attributes are covered here? So consequently if I have a single table like that everything is okay. For example 8900501 to Sunita is registered for CS413 which is a course on programming. She got an A grade, the course of 8 grades. Sunita is also registered for CS634. She got an AB grade called information systems, 6 grades. And Keshav Nori, another friend of mine is a student, is registered for this course, etc., etc. Observe that all relevant information seems to be captured properly. Now I may wonder why the hell did I do that ER model where I had student identity, course identity and the association called registers. This seems to be perfectly fine. In fact no complication, single table. So what could be wrong with this table? Any quick, particularly I would like you to remember what I stated as objective of the regular design. What did we say good data design objective? Redundancy of information should be avoided. Is this table design avoiding redundancy? Take for example Sunita Sarawagi. Her name is appearing here, here and any number of times that depending upon how many courses she has registered for. First take a course CS634. The name, information system will appear 80 times if there are 80 students who are registered for this course. Now this is clear redundancy. Can it not be avoided? There is one question. The second aspect was which perhaps you missed out. So let me go back to that slide because those two aspects are the most important aspects. They are the governing principles of good design. Let's go back to that. Mapping stores or entities or whatever into STL tables. Your design should avoid storage of redundant information in tables that we understood. Additionally schema must permit relevant information of our application to be represented properly. It is in this context that we went ahead and defined the second normal form and third normal form because first normal form the table that I showed you is in first normal form. There are no elemental atomic values, no problem. The fact that it violates my objective that redundant information is there is one way of looking at it. The other way of looking at it is from the normalization theory. We shall translate the design objective into normalization theory and see how that is relevant. The second normal form says that all non-key attributes of a table should be fully functionally dependent on the primary key. Let's understand the implication of this. Once again to recapitulate, next slide. This is a table and I am going ahead with a single table design called ACAD data. Primary key is role plus C code. This is understood. S-roll plus C code is the primary key. You all agree? This will uniquely identify every row in the table. This is S-roll. This is C-code. This combination is unique. That is common sense because they have student registers. I am actually capturing that association basically. So there seems to be nothing wrong in that. Primary key is uniquely identified. Now there is a redundancy here. I can see that but there is something more than that redundancy. First let me define functional dependency. If an attribute B is functionally dependent on another attribute A, we write it like this. A defines or determines B or B is functionally dependent on A. What is the meaning of functional dependency? Functional dependency meaning is that if B depends functionally on A, then A and B are two attributes. Now imagine a table with thousands of rows. This functional dependency requires that whenever a certain value of A appears in that column, the same value B must appear, same meaning not the same as A. Let us say there is a combination A value is 5 and B value is 27 in some row. If B is functionally dependent on A, then whenever A value 5 occurs in any other row, B must be 27. Same B value must appear that appears anywhere else. That is the functional dependency. What do we mean by saying? Here is an example. If some whole number appears in a row of table, the name of the student must be same that appeared in any other row. For example, go back to this table. This is a row in which the whole number is 5012. The name appears to be Sunita. We assume that Sunita Sarawak is the name of this student. Now, if the same row number appears in any other row in this table, it is common sense that the name must be Sunita again. It can't be anything else. Take for example, the course code. If CS 634 appears in conjunction with the course name in four systems, then any number of times CS 634 appears later, the course code must be information systems only. So, the course name is functionally dependent on C code. Student name is functionally dependent on row. You will agree that student hostel is functionally dependent on row. You agree? This is a common sense, functional dependencies. Now, what do we mean by full functional dependency? Any idea? What is the difference between functional dependency and full functional dependency? A full functional dependency means that the dependency is only on this attribute. There is no other dependency. Now, of course, there will not be any other dependency. But let's understand this notion of dependency in the context of a key, which is not a single field. See, any time I have a dependency, there are a single attribute marking a key and a single attribute marking a non-key attribute, then this dependency is also full functional dependency because B depends on A and there is no subpar of A on which B can depend. So, that is why we say name is fully functional dependent on a row. In the sense that there is no other attribute in that table which determines S name. S name is determined fully by row number. Nothing else is required. That definition is that dependency is complete. So, here are some functional dependencies that we can easily see from this single table. Hostel, row number, CPI are dependent on row. You agree? Because once a row number comes, same row number must have same row number, same hostel number, same CPI because we are talking about the same student. Similarly, course name, course credits are dependent on course code. Again common sense. Grade is dependent on the composite key. So, grade is dependent on S row and C code. This is the primary key of the table by the way. The grade is fully functional dependent on S row plus C code. Here is the distinction between full functional dependency and partial functional dependency. Why do we say fully functional dependent on this? Because any single component of this primary key cannot uniquely determine the grade. So, we say grade is fully functional dependent only on the composite key and not part of it. Now, here is the question. Let us take the course name. Course name is functionally dependent on C code. Is course name not functionally dependent on S row plus C code also? Of course, yes, because S row plus C code is a unique entry actually. So, there is no question of that entry appearing again. It will appear once and it will have this value. So, in a technical sense, the course name is dependent on S row plus C code also. However, it is not fully functional dependent. Because part of the primary key also determines that uniquely. We said second normal form, all non-key attributes should be fully functional dependent on the primary key. Many non-key attributes in our academic data table are not fully functional dependent on primary key. They are only partly functional dependent. So, what is the problem if this criteria is not satisfied? Let us look at the table again. Remember, I mentioned two things. One, the ability to avoid duplication or redundancy. And two, the ability to represent data correctly. Let us look at this. Data redundancy exists in the table. So, Sunita, Sunita, Sunita, Sunita. What are the problems with data redundancy? It is not only storage by the way. Extra storage is required which is unnecessary. But imagine that if Sunita changes her hostel from 10 to 11, then if Sunita is registered for five courses, I will have to make changes five times. So, five update commands I will have to issue. Or if the name of the course IT 640 is changed, I am sorry, this should be CS 634. If the name of the course is changed from info system to something else and if there are 80 entries there, 80 times I will have to change that. So, please note that data redundancy is not only affecting my storage. It is actually going to cause me a lot of work. Every time a one change occurs in that information, every redundant information will have to be changed appropriately. There is a huge amount of work. So, this is one problem with this design. We do not like this design because of the data redundancy. Notice why this is happening. This is happening because the hostel which was dependent only on the roll number is required to be repeated every time because the primary key is roll number and C code. So, if the roll number appears six times, but this fellow has to appear six times. If the course is registered for by 80 students, the course name will have to appear 80 times. This redundancy is being caused because I am clubbing non-key attributes in a table where the non-key attributes are not fully functionally dependent on primary. This is my friend Mutu Krishnan, this roll number. He is registered for course HS412. And you got a BB grade. Well, let's say forget the BB grade. He is registered for this course called sociology. Now, consider this. Suppose Mutu is registered for only one course, HS412. He is not registered for any other course. In this semester, he is not registered for any other course. He is registered only for one course. So, his entry appears here in this. Now, imagine his faculty advisor says that there goes humanity so much that he drops that course. So, he drops that course. What happens now? This roll goes out. Please remember, since Mutu has registered for only one course, when this roll is deleted, not only is registration information of HS412 is deleted, Mutu Krishnan himself is deleted. Do you get the point now? There is a student who registered for one course. He has asked to drop that course for another reason. Let's say, he had serious health problem. John days, whatever, whatever. And he was a weak student already. So, he has asked to do only one course. Now, let's imagine that he had cleared all courses, but HS412 remains. If he clears that, he gets his degree and goes out. In the last plus one semester, not last minus one, last plus one extra semester, he has registered for this course. His name appears in my table. And suddenly he falls ill. Doctor says, Kibaba, spend one more semester, register for this course next semester. So, he drops that course. What happens to your academic database? Mutu Krishnan gets deleted. Is that fair? Is that fair? Imagine, there is an elective which I have floated. Senators approve that elective. Let's say, performance of large transactional system, some arbitrary name CS679. Some elective, Senators approve that, that elective has been printed in the course of study bulletin and that course is offered. And let's say, no student takes that elective in this particular. Will that course ever come here? No student has registered. The primary key is what? Student roll number and the course score. No student, no entry. What does it mean? It means that a Senate approved elective is not represented in my database at all. Is that a correct situation? So, please notice that this is not, the problem is not to do with redundancy alone. No. The problem is that because of this non-compliance of full functional dependency, I am unable to maintain minimal information that I am required to maintain for my academic program. Namely, if a student exists and is enrolled with the institute, no matter whether the student is registered or not, the student must exist in my academic database. Similarly, if a course is approved by Senate, no matter whether a student exists or not, the course details must exist. The name, the credits must exist in my database. The first normal form does not permit me to guarantee the existence of the minimal information. And I can trace it to the fact that non-key attributes in this table are not fully functionally dependent on the primary key. They are only partly functionally dependent. Suppose I try to redesign this table in a way such that all non-key attributes are fully functionally dependent on primary key. I can't obviously do it with a single table because this single table has this composite key. So, what is the solution? We need to check and if necessary, we must decompose a table into multiple tables by projecting certain columns such that for each table a 2NF definition holds. Now, here is the trick. Let's go back to this table. Here is the single table. The single table has this problem. There are few attributes depend only on one part of the key. A few attributes depend on another part of the key. And only one attribute depends on both the parts of the key. Only one attribute is fully functionally dependent on primary key. So, one way I say is, alright, let me take out those attributes which are dependent only on one part of the key. And let me create a table containing those attributes and that key part only is those attributes are dependent. Let me do it for all non-key attributes. Now, I identify from here. I say, S name is dependent on S wall. SH is dependent on S wall. SR is dependent on S wall. SCPA is dependent on S wall. But C code is not, grade is not, C name is not, C grade is not. So, I say now, let me break this table. Let me create another table by projecting. You remember the project operation? So, project, S wall, S name, SH, SR, SCPA from this table. You get me what? It will get me a subset of this. A subset where Sunita is not repeated because the primary key for this subset will be only a row. So, you see from this single table, I first projecting a table which contains only a row number which is the part of the original primary key and all non-key attributes which are dependent only on a row number. Now those non-key attributes are fully functionally dependent on row number because there is no other part of the key. And imagine now this table. What is this table? This is actually the student table. You will agree that if I had such a table, then even if my friend, Motu Krishnan, registered some HS412, he will still find a place in that table. He will not get deleted because his name, SPI, etc. will be all preserved there. This is one part but I represent the original information completely no. I have taken a projection only of one part. Consider the other part. What is the other part? C code. So, what are the non-key attributes which are dependent on C code? Course name and course credit. Let me do another projection. I create a second table. In that second table, C code is the primary key and all non-key attributes which are functionally dependent on C code are included there. There the non-key attributes are fully functionally dependent on C code so there is no problem. That table is in second normal form. This table is also in the second normal form. Have I taken care of all? No. I am still left with a non-key attribute called grade. I have noticed that grade is dependent on the combination of a row and C code. So, I now do a third projection of this main table. In this third projection, I project a row, C code and grade. And now I say this third table, the primary key is a row and C code and the non-key attribute is grade and grade is fully functionally dependent on this combination. Consequently, I will end up with three tables each one of which is in second normal form. There is the notion of a lossless decomposition which means that if I get this data through projection into three tables then by joining these three tables, I must get the original data. I must not get any extra data or something. That is called lossless decomposition. It is a technical word. We will not worry about it. However, this is the important point that we should appreciate. If I had started with ER model, you remember now the same example that we had taken. What was our ER model? We model student as an entity. We model course as an entity. And we model the association as a third entity relationship or association. And we said as a simplistic way, let me convert the entity student into a student table, the entity course into a course table and the entity register form into another third table which is associative table. You will notice that these are precisely the three tables that you are getting from this procedure. And each of those three tables is in second normal form. Consequently, we conclude that ordinarily when I derive a table definition from an ER model, we will often get our tables in the third normal form. We have not seen the third normal form. We will definitely get the tables in second normal form. So, in fact, one advantage of ER model is that you do analysis using ER model but ER model itself leads to a very naturally good design at least second normal form. We do not know what third normal form and what the problems are but at least second normal form is taken care of. Is that clear? However, we must cross check. We cannot assume. If we translate ER model into tables, we must examine for each table what is the primary key? If there is a composite primary key, what are the non-key attributes? Each of the non-key attributes fully depends on the primary key or only part of it, this needs to be examined. Naturally, when information system becomes more complex, when the primary key for certain tables could be three parts, four parts, five parts, then you have to really examine each non-key attribute for the full functional dependency. I will not discuss the transitive dependency but indicate the problem that you may have when your table is in second normal form but it does not ensure non-transitive dependency and creates a problem. So I leave this as an exercise for you to go back and construct an example of what is a transitive dependency. The third normal form says that not only every non-key attribute should be fully functional dependent on the primary key but it should be non-transitively dependent. What is a non-transitive dependence? What is transitivity? I thought everybody knows what is a transitive thing? Transitive means A implies B, B implies C, means A implies C, am I right? That is transitivity. In terms of dependency, C is dependent on B, B is dependent on A and therefore C is dependent on A. In this case I say C is transitively dependent on A because it is dependent via B. We are saying that full functional dependency should avoid transitive dependency because if there is a transitive dependency there could be a problem exactly of the similar kind that we discussed. Either we are not able to represent that information or there is unnecessary need. The exercise is conceive of a transitive dependency and show that transitive dependency is not good which will lead to the same kind of problem. Show that through an example, construct that example. Take that as an exercise. In fact, if you study any standard database book it will explain a transitive dependency very clearly. The notion is not very difficult in terms of normalization theory. The normalization process is again same. If you find a transitive dependency, take project that thing out. If C is dependent on B, project B and C out in a separate table so that C depends only on B and B depends on A is preserved in another table so you have a third normal form. But what kind of problems you may have if the transitive dependency is persist can be checked by constructing an example. You can check up the standard text. As I said, there are multiple normal forms. Generally, in any practical situation even for a fairly complex information system if you can guarantee that all your tables are in third normal form then you have got a good design. Sadly, there exists many applications where this simple normalization has not been done properly by the analysts and designers because people simply take some model and divide some tables and as long as the information is there is okay. They either have redundancy or have inability to represent information or both. We are not talking about the example that we took is say a three table system which we tried to put in a single table and saw what problems we encounter but imagine if you have 50 entities imagine that a total of about 500 or 300 attributes exist and there are cross relations between them. You could get a very messy model. That way if you start from your model and convert it into tables you at least have a reasonable chance of getting tables which are in second normal form but as I said you must cross check and many times if you don't cross check inadvertently some tables may not be even in second normal form. If they are not you will have serious problems in implementing that information. Additionally, you should cross check for transitive dependency as I mentioned. There is another normal form called there is a fourth normal form, fifth normal form, etc. etc. But like in statistics you have first moment, second moment, third moment, fourth moment you can define 25th moment but generally after first and second moment the other moments are not very relevant except in very peculiar statistical inferences that you draw. I mean standard deviation for example everybody knows statistics right? So standard deviation is something which is commonly used and beyond that the third moment, fourth moment are relevant only in some specific cases. In exactly the same way second normal form and third normal form are fundamentally important. There are peculiar situations where fourth and fifth normal form may have to be considered. Another normalization is called voice call normal form or BCNF. This voice call normal form talks about dependency of non-key attributes not only on primary key but dependency of non-key attributes on candidate keys. You might have a table in which there is a primary key there is also another candidate key. So voice call normal form says that not only you should have full functional dependency on primary key but if there exists a conditional dependency on candidate key as well if you don't do that following problems will accrue etc. So this is an additional form but we will not bother about it except to note that such things exist. But you are now very clear about at least the first and second normal form and why second normal form is essential and how you can easily obtain it. Now this is true for any table. So you take any table. Suppose you have designed 50 tables. You examine one table. Check the primary key. Check all non-key attributes. Are they fully functional dependent? They are not. If they are not, project the portions as we said. So that means there is a constructive way of obtaining a second normal form from first normal form. Similarly a third normal form from second normal form. So it is actually a synthesis mechanism. The design obtains automatically from this analysis.