 Now, that was a many to many. Any relationship which is many to one or one to one, you can actually not create a separate relation, but fold it into an attribute of one of the concern entities, the many side of for many to many. So, here is an example, we had instructor related to department by INS department. He could have created a table for INS department with attributes ID for the instructor ID and department name, two attributes. What is the primary key for this table? Instructor department table, it could have been ID, because the department may have many instructors and instructor has only one department. So, if I used ID and department name, it is not a candidate key because just ID by itself is enough. But now, if you think about this, we are just unnecessarily creating a new table. Given this situation, you might as well merge this table with the instructor table and the only extra attribute is department name, when you merge the tables, the primary key is the same ID. So, whenever you have two tables is the same primary key, you can actually merge them into one table. There is a catch, if an instructor could be there without any associated department and you merge the tables, what is going to happen? The department name would be, would have to be null for that instructor. But since we allow null values, this may be okay. In this case, that cannot happen because the relationship here INS department is total from instructor. Therefore, that department name will actually have to be not null. You can declare it as not null if you created a table. So that would get folded in. That is what happened. Similarly, student department gets folded into student and advisor, we could have folded into student also. We did not do it, but we could have. Now, what about a one-to-one relationship? Turns out that attribute can be folded into either one. A student advisor is one-to-one. You can store the advisor with the student or the advisor with the instructor. Either one would work. So that choice is made by the designer. There is no unique way of doing it. And finally, the schema corresponding to the relationship, the double diamond, linking a weak entity set to a identifying entity set. The double diamond relationship will always be redundant. Why? To create the table for the weak entity set, we had to copy the primary key of the identifying entity set. So course ID was copied into section. Now, it should be obvious that the section course relationship would have simply duplicated the same information. It makes the primary key from the weak entity set would have already contained the primary key from the strong entity set. So this table is completely pointless. So we are not even going to do anything about it. We will just throw it out. And that information is already there in the section because the course ID is there. So that was for strong and weak entities, assuming simple attributes. What about composite and multi-valued attributes? How do you represent them? So we are going to show it by example. Here are composite attributes, name, first name, middle initial, last name. We can basically flatten this hierarchy if the relational schema doesn't support composite attributes. Some databases do support it, but not all. So if you want it to work uniformly, just flatten it out. So in this case, first name, middle initial, last name is unique across the whole thing. So we just make those directly attributes. Instructor, ID, first name, middle initial, last name. Similarly, the address would become street number, street name, apartment number, city, state, zip, date of birth. Those are the attributes. What about phone number and age? Age is not an attribute. It will be a method in a class or it will be a function in SQL. Phone number is going to be stored separately coming up in the next slide. Now if there are name clashes here, you may have to modify the names by prefixing. If I say address underscore street underscore street number, that will be unique regardless of what else is there. But here anyway street number is unique. So I did not have to prefix it. So now coming to multi-valued attributes. So here is our example with phone number which was a multi-valued attribute. Now in the relation which we create from instructor here, we cannot have a multi-valued attribute. So we are going to create a new relation inst phone which contains ID and phone number. So how did we do this in general? Coming back here, if I have a multi-valued attribute, the relation will contain the primary key of this thing and it will contain all the attributes of the multi-valued attribute. So that is what we did here. ID and phone number. Now the primary key for this would be ID and phone number. It cannot be just ID because particular ID may have multiple phone numbers. So the primary key is combination of ID and phone number. So if you had instructor 2222 with these two phone numbers, it would map in the database into two different tuples in the inst phone relation. Now if we just applied this procedure directly over here in our ER diagram to the time slot relation, what would we do? We would end up creating a relation called time slot with a single attribute time slot ID. We would also create another relation time slot time which has time slot ID, day, start time, end time. Now you will notice that this new relation time slot time is basically what we had in our relational schema as time slot. We did not create yet another relation time slot which has only the time slot ID. We could have done that. We chose not to do it. Now what are the trade-offs there? No is every time slot must have at least one associate. The set cannot be empty. We know that. If you could have a time slot which explicitly says there is no time it meets, that is an error. So we did not have to represent time slots which did not have associated times. So the set of time slots is actually a projection on this relation. But time slot ID cannot be a primary key for this relation because it occurs multiple times. But if you had stored a separate time slot relation having just time slot ID, then the time slot attribute by the way this section time slot is many to one to time slot and it is total which is why we made it an attribute of the section relation. Time slot ID was an attribute of section. Each section can have only one time slot. Now if I had a relation called time slot with a primary key being time slot ID, this could have had a foreign key reference into time slot ID. I could have done that. But to keep the schema simpler and also because time slot ID did not have any other attributes, we decided to remove that. But what we lost as a result is the ability to have a foreign key. So in our schema you can put any junk into the time slot ID field of section and the database constraints will not prevent this. Of course the way you can do it is either to implement it as an application program alone or you can create a bunch of triggers to enforce it. We saw very briefly yesterday how to create triggers. So if you are interested go read up the section in the book on triggers to see how to do this. So that wraps up the basic ER features. But there are also other ER features of which specialization generalization is very widely used. There is one more feature called aggregation which is actually confusing because the word aggregation is used to mean different things by different people. I will come to that. I am actually not going to cover it here in detail but I will just get to it later. So first of all specialization is something which everyone is familiar with. If you have done any programming in Java or C++, you know what is inheritance. And when a class inherits from another, one of the classes is more general, one is more specialized. In the ER model, we have exactly the same concept. And as we discussed before the break, maybe we have a person and then we have employees, contractors, consultants and other categories of people associated with the company. All of whom are persons. So that is a specialization. It is also referred to as ISA because employee is a person, a contractor is a person and so on. And associated thing is that attributes of the more general class are inherited by the lower class. So if I define a set of attributes such as name and id for person, an employee will inherit that. I do not have to again say that employee has attributes id and name. I just say that employee is a person and all the attributes are inherited. This is again exactly as happens in Java or C++. When you create a subclass, everything is inherited. So here is a simple example. We have person with id name address. Employee is a person. Student is a person. Employees have an extra attribute salary. Student has extra attributes, total credits. And going further down, an instructor is an employee. A secretary is an employee. Instructor has an attribute rank. Secretary has hours per week maybe. This is a fake example. We actually did not use this when we created the relational schema. But we could have used something like this. There are a few constraints on specialization. By the way, the words specialization and generalization are used interchangeably. The reason that two words came about is because when you do the ER modeling process, you may go either way. You may start off by saying, I want to model a person. After some time, you realize, hey, you want employees who participate in certain relationships. And then you want consultants who participate in some other relationships. And therefore, we need two subclasses or subentities of employee. So there, we took a person. We took a person, specialized it to employee and consultant. But sometimes it works the other way. For example, a library often starts off by saying, I have members who are students or faculty. Then after some time, the institution decides, here's a company which wants to use my library. I will allow a company membership. They'll pay me some money. I will let them come and borrow books using a company library card. So now I have a new type of user who is not a student or an instructor, but is now a company. Then there may be other members who are other libraries, who become members of this library to allow them to exchange books. So now you have all these members. And there was no meaningful thing we could specialize from. But instead of creating multiple separate relationships, the library may decide, we will create a new class called library user. And person is a subclass of that, maybe not person, maybe instructor, student, company, member, library. All of these basically inherit from this newly created library user class. So what we have done is, we had these individual ones. We have generalized up. And in fact, we can, with this new schema, we can have a membership number, maybe up there. And then have a relation, perhaps even, which library members and a tag, whether they're students, instructors, whatever. And now when you borrow a book, you have to relate it to library user, not separately dip to either student, instructor and so on. So when you lend a book, relating it to each of these can be a pain. Now you just have a single relationship to library user. So that is generalization versus specialization. Now regardless of how you did this, the final diagram looks the same. When you see the diagram, the ER diagram, you cannot say whether it happened because you specialized or because you generalized. So it's irrelevant. So we will use the word specialization or generalization interchangeably at this point. Now there are a few concepts regarding how you did this, created the specialized classes from the more general class. So one way is that it is condition defined. For example, if you have an entity set customer and then we have an entity set called senior citizen customer, which consists of all those customers whose age is greater than 65 years. That is condition defined specialization. Automatically by being over 65, a customer becomes a senior citizen customer and gets extra interest and so forth. The converse is user defined where you explicitly have to decide whether a person is an employee or a consultant or whatever. A more commonly used kind of constraint is disjoint versus overlapping. So in a disjoint case, a member of the more general entity has to belong to at most one of the lower level ones. So coming back here, an employee, I have shown it with a single arrow which splits into two. So in our notation, this indicates that an employee can be an instructor or a secretary but cannot be both. So this is disjoint specialization here. In contrast, here we have two separate arrows, heads going from employee and student to person. What does this allow? It allows a person to be an employee and a student. I am sure this happens in many colleges where you have faculty who are doing a PhD while they are also teaching. So somebody is an instructor and a PhD student. We say research scholar but in the database we may still model it as a student. So here you have an overlapping specialization. There is also another notion of completeness which is total or partial for a specialization which specifies whether an entity at the higher level said must belong to something lower or not. Supposing I have a university database where any person who is in that database must be an employee or a student. I will not allow somebody to be neither. Then I do not want to represent them in the database. In that case, the specialization is total. A person must be an employee or a student. But there are other situations. For example, the security database in IIT Bombay stores not only employees and students. It also has information about dependence of employees, dependence of students. It has information about people who come to work in the houses of employees, about people who are employed by a shop which is on campus and so forth. So we may not model those separately but we do model them as persons. But those persons may not be either employees or students but they are still persons. In that case, the specialization is partial. There are some people who we model as just persons. Nothing more is known about them. And some people who are not only persons but they are also employees. Some who are persons and students. So that is a partial constraint here. It is not a total participation. So finally, to wrap up this part of specialization, we again have to see how to convert specializations into tables. It turns out there are two options here. Method one is to form a schema for the top level entity. Note that the top level entity must have a primary key. Even if it is a weak entity, it has to inherit a primary key from somewhere. So it must have a primary key in the table. Now any specialization of that will automatically inherit all the attributes including the primary key. But when we create the table for the specializations, we have an option. With method one, we are going to copy the primary key alone from above and not the other attributes meaning the other attributes here are name, street and city are not copied into student. We only copy id from person and then we store all the local attributes. In this case, student had a local attribute, total cred and we store that. Similarly, employee had an attribute salary and we store that. So this is one method. The other alternative is to copy all the attributes into this table. So student will have id, name, street, city, total cred. Employee has id, name, street, city, salary. So two representations are possible. Which one should I choose? Well that depends on whether the specialization is total or partial, overlapping or non-overlapping rather. So there are several issues here. If the specialization is total first of all, which means every person must be a student or an employee. I don't even have to store the person relation with this method. I just need to store student and employee. So that is one possibility with this method. And the benefit of this method is given a student, I can get all the attributes from one relation. I don't have to join with the previous representation. If I want to get the total credits and name, I actually have to join these two tables. So this may be faster. But there is a drawback. If the specialization is overlapping, so a person can be a student and an employee, there is a problem here. I have to create two tuples for the same person, both of which have name, street, city. I can have an inconsistency. In one tuple in the student, I may have the same id with some name and address. In the other copy, I may have the same name and some other address. Which is correct? I don't know, it doesn't make sense. So what you should do is when you have overlapping, don't use this method, use the previous method where there is no duplication. Pay the price for joints, it's okay. You need to do it to avoid inconsistency. But when you know that it's disjoint, this is a cheaper way, it's faster to use it. But you should still be careful. Maybe it is disjoint today. Tomorrow people may decide it's overlapping. Then it's going to be tough to deal with this. But what object relational databases do is something which lets you eat your cake and have it too. So what they do is, they will allow you to access student.name. Just as if it's an attribute of student. Even though the name may not be stored in student, it may be stored in person. And the database will actually do a join quietly behind the scene without you needing to bother about it. So even if you use this internal representation, externally you can access all the attributes inherited by student without any problem. Incidentally, what happens with objects in a programming language? It turns out that in a programming language, when you create an object, you have to give it a type, a most specific type it's called. So if I create an employee object, it is also implicitly a person object. I can create a person object, then it is only a person object. I cannot go back and say later, oh by the way this person is also a student and an employee. It's not possible in most object oriented programming languages today. But the same idea of overlapping, an object having multiple types which can change with time and was actually present in the earliest object oriented language. In small talk, for example, you can actually have such things which correspond to overlapping specializations. So programming languages for efficiency purposes decreed that overlapping is not allowed. It has to be disjoint, but otherwise the problems are the same. So at this point, I have said let's do another year design, but I am going to slightly change the order. I am going to wrap up with a summary of notation and then we will take questions and a design altogether. So if you have a question at this point, go ahead and submit it on chat. We will get a print out by the time we address the questions. Now let's look at a summary of the notations. A box is an entity set. A diamond is a relationship. Double diamond is identifying relationship. Double line is total participation. We saw how to denote composite, multi-valued and derived attributes. Primary key is underlined. Discriminating attribute is dashed underline. We had many to many, many to one, one to one. We saw cardinality limits of the form lower bound to upper bound L dot dot H. We saw roll names. We saw is our generalization, which could be total, which you indicate by drawing a dotted line and then saying total to indicate that everyone must be either E2 or E3. It cannot be just E1. And we had disjoint versus overlapping. The default here is overlapping. If you have a common arrow head with two things below, it's disjoint. Now I should note that in UML, similar notation is used, but they don't distinguish overlapping or disjoint based on the arrows. You have to explicitly state like this. Now this is the notation we used in this edition. What about the last edition? We basically used the Chen notation, which is also standardized partially in the IDE1FX notation, which used boxes for entities, ovals for attributes. And if you had a composite attribute, those were stuck as further branches off from this oval. Multi-valued attributes had a double line around them. Derived attribute had a dashed line around them. A weak entity set had a double box. In our notation, we didn't draw a double box because we did a dashed underline. That suffices to indicate that it was weak. Generalization and specialization were using triangles like this ISSA with the arrow downwards. Whereas in the UML and new year notation, we have an arrow head pointing upwards towards the more general. Total generalization was shown by having a double line. We don't use that. And then cardinality constraints were indicated using arrows just as we do. That part is the same. But there are other notations. In the Chen notation, instead of arrows, you use star. Many to many is star star. One to one is indicated by one one. Many to one is denoted by star one and so forth. There's yet another notation which is called the cross-feet notation which is quite widely used. In the cross-feet notation, relationships actually are drawn, binary relationships are just drawn by a line. There's no diamond. In addition, the cross-feet notation draws this thing which looks like a foot of a crow to indicate this side is many. So if you do it on both sides, it's many to many. If it's on neither side, it's one to one. However, it's many to one. And there is also this notation which is a bar or a circle. The bar here indicates, think of it as symbol one. This indicates that it's actually a bit funny. This bar over here indicates that even participation is total. It must be associated with at least one e2 object. And the circle here indicates that e2 may be associated with up to zero e1 objects. So the position is kind of flipped. So a circle here indicates that e2 may be associated with zero objects. The bar indicates e1 must be associated with at least one e2 object. That is a total versus partial participation. And finally, uml which is what we have moved towards. As I said, uml is a large diagraming notation with many components. Class diagrams are what we care about. So here, I am just showing a real notation and equivalent notation in uml class diagrams. Actually very close. Uml diagrams have a prefix to indicate whether attribute or a method is public or private or protected. We have got rid of those distinctions here although they are useful in class diagrams. The next thing is that in uml class diagrams, there is a line here. Above the line is attributes. Below the line is methods. We just drop the line and indicate a method with parenthesis derived attributes a method. We indicated a role label for relationships with a diamond. In uml, a diamond is not mandatory. You can just draw a line and it becomes a relationship. You can give role labels in the same way. In our notation, a relationship is a diamond with attributes optionally with dashed line. In uml, it is still a line. A relationship can be just a line. If it has attributes, you put a dashed thing here and then put the name of the relationship and the attributes here connecting it to that line. This is for binary. What if it is a ternary or larger? Uml actually allows diamonds for those cases. Finally, cardinality constraint. This is something you have to be careful about. In our notation, a constraint on this edge indicates how many times this fellow participates in this relationship. 0.1 here means e2 can participate a minimum of 0 in a maximum of 1 times here. Whereas e1 can participate from 0 to many. In uml, binary is simply an arrow line with a name written on the line. But now, 0.star is written on the opposite side. In this line, when I say 0.star, it means e1 participates 0 to many times. Whereas 0.1 indicates e2 participates 0 to 1 times. So, be careful about this flipping. In any relationship, they are actually the same in uml. Generalization, specialization is actually very similar in uml. Except we said that if you have two separate arrows, it can be overlapping. A single arrow branching is disjoint. In uml, actually, you can use single or double arrow as you please, but you have to say overlapping after putting a dashed line here or disjoint to indicate which it is. So, that is it for the diagrammatic notation. Now, we have time for questions. So, let us end perhaps for trying out one more sample ER diagram on the fly. So, let me start by seeing if anybody has questions. M.E.S. Pune. M.E.S. Pune, we can see you. Please go ahead. There is a small question regarding the DFD and ERD. There is a small question regarding the DFD and ERD. In the DFD, we are using the data store and in ERD, we have the entities. So, what is the relation between the data store in data flow diagram and entity in the ER diagram? Is it necessary to be the same or what is your opinion about that? Please turn off your mic while I reply. So, first of all, let me explain what is the difference between an ER diagram and a data flow diagram. A data flow diagram makes sense for a process where data is flowing between different stages of a process. Whereas in an ER diagram, we are modeling a database state, not a process. Process is separate. You can have a database schema, which is what we are modeling with the ER diagram. On top of that, we may have processes which require data to flow between different steps of the process. So, they are two completely orthogonal things. There is no direct connection between these two. Now, the specific question is a data store in a data flow diagram So, perhaps a data flow diagram, a data store may have many things in there. In a database, we normally view a database as covering every piece of data which is relevant. In a data flow diagram, you may say that one step uses a certain piece of data, which is then passed on to another step which may be generate some other piece of data. So, data is actually moving between different steps. So, that is a slightly different way of looking at the world. In the database world, we think more often of the central database, which has all the data. You may have things which take something here, process it and stick it back, but that is not actually flowing from one data store to another. It is just the database. But data flow still makes sense when we are loading data into the database in the first place. Data is coming from somewhere. So, sometimes you get data periodically from external sources and then you have to process it and then load it into the database. So, here you have a data flow. So, your external source is a data store. You get data from there, process it and then stick it in the database. So, it may make sense for an application to have a database and an ER model for it and additionally to have a data flow which you model also, which shows how data gets into the database. I hope that answered your question. If you have a follow-up question, go ahead. Over to you. Hello. Yeah, please go ahead. If I have to change, sir, it is regarding that schema diagram for university database. Sir, if I have to change that, if I have to do some changes in that schema diagram, I want to accommodate the daily attendance of every student. Like, sir, if in one slot, there is a lecture of one subject, then I have to record whether the student has attended that subject and whether that slot was conducted by someone. What changes I should do? Is it better to add a complete new entity there? I should have simply relation. Over to you, sir. Thank you. Please switch off your mic. Thanks. So, that is actually a good question. What if I want to store the attendance of a student for each lecture of a course? What is the right way to model this? So, one way of doing this is to actually model each lecture as an entity, maybe a weak entity associated with a section. So, here we have a cascade of weak entities. A lecture is identified by a section which itself is identified by course. And then we can have a participation of student in lecture, which indicates the student attended the lecture. We can also have a participation of instructor in a lecture which indicates this instructor conducted that lecture. So, sometimes you have courses which have multiple instructors and a particular lecture may be conducted by instructor. One, another lecture by instructor. Two. So, you can certainly introduce a new concept of an entity called lecture related to student, to section and to instructor. This is one way to model it. Now, from a practical viewpoint, sometimes this may result in a lot of data being stored. So, if you want to get away with much less space storage, you may do what we do practically when we record attendance. We have a sheet which already identifies which course section it is for. And it has an array of student names and lecture. So, we basically have two-dimensional array, student name versus lecture. And then we go and tick in that array saying this student attended this lecture. So, an option is to, if your database supports an array, is to maybe not use a two-dimensional array. But maybe the relationship between student and section could have an attribute called attendance. That attribute could be an array with as many entries as there are lectures in that section. And then in each entry, it can be a bitmap, 0 or 1. 0 is didn't attend, 1 is attend. So, this may be a more efficient way of representing attendance. So similarly, maybe in the teacher's relation, you could do the same thing between instructor and section. If you want to record who taught which section, you could go and have a similar array which says that this person taught these lectures by having an array of lectures with 0 indicating this person taught that lecture, 1 indicating that lecture was taught by somebody else. So, this is a practical issue. Conceptually, it makes a lot of sense to have an entity called a lecture. It means something significant here. But certain implementations for pragmatic considerations may not treat a lecture as an entity and just use an array. Thank you for asking that. That is a good question. So, let us maybe move over to some other center. Next question I will take is from Calicut. So now, we have selected NIT Calicut. Calicut, please go ahead, you have a question. Sir, is there any tool for creating an ER model or designing ER diagram by giving the relations? So, you are saying if you started with the relational model, can you go back and create the ER model? Is that your question? Over to you. Yeah, by giving the relations. Sign the ER diagram. Over to you, sir. So, to reiterate what I just said, I understand the question as if somebody gave you a relational schema without having done an ER diagram, can you work backwards and construct an ER schema from the relational schema? That is a good question. You can certainly do that. So, for example, anything which has a foreign key referencing it is probably an entity and you would create an entity corresponding to those. Anything which has foreign keys going out and nothing coming in may well be a relationship. In addition, by understanding the domain, it should be more clear whether it is an entity or a relationship. So, each such table can be mapped to an entity or a relationship. Furthermore, you may have an entity which has foreign keys going out and those will turn into relationships and you remove the attribute and some of the relationships would have been converted to relations. So, going backwards, some of the relations will have turned into relationships, as I just said. So, it is certainly possible to reverse engineer and create an ER diagram from a relational design and sometimes that is required because you do not have the ER schema to start with. I hope that answered your question. In fact, this is a good exercise. I have used it as exam questions occasionally, giving a relational schema and asking people to come up with the ER diagram. One of the nice things about doing that is usually there is not too much alternatives for what you could come up with. If you just give a general design problem to students saying, here is a domain, create an ER diagram, students will come up with 100 different diagrams, 50 of which will be correct. So, sometimes it becomes difficult to go through all of this and correct it. So, if time is short, this is an alternative which forces a unique or almost unique ER diagram and also test the understanding of the students. If they have understood the concepts going one way, they should also be able to figure out how to come in the reverse direction. Although we have not taught it explicitly here, I just very quickly outlined some heuristics you can use to come backwards. But these are things which an intelligent student should be able to figure out at least in part. So, it is a nice challenging question to give to students. I have a few questions which were asked on chat and I think couple of these are very important. So, please pay attention. One of the questions is, is there any tool for drawing the ER diagram by giving the relation? So, maybe that was the question which you were asking. Is there a tool for it? My answer was yes, you can do it manually. So, the answer to the second part, is there a tool which does this automatically? I do not know of any. But a related question is, is there a tool for drawing ER diagrams in the first place? The answer is yes, there is a tool called DIA. It is a diagram editor which is an open source free tool which runs on Linux, Windows and other platforms. So, in today's lab session, there is a related question, is the ER design tutorial in the classroom or the lab? So, that is a good question. So, what I would recommend is that when you draw an ER diagram initially, do not sit near a computer because actually drawing it on a computer is much slower than drawing it by hand. So, today's session, I would call it a tutorial rather than a lab for the first part and what you should be doing is sitting in the classroom or in the lab if you are already in a lab but do not use the computer initially except to see the questions. So, you put up the questions in whatever form print out or local website, whatever but the answer should be written on paper initially. The ER model should be done on paper. The exercise for today includes both the drawing the ER diagram and converting it to tables at the end of it. So, I think the solutions for the table should have also been put up. Research scholar Karthik was working on that in the morning, I hope it is up. So, hopefully we will mail that also out to you now but in any case it will be mailed out to you in a little bit. So, you should have all that available. So, you should conduct a local discussion so that is with respect to conducting the lab today with first part. The second part is that I told you there is a tool called DIA. We use that tool during the co-ordinators workshop here. So, all the co-ordinators should have used it and hopefully it has already been installed on all the machines locally. If not, you can download and install it straight forward. And once you are done with the diagram doing it manually you can take the same diagram and enter it into the DIA editor. Now, why the DIA editor? Why not use some simple tools like go into PowerPoint and draw it. The answer is that if you draw with PowerPoint or similar such tool drawing tool it does not know what is an entity, it does not know what is a relationship. So, you will have to explicitly put text boxes put the draw lines and do various things. That is a very crude way of doing stuff. There are much better editors targeted for specific applications. And Microsoft VCO was the most popular one. DIA is basically an open source alternative to VCO and because it is free and open source it is much more convenient for us to use it. And what DIA does is it has components for each type of thing we put in there. So, for an entity there would be a component for relationship there is something. So, these are shown as tool boxes in DIA. I wish I could do a demo on the screen but the setup here does not allow it right now. But what you can do is you can create entity boxes and then instead of going in there and expanding the box and typing in attributes you have an UI which lets you add attributes specify whether the attribute is public, private or primary key, etcetera, etcetera and then the diagram is generated from this specification. The specification itself is entered through a graphical interface. So, this makes it much easier to create diagrams. Even the connections between relationship and entity becomes a connector which means if you move an entity the line will move automatically. So, there are a lot of nice features which DIA provides and that is why you should be using DIA not a low level tool like PowerPoint draw drawing tools in Word. They do not cut it. So, once you have drawn the year diagram on paper then go back and convert it. A trick here is we have given you a sample year diagram in DIA. I would suggest you make a copy of it and then create new diagrams by simply copying and pasting the same entities and then editing it double click on it to open the entity editor or the relationship editor and that is a lot faster than searching through the toolkit to find the correct thing which you need to put in a place. It is copy paste the same constructs from an existing diagram rather than starting from scratch. I would recommend that. So, there are a few more questions here which I will take. First question is how are derived attributes used in relational tables? Are they stored? Are they calculated each time we require them? The answer is in the basic relational model there is no way to represent a derived attribute. You will have to create a separate method to compute it. However, if you have an object relational database you can create a derived attribute as a method related to a record and then it is exactly what you would expect. You involve the method you get back a result. It is not stored. The next question is how do you convert a ternary relationship to relational model and the answer is very straightforward just like with a binary relationship you will have one set of columns corresponding to the primary key of each of the involved entity. So, if you have a relationship between E 1, E 2, E 3 the table for the relationship will have all the primary key columns of E 1, all the primary key columns of E 2, all the primary key columns of E 3. So, three set of attributes followed by the attributes corresponding to relationship attribute. So, if a relationship has an attribute that becomes an attribute of the underlying relation which you create from the relationship again following the rules for mapping multivalued attributes and so forth. So, that is an useful exercise to take a relationship with a multivalued attribute and try to map that to relations. It is worth trying that out. The next question is when to create table relationship side. Is it when we have many to many mapping cardinality? Yes, the answer is yes. When you have many to many you have to create a table for a relationship. When it is many to one generally you can fold it into one of the on the many side but sometimes you may choose to create a table anyway because maybe you anticipate the constraint will change. Today it is many to one, tomorrow it may become many to many. So, you go ahead and create a separate table anyway too. So, that later the schema will not be required. You simply remove some constraint and allow multiple records in that relation with the same key. So, then that is easy. The next question is we want to create an ER diagram. How do we decide whether an object should be treated as an entity or an attribute? This is again a very good question. You may have an example, maybe a site which let us say people are familiar with LinkedIn maybe or there are other social networking sites which may say what is your educational background. So, they may say you know you can enter in there I did a BE from so and so college. Now, if that site has no further notion of what is a college then it makes sense to store the college as just an attribute. Supposing the site decides that I want to model the college as an entity because I want to track more information. I want to know who all are in that college. I want to keep information about that college where is it located and so on. At that point it may say no, no I should not be treating college as an attribute I should be treating college as an entity and then each person will be linked to the college entity by a relationship. So, it depends on the needs of the application. If you are very sure that you know I am just going to store it as a string I do not need any further information about this thing fine map it as an attribute but if you feel that there is more depth to this that I need to deal with things like relationships to this thing then it should be an entity. So, it is not a very firm answer it depends on the needs current and future needs. So, maybe today you are not doing anything more with college but the site has an idea that tomorrow it may want to add more features about colleges. Therefore, even today it decides that college is an entity not attribute. The next question is how do we document an ER diagram? Well an ER diagram is actually part of the documentation but as we discussed there are many constraints which sometimes you cannot put in the ER notation. Then what you can do is add textual description describing what are those extra constraints with the ER diagram and overall an ER diagram is typically part of a larger design document. So, when we have projects done by students here for the database course this is lab what they do is they come up with a systems requirements specification SRS they first define a problem the first stage is to say what is that they want to do roughly speaking few paragraphs. The second stage is to break this up into details I am sure many of you would have already been using SRS a lot of student projects end up creating an SRS for their final project. Here in IIT we do it for the lab project also and then the ER diagrams become a part of this overall specification. So, it is part of the overall documentation. I am going to stop there and we will take a break. Thank you.