 Now, let us move on to the second part of this thing which is the other features of entity relationship diagrams. This is part the presentation part to the chapter is the same. So, now let us add a new features which will which are motivated by what some of the stuff we saw earlier. The first is a notion of a weak entity set. So, let us start with an example and come back to some of the other stuff there. I showed this notation where the diamond relationship linking section and course was shown with a double diamond. Now, what is this mean? So, the point here is that section has to be associated with the course. So, I have to have a relationship. I also said that every entity must have a primary key. Now, what is the primary key of the section? How is the section uniquely identified? One option is to create a fake ID. I can do that. I will just create an integer ID each section has some unique ID then that is fine. Suppose, I do not want to do that. I know that a section is associated with the course and I do not want to create a fake ID and in our diagram we actually did not create a fake ID. So, it is identified by a course that it is linked to. I could add course ID as an attribute here. In fact, in the relational model course ID is an attribute of section, but if I put course ID here, then I am duplicating stuff. I already have the course ID in the relationship between section and course ID. So, again add course ID purely to get a primary key, I get a view application. So, I should not add course ID here, but instead I am going to say that the section is going to be identified by a combination of the identifying course here. So, there is an identifying relationship for section which is identified by the double diamond, but that is not enough. A particular course may have many sections, how do I uniquely identify which section it is? So, remember we used section ID semester and year. So, instead of showing those as underlines, I am going to have a dashed underline. So, this indicates that the primary key of section consists of the primary key of the other side of this identifying relationship plus all these attributes shown with dashed underlines. And I do not put course ID here. So, since this does not have a primary key, it is called a weak entity set. Its primary key can be inferred, but it is not shown as part of over here. What is shown instead are the dashed underlines which are called the discriminator. It is not the primary key, it is a discriminator. Is this clear? So, this is a way to get around this issue. So, weak entity, yeah, so in the, yeah, that is a good point. In earlier notation, we used a double box around it. Here we did not, we thought it is, the fact that it is linked by a total relationship with an identifying relationship and it does not have a primary key. The two together should make it clear that it is a weak entity. So, we could have added it, could have. One of the reason we did not add it is that, you know, the standard UML tools do not have a way to draw a double box, but yeah, we could have done it. For that matter, they do not have a way to draw a double diamond. We could have done it. We could have done it. We could have done it. We did not. So, coming back to notation, we have a notion of a weak entity set. We have a notion of an identifying entity set for the weak entity set. In fact, you can have more complex situations where you have multiple identifying entity sets for a single weak entity set. So, then you will have multiple identifying relationships which together uniquely identified and then the identifying relationship which is depicted with a double diamond and a discriminator also called a partial key which is part of the primary key. And the actual primary key is formed by the primary key of the strong entity set or the identifying entity set on which the weak entity set is, it is called existent dependent. Now, this brings up another issue. In addition to this notational issue of should it have a primary key or not, there is another deeper issue which says that a section cannot exist without an corresponding course. So, there is a deeper thing which is an existence dependency. This entity is existence dependent upon this other entity. It cannot exist by itself. It makes no sense for a section to exist by itself without a course. So, modeling it as a weak entity with an identifying relationship to another entity set makes this clear that there is an existence dependency. So, what is the impact of the existence dependency? It will have A, a foreign key dependence and B, if you want to allow it to be deleted, the course to be deleted you have to delete the section also otherwise it does not make sense. So, if you want to allow courses to be deleted, you would have an undelete cascade. You could create that automatically. But more likely you will not allow courses to be deleted and cause cascading delete of section. Okay, a strong entity set is one which is not weak. Although in this slide I am saying that the weak entity set is identified by some other strong entity set. It is even possible to have a cascade. You can have one weak entity set which is identified by a strong entity set and then another one further nested within it which is identified by this one and indirectly by that one also. So, you can have a chain of these things. I do not have slides on it, but it is possible. So, now let us with the notations we have seen so far. Let us look at the university enterprise. So, there are many things in here. Let us find some of them student entity is there, instructor entity is there, department entity, section which is a weak entity, course which is an entity, class to classroom which is an entity, time slot which is an entity and then a whole bunch of relationships. Now, if you see here instructor is related to a department, not that we did not put department name in instructor here. In the relational schema we did put the department name as a attribute, but here we do not. Similarly, students are linked to department by student department and students to instructors by advisor, we have seen these. Now, let us come to course and section. Course is of course an entity, it has course ID title credits as before, but it does not have a department because there is a relationship course department. Note a few constraints here. Instructor to department is double line, it is total instructor must be in a department. Similarly, in the relational schema we should put not null, so there are two parts. Because an instructor can have only one department, we could put department name as an attribute of instructor, otherwise we cannot. Second, because of the total participation here, instructor must have a department, therefore we can say not null. So this is where constraints in the year diagram are reflected in SQL constraints, which ensure that the database does not violate some basic property, is this clear? Now, let us come to the other thing which is courses have sections and this is a weak entity and note that because there is a double line here total participation with an identifying relationship, we know that section is identified by course and its primary key when we converted to relations will be course ID, section ID, semester ID. Note that in our relational model, we actually stored the building and rule number with a section, but when we do the year modeling we are not jumping to that. We have a section and associate a section to a classroom by section class. Now, we are assuming that a section, so in general a section might meet in different classrooms on different days maybe, but here we have made the restriction that section should have only one classroom, it should not keep changing, if it change we have to do more, which day is it in which classroom, we are not modeling all that here. So, we are insisting that a section should have should have a classroom and only one classroom, so total and arrow. Now, again this property that a section has definitely has a classroom and only one, let us later on when we convert to a relational model it let us, so it has an attribute in section, otherwise we would have a separate relation section classroom, which keep strike of which classrooms are associated with which section. So, take a teachers and takes for example, are not many to one, therefore they are many to many, therefore we had to keep them as separate relation, teachers is a relation, teacher takes is a relation and grade we made an attribute of takes, so that became an attribute of the takes relation and prereq is a relationship between course and course with role labels here, course ID and prereq ID, so the role indicators are here. And finally, I want to point this part out, time slot in our relational model was actually a bit funny, which causes some complication, but when we do the ER model, we want to model time slot as an entity, now this is a choice we made, we did not have to model it as an entity, but it might be useful to do so, we chose to do so here. Again there is no one true answer in the context of ER modeling, there are many many possible reasonably correct ER models for a particular enterprise, which you choose is a matter of the things which you want to model and your taste both of these. Section 2, classroom, because this is not shown as a double diamond, identifying relationships are shown as double diamond, no no just because a classroom is a strong entity, section is a weak entity, does not mean it identifies section, so if you think about it, section is not conceptually part of a classroom, it has an independent existence, in fact the total relationship here might be questionable, maybe when I first create data, I may actually create sections without an associated classroom, that's how it works actually, therefore I probably don't want to put the total there, but anyway in this case I have done that, so if you think of it at a conceptual level, section is not existence dependent on classroom, it is existence dependent on course and that's why we made the section course be an identifying relationship, so coming back here I chose time slot to be an entity, entity set, it has a time slot id and now it has a multi-valued attribute which is day start time and end time and I am not actually noting here that day and start time is a key for this set, it's not part of our notation, so there's a limit to how many kind of constraints we put in, not everything can be modeled using the built-in constraint, this actually is true in UML also, UML in general has a notion of constraints which it provides by itself, it also has the ability to put in any text you want there, additional constraints which are not part of the standard, but these are things which you can annotate it with, so maybe we could annotate here saying that day and start time uniquely identify a member of this set, so you could add on your own textual annotations to an entity relationship diagram. Now there are two other topics which are covered here, reduction to relational schemas and extra features, the reduction to relational schemas I think I have been telling you on the fly as we did it, so given that we have about 10 minutes left I'll just quickly touch on just one or two topics, first of all every strong entity set will become a relation of its own, every weak entity set would become a relationship of its own, relationships may become relations of their own or may get folded into entity sets, so here section became an entity set, section become a relation rather, whatever its attributes, you have to copy over the primary key from the identifying course, so course ID, section ID, semester year, what are the other attributes, that depends on what relationships we fold into section, we'll come to that in a moment, for the moment the section relationship has course ID, section ID, semester year, the course relation has course ID, title, credit, for the moment, moving on there is an instructor relation, student relation with just these attributes for now, advisor should it become a relation or should it become an attribute of student, that's a choice, we chose to make advisor a relation to potentially give us the flexibility of having a student with multiple adversaries, so first of all if this is many to many there is no choice, advisor has to be a relation, but because it is many to one, we actually have the option of storing an instructor ID along with student, now note that it's not total, which means the student may not have an advisor, you can model that by having a null value for that advisor, so what does it mean, if we keep advisor as an attribute of the student relation, it has to be a foreign key referencing instructor, however it is nullable, because the student may not have an advisor, so this is why SQL allows foreign keys which are nullable, but if it's total a student must have an advisor, you could say not null, forcing it to have a specific advisor, not a null value, is this clear, so supposing advisor had an attribute last meeting time, and I decided to fold advisor into student, then the attribute last meeting time will also get folded into student, now sometimes you can have redundancy here, so for example here, so if I created a relation institute department, then it's actually easier to, you could create a separate relation instructor department which has ID and department name, but it's easier to fold it in here, so when I say redundancy or schema, I'm saying that I could create the relation, but instead I choose to fold it in, the title may not be a very clear one, so I think we have seen all this, there are a few issues with composite and multi-valued attributes, composite attributes are actually easy, I can just create a name, for example a name first name, I could make it as name underscore first underscore name, name underscore middle underscore initial or something, so to make it unique, so it's flat in the relational model, what about phone number, it's a multi-valued attribute, the standard relational model or relational database do not have multi-valued attributes, actually many do support it today, I could use it if I wish, but if I don't want to do that, then what do I do, how do I store phone number, I have to create a separate relation phone number, what would its attributes be, the primary key of the entity ID and the second one would be phone number itself, so every multi-valued attribute would turn into a separate relation, that's shown here, so instructor phone with ID and phone number, I'll skip some of the details, these are obvious, now this brings us to why we did some weird stuff with the time slot ID, so we had section to time slot mapping, now each section has at most one time slot and time slot had a day start time and time, so what do we do here, I folded time slot ID into section, I didn't keep a separate relationship section time slot, I folded it into section, now for this part time slot, I could have normally I would create two relations, one is a time slot relation and then second one is a time slot timings or something like that relation which has this part, now what is the time slot relation contain now, what attribute with the time slot relation have, time slot ID, that's all, nothing else, so I could have done that, I could have kept that relation and then in section time slot ID would be a foreign key referencing time slot, I could have done that and then there's a time slot relation with only time slot ID, nothing else and the third relation time slot details which has time slot ID, day start time and time, time slot ID, day start time and time, what is the key for this because I know that I had this extra information that end time is unique given these three, this becomes the primary key of this relation, now I chose when I did the relational schema, I chose to merge these two tables, time slot and time slot details I merged it, what is the implication of this, basically those two tables one had just time slot ID, nothing else, the other had the details, I merged it into one table, so that's the schema you saw in the end, but there is one catch, in the merge table is time slot ID a primary key, it's not, in this table the primary key is time slot ID, day start time, in section now I have a time slot ID which is an attribute, can it be a foreign key, let me repeat the question, in section I put time slot ID as an attribute, on this side I merge those two tables into one, can time slot ID be a foreign key into this relation, the answer is what, yes or no, no why, because the three attributes together form a primary key, time slot by itself is not a primary key, it's not even unique, so it cannot be a foreign key, SQL does not support foreign key into something which is not unique, so I cannot declare it as a foreign key, so this is something which I have lost, that's a trade off, we chose this, so the flip side is, because it's not a foreign key I can put in any value I want in time slot ID, I can have a section with time slot 59 and no information about 59 in terms of the meeting time, so that is a potential for inconsistency, because I merge these two tables, so that's the quick summary of ER2 relational conversion and the last part is extended ER feature, I'll just do it by example, the first of these extended concept is specialization, this is actually a very very useful feature, I have between instructor and student, we had several common, we had ID, we had name, maybe we would have many more like this, now both instructors and students might be allowed to borrow library books, so as far as the library is concerned, I may have to keep, if I want to keep track of the separately, student, instructor, library, database has to have two separate tables, it actually makes sense for the library to have a single table, a single conceptual thing, the library, which is a library user, which could be any person, any person is a library user, so by modeling a person which has the common attributes here, student has a specialization of person, employee is a specialization of person and has salary, employee again can be specialized the instructor and secretary, instructor may have a rank, secretary may have something else, this is fake, do not pay any attention to the meaning of the attribute, you can create a specialization hierarchy, so this is very useful in ER modeling, we did not do it in our university here, but we could have, we could have combined student and instructor into a person, which we have done here, there are also issues in constraints on this specialization should it, so for example here, if this specialization is disjoint, that means a person can be an employee or a student, but not both, if it is overlapping an employee can be, a person can be an employee and a student, so that is a constraint, disjoint versus overlapping, there are many more constraints, I am not going to tell you complete everything, but completeness which is a constraint which is, should a higher level entity set, must it, must it necessarily belong to a lower level or is that optional, could you have a person who is not an instructor or a student, if so it is a partial, if it is total, if the completeness constraint is partial it is optional, which total every person must be an instructor or a student or both depending on the previous constraint, and now depending on the choices with the constraints that you put in there, you have multiple choices for the relational schema, so one relational schema is the, I have a person, let us just look at person, student employee, not the lower level, I have person with ID name, street city, student with ID total credits, but not named street city, I am not repeating that information, so a student is represented twice, once in the student table, once in the person table, similarly an employee is represented twice, once in the employee table, once in the person table, there is no duplication, another option, but of course the drawback of this is, if I want all the information of a student, you have to join with person and student, another option is to copy all the attributes down, person has ID name, street city, student has ID name, street city total credits, employee has ID name, street city salary, what is the difference between these two options, one is the efficiency, but this one can be redundant under what condition, if it is total, so if it is total, you could put nothing in the person table and have only student and employee, that could be an option, but then if you have some relationship to person, you cannot represent that, but if it is partial again some of these issues crop up again, there is another issue also, so that is one, total versus partial, the other is disjoint versus overlapping, supposing it is overlapping, a person can be a student and an employee, then all this information is repeated for the same ID, it is there in student and an employee, name, street, city are redundant, it is stored twice, so it may be stored twice in person and student, or twice in student and employee, or may be twice in all three, so this thing has a danger of redundancy, so you would do it if we do not care about person anymore, there are no relationships directly with person, so we do not have to store the person table, it is total, nobody is just a person and it is disjoint, nobody can be a student and an employee, then you can do this, otherwise you should go for the previous one, is this clear, so the conversion depends on all these constraints, we do not have time for this and I just want to wrap up, so there are other slides on comparison of this notation, summary of notation and comparison with other earlier notation, the 10 and the IDE 1 FX notation, so it is there in the book as well as in the slides, the stuff on UML again I am going to skip that, this is just comparing our yarn notation with UML, it looks visually close, but there are some differences and we will stop, there are a few more concepts, there is something called aggregation, I do not have time for it now.