 So, now coming back here the student ID and instructor ID is definitely a super key of advisor, but is it a candidate key can just the student ID uniquely identify an advisor relationship? Well, that depends if I said that the participation of student in the advisor relationship is many to one that means or if I use the cardinality constraint notation which indicated a student can have at most one advisor. Now, if I have just the student ID there is only one advisor I do not need that instructor ID anymore to identify the particular relationship. So, in this case when I have a cardinality constraint the pair student ID instructor ID continues to be a super key, but it is not minimal. I can just take the student ID and that would be minimal therefore, if I have a cardinality constraint of many to one or one to many only one of the two super keys from the two entities such as required that will be the primary key of this relationship. And finally, what about the one to one case it turns out either of them is good enough you know if I have two keys from the two entity sets in a one to one relationship together they are a super key each of them is also a super key because each one can participate in only one relationship. So, each of them itself is a super key and in fact each of them is a candidate key. So, we have two candidate keys and we have to choose a primary key. So, this is something which a designer decides that this is going to be the primary key for this relationship. So, we have done a fair number of concepts. Now, I want to focus on certain errors which are very common when people do ER diagrams. The first in the most common error is confusing ER diagrams with schema diagrams where a schema diagram lists all the end attributes which we are going to create in the eventual relational model. So, remember that the ER model is just a step towards creating a relational model in end because pretty much all databases out there are relational. So, we do ER modeling as a higher level way of understanding what is going on and then we are going to convert the ER model into a relational model and we will see later how to do this. But the fact that we are going to convert it means we are going to have a set of attributes and people often start thinking ahead of time as to what those attributes will be and stick them into the ER diagram. So, what happens if you do that? We knew that instructor has department name in the relational schema. If we started off with the ER diagram, but we jumped ahead and said oh we want to put department name as an attribute, we may end up with an entity set instructor with an attribute department name and further we have a department name entity. And then we will say well instructors are related to department. We want to represent this relationship by a relationship called inst depth, so instructor department relationship. So, what do we have now? What is what have we lined up with instructor, department, a relationship which is explicit that is correct actually, but there is also an attribute in instructor department name which is actually an implicit relationship between instructor and department because in our minds we knew this should be a foreign key referencing department. So, the model of this story is we have landed up with duplication and we have landed up with an implicit relationship both are to be avoided. And the fix here is to never ever have an implicit relationship through attributes. Make it explicit by a relationship and throw away the attribute, throw away the department name attribute from instructor. Yes, we may in the eventual relational design we may reintroduce that attribute, but do not put it in the ER model because we already want we want to explicitly have a relationship indicating which department an instructor is in. In fact, if we put a department name as an attribute, we are tying our hands behind our back. Why? What can go wrong? Well, today maybe instructors have a single associated department, we assume that, but maybe we need a case where instructor should have two departments. What do we do then? If we made it an attribute, then you know we are stuck, we cannot put a second department. So, in that case we should really have had a different representation, we will see how it should have been done. But the moral of the story is when you are doing the ER diagram, go in steps, do not make this assumption right away. Show the relationship, decide on the cardinality constraints of that relationship. In the end, after going through it carefully, we may decide as we did that department name should go with instructor because we decided finally that instructor should not have two departments. But we did this after thinking it through. We did not do it randomly. If we did it randomly, maybe the right decision should have been the other way. We did not think about it, we ended up making the wrong decision. And a mistake in the schema is really hard to correct. Once you build an application with department name as an attribute, then you realize oops, there is a problem, I need two departments for an instructor, we are in trouble. Because you have to go change the schema, which means all the application programs have to change and it is a lot of work. So you have to be careful with the schema design. Now we had several examples of relationships between the same entities, which use the same entity multiple times in a relationship. We discussed this father and mother relationship. If I had an entity set person, the relationship father is between a person and another person. So it is actually a looping back to the same entity set. Here is another example, I have courses and then I have prerequisites because a course may require a prerequisite. You should have done this prerequisite course before doing this other course. Now this is a relationship which indicates which all courses are prerequisites for each course. So this is a relationship which has two edges, it is a binary relationship, both the edges go back to course. Now I need to know which one is the course and which is the prereq. If I just say that prereq has CS 352 and CS 313, what does it mean? Is 352 a prereq for 313 or is 313 a prereq for 352, I will get confused. I need to make explicit which one is the prerequisite for which one. So the way I do it is I am going to put role labels on edges. So one edge I am going to say is course ID and the other edge is a prereq ID. So this indicates that this is a prereq for this course. So that is how we have role labels on edges whenever a relationship loop has the same entity participating more than once. Now even if you did not have the same entity participating more than once, it may still make sense to have role labels in certain other situations. So you can always put a role label on an edge anytime where you want to make explicit what it is. And this point in my course I normally stop and ask for suggestions from students saying let us do an ER diagram interactively. This is something which I find very nice because it makes students think and I would strongly encourage you to also do this in your classes. Now it's a little hard to get feedback on a continuing basis through video. But I'm going to give it a try anyway. We have about 10 minutes to the break. So what I'm going to do is ask you to indicate on a view. If you have a suggestion on something which we can try to model to create an ER model for something. So if you had a question about what we have covered so far, please turn off the indication on a view. At this point I'm only asking for people who have suggestions on some model which we want to create. So now let me go and see if somebody still has their hands raised on a view. So NIT Warangal has its hands up. So let me take a question from NIT Warangal, not a question, a suggestion. NIT Warangal, do you have a suggestion for what we should model? If you have, please go ahead. Sir, my question is about the attributes. What is the difference between a simple attribute and a single valued attribute? Over to you sir. Okay. A single valued attribute is what is not a multi valued attribute. So for example, we had a composite attribute such as address which had parts. I would still refer to address as a single valued composite attribute. Now if there were multiple addresses, then I would say that address is a multi valued attribute. But coming back, this is a perfectly good question, but I would like suggestions on something to do an ER model for. I'll take questions certainly in a little bit. Do you have any suggestions on anything which we want us to try modeling on the fly? Can you hear me sir? Yes, please go ahead. We think this particular subject for quite a long time and certainly in this sixth edition you have changed the ER model and how this is going to impact the community. That is what I wanted to know, number one. And number two is that as you have asked for some suggestions on the different applications on ER design, actually in our colleges we do float some beta projects and each project has got certain level of database designing. Based on this, actually we have got a project where we develop applications for a data consultancy service where they are actually giving us some problems and can we take that as a problem and we can proceed with that. So, let me answer the first part of the question and the answer to the second part is yes, do give us a suggestion once I hand over the mic back to you. But first let me answer the first part of the question. We have been using a certain notation for a long time. Now suddenly one textbook out of several goes and changes the notation. What do we do about this? What is the impact? Now I would suggest for a couple of years you make sure students are aware of both the notations. So, if they go and use a place where they see the other notation they need to understand what it means at the very least. Or if they are in an exam where they are asked to use that notation then they should be able to use it. So, although in this lecture I have only covered the new notation. We do have a few examples of how the old notation corresponds to the new notation. And probably you should encourage your students to do one or two designs using the old notation so they are familiar with it. But it is not a unilateral decision. It's not that we decided today that the old notation is bad and we are the first ones to adopt the new one. Industries have for a long time been using UML class notation. And we felt that it makes sense to move closer to that. It's more compact, the ER models diagrams, ER diagrams become very big often unnecessarily occupying space. It does keep some students happy because their reports are now 40 pages instead of 30, but that's not a good criterion for what is a good notation. So, we decided we would use this new notation. So, I hope that answers your question. People do need to know both these notations. Even before people needed to know it really because if they go to industry they are going to use class diagrams which correspond to our new notation. So, they already needed to know both. We are just shifting what is more important, what is less important. And now back to you and go ahead and give a suggestion. There is a particular peculiar kind of problem. The problem is that data company has actually tried to establish or open up a company in our area and hence they actually shifted the people from that particular place to another place. There is a shifting. So, they wanted to keep track of the details about those people who were being shifted and what is their criteria and all those that. But they were unable to look at the details about these people because of the variety and the diversity in the different informations, they couldn't track it out. Hence, they said if somebody could help us to look at and track out what is going on then perhaps it will be easy. So, we took the initiative and we collected certain information related to the people who have been shifted from one particular place to another place. So, can we model this particular problem using this ER model? Whether it is possible or not? That is what I wanted to know. Okay, it seems like in terms of ER modeling, this could certainly be done. It does not seem terribly hard in fact. So, it is a good example to do very quickly. In terms of the problems that TCS had doing this, it is certainly not from the ER modeling perspective. I am sure the problems they had were other practical issues in getting this information from existing databases. But to model this as in an ER model, so let us first list what are the entities. So, first of all, we have persons. These are the persons who we are tracking. We could have even call it employee, but let us just call it person for the moment. I hope you are able to see the whiteboard now. We are going to model several entities here. We have a person entity. Then you said that people were shifted from one location to another. So, I assume that by location, they mean a specific facility or an office of that company. And that office may be in a particular town perhaps also. I don't know whether they wanted to model this, but let's say we model an office, which is a physical office of that company. Maybe we also want to model the fact that multiple offices may be in a particular city. Maybe shifting people from one office in the city to another does not count as a transfer, but shifting across cities would count as a transfer. So, we need to know which city or town that office corresponded to. So, that's another entity. I am unable to think of any other entities right now. I said it is a very simple design here. Maybe there is also a position, which a person held. So, perhaps we want to model, we may want to model position as an entity. On the other hand, we may also model it just as an attribute depending on the needs. So, this is for a point in time. I can say who is currently at which place. But now, I think core part of the question, which I have raised is to keep track of the history, who was where at what point. And in fact, this is a very nice motivation for a problem, which is very real, which is we not only want to model the current point in time in a year diagram. We may also want to model history. What was the state yesterday? What was the state last year and so forth. So, the question is how do we model these temporal aspects in an year diagram? And it turns out that the basic year diagram is not very good. It doesn't provide very good support for modeling these temporal aspects. So, that can be an issue. So, let's see how we can do it here without introducing any new features. There have been proposals by various researchers to extend year diagrams with specific constructs for modeling temporal issues. But none of them has really been viewed as completely satisfactory and general purpose solution. So, none of those has actually been adopted widely. So, temporal aspects have to be modeled at a low level. There is no higher level way of saying that there is a relationship. It also has time associated with it without explicitly storing time attributes. So, we are going to store time attributes explicitly. So, let's say that person is related to an office by let's say we will call it posting. So, posting is a relationship between a person and an office where the person is posted. Now, you said that a person's postings may change. So, we want to know who was at this office earlier in this period of time and so forth. So, since this may change over time and we want to track who was where at what time we are forced to use the attribute of posting which is period start and end. This is the same thing that we did earlier and this is a multivalued attribute because the same person may be posted back in the same office again after some time. So, which were the periods of time when this person was posted in this particular office. Now, supposing we have a constraint that a person can be posted at only one office at a time. You may be tempted to model this constraint. By the way, there should have been a dashed line connecting this. I ran out of space at the top of the figure. So, I was forced to stick it right next to the relationship, but actually there should be a dashed line there. Now, coming back, how do you model the constraint that a person can have at most one posting at a point in time? You may think that we can put an arrow here. That is the notation we discussed later that earlier rather that when you put an arrow to in this side towards if we put an arrow over here from person to office that would indicate a person can have at most one office. Unfortunately, this prevents a person from having different offices at different points in time. So, this is what I meant by explicit support. I cannot model the constraint that a person can have only one office at a time using ER notation. There is no way to do this once I take time into account. I cannot do this. Therefore, I cannot throw in this constraint. The only way to do it is to write this constraint in words separately, but there is no notation in the ER diagram notation for it. Now, what about other things in here? Office could be located in city. So, that is a straightforward relationship which links office and city. Now, let us come to the harder part. What about the position that a person is in? So now, a person may be in a particular office, he may have a particular position. He may continue to be in the same office, but the position may change. How do I deal with this? Or the person may go to a different office with a different position. So, one way of doing this is to actually turn this into a ternary relation. Posting is a ternary relation which links a person, an office and a position. And it has a period. In fact, this works fine because a person can be in the same office with two different positions at two different periods of time. Can a person have two positions in the same office at the same point in time? This happens. Sometimes, you know, person is holding a particular job, but maybe a, you know, I am a professor here, but I may also be in charge of something, I may be a head of a department, I am not, but I could have been the head of a department, I could have been a dean. So, there are many possible positions that a person may hold concurrently. And this can certainly model that person in a particular office can have different positions. So, there are no cardinality constraints here. We mentioned earlier that we are going to cover cardinality constraints only for binary relationships. With ternary and higher degree relationships, if you start drawing arrow notation, it gets very confusing what it means. People have actually, there are different papers or textbooks which have different interpretations of the same diagram which causes confusion. So, we decided, let us not use any notation which actually causes such confusion. And we are going to, in a book, we mentioned that if you have a ternary relationship, we will allow at most one arrow out of it, which means that, let us say, if I draw an arrow here to post thing, that would be a constraint that this person can have in this office can have at most one position. That would also be wrong temporarily because the same person may have one position at this point in time and another position at another point in time in the same office. So, we are not even going to draw any arrows. So, we are just going to leave it unconstrained here. So, this diagram now includes person, office, position, post thing with period located in city. So, let us get back to people from Warangal and see if they have any input on this. Warangal, I am selecting you again and let us hear from you if this solved your problem or there were other aspects of the design. Now, I should mention before I hand over the mic to you that any year modeling process is iterative. So, I understood a little bit of what you wanted. I may have been completely wrong in what I understood. I may have been partially correct and you, the customer will come back to me and say, no, this is wrong or this is okay and then I will refine my year diagram correspondingly. So, customer from Warangal, over to you. Part of the solution, this is one part where the office part is taken care of. But of course, there are certain other aspects like they were dealing with some local people also. So, those people's information were also been taken care of, is to be taken care of. So, anyway this part is okay sir and I have a question. Can I ask it right now related to the topic that you discussed? Okay, let me answer part of your question. The first part, good that it is partially correct, but it, the first iteration usually will not cover all the information required. In this particular case, so I understand what you said as there are some local people who are not employees. So, maybe we want persons who are employees and then persons who are not employees. And we are going to see later in the post break session, how to model these by using specialization. I am sure most of you already know this concept. But we could have used that concept to refine a person into employee and maybe a contractor or some such. So, with that back to you. So, you can ask your follow on question. Back to Varangar. Actually, in some books I found that there is a mention of entity types. Could we know what is the difference between an entity type and an entity set? So, some people, especially in the object oriented world, they have a notion of a type which is just defines what are the attributes. Now, that does not mean there is a unique set of that type. So, maybe I will have a person type which says that, you know, the person type should have name, address, blah, blah. And then I will create entity sets, employee, consultant, contractor, whatever from which all have the same type, person type. So, this is one way of modeling, which is natural for people who have been using programming language, where there is a notion of a type. But the type is not associated with a set. For the database world, people assume that there are sets with types and there is no need for a separate type. If you go to the object relational extension of SQL, they actually introduce both these notions. They end up introducing a notion of a type and then a notion of an extent or a set of things of a particular type. So, I hope that answered your question. I am not sure what the textbooks you have looked at mentioned, but I think this is what they probably mean by type. Back to you. Yeah, thank you, sir. Actually, I wanted to also know about the total participation and the partial participation. What does actually they mean? So, this will be the last question. So, the question was what is the difference between total and partial participation? I think I explained this in one of the, in a couple of examples. One was section and course, where a section must have an associated course. So, I linked a section and a course by a relationship sec course. So, to say that a section must have a related course, I am saying that section must participate in the relationship sec course. In other words, the participation of section in the relationship sec course is total. Partial means it's optional. Optional means a section may not have an associated course, but here I want it to be mandatory, therefore it is total. That is the difference between partial and total. I said that if you have questions, you can submit it via chat. And we have several questions here. So, the first question is how do you create or alter table with single valued attribute to multi-valued attribute? This is a question not so much about the ER model, but about the relational representation. The standard first normal form does not even allow multi-valued attributes. So, in the second half of today's lecture, we are going to see how to convert ER diagrams to tables. And there we will see how to model multi-valued attributes as tables. Another question from DAICT, Gandhi Nagar. Similar question like person, father, mother. The question is employee, manager and supervisor. I am not sure what is the difference between manager and supervisor. In some places, they would mean the same thing. But perhaps you mean supervisor is immediate and then the manager who is next level up perhaps. Should it be ternary or should it be binary relationships? In an organization where there is a strict hierarchy, for a person you only need to know who is the immediate boss, which maybe you would call supervisor or manager depending on how you call it. And then if you see who is the boss of your boss, maybe that's what you call manager. I'm not sure if that's what you mean, but I'm going to interpret it like that. So then there is only need for a single binary relationship. It says, who is the boss of who? There is no need for a ternary relationship. But if you have other situations where your supervisor may report to multiple people, your supervisor has two bosses. But you are under one of the two bosses of your supervisor. So maybe your supervisor is in charge of two separate sections. And for each of those sections your supervisor reports to two different bosses, then you may need to have a ternary relationship. But then again, maybe not. Maybe a better way is to say that you belong to this division. And this person is the supervisor of that division. And therefore it is very clear this person may be the supervisor of two divisions, that's okay. And maybe this other person is the next level person responsible for this division. So we also record who is the immediate supervisor for the division, who is the, maybe we can have a hierarchy of divisions. And for each level in the hierarchy, who is the manager for that level of the hierarchy? Most organizations do this. They have division, subdivision, sub-sub-division. And at each, this is a tree. At each node they would have a manager, who is also the manager for whoever is below in that tree. So this is often called an organizational hierarchy. And it's a very common situation. There's no need for a ternary relationship in any of this. The next question is how to insert null value for date data type. Regardless of the data time in SQL, if you give the value null without quotes, you can insert it into any data type. So date is not special there. The next question is showing cardinality limits more correct than showing arrows. Certainly with cardinality constraints, you can express things beyond what you can express with arrows. So supposing the limit is 5, you can have between 0 and 5 advicees. You cannot depict that using arrows. But you can depict it by saying 0 dot dot 5. So it is more powerful. But the most common situation is many to one, one to many. So in most situations, the arrow is good enough and covers what we need. It's easy enough to understand. But if you have something more complex, use cardinality constraints. The next question is a very good question. This is related to quiz one today. Instead of having two binary relationships, father and mother, can we model that with just one binary relationship called parent with an attribute to indicate whether the parent is father or mother? That's actually a very good question. In fact, that is one of the alternatives I thought of putting in the quiz. But I decided not to confuse people with it. But it's actually a very good point. In the normal situation of things, if we just associate gender with each person, whether male or female, and we just have a parent relationship, then it's very obvious that by looking at the gender of the parent, you can see whether the parent is a mother or father. That's obvious. Of course, this gets very complicated when we have people changing genders. Luckily, that's very rare. So probably we don't care. But if you were in a situation where that matters, then maybe you need a slightly different approach. One approach is to have two relationships. Another approach is to have an attribute with a relationship. So it's a parent relationship with an attribute which indicates whether that person is the father or the mother. Of course, this is slightly silly in this example. But there are many cases where you need a tag like this. All you need is a tag to indicate some extra information about that particular relationship. Next question says, what is wrong with person relation having attributes as father-mother since a person will have, as you specified, only one role to play? So this question is, again, back to the relation, not the relationship, but the table we create out of it. If we create a table, indeed you can have an attribute called father and an attribute called mother. But when you're doing the year diagram, make relationships explicit. Do not put them as an attribute. Yes, the father attribute implicitly says there's a relationship between this person and the father who is referenced. But don't do that. The goal of the year diagram is to have one way of representing relationships, which is by having an explicit diamond with a relationship in place. Use that. Even if you know that eventually it will become an attribute, don't do that. Here it's very clear a person may have only one father. Although in the US this gets very confusing, where there are many people of the same gender who are married, and then they have kids also. So the kids have two fathers or two mothers. This gets very confusing. So maybe the assumption that there is only one father or one mother breaks down in such situations. So it is better to have explicitly represented a relationship, and then said that yes, we have decided that we are not going to care about the case where you want two fathers or two mothers. We are not going to model that. But make it explicit. Don't do it implicitly. So that ends the questions. With that I will move on to the second half of this talk. So the very first topic here is weak entity sets. There's a lot of confusion amongst people who have seen this topic as to what is a weak entity set, what is a strong entity set. The first definition which we see anywhere is that what we have in the first bullet. An entity set that does not have a primary key is referred to as a weak entity set. But any intelligent person may come up and say, but wait. You're the one who created an entity set and its attributes. What prevented you from putting a primary key? Why did you not put it? I mean this seems silly. If you simply forgot to put the primary key for something, does that automatically make it a weak entity set? The answer is no. It's actually there is a deeper significance to what is going on here. And this will become clear when we see the examples coming up. So for the moment, just stick to this definition. Now, whenever we have a weak entity set, we normally do this if there is only if there is a corresponding identifying entity set. So in some sense, this weak entity set is a component of this identifying entity set. So to understand this situation, let's take a bill which you get in a hotel. So it says two masal dosas at this rate, this cost, one vada at this rate, here's the cost, and then there's a total. This is a bill. Now, how do you represent this in a relational database? If you allowed multi-valued structured attributes as in the ER model, you could do it. But if you didn't, you can't do that. And even if you just stick to the ER model, it might make sense to model a bill not as one single thing. There is an entity corresponding to the bill. But each line of that bill can be thought of as a relationship or an entity itself which relates to what is the item. This line relates to the entity masal dosas. This line relates to entity vada. And this entity also is related to this bill. So the line of a bill can be thought of as an entity. But this entity has no meaning without having a bill itself. How can you have a line in a bill if the bill doesn't exist? So what this means is that this is a weak entity whose existence depends on the parent entity. If the parent entity goes, this should also go. This is exactly the kind of situation which weak entities would model. Now once we have this in mind, if we have a primary key for the parent entity, the identifying entity, what we need in the weak entity is simply something like a line number. This line is the first line in the bill. This is the second line, third line. The bill number comes from the parent identifying entity. That's basically why this individual entities don't have a primary key. We'll see easier to understand example coming up. We are going to, in the ER diagram notation, the identifying entity is linked to the weak entity. And we need to know which one it is. The weak entity may be linked to many different entities. But only one or a few of those may constitute an identifying entity. So we are going to use as a standard a double diamond to show that this relationship links a weak entity to its identifying entity. So that's how we are going to do it. And one other concept here is a discriminator is the set of attributes that distinguishes amongst all the entities of a weak entity set. So what does this mean? I said line number for a bill. So if you just keep line number 1, 2, 3, 4, these line numbers occur in many different bills. But within a single bill, the line number occurs only once. So it's not a primary key across all line numbers entities. But within a single bill, it is unique. So it's a discriminator. So now let's see all of these concepts using a course section. And I'll explain why we modeled it as a weak entity set. So first, let's see what we are modeling. We have courses. CS317 is the database course in IIT Bombay. Now every year, either I or somebody else here teaches this course. So CS317 is taught every autumn of every year. And a student is registered for one of these years in autumn. From this year, we started having two sections of the course, not just one. So we need to identify whether the student took the section which I taught or whether the student took the section which Professor Umesh will have taught. So we need a section identifier. Therefore, a section is identified in the relational table, if you recall. A section is identified by course ID, section ID year and semester. So let's say we start off by saying a section is an entity, course is an entity. And now if you see, section has course ID in there, course has course ID. And clearly, the course ID in section should have referred to the course ID in course. And therefore, it is a relationship. This attribute indicates that this section is related to this course. So we do not want such relationships to be implicit. So what we are going to do is remove the attribute course ID from the section entity and add a relationship to course, which is sec course, which is what you see in this diagram here. Section is now linked to section core by this relationship. But wait, there is a problem. Course ID was part of the primary key for section which helped us uniquely identify it. If you remove it, what is left is no longer a primary key. Course ID is required. So what we are going to do is turn this relationship into an identifying relationship, which helps us identify sections with one of the courses. And the primary key for section is going to consist of the primary key for course plus whatever parts of the unique identifier are left behind here, which help us distinguish different sections of the same course. Here they have section ID, semester and year. Notice that I have underlined them as before, but this time with a dashed line, not a solid line. So the section ID, semester and year are the discriminator attributes for this weak entity set section. The eventual primary key for this weak entity set will consist of the primary key of the identifying entity set course, which is course ID, coupled with section ID, semester and year. This is the relational schema which we use in that these were the primary key attributes. So I hope now it is clear why we ended up with a weak entity set. So to recall, very often you start off by deciding that something is an entity set. Then you realize that to identify it uniquely, you need attribute, which is really a reference to another entity set. And that's when you realize that this is actually dependent. If a course doesn't exist, a section cannot exist for that course. So we will say that this section is existence dependent on the course. To make something a weak entity, this existence dependency is required. If a section could exist without a course, then we can't actually represent it as a weak entity. But we know it has to have a course. So that's why we make it a weak entity. So we end up starting with something which we thought was a strong entity and then turn it into a weak entity. Now what if I put an identifying relationship and also put course ID in here? If I have both a relationship and course ID as an attribute, then certainly section would be a strong entity. That is the relationship linking section with course. But now there is duplication. The course ID in section conveys the same information as this relationship section course conveys. We are duplicating information. Therefore, we should not keep the course ID as an explicit attribute. Let it be only represented by the relationship. So I hope that answered a question which somebody had asked regarding what a weak entity is. So why do we need this? This slide is basically summarizing what I just told you. The primary key of the strong entity set is implicitly there in the weak entity set. But it's not stored explicitly. It's kind of inferred through the identifying relationship. And the reason we don't store it explicitly is that although it would make it a strong entity, it would be redundant with duplication. So now we have an ER diagram for a university enterprise on this slide. Unfortunately, the font here is probably too small for you to read. I apologize for that. I intended to break this into several parts, but forgot to do it. If you have a copy of the book, you can refer to the diagram in the book. Otherwise, I hope you will at least be able to read some parts of the diagram on the screen. So this again is a toy university ER diagram. A real university ER diagram will be much more complicated. Moreover, it's not going to fit in one page. So after I describe this, I will tell you how to take a complex ER diagram and split it into pages. I'll come to that in a moment. But for the moment, let us see what are the components of this ER diagram. So let's start with departments. We have department name, building, and budget. So this was the same information as we had in the department relation. Now we have student. We already saw this entity. And as we saw, department name of student was removed. And instead, we have a relationship student department linking student to department. If you can see this figure clearly, you will see that there is a double line from student to student department. What does that mean? The participation is total. Every student must belong to a different. The double line allows them to belong to two, three departments. But at least one. But moreover, there is an arrow on the opposite end here, which means that a student can only belong to one department. So this says at least one. This says at most one, which means exactly one. A student belongs to exactly one department. Similarly, if you come to the left-hand side, an instructor in our ER diagram must belong to exactly one department. Then we have an advisor relationship here linking student and instructor, not the arrow, which forces a student to have at most one advisor. Student cannot have two advisors. Then here is a course entity, which has a course ID, title, and credits. This is actually the same attributes as the table we saw before, except that the department attribute is missing. That attribute has become a relationship course department out here. Just like we removed department from instructor, we have removed it from course also and turned it into a relationship. This is also a total participation. And there's an arrow, which indicates a course must belong to exactly one department. Now moving down, you will notice that pre-rec relates course to course. We have already seen this earlier. And moving further here, we have a section entity, which as you note does not have a primary key. It only has dashed underline. So it has a discriminator, which means it must have an identifying entity. And the double diamond here indicates what is the identifying entity that is course. Note a couple of more things here. The line between a weak entity and the identifying relationship which identifies it has to be a double line. Why? If a particular section is not associated with any course, that doesn't make sense. You require a course to have a section. It exists in the dependent, which is why we have a double line here. Furthermore, we have an arrow here. If you have a weak entity, it cannot be related to two different courses. That doesn't make sense. It exists in the dependent on one course. So you must have an arrow. Whenever you have an identifying relationship, one side will be a double line, the other will have an arrow. Now note that relationships which section participates in directly go to section. Even though section is a weak entity, that doesn't prevent it from participating in various relationships. So down here, we decided that classroom is an entity which has a building room number and capacity. Building and room number is a primary key. Capacity is an attribute. And each section must meet in the classroom. We are assuming that section cannot meet in two different classrooms. A particular section, meaning in a particular semester, has just one classroom assigned to it. And therefore, what we have done is section to classroom is relationship, section class is many to one. So there's an arrow here to classroom, which means a section can have at most one classroom. Furthermore, there is a double line here, meaning the section must have an associated classroom. You cannot have a section which doesn't have a classroom. This is a decision we made, but may not be realistic. There are sometimes self-study courses which run as a section, but don't have an associated classroom, in which case we would have turned this into a single line, not a double line. And finally, we have a time slot in IIT Bombay and in many places there is a notion of a time slot, time slot one, time slot two, time slot three. And each time slot has a set of associated meeting hours. So in IIT Bombay, time slot one is 8.30 to 9.30 Monday, 9.30 to 10.30 Tuesday and so on. So how do we represent this information? We decided time slot is an entity, which has a time slot ID, 1, 2, 3, 4. And it has a multi-valued attribute, which itself has three components, day, start time and end time. So day of the week, Monday, Tuesday, so forth. Start time and end time. And it's multi-valued, allowing the same time slot one to have hours on multiple days. So that is the time slot entity. Each section is similarly linked to a time slot entity. And that is also a total participation. So each section must have an associated time slot. Again, if you have a self-study course, there is no time slot, in which case this would be a partial participation, not total. And a section is not allowed to have two different time slots. Therefore, there's an arrow to time slot. So that is it for our university ER diagram. The next question is, what if the diagram is complicated? It is complicated. We managed to squeeze it into one page, which is readable, at least in the book. But maybe not so readable on the screen. So how do I take a diagram like this, which may be very big and split it into multiple pages? So let me show it by drawing. Come back to the whiteboard. The first principle when I break up an ER diagram into multiple pages is that there should not be duplication. What do I mean by duplication? Now, a particular entity may participate in many relationships. This may get split over multiple pages. So the entity will occur in multiple pages. That by itself is not duplication. Duplication will happen if I list the attributes of the entity in multiple places. So one way to do it is to take each entity and show it once, separately. Without showing any relationships, show the entity student ID, name, and total creds. So that's a student entity. Similarly, I will draw the other entities on one page. I may even, if a space permits, I may even show some relationships on the same page. So maybe student department is a relationship which I will show on the same page. But now I have run out of space on this page. What do I do? So maybe I need to know who is the advisor of a student and I didn't have space to put that in here. So then what I will do is I will take away this page. So to show that, let me wipe out this whole thing. Now I'm on a new page. Here what I'm going to do is student will now just be a box with no attributes. I'm not going to show the attributes. An entity, attributes are not depicted here. So similarly, maybe instructor, its attributes were shown already somewhere else. Now I'm just going to show a box with no attributes. And then here in this new page, I'm going to have advisor. So now there is no duplication of information. And I have split the complicated ER diagram into multiple pages. So if I had remembered, this is exactly what I would have done for this complex ER diagram to split it into multiple pages. The next topic in the sequence is how to take an ER diagram and convert it to relations. After all, we are not going to actually execute queries on the ER model. We have to convert it to relations. It's almost 100% true, but not quite 100% true, because recently Microsoft introduced data model, which they called entity data model, which is a lot closer to the ER model than to the relational model. But it's not quite exactly the ER model. So if you are familiar with that, you will see why it's similar to the ER model. And it is possible in certain ways to directly write a schema which corresponds to ER schema, not bother about reducing it to tables. And then the entity framework of which Microsoft has built will actually store the data in whatever form. If it's storing it on SQL server, it will actually convert it internally to a set of tables. Exactly what we are doing explicitly here manually, it may do automatically inside of the system. So you don't have to concern yourself with those details. You can work at the entity and relationship level. You can write queries at this level using a query language which they have defined. However, this is not widely used at this point, and it has its limitations. So we are not going to cover that here. We will stick to the relational model and convert all our schemas in all our ER models into relational schemas. How do you do that? Again, those of you who have taught the course would know all this, but let me cover it in quickly in any case. Strong entity converts directly to a table with the same attributes, barring composite and multi-valued attributes. We have to do a little bit more to deal with composite and multi-valued attributes. So simple single-valued attributes can just become directly attributes. How about a weak entity? By the way, the primary key of the strong entity will become a primary key of the corresponding relation which we create. We are also going to create foreign keys in a way which we will see in a little bit. Now a weak entity will also become a table. This time, we are actually going to copy the primary key attributes from the identifying relationship into this table. Slight digression. So far I have assumed that each weak entity is identified by a single strong entity. There are situations which are a little more complex. So there may be one weak entity which is identified by a strong entity and another weak entity which is identified by this weak entity. Think of it as a hierarchy. There's a strong entity set A, there's a weak entity set B which is existence dependent on A and then at a further level down, there's a set C which is existence dependent on B. This situation can arise. And in the year diagram, you're going to have a cascade. You'll see that this is connected by identifying a relationship to something which is itself a weak entity. It doesn't have a primary key, it only has discriminators. That in turn connects up to something else. Now, what do we do when we convert such a thing to tables? We have to first copy the primary key from the strong entity to the first of the weak entity sets. And then it'll get a primary key which is the inherited primary key plus its discriminators. And then from there, we copy the whole thing into the next weak entity set. There are even situations where a particular entity set, weak entity set, may existence depend on two different strong entity sets. And then it'll have to inherit the primary keys of both those entity sets down here. They are, these situations are rare, but they do occur. Coming back to our section weak entity set, we are going to create a table corresponding to this with course ID, section ID, semester here. Now, you will notice that both these tables are missing some attributes which were there in our original relational schema. What happened to them? Well, they are still there in relationships which we have to deal with. So, we are going to convert every relationship to a table. Although in some cases, we are going to merge those tables into entity tables as we will see. So, first of all, a many to many relationship will have to have a table where the primary key attributes are going to be copied from the corresponding entity set. So, for advisor relationship, we are going to copy the primary key of instructor and the primary key of student and then create a table containing these two primary keys. That represents the relationship advisor. So, you will notice here that the primary key are called ID in both cases. You cannot have two attributes called ID. So, when you do this mapping, you have to rename the attributes. So, we chose to rename them to I underscore ID to indicate instructor ID, S underscore ID to indicate student ID. So, I am stopping here for the break.