 as apparently too big for a view to load, since it contains a lot of pictures. So, I had to split the file into two parts. So, we are all set now. So, as I said the file contains a lot of pictures and that is the whole point of today's lecture which is on ER modeling with a lot of focus on ER diagrams, because the diagrammatic representation is a key part of ER modeling. So, what we are going to do today is cover the ER notation which we use in the sixth edition of the database system concepts book. Many of you may have used earlier editions of this book and you should be aware that this notation has changed in the sixth edition. Now, why have we changed it? Even though this the old notation has been around for many years. The primary reason is that there are several standards for ER modeling and then there is this whole modeling language called UML which has gained a lot of traction over the past 10 years and it is very widely used in industry. So, there is a old style ER notation and then there is the UML class diagram notation and then there are variants floating around. So, for a while we have been thinking that it does not make sense to have so many different notations and since UML has really grown a lot over the past year and is very widely used in industry we should move our notation closer to the UML notation. In the previous edition last couple of editions we had presented our old notation and also UML notation and given us comparison of the two. In this edition we decided to flip it and change our presentation to something based on UML, but it is not exactly UML. Well, first of all UML has many components one of which is UML class diagrams and that is the part which is relevant to database modeling. There are many other parts to UML, but even UML class diagrams were designed primarily for modeling objects and relationships and other things were after thoughts in UML. Whereas, in ER modeling we are not generally interested in modeling arbitrary objects, but we pay a lot of attention to the relationships between objects. So, the notation we use is very much UML-ish, but there are a few differences few notable differences which I will point out somewhere along the way, but in appearance it looks much like the UML class diagram notation. So, that is a short overview of what is different for those of you already know the old notation. So, now let us move on to the actual talk itself on ER modeling. ER modeling is more than just the diagrammatic notation, it is also way of understanding or modeling what is out in the real world. The reason ER modeling has caught on may be primarily because the diagrammatic notation is useful, but it is also caught on to a very large extent because it makes a lot of sense to model things in the real world as either entities or relationships. This idea has appeared many times over after the ER diagram was introduced by Chen way back. This idea has been used extensively in database design, but in recent years this idea was also shown up in other areas. People who look at data on the web and try to model the data on the web in a semantic sense as opposed to just textual documents are also moving to a model of the world where you have entities and relationships between entities. So, it is a very natural way of modeling things. So, here is the overview which I am going to skip because overviews are usually not terribly useful until you have learnt the material itself. So, let us start with how entity relationship serve as a nice way of modeling the real world. So, if you took somebody who did not know about the ER model who only knew about relations and said go model the world. They would try to model everything as a table and they would probably intuitively wear towards something which an ER modeling would have landed up with also, but not necessarily. The chance of they are going off in some other direction is significant because you are working at a low level. It is like asking people to program in assembly language. It is not quite that different, but it has that flavor. So, what you want to do instead is first have a higher level understanding of what is it that you are trying to model. And the way the ER model does this is to divide the world into entities and relationships. Entities could correspond to actual objects which are there in the world. As you may be aware people, places, institutions which have a physical presence these can all be modeled as entities. There are other things which are a little harder to decide whether their entities are not, but if you think a little bit more they are entities. For example, if you take a movie is it a single copy of the movie that we are referring to that is probably not what we mean. When you say did you see the movie I am not saying did you see real number 3, 4, 2 copy of the movie, but really the content of the movie is what matters. So, there is no physical thing which carries unique physical thing which corresponds to movie, but there is something conceptual there is information the pictures you see in it. So, the movie is still an entity. Now, take this course which we are covering. Does it have any physical manifestation? Sure we all get together and stuff is going out over the internet and there are web pages and so forth and all of these may be parts of the course, but if you take the course itself it is not like it is a human who is identifiable somewhere it is an abstract concept. However, even though it is an abstract concept it is still an entity in the ER model of the world, because it has some unique properties and we can identify it and we can participate in relationships with it. So, what are these relationships we will see in a moment. Now, every entity must be described somehow it has to have some set of attributes including perhaps some attributes which identify the entity. Another piece of terminology an entity in the basic ER model is a unique physical entity this course is an entity. Another course is an entity a third course is yet another entity these are all different entities in the ER world view. However, it should be clear that there are a lot of similarities between courses and if you are going to store information in a database you are probably going to use exactly the same attributes for all these courses. Therefore, there is a notion of an entity set which is a set of entities of the same type. An entity set conceptually is kind of like a relation because a relation is a set of tuples of the same type meaning they have the same set of attributes. And similarly, an entity set is a set of entities all of which have the same set of attributes. So, here is a small picture of a number of entities in the corresponding entity sets. So, we have an instructor entity set and we have a student entity set each of which has several entities within them. And in this diagram we have just modeled a couple of attributes of instructor and student which are ID and name although as we have seen before they would have more attributes even in a toy database they have more attributes in a real database they would have many many more attributes. Now, you will also notice that down below I have used a notation instructor and student. So, they are actually I should have said maybe instructor entity set and student entity set, but we are going to not bother adding the word set wherever it is obvious from the context that we are talking of a set not an individual entity. So, when it is clear we are talking about instructors in general there is no need to say the instructor set we will just say instructor. And then when it is clear that we are talking of a particular instructor you know we will make that clear that we are talking of a particular entity not an entity set. Now, how do you identify entities you need to distinguish them. Therefore, just like in the relational model you have to have a notion of keys and just like in the relation model you have a notion of super keys which are a set of attributes which uniquely identify an entity and a primary key which is a minimal super key which is explicitly chosen as the unique identifier of the entity. As usual there may be multiple candidate keys which are minimal and you have to choose one and as usual just like in the earlier relational discussion you would probably choose one which will not change. So, you do not want to choose for example, name as a primary key A because the names may change B because they are not unique. But even if names were unique for example, email addresses are unique, but still you may have a person changing the email addresses. So, you would choose an attribute which is unlikely to change. There are again several schools of thought on how to create a primary key for an entity. One school of thought which is what we follow is that a primary key should be something that is identifiable in the external world. So, not only do we need to identify instructors and students uniquely inside of the database, but when they go interact with various parts of a university people need to identify them uniquely. In IIT Bombay in one of our offices which dealt with certain projects accounts a while ago they were rather lax with the system. They did not use identifiers. We had employee identifiers, but they did not use it. So, they would try to manage with department name and initials and guess what bills for which were meant to be paid by me might have been paid by professor Sunita Sarawagi or vice versa because we happened to have the same initials. So, clearly that is a bad idea. So, that is as far as primary key is nothing new here so far. So, now let us move to the concept of a relationship and a relationship set. A relationship is some kind of an association amongst entities. We will initially look at binary things between two entities then we will see how it generalizes to multiple entities. So, here we have a relationship which is referred to as advisor between a student and an instructor entity. So, a particular student identified by 44553 whose name happens to be Peltier has an advisor whose idea is 22222 whose name happens to be Einstein. That is a relationship. This is a fact in the real world the university perhaps wants to assign one faculty member as an advisor for each student so that they know who to go to. So, the relationship between two entities the student entity and the instructor entity in this case is a advisor relationship. There may be other relationships too between the same entities. We will see some examples of this later. So, a relationship set just like an entity set is a set of relationships of the same type. In this case the advisor relationship between these two particular entities can also occur amongst other entities other students also have advisers. So, you have a whole set of these relationships between different students and different advisers and the set of all these relationships is referred to as a relationship set. In this case it is the advisor relationship set. So, a relationship set just like an entity set is also much like a relation set of things. So, we will look at a particular relationship in the set and that is identified by which entities it relates. So, this particular one we just saw relates 44553 with 22222. So, that is an element of the advisor set that is what is shown at the bottom here which says this tuple is an element of advice. Now, note that in general relationship set is an n tuple where the first attribute of that tuple would belong to the first of the related entity sets. The second one is a member of the second entity set and so on. So, when we have two we just had the first one being a member of the student entity set and the second of the instructor entity set you can have more entity sets in general. So, when we have a binary relationship we can depict it pictorially and here I am talking of individual relationships between individual students and individual faculty can be represented by drawing lines between the entities for the student in the diagram. We have a particular row corresponding to particular student and a particular instructor draw a line between them. This is not an error diagram incidentally. This is still a just a conceptual tool which we are using to understand how things are related. So, here there are multiple lines between students and instructors. You will notice here that one of these instructor's cad's has two students who they are advising and in this particular case each student has only one advisor. This could be different of course. Certain universities may have multiple advisors for student. So, that could change. We are going to switch over to error diagrams. Later we will switch back to some error modeling concepts and then switch back to diagrams and so forth. We will flip back and forth in this talk. Now, in the book we follow a slightly different order. We first introduce the concepts and then we introduce the diagrammatic notation, but here we are going to mix the two up. So, here is a simple error diagram corresponding to exactly the things we saw so far. So, we have an instructor entity and earlier we just had two attributes ID and name. Now, we are reintroducing the third attribute which we had earlier which is salary. Observe however that we have not introduced department name as an attribute yet. In our relational schema which we have been using for the past few days you would have noted that instructor has a department name. Why is that not present here? We are going to see in a little bit when something should be an attribute and when something should be a not an attribute, but instead should be a relationship and in this particular case department name really identifies a particular department that the instructor is associated with. So, it is really a relationship. It identifies a relationship between an instructor and a department. Therefore, we do not put it down as an attribute. Instead we will introduce a relationship in a little bit to represent that information. So, this is the first difference. First major difference you will note between an ER diagram and a schema diagram. We saw a schema diagram earlier. You have been using it for your labs presumably. In that schema diagram every attribute was shown. When we do ER modeling on the other hand we will not show attributes which correspond to relationships. In fact, as we will see in certain cases those relationships will actually get mapped to attributes when we create relations. In certain cases the relationship will not be mapped to an attribute it will be a separate table which we create from that relationship. We are going to see that in a little bit. So, the important thing to notice if an attribute which you thought of really was something to identify another entity which this is related with. In other words, if it were a foreign key it would almost surely be a relationship not an attribute. So, that is what we have here. This does not have department name. Similarly, student has ID name and total credits and department name has vanished from student. And finally, the relationship between instructor and student is shown by a diamond over here with the name of the relationship advisor in here. So, now ER diagrams represent really entity sets and relationship sets. They do not represent individual entities or individual relationships, but really the sets. But to keep saying set, set every time will get very boring. So, we are going to say the instructor entity and student entity and avoid the word set since it is clear from the context. Now, you will notice that in this notation at the top of the box we are showing the name of the entity set or entity in this case instructor and student. And inside the box we are listing the attributes and you will also notice that we have underlined certain attributes in this case the ID attribute. So, the primary key attributes which we have chosen are going to be underlined. So, that is part of the ER notation. And as I told you at the beginning this notation is different from what we had before. Those of you who have used earlier edition of this book or for that matter Raghuramakrishnan or Elmastri and Navathai or most of the other textbooks would have noted that over there instructor would have been a box and ID name and salary would have been ovals sticking out of the box. The first thing you would notice is this is much more compact. It fits in a lot less space while conveying the same information. So, that is one more reason that industry chose to go with this notation for the most part rather than the boxes and ovals notation. Now, back to the conceptual level where we are showing actual entities. Again this is not an ER diagram. This is a conceptual diagram. Supposing we wish to track when the advisor first became associated with the student. Now, this may seem a little artificial, but let us just assume it for the moment. You could track other things if you wish. Maybe you want to track the last time the student met the advisor, if at all you had some way of noting this. It would be nice to track it. It would be nice to realize that a certain student has not met their advisor at all for the last 8 months and maybe they should be meeting them. But in the real world there are constraints. So, maybe that would not get modeled because who is going to enter it. So, for the purpose of this discussion, we will just assume there is a date attribute which is an attribute of the relationship. It records in this case when the student was first associated with the advisor. Can this attribute be an attribute of student? In this case, actually you could. If a student has at most one advisor, then you could actually record in the student entity one extra attribute which says when did the student last meet their advisor. But if a student can have two advisors, he may have met advisor one on a certain day and then met advisor two on a second day. And if you just have one attribute with student, you cannot represent this. Therefore, if a student can have multiple advisors, you must represent the date as part of the relationship between student and advisor, which is what we have done here. So, here are a bunch of dates for each of them. Now, coming back to our ER diagram notation, we would show this information in the ER diagram by adding attributes to the relationship. And in the old notation, we would just draw a bunch of ovals of the relationship to show the attributes. In the new notation, what we do is we have a single box showing all the attributes of that relationship. And we connect that box to the relationship by a dashed line. Why a dashed line? Why not a solid line? Well, if it is a solid line, you would probably confuse it with being an entity rather than a set of attributes, although it should be clear still. But to make it extra clear, we use the dotted lines to connect it to the relationship. Now, coming to the attributes which an entity may have, in the relational model, we emphasize first normal form. And we say that attributes must be simple attributes. Just the single value is not a set of values. It cannot have components, etc., etc. But that is not actually natural when you are modeling a real set of entities. Entities do have attributes, and those attributes may have a structure to them. So, for example, supposing we have, you know, a set of phone numbers for a particular entity. In the relational model, if you know how this is done, we would end up normalizing by creating a separate relation for this phone number to instructor mapping. We don't have to do that in the ER modeling. We can very well represent a multi-valued attribute, which is a set of attributes. Later, when we convert it to relations, we may still normalize it, and we will see how that is done later. But when we are dealing with the ER modeling, we do not have to break up something from a set to a singleton right away. We can have an attribute which is multi-valued. Other attributes which are not multi-valued are single-valued attributes. Then, we have simple and composite attributes, which we will see in the next slide. We also have certain attributes which are derived attributes. That is, they are not stored, but they are computed on the fly from other attributes. These are much like methods of an object which are computed on the fly. So, here is an example of a composite attribute. We have a name which may be broken up into first name, middle initial and last name. Now, most of our institutions in India don't do this. For example, when people get their 10th or 12th standard certificates, their names are just shown flat. And they are entered by students in some way typically. And people enter it in different ways. So, for example, somebody may enter the name as Sanjay Jain. And another person with the same name may enter it as Jain Sanjay with the last name first. And people don't care. But then, when I get this list, which comes in through our JEE or gate and I take attendance, it gets very confusing. Some of the names have first name first, some have last name first and so on. So, when I call attendance, I keep flipping, which is really weird. So, what we would like is to break up the name into the components, which is first name or given name, last name or surname and then other names. And if you don't want to keep full list, we may just abbreviate it to a single initial or a few initials. Now, depending on the part of India you are from, you may have just one or two names. You may have three names, which is common in Maharashtra. The given name, middle name, which is often the father's name and the surname. Some people are more egalitarian. They have their given name, father's name, mother's name and then surname. And then, we have people from Andhra, whose names include many parts. So, in India, this is a little more difficult to decide what is the break up. But it certainly makes sense to distinguish a given name from the surname that is fairly standard, so that we want to represent the name as having at least these parts. So, let's stick to what we have here, which is a name has a first name, middle initial or initials and then a last name. So, that is a composite attribute. There are certain times when I want to just say print the name of this person and implicitly I want to print all the parts. There are other times when I may want to say, I want to get the surname of this person for whatever reason. Maybe, I want to print the surname first, then the first name for some reason. As another example, we have an address, which we all know is often stored as a single attribute. But sometimes, it stored as address line 1, address line 2, address line 3, which is not terribly structured in any way. But more often, people actually break up the address into address line 1, line 2, then maybe the area, the city, state, pin code and then country. So, the address actually has a structure. So, in this case, we have decided that address has a street part, a city, a state and a postal code. And the street part itself has a street number, a street name and then perhaps an apartment number. So, that is the structure of composite attribute. Now, here is how we will show these types of attributes, composite, multi-valued and so on in the ER diagram. So, we are going to continue to have a box as before. We are going to have simple, single-valued, simple attributes as before just shown with a single name. But now, if you have a name which has components, we are going to show the name and then show the components below it indented. So, we are purely using indentation to indicate that these three are components of name. Similarly, we have address, which has components street, city, state and zip. And street itself has components street number, street name and apartment number. So, we have, what we have done is we have taken the same information from the previous slide and mapped it into a box with indentation. Then we saw multi-valued attribute phone number, which we are going to enclose in curly brackets to indicate that it is a set of values. Then we have date of birth, which is a simple attribute. Well, actually date of birth, you know, again date could potentially be broken into month, day and year. And people will say this is not first normal form, but you know, that is iffy. So, we are going to assume here that a date is a single-valued attribute, although conceptually you could treat it as a multi-valued attribute, which has, sorry, as a composite attribute, which has year, month and date. That is the date of birth. And age is a derived attribute, which is going to be computed whenever you run a query on this, whenever you get that value. Now, our ration cards in India, at least out here in Maharashtra, have this ridiculous field, which had age. And according to the ration card, I am still 30-odd years old, which is way off. So, that is a very bad idea to store age as a regular value in an entity, because it changes with time. You should really be storing date of birth or year of birth, if you do not know the exact date. So, that is with respect to attributes. Now, relationship can have a degree associated with it, which indicates how many entities participate in that relationship. So, a binary relationship has exactly two participating entities. A ternary relationship will have three participating entities, a quaternary and so on. In general, an n-ary relationship will have n entities participating in it. Now, it turns out that, although this generality is useful in many cases, in almost all cases, binary relationships are the ones, which are the most useful. In fact, it is often a struggle, if you take a particular domain to find out what relationship in there would be ternary and not binary. Why is this? Well, it is not a universal truth. We can certainly create a lot of very meaningful n-ary relationships. For example, if you have a student, a degree program and a department, would that be an n-ary or binary? We will see in a little bit, a few examples. So, the first attempt you come up with to say, this is an n-ary relationship. Maybe a student is related to a department and a program. You will soon realize, may actually be better represented by a set of binary relationship. We will see examples in a little bit. So, the first example we are going to see for a ternary relationship is little concocted, I will admit, but still it could be meaningful in certain situations. So, here we have an instructor. We have a student and then we have a project. So, what is this relationship? The relationship is project guide. So, what we are trying to record is, which person is a guide for which student for which project. Now, for most people who are doing a B, there is just a single B project and then you can very well say, student is related to project and student is related to instructor as a project guide. They are two separate relationships and it does not matter, because there is only one project. But in other programs, there are multiple projects. For example, in our Master of Design course here at the Industrial Design Center, students are expected to do, I think, three projects. So, in the course of their Master's, three separate projects. Each of those projects may have a separate guide. Then you want to record who was the guide for the student for which project. So, we have a notion of a project which is uniquely identified, a student which is an entity, instructor which is an entity and then a ternary relationship project guide which relates a student with a project and a guide. So, if you broke this into binary relationships, student project is probably okay. You can record which student was in which took which project. But what about student instructor? Then we would have lost the information of which particular project this instructor guided. So, that is a bad idea. So, we do need a ternary relationship in this case. On the other hand, the earlier one which I mentioned, which is student to program degree program and student to department is probably not ternary. If a student is admitted to a single degree program in one department and graduates with that. But supposing university decided to preserve the roll number of a student, even if that student rejoins. And why would this happen? Let us say that Nandan Nilekhani's Aadhar project is successful. I am sure it will be. And everybody gets a universal ID. Now, it doesn't make sense anymore for a university to go and create a new local roll number or identifier for each person. And in fact, in the US, this is done already. Most universities will use the social security number of people which is kind of the US version of the universal ID and use that as the roll number. So, if a student joins again after doing a BE, they join for an ME. They still have the same roll number. It is not changed. So, now, if I say a student is in BE and let's say the electrical department. And then the student joined again for an ME, but this time in the computer science department. If I just say student program, I am going to say student BE, student ME, same student. Student department, I will say student electrical, student computer science. Now, I no longer know whether that student did BE computer science ME electrical or BE electrical ME computer science or both computer science and something else in electrical or both electrical and something else in computer science. It is not clear. So, in this situation, whereas roll number may get used multiple times for the same student joining different programs, student program department is probably a ternary relationship over there. So, again it depends on the specifics of the domain being modeled. So, now, here is a little quiz on the same topic we have just discussed. But as usual, we may want to give people a little bit of time to set it up. So, at this point, all center coordinators, please make sure your receivers are all set up. And if they are set up, please tell your participants to go ahead and press the S T button so that they are all ready for the quiz. Meanwhile, let me explain what this question is. It is a rather long question. So, I will take a couple of minutes to explain the question. So, supposing we are given a person entity set and we wish to represent the relationship between people and their father and mother. So, the question is, there are several possible representations. Which is the most appropriate representation in this case? In this case meaning, in the case where for certain people, we may know only the father or only the mother. We may not know both. So, that is the fact which we have to take into account. Even without that, the answer is probably the same, but this makes the motivation for the answer even stronger. So, what are the possible answers? One answer is two binary relationships, father and mother between persons. So, it indicates who is the father of a person, who is the mother of a person. The second alternative is the ternary relationship between three persons, one of which is a father, one is the mother and one is the child. And how do you distinguish which is which? In the ER notation, we will see in just a moment. The third thing is, we have an entity set called parent linked by relationships to person who is a father, another relationship to a person which indicates the mother and a third relationship to person which indicates the person whose father and mother these are. And lastly, a single entity set with attributes, a person, father and mother. These three attributes would be the IDs of the corresponding person, the father and the mother. So, the question is, which of these four alternatives is the most appropriate representation in this case? So, I think we have enough time now to have set up the quizzes. So, give me a moment and we will start the quiz over here. All of you should have pressed your ST button by now and have the red light going. The time is on, your red light should be blinking and please enter your option 1 to 4 or A to D. You have entered your answers by now and time is up. So, returning back to the questions and the possible options, based on the discussion we just had about student and the program and so on. Along similar lines here, you will note that it makes sense to have two separate entity sets, two separate relationships. One is the father and one is the mother between a pair of persons. Why is this useful? Supposing we know the father, but not the mother, we can still represent the fact that this person is the father of this person, even though we do not know who the mother was. Similarly, if you know the mother, but not the father, we can represent that without knowing the other one. In contrast, if you use a ternary relationship in the ER model, a ternary relationship must have three entities associated with it. It cannot have two. You cannot say this is a null entity related to it. It must have three. Then, it is not possible to represent a situation where we know the person's father, but not the mother. So, that is impossible. In general, for this reason and also for the reason that it is not possible for a person to have two fathers and two mothers, where one mother pair makes sense and the other mother pair makes sense, but the remaining do not make sense. This is a little far-fetched. Maybe it is possible in certain situations where somebody is adopted and so forth, but for most practical purposes, this is not very relevant. So, we are probably going to break it up as father and mother separately. How about the third and fourth alternative? Having an entity set parent really does not make sense here. There is no new entity which we have here. It is a relationship. So, it is thick to relationships and the last one, an entity set with attributes person, father and mother is a very wrong design. Remember, I told you entity set should not have an attribute which corresponds to a relationship. The relationship should be explicit and that should not be a corresponding attribute with the entity. This is exactly the reason we removed department name from person. We are going to introduce instructor-department relationship. So, we are not going to keep department name as an attribute of instructor in the ER model, although we may end up putting it back in the relational model later. Let us see how the results have come in. The number of institutions with results is little small compared to late yesterday. Few people have not succeeded in connecting. So, please check your connectivity. The number of people responding is also quite low. It is just 117 and audience lost here. Audience majority choice was turn-way relationship and at first glance, that seemed to make sense. Person is related to mother and father, make it turn-ary. But as I explained with the turn-ary relationship, you cannot represent just the knowledge of who is the father without knowing who is the mother and vice versa. The second most popular choice was the correct answer and then the C and D also had a good number of votes and as I explained, they are both wrong. I hope this concept is now clear to you. Now, let us move on to the next concept which is cardinality constraints. So, when you have binary relationships, there are constraints which say how many times is a particular entity associated with that relationship to another entity on the other side. Now, cardinality constraints also make sense for n-ary relationships, but modeling them is a little more complicated. So, we are not actually going to bother because already as I said, turn-ary and higher degree are not so common in the first place. So, we are going to stick to binary for now. So, there are four basic cardinality constraints, 1 to 1, 1 to many, many to 1 and many to many. So, I am going to show it not by using ER diagrams, but by showing sets of entities and their relationships. So, in the first one over here, we have each entity in the entity set A related to at most one entity in B and vice versa. Observe that this last entity here is not actually related to anybody here and equally well, we may have had another entity here which is not related to anybody on this side. So, 1 to 1. So, this is a 1 to 1 relationship. 1 to 1 does not mean that everybody is related to somebody. There can be unrelated people also, unrelated with this relationship. So, that is 1 to 1. Now, of course, this is a picture. In this particular example, the relationship is 1 to 1, but when we take an entity set and say that sorry a relationship set and say it is 1 to 1, what it means is that for any legal instance of this relationship set, the mapping must be 1 to 1. That is each element on the left side can map to only one element on the right, one entity on the right and each entity on the right can map to at most one on entity on the left. That is a constraint on that relationship. The next one is 1 to many where one element, one entity on the left side may be related to multiple entities on the right side, but one entity on the right side cannot map to multiple entities on the left side. So, it is 1 to many, but coming from the other side, it is you can think of it as many to 1. So, entity in B can be mapped to at most one entity in A. You cannot have more than that. So, this is a 1 to many situation. Now, 1 to many is actually entirely symmetric to many to 1. If you flip A and B, if a relationship is many to 1 from A to B, it is also 1 to many from B to A. It is completely symmetric. There is really no difference here, which you choose whether you want to call it 1 to many or many to 1 is up to you provided you put the corresponding entity sets in the left or the right. Of course, saying that a relationship is many to 1 from A to B is not the same as saying it is 1 to many from A to B. For example, we may have an instructor being the advisor for many students, but each student can have only one advisor that is a constraint. Therefore, it is many to 1 from student to advisor, but I cannot say it is many to 1 from advisor to student. So, if it were many to 1 from advisor to student, that means a student can have many advisors, but an advisor can advise at most one student. That is a rather silly situation. Most universities would have far more students than advisors. So, this does not make sense. And finally, many to many which is basically there is no constraint. Each entity on the left can be related to as many entities as you want on the right and vice versa. So, this is a many to many situation. Even if a relationship is many to many in general, a particular instance of it may turn out to be 1 to many or many to 1 or even 1 to 1. So, there is a difference between what an instance satisfies which is not very useful and what a relationship must satisfy which is a constraint on the database design and that is what we are looking at here. Here too, there may be things which are not mapped. In either side, there may be things which are not related at all to one from the other side. So, that was a conceptual idea of cardinality constraints. Now, how do we represent this diagrammatically? So, in our ER notation, we are going to use arrows to denote this. Again, arrows are not really used in the UML notation. This is slight difference, but they are used in the ER notation traditionally. So, we are going to stick with them. So, an arrow from a relationship to an entity will signify 1 while an undirected line with no arrow head will indicate many. So, let us see it diagrammatically. So, here is a diagram which says that instructor and student are related by an advisor relationship and the arrow on both sides says it is 1 to 1. This is not realistic as I said. If it is 1 to 1 and there are 100 instructors and 1000 students, 900 students cannot have an advisor. So, that is not realistic, but this is just to show the notation. So, each student can have at most one advisor and each instructor can have at most one advisor. Now, here is a more realistic situation where we have the arrow pointing to instructor not to student. So, what an arrow in this direction means is that a student can have at most one advisor and since there is no arrow in the other direction, you are allowing an advisor to have as many students as required. So, what is this? This is 1 to many from instructor to student or if you view it as student to instructor, it is many to one from student to instructor. And finally, many to one from instructor to student, this shows many to many where there are no arrows. So, 1 to a many, many to one is one kind of cardinality constraint as to how many times an entity can participate in a relationship. So, if a student can participate only once in the advisor relationship, that means a student can have at most one advisor. Now, this can be generalized. We may say that a student must participate exactly once in the advisor relationship. What does this mean? It must participate. That means a student must have an advisor and must have only one advisor. It cannot have more. So, there are several notations for this. I am using a slightly different example here which is relationship between section and course. So, remember that a section is associated with a course. Each course may have 0 or more sections in a particular semester here. Now, I am going to defer the discussion of what are the attributes of section and what are the attributes of course for the moment. You will notice that section does not even have a course ID. And this is because of it is what is called a weak entity. I will come back to that later. For the moment, let us say there are section entities representing each section. And then the course entities representing each course. And we are going to identify which course a particular section corresponds to by relationship called sec course. You will also notice that this particular relationship is shown with a double diamond. The reason for the double diamond will become clear later. For the moment just treat it as any other relationship. So, now, if you have a section and say and the database does not know which course the section is associated with, that is idiotic. I am it is like saying I am teaching a course section, but I am not going to tell you what the course is. That does not make sense. I have to tell you what the course is. So, a section must be related to a course. And we are denoting that here over here by a double line. So, that is a it is called a total participation constraint. In contrast, if you draw a single line, that is the default which is partial participation. Meaning, if it is a single line, a section may or may not have an associated course. That would be wrong. By putting a double line, we are forcing a total participation. That means, section must participate in the sec course relationship associating it with a course. Similarly, if we had student advisor and a double line from student to the advisor relationship, that means every student must have an advisor. In fact, you can generalize this even more. So, over here we have the student instructor situation. We could have drawn a double line, but there is an alternative even more general notation for cardinality limits. Over here, I have said 1 dot dot 1. What does that mean? This means that student must participate at least once and at most once in the advisor relationship. That means the student must have at least one advisor and at most one advisor. That is exactly total participation with an arrow pointing towards instructor. On the other hand, on this side, I have said instructor is 0 dot dot star. What does this mean? 0 means an instructor need not participate in the advisor relationship. So, a particular instructor may not be an advisor. This is normal. Many faculty are not advisors to anybody. On the other hand, it says up to dot dot star. Star means no limit. So, there is no limit a priori on how many students a particular instructor can advise. They can advise any number. So, now the next quiz question. Before I read the question, please press the s t button. I hope I will see a much larger response for this question. Please, please answer the question. Do not just sit idle if you have the remote in your hand. Even if you do not know the answer, make a guess. It helps make a try to make an intelligent guess. We are not actually recording this and saying you made a wrong choice. You get minus 5 points. No, we are not keeping track like that. So, if you are not sure, make an educated guess and it helps me get an idea and overall, how well people are understanding whatever material we are covering. So, I hope all of you have already pressed the s t button. We are starting the quiz in a moment. Now, I will just give a couple of seconds. So, now the quiz should be active. You should see the red deletes blinking. Go ahead and choose from the alternatives here. The above relationship is many to one from instructor to student, one to many from instructor to student, one to one and many to many. Pick the appropriate choice. You have about 30 seconds now. Hope you have chosen and now we are out of time. So, while we wait for the results to come up, as we just discussed, in this particular diagram here, a student can have, must have at least one and can have at most one advisor. Therefore, it cannot be many to many because the student cannot have multiple advisors. On the other hand, we have said zero dot star for instructor. So, an instructor may have many advisors. So, it is certainly not one to one and it is certainly not also many to one from instructor to student. So, many to one from instructor to student will mean that many instructors can have one advisor, but one student can have many advisors which is wrong. So, in this case, it is actually one instructor can be mapped to many students. So, it is one to many from instructor to student. Looking at the results, the number of centers is marginally increased. Let us see the participation. We have a marginal improvement in responses up to 122. It is still very, very low. I am not sure why most people are still not participating. Is it a problem with centers distributing clickers? Is it hardware problems? Network problems? I do not know, but I do hope this will improve. So, the choices which people have made, the most popular choice is the correct choice. It is one to many from instructor to student. Many people have said many to one from instructor to student. This is a common mistake when you say one to many, many to one, it is a little confusing from which to which. So, given time pressure, I can understand why many of you may have flipped this, but please be careful about this. When you see the direction, just reason out as I did, which side should be one and which should be many. So, the many students can have one advisor. Therefore, it is many to one from student to advisor, but since the question was the other way from student to instructor rather. Since the question was instructor to student, you will say it is one to many. So, this is often a good way of understanding it. First, understand which is the many side and which is the one side. So, you can say it is many to one from the many side to the one side. Then, if you are forced to say it in the opposite way, it is clear which one it is. A few people have chosen many to many, which is clearly wrong. I just said that a student can have at most one advisor. It is certainly not many to many. Now, moving on to another interesting topic. Given a relationship, I want to identify it and we want to have keys for relationships also. But the key for a relationship is actually not an attribute of the relationship. They are actually the combination of the keys for the related entities. So, in the ER model as devised by Chen in his seminal paper which introduced ER modeling, what he decided and what now everyone follows is that in a given relationship, a given set of entities in a given relationship set, a given collection of entities can have at most one relationship. What does that mean? Let me explain this. Let us say that we had a relationship between a student and advisor along with an associated date. Let say that we decided that instead of a single date, we are going to have two dates which indicated that this student was advised by this advisor from this date to this date. So, it is actually a pair of dates which is an attribute of the advisor relationship. Now, with this design, let us say that I was advising a particular student for one year. Then for whatever reason, I am no longer advising that student. But maybe after two years, I am again advising that student. So, what is going to happen? I have advised the student twice across different years. And if I want to keep the attribute as in my design. So, let me write it out. It will make it more clear. So, I have student and I have instructor and I have advisor in between that is a relationship. And I have a start date and end date. Those are the two attributes of the advisor relationship. Now, as I just said, if I want to model a situation where I was advising a student for some time, then I stopped advising the student. And now I want to advise the student again. So, a lot of people will say, well, if I look at a particular instance, let us say there is a student with ID 111 and let us say instructor ID is 222. I will say that there is one relationship here in terms of lines with some associated dates x, y, z to a, b, c associated date. And then there is another relationship between the same pair of entities with dates e, f, g to d, e, f. So, what we have are the same pair of entities, the same student and the same advisor have two separate relationships of the same type, the same set in relationship set advisor. So, that is the one relationship set I am looking at. And the same pair of entities have two different relationships in this relationship set. What Chen decided is that this should not be allowed. If in a particular relationship set, it should be a set that cannot be two occurrences of the same thing. Now, this is the constraint which we are going to use when we decide what are the attributes which identify a relationship. In particular here, it should be clear that if I give you the IDs of the student and the instructor, I can uniquely identify a relationship under this constraint. If I allow two relationships, I do not know which one it is, but I am not going to allow this, not allowed. A particular pair can have only one relationship. Therefore, if I give you the primary keys of the two associated entities, then that uniquely identifies a relationship. So, that is the super key for the relationship set. That is the constraint which is implicit in the entity relationship model. Now, you may ask, well, what about this case? What do we do? I do want to model the situation where I advise the student for one year, then I did not, and then I advise the student again. And the answer is you can model it by turning this and hope it, you can see it clearly. But what I have done is created a multivalued attribute with start date and end date as the parts of it. Maybe I will, if I want to make it more clear, it is a multivalued attribute called period with two parts, the period has start and end date. So, now it is clear that I can model a situation where I advise the student from some date to some date, that is one instance, then another date to yet another date. And these are both going to be part of this set. So, since it is a multivalued attribute now, I can have multiple instances inside of it. So, this is the correct way to represent the situation like this, where the same pair of entities may have multiple relationships, treated as the same relationship, but with a multivalued set of attributes up there.