 Welcome back to the global lab session. So, for those of you who are assembled, I would like to briefly discuss some of the issues which went into designing a near diagram for the two example scenarios which we have. The first example scenario is a railway system and here we have listed a number of things which we would like to model. If you see the problem statement here, it says we want to model stations, tracks which connects stations, train schedules, passengers and after we do the year diagram we have to transform it into a set of tables. Now, this is highly simplified. So, one of the things I want to do is discuss a the solutions which we have made available and b what are the assumptions which we made to simplify it and how we could go around extending it to a more realistic scenario. The reason I simplified it is that this whole thing the two exercises both had to be done in just over two hours of your time and that is fairly short. So, we had to keep things simple. So, first of all starting with stations clearly each station would be an entity there is no doubt about it clearly each train would be an entity there is no doubt about that. Now, the next thing in that list is tracks. So, a track connects two stations in our simplified world. In reality it is more complex tracks can have junctions at various places and realistic system has to model these kinds of situations where there are two tracks in parallel and certain places along the tracks where you have points which later train move from one track to another going in one direction. And so, there is a lot of complexity in really modeling a track system between a pair of stations in Mumbai there may be four or even six tracks in certain situations with a lot of interconnections. So, it is a fairly complex situation, but to keep life simple we said let us just have a single pair of tracks a single track between any two stations. In this simplified world where there is nothing else in between just as edge connecting two stations we can model a track as a relationship between two stations. And if you wish to add any extra information such as the distance between the two stations you can add it to the track relation which will record how far the stations are from each other along the track. So, let me show you the relevant part of the solution. If you look down here we have a station and then a track which is a relationship between two stations. If we needed to bother about directionality and other things we would have to provide a labels for these edges, but in a simplified view we are not going to bother about all that. But if you want to make it more realistic if you want to model the presence of multiple tracks between a pair of stations what do we do? One possible approach is to add a multivalued attribute to the track which says between these two stations here are you know possibly multiple tracks with one attribute set for each of those tracks. That could be useful, but maybe it makes more sense not to model track as a relationship. Why not model it as an entity itself, maybe even a segment of a track could be the smallest piece of this puzzle. So, maybe we can model smallest segment of a track that is between any two points. So, between those two points is a continuous object and we can model that as an entity. So, if you do this what happens we have entities which are track segments. Now we need to know how track segments connect to each other and how they connect to stations. So, let us do this on the white board and in just a little bit, but before we switch to the white board let me wrap up the rest of the basic design which you would have done. So, coming back to the problem specification we need to model train schedules and trains. So, the train schedule in our simplified world records at what time a train passes through each station on its route and we again made life even easier for you by assuming that all trains finish within the same day. Why do we do this? That should be obvious because if a train starts on one day and continues through the next day, we have to model which day it starts and maybe that will be day 0 by default and the timings along the route have to include not only the time of arrival and departure, but also the date relative to the starting date. So, day 0 is the starting date. Day 1 in the case of some very long distance trains in India day 2 or 3 may be even or if you are in Russia in the Trans-Siberian Railway maybe it goes up to day 7 or so I am told. So, we ignored that, but we could have modeled it by adding in addition to the time the day. So, if you see the Indian Railway schedule in fact they do this they have a day in addition to the time. So, then we are modeling the time in and time out and finally, we need to know what is the sequence. We can always get the sequence by sorting on time, but it helps to keep a sequence number to see which are the adjacent stations in a route. So, what we did in our sample solution is to have a train associated with the station and so that is the relationship travels through which links a train and a station. Now, what are the attributes of that relationship? We have got a sequence number 0 means the train starts there and a time in when the train reaches that station. Now, for the very first station on the route where it starts time in does not really have a meaning. Maybe it does if you are also worried about what time the rake empty rake reaches this platform, but ignoring that we can simply set time in to null and time out is the time when it starts. Then you have that would be a sequence number 0 or 1 computer scientists like to start from 0 others may start from 1. So, then you will have a sequence of stations through which the train passes and each of these would have an associated sequence number. Now, if you have a train which actually does a loop and continues on what do I mean by that it arrives at a station goes somewhere and then comes back to that station and continues further in its journey not goes back, but continues further on. Then you have a strange train with a loop in its journey for such a train with a loop something is going to go wrong with this design. What is going to go wrong? The problem is that the same train arrives at the same station at two different times and departs at two different times has two different sequence number all for the same station and we cannot model that with the diagram we have here. What we have here is a time in and time out and a sequence number which are single value. So, we cannot have a train loop back. So, this is again a simplifying assumption we have made here to keep the E r diagram simple. Now, is this a realistic assumption? It is for the most part I do not think there are very many trains which loop back, but then you never know there are all these special super expensive trains which take tourists through Rajasthan or Konkan and so forth. Now, I am sure some of these trains actually loop back they depart on a particular date go somewhere come back and then go further to some other place. So, if you are modeling such trains this is going to fail. So, then you would have to have something slightly different perhaps a multi-valued attribute or perhaps even you can model a particular halt of that train itself as an entity. So, when we tried this exercise during the coordinator's workshop several people actually did that which actually makes a lot of sense in this situation where the halt of a train is an entity by itself and the halt relates to a train it would be a weak entity of a particular train and it would be related to a station. So, then you can easily have a loop I will come back to that design alternative also on the whiteboard, but before getting into those alternatives let us finish up the last piece which is we want to track bookings for passengers. Now, in the Indian railways the railways does not really track who you are across time across multiple bookings it simply has a name and an age and that is it it does not know anything more about you may be it has an address, but if you have an airline reservation system sometimes they have a frequent flyer number and can identify you individually across multiple times in your travel. Again we are not modeling such situations we are going with the Indian railway model and a booking simply has an ID and again we simplified it a lot in general a booking has multiple people traveling together a family or set of friends may travel together we again simplified it to assume we are going to have a separate booking record for each person traveling. So, we have a booking separate booking ID and then a name a coach date of travel and a seat number. So, we are going to have separate entries per person and therefore, we do not have to worry about multi-valued attributes here either. Now, note that the booking is for a train and since train is an entity we have to create a relationship from booking to train not also that it is a total relationship there is a double line and there is an arrow pointing towards train because a booking can relate to only one train you cannot have a multi-train booking in our model although you can imagine a flight booking which has multiple legs or a train booking Indian railways does not support it as far as I know, but maybe you could do that in certain situations we are not modeling it. The other aspect of the booking is it is it starts from some station on the trains route and ends at another station on the trains route. So, you have a from and a to station and since station is an entity also we have to make both of these relationships from booking to station one of those relationships is called from and the other is called to. So, that those are the two relationships as before we have total participation and many to one because you can have only one from station and one to station. So, that wraps up our very simple railway ER model, but now let us see what are the issues if we try to add more features to it and I am going to switch over to the whiteboard at this point. First of all let us take the simpler one which is if you want to allow a train to go in loops and do generally crazy things and so we decide to model a halt of a train as an entity. So, we have a train I am going to omit the attributes of a train because we already know what they are then we can have a train halt and treat it as an entity. A train halt has to be a weak entity because it is for a particular train as a result train will become the identifying entity for it we can call it halt for perhaps I do not know if you can read this I wrote it a little too small let me try again halt for that is an identifying relationship. Therefore, this has to be total and this side this is many to one and a train halt itself has a sequence number which is a discriminant. Now, another issue is if you are modeling a train route in full detail we want to model not only places where it actually hoist, but also places where it passes through. So, if you wish to distinguish this we have a couple of options in our schema we did not really bother too much about this we could have just assumed it hoist at every station along the route it is the slowest passenger train on the system it hoist everywhere, but supposing it does not halt at a station quick hack was to say that the time in and time out for that station are exactly equal. So, minimum halt is one minute if you halt less than that if you time in and out are exactly equal that means it does not halt at all. If you wish to make this more explicit we can have maybe not a train halt we could call this something slightly different we can have a train through passes through or something like that and then we can have a flag or hoist that is a bullion which says whether it hoist or not and then time in as before time out. So, far so good, but this is obviously related to a station so we have a, sorry for the interruption the battery died out on me. So, to repeat what I was just saying towards the very end a train halt is obviously associated with a station. Therefore, we introduce a relationship between a train halt and a station and that relationship is one in which a train halt has to have total participation because halt has to be presumably at a station. If you want to model halts you know at some point along the route which is not a station we may want to go further, but let us ignore that and so a train halt is total participation for train in halt at and each train halt is at only one station therefore, you have a many to one relationship with station. So, now if you have a train which does a loop there is no problem there will be one in a week entity instance for the first time it comes into that with some sequence number and the second time it again comes to the same station it will be with a different sequence number in that can be modeled perfectly well. So, that took care of one part of it the second part is the track itself. Now, if you notice we just said that there is one track between any pair of stations and if you think about this if you are familiar with the data structure graph the graph data structure that basically has nodes which correspond to our stations and edges which link a pair of nodes. So, what we have done in our design where track is simply a relationship between two edges is model an edge of a sorry a track is a relationship between two stations is basically a edge between two nodes in the terminology of the graph data structure. So, what we are saying is that a graph data structure can be modeled as a ER diagram you can translate this to a set of tables and so you can actually populate a graph in a database model a graph in a database. Now, when we have multiple edges and so on it is more than a graph some multi graph and we can still model that. So, let us try to do that supposing we have a station to another station you have multiple tracks. Now, presumably the railway system would have given names to those tracks. So, you have tracks. So, a track has to become an entity now not a relationship because for a given pair of stations there may be many tracks between them and really each track has an its own existence. In fact, there are good reasons to model as an entity. For example, different tracks may have different speed limits based on when they were built. So, maybe there is even a record of when the track was laid and when it is due for maintenance when it is due for replacement and so forth. So, we are going to model a track segment to be precise it is a segment of a track as an entity that is the name of the entity and it can have attributes perhaps it has an ID. Let us make it a strong entity because we cannot really associate a track with one of the stations or the other. So, let us just give it a unique ID globally unique ID and then we have station which is also an entity. So, a track in our current model we may even want to model interchanges between tracks in between stations, but let us throw that out to keep our life simple. So, now we have a track segment, we have a station and a track segment connects two stations. So, basically it has a relationship called connects a track segment connects two stations. Well, how do we put a constraint that it connects exactly two stations. So, one way to do it is to put a participation cardinality constraint here 2 dot dot 2. So, what that means is every track segment here must participate twice in the connects relationship and that will correspond to the two stations. Should a station have a track? Well, when a railway system is under construction it is possible that there is a station without any track coming to it, but once it is constructed that would be rather odd. So, if you want to model only the state where it has actually been constructed then it should have at least one track segment, but what is the limit? It can be fairly large we have stations with you know maybe 10, 20 tracks even. So, let us not put a constraint 1 dot dot star will allow any number of tracks in that connecting to that one station. So, a track itself once you model the entity can have maybe speed limit. Now, if we want to model some other situation like let us say some of the tracks are directional. So, on a particular track you are only allowed to travel in one direction. If you want to model that how do you model it in this situation. So, one way is to create two separate relationships which is track from and track to which denotes the directionality. If you want to enter that track segment you can only enter through the from relationship segment. So, maybe we can replace connect by two things from station into station. So, a track is from a station and it is to a station and now we can actually put a limit that every track is has is definitely related from into and there is only one from station and one to station. So, we can have a pair of entities from station and to station for a track which has a direction or if you are lazy we could maybe you know hack it in here in the connects relationship we can say whether it is the tag which is from or to. So, that is an attribute that is the other alternative. If that attribute is let us say 0 maybe it means it is the from station if it is one it is the to station. So, what you would end up having is maybe a track segment one connects to station one and the attribute here from two is 0 which means it starts from station one and the same thing connects to station two and the tag is one which means the direction is two station two. So, we can have two separate relationships here or we can have one relationship with a tag either of these works. I am sure many of you came up with some such variants and if you did that is perfectly fine. I think I have discussed the railway schema enough. Now, I would like to hear back from you if there are questions on the railway schema. So, let us see if any center. Somaya apparently has a question let us see if Somaya is ready with the question. Somaya over to you. Do you have a question? Hello. Yeah, go ahead. Instead of the relation, instead of the relation to and from if it is directly added to booking as a attribute or. Okay. The question is instead of having a relationship from and to between booking and railway station can you directly make the from station and to station as attributes of the ER diagram. The answer is no. A station is an entity and you really should not turn an entity into an attribute. So, a related entity should be connected by a relationship. In the relational schema which you construct we are certainly going to turn the from and to as into attributes of the relation because you know there is only one from station and one to station which certainly will make it part of the booking relation. However, when we create the ER model since the station is an entity it should not be an attribute it should be a relationship between booking and station. So, I hope that answers your question. I will hand the mic back to you if you see if you have any follow up question back to Somaya. Do you have a follow up question? You are happy. Okay. Let us see if anybody else has a question looks like only Somaya still has some question. If you are asking questions through chat go ahead we will get those and discuss those through chat. Yeah. Somaya back to you you seem to have a question again. Sir, I have a question related to derived attribute. Go ahead. How the derived attributes are used in the RDBMS means we are saying that derived attributes are gets the value from other attribute. So, how these attributes are stored and how we get the value of a derived attribute. Okay. That is a good question. If you have a derived attribute and as I said a derived attribute gets its value from the other attributes it is not actually stored. So, the question is how do you represent this in a relational database and how do you access it? How does it get its value? And the answer to that question is a regular you know bare bones relational database does not actually support derived attributes. So, the best you can do is create a function which takes may be not even an SQL may be a function in Java or something which takes the identifier the primary key and returns the value by computing it. So, the typical example is age the current age. So, why does the current age matter? Because when you travel by railway if you are over 5 you have to buy a half ticket if you are over what is it 12 or something you have to buy a full ticket. So, the date as of the date of travel matters. So, in fact you may not just have an age function, but you may have a function which takes a parameter age as of date to see what the age would be as of that date. So, you cannot store it you but you have to compute it from the date provided may default may be current date and the date of birth which is stored. So, if you have a database system which actually supports methods in addition to attributes and object relational databases some of them do support it. So, then you can just define a method along with the relation slash object and then you define a method which does this computation and then you invoke it in a query which for a language which supports such objects you will just say for example, person dot age open close which open close meaning parenthesis which means the method which uses the current date and compute age may be there is another method which is age which is provided a particular date. So, both of those are methods, but you access them just like their regular attributes, but you may have to pass in some parameter values. But if you use a database which does not support it tough luck you cannot do it, but that does not mean you should not model it in the ER model by all means model it and then may be you will work around by implementing the function in Java or some other language, but at least let it be part of the design. Any follow up question on that ok you are looking happy with that now somebody else has a question. Hello. Yes sir actually at the time of creating trigger or creating procedure we can write create or replace trigger or create or replace procedure, but at the time of creating tables we only used to write create table. Why cannot we write create or replace table? I think this question maybe you asked this question on chat earlier I have seen this question already I thought I answered it, but in any case the create or replace is a convenient syntax in oracle whereby if it is present it simply replaces the current definition. So, why would you do that? If you did not have it and you may be it exists already you first have to do a drop and then you have to create again, but dropping can have problems. So, if you drop a procedure you are going to cause a problem where if somebody uses that procedure calls that procedure they are now invalid. So, it may even for a table if you have foreign key references into that table if you drop and then create the table again you will have problems. So, for tables what we end up doing is we have a alter table syntax which lets you make just the changes you want to the table instead of dropping it and recreating it completely. Now, for a procedure you have the body of the procedure and it is hard to say you know delete line 24 and add this text to line 23 and do this and do that that is a factly complex you cannot do that. So, what you want to say is here is the new body of that procedure and use this and the create or replace syntax basically lets you do it. The procedure does not exist it just creates it if it exists already it replaces. So, maybe we could have a database system which provides it also for creating table. So, you can say create or replace table. So, the table already exists it will replace it. If you see the DDL scripts which we have provided we actually have two scripts one for creating and one for dropping. So, supposing you created the tables then as part of your assignments you went around modifying the tables. Now, you want to start a fresh with a new schema. Now, if it is just modifying the data we have the scripts for loading data which first remove all tuples from the relations and then load all the tuples again fresh that is part of the data loading script. For the script to create the relations on the other hand we have one which deletes and then one which creates. If we had create or replace we would have just used that directly just had create or replace without having to drop and then create again. I hope that answered your question. So, now since we. Hello. Yeah. Hello. Yes. Can we have all the kind of cardinality limits on ternary relationship? Okay. So, can you have arbitrarily cardinality constraints on ternary relationships. So, we saw what happens if you put an arrow on a ternary relationship. So, that means that for a supposing there is a single arrow let me draw it to explain it. So, we have a ternary relationship let us say R which is connected to E1, E2 and E3. So, now if I have an arrow pointing towards E2 we could take this to mean that for a given even E3 combination that can exist only one relationship with an E2. So, that is the general interpretation. Now supposing I have two arrows going to E1 and to E2 what does this mean? So, some books interpreted this to mean for a given E3 there can be only one E1 and one E2 combination. Some others said that if you just look at this arrow head what it means is for a given combination of the other entities in this case E3 and E2 that can be only one E1. And similarly if you look at E2 for a given combination of E1 and E3 that can be only one E2. So, that causes some confusion. So, we decided not to bother about this weird cases where different people are saying different things and we said that let us just assume there is only one arrow out of a particular ternary relationship then the semantics is uniform across whoever has tried to define it. So, that is one approach. The other approach is instead of using arrow heads we can put cardinality limits. So, we can say 0 dot dot 4. Now that is a lot more clear. 0 dot dot 4 means that E2 must participate in any way between 0 and 4 occurrences of the relationship R. In fact, if you have the arrow coming in that is slightly different. The arrow coming in what does that mean? It means for a you know particular combination of other ones there is only one E2, but I cannot say that that is one dot dot 1 that is different. So, it is like arrow coming in is many to one whereas putting this limit here 1 dot dot 1 is basically saying this is the one side it is not the many side. So, how to say that for this combination of E1 and E2 that can be only one E1 and E3 there is only one E2. How do you specify that you think cardinality? You cannot put limits here or limits here to specify that for this combination there is only one here. So, what exactly can be expressed using these notations is slightly different. Since this whole thing is a little confusing and in any way ternary relationships are rare we just left it out of our discussion. One way is to you know write it in words that this is the constraint and then maybe you can use that constraint when you do the relationship design. For example, when you decide what is the primary key of this relation which you create from the relationship this arrow would help you decide that as long as a single arrow. So, I am just going to leave ternary relationship side that I do not know if I answered your question to full satisfaction. If you have a follow up question you are welcome to ask. Sir, I have a question related to constraint. For some reason if we disable the constraint then we should do some DML operation like insert update or delete and after that when we again enable the constraint and if there is some problem with the constraint then what happens to the inserted later or updated data. That is a good question. If you drop a constraint totally and then re enable it what happens if the constraint no longer holds. The answer is that if the constraint does not hold it will not get added to the schema it will fail. So, constraint can be added only if it actually holds on the current state. However, I want to differentiate this from the thing which we discussed yesterday which was to defer the constraint. We did not drop the constraint we did not disable the constraint we deferred it what does defer mean it does not mean defer to some arbitrary point in time it means defer to the end of the transaction. So, the constraint is checked at the time the transaction complete is wants to finish up at the end of the transaction. At that point if the constraint fails the transaction is rolled back that does not mean the constraint is not being enforced. So, the thing here is that if the constraint fails the transaction rolls back therefore, the constraint will continue to hold it cannot get violated. So, deferring is different from disabling I hope that clarifies it ok. I think I will stop taking questions for a minute I am sure you have lots of questions, but I want to quickly wrap up the second part of today's lab oh and before that I want to say one more thing. This each part today also had a corresponding relational schema in the solutions we have provided have given you the table and we have also mentioned what are the primary keys, but in reality when you create the relational schema you should also be including the foreign keys we will add it to that solution subsequently. So, if you have the email copy of the solution it does not have it, but I leave it as an exercise to figure out how to add the foreign key constraints in the relational schema. It is discussed in the book when I covered it in the morning I do not remember if I mentioned it, but every time you create a relation from a relationship there would be foreign keys into the associated entities. Every time you fold a relationship into an entity again there would be a foreign key from that entity from that relation representing the entity to the entity on the other side of that relationship. So, every relationship ends up creating foreign keys. So, that is one extra part to the schema creation. So, if I can come back here. So, here are the tables which we would create from that ER diagram. There is a table for booking and you will note that the last few attributes of this table are train ID from and to. Those are actually references to the train and the station relations. The from and to are both foreign keys referencing station, train ID is a foreign key referencing train. The other ones are all straight forward. Track has station ID 1, station ID 2 to indicate what it connects. Travels through is also straight forward. The attributes of the relationship have all become attributes of the corresponding relation travels through. Sequence number, time in and time out if you go back here. Time in, time out and sequence number are all attributes of travels through. So, travels through has the primary key of train, primary key of station and these three attributes which is how we landed up with train ID, station ID, sequence number, time in and time out. So, that was straight forward. Now, let us come to an ER diagram for this database workshop, modeling a resource center, center coordinators and capacity, faculty who will be attending and the associated institutions for all faculty. So, the solution to that I will show you the ER diagram again. This is again a fairly simple ER diagram. Each resource center is obviously an entity. Each faculty member is obviously an entity. Each institute where the people come from is obviously an entity and those are the only three entities over here. We have also this relationship of who is the coordinator for a particular resource center. So, that is a faculty member whose attends at a particular resource center that is a faculty member. Note that we perhaps could have merged resource center and institute. Resource center is basically where you are all meeting for this and institute is where you belong. Now, why did we keep it separate? I do not have a good idea, but perhaps this was to model the situation where you can have a resource center which is not associated with any institute. But in this particular course, I think every resource center is an institute. So, in reality there is no need to separate these. So, then belongs to, attends and coordinates are all going to be relationships between faculty and institute. So, then the diagram would be extremely simple in that case. Now, resource center is well one way to look at why it is not an institute is maybe it is a sub part of an institute. So, because it has a capacity and institute like IIT or NITs have thousands of students, but a resource center is basically in a particular room it has a capacity constraint which is specified. So, perhaps a resource center can be identified as a weak entity related to institute perhaps or maybe even as a strong entity which is related to an institute where it resides. So, there are other possibilities here which we did not explore, but some of you may have done that if you did that is perfectly fine. So, that was the fairly simple year diagram coming back to the schema which we generate from it. You notice that we have resource center as a table, faculty as a table, coordinator attends our tables. Now, perhaps we could have folded these two relationships coordinator and attends into the faculty member. Now, if I come back to this diagram you will notice that the relationships are many to one from faculty to resource center. So, it would have been easy enough to have one two more attributes for faculty. One is attends and another is coordinates which contain the idea of the resource center. So, we could have got rid of two relations and turned them into attributes in the relational design. In the year diagram they have to be relationships. In the relational design we could have folded them in. Why did we not fold them in? For a couple of reasons. First, not all faculty are coordinators and not all faculty attend resource centers. So, it is not a total participation. So, the moment you have a non-total participation you get null values. So, to avoid that we decided to keep them as separate relations. So, what this means is coming back here coordinator will only have tuples. So, those faculty who are actually coordinators and similarly attends has things only for those who are attending at a center. So, now if I want to see how many coordinators are there I can run a simple query a select count star from coordinator. In contrast if I had coordinator of as an attribute of faculty I would have to say select count star from faculty where coordinator of is not null and so forth have to deal with null values. Null values always cause trouble. So, this schema is probably better. In contrast a faculty member has to be associated with an institute. Therefore, we folded the faculty member belongs to institute relationship into an insured ID field of faculty. And finally, the question last part was if you have clicker questions to be modeled and answer submitted by people to be modeled how do we do this. So, first of all question can be thought of as an entity on its own it has several attributes it does not make question cannot be a relation what is it relate nothing it exists in the abstract by itself. So, question we decided to model as an entity. Now, we already have people who are participating faculty and so faculty member may submit an answer to a question. Now, they can answer a question only once they do not get multiple items. So, we just have to record one answer for a faculty member responding to a particular question. So, what we did is come up with this E R diagram the left hand part is exactly the same as what we already saw the right hand part is what is new. So, the clicker quick quiz question is a new entity identified by an ID each question has an ID. Then we have a question text the four choices which we have been making available. If you want a fifth choice you have to add one more attribute and then a correct answer which is what is the correct answer for this question assuming only one answer is correct. So, that is an entity by itself completely described entity. Now, faculty member attending this workshop submits an answer to a question you can have actually this has a slight problem this arrowhead should not have been there I apologize for that error this is many to one. If it is many to one many faculty members can answer one question, but one faculty member cannot answer multiple questions. So, that is obviously an error that got in inadvertently please delete this arrow here from submits answer to clicker question remove the arrow. So, it is a many to many relationship and further when the person has answered a question we have to record what was the answer cannot just say answered and not know what is answer. So, the answer choice becomes a attribute of the submits answer to relationship. So, that is fairly straightforward. Then we have the last part which is converting this to tables and that is also straightforward we would have a table for clicker question and a table for submits answer. The submits answer table would have faculty ID, clicker question ID and answer choice three attributes. So, that is also straightforward. So, that wraps up the basic set of questions we had for today and I will take a little bit of time for questions. So, that this has the last part one other thing I want to mention the there is a tutorial on how to use dia to draw these diagrams and there is also a sample dia file which you can use to modify you copy modify we had sent these instructions to coordinators. Hopefully they have communicated it to you, but otherwise whenever Moodle at IIT is back you can come and see it. There are two ways of modeling entities in dia because our notation does not quite match existing notations. So, for newer versions of dia there is something called a database module which has a relation construct inside it. So, we have used that. Now if you look at the left of each attribute here there is a little mark that mark really indicates whether that attribute is nullable or not. There is no way to remove that. So, that does show up in the diagram here. In contrast the sample dia file which we have provided also shows you how to use the entity in the class rather in the UML class to model and entity. So, there is a sample class which you have used to model and entity by removing certain fields and setting certain fields. So, all that is described in the tutorial and it has been done in the sample. So, play around with it. So, I will stop here and take any more questions. Let us see if there are any questions please or chat has there were a few questions sent by chat. Let me answer those. The first question was how do you show total participation using the dia tool? So, if you just use a line to connect two things in dia then there is no way to make it a double line. On the other hand in the year module of dia there is a connector which is shown with a double line. So, just copy select that and drag and drop it into your diagram and now connect it up to the two things which to the relationship and the entity which you want to connect. And double click on that connector and it has an option which says participation which is either partial or total. If you select total it becomes a double line. If you select partial it stays as a single line. So, that is the way to get total participation double lines. The next question is given example of a quaternary relationship but ternary itself is hard enough to find and you want a quaternary with four things related. Actually, I will take this opportunity to mention some situations where you start off with something that looks like an n-nary relationship for large n, but after a while you kind of give up and say let me treat that as an entity instead not as a relationship. So, let us take let us say sports let us cricket all of us enjoy cricket well almost all of us probably. So, we have scores at the end of a game. Now, how do you model that score information which is the score card? How would you model that in an ER model? So, that turns out to be a little trickier than it appears. So, if you see a score card let me show it up here. So, score card has something like batsman and then there is a sequence number. So, when what is the order 1 2 3 there is a batsman then there is probably some thing like runs balls faced then there is usually in line not as a separate column you record how the batsman was out. So, we can split that into separate columns separate attributes caught which is who caught them which could be null which means the person was not caught at all bold who was the bowler and for us thumping I do not know if the bowler even matters and then stumped or run out run out stumped I guess means the keeper by default. So, run out by who who did the run out and maybe some other statistics 4's 6's and so forth. So, those are all the statistics. So, we have a table which looks like this. So, now if you see most of these are simply attributes in the table, but there are several attributes which like batsman caught bold run out or stumped by whatever. These things are really references to some entity to a batsman to players in general their references to players. So, how do we model this should be then there is a match of course this whole thing is for a particular match. So, that is not part of this table, but it is I am sorry it is going out. So, this whole thing is for a match and innings combination for a test you have two innings. So, all of this information exists for each match and for each innings. So, how do we model this should be have a match entities batsman entities bowler entities player entities what are all the entities which may exist. So, it is very clear that match should be an entity nobody will doubt that it is clear that player should be an entity it is clear that team should be an entity it is also well what else what are the other things what should be entities and relationships now. So, clearly a player plays in a match. So, maybe there is some relationship between player and match. Now, does it make sense to put all of these things as attributes of this relationship and that becomes a little messy. So, maybe you can have an energy relationship for corresponding to one row of this scorecard that becomes an energy relationship with thing like which looks like this is not a good design mind you I am just showing what you could do I am not saying it is a good idea. One of the edges it is called batsman another edge is called bowler another is caught another is run out and then you have attributes here such as runs balls etcetera these are all attributes. This is one way which you in which you can model this whole thing there is also an innings I will skip that for simplicity. So, you can model this row of this scorecard as a relationship which goes like this, but there is a problem with this the problem is that for a particular row maybe there is no person who caught it was a bold there was no caught by there was no run out. So, then you have a relationship where instead of have. So, in this relationship is actually it has five entities, but it is also a bad idea because if there is no person who did a caught or not for a particular row of the scorecard this will turn into a ternary relationship in that case you cannot have relationships which keep changing their arity as you please. So, then you could go and create separate relationships one for the case where the batsman was bold one was for the case where the batsman was caught and so forth it is really messy. Coming back we continue to have an entity called a player and maybe we want to have a particular row of the scorecard which could be a new entity player innings. So, this is a particular instance where a player played and that is related to player through batsman. So, that is the sorry this is slightly messed up let me draw it again batsman says that for this player innings this was the batsman and then you can also have other relationships all the other ones which we had we could have had a bowler which is another relationship we could call it bold by to make it clear bold by we can have one more relationship caught or caught by if you want to make it very explicit player and so on and so forth you can have run out and so on. And the player innings can have all the associated attributes not shown them here, but runs balls and so on can be attributes of this entity player innings and that actually makes life a lot simpler. So, if you see a player innings this is for batsman there will be a similar one for bowler. So, maybe this is batting innings perhaps bowling innings is actually simpler because it just has the number of overs bowled and so forth number of wickets taken, but for the batting innings it is for a batsman. So, it must have a total relationship with batsman which is again to one player bowled by need not be total there may not be an associated bowler he may just be stumped or caught by again there may not be anybody, but it can be only caught by one person you cannot be caught by two people and so forth. So, we can introduce cardinality constraints also here and this diagram here is a lot cleaner than the n-ary relationship we created earlier. So, whenever you get a very large you know n-ary relationship things go really messy it is probably a bad idea you probably much better off turning it into an entity with separate relationships. In fact, it is discussed in one of the exercises in the book that whenever you have a relationship n-ary relationship you can always create an artificial entity like this. So, if I had an let us take the ternary case e 1 e 2 e 3 an n-ary relationship like this can be turned into you create a new entity let us say e prime and e 1 is still there. So, the instance of this relationship corresponding to that instance we create a brand new entity and that entity is going to be related to each of the corresponding entities. So, e prime is going to have a relation with e 1 e 2 and e 3. So, this is a well known thing that an n-ary relationship can be translated into a fake entity which has a n binary relationships with the original entity sets. So, this can always be done and in fact, we have done something like this here although in this case it was not truly an n-ary relationship it was a worse than an n-ary relationship it was a relationship whose arity kept changing depending on the situation. So, here definitely we want to turn it into an entity not keep it as a relationship ok I hope that answered that question. The last question is how to really how to reduce aggregation into a relational schema I did not cover aggregation here, but for those of you who know what aggregation is I will like describe this for those of you who do not know what is aggregation aggregation simply lets you treat a relationship as one of the kind of like an entity. So, you have a relationship which is linked to another relationship and the book has some examples of that the slides I have used here do not have that, but the book website has more detail set of slides and the book of course has the examples of aggregation. If you know what is an aggregation how to convert it to table it is actually very simple you do not directly create a table for the aggregation, but the aggregation makes sense only if it has a relationship with something else. In that relationship you need to create primary key for the aggregation. The primary key for the aggregation is simply the collection of primary keys of all the entity sets that participate in that aggregation well actually if you have cardinality constraints many to one maybe you can eliminate some of those keys and reduce it, but the primary keys of all the component entities of an aggregation will form a super key which you can use as the key for the aggregation. So, that is basically what you do you do not create a table for the aggregation is not required because it does not have any attributes by itself, but the point of an aggregation is to have a relationship with the aggregation and so what you can do is create a table for the relationship and if you do create the table you need a primary key from the aggregation and that this is how you get the primary key. So, I hope that has answered the question. Let me give a last chance for people to ask questions online. Let me see if anybody has their hands up. If you have a question please use a view and raise your hand. Varangal seems to have a question. Let me see if I can connect to you. Can you hear me sir? Yeah, I can hear you please go ahead. My question is whether there are any restrictions on the use of outer join. So, question is are there restrictions on the use of outer join not really you can always take any two relations and do an outer join between there is no such restriction maybe you have something deeper in mind. So, maybe you can ask that question in a different way. Back to you. Actually, I wanted to perform a natural join between a relation and the outcome of a sub query which involves a outer join and the outer join sub query basically involves a attribute from the outside attribute and when we are referring the outer attribute then there is a problem there. I think I understand the question you are asking and I think that was actually answered in a slide. So, let me bring up that slide that will help you understand what this is. So, I hope you can see this slide. I think this slide will answer the question I hope. So, basically you wanted to have a natural join between some relation here and you wanted this sub query here which you see here to access an attribute of the outer this relation here. So, the thing to note here is when you have a sub query in the from clause by default it cannot access any attribute of any other relation in the same from clause. It can if it is a sub query in a where clause it can access attributes of any relation in the from clause. A sub query in the from clause by default cannot access attributes of any other relation in the same from clause. They are all at the same level they cannot see each other unless your database supports a construct called lateral in which case you say something like this instructor I1 comma lateral and what lateral signifies is that the sub query immediately following it can access the attributes of anything to the left of it. So, this is a feature which not all databases support although it is part of the SQL standards and if it is not supported there is no way you can access attributes of I1 inside here. If you do wish to do this you can probably rewrite the question in query in some other way it depends on the specific query. Now a SQL server has a construct equal to equivalent to lateral although they call it something else. I think they call it cross join or something like that which has effectively the same result as the lateral clause. Oracle I think supports lateral currently in the current versions DB2 probably does. Postgresql does not. So, if you are using postgresql you have to rewrite your query in some other way. I hope that answered your question. I will see if I can get back to you. Yeah, back to you Arangul. Actually as post GRE does not support it I think I have to modify the query. Yes, you will have to modify the query. I do not have a general solution which will work by any query but for the specific query I am sure it can be expressed. It is just a question of figuring out what you need. Thank you sir. Over to you. I think we are well beyond the scheduled end of the day. So, thank you very much.