 Please note that the sheet which you have got today contain the one part of it is the ER thing that is the only thing we are doing in this session. The remaining parts of it are for the afternoon they are not meant for this session so please ignore them for now. So, only the ER the one sheet is for this session the railways and the other the second there are two question two parts. So, the first problem is the railway system with stations, tracks, connecting stations, trains, schedules and bookings. So, the first step in any such design is to identify the entities. So, I think that is relatively straightforward task here although there are several alternatives there are several possible ways of doing this, but let us start. I think everyone will agree that a station is an entity it is no problem there everyone agrees that a train is an entity and passenger booking is an entity. So, what remains is what is a track what is a schedule these are the two things and then of course, there are relationships between these entities. So, in the description here what we mean by a track is a single segment of track between two stations it is a simple model. So, what should track be should it be an entity should it be a relationship. For the purpose of this thing it is good enough to make it a relationship although there are many reasons why you might treat it as an entity we are not going to do that, but for example, if you want to keep track of track maintenance and other things about a track you may want to treat a track as an entity, but since we are not doing all that all that we care about is that there is a track between two stations we could treat track as a relationship let me put it down here to give space. So, it is a relationship between two stations. So, now what are the attributes of a station which you would assume ID name. So, we will stop at that certainly there are many other things you may want to put in if you put it in your design that is fine. What about a train what are the attributes you would expect ID name it is probably good enough for our design. Now, the key thing is that train schedule and there is also this issue of what does a booking relate to we will come to that in a moment. So, first of all let us deal with the train schedule. So, what is the train schedule record says you know at what time does a particular train pass through a particular station and there also a sequence number which is this is the first thing this is the origination point which could be 0 may be for the train then there is 1 2 3 up to the termination point for that train and we also need to record the time when the train passes through a station and we have made some simplifying assumptions here that every train runs every day and it finishes its journey within the same day otherwise you have to keep track you know if you are just storing time of the day that is not good enough you have to know that is it on this starting day or is it on the next day and so on. So, to simplify our life we have pushed all that under the carpet and we are going to store the time in time out and the sequence number those are the 3 things we have understood. So, now how should we do this how should we record this yeah and for that relation we have the attribute time in and time out. Exactly. So, that is the simplest representation although others are possible. So, give this whatever name we want train schedule may be or train station now what are the attributes that is time in sequence number time in and time out. So, I saw at least one or two people had a model this as an entity if you do that then the schedule entity the point where a train enters the fact that a train is going through a station if you model it as an entity it has to be related to a train it has to be related to a station and it has to have these attributes. So, that is the possibility where this is turned into an entity with two relationships on either side and these would be the attributes that is a possible alternative. Although for the purpose of this particular simple design it is not required this is simple. What is the next one passenger booking so, what are the key things in a booking we need to have a train. So, the booking has to be related to a train so, what do we do create a relationship give it any name say booking for that train booking has to also be related to. So, passenger books for the train and for booking we can have the entity attributes like the date coach and other things. So, we could model passengers and entity let us start with that detail and then we will modify it to add that the questions I do not model it, but in reality yes there are good reasons to model a passenger as an entity. So, what all does the booking have you need to know that the booking is from on this train from this station to this station. So, two relationships between booking and station should this be a ternary relationship or two binary there is no particular reason to make it ternary they are kind of independent of each other. What is next you could make a ternary also it does not really cause any harm. It is better to keep them separate conceptually it is better what is it would be related twice to station from and to and then connection to booking destination is the two from is the start source is the from from and to that is why we have two relations with station. I have issue like now we cannot check make a constraint at a particular booking which has made from and to whether the train has stoppage over there or not. Yes. That is a good point. That we cannot model. So, it is necessary that two also supposed to exist. Which exist? The booking from. From and to. So, two if from is there then two must be there. Actually both must exist. Yeah. So, how you can. I am not showing the constraints here to ensure that both the from and to exist and also you cannot have from being related to two different stations. So, that is going to ensure that both the from and the to will exist. How we can put the instance. From and which to. The booking has these two defined for sure. Like for example, we are booking from Pune to Mumbai. Yeah. Okay. So, booking contains the how we can represent this in this particular. So, there will be one instance of this relation. The from relationship which relates this booking to Pune and an instance of the to relationship which links the same booking to Mumbai. Therefore, it is from Pune to Mumbai. Yeah. Both are related to this entity. In the relational table sense, yes, that would be a booking ID in a table. So, coming back to your question. How do you enforce the constraint that this frame actually goes through these stations? So, I have some by the solution also in mind like. Yeah. See, we have instances of this schedule for instance. Yeah. So, what if we, of course, that could be through aggregation what so called. What if we relate booking with the relationship? Yes. That is. That sounds bit complex, but. It is possible. However, we have to relate it to two different instances of relationships. Absolutely. From and to. Yes. Yes. Now, how do you make sure both of those are for the same frame? That problem comes. Yeah. So, it is an interesting question. There are some important constraints, but it seems non-trivial to enforce those constraints on the ER model. Yeah. Somehow as attributes in the booking. So, if you were to, for example, you could have a, you can't do it at the entity level, but if you do it at the table level, you can have a train ID and then a sequence, two sequence number, you know, what is the starting and what is the, it is just sequence number for this train. So, this is an interesting thing. So, it's, if you create a table, it's easy to do this. If you do it through this, how do you do it? The constraint, the problem is the constraint. The academic people want to keep these two things existing in separate space. ER model and relational model. Yes. There are distinction and disjointness. Yeah. So, actually another interesting question is if you took this relational model, a corresponding ER model which could have given us this relational model. So, in case you didn't follow the discussion for the others, the idea was in the booking, we can have a train ID and from sequence number and a two sequence number. And because this is a sequence number for a particular train in its root for this train, we can be sure that it's the same train which goes. So, it enforces that constraint implicitly by keeping a, in some sense, a discriminator. So, yeah. So, if you had this top as a weak entity, so you had proposed that train schedule instead of relationship becomes an entity, we could model it as a weak entity and which is dependent on train and it has a sequence number which is a discriminator. So, let me, I think, let's finish this and then I think we are almost done with this. What is left here? After the statement here, all we need is booking ID, coach, seat number and name. I think that completes the design as stated. Now, we will modify it and try to do something else to it. Only time in will be there, no time out, but the station will get sequence number. So, from sequence? No, the problem is, if you take the terminating station, there is a time in. But there is no time out for that. So, we can model the, you know, the last station by having null value for time out. Default will be hard. Null, in this context would mean does not exist, because there is no time when it goes out. Of course, it leaves the platform, but that's an operational thing. As far as the passenger is concerned, that's it, the train terminates there. After that, it's not the same train. It's just a set of coaches which will now become part of another train, maybe. For example, you want to find out all the trains which start with one particular station and ends with particular station. How to find out? So, it requires something value whose time out is this one and time in is, because starting station has time in zero, time out is something and at the end station is like that. So, for that, we require something to be there. Some special value. So, again, from the point of your query, just like we're discussing that, if we visualize query, what query we want to fire on this particular database and then go for the database. So, here we require that. It's also, yeah, you need to know what is, so, sequence number zero could be the starting station, but how do you know which is the ending station? So, you could ignore this and take a max of sequence number for that train and that would be the terminating station, but of course your queries are going to be more complex. So, this is a nice example where from the point of view of the query, it makes sense to specify exactly what you will have here. Anyway, you need to do that. For the first station, you need to specify what is the value of time in and the only meaningful value is null. There's only a non-null time out and correspondingly for the terminating station. So, in this particular case, even if you ignore the query, just by specifying what values have to be in the design in these two end cases, we have already solved the query problem. So, I don't think in this case we needed to worry about the query, but certainly there may be some cases where you have two design alternatives, one of which makes it easier for query. So, that can happen. Suppose I say that it is terminating station, so let me put null for the time out. Yeah. If there is a situation where for in-between station, I don't know what is the time in. At the moment, I don't know. So, still I am forced to put null there, right? Or I don't know the time out. For example. If you created a schedule, I assume you would know the time in and time out. But there are many other alternatives. For example. So, that creates the ambiguity actually. Can I do Booleans? Start station and end station. So, both are Boolean, yes or no? Well, we don't normally give the time, but in this case, we could always have a tag which says that this is the starting. Then we don't have to worry about what is the value, whether it's null or some other value. So, we can look at this and say, if it's a starting station, I'm going to not even look at time in. If it is a ending station, I'm not going to look at time out. That's another option. You could put it here. Yes. You could do that. So, there are, it won't be an ID. See, the problem is. The station ID is the first thing. It has to be a relation. Start station and end station. Can we think of multi-value? Again, that would duplicate it because the train schedule already has that information. But you could create it explicitly to simplify query. Yeah. Can we use that? Yeah. That's the. End station only for one flag we need that. That is true. So, we could. So, there are many alternatives. We could get rid of this. Zero is start and we could do that. Instead of having those two relations from and to, can we think of having multi-valued attribute for representing the same? For this. For station from and to. For the booking? For the booking. No, you can't really have a multi-valued attribute. Then how do you know which one is from and which is to? Then you have to figure out the schedule of the train and it gets more complex. That doesn't make sense. And furthermore, the problem is if it's a multi-valued attribute, the other thing that it relates to is an entity. Therefore, you cannot make it an attribute. It has to be a relationship. So, again, this keeps coming up. Whenever you have something which is relating to an entity, absolutely cannot be an attribute. It has to be a relationship. This should be the form sequence, the form station. The sequence number should be less than the two things. That is, then we can have a multi-valued attribute with the sequence number both. That is certainly possible. So, this relational design in the end, which in the booking you have train IDs from sequence number to sequence number. You can easily impose a constraint that this is less than that. So, that could be a valid thing. Actually, then we don't require two relationships. We can make a single relationship with multi-valued to form and the sequence number. No, multi-value doesn't make sense. But let me answer this question in a different way. Now, note that both of these relationships are total and many to one. When we convert to relations, what we will end up doing is in booking, we would have a from station ID and a to station ID. But if you want this, the sequence number to come in there, we could have maybe used this alternative design, which was suggested, which I am starting on a new design. This one is complete. Sir, for a termination station, from and to are equal. For a termination station, time out, time out time should not be null. For a termination station, in time and out time are the same. I don't know if that makes sense, because what we have done in our design is assume for intermediate stations where the train does not stop, the in and out are the same. For the termination station, we should not use that convention. There are different ways of representing data. We have to state the conventions. Some of these are not stated in the ER diagram. We have to state it separately as these are the other constraints in there. For example, the constraint that both of these things have to be on the schedule of this train. You can state that in English, but you cannot represent it using the constraints available on the ER diagram. This is a more complex constraint which you can only state in English at this point. You can't use a standard notation, but that backup is always there. UML also does this. You can always specify constraint in English in UML. That is acceptable. As you can see, there are going to be things which... What do we have after all? We have a handful of constraints, which are standard constraints. Not all constraints can be enforced using these few things. Anything else can be either in English or predicate logic or whatever. You specify it somehow and state it along with the diagram. That is one solution to this particular problem which he had raised. How to enforce that both of these are on the route of that train. This is before this on the route. Those are all constraints which we can do in English. If you want to try to enforce it through other means, this alternative design where we create an entity here. Let me redraw this below. Let's call it train stop. Sequence number will be the discriminator. Time in, time out. This is a weak entity. It has to be identified by a strong entity, which is the train. We will have to give this some name. Let's call this TS or something. For lack of space, I am abbreviating and time also. We are running out of time. The train stop obviously has to be related to a station also. Train is enough to identify uniquely the train stop. This could be the identifying entity. That is a double diamond. The train stop is identified by a train and the sequence number. But it also has to be a station and that also has to be a unique station. This is one way of doing it. Again, there is another constraint which we have not enforced that consecutive sequence numbers should be connected by a track. We have not enforced that anywhere. There are many, many constraints which we are not enforcing. This from and to will go to the stop now. Now the proposal is that from and to relate not directly to station. Now what happens is, when you fold these into this thing, you are going to use the primary key of this. What is the primary key? ID and sequence number. In the relational schema, you will get from that. You will have actually a train ID. You will have that as one train ID. If you just do straightforward mapping of tables, you will not have one but three train ID from sequence number, train ID to sequence number. Why are these three coming from? They are coming from here because this is also, by the way, I didn't show the constraint. This also has to be total and mini to one. From this, you will get train ID. From this one, indirectly, you will get train ID sequence number which you will call from sequence number. For this one, similarly, train ID to sequence number. Now we have a constraint that all these three have to be the same. This is not expressed in the ER diagram explicitly but we can write it in English that all three have to be the same. Therefore, we can delete those and keep just one copy. If we started from here and used this extra constraint to remove redundant attributes, we could have landed up with this final design. There are good reasons maybe to use this design over the previous design. But again, it's a trade-off. We did all this work to be able to enforce one constraint. But we have still not enforced some other constraints. So there are trade-offs there. Yes, a particular station can be a halt of two different trains. The sequence number will vary. Yes, that is fine. Then the sequence number is derived from the train ID? No, this is not for the station. The sequence number is for a particular train. So this is a weak entity identified by that. Train. So each train will have its own sequence number, which is linked to the same station. So the same station can have different sequence numbers for different trains. That's not a problem. Good. Any other questions? So let's move to the next one, which is ER diagram for this program. Modeling, what all? Resource centers, center coordinator. This is actually a relatively easy one. So instead of again doing everything on the board, let me just show this. So we have, wait, something is missing. Okay, fine. So we have faculty who is from an institute. So institute is an entity. Faculty is an entity. Resource center is also an entity. And it has a capacity. That's the only thing we are modeling. But actually you will have to have an ID. This diagram is incomplete. So I'll go beat up the RAs students who drew this. I'll beat them up after this thing. But there are a few corrections here. Resource centers certainly should have an identifier. In fact, all of these should have an identifier. So this should be underlined faculty ID. So if you can note this, let us please note these corrections. We will put up, cleaned up one later. So faculty is all of you center coordinators for a particular resource center. Now this diagram is allowing some flexibility, which may not really be required. So among the flexibility it allows is that a resource center need not be an institute. So you could have a resource center in some reliance somewhere, which is not even an institute. This also allows the faculty to be a center coordinator for a resource center, which is not their institute. So this gives you the flexibility. If you didn't want that flexibility, there are alternatives to enforce that resource center is an institute. How would you do that? How do you make sure a resource center is an institute? But for example, if institute is and there is specialization with the resource center. We can't make any difference if it is an institute. Reliance might be treated as an institute. We can define that as an entity which has some sort of common name and ID and other things. Whether TechMindra is the institute, we can treat this as an institute. So in a broader sense, is it difficult? We don't want a faculty to be from a reliance world, for example. So we may want to have a case which is a real institute, a college, some other institution, and a resource center, which could be... So we could even link a center coordinator to an institute. So the assumption is that some of these institutes are resource centers. And which ones? Only those which have a center coordinator. Or maybe we can have an extra flag here which is a resource center. So there are many alternative designs. But in this one, let us just make sure there are no more errors here. This would be a primary key. There would be an ID here. The attendance here being many to one makes sense. You cannot attend in more than one. And you don't have to attend. Therefore, this constraint is fine. You are from an institute. So you may want to add an extra thing that every faculty member has to be from an institute, if you wish to. But then you may have some people who are freelancers of some sort, not associated institute. So if you want to allow that, then you will not make this total. You can leave this as is. So those are decisions you would have to make. So for our design, this is reasonable. And faculty attending a resource center has a registration number. Now what this is not modeling is the fact that you may have multiple courses. This is all for one course. If you have multiple course, the same person may attend a resource center for different course. So actually then this is not a resource center. This would be a course offering in a resource center. So then you would have a resource center which is a separate entity. This would become a course offering like a section. Similar to that, these are all sections of a course which happen to run at different resource centers. And then you would register for a section and the person is a co-ordinator for a section. So that would be another alternative. So depending on what your model will have many different variations from this. This design is okay for the specification, but it may need extension. Any questions on this? There is also a solution here for the other one, the railway, which is basically the same thing. But I think there are a few, one or two mistakes in the constraints here. So again we will fix this also before you take a look at it. You will note from train to station, what's the problem here? There's a constraint error here. But this constraint says that a train can only be there in one station. So this arrow should go. Please note that. And there was one more. Track, what is the problem here? Track is many to many. Each station may be connected to many different stations. So this is again many to many, which means this arrow should not be there. So these two correct. We'll make the corrections on this diagram. I think the rest is okay. From to, this is exactly as we drew it. For train, all the rest is fine. So any questions about either of these designs at this point? We are almost out of time, but I want to give you a very quick demo of the diagram editor, which is, it runs on Linux windows everywhere. And it has a lot of modeling tools. We always say that any system we design, it's part of some other system, right? A subsystem of something. So everything is a subsystem actually. So can you reflect on that with reference to any of those? To these designs? Any designs. Or you want to comment on that? I'd be happy to hear more from you. I mean, it's perfectly meaningful. Then usually in analysis and design type of the exercises, we usually represent the system boundary, saying that this is a subsystem, which will ultimately fit into some bigger picture. Like saying this ISTE, if you want to model it, it will fit into the bigger academic program of IIT, Mumbai or something. So that needs to come out in the ER model at some point of time, I think. So that will actually enable students to understand the scope of the system and also understand that it is not the standalone system which they should make work, but it is in the context of something that it should be working. Right. So at this level of teaching here, we have not got into this detail, but yes, if you already have an existing system and you're adding something to it, you need to know that this is part of a bigger system. And then you'd have to note that maybe this ER diagram even is partial. So it assumes that there is something called instructor, which is part of the main ER system. So you're only showing a part of it. And then you have to define what is the scope of this part, what all it deals with. So absolutely. If you're doing it as a part of a larger design, you have to scope out what is part of this and what is not part of this and can be used from the existing design. I think we are getting some good software engineering insights. Thank you. Okay. So sorry for that wasted time. So this is the DIA editor, diagram editor. Now what DIA has is support for a number of different modeling things. So this has flowchart, UML, ER. So what we are going to do for our purpose is use UML class diagrams to represent entities. So let's class 1 attributes A1. We are not going to give types and so on. So visibility, we are making it implementation. Otherwise in UML it has this plus minus in sharp notation up front. This is a hack, but if you want to underline it, the scope in UML it underlines it. So there is one more, A2 which is again implementation. We don't want to see operations. We want to see only attributes. There is an okay button at the bottom which is kind of hidden because of the screen resolution. So what we have now is A1 which is underlined and A2, the two attributes of this. This is what we have been seeing. We can copy it and then modify this as we want. Now then let's leave this name alone for now. We can edit it later. Now the next thing we need is relationship between these. Unfortunately, I think there is no relationship here. This has something which is not really a relation. I don't think this quite works. I think this doesn't allow you to put a name inside it. So what we found useful is to go to ER and create a relationship. We can call it whatever we want, R1. Now we can have connectors between these. We have the property, no arrows. Similarly for the other one, create a connector. This is just the line. Although actually in the ER model, I am not seeing it here. This is another option. So this is actually a better way of doing it. So let's delete that. Now we can have properties. Is it total? It says no. Now I change it to yes and you get the double line for a total relationship. Anyway, so we can clean this up and we can create other things similarly. Now one other thing is how do you make this an identifying relationship? Unfortunately that is... So what else? If you want to do specialization and so on that is there in the UML tools. So this is a fairly convenient tool to draw diagrams. Some of you have used VCO that also provides similar properties. Students tend to go and use the simplest tools. They'll use PowerPoint or some such thing. And what is wrong with that? The problem is that everything here becomes a diagram. So instead of having this, we conceptually something which Daya understands. You're going to have to draw boxes around attributes and so on. Yeah, the semantics is lost. Whereas Daya and VCO and other similar editors have actually created all these things with associated semantics so that you just click on certain options and immediately the diagram gets cleaned up. It shows you what you want. So it's not actually lines but it's a higher level. So the nice thing about Daya is it's free unlike VCO which you have to pay for. Actually I have one question to ask. Yeah. Suppose the ER model that we are developing for a case study is quite complex. So we would like to develop it as a series of models, right? Refining certain... So is there any way of checking the consistency of... Yeah. So that's another good question. In this... You are breaking up ER diagram into many pieces because it's too big. One of the things which you should do is you define the attributes of a class in one place. In all other places simply use the class. Don't show the attributes. Anyway there will be no question of attributes being inconsistent across different things. It's clear that it's an entity but don't show any attributes. Just hide them. Relationships of course should occur only once. You should not repeat them. So there will be no issue there. It's only entities which get repeated. So I think that should take care of consistency. But then how do you break it up? How do you break up a complex thing into pieces? You would like those pieces to be related in some way. Yeah. And then it's easy to piece them together later. I don't know if there's any scientific way of doing this but certainly you would try to break up an organization's ER diagram into things which are related to each other so that the break up is given. Say for example, if I have a person say relation that I... or entity rather I have identified and later on I would like to refine this person into maybe the student and faculty and registrar and things like that. So to make my ER model simple in the first go it is like say developing a DFD or something like that or flow chart which has the simplistic 0th level view and then keep on refining it. So is there any way of having that series of such ER models and maintaining the consistency across these things? I don't know much about this. My answer is I don't know. Because when I look at... If you know somewhere please tell us. No, no, I really don't know. Myself is trying to find out answers for this. In UML you do have ways to tackle it but when it comes to your data modeling there is apparently at the moment people are not looking at it. I'm not aware. Maybe whatever you can do for UML you can apply to this. But I don't know this part of UML so I cannot answer your question. Maybe you can answer it better than I can. Then at the moment I don't know. I have to ponder it from the database perspective. I haven't done that. But I would like to at some point. Like rational rules. If you want to take the diagram and convert it to something else. Or something else or converting this diagram or some error checking parts. Or even if we have a big diagram we can have a scalar diagram. That can be done in rational rules. Yes, that's true. This is only two draw diagrams which is useful for paper presentation or something like that. I think we should not. We are teaching students. We are learning the concept forward engineering or those rational rules tool things really will hide many things. In database course I don't advise using such things. It may make sense to start with something simple because students just learn these concepts and later they can graduate to better tools. My point is that this is a tool for only draw the diagrams. Yes. If you want to make a scalar means if you want to go for a higher version then you must use some tools. That is my point. It probably is beyond the scope of a database course. That's what you are saying. It is a part of the software engineering course. Certainly you want to take these additional issues into account. Any other questions or shall we move to lunch? Lunch it is. Thank you.