 Now, we will take time for a few questions and what I want to do is actually spend about 10 minutes doing an ER design interactively. Now, if this were a classroom, it would be with a limited number of students, it would be very nice to just have a show of you know let people suggest topics and then I pick one of them. Since I cannot talk to many of you, I cannot hear many of you at the same time, what I will do now is take a few general questions and at some point I am going to say enough general questions at this point I want suggestions for something which we want to model using an ER model. At that point further questions I will take only suggestions for something which we want to design and I will pick one of those and then we will do the design. So, now let us take some questions. This question is related to the schema of the university schema I would like to say. If you look at the section relation of the university schema, we see that most of the columns there are being referenced as the forming teams. Now, we know that forming key exercise does not put an index, does not need an index of its own but it creates something akin to an index. Now, my question to you sir is that with such a design and we also know that when such a kind of a relationship is put into a massive select operation, every time an update or a select or a delete operation takes place, the relation integrity has to be ensured. So, would this design be considered as a good design for a heavy performance point of view? The question is that in the relational design that we have, we have many attributes which together form the primary key and correspondingly in other relations we have many attributes which together form a foreign key. So, now the question is if we have such a design checking if the foreign key constraint whenever there is an update could have a higher cost because we have to look at a number of attributes and see if they all together are present in the other table. So, is it a good design from the performance perspective and the answer to that is if you wanted to focus on just performance there is an alternative which would have been to create a fake identity for every single entity. Which is a key which is a very small key like maybe an integer. So, now we would have a section ID which is different from what we have. In our case the section ID is just enough to distinguish different offerings in a given semester. The alternative which we actually did consider when we came up with this schema was to have a int ID for the section which is unique across all course offerings. And then the foreign keys intakes and teachers would be much simpler because there would be a single ID referencing section. And indeed that is a perfectly valid approach and many database designers use that approach to keep a primary key which does not have a meaning externally. It is just a number which we create inside of the database. So, why did we choose this? No major reason it is just that we did not want to create another fake identifier when there was enough external information to uniquely identify a section which people knew about. We did not see the need to create a new ID which is internal to the database. So, that is choice which we made. We could have gone the other way. I cannot say that one design is wrong and the other is right. Both are reasonable approaches. So, if your focus was on performance maybe the other approach would have been better. Does that answer your question or do you have any follow up? I think we have been disconnecting people because of echo. But let me just answer one more part of the question which was on indices and so forth. So, what indices are created when you declare a foreign key? So, normally an index should be created on every primary key. In addition index should be created on every foreign key declaration. So, that the thing can be efficiently handled. What do I mean by this? If I delete a particular course and there is a section tuple which is referring to this course, I should prevent the deletion of the course. How do I know if there is a section or takes or whatever other tuple referring to this course? So, the way to find that efficiently is to have an index on this table on the foreign key attribute. So, if I have that index I can quickly look it up to see if somebody is referring to this tuple and if so prevent the deletion or if I am going to have a cascading deletion I know which tuples to delete cascading or which tuples to update cascading. So, it is generally good to have such an index and most database systems actually create such an index automatically on a foreign key because it is very important. So, that was the complete answer to the question from Srinagar. Now, let us go to Mahatma Gandhi Noida. Hello. Yeah, sir. Can we have multiple indexing? Okay. Or you want an index on multiple attributes. Yes, you can. Just like we have composite keys with multiple attributes. You can have an index which has multiple attributes. It is not a problem at all. It is standard. Like I said, I am very happy to answer questions later, but at this point I want to do a small exercise on ER design. I only want suggestions. I do not want questions right now. I am seeing a few online music stores. Can you see it here? There are many coming up quickly. Facebook. Somebody has suggested Facebook many times. That is not a bad thing to consider. I am not an expert on Facebook. I do not use it that much. Another good suggestion is Moodle. What if we do Moodle? That gives a little more depth than Facebook. So, let us take Moodle. That is a good suggestion. Moodle is very big. I am not claiming we can do everything in Moodle in a very short time. But let us take a few aspects of Moodle. Now, ideally I should get feedback from you on what are the entities and what are the relationships. So, let us do that. Use chat to suggest what are the entities and relationships which one might use for the Moodle scenario. There are many more suggestions. Good suggestions. I am seeing Aadhar card, railway reservation, online unit test, election commission, stock trading, hotel management. There is a huge, huge number. Time table management, all wonderful suggestions. But let us pick on Moodle now. So, at this point I only want suggestions on what are the entities and relationships that we might use in Moodle. Please stop suggesting other alternatives and just suggest what are the entities and the relationship in Moodle. So, I am seeing a few suggestions. Fourth is an option for an entity. User, provider, I am not sure what is provider, but maybe instructor, quizzes. That is good. Any more? I am waiting to get a few more suggestions. Assignment. Yes, that is an important thing. A post is another. Absolutely. We have something about centers, files, grades. There are quite a few. So, now time tables. There are many, many more suggestions. I am going to switch back to the white board and I would not be able to see your suggestions for some time now. So, what we are doing is Moodle ER diagram and we are going to list the entities. We had course. We had users. Note that we are not differentiating between instructor and teacher, sorry, students at this point. If you are familiar with Moodle, you know that people can have roles. Somebody can be instructor in one course, TA in another and student in a third course. So, we cannot divide users on this basis. That is an attribute of a relationship. Then we have assignments. Then we have posts and we have probably a few more entities. If we think of some, we will get back to it. Then we have relationships. So, what are some of the relationships one could think of? We have a relationship between course, user and we will see what are the attributes of this relationship coming up. So, course user is one relationship. Then we have a relationship between user. So, assignments, we need a little bit more. We have an assignment which is something which the instructor puts up and then we have a submission. So, submission is for an assignment. So, a submission implicitly, assignment submission. So, we have a relationship between, let us first start between assignment to submission. So, which assignment was this a submission for? We also have a relationship between user and submission. So, who submitted this? Now, in Moodle, there is just one person who summits in their model. But as we just saw for a project, we actually ought to have multiple users associated with the submission. So, this is something which Moodle probably did not model initially and we have to work around it by doing some funny stuff. But ideally, it should have allowed many people to be associated, many students to be associated with one submission. But for the moment, submission is only by one user and we will show that in the cardinality constraints coming up. And then there are posts, user to post. But a post is in the context of a course. So, we have a course to post. And for that matter, an assignment is in the context of a course. So, we have a course, assignment. Now, a course in Moodle is really a course offering. You know, they call it a course, but in our university, you have a notation, it is really a course offering. Now, Moodle unfortunately does not have a notion of a course, independent from a course offering. And that shows up in a few ways, which is a little inconvenient. Supposing I have a syllabus for the course, when I create a new course offering, I actually have to dump everything from this course offering and copy it again into the new course offering. So, what do we have there? We have redundancy. But in a way, it is okay, because generally, you know, the course offerings do not happen concurrently in one semester. But if you have multiple sections of a course, it can be a little clumsy in Moodle. So, you have a course offering, but there is no notion of a section. Then there is a notion of a grouping, which is kind of like a section, but it is not quite the same thing. So, it can be a little messy. So, we have done various workarounds. So, currently in this Moodle, there is a notion of groups, and then certain assignments are associated with groups, which are really sections, which are different from the groups of people who together submit an assignment. It can be a bit messy. Some years ago, when I was running CS 101 in IIT Bombay, we had about, I think, 700 students, 600 plus students in the course. We broke it up into sections, but in Moodle, we had to create separate courses per section. Maybe there was no better way then. Maybe there is a better way now. But what all of this reflects is that Moodle, to some extent, was created a piecemeal. It started with something and grew and grew, but maybe a better year modeling had been done at the beginning. Maybe some of the drawbacks of Moodle could have been avoided. Coming back, we have course assignment, submission, posts. I think we have got the major set of relationships. There are probably a few which I have forgotten. Let me see if people have suggested any more. There are a few questions you have been asked. Can we have submission as a relationship between a user and an assignment? That's a good point. We could have Moodle submission as a relationship instead of an entity. Why did we choose this? The reason we chose it is that submission actually persists in the database and it is graded and so forth. There are marks for the submission. There are comments which go back and forth. It's probably more convenient to treat it as an entity and you may want to track which TA graded a particular submission because there are many submissions. Different TAs may grade different submissions. If you treat this as a relationship, it becomes a little more messy. Many times, it actually makes modeling a lot cleaner if you take what looks like a relationship and turn it into an entity. This came up earlier when we had the workshop coordinator. The suggestion there was IPL. This was before all the IPL scam. I noticed that nobody suggested IPL. Maybe all the scam is having an effect over there disregarding IPL. It was really not about IPL. It was about any sporting tournament where there are multiple teams and so forth. There, we wanted to keep track of scores which how many runs the person had made in innings, how did that person get out, who bowled the ball, who caught it, was he stumped, etc. There, we could have had a relationship, annery relationship but that would have been very clumsy. It made a lot more sense to create an entity called dismissal. It is not related to Moodle but let me just mention it on the side. All of you know cricket. You know that a person gets out in a match. You could think of it as a relationship involving a batsman, bowler, cat person who caught it, person who did stumping, person who did run out, etc. But that gets very clumsy. How many participants are there in the relationship? It becomes very big. There, it made a lot more sense to turn a dismissal into an entity and then relate the dismissal to a player through who was the batsman, who was the bowler, who was the person who took the cat and so forth. Coming back, for similar reasons, we decided to treat submission as an entity. Now, what we have is a number of entities and relationships. We move to the next screen and start drawing a few diagrams showing the relationships. So, we had cores. Let us first draw a few entity sets. What would the attributes of cores be? Cores ID which Moodle insists on. Then there is a title. There is a short title also. Moodle actually has some such notion. Then it has start date, end date and many, many more things in there. I am not going to give an exhaustive list. For users, what all does Moodle have? It has a notion of an ID. It has email. It has name. And then it has many, many more attributes. It has, for example, city or country, blah, blah, blah. It can have a photo, many more attributes. Then an assignment. What are the attributes of assignment? Is cores an attribute of assignment? No. That is going to be a relationship. If we convert it to the relational model later on, cores becomes an attribute of assignment perhaps. But at this point, there is a relationship between assignment and cores. Therefore, we do not put it in here. But it must have an ID internally. And I am sure Moodle has some internal ID, some number. In fact, it can be found if you dig a little deep. So an ID which is the, by the way, I am underlining the primary keys for cores ID, user and assignment. What else does it have? It has a title. It has a type. So Moodle has many types of assignment. It could be upload a file. It could be fill in online text. It could be one of many, many more types. It has a start date. It has an end date. It has allow late submission. And many, many more attributes like this. It has attributes like, should we notify the grader? You know, there are zillion things in there. So those are a few of the entities. I will not show the remaining ones. Now let us show a few relationships. We had user. At this point, the entities I am going to show by just boxes without the attributes. We had cores. We had assignment. We had submission. So now let us show some of the relationships. Let us call this participant. I do not know if you can read this. Point size is little small. You probably cannot read it. But that says participant. Then we have assignment is part of a course. Now we have to give a name to some of these relationships. So if you cannot think of a meaningful name, you can just use a default like course assignment. For brevity, I am just going to call it course underscore as. And similarly, submission is related to assignment. So again, what can we call it? Let us call it sub as. Some name, you know, we can come up with better names. You should try to come up with meaningful names. So that is a relationship. But a submission is also related to a user. So submitted by. We have a number of relationships. So let us wrap up this part of the session. We will extend this ER diagram later when we talk more ER features. But for the moment, let us look at what are the cardinality constraints over here and other integrity constraints. Does a user have to be part of only one course? In a number of courses, a course can have a number of users. But what we have missed is what is the role of the user in a course? So we have role. Is the user an instructor, a student, a TA, whatever. This is certainly part of moving. Then we have a relationship between course and assignment. Would that have any constraints? Yes, absolutely. An assignment can be for at most one course. So we will put an arrow head there and it must belong to a course. It cannot exist independent of a course. So let us put a double line there. I think we covered this. This means total participant. Similarly, a submission is part of an assignment and cannot be part of two assignments. Therefore, there is an arrow there and it must be part of assignment. Therefore, there is a double arrow there. Moreover, a submission must be submitted by somebody. You cannot have a submission which is submitted by nobody. So that participation is also total. So, sorry, I do not know why it is not showing up properly. There is a double line between submission and submitted by assignment. A submission must be submitted by somebody. Now in Moodle, a submission is submitted by at most one user. In fact, exactly one user. So in the Moodle system, there is an arrow here. But if we were to redesign this, we might get rid of this constraint and allow multiple people to jointly submit an assignment. Of course, what is the mechanism for joint submission is another question. But it could be done. But in the current Moodle, this is how it is. Are there any other attributes for the submission? Yes, probably there is a time stamp. When it was submitted, it is probably a meaningful attribute for submitted. Sorry, no, no, time stamp is for a submission. I beg your pardon. I started saying that it is an attribute of submitted by. But if you think about it, it does not make sense to have it as an attribute of submitted by. You could do it, but it makes more sense to have it as an attribute of submission. So, let me just cut that out there. Right now, I am only taking questions or comments about the design which we have done so far. We are going to continue extending this design after we study some more ER features. So, I only want to take questions about the ER design for Moodle which we are currently working on. Let us take a few questions. Let us see the institute in those. Let us see if you have questions or comments. Okay, sir. Actually, sir, we are working on the agriculture apps for the Aakash. And for this, we have created the database apps related to the ER diagram. We have some confusion related to the ER diagram that the entity is any agriculture farmer and the relationship with the entity is simply their apps. They bring information and database storage. So, we want to create this application related to the Aakash apps which is based on the agriculture. And we have created the ER based database for this. And we want to centralize the SQL server. So, it is possible we are centralize the SQL server so that the Aakash apps for Aakash tablet can identify the database for the farmers. I am not sure what your question was. You are saying that you are building a database for farmer with Aakash as the interface. That is good. And how do you build a centralized system where Aakash tablets are just the interface? Well, you can use web or you can build apps that you know it is exactly the kind of thing we are targeting in this course how to build such apps. So, we hope that the stuff which we are going to cover will be useful for you. But right now I want only questions about ER diagrams in particular the Moodle thing which we are covering right now. People have any feedback on that, comments, questions. That is what I want to target right now. It is good that you are building an app. It would be very nice if you actually do your project on this to start it off. Or if you have already been working on it you can maybe add some extensions to it as part of the project. Actually, my confusion is that we are creating the entity for the farmers and the farmer is being as entity and for the relationship between the entities and the thing is that all the database to be stored in the archive. So, how can we use the ER design for this, for the entity and for the object. So, if you want to build a system like this, the first step is to identify what are the entities of interest and what are the relationship. So, you are saying it is a system for farmers. So, the first of all even before getting into ER design the very first step is to understand what it is that you are building. What is this system going to do? How is it going to help farmers? So, let us say that you want to keep track of maybe statistics which farmers have generated and maybe incidents and so on. So, there is a notion of a farmer, there is a notion of a particular maybe field which the farmer is showing. Maybe there is information about the crops which are stored and then there is information about the weather, there is information about fertilizers used, there is information about the yield in that particular season and so forth. So, there is a lot of such information around. The question is which of these are entities which are relationships and so forth. So, in this case let us say that field is an entity. Now, there are many things which are done to the field. So, maybe you can have a relationship between field and pesticide. So, you might say that a pesticide is an entity, a field is an entity and you will say that there is a relationship saying that this pesticide was used on this field, on this date. That could be a reasonable way to look at it, but if you apply the pesticide several times in a growing season, you may want to track when all it was applied and how much was applied each time. Now, you could do that by having composite, multi-valued composite attribute which says how much pesticide and date, amount and date and then multiple of these because you may have used amount x on date x 1 and then amount y on date y 1 and so forth. So, you could model that as multi-valued attribute if you want and then you want let us say yield, how much yield this field gives. So, that could be in a particular season. So, how do you model that? Maybe that could be another multi-valued attribute of the field itself or it could be modeled maybe differently. There are alternatives. What else would you want to model? Give me some suggestions. Okay. Yes sir. Actually sir, I have planned out for this absence that the entity is to be the farmers and the entity is to be stored in the form of the relations because we plan out that according to the climates, the farmer can understand what the agriculture crops and everything is to be exposed so that the over the Indian economy is to be increased as the agriculture sector is to be grown up. So, the basically aim is that the database to be stored in the aca-shaps and the entity is the farmer which is connected to the database and all the information that is in the AI diagram we can easily contact the farmer easily getting the information by the aca-shaps. And for the centralized server, we have planned out that we created the one web server in which all the database related to all the farmers all over the India is to be stored in this web server is connected to the aca-shaps. So, this one-to-one relations should be showed that the database is to be stored completely and all the information should be stored in the aca-shaps at the database and it's difficult for me that to connect all the entities to the all the database and gathering the information in the database and retrieving the information in the database for the farmers. So, I will tell you a little bit about this. So, I will tell you a little bit about this. So, I will tell you a little bit about this. So, I will tell you a little bit confusing for this ER diagram to how to getting the information by the farmers and how to retrieve the information to the farmers. That is a separate question which is the interface. The ER diagram only models what is stored in the database. Now, again there seems to be some confusion. You would have a centralized database which tracks everything that you have gathered all the information that you have gathered and an app running on the aca-shaps which interfaces with this central application, central database and applications running on the central database. That is probably how you would build it. The tablet itself would probably not store too much information. Whatever information the farmer is entering should probably be transferred to the back end database immediately. The only reason to store information in the tablet is maybe you want the farmer to have offline access. Maybe when they are in the field with no connection to the net they may want to be able to browse some information and so forth. You may want the app to also have local storage which is available offline. That is another aspect. You may want to do one kind of schema design for the back end database which is all the data stored centrally. You may want to take a subset of this and store it on the tablet. Subset meaning some of the relations and within those relations only some of the rows are stored in the tablet because the tablet's memory is limited. You cannot store everything in a tablet. Then there is an interface what kind of things should you record. You kept mentioning farmer as an entity. There are many entities. Farmer is one of the entities. Like I said field is an entity. The pesticides used are entities. The fertilizer used are entities. They have information. The pesticide has information about it. Fertilizer has extra information about the fertilizer. These are all meaningful activities. Then there is the issue of weather. Now weather is kind of funny. How do you model it in an ER diagram? It's temporal. There are readings at different stations at different points in time. It turns out that the ER modeling approach as originally described in the version we have looked at in the book for the most part does not deal very well with temporal information. So if you have a series of recordings of weather, temperature, rainfall, humidity, whatever, it's kind of difficult to push it into the ER modeling approach. So sometimes these kinds of things which are readings, a set of readings at different times, you may want to just create tables out of it directly because the ER approach is not particularly good at it. Or you could have each reading as an entity. That would work. Then you can say that every single reading is an entity and then reading is at a point in time. What is humidity? The rainfall on the previous hour or day or whatever and so forth. All of that could be one entity. It's a little artificial, but if you want to use the ER model for tracking such statistics, then you may want to create artificial entities like this. That makes life relatively easy. So this entity is related to a particular location. Which location was these readings taken? And the attributes of the entity would be like temperature, precipitation, humidity and so on. So like this, you can take all the data which you want to represent and come up with an ER model. And the initial focus should be on the back-end ER model. And later you can decide what parts of it will reside on the front end. And then you can decide the functionality of the app and eventually build it. So we're going to look at how to build web apps. Akash, you know, an Android app is a little different from a web app. We are not covering it in this course, but there is plenty of material available, including some from the NMEICT program itself on how to develop applications for Akash. That is information. So if you contact the workshop team, they will tell you where to get further information. So let me take some questions from chat. One of the questions related to the ER diagram is, why is there no double line between participant and course? So let me show you the here we are. We have the ER diagram. There is course and user or participant, whatever. It's the same thing. So why is there no double line over here between the user and the course? And it depends on whether you want to allow users to register before they are registered for a course. And in fact, Moodle allows this. So you can have a two-step process where you first login and register to Moodle and later on somebody can enroll you in a course. So initially you are not enrolled in any course, you are just a guest. Moodle does allow this. So we are going to not put a double line there. The next question is, where can we add timestamp to assignment and submission relations? So assignment and submission are in our particular model are entity sets. So we can have time for assignment. We actually have two timestamps. One is start and end. We actually had that already. Assignment had a start and an end. What we did not do was submission itself. So let me see if I can add to this diagram here. Give me a second. Submission is an entity set. Now what are the attributes? Again we could model it using something called weak entity set which we will come up with. But for the moment let's assume there is an internally generated ID. When you make a submission the database system creates an ID. That's probably how it's implemented. Now you want to have a timestamp. Then you may want to have an ID. So let's say for simplicity we have just one file. So there's a file associated with it. And then some submissions allow text to be entered directly. So text and so forth. There may be many more fields for submission. There may be comments back from the instructor or the teaching assistant and then replies to that comment and so forth. So there could be many fields here. So this is the next question is could you let us know why you removed timestamp from the ID. So I did not add it. So what happened is I started drawing timestamp as an attribute of the submitted byte. So I'll just show it to you here. I started adding timestamp as an attribute of the submitted byte and then I realized I was making a mistake and I'm doing this on the fly. This is not set up. I had not thought of Moodle as in one of the things. So we're doing it on the fly. Sometimes mistakes will happen. Sometimes you correct it quickly. Sometimes it may still be there. If any of you find wrong decision in this you can let us know. Another question is is there an entity called login track that I'm logged in in the database? So of course the web app is going to track the fact that I'm logged in but should Moodle actually record information in the database about when all I logged in and so forth? If so should the login be an entity or should it be a relationship and so forth? So if you want to track history of users you could like I said the all problems is to create new entities for everything. So a particular login could be an entity and then it has a login time log out time who is the user and so forth. So yes you could create login as an entity. The next question is total participation and partial participation. So I think we saw this here maybe this question was already answered. User participation of user in over here participant would be total. But we decided that is not so. On the other hand the submission must be submitted by user. It cannot exist without being associated with a user who submitted it. Therefore the participation of submission in submitted by is total. So you add an entity to the module of Moodle maybe that's a typo. So we are only looking at a few modules assignment but there are many more modules wiki and zillion other things quizzes whole bunch of other stuff. So for each of those there would be a corresponding ER diagram. So the next last question which I'm going to take now is why don't we have a user to assignment submission and then submission to assignment. The question is why don't we have a direct relationship between user and assignment. I suppose one could but it would be redundant. We already so if you want to track something about a relationship between a user and an assignment without an associated submission yes then you could have it for example a teacher that is you divide the class up and say that you know group A do assignment 1 group B do assignment 2. In which case you would want to have a relationship between user and assignment which says this user is supposed to do this assignment and then you would need such a relationship. In the module implementation as of now the course so every course student is allowed to do every assignment therefore there is no need for a direct relationship here. The last question is on strong and weak entities sets and I am going to do that and then come back to this diagram and address that. Let me take me one more question. Are there thumb rules for drawing tools available for drawing diagram? There is a tool called DIA which is available cross platform so open source DIA let me rewrite it DIA that's a tool which you can use for doing all kinds of diagram that's kind of open source version of BCO which is a Microsoft tool which some of you may use for ER diagram some called ER there are a few more tools so I put up a page on Moodle which discusses the tool options now each of these tools supports several different ER diagraming notation they support the Chen notation then they support another notation called another notation is really a variant of the UML class diagram your best bet is to go to the UML tool and use the tools they have in the UML section to create classes and then entity is basically just like a class so what we did in DIA is you know a class box which looks exactly like our entity boxes relationships are present in UML so we can use them as is and attributes of relationships are also present in UML we use them as is and then there are a few more things like double diamond which are not there in UML so we so bottom line there are tools if you want to use those for your project do use them you can of course just go to any drawing tool and draw it from scratch but using a tool makes life a lot easier it's much quicker there's a small learning curve but after that your diagrams can be drawn very quickly for example I don't have to draw a box on the right side in any of these tools I can say create a class and then I can pull draws the box of the right side you don't have to pull around with the drawing actual boxes of the right side then you can make connectors if you move things around the lines move around so it's worth using tools to draw here don't do it by hand or using just a plain drawing tool use one of these in particular dia is open source so you can download it for any platform it's top here