 So, let us discuss each of these queries and solutions. If you have proposed alternative solutions, let us know. There are more than one solution for most of these. So, take the first one. Select, this is the solution. Where is the, no that is not what I intended. Yeah, the solutions are available. Are they made visible? Just now they have been made visible. So, you can see it on your laptop also. But, I suggest you focus on the current thing we are seeing over here, because there will be interesting discussions. So, the first one is very easy, trivial. I suppose everyone would have got that. Select name from instructor. So, let us skip that. Now, let us look at the second one. Find the course ID and titles of all courses taught by an instructor named Srinivasan. So, how do you write a query in general? It is good to keep the schema in front of you, whenever you are writing queries so that you know what attributes are in what table. If you are, use the schema a lot sure you know everything. But, since these schemas are new to you, it is worth keeping the schema diagram in front of you. If you want to write now also you may find it useful to keep the schema diagram, which is in chapter 2 of the slides. If you have written it down, that is fine. But, keep it with you to understand what is going on. If people are ready, this one says course ID and titles of all courses taught by an instructor named Srinivasan. So, what are the relevant tables here? Obviously, course ID and title come from the course table. The other part, who has or rather which courses has Srinivasan taught? How do you get that? The name Srinivasan would appear in which table, that is one of them. We are collecting what all tables we need to answer this query. The name is only going to be in the instructor table. The course ID is in the course table. Now, how do you know what course is Srinivasan taught? We have the teachers table. So, we have to join these tables. Do you need any other tables to complete it? No, because teachers has the instructor ID of the instructor who taught it and it also has the course ID of the corresponding course. So, we could join course instructor and teachers. So, we do natural join. So, we have course natural join teachers natural join instructor. If you run this query, it will more or less work. You can write this in SQL. How do you write it in SQL? Select course ID, title from course natural join teachers natural join instructor. That is one small glitch though. You have to watch out for whenever you use natural joins. Natural joins are going to equate any columns with the same name and you can get into trouble if you use natural join with unintended department name and credits. If you see instructor, there is ID name, department name and salve. If you do this natural join, what is going to happen? The department name should be equated. So, what would the query actually give? It will give you course ID and title of all courses taught by Srinivasan which are in the same department as Srinivasan's department. But there is nothing in our schema which says Srinivasan from computer science cannot go and teach a course in mechanical engineering. But you will lose all those courses if you just read it wrong. Yes, all of us have made this. I made the same mistake the first time I wrote many of these. Let us hope that the TAs who did this have done it right. And in fact, they have by the simple experience of not using natural joins. It is safer not to use natural joins. There is an inner join using class, join using which is safe where you specify exactly which columns you want equating. That is safe. So, this query here is joining the usual teachers instructor course. But the join attributes are casually listed. Instructor.id is teachers.id. Course.courseid is teachers.courseid and instructor name is Srinivasan. Now, let us look at this query. Let us understand what it means first of all. The id and names of instructors who have taught a course in the computer science department even if they are themselves not from the computer science department. So, what we need is actually courses in the computer science department and instructor id and name. And how do you link these up? Again teachers. It is in fact basically the same query except the selection condition changes and the output changes. So, this is identical except. So, now you will see that teachers instructor course. This part is identical and course.department name is from that is all we care about. We do not care which department instructor was from and select instructor id and name. I think even instructor dot is redundant here id only. The next query. Find id is of instructors who have never taught a course. So, how do you do this? How do you find someone who has not ever taught a course? So, you can use not in or not exist or except. So, there are several different ways of writing it. So, if you use the except way, what you will do is find instructors who have taught a course and find all instructors. So, all instructors except instructors who have taught a course that is one way of writing. Or you will say instructors where not exist course taught by that particular instructor. We will see this. I think the solution here is this one. So, this one is the not in version. You can also write it with not exist. So, this one is probably the easiest version. Select id from. Instead of where id not in. Yes. This is on set and that is under set. Select id. This minus that is a set minus set. Yeah or if you use not exist you can say where not exist select id sorry select star from teachers where teachers dot id equal to instructor dot id. That is yet another solution. I hope people have given these different solutions. We can put them together as the. No, it cannot be teachers or takes. I am sorry. This should be a section should be a section. This is a mistake. I think teachers. Teachers which courses which are offered. We are not saying if anyone needs to teach it or not. It is possible for a course to be offered as a section without an associated teacher although that is probably a bad idea. The schema allows it and since the English query said courses which are offered we should be using a section relation not the teachers relations. In fact this query solution use the takes relation which is even worse because it only includes courses which are actually taken by some students. We do have courses which are offered and nobody takes it. It does happen occasionally not often but it can happen. Yeah we can probably again if you use intersect there is no need even for this thing because intersect removes duplicates anyway. So, intersect is probably the cleanest way of writing this. What happened? Now the next one was find the id and title of all courses which do not require any prerequisites. So, how do we do this? We have to make sure that this course does not exist in the prerequisite table as the first one the one which requires a prerequisite. It can be a prerequisite for some course we do not care but it should not have a prerequisite. So, again a not in or not exist query is the simplest way but you could also do it using except essentially identical to this query to the previous query rather. So, select course id title course where course id not in select course id from prereq. Pre-req id is the id of the course which is a prerequisite. We are of course assuming that we do not have entries like course id null in prereq that should not be there. In fact, it cannot be there because we have declared course id comma pre-req id as the primary key for pre-req. So, there will be no null values there anyway. The last one is students will not taken any biology department courses. Again we can use the same kind of structure as the previous queries. So, we can make sure that id is not in takes where takes join with course because we need department name biology. So, solution is not giving that. What? That is that is. The solution is totally wrong. Totally wrong. Apologies for not looking at the solution. Yes, I have given select distinct name. So, one minute what is the query what did it want names of students. There is no need for distinct even. So, just select name from student where id we could do it either as id not in or let us do the other one where not exist. So, we have to join takes and course because takes has the course id, but it does not have the department id where takes dot course id equals course dot course id and course dot department id. Name equals and now this is the part we are not yet put any constraint on who is the student here in the sub query. So, we wanted for the outer student. So, takes dot id equals student and that ends the sub query in the rest course. So, again as discussed we can do it using an except the same way. So, we have all students except students who have taken a biology that will simply be a join again or we can use not in all the forms are equivalent. So, we have done I think we have done two of those forms. We have the exist and we have the not exist and we have the not in except is the one form we have not illustrated. So, there is a railway DDL. So, here we have the following queries find the names of all trains which have a stop at Thane. So, what is the schema? So, back to the schema we have train hoards station train and track. So, we want to the query was find things which names of trains which have a stop at Thane. So, that we can use from the train hoards and the station code, but the name is Thane. So, we have to use the station relation and equate the name to Thane and then join this with train hoards on station code and make sure it actually hoards at Thane does not just go through it and what is the condition? It hoards if the time in is not equal to if it is this is just another way of writing that query. We could have wanted the name of the train I guess. So, we have to join with train also the train hoards only gives us the ID of the train. So, if you want the name we have to join with train and train ID equal to train hoards at ID train hoards dot station code. Instead of joining this particular version has used a sub query, but that is not really required. We could have equally well just done a join that is probably simpler. In general it is better to use a join than to use a sub query. Why? Why is it better to use a join than a sub query? If it if the sub query helps to understand what is going on more clearly then that is fine, but if it could have been easily change into a join the issue is that many database systems like in particular PostgreSQL or MySQL for that matter are not very good at optimizing sub queries. So, if you run queries with a sub query on a very large relation they may actually choose a bad plan. Whereas, if you use a join they are going to get a much better plan. Now, it is possible for a good optimizer to say that this query is really equivalent to the join query and it can convert it and good optimizers actually do that, but the optimizers in PostgreSQL and MySQL they are they do a good job of join optimization, but with sub queries they do not do such a great job. So, do not use it unnecessarily for no good reason. Whereas, we are not in then there is a slightly better reason again except could have would probably guarantee a better solution, but sometimes not in is easier to read. So, if you are a one to one option is first write the query with not in or and check out the performance. If the performance turns out to be bad then you can look at how to rewrite the query using except for example. Let us do that before that time underscore in time in. Now, coming back names of all stations which are 20 kilometers from Mumbai. Now, how the hell do you do this? What do we have here? We have station we have track. So, this distance is from one station to the next which are our assumption is that this track shows station which are connected directly by a track they are adjacent to each other. There is nothing in between that was the assumption in this schema we stated that also we discussed it. It turns out that this query I how did this query get in. In fact, if you think about it this is not an easy query. Why is it not an easy query? You can find out stations which are adjacent to assuming this track only has adjacent immediately next station not two stations away and so on. It is easy to find stations which are next to Mumbai which are within a distance of 20 kilometers. But what if this is actually the situation in Mumbai I mean there is no station called Mumbai, but let us say that is Mumbai central. Now, within 20 kilometers of Mumbai central there are a lot of stations. There is a station every 3 or 4 kilometers from Mumbai central. So, there are 4 or 5 stations in there. Now, how do I find out which all stations are within 20 kilometers? How do I find the distance of an arbitrary station? Let us say I do not know Burrivili. How do I know how far from Mumbai central Burrivili is? Does this give me information? How will I find it? From track. How do I find it from track? If track only has adjacent edges you know this is Mumbai central let us call this S 1, S 2 and then there is Burrivili. So, maybe this is 5, this is 3, this is the distance and that is 9 may be. Now, Burrivili is within 20, but the single track does not have 20. Now, let us how do we know the actual distance of Burrivili to the we have to add up these things? Some aggregate. Unfortunately, this is not so easy. This how far do you how many links do you go? So, if you write a query like this, select you know what is track station code 2, S T code 2 from track where S T code 1 well we can say track comma what is track station. Since, we are using the name station code sorry S T code this they are anyway separate 1 and S T code. So, we do not need to qualify them S T code 1 equals S T code and name equals Mumbai central if you wish. So, this will tell us all stations which are directly linked to Mumbai. Now, how do I find a station which is linked by 2 links. So, this one is 2 steps. So, this will give me S 1 and I can get the distance also S T code 2 comma distance. So, I get the distance. Now, how can I get a station which is 2 hops away? We can find things which are 2 hops away by doing another joint. So, what we will do is we will join track with itself. So, let us call this track 1 sorry call it track 1 we have to rename it. So, station comma track T 1 comma track T 2 and let us say this is T 1 dot S T code 1 equal to. So, there are 2 copies of track. So, what this will give me is if I select T 1 dot S T code 2 and T 1 dot distance that will give me things which are immediately adjacent, but there is a track T 2 also which I can use. So, what I am going to do is T 2 dot S T code 1 equals T 1 dot S T code 2 that will what I have done is T 1 dot S T code 1 is you know the station code for Mumbai and the other end of the track is the first end of T 2 dot track and the other end of T 2 dot is what I am going to use here. T 2 dot S T code 2 is this I will get S 2 and any other thing which is within 2 what about the distance? The distance will have to be I have to add up the distances T 1 dot distance plus T 2 dot distance. So, to get stations 2 away I had to take 2 copies of track and join them. So, I want to get 3 away I will get 3 copies and continue this I can repeat this structure however many times I need to get stations which are that much distance away not distance that many hops away. If I want to get things which are an arbitrary number of hops you know what do I do given any single SQL query like this depending on how many copies of track I have there I am limited in how many hops I can go. So, it turns out the query as stated here is actually very hard to answer because it requires what is called the transitive closure operation if because the query does not specify the number of hops. So, there is a all station which are in a range of 20 kilometers you know for all I know there may be a station every 1 meter of course not realistic, but if I do not know anything about the railway system I there may be 1 every meter and then I have 20,000 such things joints which I have to do in Mumbai they are not quite 1 meter, but within 2 kilometers is not at all uncommon 2, 3 kilometers. So, we may have even 7 things in between. So, we need a 7 way join here which is of course very boring to write such long queries. So, there is a solution SQL actually supports recursive queries. So, you have 2 options one is instead of writing a query like this you can write a loop in java or pls2k. So, what we can do though is instead of asking you to solve that we can always update the assignment. So, maybe what was intended here was I think the problem is the who are the TA who inserted this question assume that the track relation included all station pairs not just immediately adjacent ones which is why the query was written that way, but we have made it clear elsewhere that the track is only adjacent one. So, we will modify the query which says. So, now if you assume this is the question let us see if the solution provided with TA matches this. Now, we will change the assignment the database itself is that is not an unreasonable one the queries have to be changed. So, I have already changed this query the solution to it was the first one I gave without doing the extra join. So, we will put the solution in I will just delete this I have no clue what it is doing. So, this is looking for station not equal to Mumbai and track distance. So, this solution intended to be things which are adjacent away on that from the track. Now, what is the other thing doing I am not very sure I think this we made an assumption that if you have a track from A to B there will also be a track from B to A there will be two rows in the database. This query this does not make that assumption it is assuming that you may have this or you may have that and it whichever one is there it will work that is the intention of the query. But obviously, if you do not have both pairs that is if you have two pairs of stations we have stated already in the schema that we will have both pairs. If there is a track from A to B we will have A B distance 10 we will also have B A distance 10 and the goal of that was to simplify query. So, that we do not have to go through all this pain. Let me just delete that whole thing for the moment and we will add the correct one later. I suspect the remaining queries are also very complicated for exactly the same reason that assumption of symmetry was not made in the queries. So, the TS had a hard time writing the query I think the query is correct. If you assume symmetry is not there, but it is unnecessary and we did not introduce to write such complex queries. Therefore, I think we can safely well let us see what is the query I will delete it in just a moment. Find ID of trains which cover Mumbai, Thane and Umbarnath. So, what does it mean to cover this go through these things whether they stop or not is another question. So, maybe it is better to interpret this as cover meaning that they should stop. Otherwise, it is not very useful if it just goes through the station. So, let us interpret this as stop which go through all the stations and stop it. So, this is a not such a hard query, but it is a little lengthy. Now, let us see if the how would you do this. So, you need a train we have the train hoist relation. So, we can check with the train hoist relation if it hoist at a particular station easily with a given name. So, what we will need is let us see if this query is badly formatted. Let us go back to display and see what that solution is giving. So, this could be a good example of a complex little more complex query. We have seen the width clause. So, this is a query where the width clause may be used. So, what we will do for the December workshop is give hints on how to write this query instead of just giving a query like this and say solve it. We will add hints to help people structure their answer and then they would have learnt how to write a more complex query. So, T is just please note this if we will modify this query. So, what we want is a train which hoist at those three stations which requires us to join the train hoist train and station code. Now, we have to do the same thing three types. One is for Mumbai, one for Rambarnath, one for Thani. Now, if instead of writing the join so many times it is a lot easier if you do the following train hoist comma station where train hoist dot s t code equals station dot s t code. So, this is a hoist for that station and we are going to get the name, but we still have to make sure it hoist and time in not equal to time ok. So, this is going to give me all pairs of train ID and station name where the train actually stops at that station. Is this clear? And we have done the join with station here. So, that is the with clause. Now, what do we want select? So, from we will come to what is selected. We are going to have train hoist three times train hoist th 1 train hoist train hoist 2 and train hoist th 3 where now we want all of them to be in the correspond to the same train. This is just one train which has to go through all. So, th 1 dot id equals th 2 dot id and can you read the thing at the bottom. And th 2 dot id equals th 3 dot id. So, we have forced all of these to be from the same train. What we wanted is one at Mumbai, one at Amarnath, one at Thani or whatever that was. So, and th 1 dot station name equals Mumbai. And you can obviously fill in the rest, the other ones are for the other two. And th 2 dot station name equals Thani and similarly, blah blah equals whatever that last one was Amarnath. Is this clear? Are there other ways of writing this query? Anyone wrote this query in any other way? You could of course, have not used this and done all you know there are there is a pair of joints. So, you could have had six things here similarly, each of these would have been replaced by the joint. And then this condition and this condition all through. So, you get a very big query as a result. Whereas, this is much easier to understand. Actually the first this query is basically without the with clause it is doing something very similar. Instead of a joint it is doing something in something else. But this is the first copy as a temp. This is the second copy. So, one was for Mumbai, one was for Thani and the third one was for Amarnath. And then it makes sure that the IDs of the trains are the same. So, this query is correct. It is just that if you use the with clause to write it, it is a lot easier to understand what is going on than that one. Find the names of trains which are covering Thani before the sixth stop. So, there how do you know which stop it is? The sequence number does the sequence number give the stop? It does not. It just tells you what is going through. So, now finding the sixth act stop over there is actually not going to be easy. I do not think this query was also it does not actually do that. Now, it is just assuming sequence number less than 6. So, the English formulation was again messed up. So, if you actually want to find out if it stops, the sequence number is meaningless. The first stop may be sequence number 25 for all we know. So, for a Rajasthani express that is probably true. It is the first thought will be way above high. So, the solution, one solution to this problem is to change it. But I would not work out the detail. What if you did want the sixth stop? If you wanted to solve the problem as stated, how would you do it? I can get two copies of train hoist. Join them for the same train where the sequence number of the first one is less than the sequence number of the second one. So, these are all pairs of train hoist for the same train where the first one is before the second and the first one is actually a hoist. Now, if I group by train ID sequence number for the second one and count how many there are, I can find out how many hoist there are before this sequence number. So, do not understand what I am saying. Pardon me. Maybe I will come back to this later. I do not want to get stuck on this because of one more set of again let us not say covering. Do you want pass through or stop at? Stop at Thane that a reasonable phrasing is that clear which pass sorry which pass through or stop at Thane before the sixth station in the route of the train. If you go back here. So, this query matches the reformulated one. What is it doing? Select distinct name. Now distinct is important because well let us see if it matters. So, what we want is train hoist, train train hoist where train ID equal to train hoist. So, this is simply to get the name of the train and the train hoist sequence number is less than 6 and the code is basically Thane's code. So, this is the station is Thane. It is on the route of this train and its sequence is less than 6. So, this query answers the previous one correct the not the previous one. The new phrasing of the question matches the solution. So, that tends the first assignment. Let us get to that. The third one is find the ID. So, let us get to the solution here. Yes. So, what will happen here is without the distinct. So, let us modify that. So, point is if that person taught many courses in CS that ID and the name will occur that many times. So, if you want to avoid that we could. Yeah, that is a general tip. If the query is becoming very long. So, you can always give short names T I C and then use the alias here. Any other comments on the solutions? Now, let us move on to the second assignment. What I thought was the easier lab. So, unintentionally to have been may be the harder lab unless this one is even harder. Find the maximum number of teachers for any course section. How do you do that? First of all how do you know how many teachers are there for a course section? There is a teacher's relation which has the section and the ID of the teacher. So, how do I find how many teachers are there for a particular section? Group by the section's primary key count of ID. So, that will give me how many teachers are there for each section. What is this query? I want to find the maximum number across all things. Now, how do I do that? So, a simple query will give me the simple group by query gives me a number of teachers for each section. So, one way is to say where this is greater equal to all again of the same thing that is one way or equivalently we could use. We can just take the max of that. Since we just want the number, max is good. In fact, teachers already has course ID. So, there is no reason to join with course. If you see here, it is not doing anything useful. Teachers has ID, teachers has course ID. So, why join with course? So, we will remove that. Usually, it is done to reflect the fact that foreign key constraint. Making sure that course exists. So, that is what some of the database literature says. This particular case, the teachers, because it has a foreign key to course, the course will exist. It will not make any difference at all here. Luckily, it will not make any difference. It does not modify the count. In general, if you do a join, it could mess up the count also. But, you do not want to unnecessarily join with no particular goal here. So, you just want select count ID. So, we do not even need to know which course that was. All we need is the number of teachers for each course section. Again, this is a course or section. It is a section. So, group is also wrong. So, not for a course. If it was for a course, this is okay. Course section's primary key is what? It is course ID, section ID, semester. So, that will give me the number of teachers for that section. But, I think, Alice, we have to write as L. When we run it, it was not working. Which one? When we run it, it was not working. At last, we have to add some as something. Alice. Yeah, as L. Over here. As L. Give it a name. As L. As L. T. No, no, no. T2. Anything. As anything. Good. Good. Yeah. So, I think, there is some requirement in PostgreSQL that, if you just have a Alice we need. Subquery like this, it should be given in alias. Otherwise, it does not have any name. It could have used an internal name. I do not think the SQL standard requires this. But, PostgreSQL has this limitation. I mean, it does not cost anything to add as T2. So, this subquery, from that we are just selecting max T there. So, this is clean. Let us go to the second query. Find all departments that have the minimum number of instructors using a subquery. Order the result by department name in descending order. So, how do you know which have the minimum number? This time, we do not want just the number. We want to find the departments. So, I have created a view first and then. Yeah. So, there are several ways of doing it. You could create, use a with class. We have a similar query in the book slides also. Yes. Use a with class to structure this or you can just repeat the whole thing. It is more clumsy. Now, what is the solution given here? That is not working. What? It is not working. No, no, no. This is completely wrong. Wrong. That is not working. It is the answer is Wrong. It is completely wrong. What is this doing at all? This is finding the, what is it doing? Something. Finding the number of instructors in each department. Yes, yes. It is wrong. It is wrong. But, after that it just. Order. Simply descending. Order. Not working. Whereas the query clearly said. Yes. Only find those which are the minimum number of instructors. Not even doing minimum. What is going on? I think I need to fire all my TAs. Get a new set of TAs or roast them or do something. There are too many mistakes here. Okay. And I should fire myself also for not cross checking all the answers before coming here. Anyway, so. You are with view in subject. So, how would you do it? You would use a, with class would help a lot. So, first of all we, for each department we need to know the number of instructors. So, we can say with dip count or something like that. Department name. Number also. Num instructors. Number. Asselect. Asselect. Department name. Department name. Count star. From instructor. From. Instructor. Instructor. Group by department name. Group by. Well close. That's all. And as they say close but no cigar. This is missing out one special case. What if a department doesn't have any instructor? What is going to happen? We have, whenever you write a query we have to watch out for this one special case. You know. If we have a instructor the department will not appear here at all. So, then if we find the minimum on this the department will not appear. So, if a department is 0 that is the minimum. 0 should be there. But it won't help. It won't help. So, what we can do here? How do we ensure that we get a count of 0? There are several possible solutions. The simplest solution to make sure every department appears is to use department we can say natural left outer joint. Space here. Space. Group by department name doesn't change. Now the one other thing is if I do count star. If a department doesn't have an instructor, count star would be 1. It would not be 0. Because count star will always, there is one tuple. But what distinguishes this from the other things is that the instructor ID would be none. So, if I want to know how many instructors there are in the department, instead of count star I can do count. ID. ID. ID. Yeah. There is only, you don't have to make it unambiguous. It is unambiguous. Count ID. And will the instructor get duplicated? No. Instructors are only in one department. And it will just match here. Exactly. This is the number of instructors for a department. Except for if a department has no instructor, instructor ID would be null. Count of any attribute like this eliminates null before giving the count. Therefore, it would be 0. As an alternative, we could write a sub query which finds the count. That's the other. Both are there in the book. They are described. You can go look it up later. So, there is an alternative. But this one gives us the correct count. So, now with this the rest of your query would be the same. Please. So, what is the rest of the query? Select department name. Yeah. Select department name. Select. Department name and. Department name. And what is that? Number or what? Num instructor. Yeah. Num instructor. From department count, DEPT count. We need to find the min. Yeah. Where num instructor equal to. Yeah. Equal to. Yeah. Select min num. Yeah. Select min. Yes. Min. Min instructor. Yeah. From the same department count. Yeah. From. DEPT count. Order by. Yeah. Order by this. Yeah. That one is over. Order by. Order by department. Name. Name. Descent. Is this query clear? Please note this query so we can enter it later. Okay. So, we are still on query two out of how many more queries? We have almost eight more queries and the descriptions are becoming longer and longer. Okay. I think this assignment was probably a five hour assignment at minimum. So, again we will cut down the questions. I think we told you this was optional. Okay. We said it is part of the disumber lab but I think we will remove this. So, you will not have more than six questions for the disumber lab. Okay. So, now what is the next one? For each student compute the total credits they have successfully computed. This query is actually directly there in the book. Yeah. If you want to look it up. It is easy. And it says do not use the total credits attributes of students. That is a materialized view of some kind which can be computed from the underlying relations but it does not actually match in the book schema. I mean book example schema data rather the number does not match. So, we could run this query. How would you do this? How do you compute the total credits? We have to see which courses they have taken and completed successfully. Meaning the grade is not null and it is not F. So, the takes relation will tell us which students have taken which course and the grade. We can eliminate from their grade rows where the grade is null or F. So, that gives us students who have passed a certain course but we do not have the credits there whereas we need the total credits. So, where do we get the credits from? Course. Course. So, we have to join takes with the course relation. The course relation has the credits. So, we have to join it and then for each student we need the total credits. So, what do we do? Group by. Sum. Student ID and sum of credits. So, that will be the query. Let us hope this one is correct here. Yeah, that is working. That is fine. That is working. Now, it should have been. But not. ID, not name. Null also. Yeah, the null part. Not equal to null that also. Did the thing. Yeah, for each student. So, we can interpret student as show the ID rather than the name. So, let us correct that. Takes joins student.course, where grade not equal to f. And not equal to null. And of course takes.course, id is course.course, id is student.id. Group by id. We actually do not need to join with student here unless we want the name. So, if you want the name we join with students. Otherwise. Otherwise it is not required. Takes itself. Takes is enough. It has the id. So, since the query did not. Well, we can interpret the query as we want the name also. That is better. In which case, but then we have to do one more thing. If we output name here, the query is not quite right yet. We need one more change. What is it? Group by id comma name. Any attribute which appears here also has to appear in the group by list. Unless it is aggregated. Name, id comma name. Some SQL implementations will actually let you get away with this. Because, one course from Srinivasan. How do you do this? We have to figure out who took a course taught by Srinivasan. How do you do that? Find out all courses taught by Srinivasan. Yeah. So, how do you do that? The Srinivasan is in the instructor name. Yeah. So, you have to join that with teachers on instructor id. Select different course id. That is all. Now, you go for students and if it is taken courses equal to sum. Equal to any of this set. So, you could do that. So, you can find students whose id from takes. Actually, we need takes also. Takes and takes. The section has to match the teachers section and then from that we get the student id. So, we can actually just do a join of instructor, teachers, takes and from that we can get the student id. Then, we need to know the number of students. So, on top of that we can do a distinct id. Distinct and constant. So, yeah. Yes. Select count distinct takes dot id from. So, the join is instructor takes teachers. Takes course id equal actually this part is wrong. It is not just course takes course id because Srinivasan may have taught the course once and somebody else Ram may have taught the course another time. So, if you look at only the course id you think that this student has been taught by Srinivasan. But Srinivasan taught it in 2009. This student took it in 2010 when Ram taught it. So, the join of teachers and takes should cannot be just on course id. Did you get the point? What does the join have to be on? It is on the section. It should be the same section. The student took a section taught by Srinivasan. So, what is the primary key of section? How do you know that section from teachers is the same as the section from takes? We need more things here. We want and takes dot sec id equals and takes dot semester equals teachers dot id equal instructor id instructor name is Srinivasan. So, that completes that query. So, again for December I think we will give some simpler queries also. All these queries are hard. They are hard as evidence by the fact that several of our students have goofed up on the answer. So, we better make it a bit simpler than this. We will give some simple queries to start with and a few of these will be retained as harder queries. But with some hints so that it will help the people to solve it. So, we will make all these queries and the solutions available to you before the workshop. So, you can go over it, give your comments. If you feel something is too difficult or it is not clear what is going on, please give us your feedback. So, before the workshop we have something which everybody here is comfortable with. That should be our goal. So, before the workshop you should know the solutions for all of these including alternative solutions. You should be comfortable with it because you will be acting as TAs not TA. You will be the teachers for the lab sessions in that part. We won't even know what is going on. We are not going to be involved. You are the boss for these sessions. That is one possible solution. Let us wrap up in a few minutes. We are behind time. Names of instructors who get the highest salary in their department. How do you answer this? I think this should be hopefully simple enough. Sixth is tough I think. So, this again you can do it different ways. So, this sub query is finding for each department name. This is the maximum salary in that department name among all instructors in that department. So, we do not need the department relation actually. We just need the instructor relation. Yes. Group by department name and find the maximum salary. And also we need the department name because the outer query is going to take instructors from that department whose salary equal to max salary. I have used width function. You used width and made it. Yes, you can do it. This has a small, again there is a small error here which we need to fix. This assumes that max salary is called max which is not quite right. Where did that go? Either we should give this a name outside or give it a name here. So, let us call it max cell. If you just did i2 dot max, what is that? There is no column called i2 dot max. Unless postgresql happened to call that column max. There is no guarantee what any database will call it. If you just say max of salary, it can give it any name. Some internal name it gives. Last one for the day. We are not going to cover these today. Sixth is tough. Now, this one is find students who have taken all courses taken by… Relational algebra is easy, but here… Relational algebra is easy, but here… You can just put in the division operation and you are done. But here I tried and just… So, again if you have access to the book, there is an example. Even the slides it is there which shows how to express the division operator, the for all operator using SQL query. So, it is there in the slides. So, you can take that and map it to this slightly different thing. So, in general how do you do this? This is like a for all query. For all courses taught by… In this case it should… Should we call it courses or… We have to first interpret this correctly. Do we want them to have been taught by Srinivasan? Or if Srinivasan has taught a course, these people should have taken that course. It does not matter who taught the course. So, there is some English ambiguity here. So, we have to first rewrite this to make it unambiguous. What do you think this means? Does this mean they have… It could mean that Srinivasan taught 101. This student should have taken 101. That is probably what… If you will literally read it. If you would say course, then you could say 101. If you say course section, it has to be actually that taught by Srinivasan. So, let us match the query to the answer. What does this answer do? What the hell is the answer? So, this one is ignoring the section aspect. So, that is easier also. Ignore the section. So, we will update this to make it unambiguous. So, they should have just taken the course if at all it was taught by Srinivasan. So, how do we do that? It is always an except. So, first find the courses which were taught by Srinivasan. Take the courses taken by this student and do an except. So, if there is a course taught by Srinivasan which this student did not take, then this will be non-empty. The difference will be non-empty. And that will be the condition here. So, we… Let us look at this here. So, the outer one is takes with… Well, the join with student over here is simply because we want… If you want the name, otherwise that is not required. But look at this not exist part. It is a not exist. Let us take the first part. So, this one is finding all courses taught by Srinivasan. Again, I think there is a redundant join here. We can simplify it. Let us go down. So, teachers and instructor is required. The join with course is redundant. So, now a little simpler. So, can you read this? There is no course dot course ID. That would also be course ID from teacher's instructor where instructor ID equal to teacher's ID. Instructor name is Srinivasan. So, this is giving us all courses that Srinivasan has taught. Now, we are going to remove from this all courses taken by this student, this guy. So, how do we find out what all courses the student has taken? Again, this course is redundant. We are going to remove that. So, what is this query doing? Select course ID from takes where takes T where S dot ID. S is this outer one. This takes. Actually, we do not even need S dot ID. We can do student dot ID where student dot ID equal to T dot ID. So, for this particular student, it is getting us all the courses which that student has taken. So, now if we remove from all the courses that Srinivasan has taught, we remove all the courses the student has taken. If the student has actually taken every one of those courses, the result will be empty. If the student missed even one course, it is going to be the endless. So, the condition here is not exist. Not exist means it is empty. If it is empty, the student has taken every course. Therefore, they should be in the output. So, that is what we do here. And that query is now correct. Any questions about this query? The outer one is joining takes and student to get the ID and name. And the inner one is the sub query is the standard form for the division operator or the for all operation. To take the full set, remove another set and check that the result is not empty. And note also that the second set here would normally have a condition from the outer one. This condition would come from the outer correlation. So, the inner one is being done only for specifically the one outer tuple we are considering at any point of time. So, we will do this for each outer tuple one at a time.