 So, that covers the basic set of queries, we are going to come back later in chapter 4 and cover outer joins and other kind of queries. But before we do that, I want to cover modification of the database. So, there is the delete operation, delete from instructor what does it do wipes out all instructors, all tuples from instructor, which is different from drop ping the table instructor, delete from instructor retains the table removes all tuples, if you drop the table the table itself is gone. Now, you may want to do something less drastic and delete only instructors from the finance department. Now, here is a next query which says delete all tuples in the instructor relation for those instructors who are in a department which is located in the Watson building. So, not only do we you know want to delete one department, we want to delete how many of departments are located in the Watson building. Now, if you remember the schema if you have it with you, you will notice that department has a building name. So, we are assuming that each department has a building, a department cannot have two buildings that is not important here. So, each department has one building, but more than one department can be in the same building and we want to wipe out all those departments which happened to be in the Watson building. So, we can say delete from instructor where department name in select department name from department where building is Watson. So, here is a sub query connected with an in clause which lets us choose which department which instructors to delete. Now, here is another query which is interesting delete all instructors whose salary is less than the average salary of instructors. So, how do we do this? Well, here is a simple way of writing it delete from instructor where salary is less than select average salary from instructor. Note that this is a scalar sub query because it returns a single value is used in an expression where you expect a single value the right hand side of less than would be a single value here. So, now average salary of instructor is a single value well technically the sub query returns a single tuple with a single attribute whose value is the average salary. But in SQL you can use it in a place where a single value is expected and it automatically gets the value out of the tuple in compass. So, if you are familiar with type systems in programming languages these two types are different the right side of less than is set having a single tuple with a single attribute the left side is a value, but the type is automatically cast by the less than operation the type conversion happens automatically. So, now if you did this in a naive way if you execute this in a naive way you first take the first instructor check the average salary check if the salary is less then delete that instructor. Now, get the next instructor again compute the average salary now there is a problem if you deleted the first instructor the average has changed. So, now depending on order in which you chose the instructors you may delete a completely different set of instructors that would be very bad situation you do not want that. So, in SQL the semantics of these deletes are slightly different it is not you cannot evaluate the sub query again and again after doing deletions instead conceptually what you do is first compute the average salary and find all the tuples to be deleted based on the average salary at this point no deletion is to be done the same with other thing insert and so on. First compute the set do not actually do anything after you found out which all tuples to delete you can go ahead and do the deletion at this point we are not again computing averages we have already chosen which all to delete using the original average and that is it afterwards we finish the deletion. So, that is how the semantics is defined now it should be clear that regardless of the order in which you consider tuples this does not matter we computed the average already before deleting any tuple. Now, insertion again this shows you multiple syntax variations for doing insertion the one we saw so far was insert into course values and then list the values in the standard order which was done when we created the table so this is what we saw the second one is an alternative where we say insert into course we list the attributes explicitly in whatever order we choose and then we provide the values in the same order now why is this useful because we can now go and modify the table course and add a new attribute let us say a comment or something now what happens to this insert statement the first one will fail because now course has 5 attributes but this has only 4 attributes this one would actually succeed because explicitly given the 4 attributes the 5th attribute which we have not specified will get some default value if you do nothing it gets a value null you can also declare in the table what is the default value if so it will get the default value if you do not specify it now here is another example if you want to insert a value null you can type null over here so if the total credits for a student is set to null I have to type null note that I should not put a quote if I say quote null quote that is actually the string null by just saying null means a null value especially now the insert statement can actually take a query in there and take the entire set of results of the query and insert it into a new table so here is one thing which says instruct into student select ID name department name 0 from instructor what is this doing it is turning every instructor into a student whose total credits value is 0 now this is not very meaningful but it shows you the syntax for computing the results of a query and then inserting them all at once into another relation this is related to the question which somebody had asked earlier on chat which was what if you want to insert a set of values if that set of values already in the database and I can get it by running a sub query like this this is the syntax if that set of values is outside of the database then that is when you need to use either a series of insert statements or a special bulk load facility which database dependent now again the statement here is fully evaluated before any insert happens otherwise we will run into trouble so supposing we say insert into table one select star from table one and let us say the table one does not have a primary key declaration so now if I am scanning table one and I insert a tuple and I continue scanning it to find other tuples when I come to the end I may find the tuple which I just inserted and I will insert that again and continuing the scan I will find the newly inserted tuple and I will keep inserting again and again I will go into an infinite loop inserting the same tuple over and over which is back so the SQL language definition prevents this by saying first compute the set of tuples which you want to insert get that set and then do the inserted one so you will not see the same tuples over and over of course if I had a primary key it would not matter the primary key violation would occur right away and finally updates again there are many kinds of things you can do with updates here is a simple one which gives a salary increase so it says update instructor set salary is equal to salary times 1.03 where salary greater than 100000 what is this doing it is giving a 3 percent raise of salary to all those with high salaries what about those with lower salaries less than 100000 they have chosen to give it a 5 percent raise so the salary is set to salary times 1.05 so that is how you give differential raise to different employees but this particular update is tricky supposing I flip the two statements I first did the one which updated the salary of those less than 100000 and then did the second I think what will happen supposing somebody's salary was 96 or 97000 I give that person a 5 percent raise boom the salary is now more than 100000 now if I run the this first one if I run it second it is going to find the salary is greater than 100000 and it is going to give the one more raise of 3 percent they will be very happy but others are not going to be happy seeing them getting a double raise so that is an error so it is a little risky to write multiple updates like this if you are not careful so in this particular case much better way of writing it is using a case construct a case construct is sort of like the question mark colon operator in C language and what does it do let us just look at the case construct here case when salary less than 100000 then salary star 1.5 else salary star 1.3 the case construct is returning a value now if you know the C language the case statement there is slightly different the case statement executes one of the statements in the case it does not return a value per say although it could be used to return here the case statement returns a value and returns one of these values depending on which of these conditions so if this is satisfied it returns this else that so now look in the main query update instructor set salary equal to case when this then this else this end not that this is very safe I am just doing one scan of the relation depending on which condition is satisfied I give a 5 percent raise or a 3 percent raise it is safer than the previous one it is also more efficient so that is a very useful construct and finally here are few updates which would be basically impossible unless we use scalar sub queries. So here is the squaring if you recall the schema every student had an attribute called total credits or taught under credits which is the total number of credits which they have completed successfully meaning they passed the course now if you see the sample tables which we have provided in the book you will realize as many students did that the total credits do not match and the reason this happened was we you know reduce the number of rows in the takes table to keep it small and then the total credits do not match that so there are rows which we did not put in the takes table supposing I want to update the total credits for all students based on the actual rows in the takes table I can write a query to do that how do I find the actual total credits for each student now let us just look at the sub query that should help you see what is happening to find the credits I have to join takes with course because the credit attribute is not there in the takes only the course ID is there but each course can have a different number of credits I have to join with course so takes natural joint course now takes natural joint course gives me rows for all students I wanted for a particular student in this case for the specific outer level student I am considering here so I am saying updates student s and in here where s dot ID equal to takes dot ID so first of all I am filtering out all the rows which do not correspond to this student now this student may also have taken a course and failed it I do not want to count the credits for such takes instances so I am going to say and takes dot grade not equal to f now furthermore this person may have taken the course but no grade has been allocated yet because the semester is going on so grade is not yet meaningful so grade may be null I want to eliminate those also so I am saying and takes dot grade is not null so what will be left is only those where the grade is not null and it is not so I am going to sum up those credits and the update statement is actually setting the total taught under credit for this student equal to the result of this sub query so this is a scalar sub query which is now being used to update the total credits attribute of the student I hope this was clear but there are a couple of issues here so first of all if somebody has not taken any course or maybe they took a course but it is grade is still null or they have taken a course but they have not passed it in which case what happens this sub query returns an empty set so what is someone an empty set if you recall our aggregate discussion someone an empty set is null count is 0 but some is null so the total credits for the student would be set to null it would not be set to 0 so if you want to set it to 0 what do we do we can use the following case statement you can say case when some credits is not null then some credits else 0 so that takes care of handling the case where if it was null it will be set to 0 so I hope this query is clear so that is it for now but we will take time for a few questions we have a few people with their hands raised let us try NIT var angle NIT var angle we have a question please go ahead first question is regarding deletion how to delete duplicate rows from the table how do you do this it is actually a little tricky because if you delete one row the other will also get deleted because when you say declaratively delete something whatever condition you give all the duplicates will satisfy that condition so the only way to delete duplicate rows is to first make a copy of the rows of in that relation in some other table temporary table so you can say select distinct star from this table and save it up in something else and then wipe out the contents of the table delete all the rows and then insert the remaining rows back into that table that is the only way you can actually do it in SQL now there are tricks which a few databases offer which let you access row IDs and so on which can help you this thing you know pick the minimum row ID among all rows which are equal on all the other attributes so if your database supports those features there are other ways of doing it but in the standard SQL there is no other way of doing it so if you if it does for example Postgres SQL has a row ID so I can delete everything except the minimum row ID for a particular value for the remaining thing so I will delete where the row ID is not equal to select minimum row ID from the same relation where all the attributes are equal so that that is not too hard to do does that answer your question regarding the assignment one like yesterday's assignment question number three can I ask some question regarding that okay the question goes like this find the ID name of instructors who have taught a course in computer science department even if they are themselves not from the computer science department okay so the point of this question was if you did a natural join across all of the relations involved it would make the department name of the instructor equal to the department name of the course we discussed this question during the slides for earlier in this chapter three so the question was basically to make sure you took care of that so does that answer whatever doubt you had or did you have some other doubt what is the difference between delete and truncate okay delete and truncate is not a part of standard SQL it's part of Oracle and delete you know let's you specify which rows you want to delete truncate says just wipe out all the rows of the table so it's just extra Oracle syntax now the way they executed is probably a little bit different but conceptually as far as I know I have not used the truncate very much but as far as I recall it is equivalent to just deleting all there may be some other minor differences but it's not standard SQL when we copy the structure of a database to another structure do the constraints also get copied that's a good question when you copy the structure so the question is how do you copy the structure depends on how you do that now if you you know export the schema and then reload it you can export the schema in SQL which lets you you know save all the constraints including foreign key primary key everything is saved however if you do something like you know create table as select star from another table that's a way to copy a table and create a new table with the same columns and the same types as the first table so this is supported in many database it's wasn't there in standard SQL earlier I think it is there now and it is widely supported create table as select from star from something else the problem with this is it does not preserve the constraints the newly created table does not inherit any of the primary key of foreign key constraints from the original table does that answer your question hello thank you sir okay so now let's move to some other place which still wants a question VNIT in Akpur still has its hand raised so let us see I can see you VNIT in Akpur go ahead if a query involves real class having class order by class and group by class then what was the execution plan of the query processor that means which class will execute first and why yeah that's a good question if there are multiple clauses what is the order in which they are considered so the SQL language defines the order like this first is a from clause that is central that list the relations which are involved so conceptually the first step is to take the cross product now the actual execution plan may rearrange things as long as the result is the same but to understand what the query means first take the cross product or whatever is in the from clause then apply the where clause to filter out rows which don't satisfy it of course a good implementation would actually combine these two steps by doing a join on a condition rather than just taking a cross product now once you've got a set of rows if there is a group by clause that is applied to do a grouping and then the select clause can be evaluated if the select clause has aggregates it would be one rope a group if it does not have aggregates then you know if there are no group by clause and there is an aggregate it will be just one row if there are no group by clause and no aggregates the select is applied to get whatever you have now if you had a group by you could optionally have a having clause so after you compute the aggregate in the select clause you can apply the having clause and filter out those things in the results of the select which fail the having clause and finally if there is an order by clause that is applied to order the result so that is the order in which this happened conceptually now a good query optimizer may actually mix up these steps and to get a faster plan provided the result is the same does that answer your question any other questions from VNIT thank you thank you very much let's see if anybody else has a question there is one from Amrita Bangalore sir can you explain the query find all the students who have taken all the courses offered in the biology department that slide once again okay that's a good question because many people may have been confused by that so let's go back to okay I hope you can see this slide up which says find all students who have taken all the courses offered in the biology department so first of all I hope the query itself is clear the biology department may offer multiple courses certain students may have taken all of those courses I want them I don't want those who have taken 0 or 1 or 2 but those who have taken all the courses which are offered by the biology department so this kind of a query which requires things which have the for all form that is I'm returning students such that for all courses in biology they've taken that course that is what the question says now how do I write that question since SQL does not have for all construct but it has a not exist construct I'm going to use double negation what is double negation here if you say this is true for all things it is the same as saying that does not exist the thing for which this is not true so I want to find students such that there does not exist a biology course that they have not taken so that is the basic way in which you write such queries in SQL now don't ask me why does SQL not have for all construct I don't know the answer to that the sandage bodies decide that but you can work around it although it is a little confusing at first but you'll get used to it so what we are doing is find students such that there does not exist the course in biology that the student has not taken so not exist course in biology is the first part select course ID from course where department names biology which the student has not taken so I want to remove from these courses all those which the student has taken so if I take the courses in biology and remove all the courses the student has taken what is left is the biology courses that the student has not taken if this set is empty then the not exist condition will be true and the student will be output which is correct but if this set is not empty that means there is some biology course which this student did not take then the sub query result is not empty so the not exist condition here will fail and that student will not appear in the output I hope that answered your question let me come back to you on yeah Amrita it's back to you if you have any follow-up question yes sir thank you thank you