 We are going to start today with a coverage of chapter 4 from a text book and then continue on to chapter 5. In the topics we will cover are more of SQL starting with outer joints and then we will move on to JDBC. So, first of all let us look at the concept of joint relations in SQL. So far we have seen the natural joint construct. It turns out that the natural we also saw the joint using construct. Both of these are special cases of more general class of joint relations which SQL supports. In general what is a joint? A joint takes two relations and returns another relation of an output and what is returned as output can be defined in many different ways. In general the output is you take the Cartesian product and then take a subset that is for an inner joint, but for the case of outer joint it is actually a little bit different let us look at some examples. So, we are going to look at outer joints now and we are going to take the following tiny relations as our running example for outer joints. We have relation course which has three courses bio 301 CS 190 and CS 350 and then we have relation pre-rec which has a pre-recs for bio 301 and CS 190 and a pre-rec for one more course CS 347 which is not in the course relation which of course is not possible if you have a foreign key dependency. So, how would you land up with such a relation? Well, we are just calling these relations as course and pre-rec for convenience, but they could be the result of a select on the course relation or a select on the pre-rec relation and then we join the outputs of the select. So, this kind of situation could happen in that case. The pre-rec relation also on the right hand side it has three pre-recs bio 101 and CS 101 for two of the tuples in this case neither of those is present in the course relation. So, what do we see here first pre-rec information is missing for 315 and B course information is missing for 347 it is also missing for the other three, but for the purpose of this example that does not matter. So, now let us look at the first kind of outer join we are going to start with natural outer join and then move to other general kinds of outer join. So, supposing we take course natural left outer join pre-rec. So, first of all the keyword natural with join means that columns with the same name are equated. So, in this case what are the columns with the same name let me go back here the only common in column in common is course ID. So, we want to match things based on course ID supposing we did a natural join of these two bio 301 would match would get an output CS 190 would match would get an output CS 315 on the other hand CS 315 does not have any matching tuple here. So, in the result of a regular natural join CS 315 the tuple with CS 315 would be dropped in contrast if we say natural left outer join of course with pre-rec we get this tuple CS 315 preserved. However, as we just saw there is no matching tuple from the pre-rec relation. So, the result of a natural join has all the columns of the two relations in this case the course ID column is common and the extra column is pre-rec ID. So, if we keep this particular tuple it does not have a matching pre-rec tuple. So, what value do we put for pre-rec ID we do not have a value. So, we are going to set it to null. So, the result of an outer join is defined to have null values for the attributes from the other relation if a tuple does not match with anything. If you look here bio 301 did match with course from the pre-rec relation it therefore, we do not have an extra tuple for it with null values. It is also possible for bio 301 to match with two or three tuples from the pre-rec relation that is fine. It should have at least one match if it does not have any match then we add it to the result with null values for the attributes from the right hand side relation in this case pre-rec. I hope that is clear. There is a symmetric case which is called natural right outer join. In right outer join we take the rows from the right hand side and make sure we do not lose any rows. So, again the basic principle of outer join is we preserve rows regular join loses rows which do not have a match, but there are many cases where we do not want to lose the rows. So, the outer join operation preserves those rows and fills values with from the other one with null. So, in this case since we said right outer join we are preserving rows from the right relation which is pre-rec. Now, remember pre-rec had CS 347 with a pre-rec CS 101, but there was no course tuple for CS 347. As a result all the extra attributes from the course relation meaning title, department name, credits are all null over here. Only the course id is not null because in this case it came from the right relation. So, in the natural a version of left outer and right outer join the column course id in our example came from both the relations. In if there was a match there would be two tuples one for course one for pre-rec both of which have the same value for course id. In the case where there is no match for one of the sides the corresponding course id does not exist on the other side, but because the natural join has just one copy of the column course id it is going to have just CS 347 there even though the course id from course in this case was actually missing there was no course id from course there was a course id from pre-rec and that value comes over here. And finally, there is a full outer join. So, again sticking to the natural version of it course natural full outer join preserves roles from both sides. So, CS 315 from course did not have a match in pre-rec it is preserved with a null value for pre-rec id. Similarly CS 347 from pre-rec did not have a match in course, but it is preserved with null values for title, department name and credit. So, I hope the natural version of outer join are clear, but natural join is just one kind of join we can actually split the join condition into two parts there is a join operation into two parts rather there is a join condition and then there is a join type. So, what are the join types the first join type is inner join this is actually the default this is the one we have been looking at so far. And then we have the three types if you just left outer right outer and full outer the join condition on the other hand is something which we have seen already. We have seen the natural join we have seen using and the last one which we did not see before, but which we could have used already is on predicate. We did not bother with it because for the inner join case you could always put it in the where clause, but as it turns out for left outer join you cannot always put it in the where clause. So, a join operation has to specify a join condition and a join type sometimes these are taken by default when I say a natural join without saying left or right or full outer then by default it is inner join. So, now here are some other examples of a join, but this time not using natural join since we are not doing natural join columns with the same name from both sides are present. And here in this table we are just showing the column twice, but remember that in SQL you would actually have a prefix here. So, in this case clause inner join prereq on clause dot clause id equal to prereq dot clause id we have clause id title department name credits there is a glitch in this table. This should have been prereq id and clause id and not pre re id. So, clause id appears twice this clause id is from prereq and this clause id is from clause. So, you could prefix it by clause dot clause id and prereq dot clause id. So, this inner join is the usual join we have seen except we have written it as inner join on condition clause dot clause id equal to prereq dot clause id. Instead of inner join we could have said left outer join that is what is the second example is clause left outer join prereq on clause clause id equal to prereq clause id. What has happened now? Since it is left outer join with clause on the left hand side clause has to be preserved. So, the clause cs 3 1 5 which did not have a match in prereq is present, but now you will see that the attributes from prereq are all null including the clause id attribute from prereq is null and of course the prereq id attribute from clause id is null. In the natural join case this clause id and this clause id were combined and essentially the non null value was retained for the non natural the normal case with condition both copies are preserved. So, we have already seen this clause natural right outer join and we also have clause right outer join using a course id. In this case it is basically the same as the natural right outer join. By the way all of these slides have prereq written by mistake as 3 re do not worry about that, but this example is something which we saw earlier it is the same thing except it is written using the using clause. So, this is a good time for a quiz there actually 2 questions in this slide the first question is the one which I will enable first then give you a little time for the second question as well quiz question is actually very easy. It says a left R left outer join S and S right outer join R the same if we ignore the order of column. Now, why is this part of ignoring the order of column in SQL the output of any join like this has the columns of the left relation first followed by the columns of the right relation. Now, for the natural join case duplicate columns are present only from where they appear in the left relation they do not appear in the positions that they would have been in the right relation. So, these 2 queries definitely because we have flipped R and S the order of columns is going to be different. However, if we ignore the order of column in both cases we are preserving tuples from R and S tuples if they do not have a match they are dropped. So, they are actually equivalent and the answer is A. So, the question here is which of the following give exactly the same results given 2 relations R A B and S B C. So, the common attribute is B. So, the natural join will equate things on attribute B. So, R natural join S is really the same as R join S using B because the using clause specify which columns are to be equated. In this case it is the only common attribute is B which is there in the using clause. So, that is actually equivalent to the natural join. However, is it equivalent to the next one which is R join S on R dot B equal to S dot B and it is almost the same, but it is not and the difference as we saw a couple of slides down is that the third query R join S on R dot B equal to S dot B it has 2 copies of column B. It has column from R and a copy from S. So, it is not exactly the same result it is equivalent in some sense because the columns have the same value, but there is an extra copy of the column. So, A and C are not really the same A and B are the same and of course, B and C are not the same. So, the right answer is one I will take questions on outer joins in just a little bit after briefly covering view definition. We already saw some kind of view definition using the with clause. There the with clause defines a temporary relation which is available only to the query. A view is very similar except that it is stored in the database and can be accessed just like a regular relation. So, view is defined using a create view as a query expression and once it is defined it is in the database, but what is stored in the database is the name of the view. By default there is no actual content or in terms of tuples for the view. It is only the view definition which is stored. So, note that defining a view is not the same as creating a new relation by evaluating the query expression. What do I mean? Instead of saying create view as some query I can say create table as select star from you know something query expression. What is this latter one do? It creates a table which basically has the same schema same attributes, but it fills up the table with rows from the result of evaluating the query. The issue is the following, the difference is the following though. In the first case when you say create view supposing we change the contents of one of the tables used in the view definition. Automatically the contents of the view conceptually changes because it is not actually stored. It is defined as the result of evaluating the query. If I change one of the underlying relations I re-execute the query I am going to get a new result. But if I computed the whole table and stored it, it is not going to change. It is fixed. So, it is not really the same thing as evaluating a query and storing it. However, there is some connection and I will come back to that in the context of what are called materialized views. So, before we look at that here are some examples of views and some of the motivations for views. So, one kind of motivation for view is if we want to make a relation available to somebody, but we want to hide certain columns. So, in this case what you said is create view faculty as select ID name department name from instructor that is over here. What are we done here? We have simply removed the column salary. Maybe salary is something which we do not want to expose to certain people. For example, this view might be made available to everyone because we want people to know who are the instructors and what is the name, what is the department, but we do not want everyone to see this. Now, how do we use the view? You could use it in a query. So, here is the query select name from faculty. This is the view we define where department name is biology and that is executed. How is it executed? Conceptually faculty over here is replaced by this thing. Think of it as becoming a sub query in the from clause which is select ID name blah blah from instructor that is a sub query and then that is the resultant query which is executed. So, when we write a query on a view basically the semantics is that the view name is replaced by the view definition. Another use of views which is very common in the context of decision support system where we want to pre compute certain things and store it. So, they are actually little different from normal views, but first let us look at what this view is and then we will see later why it is particularly useful. So, this view creates a view of department salary total. Maybe we want to publish the total salary in a department without publishing the individual salary. So, what do we do? Create view department total salary, department name total salary those are attributes as select department name from salary from instructor group by department name. Now, of course, in this case a department with no instructor will not appear in here. If you wanted that we could have written a slightly different query as we saw yesterday. If you recall that query was to select a star from not not select department name comma and then a sub query which computes the total number of total salary in that department. We did it for count yesterday, but a small variant of that maybe I will cover that in little bit coming up. So, the idea is we can write a query like this and create a view and then we can use it. Now, I told you that you can use a view in a query, but you can also use a view to define another view. So, here is one view do not really bother about this whole definition. Just look at the name of the view create view physics fault 2009. Intuitively it is selecting the courses which the course ID, second ID building room, some part of the section information for physics courses which run in fault 2009. So, that is a view definition. Now, why would you do this? It is a toy example, it is not real and now finally, the second view is physics fault 2009 Watson which says I only want those courses where the building is Watson. So, maybe there can be courses with run in different buildings. Note that this is not the department building. This is the building of the classroom in which the course occurs, course runs. So, maybe some of the physics courses run in Watson, some run in a different room. This is a view defined in terms of the previous view. So, in general how do you define what it means, what is the semantics when a query uses a view. The answer is you replace the name of the view by the definition of the view and that is what happens here. So, this may happen this may have to be done more than once. So, over here this one create view physics fault 2009 has a use of a view here and what we have done is replace this one relation name here by a sub query. The sub query starts here and ends here. This sub query was actually the definition of that view. So, we have replaced it. Now, this is the definition of the view physics 2009 fault 2009 Watson. So, in general the semantics is we take a query. If there is a view in the query, we replace the view relation by the expression defining b. But, the result of this replacement may still have other view names present here. It may have been present in the original query. There may be multiple views or the replacement resulted in a new query which itself had a view name. The replacement that is the view definition itself used another view. So, this may have to be done multiple times as long as you find a view name in the query which is not a actual relation, but a view we keep replacing it and we stop when no more view relations are present. So, it is intuitively very simple. Since the SQL language actually allows sub queries in the from clause, if a view name appears in the from clause, we simply replace it by the definition of the view. So, this works as long as view definitions are not recursive. Now, what is a recursive view definition? Why is it useful? We are going to see it little bit later in chapter 5 and those are handled differently. But, as long as the view definition is not recursive, it is ok. What do we mean by recursive view definition? You all know what is a recursive procedure? A recursive procedure is one which is defined in terms of itself could be directly meaning the procedure calls itself. It could be indirectly. Procedure a calls b which calls c which eventually calls a again. So, there is a recursion. Similarly, view definitions can have recursion. U a can be defined in terms of view b. U b can be defined in terms of view a. If you do this view expansion for a query which uses a, the definition of a is put in place. Now, it has a view name b. If you expand that, the definition of b has view name a and then we are back to square one and it keeps going on and on. So, recursive view definitions cannot be handled like this, and non recursive ones, it is very easy to see how view expansion works. So, I think this is a good point to take some questions. We have covered outer joints and views. So, let us take some questions. After that, we will come back to the view updates and then other thoughts. So, we will take at this point, Walsh and introduce. Sir, my question is we can use exist clause in place of in clause and vice versa. Yes, this question was asked yesterday. Let me use the white board and answer that question with an example. So, to repeat the question for the benefit of other, the question is can you use an in clause or an exist clause interchangeably? The answer is any query with an in clause usually can be rewritten using an exist clause and vice versa is a little bit harder, but it is possible. It is not so simple though. So, let me show this first direction. Supposing we have a query, I will keep the query simple. Select star from r where r dot a in select a from r dot a in select a from r dot a in select a in select a. I am just keeping the query extremely simple, but the same idea works in general. So, what do we have here? We have a query outer query relation r and the in clause make sure that value in r dot a should appear in s. Now, we can easily rewrite this query using a join well modulo duplicates. So, ignore duplicates you know it is easy with duplicates is a bit tougher, but the question is with respect to the exist clause. So, the same query can be rewritten as follows. Select star from r where now note that we are checking if r dot a in s dot a essentially. So, we are going to rewrite this query as exists select anything I do not care what it is. I could say select star, I could even say select 1 what is select 1 means select the constant value 1 it does not matter what I put there. From s where s dot a equal to r dot a note that this r dot a over here is actually coming from this r over here. So, the semantics is for a particular tuple of r I am going to execute this query which selects s tuples is the same value of a and outputs tuple with a just a field 1 I could have called it r dot a I could have said star does not matter. The bottom line is that this sub query succeeds if there is at least 1 tuple in the result which means there is at least 1 s tuple with the same value of r dot a then that particular r tuple comes in output that is exactly the same as this query. In this query r tuple comes in output as long as it is r dot a value is present in some tuple in it. So, these 2 are actually entirely equivalent. Now, there is another almost equivalent, but not query which is select r dot star from r comma s where r dot a equal to s dot a. What is this doing? It is outputting r tuple which have a matching s tuple. So, it is almost the same query, but there is a very minor difference and the difference is how many times that particular r tuple get output in the first 2 cases in this case and in this case an r tuple gets output exactly once. Even if it matches many s tuples it gets output exactly once because the in clause succeed only once regardless of how many s tuples like. Similarly, the exist clause is done only once regardless of how many s tuples like, but if you come down here that r tuple is output many times. How many times? As many as there are s tuples like this. So, we might say select this thing that is very close in most cases as long as it is r dot star and r has a primary key. This is actually now equivalent because the duplicates are removed, but it is a little tricky because instead of star if I had let us say select r dot d this was also r dot d and here too I made it r dot d instead of star. Now, what happened? In this case a particular value in r dot d will appear once per tuple. So, if there are two tuples the same value of r dot d that value will appear twice. So, if there are let me repeat if there are two tuples in r with the same value of r dot d that value of r dot d will appear twice in this first query. In the second query also it is going to be exactly the same, but in the third query because we said select distinct r dot d that value is going to be exactly once. So, it is not exactly the same as the previous query. So, you want to make it the same there is a more complex way of doing it by first projecting a primary key also of r. So, instead of distinct r dot d I will let us say r dot a is the primary key. So, I will have to do distinct r dot a comma r dot d and I enclose this in an outer query which says select r dot b from this inner query. So, what we are doing is first select distinct r dot a r dot b in the inner query that will remove duplicate r tuple which were introduced because of multiple s matrix, but if a particular r dot b value appears in two different r tuples those two r tuples will still exist in this division because we included the primary key r dot a in the select distinct. So, they will be two separate tuples as is and then finally, select r dot b will get you the correct answer. So, if you want to preserve duplicate count you have to jump through a few hoops like this. So, this is a small example, but this idea can be generalized. I hope that answers your question. Let us come back to you. Sir, in terms of execution time, in terms of execution time using exist is better or enclose is it depend on the inner query. So, the question is in terms of execution time is it better to use the enclose or the exist clause. The answer is most databases will be able to transform one to the other. So, I would not expect any difference between the in and the exist clause because if you use the in clause it is very easy to transform it to the exist version. Unless it is a very naively implemented database then yes you could see a difference, but my guess is on most databases you will not see a difference because this is a very easy case, but when there are more complex cases transforming is harder and then you might get different execution time. Shanmuga college. Can you give us idea about object oriented database and relational database which will be the best for the real term applications? Object oriented databases are different from relational databases in the sense that they explicitly model objects and then objects have methods and so forth. Then they model inheritance and many other such features. Now, there was a time when there were many companies which were building object oriented databases. This was back in the late mid to late 1980s that long ago. Object oriented databases were considered a very hot topic lot of research in that area, but they did not take off commercially for reasons which basically were the following. There were some technical and some non technical reasons. So, the first thing about object oriented database the motivation for that is was to integrate the database with a programming language. So, the idea was instead of writing an SQL query and creating a result inside getting values back. We are going to see that in a little while after the break we are going to cover JDBC and you are going to see how clumsy it is to get values from the database. It is a bit of a pain. The idea of object oriented databases was to make this easier that it should be very easy to get a value from the database without writing a query. You should just be able to access it as if it is a object and field of an object. So, the original object oriented databases all those companies there were many they all died more or less. I think one or two of them are badly alive, but none of them succeeded because that is not what the marketplace needed. Sure the impedance mismatch was important for certain applications, but many other applications did actually want the relational model and they were not comfortable going over to objects that was one aspect. There was also another issue which was that that generation of object oriented databases lacked many features which SQL databases provided including a good query language which was very important for many people. They did not do concurrency control and other such things very well. So, there were many drawbacks they did certain things there was a lot of interesting research which came out, but the company is died. Now, but the basic idea has gone away it has come back in several forms at some point when J2EE was big the beans and the bean persistence and so on were hot topics for some time which was in some sense renewal of the same object oriented idea. But these things also fell by the wayside for more or less the same reasons they did not have features which most databases need including recovery concurrency control blah blah blah. Today there is a new variant of it which is actually fairly popular and that is object relational mapping which basically gives an object view of data, but the underlying data is actually in a regular relational database. So, if you want to write a query issue or write them in SQL do you get concurrency control recovery yes because you are using a regular relational database, but if your application wants to view data as objects without writing SQL you can. So, these systems are called object relational mapping or ORM system let me just write it down here. So, these systems are called object relational mapping or ORM system and there are several of them the most popular of them is an actually an open source thing called hibernate. So, what is it about hibernate that makes it popular and it is actually very widely used today something which you should learn about and the reason it is popular is a you know it did away with all the problems of the old object oriented databases by using a relational database underneath, but it makes it easy for programmers to write applications without even knowing SQL. So, how does it pull off this trick basically it has a notion of an object. So, an object in Java for example, which is mapped to tuples in SQL. So, supposing we have an instructor relation here instructor relations might have corresponding instructor objects and hibernate can be used from multiple languages let us take the Java that is the key thing the Java version. So, the instructor object in Java is mapped to an instructor tuple. So, on this side it is an object in this side it is a tuple. So, the instructor object is mapped to an instructor tuple how is this mapping done we say that we define the class instructor class in Java and the fields of that class are mapped to attributes of the relation. And moreover there is a way to say get this object with a specified key value and that key value will be the primary key of the relation. So, in this case the instructor relation has an ID. So, in the Java side we will just say get instructor with a given ID and the programmer does not need to look at the SQL query, but this hibernate system actually generates an SQL query which is sent to the database it fetches the instructor data brings it back and fills in the fields of an instructor object and returns that and then the programmer simply accesses the fields or methods of that object. So, it makes it very easy to access a database and it has been popular for two reasons. One is it reduces the amount of code that a programmer has to write. The second interesting reason is that it gives you database independence. Let me write it here. What do we mean by database independence? You can write your program in hibernate and run it on a PostgreSQL database today. And tomorrow you can run the same program on a Oracle database without changing the program at all. All you have to do is change the mapping definition even that more or less works because it is mostly database independent. So, you can just retarget the application to run on Oracle and it will run with no changes. What is the magic? We have seen that SQL for PostgreSQL and SQL for Oracle are slightly different. The hibernate system hides these differences and it will generate appropriate SQL for Oracle or PostgreSQL or whatever else you choose and give you the same object view of data. So, many applications have been built with that. So, recently national securities depository was building a new application and they were consulting Professor Fatak and Professor Fatak told them use hibernate. This is new technology to them at that point. They were not sure, but they went ahead and tried it out and they were very happy with it. And the reason is not just that it was slightly easier to write the application. The bigger reason is that now they could deal with Oracle and IBM and Microsoft and say that look we can run our application on any one of your databases. It is very easy for us to switch and you better bid low for us to select you. So, they had to bid lower and the price that NSBL paid for their database system was a lot less than it would have been if they had written the application using raw SQL. Why? Because then you are forced to write to a particular database. If you write SQL which works with Oracle, it will not work unchanged on SQL server. You have to go make changes. You have to test it again. It is a lot of work and even between Postgres and SQL server, Postgres and Oracle. Every one of these you have to go rewrite stuff. So, that is actually a very interesting reason why hibernate is doing well. So, why is hibernate not replaced SQL totally? Well, there are some reasons. The first thing is that if you are writing queries which access a lot of data, it turns out hibernate is pretty slow for that. It is terrible in fact. So, if you want to write complex queries, you really do not want to write it in hibernate. You want to write it directly in SQL. So, that is the reason why hibernate cannot replace SQL totally for application. Now, the next question is why is hibernate slow? Why cannot it run faster on queries? This is actually a very interesting research problem. This is a problem which we looked at, not originally in the context of hibernate, but more in the context of web applications. We said look here is a database program, a database application which accesses a database. It is very slow because it makes many different accesses to the database. Can we speed it up somehow? So, we came up with ways of actually going and rewriting Java code in order to improve database access. So, this is part of a project at IIT Bombay which we call holistic optimization. Holistic because the optimization is not in the database. It even spans programming language rewriting, Java code rewriting and it turns out that some of the techniques we developed in that context can be used to optimize hibernate programs also. So, we have shown that our techniques can be used. It is not a product at this point, but we have shown that it is feasible. So, maybe if some of these techniques improve over time and become more stable hibernate could be used even in other settings. So, that was a long answer to a short question which went completely out of topic. Let us come back and deal with a few more questions. Yes sir, we have got ideas there. Yes sir, you have given us the proper explanation sir. This also useful for us research also. We are thinking about this topic and relation database and object in the database, which is the best and which is used for real time applications. This topic and we are all thinking to do research here. I will take a few questions from chat. The first question is comparison between view and with clause and the second question is whether views are stored permanently inside the database. These two are related questions asked by the same person. So, as I explained yesterday when I covered the with clause, the with clause definition is not stored in the database. It is only available to the particular query in which it is used. In contrast, the view definition is actually stored in the database and any other query can use that view definition. So, the syntax is almost the same except that the with clause is part of a query whereas the view is defined and stored in the database and to be used for any query. Another question is what are the practical applications of views in real life? So, there are many applications of views. Sometimes you want to take data which is stored in two different relations and provide a joint view on that. So, why are they in two different relations? Maybe because of normalization things got broken up into two relations. Now, to make it easier for people to view the data, you might need to join the two relations and view provides that. So, you can have normalization and still you can have simpler queries which do not have to do the join. The join is already done in the view definition. Then, you can have aggregate views of the kind we saw which may be a little difficult to write, but you can define the view and then let people access the view to get aggregate statistics. In fact, you can go one step further and do what is called view materialization. So, let me use the white board and give this term there. So, what is view materialization? The idea is if I define a view, create view v as some query q 1. So, normal view definition simply stores the text of the query in the database. The view is stored defined by the text of the query. But if I have an aggregate view which may take a long time to execute, it is sometimes useful to actually execute the query and store the result in the database. So, many databases support a variant of this which is they have different syntax, but if you specify materialized here, create materialized view v as q 1, then q 1 is executed and the result is stored as a table in the database. So, now, view v actually has data. If you access v, it will access the underlying data. The problem with this as I said is that if q 1 uses a relation r and I insert something into r, insert into r something. Now, the result of q 1 has changed because r has changed, but the materialized view has an old result. It is no longer consistent with the current state of r. That is the problem with materializing a view. So, you have to update the view to make it again consistent with r. So, how do you do this? You can re-compute. So, I just run the query q 1 again, throw out all the tuples in v and replace them by the new tuples. But this can be expensive. So, many databases will allow you to keep a view which is somewhat out of date. In particular, for aggregates, most of the time I do not care if the aggregate is consistent as of yesterday. I do not necessarily need today's data to be included in the aggregate because I am using it as statistics for some decision making and I do not care about today's values. Yesterday's values are good enough and so every night, you will re-compute the view and it will be up to date as of last night. There is another option which some database support which is called incremental. What is incremental view maintenance? The idea is as soon as I insert a tuple in r, the database will figure out how to update v to reflect that change in r. Now, if you have to re-compute the whole v, that is going to take a long time. So, that is not a good idea at all. So, you should be able to make just the required changes to v to reflect the changes to r. Now, with complex queries, this is actually quite hard. So, those databases which support incremental view maintenance, they put a restriction on the view. You cannot have arbitrary queries. The queries have to be relatively simple and for those queries, they will give you the option of immediately updating v as soon as r is updated. So, views are very widely used for many reasons including all of these. I will take one more from chat. Somebody wants to know what is a semi-join? A semi-join is a relational algebra operator which is defined in terms of the join operation. So, if you just want the formal definition of semi-join, so what is a semi-join operation? It is like the join operation but slightly different and it is denoted like this in the relational algebra. So, r, semi-join s, there are again the natural and the condition theta version. r semi-join s is defined as project on, so let us say that we have relations r whose attributes are denoted by capital R, s whose attributes are denoted by capital S. So, this is capital S. So, this r semi-join s is defined as project on the attributes of r of r join s. At least this is the set version, the multi-set version is a little slightly more complicated. But what is the basic idea? Conceptually whether it is a set or multi-set version, the semi-join, r semi-join s is those tuples in r that match at least one tuple s. Those tuples in r that do not match any tuple in s are thrown out. Now, this was from the natural join. There is also theta version. What is theta denote here? Theta is any condition. Theta could be for example, r dot a equal to s dot b. That is the join condition. Theta could be any condition like this. So, semi-join theta is defined as the same thing, the join theta with join on that condition. So, in SQL terms it would be r join s on that condition. That is what is theta join. So, this is called the theta join or the theta semi-join correspondingly. So, that is a formal definition of a semi-join operation. Now, why is this useful? There are many uses inside of a database query engine for semi-joins. In particular subqueries like let us say the in or exist subqueries are often translated internally to semi-joins. It is an efficient way of dealing with them. And then the not-in queries are translated to something called anti-semi-join. So, this is used to translate not-in queries. I will not give you the exact translation. I will for lack of time, but I just want to give you the intuition that this is what the semi-join operation is used for in centralized databases. Now, in the context of distributed databases, semi-join operations were used to minimize data transfer. Supposing r is in site 1 and s is in site 2. So, I want to compute r join s at site 2. I want the result at site 2, but s is a small relation and r is a big relation. And very few tuples of r join with tuples of s. Then I could copy s over here send s. And at this site, I will compute r semi-join s meaning I will find those r tuples that match one of the s tuples at least. Ship that over and complete the join by taking this result let us say t 1 equal to this. I ship t 1 t 1 join with s that is equivalent to r join s. But the difference is that I have sent over not all of r, but only those r tuples which match at least one s tuple. So, this can be more efficient than sending all of r. So, semi-joins are also used in distributed database system for to optimize data transfer. So, there are many applications of the semi-join operation. Mostly you would not see it if you write SQL, but it is good to mention it because it reappears in the context of query processing. So, let us go back to and take maybe one more question. Sardar Patil institute and very please go ahead if you have a question. How does view help in database security? So, to repeat the question for the benefit of others the question is how do views help in database security? I give you an example where I projected away the salary attribute in the view definition. So, that is not visible to others. So, that is one aspect of security privacy. There is another aspect of views which can project out certain rules. Supposing you know I want to make available to people in one department only information about that department. Let us say you know I have in a university I have details about all students of the university, but I want instructors from computer science to view only data about computer science students not about other students. So, now I can define a view that looks like this. We go back to the whiteboard create view CS student as select star from student where department name equal to to use our sample data it would have to be comm dot psi dot. So, that is the view and now we can give access to this view to people in computer science, but we may not grant access to all of the student relation to people in computer science. Now, this aspect talks of granting privileges and that is actually coming up later in this chapter. So, I will cover that at the end of chapter 4. Does that answer your question? Yes, what are read only views? That is a very good question which actually takes us into the next few slides which is you know I said that a view is a relation and you can use it in queries, but can you perform updates on view? What does it make sense to allow you to update a view? So, let us say that I gave this view which I just showed you which is computer science students and granted it to people in computer science. Are they allowed to make changes to that the view CS student? Now, the view is not actually stored unless it is a materialized view. Let us forget about materialized views for the moment. A regular view the tuples are not stored. So, if I want to make a change to the view CS student actually I have to go and update the student relation. Should it be allowed? Can there be problems? These are the issues which come up and the simplest way is to say no you are not allowed to do any updates to a view and that is fine you know that is a valid option, but there are many situations where you want to grant access and also allow updates of certain kinds. So, before we see what updates are allowed let us see what are the potential problems if we allow updates on view. So, I am going to cut you off and go back to these slides. So, let us come to view updates. So, remember the faculty view which we saw earlier may be I since it has been a few minutes let me just go back and show you the faculty view that is here create view faculty as select ID name department name from instructor the only thing which we have removed is the salary attribute. So, that is the faculty view. Now, supposing we want to allow insert into faculty values some ID the name is green and the department is music can we do this there is no actual set of tuples for the view faculty it is just a view definition, but we can actually go and insert a tuple into the instructor relation which has the effect of performing this insert. What is it that we add to the instructor relation we can add this tuple into the instructor relation ID 3075 name green department music salary what about salary the view faculty did not have a salary column. So, there is no way to even specify it. So, if you want to insert a tuple to instructor what is the salary value and the answer is null as you should if you do not provide a value we could make it null in fact the SQL language provides another concept which is to say that for this attribute you can define default value something. So, for example, let us say salary and if you do not have specify a salary we want to say salary is 0 because maybe there are many people with 0 salary on very people. So, what you can do is in the table definition we can say salary numeric 6 2 default 0 in which case if you insert a tuple like this instead of null the value would be 0. So, this particular insert into faculty can be translated into an insert on the underlying instructor relation. So, this viewer update of a view is translatable. However, it is not life is not always this and by the way many SQL implementation will allow this situation they will allow updates on views which are simple like this. The view was simply a projection view as long as the columns which are removed can be either set to null or they have a default value one of these two conditions is satisfied many SQL implementations will allow you to update simple projection view and similarly, selection view, but there are some other issues in selection view which are coming up.