 So, this chapter deals with more SQL features and the very first topic we are going to cover here are join expressions and outer join which a few of you had questions about. And then we will move on to views which we will do very quickly and then transactions, integrity constraints, little bit more on data types and at the end a bit on authorization. So, what are these join relations? So, as you have seen earlier one of the fundamental operations in databases is taking joints. So, in the original SQL the only way to do a join is to list all the relations in the from clause and then specify the join conditions in the where clause. This worked fine for inner joints, but how do you specify an outer join using this syntax? What is an outer join? Next we will see an example coming up right next to this. So, these are the two relations one is course and one is prerequisites. Now, what I have done here is I have made some changes in the original relations such that I have a course here 315 which does not appear here. In contrast I have a course here CS 101 which does not appear here, there is also CS 347 which does not appear here. So, if I just take course ID forget the 101 just take this one 347 is here, but not here 315 is here, but not here. Now, if I took the join of these two relations on course ID this tuple will match this one the second tuple would match the second tuple. The third tuple here would not match anything and the third tuple here would not match anything. So, the result would be just two tuples. In contrast if I say I want the natural left outer join what this is stating is that I want all courses to appear in the result even if there is no matching prerequisite. So, it is possible that a course does not have a prerequisite. So, this is not an unreasonable situation. So, if it has a prerequisite I want it to be shown with each of its prerequisites. If it has no prerequisites I still want it to be shown, but what value do I give for values from the prerequisite relation since there is no prerequisite they are going to be null. So, that is the meaning of the left outer join. So, see what has happened here this is the natural join version you are familiar with natural join. So, whatever attributes are common are equated. In addition now whenever they are common there are two copies of attribute one from this relation one from that relation. So, what natural join does is it removes the duplicate attribute and keeps just one copy of the attribute. In the case of outer join this is actually a little tricky. So, if you see course ID title department name credits come from the course relation the prerequisite relation has course ID and prerec ID. So, now we had 315 which did not have a prerequisite. So, it is coming as is the attribute from the prerec the prerec ID is going to be null. It turns out that the prerec relation also had a course ID attribute. Now, there was no matching tuple. So, if I took course dot course ID it is here prerec dot course ID will be null because there is no matching tuple, but since I did a natural join the two copies course dot course ID and prerec dot course ID are actually combined into one. In this case one of them is 315 the other one is null, but still it is combined and the non null value is kept here that is how natural left outer join is defined. If I use the other versions of left outer join which we will come to in a little bit it is going to look a little different, but before that the natural right outer join is similar except the right input in this case prerec is preserved. So, what are the attributes of prerec course ID and prerec ID those are preserved. So, CS 347 101 appears in the result what about the other attributes from course which are title, department name, credits well they all have to be null over here. The first two tuples again are the same as in the regular join and then there is the full outer join which is a combination of the two. So, it has the inner join the normal join results here then this one is the left outer join extra tuples these are the extra tuples from the right outer join. So, full outer join preserves both inputs. So, that is conceptually what is a natural outer join outer joins can also be defined using join conditions I will show you the syntax and then we will see some examples. So, in SQL the syntax is as follows you can specify a join type and a join condition. The join type could be inner join which is a normal join left outer right outer or full outer the join condition could be natural which means equate all attributes whose names are the same in both sides. It could be on predicate. So, you can give any predicate on whatever condition you want and then there is a last form using which lists a set of attributes and what it requires is that these attributes have to be in both sides and they have to be equal. If there are some more attributes which are there in both sides they are not equated they are left alone whereas, natural join will equate every attribute which is common. So, using is a restriction of a natural. So, now let us see some examples course inner join prereq on course course id equal to prereq dot course id. So, this is the inner join just the first two tuples. The second one is course left outer join prereq on course dot course id equal to prereq dot course id. Now, note here that course id appears twice because it is there in both relations. So, this one is course dot course id this one is actually prereq dot course id. Now, if you look at this last tuple the course tuple is there in full and all the fields from the prereq both the fields are null. So, that is a case of left outer join on a condition the natural one just removes the duplicate columns. Here is course natural right we have already seen this and this one is full outer join the table here corresponds to full. So, since the only common attribute is course id this is exactly equivalent to the natural full outer join which we saw earlier. So, that is it for outer joins if you have doubts please ask me otherwise I will move on. The next topic is view definition we have already seen the width clause. So, what we have is a query whose result is given a name and in the width clause that is available only in that query. In contrast when you create a view that definition is stored in the database and it can be used by any query which is running on the database. So, the syntax is create view v as query expression. Note that when you create a view v v is a relation it looks like any other relation, but it is not actually stored in the database at least by default there are ways of asking for it to be stored we will come to that later, but by default it is not stored whenever you use v it is going to be computed as required. In fact, it would not even be fully computed what happens in reality is if you have a query using v the use of v is replaced by the definition it is called view expansion we will come to that. So, a view now is basically a virtual relation which is not necessarily stored defined by the view expression. So, here is an example create view faculty as select id name, department name from instructor. Now, what is this view doing it is giving you instructor, but minus the salve maybe salve is sensitive piece of information which should not be revealed because for all of us from government institutions a salve is not a secret it is supposed to be published on the website. However, in other places the salve is usually closely held secret. So, maybe you want to make these fields available to everyone, but not the salve salve is only made available to authorized people. So, that is an example of a view. So, now somebody can write a query using faculty select name from faculty where department name equal to biology what is the database system do it will actually replace this by this expression and then evaluate the query. You can also do other stuff in a view. So, that was that view was for hiding information this view is computing something complex and allowing you to use it without recompute well without rewriting the expression. So, what is it doing for each department it is computing the total salve and making the view available. So, this is a function definition. So, we are creating a library of functions in effect and we can use that function whenever we want without worrying about how it is defined that is the use of this view over here. So, this is a function of authorization if you are authorized to use instructor you could do that. I will be seeing authorization, but it is a possible that you are not authorized to see salary. So, you do not have any authorization on instructor, but you may be given authorization on the faculty view in which case you can write this if you write that it will be rejected even though the result is the same in this case that will be rejected. Now, you can cascade views just like in the with clause we had one thing used in the next definition and so on. So, this is a very simple and do not bother reading the whole thing it is not worth it, but I will just point out that this is creating a view physics fault 2009. This is creating another view using this view in the from class. So, that is all I want to say here conceptually what the query engine does is wherever a view was used it is going to replace it by the definition of the view. So, here this view was used here the view defined here is used here. So, what is going to do is actually replace this by this whole expression from here down to here as a sub query. So, that is actually what the database does. So, this whole thing was put in place. So, that is basically the semantics of a view in SQN. So, view expansion is basically formalizing that idea of replacing you may have to do this multiple times. So, you replace one view by its definition, but that itself may use another view. So, that is replaced if you keep doing this eventually you will run out of views to replace and you will have a regular SQL big expression, but one which is not using any views and then that is evaluated. And once you have a relation in the database you may be tempted to update it. What prevents you from writing? Insert into faculty values 3075 green music. You can write it in SQL nothing prevents you. The question is what does the database do if you try to do such an update? There are several options. One option is the database can say look this is a view you are not allowed to update it you can only read it and that is what the initial implementations did. Later people said look why cannot I allow this person to insert a faculty with the salary set to null. Now, this person does not have access to salary. So, they cannot give a value for salary. So, the if you use the faculty view it does not have access to the salary. Now, I cannot actually put a tuple into this relation because it is not a real relation. I have to insert something into instructor. And if I insert into instructor what value do I give for salary? I can say well let me make it null. This is not a good example. You probably do not want to create an instructor with null salary. But there are other examples where it makes sense to use a null value or a default value. So, some many databases do allow you to do simple updates on a view insertion deletion and updates with certain restriction. Deletion is actually a problem. Since you are deleting a value which you are not even authorized to see. Well insertion does not have that problem. Update is a little easier. So, if I updated the department here that is probably not a problem. But again it depends on the authorization you are given. So, by default most people would be given read authorization on a view. But if you allow update authorization on the view then maybe you will be allowed to do this. But again general thing is do not do this. This is really for database implementers who worry about what if somebody does this, what should we do? Let us define it clearly. But as a user you probably do not want to use these features. It is just going to get you confused. Excuse me. So, if you do not have the primary key contained in that view for that relation then you cannot. Yes, there are many issues like that. If you cannot identify a specific tuple then there are some view updates which cannot even be translated. So, there are there is some information on these slides. But in the interest of time I am going to skip this. If you are interested it is there. There is in fact a detailed set of rules in SQL which define when you are allowed to update views. But my advice is do not even try to update views. It is not worth it. We can have read-only views also. There are some syntax for creating read-only views. That is database. There is no SQL syntax for creating a read-only view. And then you can grant somebody select access on that view instead of granting update access. So, then they can only read that view. So, that is the function of the authorization maker which we will see later in this chapter. Sir, one more. Sir, normally when we say when we speak about views only the definition is stored in the memory. What do you mean by that? Is it the structure? So, let me rephrase that. When we create a view the view definition that is the SQL syntax is stored in the database. It is stored in a special relation. The actual content of that view which you see when you say select star from faculty that is not stored by default. All that is stored is the actual view definition that text SQL query is stored. When we try to refer to the view it has to scan the table and get the. So, what it does is there is no table for the view. No, no, the view has to be made on top of the table something. Yeah, the view is defined on a existing table. So, will it be slow when we try to work with views compared to tables? No, it depends on the query, but bottom line what happens is if you use a view it actually replaces the use of the view by the definition. We just saw an example. You had a query which use the view. Now, that view relation is replaced by a sub query and that is the query which is evaluated. So, the view actually vanishes from the query once you have replaced it. Now, is this going to be inefficient or efficient? Well, that depends on the actual view and the query optimizer and so on. But the fact that you are using a view does not make it inefficient. That is involved in view because this was we have experienced in practical when some project was been developed. So, using views and the content which has been accessed to the particular view. If it is smaller, then no problem. If it is larger, then there was. Yeah, like it depends on the query. So, supposing you had a view like, take the simplest view, take this view, view faculty. Now, maybe there are a lot of instructors. But if I have this particular view, said department name equal to biology. Now, if the original relation has only a few tuples in biology and it has an index on department name, we can actually use the index to find just those instructors quickly. Or if this query instead of saying department name equal to biology, said id equal to 10101. Now, the instructor relation actually would have an index on id because id is the primary key. So, pretty much any database would automatically create an index on the primary key. So, those queries will run very efficiently. Now, it is not computing all of faculty. If you compute all of faculty and then select, it is going to be slow. But that is not what the query engine will do. It will only pick the subset of it which is actually useful, which in this case will be the instructor with department name biology or id equal to 10101. So, the actual plan does not necessarily compute the view, full view. View can be huge, your query may only access a part of it and the query engine will take care of it. But anyway query needs to be executed, whether it is stored query or it is executed directly. So, the efficiency, issue of the efficiency, I do not think comes in the picture. I mean, if query needs to be executed, it needs to be executed. Whether you directly enter the query or you use the query to view. Yes, exactly. That is how I think. I disagree. Okay. Medina's view I have not covered yet. I think the query is going to be executed only once. But when you use a view, every time whenever you use it is like a macro substitution. So, this is a macro substitution. This is the part I have covered so far. Now, there are situations where, you know, the view has to do a lot of computation. So, supposing I have a very large relation, sales relation. And I want to find the total sales of each type of item, each product that is there in the database. And I have millions of tuples even more. And the query says, the view says the group by item and count maybe, how many items were sold. So, this is a view which whether you write a query on the view or write a query directly. Either way, it has to access millions of tuples when you run the query. Now, this is where the query can be inefficient. So, an alternative to having a plain view is to actually materialize the view. So, what we were just discussing is called materialized. So, what is a materialized view? It is a view definition whose content is computed and actually stored in the database. So, supposing I had a query like this. I have a sales relation item and maybe transaction ID, let us say number of those things which are sold. That is a given relation. So, I create a view total sales as select item sum on number group by, this may have other information also. But let us ignore that. So, what I want here is this query result. Now, the sales table is usually very large. It may have millions of rows, but the number of items is usually much smaller. If you go to a supermarket, yes, they have a lot of items, but it is probably of the order of 10, 20,000. Whereas, their sales are probably millions. So, this result is going to be much smaller than that. So, if I materialize this view, if I actually create, I compute this query result and store it. Then, if I write a query on using this total sales view, it can just look it up quickly without reading the whole table. So, that is where a materialized view is useful. Now, there are a few issues with materialized views. And the first is, once I materialize this result, what if somebody goes and adds a few more rows to the sales table? Now, this materialized result here is not going to be consistent with that. The moment an insertion happens here, this is out of date. So, what do I have to do? I have to update this. This is called view maintenance. So, every time this is updated, I have to update this also. But, who is I? So, there are two answers to this. Generally, it should be the database system itself. So, if I tell it to create a materialized view. Now, the exact syntax, I cannot just say create materialized view. Unfortunately, that syntax is not standardized in SQL. Each database has its own variant of this syntax. So, whatever it is, if I specify that, when the database supports materialized views, the database will do the following. It will create this result and store it. Moreover, when this is updated, this will be also updated. Again, there are different policies. So, there is immediate and then there is deferred. Immediate means the moment you update this, this will also get updated as part of the same transaction. And that is the default behavior, for example, on SQL server. An oracle, on the other hand, if you do not say anything, it will not update immediately. It will update when you use this view. So, it is some form of deferred maintenance is the default. Also, in oracle also, you can say do it immediately. So, these things are not standardized unfortunately. SQL does not have a standard for this. But you can use this on most of the major databases. So, oracle supports it. SQL server, IBM, DB2, I believe Sybase, all the commercial ones support it. But PostgreSQL does not. It does not support materialized views as of now. It may get added at some point. So, then if you want to do it, what do you do? You could still create a relation like this. And then you could create a trigger to update this. Every time something is inserted here, the trigger can update this. So, now the view maintenance is done by a trigger which you have written, as opposed to the database system automatically doing it. Any questions? I want to briefly mention transactions. All of you are familiar with transactions. Anyone here who is not familiar with transactions, raise your hand. Otherwise, I will skip the details. So, you are all familiar with it. And now on every database today, by default, every SQL statement runs as a single transaction. So, if you just run an SQL statement, it immediately commits. So, if you want a piece of work which covers multiple SQL statements, you need to tell the database specifically that all of these statements together are one transaction. Now, how do you do that? That is something in the SQL standard, but that is not implemented uniformly. So, there are basically, in SQL, the syntax is supposed to be, you say begin atomic and then give any number of statements and then say end. If you do this, it is supposed to execute all of them. And if you roll back at any point in between, all the updates done up to that point are supposed to be rolled back. The only problem is nobody supports exactly this syntax. But in PostgreSQL, you can say begin. Do not use atomic. Just say begin, semicolon will begin a transaction. Then we can give any number of statements. In the end, we cannot say end. We have to say commit or roll back, one of the two. Commit to commit it, roll back to undo the effect of all the updates which have happened from the corresponding begin. Or the other option, if you are not directly using SQL, if you are using JDBC or ODBC, for example, you can turn off auto commit for a session. How many of you have done turning off auto commit using JDBC or ODBC or any other API? Any of you have done it? No. How many of you knew that you all know what is the transaction? You know that you can do multiple updates and then commit them all together or roll them back? How many of you knew that the default is every SQL statement is a transaction? Few of you. So when you are using JDBC, the default is every SQL statement you give is immediately executed and committed as a separate transaction. If you want this effect that multiple SQL statements are treated as one transaction, there is a way to turn off auto commit. We will see that syntax later when we come to JDBC. It's like connection.set auto commit false. Now once you do that, then the effect is exactly the same as this begin. Subsequently, all the statements which you submit, one after another, will all run as part of the single transaction. And only when you say connection.commit or connection.rollback, will the transaction be finalized? So if you say commit, all the updates are committed. If you say rollback, all the updates are undone. So that's outside of the scope of SQL. In SQL, this is the way to do it. And we will actually be playing around with this on the last day in PostgreSQL. So that is it for transactions in SQL. Now let's look a little bit at the data definition language. We have seen data manipulation in detail. We saw a little bit of data definition at the beginning. Now we will look at a little bit more on data definition. In particular, integrity constraints. Now I hope most of you are familiar with the basic integrity constraints. Not null, primary key. There is also a unique. And finally, the check predicate, which I will cover in a little bit. Not null, you are all familiar. I am going to skip this. Unique is basically similar to declaring something to be a primary key. But primary key has some other implications. Unique simply says it's unique. And one of the other differences for a primary key, you are not allowed to have any value being null. In unique, it's okay if some values are null. This is not very widely used. Not null is used. It's required in many, many cases. Now check is any arbitrary predicate. So for example, in our schema, university schema, we had semester, which is wire care six. Now you could put any value you want in there. We don't want that. We want only a few valid semesters. So there are just four possible values for semester. So one way of enforcing that is by having a check. Semester in and then list the values. So whenever you insert or update a tuple, if semester is not one of these values, the check fails. And that update or insertion will be rejected. That's the idea. So a check clause can be useful this way. Now is there a way to do this without using a check clause? No, not that. Well, you could, but that's not what I was looking for. The goal is that to enforce that semester must be only one of these four values. It will require a schema change. You are familiar with foreign keys? So how will I use foreign keys to enforce this? Yeah. We'll create a separate relation, which we can call for example, semester master, which has all the possible semester values. In this case, just these four values. And now what should you insist that the semester value here must appear in that table? Now how do you specify that? It's coming up in just a bit. If you already know it, it's fine. If not, I'll get to it in a little bit. So that's the other way of enforcing this. So how do you create a foreign key constraint or it's also called referential integrity constraint? Let's forget the details of the slide. Now here is an example of referential integrity. We have created a table course where course ID is primary key and then there's title and department name. Now note this last fellow here. Department name references department. So this insists that the department name in course must exist in the department table, in which attribute of the department table is the primary key because we have not specified anything more here. We just say references department. It means the primary key. In fact, foreign key constraints have to be on primary or unique keys in SQL. So that ensures that I cannot throw junk values for department into course. Similarly, if I had a semester master table, we'll come to that. Now here is another example. Course. It's the same one, but there's a little bit more. This one just says department name references. This is an alternative syntax. I can say foreign key department name references department. This is the same thing. It's just another syntax. But in addition, I can add these two things. On delete cascade, on update cascade. Now what is this doing? So if that department name is deleted, what this is going to do is delete all the courses from that department. Is this a good idea? Probably not. This is just a toy example. If you delete a department, you don't want to wipe out all courses from that department. Then if somebody had registered for that course, you're going to lose that information about what that course was. But this is just a toy example. The other part is useful. On update cascade. Supposing you decided to change the department name for some reason. Then if you update the department name, then the department name in course will also get updated. So this is actually useful in certain situations where the primary key value itself is updated. Then it makes sense to cascade the update. Now for example in IIT Bombay at one point, student roll numbers were updated sometimes in certain situations. That is a bad idea. But assuming we have to update it and we have a relation which says this student takes this course, we better update the roll number tuple in the takes relation and in other places also. Wherever it occurs, we better update it. So then on update cascade would have made sense. Coming back to the check semester, this one, if I wanted to create a foreign key for this, what would I do? In this table, I will say semester, where care six, references, whatever that table is, semester, master, table, etc. And in that table, I will have to insert four values, fall, winter, spring and summer. Now here is some interesting issues with the referential integrity constraints. So here I have a person table with a name, mother, father and an ID. Now what I would like is to say that, father is a foreign key referencing person and mother is a foreign key referencing person. So if I store the father and mother values, they have to be there in this table. That is the idea. Now what if I do not store father and mother values? I can set the value to null. Both of these can be null because I have not said not null. So a foreign key reference here can be null. That is okay. But if it is not null, then that thing had better be present in this table. So both of these should be IDs. If they are not null, there should be a valid ID, which is also in this table. That is the idea here. Now how do you insert a tuple into this table? Well we need to have the father and mother IDs. How do I insert a series of values into this table? One way is to start from the oldest ancestor, insert that first, then their children, then their children and so forth. So if I order the insertions properly, then there would not be a problem because when I insert a particular tuple, the father and mother tuple have already been inserted. But that is more work. If I have a relation already available, I will actually have to sort that relation properly and then insert in order, which is more work. What I would like to say is, here is a set of things. Insert the whole thing. But please don't check for foreign key violation. The moment this tuple is inserted, wait till I am done inserting a whole set of tuples, then go back and check the integrity constraints. If I am allowed to do that, then the ordering is not important. So that is the idea here. So it is possible in SQL, again not all databases implement this properly. But as per the SQL standard, I can give a name for this foreign key constraint and then set, in this case, I have taken these two constraints and declared them as deferred, which means they will not be checked immediately. When the insertion happens, it will not be checked. It will be checked at the end of the transaction. So that is the idea here. Any questions? Suppose I have a very typical situation. Say I am creating library members table and there the members could be faculty or staff, which are faculty or students. There are two separate tables. Now the foreign key needs to refer to either of them. Yes, this is a very good question. I am glad you asked this question. So to repeat this question, there is a very common situation where let us say in the library, in this example, faculty are allowed to withdraw books. Students are in some case even outside institutions and so on. So when I have a particular tuple in the issue relation, the ID there can be a student, can be a faculty, can be one of the other categories. So what I would like to say is this value should reference one of these different tables. That is what I would like to do. Unfortunately, SQL does not allow it. There is a rather complex workaround using table inheritance. So if it is supported in your database, not all databases are supported, but if it is supported, you can create a table which is let us say library member and then student, faculty and whatever other categories can be created as sub-tables of this library member table. And then the foreign key can reference the library member table. So that is workaround in the SQL standard, but implementation is spotty. But that is not pure relational. It's object relational, extension of SQL. So in pure relational, we don't have any... Unfortunately, no. There are a few niggling things like this. You wonder why haven't they put it in the standard? It would be so useful. This is one of the examples. Table inheritance, it's not very standardized. Only a few databases support it, and each one which supports it does its own syntax and semantics. It's a mess. But the major ones, I think Oracle DB2 certainly support it. I'm not sure about SQL server. PostgreSQL has a type system, but I don't... Oracle certainly supports. Now, here is another example of a retent in SQL which relates to the schema we have. Now, I don't know if you've seen this relation, but in our university schema which is used in this book, there is a relation called time slot. And that relation has a time slot ID and a specific time. So for example, a particular time slot ID one, it'll say it meets on Monday 8.30 to 9.30, so there are different rows in this table. In the time slot table, there are three or four entries for the same time slot one. Then there'll be again three, four entries for time slot two and so on. So what is happening is this time slot ID is not a primary key of the time slot relation because there are multiple entries. Now, what I would like to say is each section has a time slot. What time does this section meet? Which time slot doesn't meet in? I would like to be able to say that this section meets in time slot one. I can store that, but I would like to say that this is a foreign key referencing the time slot relation. This value one here had better occur in the time slot relation. Unfortunately, in SQL this is not possible because in the time slot relation, time slot ID is not a primary key. It can occur multiple times. So in the time slot relation, I have time slot one, Monday 8.30, 9.30, time slot one, Tuesday 9.30, 10.30. And in the section relation, one of the fields is time slot. There are of course other fields and I'm going to say for a particular time slot one, another one is time slot two. What I would like to say is that this value here must occur in the time slot relation and that's a foreign key or to be a foreign key. Unfortunately, in SQL a foreign key can only reference a primary key over here and in this relation, this is not a primary key. The value one is occurs many times, so it's not a primary key. Therefore, I'm not able to create a foreign key reference. Why don't we do two number to the second time slot? No, I want to say that time slot one meets at these three times. In this case, two times. Twice in the week it meets. So there is another way of doing it which is again not very satisfactory. I could have a time slot ID relation which says time slot one, two and so on are valid time slots. And then this one time slot here can be a foreign key referencing time slot ID. So this will make sure that these values occur here. But how do I make sure that this value occurs here? How do I make sure that every time slot actually meets sometime or the other? Unfortunately, there is no way of enforcing that in SQL. Now, what this first check clause over here is trying to do. Unfortunately, it doesn't work. What it's trying to do is to say that time slot ID is in select time slot ID from time slot. We saw the check clause earlier. So it would be nice if you could just write this as a check clause for a relation. But this check clause is special. What it's doing is it has a sub query referencing another relation. Unfortunately, no SQL implementation today supports this. So this is one of the things which is actually very hard to enforce in the database. You can enforce it out in the application. In the application, you can always make sure that if you put a particular time slot for a section, you can run a query to make sure it is present in the time slot relation. So that can always be done. In positive SQL, I found a solution. We can write a function, and that function can be called here. Check and then function name in positive SQL. I'm not sure it will do the right thing because if the thing is subsequently deleted, let's say you have a reference. So you have a time slot right now. It may not be in particular situation, but instead of using sub query, we can write the function which has the sub query, and we can use that function. So what that would do is when you insert into the section table, you can run the function and make sure it is present. But the catch is that supposing subsequently the time slot is deleted, it may not detect that. So it may check it at insertion time, but it will not enforce that it is always present. Subsequently, the time slot may be deleted and you have a dangling reference that will not be deleted. So that is it for referencing. Now there are a few more small details of SQL. There are built-in data types. I won't get into the details. There is date, time, time stamp. Now time is time of day. Oracle for some reason doesn't support this. Such a basic thing, but others do support time. Then there is time stamp which is date plus time of day. Then there are indices we will see those later on. Create index is reasonably standard syntax. We will look at it later. Then there are large objects. You are familiar with this. If you want to store an attribute, it generally has to be small within a few kilobytes, one or two kilobytes typically. If you want a large object, you have to create a type which is one of C-Lob for character large object or B-Lob for arbitrary binary large object. And then the way you access this is also a little bit different. You won't get into details. To wrap up this chapter, we have authorization specification which is the grant statement and the rework statement. Those are the two basic things. The basic syntax in SQL is you have an authorization. You can grant it to somebody who has the primary authorization. The database administrator has authorization on everything in the database and you can grant it to specific people. If you have an authorization, in certain cases you can grant it to others. That is the authorization model. Let us look at this by example. Before that, the basic syntax is grant a privilege on a relation, view name, procedures. There are a few other categories here. There are two user lists where user lists could be a database user or it could be public or it could be something called a role which we will see in a little bit. So, what are the possible privileges in SQL? Select which lets you read values from that table or a view. Insert the ability to insert tuples, update, delete and all privileges is all of these. If you have granted someone a privilege, you can revoke it by saying revoke in the rest of the syntax is the same. So, revoke select on some relation called branch from these guys. Now, the notion of roles can be useful. But in the interest of time, I think I am going to skip this. I think I will just stop there because it is 1 o'clock. If there are questions on authorization, I can get into it. One of the reasons I am skipping authorization is, well, it is two-fold. One is that the authorization mechanism of SQL was designed in an era where users typically directly connected to the database and run queries. Today, how many applications do you know where users directly connect to a database and run queries? How many times? Outside of courses like this, nobody connects to a database and runs queries directly. So, it is always through an application. And at least in today's scenario, applications have all privileges on all relations which they access pretty much. There are a few cases where this is not true. But in general, the authorization mechanism is useless when you have a web user. So, someone connects to a web application. The database has no idea that there is such a user on the web. There is no connection between the web user and the database user. So, the database authorization mechanism is totally irrelevant for this web application which is handling users connecting through the web. It may not be a web. It can be any other interface. But today, the web is the standard. So, the whole authorization mechanism of SQL today is of limited use in these kinds of scenarios. The one kind of scenario where it is useful is we actually do this in IIT. We have our applications running on our intranet. We do not want to make it available outside because it is susceptible to hacking. We are not completely confident it is secure. We believe there may be some holes. So, we do not expose it outside IIT. On the other hand, when our students go home, they still want to see their grades as and when they are assigned. So, what we did is we built a few of the interfaces which are read-only. We put on an externally visible website. Those things though, those parts of the application, they are not going to modify the database. So, what we did is we created another user with read-only access to those relations and the external application uses that login and password. It does not use the regular login password for the login which has update privileges. So, even if somebody breaks into this external facing system, even if they get the login password, they can only read our data. They cannot update it. So, this is one level of protection which we have used. So, here authorization, select authorization without update was useful. But beyond this, it is not... The complex schemes which SQL supports using roles and so on are no longer that relevant for real-life applications. But it is there in the standard because it was put in. Any questions? Yeah. So, in the database, how can we define a non-primitive data type? Like a fingerprint. Okay. And can it be a primary key? Okay. So, first of all, you know, if fingerprint is a small thing, if it is, you know, you can use a character, if it is characters encoded as characters, then it can be a primary key. But if it is a large object which is more than a few kilobytes, it can only be stored as a B-lob. And that cannot be a primary key. That would not be allowed. So, in general, if you want to store fingerprints as a B-lob, you can. Now, why is this restriction? You know, why does it matter? Who cares if it is a few kilobytes or more? This is to do with the internal storage representation in the database. So, all databases store records in pages in the database. And the assumption is that a single record will fit inside a page. It should not overflow a page. Because if you allow records to overflow a page, managing records becomes a lot more difficult. And a page size typically is of the order of 16 kilobytes. So, pretty much any database will insist that a record should be smaller than a page, and an attribute will be correspondingly even smaller, a few kilobytes. So, it's an implementation restriction. That's the reason you have to declare a B-lob or a C-lob, which is actually stored differently. It's not stored with the tuple. It's actually stored separately, and the tuple has a pointer to it. So, it's not an inline. And the way you access it also. So, when you access a B-lob, supposing this blob is a gigabyte, and you access it, and the whole gigabyte is loaded into memory. Well, of course, today a gigabyte doesn't sound large. But if your machine has one gigabyte of memory, you can't load a gigabyte into memory. There's not enough space. There's other stuff in memory. So, the way you access B-lobs is similar to using a file. So, you can read bytes from the blob, or you can write bytes to the blob, rather than fetching the whole blob at once. So, that's why databases create a new type called blob. And the way to access a blob is also different. I'm not showing you examples, but you would access it programmatically by reading or writing to the blob. Any other questions?