 Assalamu alaikum. Dear students, today we are going to discuss lecture number 17 of the database management system course. The course code is CS403. In the lecture number 16, we concluded our discussion on the transformation process of the constructed database design into the logical database design, where we studied different rules regarding the conversion of a design in the ER data model into a design that is in the relation data model. And we studied there that in the ER data model, we have got different constructs, there are different constraints. And in order to transform there into the relation data model, we have got just one construct there and that is relation. So we use the relation and some form of constraints like not null or integrity constraints and we transformed the ER diagram or design in the ER data model into the relation data model. And after that we studied the discussion on the third part of the relation data model that is the data manipulation language. And there we discussed that we have got different types of languages. Thirdly, we will be studying the relation algebra and relation calculus. But practically, we will be using the SQL that has been accepted as a standard language for the rational DBMSs. And mind it, the rational algebra and the rational calculus, they provide basis for the operations of the SQL. So it will be helpful for you to understand the SQL if you understand properly the operation of the relational algebra that we are going to discuss today. So let's start today's lecture. Dear students, in the previous lecture, we studied that relational algebra provides five basic operations. There are different forms for these five different operations, but mainly they are five operations. So we are going to discuss them unary operators, that is select and project. Then binary operators because we call them unary because they operate on single relations and we call them binary because they operate on two operations and that is they need two operations to perform the action that they perform. But the thing is that the basic property with all the relational algebra operators is that they operate on relations and they produce relations and the original relations they remain unaffected. The first of the five operations of the rational algebra, the select operator. The purpose of the select operator or the select operation is to select tuples rows that satisfy a given predicate. Predicate means condition and the notation that we choose for this select operator is Greek symbol sigma and here is a general structure, the general format of the select operator and that is the sigma, predicate and relation. Dear students, as you know that we may have different number of rows in a relation, that is initially there is no row in the relation when you define the relation, when you define the intention of the relation. But as the time passes, the relation keeps on population populating. It can grow up to any size, which can be thousands of lakhs of crores. And then we have discussed that the tuples, the rows in a relation, it is more dynamic. It is frequently additional relation. So, when we want to identify the particular row, the particular tuple in a relation or in a table. So, for that, if we look at the advanced level, we see that we have a primary key to identify them. But the thing is that there are no names of the rows, such as the attributes. So, when we want to select certain rows, we want to select tuples, then we do that by defining a condition. That is, rows are always selected, certain rows, by defining a condition. We define a condition and the condition that we specify, that condition is applied on each and every row of the table of the relation. And the rows that the condition or that predicate fulfills, the condition that you have is a statement, the result of which, when you execute it, when you check it, will it be true or false? So, the condition that you have given, that condition will be applied on every row of the table, on every tuple, on every record. And that record, those tuples, those rows, which fulfill that condition. Fulfilling means that the values of the rows for which the condition value is true comes. I had told you that the value of the condition, when you evaluate it, it will be the true or false. The rows that produce a true for that particular predicate or condition, those rows, they will be included as a part of, as a part of the result of that operation. When you select the operator, the table, the relation on which you are applying this operator, will remove some of the rows. Which rows will it be? It will be the rows on which that condition is true, the condition that you have supplied with this select operator, that is, the rows that you always select, you always distinguish them by a condition. And the condition that I have said is such a statement that the result of this is true or false. That is, you are a row. This means that when you select the operator, then you will give a condition and the table, that you apply a table on this condition, apply this condition on this table. And what will be the condition format? The condition format is that the attributes that you have on that table, you will apply a condition on the value of any of the attributes in that table. For example, first of all, you want to see an employee whose ID is this, whose ID is E349. So, you will select the operator and you will say that the condition is that the employee's ID is equal to E349. So, you will apply this condition on all the rows and the row in which the value of the employee's ID is this, that will be the row. Now, when you talk about employee's ID, obviously this is a unique value. This attribute will be such that the value cannot be duplicated, it will be unique. So, you will get only one row for this. But if you want to apply a condition like this, in fact, many times you apply a condition like this, whose result is unique, not unique. For example, you say that such employees whose domicile is of Multan. So, you will say that you have an attribute, you have defined it in the entire table, in which there is a domicile attribute. So, you will apply a condition like this, that select those employees where now there is an attribute, domicile. So, domicile is equal to Multan. So, in this, you will get that record, in your employee table, in which the domicile is written as Multan. That means those rows, those tuples, those records, in which your attribute domicile is written as Multan, you will get all those rows. And mind it, the select, it gives you the entire row. All the attributes that are included in the table, you will get them all. So, this means that when you have to select some special rows, then you will have to define the condition that which rows you want. And it is possible that you want to see all the rows of a table. You want to see all its records. For that, you will not supply, you will not provide any condition. So, when you say select, and you will give the name of the table or the name of the lesson, that means all the records, all the tuples, you will get all of them. Let us see some examples. To process a selection, look at each tuple, see if we have a match based on the condition, as I mentioned earlier. The degree of the resulting relation is the same as the degree of the relation. As I told you earlier, the output that you will get, all the attributes that are included in your original table. That means the select operator will show all the attributes, it will restrict, select only the rows. All the attributes will be the same. The degree of the select operator will be the same, the degree of the relation that you have, all the attributes will be the same in your select. This commutative means that if you have two select conditions, that is, you said, because we will remember this, we said that the input and output of all the operators is the same relation and the output is the same relation. So that is why we can use the output of one operator as the input of the other. As you can see right now, we have two consecutive select conditions, that is, we said select condition 1 and then select condition 2 and R. If you look at the statement on the left side, then it is written that the select condition 2 is the innermost, it is written inside. First of all, it will be executed. So select condition 2 is executed on R and this is executed first. And then its output will be treated as input to the select condition 1, that is, the output of the select condition 1 will be given to it and as you said, you can use these operators of the relation algebra in a nested format. So it is the same. But the commutative means that if you change these orders, that is, what is happening on the left side is that the condition 2 that you have applied on R and then the output that you have given you, when you have got the output through condition 2, first of all, we have 100 records on R. And in these 100, when you applied condition 2, first of all, you have got the 23rd record. Now, when you applied condition 1 on those 23, then it is possible that you have got 5 records. If you do it like this, that you apply condition 1 on R, it is possible that you have got the 3rd record from here. But now, when you apply condition 2 on it, ultimately you will get 5 records and you will get the same record that you have got on the left side. This means that if you have select operators, their conditions, if you change their order, then they will remain the same. This is called commutative property, which is written in front of you. And to understand this, it is better that you apply it to yourself. You can make a table like this. For example, if you make an employee table, and first think about condition 2. The condition is that where the domicile is multan, and the second condition is that for the first one, it is C1 that the domicile is multan, and C2 that the salary is more than 20,000. So, if you apply this condition on the left side, that the employees whose salary is more than 20,000, you apply it on R. You will get certain tuples. And you apply this condition on those tuples, that those employees whose domicile is multan, then you will get that. But on the other hand, if you first want to apply the domicile of the employee, and the output of the domicile, then you apply it on those whose salary is more than 20,000, then the answer for both of them will be exactly the same. This is called commutative property. Now, this example, we have a table, which we have seen before. If you look at it, we have a student table. There are five attributes in it, which include student ID, student name, address, program name, and current semester of the student. You have these four records in it. If we apply a condition on this, which is, first of all, select current semester greater than three, and the student given in the bracket. Now, look at what is written here. The symbol of sigma is used. This is being used for the select operator. This will perform the select operation. As you can see below, current semester greater than three. Now, current semester is an attribute of the table on which you are applying this operation. So, the student in this case. So, current semester is an attribute of the student, and here you have applied the operator greater than. This is happening, and in the end, you have given that table to the student. Now, look at the student whose current semester is greater than three. What does this mean? Whose semester is either fourth or fifth. You will get all those rows in which the current semester is either fourth or fifth. Because the symbol that is being used here is greater than. If we look at our data on this slide, look at the first row. The current semester in the first row is four. The second row is three. The third row is five. And the fourth row is five. Now, if you look from here, that your rows, this condition will be applied on every row, on every table individually. Now, the rows that we have in the current semester are more than three. And as you can see, the record of Suhel Dar i.e. S1020, S1050, Tahira Ijaz, and Arif Zia, S1018, their semester numbers are greater than three. This means that the number one row in which Suhel Dar's record is, and the number three in which Tahira Ijaz and the number four in which Arif Zia is, you will get these three rows as an output of this operation. Because the condition is that the current semester is greater than three. Now, from this, you may be confused about how Tahira Ijaz could have been delayed in his project. This is another condition in front of you. We said that the student ID is equal to S1038. And again, we applied this on the student. Then look, this is also happening. The table that we have is the student ID. The condition is being applied on this. So, those rows which fulfill this condition and the operator used here is equal to two. And because we have the student ID which is primary, that will give us one record. And since we have applied the condition on S1038, we will get the record of Suhel Bali. Because the record number two or row number two fulfills this condition. Now, how can we apply conditions on the different values of the attribute? Let me tell you something. So far, the condition that we applied on the different attributes of the relation, they were simple conditions. That is, they involved only one attribute or only one condition. You can also form the composite sort of conditions. For example, you can combine different conditions using different operators. You can also form the composite condition by combining AND or operator. For example, you said where the current semester is greater than three and the program name is BCS. That student of BCS whose current semester is more than three or more than four is equal to BCS and current semester is greater than three. The composite condition will be the same rule that it will be applied individually on every row and the condition as such, as a whole, will be true. You will get those rows on the output. I hope you have understood its function. Let's move on. Here is another example. This time, you have the table of the faculty. You can see here it is said that select and salary is greater than 37,000 on the faculty. Here you see a composite condition where salary is greater than 26,000 and rank is equal to associate professor and faculty is applied. The rows that fulfill this condition, they will be selected as a result of this operation or mind it. The original table will not have any effect. The major operation of the relational algebra is projection. The purpose of projection is that it performs the same thing that performs the selection on the rows. It performs the same thing on the columns. Certain columns are selected or shown. Mind it. As we said in the case of rows, we have got a dynamic situation. The changes in the columns do not happen very frequently. The columns are fixed most of the time. To identify them, rather than imposing a condition you select them and you distinguish them by their names. You specifically use their names and select certain attributes. The difference is that your select operator selects certain rows. Horizontally you select them. The projection will be vertical. If you have a table with 10 attributes you will get 1, 2, 3, 4, 5, 10 whatever you want to see. You select the name of the attributes and you will see that attribute. As you select all the attributes in the same way the project operator will show you certain attributes all the rows. No factor will be added to the rows. You have 10 attributes and you have 10,000 records. If you project just 2 attributes you will see the value of 2 attributes but all those 10,000 records are also there. The project operator will select certain attributes but all the rows are in the same way. How does this work? The unique operation that returns is argument relations with certain attributes left out. Any duplicate rows are eliminated. The logic of this is that when we read the basic properties of the relation we read that 2 rows cannot duplicate within a single relation. The project operator is a relational operator. The relation operator has an input p relation and its output is also a relation. Because the project operator has to give you a relation the properties will be all implemented. Because this relation is not important that this is being permanently just created temporarily. You will see it and then it is finished. Although this is being created temporarily but it will be following all the properties of the relation because it is a relation by definition. Every relational operator's output is a relation. So what is a property? It cannot be duplicated. If you do the first thing it is a physical property which is stored on the disk. When you applied the project operator you selected 2-3 certain attributes. Now the attributes you selected the output of this operator is a relation in which these 3 attributes are being done. The output relation which is being created following the properties of the basic relation is being duplicated and then it will be ignored and it will not be shown to you. Just one row and once it is shown to you. For this you use the symbol of Pi and you gave this symbol and apart from that you have written A1, A2 and A3 which are different attributes. Obviously these are the same attributes which exist in isolation R. So you will see these attributes every day. Works on a single relation R and defines the relation that contains a vertical subset of R extracting the values of specific attributes and eliminating duplicates. Now you have seen the same table again. You can see a record of faculty ID, faculty name, department, salary and rank. Now you applied the project operator and you said faculty ID, i.e. you said we should project faculty ID and salary from the faculty table. Similarly, we said faculty name and rank of faculty. Let's take another statement. The first statement which you applied to the project operator and faculty ID and salary you will see 4 records in which there will be 4 rows but that relation will be faculty ID and salary i.e. you will see F2,345 and salary 21,000. Likewise F3456 and salary 23,000 and you will see 4 records. Similarly, if you see the second project operator the table of faculty will be applied and you will see faculty name and rank. Just in that Usman, Tahir Ayesha and Samad and their names are professor and professor. This is the same thing because in your original table there are 4 rows but attributes you are seeing only those attributes that you specify in your project operator. Similarly, if you see the third statement in this you said faculty ID, salary and rank. These 3 attributes you will see but 4 rows because the original table contains 4 rows and you see that you have applied these 3 project operators on this table and your original table will have no effect on it. Dear students, as we said that we can use the relational algebra operators in a nested fashion now you get the real life situation there are many times there is such an output that it involves both of them the selection and the projection. The basic properties you can use are interchangeable not interchangeable but you can use them in a nested fashion for example like we have faculty table you are told that the rank associate professor his name and his department name again we want to see the name and the department name of those employees of those faculty members who are working as associate professor. Now if you pay attention then both things are involved selection and projection you want to see only the name and the department name it means the projection is involved on the other hand you just want to see the name and the department name whose rank is associate professor what happens is that selection is involved because there is a condition that you want to apply on the rows so if you look here both the operators are involved how to handle this there is one operator it does not matter which one you write first so first you use the selection you say select and rank is equal to associate professor and in the bracket you write faculty now you have this operator which will be applied on the faculty you will see all those rows in which the rank but you will get all the attributes all those attributes that are included in the faculty table now you have a table in which only those rows are and if we see this particular example we have got two records where the rank is associate professor and which are f3,4,5 and f4,5,6,7 which is Aisha so you will get because they fulfill the condition they satisfy the condition rank is equal to associate professor but you will get all the five attributes from these you want to see just the faculty name or salary here you will apply the project operator and mind it the input of this project which needs to be operated is not your faculty but the output of that was basically the output of the select operator so those rows the record of Taher and Aisha with all those attributes you will apply the projection so you will get only these two rows name and department name you will get Taher CSE and Aisha English so with this you will get Dear students as you have seen we have applied the selection and then we applied the projection now the question is how we have fixed this order this issue in this there is a question whether we can change the order of the operators or not but one thing is the significance of this is also in terms of validity whether we will get the required output or not the other thing is in terms of the query processing or the efficiency of the execution the efficiency of the execution we will not have to worry about but the first thing whether the execution wise is valid or not there is a concern that you have to see now you will have to see at this stage as we have just done we have applied the selection and the output of the selection that was used as input for the projection operator now if we think we will reverse it first we will do the projection and then we will do the selection how? first we will do the projection and then we will say the faculty name and department and on which? on the faculty now you will get all the required records but only these two attributes now if you try to perform the selection operation the problem is the selection operation the rank is not included in the attributes that you projected out of the faculty table so the input you are getting the selection operator there is no attribute rank although it is there in the faculty table so when you try to apply this selection operation on this particular relation that the select operator is receiving as input so the attribute rank is not included that is the error this will not work therefore the summary of this discussion is that you can switch the operators you can change their order but one thing especially on the outer level the attributes you need they are getting on the inner level if it does not happen if it is on the outer level it will fail this will be more clear from the practice so I will recommend you to practice further Dear students the two operators we have were selection and projection and both were unary operators which are implemented on a single relation now what we are going to read are our binary operators which operate on two relations and we also call them set operators the first one is union assuming r and s are union compatible here r and s are basically two relations and before applying the union operator on any two relations you have to make sure they are union compatible what does union compatibility mean union compatibility means that the two relations on which you are going to apply the union operator should have the same degree the degree means that the number of attributes in both of them should be exactly the same and the second thing is that the corresponding attributes the first one the second one the corresponding attributes they must have the same domain look here the condition is that if you have the same name then that is even better but the requirement is that at least the domain should be the same because union as you might have said what does union do that in two sets the first element shows it once otherwise it shows all the elements so obviously when you have to combine a single table then it is necessary that when you want to combine one table then if the domain is not compatible then how can you combine that is why the condition is that the relation should be union compatible should be the same the number of attributes is equal and the second thing is that the corresponding attributes should be the same domain R union S is a set of tuples in either R or S or both this is a very traditional fixed behavior of the union operator you are definitely familiar with this operator since it is a set there are no duplicate tuples union is commutative i.e. R union S is equal to S union R in front of you there are two tables course 1 and course 2 both the tables have different number of rows but the degree and number of attributes are the same the attributes and the names are the same if now we take the union of these two tables since the union is compatible then the output of union will be the same in this there are 5 rows in your original tables there were 4 rows in the original tables but the course number is C4567 i.e. the third row of the first table is the same which is row 1 in the second row so when you take the union then this row will be the same so the output of the union we will have 5 tuples and the duplicate one will be the same Intersection again this is a set operator maybe you are familiar with this now we will start studying on the sets so to perform the intersection the first condition is that rs has to be the union compatible that is what we have already done and in this we have intersection r, intersection s is a set of tuples in both r and s intersection is commutative as we have already studied r intersection s is equal to s intersection r now here is the example of the intersection considering our previous two tables on which we applied the union if we apply the intersection operator on the same two tables then we get this just one row because this is the row that is common in both the tables i.e. C4567 or program IDA or course of financial management so this is the intersection and mind it union or intersection operator normally we do not use as compared to project or section we use more than that next operation is difference r minus has the set of tuples that appear in r but do not appear in s and here is the example of the difference if you apply the difference operator on the same two relations that we already had see that course 1 minus course 2 means those rows of the course 1 are not in course 2 now if you have course ID 4567 that was also present in course 2 that is why the tuples of course 1 were original 4 rows so we are left with these 3 rows which you can see in front so this is the output of course 1 difference i hope you have understood Cartesian product we represent it from cross sets do not have union compatible now the 3 operators they are applied on the union compatible sets for this Cartesian product the sets are required to be union compatible first of all this is done in r we have attributes from a1, a2 to an similarly if we have second set is s in this we have b1, b2 and bm so r cross s that we have such a set why in which a1, a2, an then b1, b2, bm that is both the relations they have all attributes in it you can see that your r has all attributes and s has all attributes if r has c tuples and s has d tuples c tuples means c number of tuples that is its cardinality number of rows is r if it is c and s's cardinality is d then your q's cardinality means number of rows in q number of rows in the output of the Cartesian product it would be c multiplied by d also called cross product note that union intersection and cross product are commutative and associative i will explain the working of the Cartesian product i will explain the working of the Cartesian product that why the number of rows is multiplied by the number of rows you will understand that the output of the cross product has all attributes if you take the cross product of the student and the book take the Cartesian product it has 5 attributes in the student's table and 7 attributes in the book so the 5 attributes and the 7 attributes of the book will be in the output of the Cartesian product that means the total number of attributes will be 12 the number of attributes is 12 5 of them is 7 12 the rule of the Cartesian product is that the one table of one row of one record is the first record this record will be joined you say it will be integrated it will merge with every record of the other table so if we have 300 records in the student's table and 5000 records in the book's table so the first record of the student is S101 and the first record is the record of the student the record of the student will be with every one of the 5000 records so the first record of the student will be with the 5000 records and the 5000 records of the student's table will be common but the value of the book's table will be 5000 of the 5000 after that when this is done on one then the student's table will have another record and the record of the book will be with all the records and the output will have 12 attributes so the first 5 attributes will be on the student's side and the 5 attributes on the book's side will be with the 5000 records so you will have 5000 records which will be the same of the 5 attributes because that is your one record of the student but the rest of the 7 attributes will be changed because they are coming from the book's table so this is the third record and you will get 5000 records then it will be 5000 records then the fourth record which means that the table of your student which has a relation all the 300 records and each one of them will be with the 5000 records which means that when you have the final table which is the Cartesian record if you write that you have read 300 records in the student multiplied by 5000 which is the number of records in the book you will have the total output in this way you will see what happens in the Cartesian product in which all the attributes are and each one of them is the product I hope you understood let us see the examples we have 2 tables 2 attributes of the course course ID and course title and 2 attributes of the student student ID and student name if we apply the Cartesian product operator, we have given this command then the output the first thing is if you see the number of attributes then there are 4 attributes if you see the number of rows then that is equivalent to the product of the number of rows in both the tables there are 3 records in the course and 2 in the one if we multiply them then we have 6 of the tuples then the first course ID all the records are with S101 which is written that you have 2 attributes in the first 3 records similarly the other 3 records the other record of the student table the 3 records of the student table so to give you an example to show you a small table obviously if you apply this concept on the larger tables then the concept is that each one of the tables will meet with the other rows and this will be called the Cartesian product joint operation Dear students this joint operation is one of the most important operations of the traditional Jabra and it is a little complex compared to the other it is a little difficult and one more thing this is a different form of the Cartesian product and this is the form which is generally used this concept will require a little more effort so try to understand it if you see the working of the Cartesian product then what you see is that it meets with every row now in this if you see the real life in terms of applications in terms of the real world system if you see the usage if you see the reason then the use of the Cartesian product is not so clear its use is very less in real life meaning you say if we take the example of a student and a book which meets with every row of the book then its logic cannot be understood why we do this joint is a thing which has a certain logic that is why it applies to the real life for example if we see that the student table has some days some of these days we want to meet with some days of the book there can be a reason there can be logic which days we will meet the days that relate the days that are linked with each other we say if a student does an effort they have issued some books obviously this is logical and easy to understand from the book table or from the book relation we want to meet with those days that the books have been issued by this student the first thing is how will this link be decided that which books have been issued by this student so if you think about the references of our ER data model then basically the condition is that the book and student are related with each other so if you have an agreement let's suppose it was a one to many relationship so what we have done is that the student has to join the primary of the book so we will say that the student has to join the primary of the book so whatever books have been issued in the book the record will have been read that means the student ID is let's say S109 if the student has issued 2 books out of 5,000 books then in those 2 books the student ID will be written there S109 now the record of this student has the logic of meeting these 2 records because this record of the student and this record of the book table are related to each other why? if you describe it in real life or practical example then you will say because this student has issued these books and if you describe the same thing in the language of database in that case you will say that the primary key and foreign key are the same so they have the logic of meeting each other but it is obvious that the logic of meeting each other is not the same similarly if you apply this example on project and employee if we say that a project has a lot of employee work and one employee works on one project again this is a one-to-one relationship the project the primary key that is included in the employee as foreign key now for a project a record of the project we want to meet some employees so it is obvious that the employees work on this particular project the employees work we have represented the primary key of the project employee table as foreign key so in each of the rows project and employee key this is your attribute the primary key foreign key is a common attribute the primary key foreign key so join that logic provides rather than combining merging or joining all the rows of both the temples we join only those rows that have the same value in the common attribute with that database you will say that they have the same value in the common attribute in real life you will say that they are U related like one student has a book issue or they work on one project so the join operator that gives you that reason that you combine two tables and mix them together Dear students as I told you join is an important operator of the relational algebra and in this in real life it is used a lot in databases and the other thing is different forms each form has its own characteristics and it is implemented in its own way so join that we will discuss in our next lecture in today's lecture we are focusing on the data manipulation language of the relational data model what we have discussed is the relational algebra and the relational calculus we study them mainly for the theoretical purposes otherwise this base provides that language which is used in the DBMS we have read about the relational algebra that there are 5-6 basic operators some of them which are unary which have select and project and the rest are binary that is they operate on two rations or tables we have read that they operate on rations and on this basis the select and project can be used in nested sections in which one output becomes an input apart from this we have applied union intersection of difference that was union compatibility we also studied what do we mean by the union compatible relations and finally we were discussing the Cartesian product that each of each rations will be available and when we have a resultant then both the attributes and the product of each rations will be included and lastly we were reading the joint about which we said that this is the most useful and useful operation we will read about this in the next lecture Allah Hafiz