 Welcome to the session on partition join operation in parallel databases. It is one of the inter-operation technique. Talking about the learning outcome of this session, at the end of this session you will be able to apply this partition join operation on a given relation, inter-operation parallelism. What is this inter-operation parallelism? In this one, usually for the parallel databases, every individual operation in a query is parallelized. Whatever the operations are there, say for example the operation may be sorting, the operation may be searching, join, whatever a single operation is now taken for parallelism. So that operation is parallelized, that is called as inter-operation parallelism. So in this one, basically for the operations like parallel sorting technique is there, parallel join techniques are there. So for parallel sort, we have already videos are there, you can refer those videos. Today we will be studying this partition join operation which is one of the parallel join technique. Let us go ahead for this partition parallel join. Now this parallel join, basically what the join operation is doing? Now for the join operation, basically if more than one relation are there, even self-join is also there, the pairs of tuples are tested, one-to-one tuples are tested and if a particular condition they satisfy, then that pair is added to the output. So let us go ahead for the parallel join operation, that is a partition parallel join operation. Basically it has the three steps you can say, first one is partitioning where we are splitting the pairs whatever we want to do the operation, those pairs we are splitting for multiple processors. So that is called as a partitioning, first we are doing the partitioning, second we are going for the actual parallel process that is now we are doing the operation a join operation. So every processor parallely computes the join locally because already the data they have according to the partitions and finally once the parallel join operation in all the processors parallely has done, then the result is collected at each processor at one of the place. So that is the third step. Now this partition join operation is basically for equi-joins and natural joins, equi-joins it is possible that the partitions have two input relations across the processors and compute the join locally at each processor, that is every processor has partitions say consider that more than one relations are there, so if two relations are there then those are partitioned into processors and locally every processor is doing the join operation. Let us say consider that there are two input relations say here the relation R and the relation S and those are we are joining on the attributes A of relation R and B of relation S. So and now the relation R and S each are now partitioned into say some n partitions are there we have n disks, so we have n partitions, so the partitions are denoted by 0 to n minus 1. So you may say that the relation R0, R1, Rn minus 1 are partitioned on across n processors. Similarly the relation S is partitioned as S0, S1, Sn minus 1 across the n processors. So now every processor has one partition of R and another partition of S which is locally it is going to take the join operation. So the same thing here, now what we can do for the partitioning is we can either use range partitioning or we can use hash partitioning. These are the partitioning techniques. Basically there are three partitioning techniques, we have seen that round robin partitioning, range partitioning, hash partitioning so either we can use range partitioning or we can use hash partitioning here for taking the partitions, so the join attribute now how to take the partition the partition should be specifically on the join attributes. So what are the join attributes we have seen here that the join attribute for relation R is attribute A and the joint attribute for relation S is B. So, based on R should be partitioned by A attribute and S should be partitioned by B attribute because based on this only we are taking the joint operation. So, what is the scenario now that partition R i that is relation R is partition say R i is one of the partition of relation R and S i is one of the partition of relation S then this R i and S i we are sending to processor P i. Now, processor P i locally is computing the joint operation based on this. So, partition R i joins with partition S i on the attribute of A of R i with B of S i ok. Now, any of the standard joint operation we can apply here. So, this is the scenario you can see that diagrammatically I have represented here. So, this is our joint operation which parallely is going across these many processors. So, processor P is 0, processor 0, processor 1, processor n minus 1. So, the partitions are what 0 partition, 1 partition n minus 1 partition these are the disks D 0 to do n d n minus 1 you can say. Now, relation R and S are there. So, relation R is partitioned as R 0, R 1, R n minus 1 across these n disks. Similarly, relation S is partitioned as S 0, S 1 across these n processors till S n minus 1. Now, what every partition will do now? Every processor say processor P 0 it is locally doing the joint operation of R 0 joins with S 0. Similarly, processor P 1 is doing the joint operation of R 1 joins with S 1 processor n minus 1 that is R n minus 1 joins with S n minus 1. So, parallely every processor is doing the joint operation that is what the parallel partitioning partition joint operation is ok. This is another diagram depiction you can see here that relation R is there and these are partitioned as R 0, R 1 till R n minus 1. Similarly, relation S is here that is partitioned as S 0, S 1, S 2 till S n minus 1 you can say. So, now what processor P 0 is doing it is doing the joint operation of R 0, S 0. Similarly, P 1, R 1, S 1 like this. So, the parallely these are executing. Consider the example here. This is our example like we have two relations student and courses you can see the attributes are there. We want to take the joint operation between these one ok. So, the scenario is like this we have two relations student and courses the joining attribute we are taking here is roll number. So, based on roll number we want to do the joint operation. Number of disk we are considering for example as 2 and therefore, the number of partitions are also 2. So, therefore, the partitions the processors are also 2 you can say. Now, the partitioning technique we are applying here is hash partitioning technique. Partitioning attribute is roll number. So, as it is hash partitioning technique we are taking the hash function like this H of roll number. So, that is mod n and what is our n? n is our 2. So, roll number mod 2 whatever the result will come that hash value disk that is containing that tuple ok. So, we are applying the hash partitioning on student as well as on courses both the tables. Let us see now. So, this is the partitioning what we have done we may say that this is the partition 0 for the process of P 0 ok. So, partition that is the student relation is partitioned as student 0 and courses relation is partition as courses 0 in. So, this all thing will go to disk 0 and the processor P 0 is working on this one ok. So, the number of partitions and the partitioning technique we have applied is hash partitioning here. So, you can see this is partition 0. Now, what I would say is pause this video and think of the second partition. Can you show the partition 1 of student and courses relations? We have this table and we want to do the partition 1 on this one based on the hash value hash function mod 2. Take a minute and do this one. Let us see. So, this is partition 1 ok. So, here this will go to disk 1 which will be parallelly operated on processor P 1. So, what the processor P 1 will do? It will do the join operation of partition 1 of student 1 with courses 1 ok. So, that is what the parallel partitioning is. So, you can see here that processor P 0 is doing this is partition 0 of student and this is partition 0 of courses. Similarly, this is partition 1 of student and this is partition 1 of courses. So, parallelly both the processors are doing their join operation and finally, you can see this is the result of the join operation whatever we have seen earlier ok. According to the example this is our result. You can just go through the result, but now what we have done here that parallelly we have done the join operation on processor P 0 and processor P 1. These are my references. Thank you.