 Hello everyone, welcome to the session on some parallel join techniques like partition parallel has joined and the nested parallel loop join in the parallel databases. These are the techniques of inter-operation parallelism. Let us move ahead with the learning outcome of this session. At the end of this session, you will be able to apply the parallel join techniques like partition parallel join and the nested loop joins for any given relations. So these are one of the inter-operation parallelism techniques. Earlier we have seen few parallel sorting techniques and parallel join techniques. The videos are available for this one, you can refer those one. So today we will be talking about the partition parallel has joined and parallel nested loop join in this session. Let us move ahead. We are assuming that we have n processors P0, P1, Pn minus 1 and the two relations called R and S for the join operation and considering that already these are partitioned into n number of processors with the n-disk. Assume that S relation is smaller than the R relation and therefore usually S is chosen here for the build relation. Actually this partitioning technique is working for build-end probe. We are taking the hash functions, two hash functions we are using. So that is a hash function h1 and h2. Let us move ahead. So for the partition parallel has joined, this is the scenario you can see here that relation R is there, relation S is there, it is partitioned with the hash function h1 and this is also partitioned with the same hash function h1 and then later you can see that again the partition is applied with another hash function say h2. So these are again partitioned like this one and now parallelly now the operations are done and every processor. So how many number of processors are there? The number of processors are more here that is processor P0, P1 like this it goes on. So what the processor P0 will do? It will do the repartition data of R1 with repartition data of S1. It is taking the join operation here. Let us elaborate more here. So this is one of the scenario you can pause the video and observe the figure and write the scenario how the parallelism will work in this. You can see the same thing depicted in another way here. So what is the scenario here? That is h1 function is used as the first level partitioning technique and h2 function is used for the second level partitioning techniques you can see here this is a second level partitioning technique and later you can see the parallel join of this one. So this partition joins with this partition here. So this join operation takes place parallelly. So parallelly all the join operations you can do parallelly all these one. So let us talk about how the hash function h1 is there, how it is doing. By the diagram you have earlier seen that it is taking the join attribute value of each tuple of S which is mapping with among the end processors. So we have mapped this S processor, S contains of these one with among the end processors. Each processor PI is reading the tuples of S and then based on the mapping condition based on the hash condition it is sending that to this DI wherever it is matching with and then based on the hash function. Let us say that that is SI we are denoting that as SI which is sent to processor PI through the disk DI. So you can see here this is another diagrammatic representation. So relation R is there which is we are taking the hash function on this and that relation is now built on this and it is partitioned through these many partitions and again these are partitioned, repartitioned with this one. Now hash function h2 what this hash function h2 is doing as the h1 function is already partitioned it is receiving the data of that one it is receiving the partition data and then further it is again applying another function h2 on that to compute the hash join locally. So once the second hash function is applied then again it will do the local hash join operation. So this is the scenario. So relation R is partition, relation S is partition, so relation S is partition here and again it is partitioned with another h2 function one of the relation is given here like RI so it is applied with h2 again and that is provided to this and locally these processors are doing the hash function and where the output they are storing there is the output buffer so those the results are stored to the disk output disk here. So you can see that two times the hash function is applied so partitioning and repartitioning has done in this parallel hash join technique. So we have seen that every processor PI is executing actually it is working on build and probe phases the build phase is called as where the relation S is partition based on h1 and then probing is what once the partition has done again the same partition it is taking for the next step for probing. What the hash by hash join optimization can be applied to in basically for the parallel cases where it is called as the hybrid hash join algorithm. It can be used for caching some of the incoming tuples in memory. In memory join operation is taking place here to avoid the cost of writing them into this so in memory only the partitions are there and again repartitioning is done in memory not in the disk to avoid the cost of writing them into the disk. So second time partitioning what we are doing that we can do in memory and after the join operation the results can be sent back to the disk later. Talking about the second technique that is a parallel nested loop join again we are assuming actually this one is working like a normal nested loop join where two loops are there every tuple of one relation is mapping with every other tuple of relation but one more thing parallel what it is doing is it is one of the thing is partitioning as well as it is working on the indexing. So see the scenario that relation R, S is there which is smaller than relation R. So what we are doing R we are partitioning and we are taking the index on that R and S we are replicating using the index on R we are doing the probing of that we are doing the comparison and we are doing the join operation based on that. So usually here we are using the asymmetric fragment and replicate join with relations S being replicated and using the existing partitioning of relation R. So R is already partitioned and S we are replicating and we are doing the indexing on R okay you can see here. So index nested loop join of S with ith partition of relation R you can see that relation R is partition so indexing is based on this one and every S is now replicated S is not partition so every S tuple is using the indexing it is mapping or it is comparing with the tuple of R and finally it is getting the output here in the parallel way. The major part here is it is using indexing so you can see that it is nested loop join because two loops are required one is for the outer one and one is for the inner one. Inner one is for the replication and outer one is you can say the partition one. So for partition R1 the inner comparison is replication of S you can see here. So talking about this parallel nested loop join in detail what it is doing in this is every processor PJ where the partition of relation S is stored and relation R is also stored it is replicating the tuples of every other processor PI and at the end of this phase relation S is replicated at all the sites that are stored tuples of relation R and relation R is partitioned on all the other this you can say and what every processor PI is performing it is doing the indexed nested loop join of the relation you have seen here ok here. So it is applying the indexing here these are the partitions of relation R these are the relation S is replicated so every tuple of S is comparing with every other tuple of R parallely every processor is doing this one using the indexing technique ok. So the scenario is provided like this you can see here that this is relation R which is partition that is these are the partitions so R1 R2 something like these one these are stored so you can see that relation partitions are like these one and relation S we are replicating on every disk so this is a replication process you can see this is the partitioning process and this is a replication process so relation S is replicated on every disk you can see on every partition we are applying the indexing on this so using indexing the partition R i is taking the join operation with the S and the results are stored on this disk so like this one the nested loop join is working these are my references thank you.