 So, the topic you are taking is a very broad one and we will concise it. So, what is parallel data structures? What are the existing one? And then what we have done? We have focused on mainly parallel hierarchical data structures and hierarchical data structure includes the tree type structure in which also we have various kind of data structure existing like B, B plus B, R, R star, T and he binary search and all are existing. So, our main goal was on the tree 2, 3 tree and right break tree. Why we have chosen this? Because we have proposed some efficient algorithms that are not existing yet. And the problem too we have built is the quadri-representation on pyramid machine. Quadri is also a kind of data structure which is used for two-dimensional images. And we have constructed algorithm which gives a quadri-reconstruction on pyramid models. And the future of conclusion and references as we all know data structure is a way of storing data on a computer. So, that it can use it can be used efficiently. Efficiently means we can perform the dictionary operation like insert, delete and update on that machine efficiently. And the example well known examples are stack queue, linguist, array, hash table and all. And parallel data structure is the implementation of these data structures on more than one processor. So, that the number of processor can contract to perform insert, delete, update operation on this. And the existing data structures are array, stack queue, hash table, linguist, trees and all. And in trees binary search tree heap, B B plus B, 2 3 T, red brick tree, they are the existing one. But for 2 3 T and red brick tree they are we propose a new algorithm for it. And for heap we have given a efficient algorithm better than the existing one. So, from now on it is. Now, describe the. Now, what they give in the report you must mention the proper reference and other things like that. In a slide also we have to prove. Now, describe the first data structure is the parallel, parallel is the implementation of the parallel of priority queues. The basic idea behind the parallelization of the heap is that each node each consist R key instead of the one key. Let us try to define the parallel heap. Parallel heap is the complete binary tree such that each node contain R items. And all R item at node a value less than or equal to value of the item its children. This is an example of the parallel heap in each node they consist of the R key instead of the one key. Let us try to define the dictionary of expression on parallel heaps. Let us first consider the insertion of the R keys. First sort the R key and this sorted R key is the merge with root node. It become the sorted list of the two R items. Smallest R item is the put in the root node and remaining R item is the carry forward to next carry forward to child for next expression. One point is that how to find out the child for next expression. It may be left child or right child. Let us consider n is the indexing of the target code and ith is the level of the current nodes. So, we check the ith write most beaten if it is a 0 then move left child if it is a 1 then move right child and their child become the current node for next expression. This iteration is the repeated until the R keys reach the target code. You speak again clearly what is initially given. Initially we have to insert K keys in a existing A. What is the definition of A? Is it a same binary? It is a in parallel we are keeping R number of keys on a particular node. So, this is R having also R keys and this is also having R keys. So, we are inserting R. Is it exactly R or it can be. No, no for internal also it is exactly R the last node in container. Now, you want to insert how many keys K keys. K keys. K can be. Up to R. Now, this K you have sorted. First we sorted. Yes. That is the key. Next slide, next slide. Next you have sorted K. Current node is the root node. Current node is the root node. Now, repeat until if the target node is it. Yes, because the last node will be our target node. In heap the last one will be the. No. Merge K item with R item root. Yes, it is carried to insert the process with the R items with the current node. What is the result of the single sorted list of odd flat K. Let us get items. All this K you have inserted. Yes, trust this. This is the 13 K. So, you have inserted. Yes. Then we sorted. This is the smallest R item in current nodes. And remaining R item is the carry forward to child for next interest. That is of the right child. No, not exactly. No, you know depend on the target index of the target node. The smallest you are keeping. Yes, first we are finding the target node where the inserted node will go. At the target node index we stored somewhere. No, you read this. Merge the K items carried to by the insert process with R items. So, you have inserted this K elements with 182 AR. Yes. So, you have got E1 to R plus K. Yes. Right. Place the smallest R items in the current node. Current node is for the root node. No. So, smallest element in the root node. Yes. Right. That is remaining. Now carry forward the remaining K items to the child of the next iteration insert process down the child lying on the insertion power. It is the hard English version of the insertion path leading to the target. So, which means where that. The largest K element is carried forward to child. Which child? Which child? Now, if we found the target code initially. What is target code? Target code is like in heap where the element will go. The element will go at the last available position. So, we find this. We can easily find out this position. And if it is, we find out the binary representation of this. No, but how are you getting that target code? Then 0, 1 we just change the index of the target code. And if it is 0 we move. All the exist number of the nodes at the 1. What is 0? 0 is the like the target code of this indexing of this target code. How are you going to get this target code? Sir, in heap we know how many nodes are there. And the next one will go to the last node or the next to the last node. In heap 1, 2, 3 like this indexing like this. So, we have the index of last. Simple, simple sequential processes that we put at the last node. And then you go up and adjust the things. What is the heap insertion technique? Sir, now we are inserting cake is simultaneously. No, no. Let us understand in sequential algorithm that insertion is done at the, see I insert at the bottom. Then we go up. Agreed that is the same process. Now you should have been that you are starting from the top. Now once you have inserted k, now larger r plus k elements will be bringing down. How are you bringing down this? For vertical node we are inserting the current node and the keys that are getting from the upper layer. And we are merging them. You can smallest one in the current node and then the new one into the next line. Next cell is how are you going to define this? Sir, this is through this indexing. How are you getting this indexing? Sir, we know the target code will be here only. That is the thing that where it should be placed that one. So, if it is 0, if we check, if we are in the level first. Okay, now I understood. So, now there will be r plus k. Sir, we can simply find if n is the total number of. You are using this r plus k? Again, yes for this node again it will. Again you want. Sir, n is the total number of k and each node is the consist of the r k. Then total number of node is the n upon r. So, that next target node is the n upon r plus 1. Sir, now you got the target node. Now r k elements you will be bringing to this element. And this side become the current node and this step 3 is there repeated until the. Sir, then you will be merging again k elements. Yes sir. Right? Yes. And some of the elements. Then again the place smallest one into that current node and the largest one carry over to the next one. Okay. That means you are using figure. Okay, I understood. We use the r upon log r processor to solve the k key and merge the k plus r items. Then time required by the sold team is order of log square r using r upon log r processor. And step 3, step 3.1 take the time order of log r. And step 3 is the repeated order of n upon r plus 1, height of the heap. Then total time required by this L go is the order of log r into log n. And we use the processor r upon log r processors. Then total cost will become order of r log r log n. It is a cost optimal. In sequential case if we want to insert r items, each item take the order log n times. R item take r log n times and log r log n will be r log n. And speed up is the r upon log r which is the 100 percent. And try to define next dictionary of present deletion. Suppose we want to delete r highest priority items. We know highest priority item is exist in the roots. So delete the r item from the roots and put the r item from the last node into the roots. And root is the mark current nodes. Merge the r items with the two r items of its children. It becomes the sorted list of the three r items. And smallest r item is the put in the root current nodes. Let's define x and y. X is the value of the largest item of the left child. And y is the value of the largest item in right child. If x is greater than equal to y, then put the next smallest r item into left child. And remaining r item put in the right child. And right child become the current nodes. Otherwise then put the next smallest r item in the right child. And remaining r item put into left child. And left child become the current nodes. Until the spelling care. And step three is repeated until the e property is satisfied at current node. Or current node become a node. So we use the r upon log r processor to merge the items. What is the difference between your method and their method? There exists parallel heap, right? What is the difference? So we are using r e by log r processor. They are used for the r processor. And they are using r processor. They are kind of complexity is somewhat. I have to find out that. You mention all this there. I have to check. Cholox? No way. Cholox. Thanks. Data structure is parallel, red leaf free. And what is red leaf free? Red leaf free is a binary structure to satisfy the following property. In which every node is colored as red or black. And the root is always colored as black. And every leaf will be colored as black. And if a node is red then both its children are black. It means from root to a particular leaf. No consecutive nodes can have a red node. And every simple path from a node to a descendant leaf contains the same number of red nodes. It means for every node. Every path from that node to leaf will contain exactly same number of black nodes. These are the five properties. This example of a red leaf free where all the leaves are containing black color. Root is black. And you can see no path is containing two consecutive red nodes. And every path means from that if we consider this. So, it containing black node and this path is also containing black nodes. Same number of black nodes. Same number of black nodes. Insertion operation. What we do in sequential insertion? For if we want to insert k, e in a existing red leaf free, we do this in the two phase, search phase and rebalancing phase. In search phase for that particular key, we find out the position of that key. And we insert that key as a new red node. And then try to rebalance this new red node by recovering and rotation of red leaf. So, this is the typical sequential insertion. And what if I want to insert a k number of keys concurrently to that red leaf. So, in that case in search phase we just for k keys we find out the position of those k keys and insert as a sub packet of the keys in the leaf of that particular node. So, this kind of structurally form and each and every leaf not each and every leaf, the leaf contain some we contain the sub packet of the keys. Now, we will create some waves. We first for each and every packet we find the median of that particular packet. That median divide packet to two left two sub packet, left sub packet and right sub packet. The new median element next the new median element will become a new red node for that and the left and right sub packet will create that right child. So, the both the child. Now, the new red the new red node will check if its parent is black then insertion is over no need to rebalancing. But if it is red then that processor will be active for this iteration. So, all the processor at this stage all the active processor at this stage form a new wave and this waves will go up in the until it will reach the root. And at the same time new the packet is the median is once we are finding the medians and the median is splitting the roots. So, the waves are keep creating. Only that subtree part will be affected right. Only subtree part. See once you are doing this side this side is affected not affected at all. All the active processor at this level are affecting. Now, next level. Next the new wave will create getting created. This side is not required. Suppose you know the parent of this flow parent of just one of them the fourth one. So, this processor will be active and at this layer all the processor which are active. All the processor will be active. No, means it depends it can be active or not means all processor is taking this condition. Only affected will be. Will go up. Will go up of that subtree right. No this subtree will not be affected. No. This all are doing parallelly. So, all the all the active processor become a wave. What do you know active processor? Active processor means if the parent of this node is red. Is red. When it is red then active otherwise it will not do anything. Now, if it is active red then only that subtree will be affected. Yes. See neighbouring processor if you find. No it won't affect. The waves are getting created from the lead and the waves that are already created are moving up. There may be several such waves. Such waves. And as the last wave get created and it will reach the root then it means the rainbow black creek is balanced. So, for this creation of wave and moving up of all waves we are using some local rule of insertions. In for that we have defined a area of a particular node. The area of a particular node is defined by its like this is the area defined by a processor D. It includes its parent and its son and its grandparent and its son. So, the area defined by D is the all saving nodes. So, here also we have considered two cases. If in that particular area only one processor is active. So, we can do this by sequentially and if there are two processor active in this area. So, this way I have considered as a parallel case. So, they are different. Define some local rules. They are different rules for insertion for sequential case and parallel case. For sequential case this is a sequential case here only this was a tree and D has inserted here. So, it takes the both the tree were red means this is red and this is black. So, it just colored both as like recoloring will do all these things. Here D has inserted and D has inserted a left child of B. So, the one rotation the one right rotation will require B will go here and the recoloring will do. These are sequential rules. These are rules for rotation and recoloring. This is also here double rotation and recoloring. In parallel case if there are D and E are two processor they want to that are active. So, and its parent is red one parent is red and one parent is black. In that case first we will rotate what we are doing. We are going right rotation as B on B. So, B will become a root node here and E will attached as a left child of A and D is here. So, we have one rotation and recoloring are required and the other case if both the parents are red. In that case no rotation is required only recoloring will do. We colored both there as a black node and nothing will be required. So, here also this is the total flow diagram of insertion. We are inserting as a packets the K keys then splitting as a medium of this and the moving that all the active processor at this level form a wave and this wave are moving up and the next waves are keep creating at the same time. So, all the waves will move up and then all the waves reach here means are triggered parents. Time required in search phase is log n plus log r. R is we have to insert number of keys and so this set of new leaves created by the Middelsky constitute a new wave. So, at most O of log r waves can be created and waves are moving up level by level. So, it will equal to the height of that red leaf reach log n. So, total time and waves are creating in pipeline. So, log n plus log r and the speed of this r log n by log n plus log r. So, it is of the bound of. So, deletion the idea behind the deletion is we are checking each and every we are checking each and every node which node which keys we can delete parallel. So, we will check that particular node is a leaf node or not. If that node is a leaf node we can directly delete it and if it is not a leaf node then and it is having only one child. So, we will check that child also want to delete some item or not. If it is not then also we that processor can run concurrently and if it is if node has having two child and both of the children do not want to delete any item from that particular node. So, that node is also can be processed as a concurrently. So, in these three cases we find the we make that processor active and in that particular iteration we for all the processor which are active we delete that node and rebalance it. So, we will repeat this whole iteration until all keys are deleted. So, this is a proposed by me. So, the more more more clear as to it again. Yes. This is the analysis of it of log n time to first one. You give the diagram also as you did in the case of insertion and only it will be all the cases properly studied. Deletion sir actually there is no diagram required means we are just taking that is read node or not. No, it is required because I have to understand there are 4 5 cases in the 3 4 cases. So, that we have to study for figure next. So, total time required for deletion is c log n log n where c is a constant time and cost of this is of our log n because we are using art processor to delete art keys. So, speed of this here is order of art. So, that is the structure is the 2 3 tree. 2 3 tree is the special kind of the B plus tree such that every every internal node has 2 or 3 sons. I deviant the parallelization of the 2 3 is similar to red blade tree. In insertion of the key key in 2 3 tree is consist of the 2 2 phase such phase and rebalancing phase. In such phase every key is such its position in the tree and end of the such phase packets of the key hang from the leaves. Only difference in the red blade tree and 2 3 tree is that in 2 3 tree all the living in same level. Then wave created by the middle key is the plane wave. Only the 1. But the older leaf. 2 3 tree is the balance tree while the red blade tree was not a balance tree. So, here in that case the leaves were like. In same level. Not at the same level. See 2 3 tree is a balance tree. B plus tree. What is the full definition of 2 3 tree? It is a B plus tree it seems every internal node has the pointer on the leaves. What is the number of it is a k or a k right. It is a k or a k. Red black tree is not a k or a k. It is a binary tree. It is binary tree. So, that is the first and then all nodes are. In same level. In same level it is not a k. And wave created by the middle key is a plane wave. In each packet middle key is split packet in the 2 sub packets. And the set of and middle key become the new leaf nodes. And set of the new leaf node created by the middle key constitute new wave. And this wave will send up in each tracer until it is a this way. So, here the difference is only all the leaves are moving level by level means they are at the same level. And in the red blade tree they was not this way. This can be seen. I have one size. Let consider one example. There is a 2, 3 tree which consists of the 3 k, 8, 11 and 15. This is tree is the given 2, 3 tree. And we want to insult the 12 item. 12 item 1, 2, 3, 4, 5, 6, 7, 9, 10, 12, 13 and 14. And in search page each key is search its position. And after the search page the packets of 1, 2, 3, 4, 5, 6, 7 is attached angst to with 8 and this packets angst with 11 and this packets angst with 15. So, you have taken in such a way the numbers like 8, some sort of this and this. How did you take this number? You have the, suppose we have inserted this into the 3 with the array 1, 2, 3, 4, 6, right. What is the original by beating? 8, 11, 15. You want to insert. No, no, no. No, no, no. This is a. Why say you want to insert the. Yeah, yeah, yeah. No, no, no. Now, after that what you did? After the search page. This all the elements. That means you have to have this element sorted. Yes. 1 to 4. We want to insert this. We find out with the packets of this key. And hang that particular packet to the, particularly. Now, we are finding the medium. Now, we are finding the medium element of each and every packet. And dividing this packet into two sub packets. And then this, the new and medium element will become a new red node. Thanks. It will become a new red node and then the packet. Yes. And wave each go up by one level. And this node become the overload because every internal node have that maximum 3 childs. Then this node is the further split 9 become the parent's node and there become the two internal node for 8 and 11 13. In this case, and where is the go in the fruit nodes at same time. Next is the critter by the middle. Middle K the middle key is the 2 6 and there is a 10 12 and 14. These middle key become the new leaf nodes and which is attached in the parent's nodes. And this node and this node also become the overload and this is the splitting. And this node is splitting in this and 4 is go up in parent's nodes. And same kind of this. And where is the move in the up in parent's nodes. At same time, new wave is the critter. There is only single single single key. So, 1 3 5 and 7 become the new leaf node and it is attached with parent's nodes. And become this internal node and this internal node become the overload and this is the further split. And 2 become the parent's node and there is a sub child nodes 1 and 3. And same kind of thing is there. 6 become up go in parent's node and 5 and 7 8 become child nodes. And this node is the yes satisfy the properties. So, this 2 you are not going to talk. Yes. And then again the equivalence is go up to the root. And wave is go in roots and it is this way. This is the second one. So, quadri-representation on pyramid model. Quadri is the different kind of data structures used for a 2 dimensional images. And pyramid model. And it is based on principle of recursive decomposition. So, it mainly the quadri mainly used in the image partitioning image processing problems and the pattern regression problems. And in that this is a typical example of a quadri where we have our at the root we have our original images that images is subdivided into 4 sub images and these 4 sub images again subdivided into the 16 sub images. So, in the base layer we are having images which is containing only 1 pixel. And the full images partitioning into the number of pixels. So, we have taken a square binary images. Binary image means which can have only value 0 or 1. And the process is represented by a tree of our degree 4. And where the root node represent the full images. So, we have taken 3 kind of nodes here. Black represent the object of image. White represent the background of image. And gray represent the nodes where the further processing will require. Mixture of this. Mixture of this. Next. So, we have taken some coding for this. And the means at the base layer every node contain either 0 or 1. So, we have taken the ending and oring of this the combination of pixel. And at upper layer the base layer will contain the will calculate and or operation and will send this code to the upper layer. So, upper layer will have the 2 bit code. So, the upper layer will and and all these 2 bit code. So, it will first and the first bit and operation will perform. And the second bit or operation will perform. So, this is the result of this 2 bit ending and oring. The algorithm is we are each and every process we are having 3 variables. First variable is storing the value of and operation. Second variable is storing or value of or operation. And third variable is storing the concatenation of and and all. So, after initializing each processor send its value to the father. And at the same time we perform and operation and or operation. And the whatever the result of and and or operation it will concatenate both the operation and place this operation in variable 3. And then send this variable this value of this variable to the upper layer. So, this 2 states will get repeated until every layer until top layer get processed. The result of quadrate representation will be in the variable 3 of each processor. So, this is the quadrate construction in the pyramid model. We have a extra variable which is having the quadrate representation of this model this particular image. So, analysis of this we are performing 3 operation and operation or operation and the vertical communication between layer. So, the time required for this operation is the summation of and operation or operation plus this vertical communication between the layers into the number of layers required in the pyramid model. So, this is the conclusion we studied the different kind of quadrata structure. I have few comments. One is that what you have done you are writing a red black tree and then you are making 2, 3, 3 and so on right. And in both the case that you are trying to do through wave generation. Now, can you write different way that you discuss about the wave generation first. Then as an example that use of your wave concept on red black tree and 2, 3, 3. Not clear. The wave you define wave is your paradigm. Yes. Now, use this paradigm to solve the problem on red black tree and also 2, 3. Is it clear? Yes. See your paradigm is that. The wave was very hidden. So, that you will discuss first. Right. Now, use this wave things for your red black tree and 2, 3. Using red black tree. I mean red black tree, then similarly you are using. So, instead of doing it you discuss what is wave and other things. So, that we can apply to other hierarchical. So, it will look like the generic thing is first you discuss. And then as an example you are considering 2, 3. Then it gives more weightage on your right hand. In that case also this may be our presentation was different, but the wave is. So, I am telling you while writing the report you write that way. The another one is that, but you have already mentioned that means you have not done anything on octree. Octree sir. Okay. And once you are telling that what tree you are representing, you should tell little about the operational point. See, I represented something. It does not mean anything until and unless I cannot perform the operation or say in data section at least you should tell how to insert, how to delete. These are the minimum operations you have to tell. Otherwise data section does not have any meaning. Sir, because quadree construction can be used in other terms. So, image operation. Yes. You take one operation. Patent repetition. One operation. Say in image say whichever is simple operation say histogram or something some operation and you tell how some image processing also has some basic operation. So, one take one operation and you see that in that way you can perform. That is the loop. That is the other way data section, other way the meaning. Actually quadree can be used. Then you can think about it. Okay. Welcome everybody. We are here to discuss some parallel algorithms on the computational geometry is very classical and popular problem. That is Voronoi diagrams. Myself Risha and my friend Theran will present on this topic. First of all, see what we have in a bunch of slides. First of all, we will explain what is the Voronoi diagram. Its properties, some of the definitions and like we will discuss some of the properties that will form the basis of our own algorithm. Then we will discuss the best sequential algorithm present in the for making the Voronoi diagrams. Then the existing best parallel algorithm available. Then I will discuss our approach and how that approach can help in various other applications, various other problems of the computational geometry. Then as obvious I will discuss the complexity and then with the conclusion I will conclude my slides. Like we will start like what is the Voronoi diagram. Basically Voronoi diagram is this diagram. So basically what happens if we have a Euclidean plane and with a bunch of points in that. So we have to partition the whole plane by assigning every point in the plane to the nearest site in PI. Which means that like the point here will be closest to this site, not to any other site. So partition the whole plane according to that. To visualize this in a better scenario like we can consider like moving on a road and we have a bunch of noise points along the road. Like if we have to see which noise point we will be hearing to while walking that will, that gives a representation as if we make a straight line from here and in which region we can visualize okay I will hear to this noise source, I will hear to this noise source here. That kind of visualization one can have from the Voronoi diagrams. So coming to the definitions. First of all what is a Voronoi cell? Voronoi cell is basically this kind of region. And like every location within it is closer to the site above which that cell is strong than it's to any other site. Then we have the Voronoi lines. Basically the line connecting the two Voronoi cells is the Voronoi line. And the Voronoi points. Voronoi points are the Voronoi vertex. It can be any other thing. Voronoi vertex is basically the point joining more than three Voronoi cells. Now moving to some properties. Like first of all it's a quite condensed property. A point Q is a Voronoi vertex if and only if the largest empty circle containing at least three sites. Which means it is like basically it means like if we have the largest, the Voronoi vertex contains, if we form a larger circle with a center as the Voronoi vertex, then it will contain at least three sites. And the second point was like it doesn't contain any other site in between that. One more, a good conclusion is that we form a Voronoi vertex. Generally like we get a point in the intersection of two lines. But in the case of Voronoi vertex we get the intersection by the intersection of three or more than three lines. That's the conclusion from this point. And basically to give up, this will be very much obvious when we'll show our own algorithm and then everybody can visualize that pretty easily. Then like we have one more property. Like the number of edges are in the Voronoi diagram can be at most three n minus six. Where n is the number of sites and the number of vertices can be at most two n minus five. From where we got this formula, basically we'll use the Euler's formula for the computational geometry, which gives, which is basically used for the closed figures. So to make this Voronoi diagram a closed figure, like to have a closed face, we assume we have a point at infinity. So all these lines which were like, which were not intersecting, which were not like having any point to connect, we assume that they at infinity they connect to this point. So now we can use this Euler's formula. Like this is the number of vertices. So now the number of vertices would be initially it was in the Voronoi diagram. We were assuming it to be V. Now with this point, it is V plus one minus the number of edges, which is E and the number of faces. The faces will be like with each face, we have one Voronoi site. So the number of faces would be n. So one equation will come from here. Second equation will come from the fact that like each edge has a degree of two. Like it has two endpoints. So it has a degree two and each vertex can have a degree at least three. Like it connects to three Voronoi edges. So that inequality will lead to these two inequalities. Solving that inequality and this one will get these two. Then coming to the best possible sequential algorithm. So basically what was the, so basically what was the fortunes algorithm? Basically it has a concept of a sweep line. What it tells, before the sweep line, we'll try to make the Voronoi diagram. And after that, like on this side of the Voronoi line, we don't have a Voronoi diagram. On this line of the sweep line, we have the Voronoi diagram. But there was a conceptual error in that. What was the error? Basically if we form a complete Voronoi diagram here, then any side coming in this side will definitely affect the Voronoi diagram on this side. So actually we can't have a complete Voronoi diagram nearer to this side. So what fortune did? He made, he didn't make these as a straight lines. He made them as a parabolic curves. Why the concept of parabola curve? Basically parabola is a locus of the point which is equidistant from this side and this whole line. So conceptually what happened? Like if a side comes at this point, so that will be equidistant. So in that case this line would be equidistant from that point as well as this point. So making the Voronoi diagram, like in that case we can have like on expansion this will definitely form a Voronoi line. That was the basic concept. And this parabola will definitely like as the sweep line moves on this parabola will also move on. And this basically this outer thing, this we have formed as we like keep on expanding this forms a Voronoi line and this outside is the beach line. This is the last point. Yeah it is, it is. Basically this is the final Voronoi line. This is not the beach line. Beach line is just the outside one. This is in the face of expansion. So ultimately when it expands, so this is the final line. This is in the face of expansion. At expansion it will also become blue and the Voronoi line. Basically we didn't add the theory so that like people will understand this. So this is the parabola that just basically in the fortunes algorithm what it was doing it was basically sorting all these points on the basis of x-axis. So that was taking n log n time. And afterwards with the each sweep line they were adding these all sites in the event stack. So since these numbers, these are n, I was discussing the complexity of the fortunes algorithm. What will happen, first of all it sorts all the n sites according to the x-axis which takes n log n time. And afterwards it puts each site according to like the event that will happen in the event heap. So since we have n sites that will take approximate of the order of n. So the total algorithm will take n log n. So the complexity of the fortunes algorithm was n log n. Best sequential algorithm. Now we come to the optimal parallel random algorithm which is discussed by Tharun. In the next few slides we will be talking about the best known parallel algorithm for solving Voronoi diagram. Now it was given by Raj Shrikan and Sunita Ramaswamy. Now the algorithm time complexity of this algorithm is log n and the order of processors that are used in this algorithm is of order n. So the cost of this algorithm it is n log n. Now the previously best known parallel algorithm deterministic parallel algorithm for this was given by V and Z and the complexity of that algorithm was log square n. So in this algorithm they obtained a speed up of log n over the previous best known algorithm. But this is a randomized algorithm. Now the basic idea involved in this algorithm is that they use the technique of sampling to divide the main problem into smallest of problems. And solve these smallest of problems in parallel and then obtain the final result by merging all these smallest of problems to obtain the final results. This is the brief outline of this algorithm and I will be explaining to you right now. Now initially we are given the n sides which is the set S, the S1, S2, S3 and so on and so on. Now from this side we select a small random sample R. Now which is they say that the small random sample R it is of size n which is for epsilon where epsilon it is less than 1 and greater than 0 some factor which they pre-processed means during the pre-processing they choose this epsilon such that the output of this algorithm the probability of correct output is very high. So a little bit of pre-processing is involved in selecting this random sample and choosing this epsilon for getting the efficient output. Now after getting this random sample R what they do is that this create the Voronoi diagram for this random sample R and use this Voronoi of R to divide the original problem into smallest of problems which will be solved in parallel. Now after getting this Voronoi of R what they do is that now we need to get these smaller problems some problems. So they use some efficient random searching techniques to get these smallest of problems and after getting these smallest of problems they need to associate the points that are the input points which is each of these smallest of problems. So after processing V of R to associate these smaller points with each of these smallest of problems in the next step what they do is that they recursively compute the Voronoi diagram for each of these smallest of problems and finally obtain the final Voronoi diagram by merging the recursively computed Voronoi diagrams. Now the time complexity involved with each of these steps is of order log n. So the final equation that we get is something like this that T of n it is equal to T of n raised to our own minus epsilon which is the order of the sub problem that we get plus the speed up the final time complexity that we get is the order of log n. So the physical algorithm for this problem randomized algorithm is obtained speed up of log n using the same number of processors of order n. So the speed of log n. So after this step we will be giving our own algorithm. So now it's the time for our own algorithm. So it's a soap bubble algorithm. So basically I was searching on the blog when Rex saw the soap bubble paintings. So that gave me an idea why don't we have soap bubble as our own Voronoi signs. So basically what happens? So bubble has a very good property of expansion and when the two soap bubbles meet so each pushes each other and it forms a kind of equidistant it expands at the equidistant level. So that kind of visualization can be compared to the Voronoi line. So what happens? We assume the soap bubble as each Voronoi sign. We order each bubble equally so that it expands in an equal fashion. And expansion of the bubbles will touch each other and form a Voronoi line I miss the Voronoi line in the two dimensions. This can be conceptualized here like if you take a mineral this will be a line. When we have more than three lines when the three more than three lines meet they will form a Voronoi vertex. So this was the property that I was referring to. What will happen? When the two bubbles will expand this line will keep on expanding we have a third bubble that will stop it from expanding in that direction. So to get a view that third bubble will touch this one and this one so it will have one line here one line here and the expansion will stop only when all these three meet and that will form a Voronoi vertex and when this forms with the more than three circles then we have three bubbles. Basically the concept this concept is not only can be used in the Voronoi diagonals but we have conceptualized more problems which solve this thing. First one is a little bit obvious which is a trick and we can have the Voronoi diagonals for the three dimensions but basically we replace the circles with the bubbles and lines with the replace the lines with the planes two planes meet three circles meet they form Voronoi line when the more than three circles four circles meet they might form a Voronoi vertex now comes the more important part actually Voronoi diagonals are used where the space need to be partitioned into spheres of influence each side is associated with the sphere of influence at what distance we can hear its influence so what if one side is more popular than the other basically the idea came from the fact that while moving on the road I watched one side that was making noise but I was not able to hear that noise but basically I was hearing some other noise which was made by some distant souls which was more powerful which was more powerful so that kind of thing is not covered by the Voronoi diagonals since it captures all the Voronoi diagonals have equal power and basically the Voronoi diagonals is a special case of that problem so that problem can be solved by our conceptualization so basically how we can like a person moving between two noises here's only one and more important if we have a self-guiding missile that can locate its target so at the army base we have to study which target it will go and hit that requires a little bit real understanding like which target is more influential that the missile can see that kind of thing can be found with this problem basically what I was referring to is like these are two sides and this is more powerful so the Voronoi line won't be this which is the midpoint of both of them but basically this will be according to the power so will the fortunately will it work for this actually the answer is no what happens as we were having the like it's a free flight and this thing expands so this is taken into consideration like this is equidistant from this point but what happens if we have a side here which is more powerful than this one so what will happen like since this side is more powerful so this can't be a Voronoi line Voronoi had to be somewhere around this since it is more powerful and that thing can't be made by this then the conceptualization the parabola will change and in that case Voronoi will fail but in a case if we modify simply when we were pumping the air in the each so bubble and these pumping if done in accordance with the severe of influence like one bubble is expanding more than the other then the rest we proceed as normal then the Voronoi diagram will be as this that will solve our problem like one sphere of influence is more than the other now coming to the so bubble algorithm in detail we are using a CIEW prime model which is basically shared memory concurrently exclusive write and parallel random access machine model and we assign each processor to each side increase the size of circle at each side Voronoi lines Voronoi lines will start at the midpoint of the two each side so basically first question popped up at what point each circle will meet and form how will calculate that so basically that will happen straight at the midpoint and the straight line we can find from the two points we know this is the midpoint and this is the perpendicular bisector so we have the line equation that can be finding near computation now these two points are used further for pointing to Voronoi vertex so basically Voronoi says each side stores with which side Voronoi lines has started if it has started making line with three circles it notes that information and if a circle is made between more than three circles then we have a Voronoi vertex these two points will be explained in this slide initially we have a scenario in which one was stretching the three two was stretching three and now two has touched one so this information is in each side basically with each processor so two will inform two touches three then two informs two to three to one then one will find out that it contains three in its list which is this so it finds it has a cycle so this cycle will form like we have a cycle basically we know that it will result in a Voronoi vertex so in this scenario we don't want to calculate for each point in this way we are cutting down a calculation cost so in that scenario we have a Voronoi vertex touch all the circles then we have a Voronoi vertex now it's the time for complexity the circle expansion will run until each circle collides with the other the circle is limited by the other circles so this expansion so basically the expansion will go on up till the longest radius in the Voronoi diagram the circle having the longest radius will expand up till that we can't have the final Voronoi diagram so basically the time the worst case time would be the longest radius but to find the longest radius is in itself not trivial so what we did is to find the complexity of the whole scenario we find the longest radius to be order of A B by N where A B is the dimension of the plane how this thing came actually on an average if we see like to each side how much area would be associated that would be that point like the total area divided by the number of sides will give that figure so if we consider the normal area to be like this x into 1 unit so that x, so one dimension the maximum one dimension that it can go was A B by N and we are interested in the longest radius and the longest radius would be the diagonal of this one so that would be x k plus 1 under root and that would be of the order of A B by N from there we got the A B by N thing so we are using this is the time complexity and we are using N processors so the cost came out to be A B so here is the catch this A B is not a constant thing we have not achieved something very new actually what happens like the distribution of points is dependent upon this area so in that case the cost will be order of N which is better than the sequential algorithm I must say better than the existing parallel algorithm if the distribution is better than N log N is equal to k times A B where k is a constant and in that case the cost will in the last in this limiting case the cost will come out to be N log N so now to conclude our slides we studied the best sequential algorithm and the parallel algorithm we conceptualize and analyze the new approach for solving the problem and the approach can provide a solution to the other domains as well so here I can conclude our presentation thanks and any questions you go to your I don't exactly understand how you say the time complexity order the long range radius like say processor 1 you don't know anything about the other process you know the circuits of the other process so you have to have some method of getting information from the other methods actually what will happen each circle will have you understood the point each circle will expand it will always expand but the other circles on touching will limit it from expanding so it will expand only up to that point it can't expand more than that so it will order the long range radius so as all the circles have expanded so all the circles will expand like the limiting case would be the longest radius all the expansion how will you find out which ones are touching when they are touching we are stopping the data set to be discussed actually I try to make my slides a bit easier if you want to go into complexity I can discuss we are covering that in a properly basically I have conceptualized each site will have a triangle structure how will you get the basically he is asking how we will get the what he is saying is order the longest radius because that will be the maximum expansion but after each say you increase by one unit and then probably you will have to see something is touching so that checking should be done in constant time then only the time complexity can be order the longest radius if it is not in constant time we cannot say order the longest radius I don't understand how we can do it in constant time because we are going to say n process so how will I know processor one is touching with what other process that will be dependent on the number of yes for that I think you will miss somewhere for that I will discuss this thing and for that two circles touching that will be found at the midpoint of two sites so we don't have to search for each processor and find another thing initially I will make the plan so actually what will happen each circle knows its site each processor knows its site in the constant time it will go on we have an array which has which is of the length of the x axis of the whole thing when each processor knows its site already each processor is distributed with a site finally a processor will get a polygon in the two dimension initially it is the first step what each processor has it has x and y axis of its site each processor goes into the array it is a shared memory it goes and marks where its x axis is we have an understanding of the distribution beforehand we know what can be the longest radius approximately so this site will do this is the longest radius all the sites that will come in that will be put in a stack in a time stack of that site if you are coming through here please come in here so this is a timeline kind of thing which is of the order R since we can proceed only R so at each expansion this will move here as it comes here it knows it will touch 3 similarly 3 will know it will touch 1 so at that point in a constant time we will get a line here now each processor knows which site it has expanded a dry kind of structure which is for the whole task what happens in a dry one site has this total n-1 branches and only that site is highlighted or Boolean or something you can say it touches this point and this is like site 2 site 2 will in itself have a dry node so from here I will know 1 has touched 2 1 has touched 3 or something else similarly 2 will also have this kind of structure so in this example what will happen 1 touches 3 1 touches 3 it has this highlighted 2 is not highlighted then 2 touches 3 then what will happen 2 will it will tell this one the dry node of its dry node which is this 3 talking about finding the common vertex yeah how that like let me explain then it will come like this thing will come to 1 1 will tell its third node like the 3 node like this is the this is what I have got have you found the circle 3 will find ok and 3 and the 3 is highlighted in this node so it will tell ok we have found the cycle in that way it will communicate to all the things we have found the circle so this will be in constant time so in this whole like the whole thing can be found in constant time 2 has to inform all the cutting circle that it has touched 3 right we have shared memory we have shared memory shared memory and randomly we can inform each processor but in the worst case you know like if you are using shared memory then each processor has to look for each processor in the shared memory like this no no how it knows only these it has how much ok suppose 2 that is another 10 circles and after that it is 3 2 already that is 10 circles already and after that it is the circle number 3 now it has to inform all the 10 circles yeah it is the all you have to inform it right of course all the processes are not reading from the shared memory but that will happen communication will happen in constant time regardless of the amount of processor regardless of the thing we are using the prime model it is a parallel random access model that communication one can communicate with 2 in random time communication is there yeah communication is in constant time but the number of communication you are required is 10 for 2 the order will be number of circles ok not the constant so initially the whole thing was like this thing was made only because like it knows it will touch like the communication will have only in r like it will communicate only in number of r so whatever will happen you are expanding r no this is the maximum we are not going beyond this so whatever we will do we will resort to only part of r this can be done in other way also yes how can that see what I was thinking if you keep the 7 copies of it ok copies of what copies of your prime for each one for each one and then whatever they are having the idea that how can you so we will try to that people I think that can be avoided yes we will otherwise but thinking I think that is ok only thing of cycle found is the problem is that cycle found is how you are expanding it because several at any instance a 4th cycle also can create at a particular instance 4th cycle 4 of suppose that is a 4th cycle also increasing and it is all simultaneously knocking to 3 to check whether I have then the you are only finding cycle of 3 no no no that communication will happen alright so that is why there will be 3 cycles of things no sir all of these those things not only 3 whatever it connects to those has to be same actually in the report I have mentioned everything but here I have because nobody will understand that so that is why I have this is like the complex state when you are making you will understand that what for the 30 years from row number 2 9 0 total 3 row number yes sir see if the cycle itself you know you have given a very big title if you are not parallel row number 2 9 you have to be even more parallel some of the time some of the time now again there is a class of parallel right maybe sorting maybe something maybe something some parallel some parallel so I understood that it is such a big thing it was a very big thing you have mentioned so you must or some non numerical some semi numerical something like that actually they are written in different classes but class is more semi numerical, non numerical not all so some you are like some non numerical non numerical something like that good morning everyone this is Deepak and my sister we have included some of the algorithms that are covered in class and there are two extra which are not covered in class first of all this is the table of contents first of all we will give a brief overview of the pvm parallel version some algorithms and the sorting algorithms these were covered in the class and the transport also then there are three other finding square root in queens and rabbit's risk version we would like to give pvm only first pvm enables a collection of heterogeneous confidence system to be viewed as a single parallel virtual machine it means that we can the programs on the pvm environment need not to worry about the architect on which they are running the pvm so pvm transparently enables all these machine routing and data conversion and task security across the middle of incompatible computer applications this is the competition model that pvm uses the input is partitioned based on the function of the task or data functioning across number of computers available and these computers perform the assigned task and set the result to the node which is called master and master node displays the output these are some of the pvm routine that we use frequently in our programming pvm iqid registers the new task and get the unit id in the pvm environment pvm exits the task from the pvm creates n number of tasks mentioned in that variable n task and it creates tasks as of character order pvm kicks the task pvm creates the parent id of the child child gets the parent id pvm initializes the sending pvm sends to the destination with the given type pvm receives the same type pvm multiplies to the nth task which are given to this id other than this we have functions to pack and path different types of different data types for integers, load, demos you have all different functions for that and you also have a concept of group where you can assign a few tasks to a particular group and then you can access them through the group id which is a sequential id starting from 0 the ids are unique but they are not networking sequences and we are not sure about the first number so if we assign them to a group we know that the group ids will start from 0 and go until the number of elements in the group contains pre-number of functions it has different ids and also it will all cause to a group just by giving their name of the group we start with the sum important this is the same as in covered that was covered in class we have n elements and key process we divide the elements into p groups such that the size of each group is n by p and sending them to the different process each processor adds its n by p numbers then sends it back to the what happens with this group you have say processor p1 p2 p3 p4 p5 and so on so how what you do is p1 sends it to it starts to p1 p3 to p2 p5 to p4 and so on then in the next stage this sends it to p0 there is some other then in the next stage this sends it to so finally you can get the result in the p0 processor which is actually the master processor and it displays the result of that are you using the master processor to spawn the sum like a slave method always or are you doing it like cloud computation master does work so you are having a critical master where the master spawns this and it also computes it also computes the numbers it will also be a slave to itself then the sum of the linear error the difference between the two is that in pvm we can send the data directly from the processor to another if we know the id of the processor which we have to send while in linear error we only have corrections of this sort so in this implementation what we do is it follows the same object but when this data has to be sent to p0 it goes through first p3 p2 p1 and then which is p0 how are you implementing the receive non-blocking or clocking so you count the number of receives each of these you will have and then you see what happens is first upon p0 will be receiving from 1 I am saying are you counting it somewhere and then doing it or see it is that p0 first it is from 1 p2 from 3 p4 from 5 and all odd numbers have to send it only once their own data so all odd numbers send it first then you have only these you are only remaining with the even number which also they are alternating even send their data to the previous processor so what I do is I maintain a variable and I just keep on multiplying it by 2 every time so I know what I have to receive and to which I have to send then 2D machine is also similar just use the two dimensions the same process it first collapse the columns and then the first row then you move on to sorting the innovation sorting has you give it any way so it uses n square process for sorting what it does is it creates a mesh like structure but the connections are in the row and column as in a tree something of this sort and so you have you can send data from p0 to p1 and p2 from also to p3 and p6 so this way you can send this data what is does it we first of all give the x5 variables to the all of these x1, x2, x3 and similarly x1, x2 and x3 just over it in two different variables A and B then the first row and the first column it broadcasts both the numbers to the other elements to the other process then all of them calculate the rank depending on the values of A and B then the ranks are summed up in the first column and then after we get the rank we move from the rank over here is 0 so it moves from here to here to same place over here then goes up and reaches over there this is the same important that we covered in class so I just go through it fast enough then by turning again we have a by turning sequence you divide it into two different by turning sequences and then by turning merge you merge the sequences and so on you see this uses n log square in comparison with the n log square in this process and the time taken for this is order of log square so the overall cost is order n log to be got 4n same with the ordinary merge sorting this also does it divides the sorted sequence into two different sequences and then you merge the sequences this also takes the same amount of process in the time then transpose of matrix here the model we use is MCC we initialize the matrix of size n cross n and create n cross n task so actually the number of process available that are less than the number of tasks we have created so one process makes more than what one does so we transfer one element to that process and store that element in a register called A and we perform and after each process we perform rotate A task we perform rotate operation on the element that is stored in the register A and this premium barrier will ensure that all process will get synchronized over here and after that again all process will perform rotate left stored in the register B if the row is equal to column then we strap the elements and again premium barrier is used so that all process gets synchronized again and this loop will be continued n number of times after n number of iterations we pre-process will contain the element which is transpose element so this at the end the process will send the result to the master process what is that input input is a matrix where it is available it is there in master process master process creates n cross n number of slams matrix is available where at the master process but if you have the master process then why do you do all those things that you observe if everything is available in the master process and then master process does himself the transpose because the time mediates 1000 cross 1000 matrix that 8000 comma 1000 the data is because you are sending to 1000 element position means you are saying that the data is available at each process so the time it is if I do it in the 6th single process the time is about n square that is what I mean but the data you have to send from the master to 1000 to 1000 that will then depend on the the how are you going to master the data tell me the top of the 8 data you have how are you going to you write down some place master will come to all the states master is going to all the states 1000 comma 1000 the data he has to pump so it is only one data why because it is connected to all they are sending it to one state they will take only constant data so it is a broadcast broadcast is what you write down that there are n elements and I will tell you about the process of PI almost time here there will be order n or not order n square or not in the matrix case so if you are ready to spend only pumping the data order n square this is that you do the transpose of the matrix and give it so what if you have it in different process and you start this is one of the states from yeah I mean that states that initial state should be assume that the elements are lying in the master process your problem is no good then you are making unnecessary complex problem you have to assume that either a part of the block of the sub block of the matrix is residing in PI you are assuming that otherwise you will not gain otherwise you will not gain in reality you have to remove the first you assume that a block of a matrix is residing in each process now you want to make the task then how you got the sub block from some pre-processing to some pre-processing I didn't know otherwise if you estimate the time the broadcasting can be should not be included in that yeah in the main algorithm the next algorithm is the finding square root of a number using bijection method here let X be the number whose square root is to be taken and in the initial state we initialize the start and end with 0 and X respectively we divide start to end interval into p equal intervals where p is the number of process that we are available we obtain a of X and a of X plus 1 and f of a of X is less than 0 and f of a of X plus 1 is less than 0 and we find c is equal to a of X plus 1 and we compute start and end appropriately using a of A of X a of A of X plus 1 and we again assign start and end and we progress to the start and end and this will be done till we get desired activities problem that we have implemented is n place problem here the problem involves placing n place on n cross n chain board no queen can attack any other queen now we have tried to find all possible solutions and the solution that calculated is by by acting method and the complexity of backtracking rises exponentially as a side of n convenience here is the working of situation algorithm to find the position of the queen in k column we find it said that it should not be attacked by any of the previous k minus 1 queens which are already placed so if it is not possible to find such a position for k column queen then we backtrack and find the next same position for k minus 1 queen so here is the here we place the first queen and first to first column we try to place second queen so that it should not be attacked by the first queen the same position for the second queen is third row and similarly we proceed for third queen so the side position for it is fifth row fifth row third column similarly we do for this one is the parallel algorithm here we note that so all the solution that we obtain by placing the queen at first row, first column position first row, second column position and likewise first row and column position is independent of each other all these solutions are parallel so initially we have n number of tasks we call that as tasks so at the initial stage except we have n tasks and we gather the tasks to give the number of tasks that we have so that each process will get one task and it will compute all possible solutions of that task so to gather a task what we do we create the initial code for the first task so that we get the number of tasks so here i and davan is another variable that will come in the picture so initially we have 4 number of tasks for this matrix but suppose number of process that we have are more than 4 then some of the process may remain idle so for that case we will compute the task suppose this is the first queen task for the second row also so the possible positions of the second level queens are this so these we gather 3 number of tasks we do similarly on these 3 positions also so likewise we continue till we get p number of tasks these number of levels that we are going to travel depend upon the size of number of process that we are working on is an array which we store all the tasks and now our job is to only parallelize this loop so individual each process will get one task and it will perform that task and complete all possible solutions of that task so if you parallelize this loop and we use this routine as a queen's i level and task of x then this process will start with that level and this task of as initial values and we will calculate all solutions the analysis of algorithm is like the complexity of situation algorithm is order of to the power of n here the complexity of parallel algorithm is at state one we will be competing all possible tasks that are needed that is p number of tasks so at the first level we have at each level subsequent level the number of tasks goes on increasing by order of n and at each level the number of process that are available for the competition will be strictly greater than the number of tasks that we have so far up to that level so we can split this task task of gathering the task across all the process that are available for that level that is required for this particular state is equal to number of levels and number of levels is the function of n and p and the value of these number of levels that are needed to complete to gather the task is called to be order of log p by log n and the complexity of second stage will be reduced by factor of p because we are assigning each task to each of the process that are available and there is no communication over it each person will complete all the solutions that are possible for that task so total complexity is to the power of n by p that is log p by log n here we can reduce the complexity of the problem but the execution time required can be reduced of the problem here symmetry along the vertical axis is that all solutions obtained by placing queen at first row first column and all solutions obtained by placing queen at this position and this position and similarly for this position and this position almost mirror image of each other so we can cover these competitions so half of the competitions can be avoided horizontal axis symmetry all solutions obtained by placing the queen over here and over here are exactly mirror image so again the competition can be avoided for this last one also here are some of the special cases when n is fine the solution can be easily found for example the solution equation y is equal to x plus b where a is not equal to 1 if we equal to that then we will get all ways place diagonal so for that case a is not equal to 1 and for n is equal to 7 if we put values of a 4 5 and b is equal to 0 to 6 we will get all possible solutions so that execution time of this particular all particular cases can be reduced and when n is equal to product of two composite numbers n is composite number the product of p and q then we can directly compute the product of n-tween and q-tween solutions and we will get the result for n-tween problem so for example n is equal to 35 and p is equal to 5 and q is equal to 7 we will compute the solutions of p-tween we will compute the solutions for q-tween and we just take the product of this and the total number of solutions we will generate is of this problem so we executed this on p-tween and here the result that we obtained as a number of pins increased we can see that the time required is going on increasing exponentially few more pins if we add few more then we will get time as hours and the game that we obtained is almost linear for n is equal to 15 because the number of other factors that came into picture are nullified and the problem is greatly parameterized so for n is equal to 15 as you can see in the last column the game is almost linear the next problem that we implemented is the trialing self-cursor problem the definition of problem is that for g, v, e v a graph the dual of g is simple directed cycle that includes every vertex in p exactly once and we also find two of them with many of course of problems this is a sequential algorithm here we without loss of generality we assume that our course starts and vertices 1 and ends at vertex 1 and we find all permutation of the remaining n minus 1 vertices and we will find the permutation which will give minimum cost so this permutation function will compute the permutation starting with i plus 2 n minus 2 is the cycle of permutation contains the number of elements whose permutation is to be taken here is the parallel algorithm now in our tool first and last vertex is fixed that is vertex 1 and the remaining n minus n minus vertices we can generate n minus 1 task so suppose we have 4 vertices then 1, 2, 3, 4 then our task we can gather is 2, 3, and 4 so this our coordinate starting from 2, 3, 4 can be evaluated independently so each person will get these tasks and they will perform the permutations and we will send the minimum value here we cannot explore the symmetry because we assume that this graph is directed on pf here nc is the number of cities n is the number of persons when nc is equal to 14 you can see that it is growing exponentially time before it is very much large these figures are on pf pf this is the slide for gain obtained gain is the time for the sequential execution divided by time for the parallel execution for n is equal to nc is equal to 14 again it is almost parallel linear gain that we obtained here is the second algorithm for the traveling states of the model in each iteration we explore the nodes which have the minimum cost part from the strata suppose we have 4 products again from the cost node n minus 1 pass we explore this cost node to n minus 1 pass again we will compute the cost from 1 to 2, 1 to 3, 1 to 4 and in the next level we will explore the nerve which has the minimum cost suppose 3 has the minimum cost then we will explore only 2 or 3 and then 2 and 4 so again in the third level we will compare all these cost and we will explore the next node so this exploration of nodes can be exhibited manually but here there is a problem like the number of nodes that were exploring goes on decreasing at each level so some of the processor may remain idle and also the task that each processor will get which is very small so communication work will nominate both of them the conclusion is implementation parallelism gives good insight how parallelism works and there are a lot many practical like the number of processor available for the competition and implementing any parallelism is what we get since so the demonstration right now is half-permission right now in table and just this is the enqueue implementation the number of screens I am entering is 30 I had already added the 6 processors to the virtual machine so as far as the parallel execution time we are not separating the sequential work on the screen something like that so the time required for the sequential implementation is 17 what do you mean by the number of possible solutions number of solutions so here the enqueue problem we can have the the enqueue space on the screen this is the number of different places at the queues kind of place so that is how the problem is to find the position that you want to place from that you have to your we have to print as a bear these are so many solutions no no no our task is to find all possible solutions that's all the algorithm that we implemented so only one solution how do you know that they are correct the solution is correct solutions the number of solutions yeah we printed actually solutions but here we can't print my question is how do you know whatever solutions we have made they are correct we printed on the yeah we can check 4, 5, 6, 10 no but it is the same as your use of backtracking algorithm right only if you are using the concept from your you know when dividing each task to be closer we are gathering the number of tasks we are gathering the number of tasks we are distributing each question each question we complete why it is correct so this is not a as such a new algorithm yeah the sequential algorithm that we are using is the same one only the step we add in each step where we complete the task and we divide the task to be closer it is different you have to if you write any algorithm or any but you must yeah we check the how we put it for the lower hand and we check 2 plus 2 is 4 2 into 2 is also 4 so instead of multiplicity given the last sign we got the parameter so now 2 plus 3 plus 4 and 2 into 3 into 4 and the same as the other so small algorithm gives them one thing sir the algorithm that sequential algorithm is the same thing what is it because you are using the algorithm is not exploiting the symmetry of the problem now we are getting all the tasks and we are finding all possible solutions that means you are implementing the sequential algorithm directly we are parallelizing the sequential algorithm and we are implementing the sequential algorithm that is why the gain is only so let us see the symmetry then the gain will also increase ok you move me one day all the recommendations what are they we will do what are the what should be 3 200 200 what are you let me call it a definition for a word case is a distance bounded by k from the position of p as where s is the sorted sequence. So, in other words the array is almost sorted it is like it is not exactly sorted, but it is almost sorted. So, in this array p this k sorted is the element is here then the position in the sorted array cannot be at a distance greater than equals to k. So, when you say the array is one sorted it just means that the array is completely sorted and this k sorting is useful when you want to sort groups of elements in such a way that ordering within a group is immaterial. Well like you just want to say that I do not care about the ordering of the group and say the height of the group can be k then if you do k sorting what you getting what you need. Yeah just one of the problems is the case order sequences. Well if being the case order sequence then well we define two things one is l v of b f which is nothing but larger before the and when I say a sorted sequence I mean an ascending sequence not a descending sequence and the larger because b f is nothing but set of all elements before that which are more than this element. So, in other words these elements have to be transferred here when you want to sort it and similarly smaller after is nothing but the set of smaller elements here and it can be easily seen that the number of elements here that is the number of smaller elements here minus the number of larger elements here will be nothing but the position of this element in the final sorted array minus the position in this array. Because in the final sorted array what you mean nothing but is just move these elements here and move these elements here then you will get the actual position. Now this is one property which can be proved. That is one question. You are discussing this regarding the sorted position of the element or the number. See s of b i is the position of the element in the sorted sequence this is the sorted sequence. See what is the position of this element in the sorted sequence and easy way of saying this. Is this element already in the sorted position or not? No this element is not in the sorted position. This element is in position i in the array b which is. Ok. So, if you sort it then go on. Yeah ok. So, sequence b is case order if and only for all i will have this result that number of elements you know say the smaller elements after and the larger element before both are less than k. What is the range you are considering for after and before? All everything. Here I am not mentioning there is case order anything. Ok. And this same the result number of elements here and here will determine the actual position. So, we actually designed a parallel algorithm for case order based on the parallel algorithm for mass sorting. Well, the principle will come. Actually, what we are proposing is basically a parallel algorithm for case order problem. The model which we are using is basically the shared memory model with complementary and exclusive right axis. The number of processors required in our algorithm is basically order of time. The time comes out to be order of log k log n by k plus log square n by k. So, this is fairly good complexity and clearly the cost so order of time of k log n by k plus time log square n by k. We can see the formula from the time that we got the k question and that is the case of the sorting. Time complexity comes out to be order of log square n and the cost comes out to be order of time log square n which is the same as that we have obtained in the class using by doing sorting not network and network model. Before describing the algorithm from the actual problem of case sorting, I will describe the algorithm that we have designed parallel or some which we are using with some modification in the case sorting problem. So, this is the most common algorithm. A, B are the two array of the same size and they are sorted actually. And the step one in the algorithm is we will visualize an array position. But for this array, this array basically at the end of the algorithm, the position array will determine the position of all the elements of array B in the final most array. So, suppose if the d i is at position j in the final most array. So, the position i in the value of position i will be j at the end of the algorithm. We are initializing all the elements of position array is burn this time to the order of order. The next step in the algorithm is basically we are performing in this algorithm in this step. We are performing the binary search for all the elements of array B, all the d i elements of the array B on A and storing that result in the another array predecessor. So, actually we are using the component read model so all these binary search can do in binary. After the end of this step, predecessor i will have the predecessor element of d i in other way. So, it is like every element in B will usually see its position in A and it will store all the elements. So, it is committed in auto-logging time. So, the binary search takes all the log in time. So, this step takes all the log in time. Now, what our task is then is basically we will find the position of elements of both of array A and B in the final most array. For that we will use the two array that we have defined as the position array and predecessor array. What we are doing in this step is basically we are performing the cumulative soft approach that we have discussed in the class with slight position array and with end processor and c i d b model. Actually, we are breaking the every iteration of the cumulative soft approach in two steps. Actually, what happens in the cumulative soft approach is we suppose we have eight elements A 1, A 2. In the first step we add these two and store the value in this. So, this will have A 1 plus A 2. Similarly, we add these two. This will have A 2 plus A 2 plus A 2. In the next step we are sending the data of A 1 plus A 2 to these two processors. Actually, we are doing this on the B array, not on the A array. Yes. So, we will have A 1 plus A 2 plus A 3 and this will have A 1 plus A 1 plus A 2 plus A 3 plus A 4. So, we are doing the same step we are sending this value to these two processors. So, this will have the value as A 5 plus A 6 plus A 7 and so on. We are adding all these elements in there. The only difference is we are all. So, similarly, we can find out what is going on the i th iteration. Basically, we are sending the data of position j to the position j plus 1 to position j plus 2 is to power i minus 1 elements. So, we are applying this in parallel. Till this point, this is the same algorithm. The only additional step what we are doing is basically we are checking whether the predecessor j and predecessor of j plus 1 they are same or not. If they are same, then we are adding one to all these elements. Actually, why we are doing this is basically predecessor of j is not equals to predecessor of j plus 1 means that the predecessor element of B j in array A and predecessor element of B j plus 1 in array A they are not same. So, it is like this suppose this is B j and this is B j plus 1. Predecessor of this will lie here and predecessor of this will lie here. So, clearly this element will lie between these two elements. So, we have to create a space for this element in the finite model. That is why we are adding one to create the space. Finally, the algorithm is predecessor of j plus 1. So, we can find out the position of the predecessor j plus 1 in the finite model array. That position will be position of j plus 1. So, similarly after all the steps, we can find the position of all the elements in the finite model array. And also after the algorithm we also have the position array which gives the position of all the elements of B in the finite model array. So, we have the position of all the elements. So, we can find the finite model array. Step three we will take order log n time because of the cumulative some tools that we have taken with n processor. Total time complexity with some of the all three steps which is order of log n and cost is order of time log n. Next is the total number of most steps required will be order of log n. So, the time context will be more or more short it comes out to be order of log square n and cost comes out to be end of square n. I just want to emphasize this point here that while we are doing this in the B array. So, we just need to see which element of A is there like you know if you can just add the number of elements of B. Suppose there is an element of B here the position of this element is a finite model array does not depend alone on the elements of B here it also depends on the elements of B elements of A which are smaller than this element and they can be easily found out you know where in the cumulative sum by adding when the adding one when the position has changed. And now we come to the case of the case order as we saw we got an log square and algorithm merge sort algorithm and basically the type of you know for a given array multiple case order array is a possible there is no unique case order array and the type of case order array what we are getting is you know this is the array with n elements then what we say is you know we have got n by k such segments each segment has n elements each segment has k elements and these k elements are smaller than all of these and like they are basically divided into groups. So, these will lie in between this this is the type of case order that we are trying to achieve. So, here we have to case order guess now I find that k merge step actually in the algorithm initially we start with n by k so n is the input other size. So, initially we start with n by k groups we just get the n by k groups and we know that each one of these segments is case order itself and now we apply the k merge step recursively on and finally, we get the total k for the array on the n now I will display the k merge step. So, here a and b are two arrays of size m each and so these are the segments that are present you know n by k different kinds of segments and within each segment we store an array segments which shows nothing but the maximum element of all the the maximum element in this segment and the maximum element in this segment. So, there will be m by k elements in that segment array and so this is step one of the algorithm step one what we do is we just do the binary search for these segments you know what I mean by this boundary is the segment of this segment. So, I will do the binary search for these segments on a this input I will do it parallely and this input is in order log m by k time and then suppose for this particular segment I get the position as here you know and so it just means that all these elements have to like here you know they cannot they cannot be here because this was what I thought was the maximum element of this particular segment. So, now what I do is I for all these k elements here I parallely get the positions here positions in the sense the amongst the groups so which group belongs to. So, this also can be done in order log m by k time now this this conducts the step one now in step two what I do is you know for every for every boundary of a which have been chosen here like say this one select this boundary and this one select this boundary and so on something like that. So, for every such boundary a the k elements before that I do the binary search here and I get the positions amongst these groups I will explain why I have to do that this because the elements here you know it may so happen suppose this is the boundary for this. So, this it just means that this boundary will lie somewhere here and there will be some elements before that and there will be some elements after that what I intend to do is I intend to merge these two groups and get two m by k groups. So, for that I just you know get do the reverse binary search I say that reverse binary search and for these k elements in order log m by k time again. But how are you reading this log log of m by k. Because I am doing it only amongst the m by k you know segmented headers. But that can be bigger one see the elements. Yeah. The element of a if you do the binary search you may find that all of the most you are doing binary search on maximum elements right. So, that means the maximum elements of the segment should be sorted. The maximum elements of the segment should be sorted. Yeah. Yeah, the maximum elements are sorted because we are assuming that you know these two segments are k sorted. What do you have initially initially you have a and b both k sorted. Yeah. And initially a and b both k sorted of this form the form that I explained there you know they have two elements. So, this group is all the groups the second one is first and third and so on. So, I will be doing the binary search only on m by k elements in the worst case. So, order log m by k. So, how we have cleared it. If it is m by k then log m by k. Yeah. Yeah. Now, we get the intermediate k k sorted merge array. What it means is like I will get the RAC of size 2 m which has 2 m by k headers. But the problem is the size of each segment may not be k you know in the intermediate array. I will make it to be k in the final array. So, how I will get this intermediate array is what I will do is I just so, I have got the segments in a I have got the segments in b. Now, what I will do is I will make I will do the cumulative sum for all these k elements. Cumulative sum in a sense like positions of these elements here like if this element belongs to this segment or this segment or this segment or whatever it is. So, what I will do is I will do the cumulative sum. So, for example, for this element when I do the cumulative sum it will determine what elements you know this element belongs to this segment about this segment. So, I will get what elements are there before that which belongs to the same segment. In other words I will determine the position after the merges in log k time because I am doing the cumulative sum on k you know here I have got k k k and stuff. This because in the worst case you know all these k elements will go to the same segment. Any of these k elements they will be they will be lying from here to here because you know the boundary the place for this is this. So, it just means that these k elements can be here or can be here. So, the side of any segment in this here C sorted in this C array is less than 2 k. So, in order log k time I can determine the sum of all the elements in this segment in this segment parallel and then I will transfer it to this particular segment boundary. And then what I will do is I have got different segment boundaries here 2 m by k segment boundary. Do the cumulative sums on the segment boundaries. So, in other words what I will be finding out is that suppose I am at this segment boundary. It just means that how many elements are there before this segment boundary. So, I will be getting that some in this particular element in order log 2 m by k which is order log m by k time because I have got m by k 2 m by k such that. So, once I get the cumulative sums what I can do is you know what I want to do is actually I have got new headers new boundary dividers such that these are the new things such that each such thing is k and on the side of each such thing is k. So, now I can easily determine these boundary will be at the integral multiples of k because therefore I am trying to achieve. So, I can easily find out you know from this information the positions of these boundary dividers. And in the worst case they can be divided you know within one particular segment of the intermediate array because the size is at max 2 days we have already seen that. And we can determine their positions once we determine their positions we can get the relative positions within the segment and once we get the relative positions we just need to say this is the particular thing say this is one particular segment of the C R A and this is the new header that is that is flying in this particular region. So, and say I determine the p n smallest element in this range this can be at max 2 k and this can be done in log k time this was done in the class using 2 k process the number of process here flying here and once you determine the p n element here I can you know for all these elements flying in between here I can get their positions you know I can determine whether they are more than this or less than this and that can be done in constant time say I have got an element here but the problem will be where will I transfer this element to because I am doing everything calmly I don't know exactly where this element will be going to so I will determine this final position so I will do that in 2 steps you know in a step 1 what I will do is whenever an element is more than this element I will just put a 1 so I will get a binary element and then I will do the cumulative sum in order of log k time so the sum here something here will just imply that you have got these many greater elements before this greater element so it's like it just determines the position of this particular element so that will be nothing but I will get the position here and then the second step what I will do is I will interchange this process for small elements and 0 for greater elements and get the position of the smaller element so in log k time I can determine the positions, final position and once I move them I can determine the maximum in each new header by just finding the k smallest element in log k time and so the total time for all these steps in k merge is order log k with order log n by k and we are using n process here and the number of k merge steps we will start with arrays of size k and we are doing the merging till we get array of size n so it will be order log n by k and so the time complexity nothing but order log n by k into order log k plus order log n by k log n by k is it dog of n by k or log n by k log n by k log n by k is it log n by k or log n by k no log n by k yeah log n by k no it was transparent no yeah log n log n by k I can find any parallel algorithm which does the thing in log n time this is in there my n log n cost rather than log n time on CRCW that's all in log n time CRCW CRU CRCW CRCW what is n log n time n log n cost but log n time is it log n time order n process CRCW CRCW that we will assume that we will assume but what is the kind of product this is a tree sorting but lot of book keeping is there yeah time complexity is there with reference to but at the same time we have increased the more book keeping so the algorithm has become very complex but man I don't think you can do it better than log square this if you use this part there is also mark sir tree mark if you use this part you can do better than this there is other problem but only way is that time complexity is coming log k into log n log n into log n by k log k into log n log n by k log square n minus log n log k into log n by k log square minus log k log n log square minus log n by k log square minus log n by k