 Hello, everyone. Welcome to the session on parallel databases. Now, in this session, we will see what exactly parallel databases means, how it works, what is architecture. Now, talking about the learning outcomes of this session, at the end of this session, you will be able to describe or illustrate the need of parallel database and what is the aim of its database. And you can illustrate the different parallel database architectures, how it is using, what kind of architect machine architecture it is using. Now, let us start with the basic goal of this database management system. What exactly database management system is doing? It is providing the way to store and retrieve the information. Any kind of information we can store or we can retrieve through this one. And the important part is it is using very convenient and efficient way for it. Talking about the management of data, what exactly means management of data? It means that it is defining some structure, fixed structure for the storage of a particular information. And it is providing various mechanisms for manipulating that information like how you are selecting, how you are updating, modifying, deleting, reading, writing or all these one we can say as the data manipulation things. And the database management system is providing a effective mechanism for this one. Let us start with an introduction that what exactly, why the importance of database is there and why we are going for this parallel database, starting with the database thing that we know that the databases are growing increasingly nowadays, everywhere, everything whatever you are doing that will be considered as an information and that is stored in some database. You can see now on any click of the internet, any message what you are sending or any mail you are sending everything in even if you are going for e-commerce or if you are buying anything online, everything whatever the transactions you are doing that will be considered as part of a data and that data is stored somewhere as a database. So it is increasingly it is growing very high nowadays. Now we know that so based on that the data what is generating that is in a very high volume and it is collected and stored for many of the analysis means everything whatever you are doing online that will be taken as a data and that is analyzed for the future development for how the customers are interacting, what is the interest in the customer and many of that thing. So that is also one of the major reason where the database is growing increasingly and now we know that database is not exactly only data as a format, it is having images, videos, audio, video everything will be considered as a data and that is effectively handling. So multimedia objects like images are also increased and stored in a database. You can see we people are using many social media sites for uploading our images, our pictures, we are doing that as our memory using that social media. So that is also one of the major use of the database where we are using. Now what is the need of the parallelism in this case? We know that as the large number of volume of data is there many manipulations are there, the storage is there, retrieving is there. So it is required that we have to improve the speed of the execution, the speed of storing, retrieving, everything. So for that we are taking the help of the parallel machines. What exactly parallel machines are? Parallel machines are quite common and nowadays even your smartphone, your systems, your laptops, everything is increasing in the form of the parallelization. We are talking about earlier it has started as a single core, now it is multi-core, your mobile phones are also having many cores in that one. So those are very common nowadays and the prices of these like microprocessors, memory disk, whatever the peripheral devices we are using, electronic devices we are using, those are dropping sharply. So the recent desktop computers they are having many multiple processors, many multi-cores and this is accelerating day by day. Now the basic goal of parallelism is what as I told earlier also that large scale parallel database systems are increasingly used nowadays. So for many reasons one of the thing is what we are storing that for large volume of data. We are processing that most of the volume of data for decision support system where most of the decisions are taken by the organizations. So these decision support systems are having very time consuming. And the basic thing is why we are using this parallel databases. It is providing high throughput for the transaction processing because as most of the transactions are there, many of the transactions are at a time working means we have to get a proper throughput based on that. That is what the goal of the parallelism is. So this all thing totally where it is moving is it is moving towards the performance improvement. So most of the system when large volume of data we are manipulating many operations we are working on that one it may get more time for it. So to reduce this time to improve the performance of that one in the form of time we are using the parallel systems. So that is what the goal of the parallelism is. So talking about the parallelism in databases here what we are doing the most of the data the large volume of the data is we are accessing through the parallelism in that one. And how it is there it is partitioned across the multiple discs and using the parallel IO that is parallel input and output where at a time we can read and write multiple discs. So automatically the work is pallorized here. So you can see the scenario here. Here I have taken a very small database example here relation here which is having some data in that one but we are talking about in a large volume of data. So if the data is very high large volume in this one so that particular relation can be partitioned in many discs. So what we can do is at a time different processors can read this discs. So that is what we call it as a parallelism in database. So talking about the relational operations like whatever the operations we are doing on that one database manipulation operations like sorting is there join is there aggregations are there these are requiring more time usually for the execution. So these can be utilized using the parallel techniques. So most of the process parallel processors are there the parallel discs are reading them. So it is working on a parallel thing. So consider this scenario here what we are doing a single query is there for sorting but this sorting it is doing in parallel. So where the sorting of A to G we are taking some example of sorting in ascending order. So according to these one it is parallel sorting the thing. So at a time it is sorting like this and finally it is adding it we will see these one in detail later. Now data is partitioned in various discs as I told earlier and every processor is assigned to it you can see here that this large volume of data is partitioned in different discs and every disc is having its own processor. So what at a time three processors that is processor P1, P2 and P3 are accessing simultaneously disc D1, D2 and D3. So automatically the work is distributed and the work is in parallel simultaneously every processor is working there. So speed is increasing here. Now queries are expressed in high level languages and every SQL query is translated in relational algebra for the parallelized system and different queries can be run in parallel with each other at a time. So sorting might be there joining is operation is there selection is there many of the things whatever those can be working in parallel system and talking about the concurrency control concurrency control is taken care of conflicts and all that one by the database management system. So databases are naturally lend themselves to parallelism automatically. Now the parallel database architecture you can see this architecture here what exactly is there that most of the nodes are here and these nodes are using these these are shared. So this is what the database parallel database architecture is the disk might be shared the memory shared the processors are shared whatever it is. So what we can see is machines are physically close to each other you can see here the nodes are physically close and they are connected by the LANs and dedicated high speed LANs. So whatever the communication is going on in these one though that is by the intercommunication channel this channel might be a high speed LAN and the communication cost is as it is using only high speed LAN it is very small and many of the architecture you can use here that is shared memory is there shared disk is there or shared nothing architecture is there. Let us see all these one one by one. Now in the shared memory architecture you can see here how it is working the memories are shared and this is the intercommunication network and every processor that is P processor these are the various processors and every processor is associated with an individual disk. So what is shared here only the memory shared doesn't matter that how many the number of memory memories and the number of processors are equal there might be various number of memories also and these are shared by these processors. But remember here every processor is having its own disks. So what we can say every processor has its own disk and this it is using a single memory address space for all these processors and reading or writing of to the memory is more expensive here because as memory is there and the intercommunication network it is using so it may it is somewhat costly we can say here and every processor can have its own local memory also is one if it is not shown here every processor like P is having its own memory here this one is also having but these memories are shared here. Now you can pause the video here and think of this architecture can you guess what what exactly this parallel architecture is having like in this one what is happening is see what is shared here and we can say that this is an inter communication network and these are the processors and this is a memory. So if this is the case then what exactly shared in this one you can see this one like yes you can guess here that it is a shared disk architecture why because the disks are shared here and every processor is having its own memory. So let us talk more on this one it is called as a shared disk architecture where the disks are shared this is the intercommunication channel where the communication is going on and every processor is having its own memory so memory is not shared here only disk is shared. So every processor is having its own memory all the machines are accessing only disks and the number of disks here also is not necessary that those are equal in as a number of processors it may vary. So what is the application the Oracle system digital equipment corporation is using this shared disk architecture. The next one is shared nothing architecture what it is there here nothing is shared means every processor is having its own disk its own memory it is nothing is shared but all these are working in parallel with its own things and these are the most common architectures nowadays every machine has its own memory and disk and many cheap machines commodity hardware communication is done through high speed network and switches. So the application is this one these are my references thank you.