 Hello everyone, welcome to the session on parallel and distributed databases. In this session, we will be learning how exactly the parallel and distributed databases are working and what are the differences and similarities in these two. The learning outcomes are at the end of this session, you will be able to differentiate between the parallel and the distributed database and you can compare the two databases by the various factors. We will see what are the factors for the comparison later on. So the basic performance of the major of the database, like how the databases are measured is basically by two things. One is the throughput and another one is the response time. And what is the throughput? The throughput means what the number of tasks or the number of transactions you can say number of queries which are completed for a given interval, for a given time. And the response time means the amount of time it is taking for executing the single task. Whatever the single task we are taking, it might be a query, it might be a transaction, whatever though, so the amount of time it is taking for the execution that is called as the response time. So these are the two basic factors where the performance of database is measured. So if you are talking about the distributed database, you can see here that the data is in this one, the data is distributed among the various sites. So site one, site two, site three, site four and every site is having its own database. So the database is stored in several sites and every site is managed by its own independent database management system. That is what the distributed database is. Now whereas in parallel database, the nodes are there and these nodes might be sharing some physical resources like a processor or a memory or a data storage, something like that. So what we can say for parallel database is machines are physically close to each other and they may use the same server room here and physical close means we can say that the LAN is connected through LAN, the machines are connected and multiple processors are handling the database and the database is shared through this one. You can see this diagram in this one, you can see that this is an intercommunication channel and through this one all these nodes are communicating and there might be the sharing of these things. Now what is the difference between parallel and distributed database? What exactly the parallel database means? Multiple processors are executing the database operations in parallel in the parallel database. We can say here multiple processors are there whereas in distributed database multiple sites are working on a database among the means the network. But you can see here that the networks are here in parallel database it might be LAN and in the distributed one it might be LAN. And in the parallel database the processors are tightly coupled, they are located at a single premise. You can see that different nodes are there then they are interconnected in a single premise. Whereas in the distributed database they are separated geographically. You can see that the sites are separated at various locations, various geographic locations the sites are separated and the database is stored on every site and they are handling their own data and globally also they are working. Now in parallel database as I told earlier that the nodes are connected through LAN so speed is very high as it is a LAN. And for the distributed one the sites are connected through LAN so as it is through internet is there you can compare that one that through internet if it is there you can say that the speed is relatively low basically it depends upon the internet speed. It supports the parallel database supports the shared resources where the resources are shared like disks are there, processors are there or memory is there so these are shared. Whereas in the distributed database as they are located separately in various locations there is not any sharing of the resources is there. Now in the parallel database it is supporting three types of architectures like shared memory, shared disk or shared nothing. In this one memory shared here the disk is shared and here nothing is shared every processor disk and memory are independently working. Whereas in distributed it is working on only one architecture that is shared nothing it is not sharing anything no sites are sharing anything among themselves. Now the parallel database can handle the complex database queries in the distributed one they are working on a simple queries and they are not sharing any resource. Parallel database is maintained by a single database administrator whereas in the distributed database a separate database administrator is there for every site. As we can see that in parallel database a single database management system is handled by various nodes whereas in this site it is not necessary in distributed database that the database management system is same every site may use its own DBMS for its working. In parallel database the nodes are working on a single policy whereas in distributed the policies may be different because as various database management systems are working the policy may differ in this. They may work on a single policy or they may not be usually on a different policies. Now pause the video and think of this scenario and answer the kind of the database architecture we need to implement here. The scenario one is a nationalized bank is there with its various branches across the country and it has to manage its database. Another example is a swiggy food delivery service provider is there and how they are handling their database what is required in this whereas in the third one a restaurant is there and the three sections are there in a restaurant like vegetarian section non-vegetarian and bar so these three sections are working. Now you have to think on and you have to write the answer that what kind of database architecture is needed here to handle this kind of database. The answer is in the first two scenarios where the nationalized bank is there its locations are different among the country in the country so it is it may use the distributed database for the same swiggy food delivery also as the delivery locations are different the service providers are various restaurants are there who are providing the services so here also the data is distributed among various one so for these two scenario the distributed database is used and in the third one where the single restaurant is there and the sections are those are located in a single restaurant and the sections are vegetarian non-vegetarian and bar but even we can see that they are at a single premises. So here we can handle the parallel database. Now the comparison between the parallel and the distributed database some factors we can study here and based on those factors we are comparing here. The first factory system components what are the components are there as we earlier discussed something that the parallel database are geo-distributed and with the low bandwidth link so in a single premises is there so autonomy of the site is there whereas in the distributed those are high bandwidth link and connected and non-autonomic sites are there. For the second comparison factor that is what are the components in the parallel database those are working independently with the local transactions and working together on a global transaction whereas in a distributed database they are working together for the global transaction. For what for these are designed as earlier we have discussed for the parallel database the basic thing is sharing of a data is important here and the autonomy is local because it is working under the single premises with the various nodes and because of this the availability of the data is very high. So these these are the main basic motivation for designing the parallel database whereas in the distributed database the motivation is to provide high performance and to provide the high throughput. Now for the data distribution we can say that the data is distributed in the various disks so we can say that the data partitioning techniques are applied here like various data partitioning techniques are there like round robin hash sorties there or range partitioning is there whereas in the distributed one horizontal and vertical fragmentations are used. Now the objective of the parallel database is to improve the performance and throughput whereas in the distributed database the objective is to provide high availability with the reliability. Talking about the speed of the execution as parallel database is there it is tightly coupled so it is highly available. So the execution speed is also high as it is connected through high speed LAN whereas in the distributed one it is low because it is connected through internet so based on the internet speed the working is applicable. And talking about the geographic location as I told that it is located at the same location in parallel database and in this distributed it is located on different sites. No types obviously we require parallel in parallel database all the sites should be homogeneous because those are working on a same DBMS software whereas in distributed it is in both that is heterogeneous and homogeneous because those are working on a heterogeneous environment they may work on a single also so various DBMS softwares may be used by various sites so heterogeneous environment working is also applicable for distributed database. The scope if you are talking about then in parallel database difficult to expand the scope whereas in distributed as it is not fully connected and every site is not dependent on each other it is easy to expand. Overhead is very low in parallel database because the communication is very high communication speed is high the number of nodes are limited so it can be easily handled whereas in distributed the number of nodes may vary independently and many nodes are there so it may it is very high. Performance it is low reliable and it is highly reliable and available it is also high for distributed one backup it is at only one site and it is at multiple sites maintaining consistency it is easier here and it is very difficult so these are the references for my video thank you.