 Now data is being generated at a tremendous space now what we are talking about is not the transaction data withdrawing money from ATM making telephone calls Searching the web, but the data is people are tweeting. They are putting messages on the Facebook sharing pictures videos Doctors are transcribing So all is this is text data visual audio video data Right and that is being generated at a tremendous space Now the traditional database architects don't have the luxury to sit in a room for months and maybe here and design a database That's not an option. The good news is that there are solutions in the market the no sequel solutions Which cater to this kind of challenging scenarios Now you might be worried that what are those solutions and how do they work and What are the details? So that's why this module? That's why this course? That's why you are here and that's why I am here So let's look at the module coverage for this module so we will start with the with the the concept of the Google big data paper and Then we will talk of the no sequel meetups which predates the big data paper And then the Amazon Dynamo paper which are kind of a milestones and They set the standards and then we will talk about the second no sequel meetup and what no sequel meet means today So let's look at it in more detail. So Google big data paper So what this paper is about this paper was published in 2006 and what it says Big table is a distributed storage system for managing structured data That is designed to scale very large size Okay, Peter bites of data across thousands of commodity servers so if we have this Peter bite of data and we have this thousands of commodity servers and this is all is Distributed similar to an RDB Ms. Model at first sight big table stores rows with a single key and Stores data in the rows within related column families Therefore accessing all related data is as easy as retrieving a record by using an ID Rather than a complex join without a join So what do we get we get speed without a join this model also means that Distributing data is more straightforward than with Relational databases by using simple keys related data such as all pages on the same website Okay as given in the Google example Similarly big table is designed to be able to be distributed on commodity servers Commodity servers big day table can be destroyed commodity servers a common theme For no sequel databases for example Dell or HP servers with perhaps two CPUs a to 10 or 16 cores and 32 to 96 GB of RAM Nothing fancy just lots of them commodity servers lots of them Okay, of course with the passage of time these requirements might change, but you get the picture Carlos Strozzi is the person who gave the name no sequel Okay, and this is the first documented use of the term no sequel in 1998 He was visiting San Francisco and wanted to get people together to talk about his lightweight relational database Carlos meeting in San Francisco came and went Developers continued to experiment with alternate theory mechanisms. So this is no sequel is not only sequel So the data is retrieved using UNIX scripting Instead of sequel sequel is for the rdbms Rdbms Okay There is a cost using sequel complex theories are hard to debug and it's even harder to make them perform well Which increases the cost of development? Administration and testing so this is the reason for the popularity of No sequel. Okay. Now. Let's move ahead to the Amazon Dynamo paper So what does the Amazon Dynamo paper states the stress is on consistency? cost effectiveness and performance and of course availability is there it was published in 2007 and The paper goes on to describe how a lot of Amazon data is stored by the use of primary key How consistent hashing used to partition and distribute the data? Hashing and then partitioning the data why because then you make it distributed and this is probably the first globally Distributed key value store which was there which was introduced So that is the strength of the Amazon Dynamo paper. So the second no sequel meetup Okay. Now that was held in 2009. Now people were developing solutions whenever they had a challenge and There were a lot of products in the market mango DB Redis Cassandra and the list goes on and Then the hashtag no sequel was used for the first time about open source about distributed computing and about Non-relational and of course as you can see on the slide there were people From when he influential companies when he leading players of the market who were there which goes on to show the popularity the acceptance the application and The desirability of the no sequel solution So what no sequel means today? No sequel is applied in those areas which are weak which are complex Which are complicated which are hard to develop and maintain for the Rdbms relational database management system and they are different solutions and As many problems are there as many solutions are there and all of them are No sequel Databases, but you have to be aware of their limitations their strengths and how they will go together and Which solutions you should choose from from all of this variety? Which solutions are going to help you and you should be aware of the total cost of ownership? TCO So that's all I have for this module. Thank you for your time