 I am working as a few architect for my school cluster from last 6 years close to today we will discuss about NDP clusters and some of its features and how to set up the clusters This is a safe approach statement. We will discuss mainly what is special with my school cluster, my school cluster architecture and when to consider my school cluster set up by NDP cluster and the resources. Let's discuss what is special with my school cluster, my school cluster is a combination of my school servers and the NDP storage engine. Data can be stored in both in-memory as well as in this cluster. It was initially designed for telecom databases where high availability and high performance is key there and later after enhancement it has support for web application, gaming, high storage servers etc. This is my school cluster in and of itself. So yes, my school cluster is in-memory database auto partitioning, data distribution, replicas, these are the built-in features into the my school clusters. You just don't have to worry about how my table data will be partitioned, how the data is distributed to the clusters and what kind of replicas are in current duty. Two types of replicas are built in every supported, that is synchronous replicas are built within the data nodes of the clusters while asynchronous replicas are built in the master cluster to slave clusters. My school cluster is 6 nines, that means less than 30 seconds down time in a year, so this has been verified by the customers. It is said nothing, no data is shared, I mean across all the nodes, data is stored within each node stored in its own local data, it is not shared and no single point of errors, it can substance multiple errors at any point of time. That's why it's a no single point of errors. Transitional consistency, it is something like, even if there are, I mean my school cluster supports 40 data nodes and 200 my school server connections. And within 200 my school servers, you can have millions of queries, you can project to the data nodes or to the clusters and each application is the same data that they corrupted or inconsistency data. It's a open source, it supports both SQL and no SQL, that means you can have a magical server along with your MTV storage engine and if you can write your own API programs in C++ or Java or anything, then you can also talk to the magical clusters. And read and write scale up, as I said, it supports 200 magical clusters, sorry, magical server, so millions of queries you can do read and write at the same time to the clusters. Since data can be stored in a disk, in commodity hardware, you can store many terabytes of data in clusters. That is what exactly my school cluster and what is special with magical clusters, as I said, it can support very large data size. With in-memory, you can store many terabytes of data while in disk, you can store many terabytes of data. As I said, it supports 48 data nodes and 200 magical servers. NDP is known for high performance within our own benchmarking rules. We found that it supports 200 million reads per second and 1 billion updates per minute in our benchmarking system. These are the industry currently magical cluster players, mainly telecoms, gaming, online gaming, financial services, web applications and many high volume stores, of course, also they are taking magical clusters. These are our customers mainly. You can see that telecom players, web applications and elections. Let's quickly look at what are the cluster architectures. Here you can see that lots of nodes are basically here, a group of nodes here and a group of my school nodes, API nodes and other management nodes. Basically, the architecture is something like the management server is mainly responsible for handling cluster-wide configurations. When there is a conflict between data nodes, it acts as an administrator to decide who is the president. And it also supports a manager's logging, cluster-wide logging mechanisms. And there is a client for management server, like NGF, that is a client. And you can perform your administrative tasks like you start, start of a node, start in the short term of cluster, backup, recovery, all those administrative tasks you can pass through. You can perform this management server is responsible for that. And you can give those comments here in the page. We will see that in the demo slides. Then you can see a four NDVD, NDVD 1, 2, 3, 4. This is a four node cluster examples given here. And then these are the heart and soul of clusters where these actually stores your actual data, indexes and these nodes are logged in also. And this also performs transactional look-up, like through crime making. When you perform a query, where basically the point goes to here and these decides where my data stores and which nodes. And all these operations take care of here itself. Then you can see that SQL nodes, there is a three SQL, my SQL nodes. This is for the example purpose, you can have many purposes. The main purpose here for my SQL server is to perform my SQL queries, like create data, there is createable store procedure, events, any SQL queries. These are the layers responsible for that. Then the last layer is called client and API layers. You can see here that there are so many clients are there and API also there. The actual query is submitted in this layer to the cluster and these layer are responsible for load balancing, data distributions, failover, failback, everything happens in this layer. There is NDV API, this is the API program. It is a native API program for the clusters which is written in C++ and this is the only API which can talk directly to the NDVD. The cluster data loads no other API or mySQL can talk directly to the NDVD. We will see this connection, you can see here. This is your clients and these are your API layers. You can see that either it is a mySQL server or any of the API like cluster j, NDV API. And this is the NDV API layer where the request path starts with the client, it pass through the mySQL server. Then it pass through the NDV API layer. These layer then send the request to the data nodes and it pass the data and return back to the NDV API. Then it return back to the SQL nodes or API. Then finally it reaches to the client. So this is the flow of operations. The difference between API and mySQL node is that since NDV API directly talk to the NDVD. So if you perform any query through API then your performance is much more better than if you pass through the mySQL. Because when you pass the query to the mySQL it will talk to the send it to the NDV API. NDV API then finally NDV API will go to the data nodes. So you got one layer of your communication. That's why we say that when you perform no SQL or API then your performance will be much better than the mySQL server. Let's look at the few of the features. First are features here. It supports SAP compliance. Like any ADVM, ADVM, AMS, mySQL cluster, SAP compliance, transactions, auto-sharing, online operations and the scale up. There are so many features are there. I just highlighted few of the key features of auto-sharing. We will look at later. This is auto-sharing. Basically it is inbuilt as I said. Users don't have to worry when you create a data and insert row into it. When my data is stored, which partition, which private, nothing to worry. It is all bothers not cluster. So when you create, suppose you have 4 data nodes, you can see here 1, 2, 3, 4. Based on your number of data nodes your partition is created automatically. The moment you create a double D1, partition into 4, partition P1, P2, P3, P4. So each partition store is data into one of the data nodes. Like say this one. Partition P1 stores its data in the data node 1. F1 is the primary fragment while the same data is replicated again in the data node 2. That is called secondary fragment. The same way you can see that all these partition data are stored primary. One is primary and another is secondary. So the idea is behind is that if your data node 1 is down, that means this data node 2 have both the data, F1 and F3 content. The same thing is here. So that's why you said that it handles no single point of values. So these data nodes, if you have more nodes and more tables, then based upon that your node group is created. I will show you in the demo how these node groups are and what are the node groups created. By raw calculation it is basically how many data nodes you have and what is the replica. Replica is basically how many replicas of data you want to store. We currently supported 2, but you can create up to 4. That means 4 replicas you can store in your box. And cluster also supports online operations. That means these are the cases of operation you can perform without shutting down your clusters. Like R node says your business is, suppose you are starting with a small business and the future, suppose your business is growing rapidly, then you can enhance your clusters. Say if you started your cluster with a 2 node cluster and you decided that let's move to the 16 node cluster or between 40 node clusters, then you don't have to worry that you have to just add the data nodes into the existing cluster. You can perform all your read write in all operations while these nodes are adding 1 by 1 to the existing cluster. Data distribution, normalization, everything happens in the background. As a background process, it may be a little bit slow in that period but we don't have to subdom the cluster, make offline all the data nodes and migrate all the data. You don't have to do all those things. That is the uniqueness here. Then partition also you can do partitions of a table like we saw in the previous slides. Online start, stop, restart nodes. You can start a node, stop a node and restart a node while the cluster is up and running. Upgrade, whether it is OS upgrades or cluster upgrades, you can do that without shutting down your clusters. It can happen that suppose you are in 7.4 clusters and you are migrating to 7.6 cluster, you can do that. Half of the nodes are in 7.4 and you have migrated all the half of the cluster in 7.6. Within this half of multiple versions, you can do your transitions and all the operations. Nothing will happen. No data corruption, no issues. Online backup and register, yes, apparently you can take a backup of the clusters and register while the cluster is up and running. When I am saying up and running, you can do all your proper transactions with the clusters. It's not that, okay, since the upgrade is going on, let's minimize your write operations or the read operations. You don't have to do that. Reconfiguration basically, as I said, when you move from one person to another person, you might change some view of the configuration parameters. So, you can add those config parameters to the existing config of INE, which is the cluster configuration files. And you can do a rolling restart, which will take care of everything and why everything happens during the cluster up and running. So, when to consider magical clusters? Mainly, I mean, customers are looking at these three points. Scalability, latency and the down type. Scalability, as I said, that when your business is great, then you should think about that when your response, you are looking for a high response time or down type. As I said, that it's a six nines, that means 30 seconds down type in a year. If these are the critical parameters for your business, then probably you can think about magical clusters. Then, let us set up any clusters. I am considering here a four node cluster, that means one management node, four data nodes and one minus quality servers. For illustrator corpus, I have just shown here data 101, two, three, four, these are all critical bus. Because there is no point in running a cluster in a single bus where everything is in the same bus and this is your cluster, cluster log. This is the management client. Actually, it is a management server, but I am running a management client on it and this is a basically server and running my specific line. I will show the respective commands in all these posts. You can get the source code from the ID, GitHub and you can build your board or if you are don't want to do that, you can download the Appian platform from these locations and once you either of these, you will get the binaries of the clusters. So, once you get the binaries, then the next step is to build the client my.CNF, which is basically MySQL server specific. Another is config.ionic, which is the cluster specific. My.CNF is standard my.CNF files where you basically have to mention the IP address of the management nodes, then the user name, both suckers and base directory. In config.ionic, you can mention what is my management node. It's an IP address and node ID where I store the lock points, then data node 1, up to data node 2, 3 and 4, the host name IP address. So, these configuration files, you can get it from this manual, online you can get this configuration and the change accordingly as per your requirement. Next step is to install the MySQL server. You can see that your MySQL server this is the command to install the MySQL server. From the MySQL client, I am using these commands and you can see that yes, MySQL server is installed. Then the next step is to start the cluster server. There is a sequence, particular sequence you have to mention that your management nodes should be start first, then all the data nodes, then all the MySQL servers. So, the management nodes you can start the management nodes from the management client you can issue these commands to start the management cluster server. Now here you can see that MySQL management server this is the MySQL version this is the NDV version, it is started. Then check the status of the management nodes basically from the management client this is the management client NDV understand NGF, this is the process and from here this is the command which will show you what is the status of MySQL server. Now, just to see that these are the 4 data nodes which is not started or not running, this is the management node you can see that it is running and none of the MySQL is running here. Let's start all the data nodes this is the command to start the data nodes and from data node 1, 2, 3, 4 you can see that I am using these commands and you can see that all are started let's check the status of the data nodes here you can see that it is not started let's start each of the data nodes from the management client that is the ID is both started 4 is the ID or node ID of the fast node like 5 and 6 and 7 so you just start all this from the management clients and you can see that all are started started, let's look at the status of the all the data nodes here you can see that all the data nodes are started 2 node groups are created 0 and 1 and management node is started the next step is to start the MySQL server this is the command to start the MySQL server and from the MySQL client I have given this command here check the status all the data are waiting for the connection to start let's do some transaction on it and to do that let's start the MySQL client here this is the command to start the MySQL client create a data test one and create a table inside it and do some insert some rows I have inserted 6, 7 rows then to perform a select operation to see how many rows records are there you can see that 1, 2, 6, 7 so at this point our cluster is up and running we need some transaction also let's look at one of the key features self filling for clusters the idea of self filling is that if a node is crushed it will start automatically you don't have to bother about basically when it starts you should not worry about how to start it so let's do a ps-ep for ndvd that is the ndvd process do it do a kill-9 this is the process idea of this ndvd so let's see how it happens now you can see that node 4 post node shutdown basically we have killed this node ire4 this one we have killed and now see it's not connected and all the 3 are still up and running after some time you can see that it is starting automatically you don't have to worry about that then you can see that node is started so if you check the status of the ire status of the cluster now all the 4 nodes are started that means you can do the same thing with multiple nodes it will automatically start after some time so another feature is called no single point of ire4 that means you can cross multiple nodes at the same time and the cluster will survive no data nodes nothing let's kill one of the data nodes you can see that one data node ire4 is going to crash then kill one more data node from the node group 3 node group 1 here one point is that you cannot crash both the nodes from the same node group that means cluster is stopped that means there is no way you can recover unless you have a backup already in place so that's why you said when you say that ire4 replicates ire4 that means 2 appear there if you say ire4 replicates ire3 that means in this node group 0 there will be no 94 hype and 6 all belongs to 0 then 6, 7, 8 belongs to node group 1 that means ire4 replicates ire3 ire4 replicates ire4 ire5 replicates ire6 ire7 replicates ire7 cluster will still survive no data nodes no data nodes so this is a typical scenario where half of the cluster is down and half of the cluster is up and running so you can check that half cluster is here down and half is running let's do some transition whether we are able to see some transition or not let's go back to the same table actor and let's do some select operations select from actor where actor ire is 5 and we will paste this data so that means when cluster is in half clusters down we can perform our transition it's a simple query I see with us to show I have done that but we can do a big irems for cluster so this is the cluster resources this is the link where you can download cluster binaries minerals, computation everything and we support multiple irems