 Are you glad to see all of you think about memcicoil before? No. Partially. Okay. Okay. Okay. So I introduced my presentation. It's in-memory database memcicoil. And then I would like to describe when and how to use it with Drupal. I'm again in the kitchen. I work at French company Smile on the Drupal team lead position. So when we are talking about in-memory databases, all our expectations that disk aren't used at all, data can be lost after server reboot, and it has to be worked super fast. It's all our expectations. But reality is a little bit different. Actually, data is stored on the disk, and these databases respect ACID. Yes. Actually, these databases can keep data in memory. These systems are built as distributed systems. In-memory databases leverage SSD. So no random writes. It uses all features of SSD drives. MDB is a new trend. As memory became cheaper, cache, for now it's a new RAM. RAM is a new disk and disk is a new tape. But there are still bottlenecks. If when we are talking about all databases, bottlenecks are disks, but for in-memory databases, bottlenecks are in the different places. So let me introduce MemSQL. It's a database of the new era. It can be run everywhere on premises and containers, on cloud. MemSQL provides cloud storage, it can store all data as you need. It supports GAY special, JSON K-value storage. You can load data from lots of sources. And it should really work with new, in our new model world. So a few words about MemSQL architecture. As I said before, it uses in-memory databases as a distributed system. So aggregators and leaves are used. So aggregators are used to store information about leaves and the leaves are used to store data. As you can imagine, there aren't servers in our reality that can store more than a few hundred gigabytes of data in RAM. So when you want to store terabytes of data in RAM, you have to have lots of servers. And for it, we need leaves. How it works. For example, in the operation, we send data to aggregator. Aggregator calculates hash value based on primary key or short key. And then based on this value, it stores information on the sum of leaves. When we read, we send data to aggregator. Aggregator takes information from leaves. Then it combines this data and then returns to the client. MemSQL provides a special tool, MemSQL Studio, where you can see the health of your cluster. There are all aggregators, all leaves, and you can monitor how it works. All resources that are used by your cluster. So everything with a glance. When we are talking about in-memory, we want not to lose any data. And let's have a look on the picture how transaction works in this database. So all transactions are written in memory. And then these transactions are pushed to the transaction buffer. When transaction buffer is full, it is saved to the disk. And then it can be stored to the database snapshot. So you can define should your system be fully durable or not. So transaction buffer can be increased or set as zero. If it will be set as zero, it means that transaction buffer won't be used at all and all data will be saved to the disk right after transaction has been received. Or you can use big transaction buffer. Yes, performance can be better, but you can lose latest transactions. So you need to choose process and cons of these decisions. It depends on your application. In the modern world, all databases can be split on two different types. First type is operational databases. And they are used for operations accounting. And they use Rostow tables. It's well known databases like MySQL, PostgreSQL, MSSQL. So all databases that we usually use in the Drupal world. And another big type of databases is analytical databases. It's databases of new era because for now we have lots of data around us that we haven't had before. And we need new way how to store data. So analytical databases use column store tables. So let's have a look what is the difference between Rostow tables and column store tables in MySQL. Actually MySQL can work with both tables. And actually in the same time, in the same database. So some of tables can be in Rostow. Some tables can be in column store. So Rostow, in this case, data is stored on the disk. But then they are loaded to the RAM where it is read from by client. But it needs longer recovery times because all data is loaded from the disk to the memory. So we need to wait some time when all data will be read from the disk. It's very well optimized to be updated frequently and at random. It searches fast arbitrary data and supports as many indexes as you want. Data is stored as is. And actually more space is needed for storage than source data. So for example, if you will compare CSV file and snapshot file, you will see that snapshot file the table will be much more bigger than CSV file. And MySQL by default use Rostow tables. What are about column store? Column store is stored on the disk and data is read from the disk. Data is ready right after database start. It's optimized for batch update and delete queries. Works effectively with reading of big amount of consistent data. But it has a limitation in the MySQL. It supports one index per table only. What is great that data is compressed there and less amount of storage is needed than source data, especially with resorting. It's great because when you have few terabytes of data and it can be compressed, you will see advantages of this solution. And it's very simple to define column store table just to specify special key clustered column store. So should I pay for MySQL? Actually MySQL is commercial database and they use special term unit. It's 8 core CPU and 32 gigabytes of RAM. And up to 4 units can be used for free. Actually lots of applications can use it because you can store about 128 gigabytes for free. It's more than enough for lots of applications. So actually I have heard lots of things from MySQL company that it's super fast, it's super great, but then I found information that it isn't. Yes, and I have decided to check myself whether it's really good or not. So I have built a testing platform and I have taken MySQL 5.7 that is used for Drupal and latest version of MySQL 7. Different types of tables were used to row store and column store and all aspects of working with databases have been covered. All measurements have been done in milliseconds. So first test, it's loading data from the CSV file. As a source I have taken 1 million lines CSV file, it weights about 1.5 gigabytes and we see that MySQL works much more faster here in 6 times faster. Actually it was impressed when I started testing. Second tests are about reading data. So I have taken 4 data sets. It's 11 million rows, 1 million rows, 50,000 and 1,000 rows. Differences were in rows to tables. I have tried to understand whether indexes can help the databases. So some tests have been made with indexes on some columns and without indexes. So the first test is searched by primary key M and surprisingly MySQL works faster on all data sets. It was surprised for me. Actually, okay, let's have a look. Next test. So next test searched by parameters. For you, I have prepared graphics for biggest data sets and smallest because it's most visible differences between use cases. So on a big data set, MySQL wins in few times than MySQL. But if you have just 1,000 rows in the table, MySQL wins. And searched by range. Absolutely the same situation. On the big data set, MySQL works faster. On the small data set, MySQL works faster. And actually it was for many of tests. Not for all, but for many. We see that MySQL works in few times faster if data set is big. And vice versa with MySQL where it's more performant on the small data sets. So actually it's... Yeah, interesting use case. Using small limit. It's limit by 100. And we see that MySQL wins in both cases here. But if we take big limit, MySQL loses. It can do nothing, actually. MySQL works much more faster on the big data set. And inner joins the same situation. And MySQL wins on the big data set. So next tests were about updating data. First test, it is update data consistently. 10,000 rows have been updated. And when we... When I run this test first time, I saw that MySQL rows store update data faster than other cases. You can see that MySQL comes to update data very slowly. But actually we know from theory that it isn't suited for this use case. For the second run, all primary keys in MySQL have been indexed. And it wins. So in this test and from previous test, we see that if primary key is used in the query, then MySQL works better. It's very well optimized for queries with primary key. Next case, it's update data using range. And actually I think MySQL and MySQL rows store showed about the same value, the same result. Adding data. Actually, in the team of 10,000 rows works faster in MySQL. And in the data by big chunks, actually there is no big difference. Actually it's just 10 transactions. So conclusion. I have found that performance of MySQL and MySQL highly depends on the size of tables and the usage of primary keys and requests. So scope of application of this databases. Tables up to a thousand rows, it's definitely MySQL. Tables with more than 100 rows with frequent updates MySQL rows store. And tables with more than a dozen million rows with rare updates, I would prefer MySQL column store. Actually, why I wouldn't suggest to use MySQL for the small dataset because MySQL requires better servers with more resources. So sometimes it costs more money but it won't give you any advantages. But we work with Drupal and Drupal is a quite complex system and actually it was written by smart people. And Drupal very well works with MySQL. It uses all advantages of MySQL. If you will have a look on the screen you will see common queries in Drupal and all queries use primary key. So next test was a Drupal site where I need to show just 10 pages and I have used three datasets. It's just 50,000 nodes and 500,000 nodes. And actually there isn't big differences. And if you just store content and show this content occasionally I would say there isn't a big difference between MySQL and MySQL especially when data is cached. It's about the same values. So for content storing I would say that using cache mechanisms like internal cache, varnish will work more effectively than using MySQL. But from the previous test we saw that MySQL works really great with a big amount of data where we need to do some calculations with data. So when MySQL should be used with Drupal it's definitely real-time analytics platforms different type of dashboards systems with search by parameters like search by hotels, by planes, etc. and systems with frequently updated data when we need to store the safe data fast and get data fast. It can be some IoT systems. So if you would like to use a test named SQL you need to install it. It's system requirements. As I said, you need at least 4 core CPU server with 8 GB of RAM. TCP ports, these TCP ports should be open. As you see it works on the same ports MySQL and actually if you would like to use MySQL and MySQL at once on one server I would suggest set another port for MySQL. It's simpler. And installation guide is well described on the memsql.com site. Second interesting question is client software for MySQL. Actually memsql is practical compatible with MySQL. What does it mean? It means that memsql says to the software clients that they work with MySQL. So clients have no idea that they work not with MySQL but with memsql. So you can use any database management applications as you want. For example I use SQL Pro on the Mac and you can use the same PHP drivers as we use with MySQL. So it's a very familiar stack that we used before. So in the memsql stack MySQL can be changed on memsql without problems. Then you need to install memsql Drupal driver. There is a module for it, memsql. So it can be installed using Composer. Current version is alpha 1. Yes, actually I wouldn't suggest to use it on production but it's ready for testing. You can jump to the project and help me with issues. Then after installation we need to copy the driver to the Drupal folder. This specific folder that is used by Drupal to find custom database drivers. If you don't want to do it every time the module is updated this command can be added to the post update section in the Composer.json file. So after modules update it will be updated automatically. If everything has been done correctly during set installation you will see memsql option on the database configuration page and you can fill all credentials or you can define credentials in the settings PHP file and you can install site using drash or as you want. What you definitely have to know when you start to work with memsql is the difference between memsql and mysql from development perspective. The first difference is a short key. Short key. Memsql requires short key or primary key is used by default if short key isn't defined. This key is used for hash calculation to define on which leaf it has to be stored. Then primary or unique index must be identical to or superset of the short key. This is an obligation of memsql. And default order. Mysql serves data by primary key by default but actually sql standard doesn't guarantee such behavior and memsql can return data in any order. So if it's important for you to show result in specific order then you need to define this order specifically in the query. Actually, that's all. I was a little bit faster. Any questions? Yes, please. So the free version allows you to use a four node cluster. Is that correct? Actually, it can be four different servers, four nodes, or it can be one server with lots of resources. It's up to you how to use it. How long can you use it for until you actually have to use a free version in your experience in terms of when you actually go over capacity until you actually have to upgrade to a full version and use other nodes? There is memsql portal where you have to register and there you should register your application and take license key. I was just wondering whether you can actually set up a Drupal site and actually have a production site with the free version or if that's just for you to experiment and for you to actually use it in production with you, you'd actually have to use it if it would be capable of running a production site. Obviously it depends on the amount of load and whatever. Actually for now I have prepared just proof of concept of some applications and it works. It really works fine. So I run tests using alpha version of Drupal driver. It isn't ready so need to fix few issues because for now Drupal core tests aren't passed fully. But I think I will be able to finish it in few months and it will be ready. Actually 128 gigabytes of data is quite a lot and for my needs it's more than enough. I saw some of your tests there and they were like 50 nodes. I have started with 50 nodes because it was a question should I use MemSQL for example simple blocks or not. Actually I decided that it has no sense because you have to pay for the server with lots of resources but instead of it I would pay for smaller server and save money because on the small data sets you won't see any advantages of this database. Okay cool. I saw here the full text search and is that something that you could potentially use to replace solar in a simplistic way? For full text I would prefer to use solar. Okay. It isn't for full text search it's more for real-time analysis. It's for calculation of data for all of these things. Okay. Yeah it's just for calculation. Yeah it is but solar is very well suited for text searching so you need to understand what tools are suited for better and use it. Yeah. I mean I'm just wondering if there's some underlying tech there just okay you wouldn't actually have to spin up the solar server you can just leverage whatever they're providing but yeah check it out. Any other questions? Yeah please. You mentioned the driver module and it's not quite stable yet but what functionality does it actually provide if it's MySQL compatible? It's as mentioned on the latest slide we need to define a sharp key or we need to ensure that primary key is used there were issues with unique keys so driver leverages the situation and makes to work it possible also there are differences how MySQL and MySQL works by search by strings but actually MySQL driver extend MySQL driver from the call yeah but there are still issues and MySQL provides MySQL module provides integration with Drash so there are a few differences so this model covers all these differences yeah okay thank you for participation