 The presentation is done with Google Slides. So you can ask questions while I'm talking. And I'll answer when we start the questions at the end of presentation. You have the URL on the screen. It's saying, ask questions if there's a question. I think we can start soon. OK, let's go. So welcome. This session is about accelerating and scaling up Drupal with no-sequel solutions. Which first begs the question, what is no-sequel, and especially what is no-sequel in the context of this talk. And of course, this stopped working right now. So basically, when we say no-sequel, we are talking about two different things. First, there is the original idea about no-sequel being about no-sequel, meaning nothing sequels in the application. And this is what is meant when talking about no-sequel engines, things like key-value stores, structure stores, document stores, graph databases, columnar databases, and so on. Most often, there is no fixed schema in this. They don't support joins. They don't support referential integrative with foreign keys, and so on. What they do imply is application-specific design. Meaning, you can't just have a normalized form and expect a code to work on whatever engine you are putting behind it. And this is a pro match with Drupal architecture, actually. And then you have the second wave, no-sequel with the big O, usually meaning not only sequels. And this is more what we are doing with Drupal when you are talking no-sequel. It means keeping our sequels database schema and the tools that are built on it. And then we extend that with further tools based on no- sequels with little technologies. Basically, the purpose we try to reach with this is to upload work from main database to other tools so that you can scale better. And then this asks the question, what is the difference between performance and scalability here? Mostly, performance is about serving the same page faster. And this is often confused with scalability, which may imply serving the same page slower at times, but it allows serving more users. And no-sequel may improve your performance and your scalability, but in a small number of situations it will reduce your performance but still improve your scalability. So the question then is, do you need it? How do you know if you need it? Of course, if you are here, it probably means that you have a slow Drupal website or two. This happens to everyone, I guess. But to know if a no-sequel can help you, you need to do observation. You need to have observability tools and monitoring in place. And then observe what is making your site slow. Consider the example here. You have this site. And if we didn't have this graph, we would never have known what was causing the site to be slow. And in this case, it was a slow request, which just happened because of poor configuration somewhere. When you don't have tooling available, you don't know why your pages are suddenly slow, and you can do nothing. But if you see this, you see this is a database issue, for example. So what do we have to perform observation on Drupal when core itself has built-in observability tools? A little known feature is the database logger. If you have looked anywhere at the database docs, you can see that you can log all queries coming to the database engine before they are executed and send them to something else. This is what is actually used by the web provider. You also have the ways to observe the cache operations. In Drupal 7, you have the Heisen cache module for that, which I wrote. And then you have the similar set of features in Drupal 8, included in the web provider. And I know that Mojpitesman has been doing work on that. And finally, when you are looking at the time spent building your pages, there's a wonderful tool by Vim Lears, or Lears, which is RenderVis, which allows you to see what your cache context, your cache tags, and your cache lifetime are. So with all that, you can observe what's going on in the site to see what is going on being slow. But then you have these metrics, and you need to do something with them. So you need to take the observations and put them in a more accurate system. And here, Drupal doesn't help you. You really have to have something outside Drupal. The best solution that I've seen currently is combining Prometheus or inside DB and Grafana for Drupal 6. You can also, for this, use a module which has been written by a number of people, including Berdier, which is monitoring, which supports a ton of metrics about your Drupal instance, and which can be used from Grafana and Prometheus tools using Telegraph. So that's a well-known saying which applies not only to business where it was created originally, but to Drupal. If you want to improve your speed, you need to measure and know what is being slow. Then once you have the root cause of the slow, you're slightly slow identified, you can fix it. And hopefully you can see some things like that. This was a mainstream site in France with many, many requests being slow. And then it was just a matter of changing settings on memcache at the time to go from this high number of slow requests to this low number. So it's how you proceed. You measure, you observe, and then you reduce what to do. One thing not to do is try to throw solutions to the problem in the hope that something will fix it, which is all to come on with people who are not specialists in performance. So should you just do it and deploy NoSQL to your site, well, really it depends. First, Drupal, as you know, is built on SQL. It has been from its inception. And some parts of it actually depend on the database being SQL. Views, especially here, has a deep knowledge of the database schema and builds the queries from the database schema. So most sites these days in Drupal 8 depend on views, even more so than in Drupal 7, I guess. Core, though, can accept the database being replaced. But many contributions don't have this level of flexibility. And they needed to assume that they have a database because they want to perform some SQL queries. And then you have a problem with this. So contribute support is limited for NoSQL. Also, another limitation is that most contributions in the NoSQL space have not been imported from Drupal 7 to Drupal 8. We will see a table later on. So still, technically, it usually makes sense. But then you have to ask your questions about costs. If you add a new server, you add a new technology, it means you add to build costs. So you have to balance the gains you get from the performance or the scalability being better with the added costs for the building and maybe for the running of the site to see if it's worthwhile. Just take an example. Three years ago at Milan Dev Days, there was this driver being written by Gizra, by Roy Segal, for WrithingDB. WrithingDB is a very interesting database. And then he had spent a lot of time for Gizra working on this. And he put out this driver. The module was ready at the Dev Days Milan. And then a few months later, the company shut down. So the code went essentially dead, because the product has been open sourced, but there's no longer anyone to support it professionally. And finally, do you really need it? This is an early commit at the beginning of the NoSQL Wave. And it's fairly typical, actually. You see younger developers, usually, junior developers who are in love with NoSQL technologies because it's the hype of the time. And they really want to use it. But then maybe it's not needed. Most of the time when we use NoSQL stores or other technologies, it's for bigger sites, the one which actually have traffic in the million views per day range and above. So what is by default in SQL and can be put out of the SQL database in all versions? Well, first, that's front caching. You don't usually think of it as a store, but it still is one. By default, when you start Drupal, and you may know that already, I guess, but the browser caching is little developed, especially in Drupal 7. It's better in Drupal 8, of course. But then the caches themselves are kept in the database. That's true both for anonymous and authenticated users in different bins. So anytime you want to serve a cached page to the user, you need to open the database connection if it's not persistent. And send a few queries just to ensure that the IDAP is not banned, that the page is available, and a few queries just to serve something which could be done outside the site. So what can you do, which is out of your SQL database? Well, the first step is putting better work on the browser caching, meaning the request won't even leave the user browser. But that doesn't work for the page itself, usually. So you can use a CDN. A CDN will improve your site scalability a lot. And the CDN module, for example, is used by about 2,000 sites. And there's also a module specifically for Akamai, the biggest CDN that exists, which has still already 600 sites using it. And this will typically give you time to first bite about 15 milliseconds in Europe. And of course, the best thing with CDNs, really, is that they are truly web-scaled. You can serve worldwide traffic peaks if you are lucky enough to have them at some point. One level cheaper, you have Varnish or some other reverse practices in front of your site, which can give you sometimes faster time to first bite, usually below 10 milliseconds, but which, of course, are not web-scaled because you have only, typically, a single hosting location. And then, when you are using this, all the cached pages don't make it to your site, which is the best for scalability, of course. Now, when you really have a page that needs to be served by your site, you have to enter the loophole itself. This is a summary of the performance and scalability solutions along with price. The graph is not maybe very easy to read. Basically, the size of the circles is the run time cost of the solution. So what you have, by default, is up. Yeah, correct. The default solution is Drupal with no tuning. And this is, of course, the simplest and the cheapest to build. But it's also expensive to run because to get decent performance, you need powerful engines. And you have to spend a lot of hosting. With a little extra work, you can get a tuned Drupal, which has no SQL involved, no SQL solutions involved, no varnish, no anything. And it still performs much better. And it also costs less because it is faster on the same hardware. When you are doing a smaller to medium site, this is often enough. To take an example, in France, a few years ago, we built a site for one of the largest e-tailers in France, pretty commerce for those who know them. And on the first day of the opening, we had a problem with the operations. And varnish was not available. And that was on the equivalent of the Cyber Monday, which we called sold in France. And, of course, the site had trouble with that. We just did a few adjustments to the site. And we served the whole onslaught of users with just the tuned Drupal, and it still worked, to about 3,000 transactions per minute. So when you are below that, maybe you just don't need anything more. However, it's usually simpler to do their tuning and add a reverse proxy in front of your site. Here, of course, you can see that your scalability improves. The vertical scale is scalability. And the horizontal one is 10 to first byte. Your 10 to first byte is lower. You are on the left. Your scalability is better, at the cost of having an extra server in the form of varnish with some configuration. This is usually the optimum for professional sites. Not the smallest one, but anything medium to large is best served by this. And then if you have a very high population of users, and especially if your site has a worldwide user base, you want to use a CDN to better serve your assets, especially. But this has a runtime cost, which is significant. And the performance may not be as good as serving direct ref from your system with varnish. Usually, we consider varnish to serve above around 10 milliseconds. While Akamai, for example, will serve around 15 milliseconds time to first byte. Now, when we're talking no SQL in Drupal, most of the time, we are thinking of storage solutions. And for this, we have really three big solutions, maybe even just two and one smaller. These are Redis, Memcachedi, and MongoDB. So what are they for those who don't know them exactly? Redis is usually classified as a key value store. Although, actually, it's much more powerful. It is a structure store, which can do lots of things, which are not very much in Drupal, at least. There's a module for that, of course, which is just called Redis. Redis is also the first ranking key value database on the DBM Genes ranking. And it's used by a lot of Drupal sites, around 10,000 Drupal 7 sites and also 10,000 Drupal 8 sites. The interesting about it is that this is one of the few modules where you have more sites using it in Drupal 8 than in Drupal 7. Both are around 10K, but there are more Drupal 8 sites than Drupal 7 sites. Currently, it's supported by MD Systems, and mostly for the previous versions by Machina Corpus in France. Then you have the long-standing solution for caching, which is Memcachedi, known and loved by many, hated by some, which is just a key value instead of being a structure server like Redis. It's supported by two modules. Mostly, the main one is Memcachedi, without the D. And there's also Memcached storage. Both of them exist. Memcachedi is very popular, too. It's third on the DBM Genes ranking. And it's also fifth on the DBM Genes, if you count the implementation by Hazelcast, which has a feature like a consolidation, replication, sharding, and so on. And it's the most used SQL solution in all of Drupal, because you have over 30,000 sites in Drupal 7 using it, and 15,000 sites using it in Drupal 8. By comparison, the number between parentheses are for those using Memcached storage driver. At some point, Memcached storage was more interesting than Memcached as a module to drive Memcached, but then maintenance caught up in Memcachedi. And this day, you really want to use Memcached and not Memcached storage. The code is much better, has been better tuned for Drupal 8 and its overall beta solution. Especially if you consider the companies supporting it, which are really heavyweights in our Drupal world, Equia, and Taguan Consulting. And finally, you have MongoDB and Cosmos DB. Cosmos DB, for those who don't know it, is the fourth most popular document store in DBM Genes. It's a Microsoft product for Azure, which is compatible with MongoDB. So you can use the MongoDB driver with both the original MongoDB and with Cosmos DB if you're running on Azure. And unlike Redis and Memcached, this is a document store, meaning you can store things in it, like you can in the others, but you can also query values without knowing their ID. And this is a big departure from the key value stores because in key value solutions and structure servers like Redis, you have to know exactly the ID of the day and at that time you want to access. Which is not needed for MongoDB. MongoDB is the first database in the DB engines ranking for document stores. But on Drupal, it's very much less used than the other two because it's more complicated to use in most cases. There are only 300 sites using it on Drupal 7 and about 50 sites using it in Drupal 8. Also, we will see in detail for all of these technologies, the past of the MongoDB driver has been a bit complicated. So what do we have with Redis? First, Redis is supported by two extensions of the PHP label, both of which are supported by the driver. Driver includes support for Cache, which is its main user, but it also allows you to take out the flawed service outside of your SQL database and the lock, including the lock persistent service, which is only used by Core itself. And it can also help you take out your Q service outside Drupal, although that's maybe not the optimal solution for QLin. It doesn't have a P support and there's also an extra module, which is the Redis Watchdog, which combines a logger service, like we have with DBlogger and Syslog, with a UI that you want to use it. Recently, Berdier told me to mention that there's been a long-standing issue for about three years, I guess, about node list invalidations causing deadlocks on site with some traffic. And that's at last been fixed some weeks ago only, and it will be available as soon as Drupal 8.8 becomes available because the patch is only for Drupal 8.8. And at that point, Redis will support it. Also, questions I've seen from customers using Redis is about the five-zero version of PHP Redis, the extension which broke the driver and the Drupal site using it, so they had to roll back. This is now supported with the latest release even on Drupal 8.7. So the maintainer asked to please test the module and report about its status. How does it fare in terms of performance? On the single server, Redis is always among the fastest stores you can find, and it's often actually the fastest one. And that's with concurrent access. About persistence, unlike memcache, Redis can be only in memory or can be persisted to disks in two different ways. The RDB performs snapshots, which you can then take away for increased resilience, or AOF, Appended on New File, which allows you to have sync to disk at the second rate. This is less compact. This cannot be directly taken off site for added security, but it improves your resilience to data loss. However, at this point, it becomes slower than many solutions, including MongoDB. Another reason to use Redis may be fault tolerance, which is built in with the Sentinel feature, which supervises a cluster of master slave deployments. It allows scaling using clusters, and then you start from a master instance, which is replicated to slaves, which are then replicated to more slaves. The main weakness in that scheme is that there is no strong consistency, and this is a hallmark of all no-sequence solutions. Also, if you really want to deploy the Sentinel and cluster on that topology, the recommended minimum set is six instances. You can do it less, maybe most sites do it less, but the recommended deployment is six servers or more. It's available in the cloud, with the Redis Enterprise Cloud, which supports the project itself, of course, and also on AWS, Azure, or Google, as memory store, also many others. Now about Memcache, a long-standing project. You can use either the Memcache or Memcache DPHP extensions. Both are supported. The Memcache extension is usually faster, although it's older, often by a significant amount, but it's usually less available, especially on PHP 7.3. Memcache D is more available and seems to be the future as far as Memcache is concerned. What it does provide you is the driver adapter, of course. We include the cache driver, which includes invalidations, and that's something which needs to be mentioned because early no-sequence implementations of caching did support putting the cache data in the no-sequence store, but they still had the invalidations done in SQL, which reduced a lot of the performance, and at some point, both Redis and Memcache D moved to having the validations done in the SQL store. It supports the lock, but it doesn't include the lock-persistent service. Sessions were once a hallmark of performance-conscious sites in Memcache D, and they have been removed during the 7.x lifecycle and were never part of the Rupal 8 version. There's also a limited monitoring UI. If you compare with the Memcache storage module, you have the cache with the core invalidations too, but it doesn't support the lock system and also includes the monitoring system. Recently, the same node list invalidation that block has been fixed. How does it stand up to Redis? When on the single server, it's much slower than Redis most of the time. When you're talking about Redis in memory only. It's still among the fastest stores you can find, except Redis, and it's usually comparable, although a bit faster than my SQL or MongoDB used in key value mode. A recent development on Memcache D, the server itself, has been the ability to persist data for long, for many years, when it was created, Memcache was only in live memory, meaning any time your instance went down, you lost the data. With the EXT store support, you can now persist the data in Memcache D, although it's usually not a great idea because many logic in the caching system of Rupal expect that when something stops being available at some point, it doesn't become available again until it is regenerated. And when you have caching with the persisted store in EXT store, the data which may become available when an instance goes down may become available again when the instance comes back up and the EXT store is restored. So not a great idea and it's pretty expensive usually. In terms of full tolerance, this is really where Memcache shines over Redis. Because you typically deploy Memcache as a set of many instances, all of them are sharded to at least two servers, meaning you never get data loss when one instance goes down and you have instant availability and EXT in L which has a delay for switching from one instance to the next. Also, if you are using the Hazel Cache implementation, which is a separate product using the Memcache D protocol among others, you can have replication, which allows you to have multiple Drupal instances, sharing a cache, which is replicated, allowing you to go to a very large scale. It scales well because of the cluster sharding. Also, it has consistent hashing, meaning you can add or remove servers depending on your workload to improve your site scalability and gain some elasticity in face of demand. The fact is it's often not well understood and you need 10 to 20 instances on a typical Drupalite site, whereas some companies will tend to put just one Memcache D and claim it's already enough. It's not, really. You have limited availability of monitoring in the module itself, but it's better to use a contribution product which is called PHP Memcache Admin, which gives you a lot of features to administer, configure and even modify data in the cache. It's also cloud native like Redis on both AWS, Azure and Google. Now my personal favorite, of course, is MongoDB since I'm the main maintainer of the module. In Drupal Zen, the module had a lot of features. It supported both extensions, the Mongo extension, the MongoDB extension. It supported the early MongoDB and also supports up to version four today. And it includes the driver adapter for MongoDB like the other NoSQL solutions. It supports a block system, which is independent of the Drupal block system, completely replaces it. It's not just a block cache. It supports the cache driver, the path driver and the queue driver. What about the path? This is something which is not very much known. But if you look at the workload in many sites, having authenticated users or serving uncached pages, it's quite common for the path queries to take about one third of all the queries done in a Drupal seven site. So just by switching into a NoSQL path plugin like you have in Redis for seven or in MongoDB for seven, you just remove almost magically with zero configuration about one third of your SQL load. It's just enable the plugins and settings and your load goes down. It also supports the queue like Redis. And parts of the module are available but not supported and the best known one is the field storage which enables you to store your Drupal nodes and any other entities actually in the MongoDB field storage as single documents instead of having the star schema for all the fields. This was developed for the White House and examiner.com. And it has a very high performance and very unstable. You really need to have someone able to maintain it if you choose to use it. There's also a log support, session support, although it's not great. And the watchdogger, which includes the logger and the UI. There's also an extra module on top of it which is EFQ views, which allows you to use the views to build queries on top of MongoDB instead of building them on top of SQL. Then you have the current supported version which is Drupal 8.2, which was released last year at Drupal Europe. It supports only Drupal 7 and the latest driver with the new contributed driver for PHP which is MongoDB slash MongoDB in package system. It supports only MongoDB 3 and 4. It includes the driver adapter, the key value storage, which is the support for state and can be used as a support for cache. Key value is expandable, which is used for your previews. And again, the watchdogger with the logger and UI. It has key support for both Drupal and Drupal console and has experimental entity field storage at this point. You can store data in MongoDB and restore them, but it's not completely safe at this point. And also you have a complementary module outside in a separate project, which is the MongoDB indexer, which allows you to just send your Drupal nodes to MongoDB and use it for previewing outside Drupal itself without having to maintain knowledge about the Drupal schema. This is much simpler to use for front-end applications. Like when you have a Red List site being built in Node.js or Go or whatever, or Simfini, you can just query the copy of your index data in the Mongo Store instead of going to the Drupal schema. Now you have the rim of the bizarre and the weird, which is Drupal 8x1x. This was a Drupal NoSQL in the first sense of NoSQL distribution sponsored by MongoDB. With the goal of having Drupal run without a SQL engine. This was then led by Chex, Caroline Ighezi, and other developers, I was part of them. However, after several years of Drupal 8 development, MongoDB pulled the plug and the development stopped. You can still use it and start from there. And this is exactly what one adventurous developer has done with the DrewMongoose distribution, which supports the current versions of PHP, of Drupal Core and of Mongo itself. And again, a NoSQL distribution allowing you to run Drupal without a SQL engine. It needs a significant core patch, which is more or less maintained current, and the specialized driver itself. When you are using this, you have a Drupal that runs without SQL. The project is active. The maintainer is available to respond. There are issues on Drupal.org, but development is done on GitHub. Now, how does Drupal work with MongoDB? Here we have an example of a situation where we converted a social network developed on Drupal itself. We had an optimized version running for some time on SQL. And then we moved it to MongoDB and you can see how that went with our flow queries in the SQL engine. For hard numbers, which can be difficult to read, basically our flow queries went down by 85% and the response time to build the page center went down by 98%. So we were about 50 times faster using MongoDB than using SQL. More generally, if you use MongoDB, what do you get? The engine features full tolerance within the engine itself. It has built-in replication and sharding and the recommended config, unlike this, is not six servers, but only two dataful servers and one small server for an arbiter. It scales well and it's also data center aware, meaning you can have multiple instances deployed to several sites for geographically different areas. And then the queries will be done to the closest server to ensure the best performance while still being in sync with your main server. Monitoring is provided by MongoDB. You, of course, have a paid offering, but there's also a free level which provides online monitoring in the cloud. And it's also available in the cloud natively by Azure using CosmosDB, which is the fastest growing database on DB engines, by MongoDB themselves as the Atlas service and also by MLab, formerly MongoLab. This is summary of what you get with MongoDB main cache and release. So as you can see, Drupal 7 had more features in all cases that we have in Drupal 8 currently. The features within parentheses are still being worked on, so they may appear at some point. In the case of MongoDB, for example, the next one is likely to be the path plugin because it's most useful after those already existing. And now, if you have been interested in scalability and no-sequel solution for some time, you may remember that at some point, during the Drupal 6 and 7 lifetimes, there was a lot of interest for no-sequel, more so than today. And at the time, there were a number of engines which became popular and had drivers in Drupal. So let's see what happened of them. The most maintained engine today is Neo4G. Neo4G is a graph database. It's the only one of its kind to be supported in Drupal. And it has a driver wrapper allowing you to write applications for Neo4G in your own code, anti-alicostome, and a tick for Drupal. And the main point of interest is that it's supported, well, at least it works, and there are no broken bugs it's used for 8.x. The other one still existing on Drupal 8 is really the view of which I talked a bit earlier. Of course, the database itself no longer has support, but it's community supported. And it has a Drupal 8x driver which does not include any services at the Drupal level, but includes the wrapper and an ORM, which means you can write your own application on top of it and still be within your Drupal code. For the other databases which used to be available, as you can see, there are Drupal 7 drivers and no Drupal 8 one, the exception being Elasticsearch, which includes a more complete support and is still maintained for search API. Some of them I didn't even make it to 7.x and these are Apache Cassandra and Tokyo Dairant, which only exists for 6. Now there's one specific case of storage. I want to mention that these sessions, if you want to go back a number of years, you'll find that many people try to scale their Drupal sites by moving the sessions to Memcache. This was very popular and you may have noticed in this talk that these days there is exactly no session support in any of the no-sequence solutions and you may wonder why that is. Especially if you consider that Memcache itself has the support for sessions baked right into the extension. Why did we remove it? The thing is, in Drupal we have session data which can get big and Memcache D by construction is limited to one meg or less of data unless you configure it specially. But it becomes especially inefficient as soon as your data becomes large because of its page allocation system I won't dive into the details but suffice to say that it's a really poor match for sessions. Now another problem is with cache and any cache solutions outside the persistent storage you can just lose your sessions anytime your cache driver was down and you are not in persistent storage then you lose your sessions all users again can get looked out and lose the content of their sessions not only are they disconnected and have they to reconnect but they also lose whatever data was stored for them and this is not something you want for a session. And one of the consequences of this is you have a vulnerability. It's very easy to dose to apply a denial of service to a site using Memcache D or any kind of cache as the basis for the sessions because as soon as you find a form which generates an entry into the session set you just have to generate enough entries by going repeatedly with various parameters to that page and then you will generate new sessions which will look out for users. If you go fast enough you can just prevent the administrator from logging in. So it's a very bad idea although the symphony people do it and it's even the recommended practice for symphonies these days. Now the biggest gain to be had from NoSQL is logs. You probably know the problem with the Drupal Core DB log module database logging it tends to write a lot to the database especially if the code is not of pre-styled quality because it sends warnings and notices and info and sometimes they give info to the database and it gives a lot of load to your machine. I once audited a site which sent 1,500 insert queries per minute just for DB log and the site owner didn't understand why you chose slow. Basically the site had been written by a company which did a great job and so for them they didn't have any special insertions so it was just fine to use that DB log. Then it was taken over by a self-party maintainer which was cheaper and of course they did extra work which was not of good quality and the number of warnings and code errors increased dramatically to the point where the site just didn't work because of the load of the database log. Now there's a simple no-sequence solution within Drupal Core itself just for that reason and that is the syslog-d module you can use syslog to send your logs outside your database and send it to a syslog server on your machine and this is recommended except for all of the smallest sites you can find. The one problem with the syslog is that there is no UI so although you have logs you just can't use them. You have to go to the command line or deploy extra tools it's rather weak. See how that looks when you are switching from traditional logs to MongoDB log for example on a large site where I had to work on this year. Here you see the number of operations performed by seconds and the number of TCP connections and the number of downloads. I think you can see the time when we deploy the new solution. That's the MongoDB watchdog and I must admit from what I've seen it's usually the first reason why people actually deploy MongoDB to their site because they want to see their logs and syslog doesn't help them for that because the hosting company often doesn't provide access and they don't want to have to use grep to find anything. With that module you have a UI within Drupal just like with dblog which you have occur events which occur frequently you still have only one line for them and also it does something other drivers don't do which is tracing. Meaning as soon as you have a log event for one request you can go to the page for that request and see all events done by a single request separated from all others. So you can see during this page it happens this, this, this not being mixed with all other requests as it happens with all others. That's tracing. It's available in Drupal 7 sorry and 8.2 Now if this is not a solution for you the industry standard solution is the Belk stack which stands for Beats Elastic Search, Logstach and Kibana It's very well known it requires quite some work to maintain and develop especially as Kibana tends to break compatibility with each release but it's very high performance and if you have an Ops team it's a very good solution to use It's also available as a SAS offering from Logleak, Logzio and some others. When developed you have Greylog which is modelled after the industry outside Drupal standard which is Splunk which uses both Elastic Search and MongoDB for storage it is really well-scaled by some of the largest companies in the world and Drupal has a driver for that it's in the form of the Greylog module It's also available as a SAS but when you are at that level of performance when you need Greylog you usually have an internal Ops team which can maintain it and SAS is not really interesting for you I've included this drawing for later users you can find the slides on the web after the presentation anyway and now the question is do I need these logs? Basically if you have a very small site you are doing your blog because you are a Drupal developer or you are doing your company Extranet, probably not you can just use dblog it's no big deal as long as you are below 100,000 users per day Now, if you have anything larger than that, you should move to any sort of long no-sequence solution it can be syslog if your hosting company provides you with a solution and for example Acquia and Pantheon both include service for that just use them if your hosting company offers you to push your syslog data to Kibana or something else say yes and use them but please don't use the database log for anything serious and be sure to ask your hosting company your logs because you want to see what's happening on your site if you can't find the hosting company doing that, you have good sys solutions especially Scalia Logli and PaperTrail only 5 minutes left Now, a personal thing I find very interesting is moving work off the queue in database to queue outside the database you have and Drupal the queue API works with the default database driver in core and it's included in core there's also still in SQL a country module which is called advanced queue which was developed for Drupal commerce at the time and ported to Drupal 8 and it's the most used queue system still in SQL but non-core because it's a requirement for commerce.x so that makes it very much used objectively as whenever you are not a very large site with lots of media and fees to ingest in real time just use advanced queue instead of the standard queue and it's good for you but then, if you have lots of data being built in asynchronous mode you should use an external server the diagram you have here is a site for a French television station and that's French TV sport actually, it was published like this and as you can see they have both non-core RADIS and RabbitMQ driver so what can you do with that basically that is you receive news feeds in real time for example when you are publishing a sport site you may receive up to 300 events per second and you can just handle that in real time so you push them to a site and you have CronWalker which get the work or when you have been building a commerce site outside Drupal itself because it has lots of transactions you typically synchronize your inventory system with Drupal in real time with an asynchronous process and this goes through a queue to overloading your Drupal so what are the supported queues in Drupal in 8 at least and you have been stopped D this is a little known queue server but it's very simple to use it's usually your best starting point when you start to evolve from the building core system just one file you will start it and it runs and that's what you have to do it's powerful enough that Drupal.org itself depended on it for many years one level up you have RabbitMQ which is the industry standard and it was little used in Drupal 7 but has increased use with Drupal 8 I wrote the driver for both actually but the maintenance has been taken up by some other people and we have now several hundred sites using it Amazon SQS server was seen in Drupal 7 but it's no longer used these days and then you have Apache Kafka Apache Kafka is really the biggest queue you can find but it was created for Carrefour in France, the driver for Drupal and other services used to be German, IronMQ, ZeroMQ but they don't have Drupal 8 versions here we have a graph of the popularity of the various queue systems as you can see the queue built-in drivers are much more used than anything else but then Redis which includes the queue is still matured itself but we don't have the notion of which sites are using Redis just for the cache and not using the queue for example what we can see is that RabbitMQ is the most used dedicated queue server by far so do I need it, as I said unless you are performing a very high load of asynchronous work you probably don't need a NoSQL queue but if you do then it's better to use one and I would recommend Rabbit or BigStore if you are just starting Kafka is for the very big sites so unless someone mandates it to you you probably don't need it with Drupal it's something which is used by LinkedIn by Twitter, by Spotify, Airbnb, PayPal and it can handle multiple thousands of data inserts per second no Drupal site has this kind of traffic normally search is a well known topic I want to extend on that you know that core has a database search which is quite weak but is bundled you can get better results still in SQL using the search API database driver but in most cases you want to use something on your site like Elasticsearch or Solar there are also two good effort offerings in SAS which are simpler and don't have the cost, the run cost of having a Solar instance or an Elastic instance and this is Algolia and Google CAC most largest Drupal dedicated hosting companies include Solar and Elasticselection among them Acquia and Medzio Medzion and PlatformSH the main share I should say of search solutions for Drupal if you go beyond core you can see search API with whatever it takes is of course the basis for a solution but there's a presentation about that today in this room and then after that you find Solar and Google and Elasticsearch connector same thing as the queues and the logs if you have very small site core search is enough if it is not use something else if you have a favor the slides include some best practices but I'm afraid we don't have much time for them so basically core tries to support NoSQL but does it only partly it's difficult and if you have to write code that should support NoSQL the best way to do it is to split your features into parts one should be the driver which handles all storage features for your contribution for your code and one should be the one doing the actual work and then to support NoSQL you just have to plug another storage driver this is much simpler than having all your code be aware of the database itself it's much more detailed in the slides you can find them after the presentation and finally for MongoDB itself the 8.2 version has been written to ease developing specific custom solutions it's designed for that so what you get is the compatibility with the standard practice of MongoDB which was not the case with the 7.x and 6.x versions here what you get is either a standard MongoDB client or standard MongoDB database if you have to handle multiple database get the client from the client factory and then you can handle multiple database and topology yourself but if you can remain within one database use the database factory and then you will gain the ability to be configured just from settings and still have the full features standard database for MongoDB and that's all I have to say except this last thing you should contribute to these drivers if you use them because it's the best way for you to understand how they work and what their choices are about that you should also be present tomorrow for the contributions and we have 10 minutes for questions if anyone has any questions so I have two questions from the room why do you want to run cache back in the cloud it shouldn't be close to the production server yes it should most of the time basically the situation is why you want to have a cache of site in the cloud first when your site itself is in the cloud where you are running on AWS for example you are in a DC2 instance Amazon does special work when working out the allocations for the elastic cache to your instance so that the latency is low so if your site is in the cloud you have specific services like elastic cache or memory store for Google which optimizes the topology so that your latency is good it won't be as good as being on the same elastic instance but it will be better than if you did yourself two spread instances and bound them about the network and then another question can you have the session page grouping of watchdog events for syslog solutions via yes or greylog yes usually you can perform queries on both of these as they both use elastic search to store the data and if you use the proper query you can just group your results and aggregate query and get the result and the aggregate details anyone any question I don't know if that's because I said it all or you don't understand or well then if there are no questions thank you for attending and I'll be available the whole day can you tell me in the room thanks for coming