 Okay, so welcome to this session about delayed operations with queue. Some of you might wonder why we are trying to delay something when an operator has already been delayed so much but it's not exactly the same kind of delay. Let's see what it's about. First off, who are we? Yuri? Hello. My name is Yuri. I'm my main role. I'm working for the company FFW. I was from part of ProPeople and I was mainly the developer then I worked as a team leader and I'm working as an architect so I'm involved a lot in architecturing the projects that we implement and this is where my experience with queues came up. Also I was involved in contributing to Drupal a little bit. I worked on services, module, draggable views and some others. Also I'm working on some little tool about visual regression that's Backtrack.io but today we are going to talk about my Drupal background and what we have done with queues. And I'm Frédéric Marron. I have FGM or sometimes Ozynet on Drupal.org. I'm mostly doing performance consulting work on media sites and mostly in France. I've been a long time contributor to Drupal core since Drupal 4.7 I guess. And I also maintain the MongoDB module suite and the XML RPC sub-system in Drupal 7 and now in country as of Drupal 8. One thing which may be interesting is that we now have already worked with on four different customer projects for Drupal 8, even before the Drupal 8 release. We even have some of these in productions these days and I'm known by some to have added a queuing systems to several projects. So of course the first question is why do we want to use queues? So when we are talking about queues we have multiple areas where they can be applied. So the first one is the speed for the visitors. So when our visitors are looking at the website and they're performing some actions that will take some time, sometimes it's reasonable to actually put them into queue, I mean these jobs and then still display something to the users like your request is in progress or something like that and then after the job is processed to display the results. So they are not just holding on the line and waiting that something happened or not happened. So this is very important for the user experience. The next one is very similar thing but for the editors and the use case can be a little bit different. So when we create a content and there are some third-party integrations, we need to push that content somewhere else so we need to perform very heavy computational actions. This is something that people also should not just sitting and waiting till they will finish. They should have some message ideally with the progress bar but it will depend. So it's very important for editors also to have very fast response what is happening with their content and what's happening with their work that they just completed. And another thing that is very important is scalability. So when we have a lot of visitors to the website and they start hitting our services very hard and we can have multiple bottlenecks and some of bottlenecks for example database writes, this is another thing that can happen and it will slow all your operations down very fast just because of the spike of the traffic. So if you would put these kind of operations in the queue you can also survive and scale pretty a lot. This is at one end of the spectrum. This is the part of Drupal itself which is slow. At the other end of the spectrum we can also have jobs which are actually intrinsically slow because of the data they are handling. For instance if you consider video encoding you obviously don't want to do video encoding while someone is waiting for the page to display or even if you consider the case of a photo journalist going to a fashion show and bringing back tons and tons of photographs that need to be uploaded you don't want to have him post his form wait for the 16 megabyte photo to be uploaded then click on the next and so on you want to have operations which are performed in the background from the UI user standpoint and which don't interfere with the normal operation of the site. This is a set of types of problems which can be handled by adding a queuing system on top of Drupal. More precisely some concrete use cases. Yeah so for the use case we have from our experience we have identified several ones. So the first one is when we create the content for non-Drupal front end and we will be talking about each of them in detail so the second one is anticipated content generation. This is something to deal with refreshing caches when they when we know that they are going they are about to expire we can do something and then we can have deferred submits. This is something that is about scalability so the use case is we have plenty of users who are just bombing us with comments but they don't need to see their comments like immediately we will talk about that. Also slow operations like video processing that that is another area external data fetching so this is when we are working in the area where when we need to pull or push the content from other sources and this can take some time and also the last one that you were probably using all the time it's batch operations in Drupal so they do use queues. It's and of course it's batch not batch it's just a type of course. So the first use case is about front ends you read. Yes so the architecture for the Drupal in this area is that we can have very nice Drupal that is very familiar for editors very convenient it has beautiful structure for the editors only. So this is kind of like back office of our application and the front end we have so many new technologies that are way more faster way more interactive they can build they can be built in other languages like in JavaScript and whatever and they could use not my SQL database for fetching the data for their pages. So one of the examples it can be for example Sylex application and it can use Redis for pulling the data. Redis is like blazingly fast but for Drupal to assemble all the data if we are going to have multiple pages where the content is being displayed and like we prepare the data for each page and refresh it and put it to the Redis that is going to be used by our front end. This operation can take some time and this is very important thing for example in media websites when you can have one article but it should appear in multiple front pages and each front page is like five pages and it's huge and we need to regenerate it. So this operation can be a little bit time consuming. So in this case we still for example we have these pages before the editing of the content and then editor changed something and then we create a job in the queue that actually tells the workers these are little draft icons that we need to regenerate this set of pages and then when the queue is handled by one of workers it goes to all the pages generate the content and put it to the Redis and then front end will get just updated pages in this way. The beauty of it of course is that at no point does the user experience any lag until the worker is done cooking the page for the front the front end still serves the previous version of the page and only when the new version is ready in Redis does it switch to the new version there's never any user getting a page miss. We can go further than that by anticipating generation. This is something which is not so well known but let's consider how it works. On the top line you have the typical Drupal workflow for a page at some point content is generated. At that point the content is considered to be fresh it's news it's fresh and it can be put in cash and indeed most sites you have been working on will probably have a caching system in place so that for the duration of the cache available for that data the data will be served from the cache everything will be fast and it will be good. However after some time has elapsed the data is still valid in the cache it's not expired but it's not really so fresh if you are dealing with high frequency information like fresh news or agency news or maybe stop quotations you need to have very fast refreshing information and after a few seconds or a few minutes in the cache information is still relevant you don't want to replace it necessarily before displaying the page but it needs to be updated because it's stale it's no longer fresh so what will you do with Drupal normally since it's not expired you're still serving the page from the cache it's still fast but it's no longer so relevant for the user. Imagine considering looking at sports retransmission and getting a report of the the goal is scored only two minutes later after the goal has been scored and then when the cache finally expires one unlucky user will get a cache miss the page will be rebuilt from the source data and will be fresh again but one user at least will have taken a miss meaning possibly several seconds to rebuild the page which is not a good experience. What we can do is anticipate this generation so the content starts by being created and when it is still fresh it's served from cache no change however at some point we introduce a laptop time in which the content is marked as stale it's still served from cache just as previously but at the same time that you send the request to the user browser you also push a refresh request to a queue saying please update this data some worker in the background will take the the information about what needs to be refreshed will fetch the information rebuild the relevant information from data sources from the database from whatever rebuild possibly pages like you showed on the previous slide and store the content in an updated cache at this point before the the cache itself has really expired you again have fresh information stored in cache and it has replaced the previous fresh information so whatever the time you're always serving fresh information from the cache your no user is getting a cache miss so everything happens in a hidden time this is something we have been put to use in many news sites in France so another another use case is deferred submits and this is where we have a lot of traffic we have a traffic spike and it's not hard thing to do you can post something on reddit or on some social campaign and this is what a lot of media companies do and when they have it they expect a lot of people to comment on the article and the problem is that if you will have a lot of people doing this commenting and you will just insert the records in the database database can go down because you will have locks on the table and like a lot of these performance issues so the idea is that instead of putting the comments right to the database as they come we put them into the queue and then we have a worker that actually get the records for the comments and it put it in the in the database but also it does this thing in batches so instead of doing multiple inserts it goes for one insert with multiple records in this way we also optimize the process of inserting the data to the database in this particular example that's one of our implementations and I would like to go through it because it has also some security enhancements so in this project we were dealing with two groups of web servers so the public facing web servers they had only they had read only access to the database so all the operations that were requiring actually writes to the database from the users like commenting or registering that was happening through the queues in this way we knew exactly what information is coming and whatever amount of information we'll get in it will not put our service down it will just make some interaction slower but anyway it will work and on technical level for example when we were submitting comments we were actually doing JavaScript across the main post requests to some very lightweight PHP application that was storing all the data to the queue and we implemented the custom queuing mechanism like in Redis and then there was another worker that was another little custom PHP application that was grabbing all the data from queues and putting that data to the database on technical level these demons that I mean this worker that was on the other PHP application that was running with PHP demon and we had in this way the access to the database the other group of web servers that were also Drupal servers they were for editors so they were not involved they were not publicly available you needed to do the VPN connection to access them but the great thing about this idea is that when we had this Drupal get done thing when when we had this huge security issue with with database access like this project didn't affect like at all because we had read only web heads that were publicly available so this is about both deferred submits but also some security implementations the next case and which is maybe the most common and the best known is the issue of getting data into your Drupal sites for this you have really two ways of getting at data which is pulling the data from external sources or receiving data which are pushed into your site if we consider the case of pulling data you probably are familiar with this at least with the aggregator and if you've built any intranet or extranet application you probably have seen situations where people have been putting blocks on the site in which they were doing HTTP requests synchronously with the page which is awful because when the server blocks upstream your own site blocks which this should never happen it's a spoff single point of failure but also even when it works it's still not not fast for the user because it it will always incur at least the cost of the the upstream data call so how do you use a queue for this typically you have an external process ah sorry which is here which can be triggered by cron or by any type of scheduler and which will which will periodically request jobs to send well send send job update requests to the queue saying please refresh this feed so it pushes data here and whenever a worker is available because it has finished processing previous data refreshes the worker gets the description from the queue item fetches the associated information source and pushes it to Drupal over the Drupal API this is also interesting because it means you can control how many work workers can perform in parallel you know you are not linked with the fact that it has one runner on this schema you can implement two or three or as many as you want and you are controlling how many workers perform at any given time because of the queue has acts as a serializer only one job can go through a queue at any given time you are in charge of the bandwidth and it's a voice checking your server under too much load this is even more relevant for with push data sources push data sources are the ones where some provider we have this with the sports news for instance will call your site sending either the data themselves or or a request to refresh from their site in pull mode after receiving the push what you typically do in this case is that we implement a web service it can be Drupal but usually it will be something a bit faster like a small silux or symphony controller just having a web service able to receive this news and push the raw data into the queue again this serves to serialize the requests so even if you have multiple providers sending you a torrent of data your workers will still control how much go work is being done at any given time it's very hard to saturate a small worker in a small web service done in silux just pushing data to a queue you can do over 10 000 inserts per second on any of the queues we will be discussing a bit later so it's very hard to saturate this and when the rate flows down goes down your worker catch up with what has been accumulated in the queue and still is doing inserts one at a time in your Drupal database meaning your site does not slow down even if there's too much information peak at any given time I can talk about this so and one of the other applications of the queue is heavy heavy processes and that means like it will go beyond your time limit of like 30 seconds 60 seconds whatever you think acceptable for your users to wait for the page and we can take example of the resizing or processing the big images we have the use case when we had the client who was working with some like big images and they actually warned us like oh we are going to upload big images and like okay big images like 10 megabytes right and then 150 megabytes arrived oh that was like huge issue for the server so it's just like went down like never images were processed like nothing worked and also we had another thing because the images were so huge and they really wanted to give ability for visitors of the site actually to see very small details of the images we had to go for this solution when you split the image into tiles and then display the image as the map so in this way actually processing of the each image to split it into tiles it will take some time so in this scenario we way the idea of implementation is actually user uploads the huge image to the Drupal and Drupal sends the job to the queue that oh we have received a new image we need to process it and then when we had job worker it processed the image it stored everything to the file system and then it's just let now Drupal by writing to database okay this image is processed now you can display your map and this is kind of general scheme of the job servers when we have heavy computational or some other jobs that will take some time like instead of doing them online during the pager quest like queue them up and then you will have multiple workers so in this way you actually parallel your workers they can be on the other servers like you can go crazy with this and then results can be stored let's go for the next one and when we are dealing with job servers when you're planning for some operations you need to be aware of multiple things so the first very important thing is like how to get results because in some cases it's fine okay if every worker it will process some job and it will put the results to the database in some cases that is fine in some cases it's not really fine because you don't want to have workers to deal with your database it's just about like security implications you don't want to have anyone but not local force to deal with your access to the database so you need to deal with that in some in one implementation we have just created another queue for the results so we actually had like one job queue just for the jobs and then workers were processing those jobs and putting results of those jobs to another queue and then we had cron runs on Drupal that were actually just grabbing the results from the job results queue that was one of resolution and another thing is jobs can fail like no matter how good they are experience tells that you need to anticipate that jobs will fail and they will fail like in very multiple reasons you can get some PHP exceptions but that will be easy to track you can get some fun stuff with server load like it was too loaded to crowded job was not was not able worker was not able to process the job within the limit and we will be talking about these limits so in usual practice you tell the worker okay this is the job but you need to process it let's say in one minute and if it doesn't process it that means like something on wrong I will just kill you and then this job will need to be rerun so when you design your own queues remember that you need to rerun failed jobs multiple times because before you're actually telling like there is no way to process this job another thing that yes we can fail a couple of job a couple of times but what happens if it's really something that happened like we cannot really process this job and like in the project I was working this like visual regression thing like we were taking screenshots screenshots of the pages and it's totally fine if it's website that works but sometimes it doesn't work or for example latency is too high I mean I cannot just open the page or it's just website is broken it's just 501 or whatever error message I get so I rerun this job multiple times and then I need to say to administrator like hey we cannot process this so deal with it and for that you can you can do multiple things so first of all you can create another queue that's what we did we created another queue for failed jobs so we each job in beside of the information about the job itself it also had the counter and when we failed the job we just increased the counter and when we reached 10 we just moved this job out of the main queue and put it to the failed jobs and in this way me as administrator I was able to monitor if I get anything new into the job and then I was able to debug these things but these jobs were not running around and like spamming my workers also you can do the other way you can just create a logs and emails whatever you like in our case we created failed jobs just for one reason just to be able to restart those jobs after fixing the problem so if the problem was on our side like I don't know we were doing screenshots with selenium and selenium sometimes just like drop off completely right so when we fix that issue like selenium is back we can rerun those jobs again so this is when we just move the jobs out of failed jobs queue to the main made all the counters to be zero and then we rerun them in your scenarios it might be different but you need really to anticipate these kind of things another thing that is extremely important when you're developing is monitoring it's very fine when you don't have much jobs and they work without any problems but the moment you will have some problems you need to understand like okay I have multiple queues with jobs I understand that's something going wrong in my system like I need to understand which ones are failing which are failing most often and what can I do can I should I like add more workers or so there are two monitoring problems so first one is monitoring your queues like the best one is if you have a graph where you can see how many jobs you had at the some period of time so you can see like oh there was another spike or like queues empty all the time why do I need it and another thing is the monitoring the workers and especially monitoring workers when you create some outer scaling systems and this is what we did we were able to monitor the amount of jobs in the queue and if the amount of jobs is exceeding some limits we were just spawning new workers and for that you need monitoring for workers because you can like you need to upscale your workers number and then you need to downscale if the number of jobs in the queue went down so remember about these two monitoring options in our technical implementation we were running workers as the PHP processes and we were playing around with multiple solutions to actually start them one of them was PHP demon that is actually in production right now but also the another tool that we started using is supervisor this is a demon that is very very native for unique systems and the nice thing is that it's just configurable with like text file where you just specify what what is the path to your PHP script you want to run and then like how many processes you actually like so in your workers code it was like check the queue if there is anything do something or you can have a loop and then for example after processing 10 jobs in a row just killed I like just exit so you free all your memory you don't have any memory leakages anything like that and then supervisor will just restart your process so this is not a very important tool maybe it will be useful in your cases I must say a second the recommendation for supervisory it is also what I use to to ensure that my workers run smoothly it's very simple to use probably the simplest option that you have if you are running on Debian or Ubuntu so let's be concrete what are the solutions available well there are already many as you can see starting with Drupal 6 and for some and including Drupal 8 already we have core solutions Drupal has come bundled with the QAPI and two Q services since Drupal 7 and the back port exists in the form of the QAPI module for Drupal 6 and this supplies a reliable database queue and a non-reliable memory queue one step above you have a contribution module which was developed for an e-commerce site that some may know which still uses the Drupal core database but adds several more columns and provides improved monitoring facilities this is the advanced queue module it's available only for 7 but porting it to 8 should not be hard and if you are looking for something more advanced and don't want to handle running the operations for your servers yourself amazon has a cloud option which is sqs the simple queuing system which is supported for 7 by the amazon sqs toolkit and my personal favorite is the beanstalk dq the module has been written by Gordon Haydon and I also had maintaining it it was written for Drupal 7 and we have two branches already for Drupal 8 neither is very stable at this point but you can already work with with it on some versions of Drupal 8 and it provides you with a way to have already a true job server instead of relying on the built-ins like the database for the reasons we'll see later and finally a french job server is evq which is 8 which is starting natively for Drupal 8 this is a product written in c the only event and the driver will be available in in some weeks probably for the Drupal 80 release some of my experience so when we were working with phpdman we had like crazy security issues with our c-sendments and one of the arguments they put against it is actually it was depending on lib event or some other system library that was like not in stable release even being used in for like five years or ten years already so and other solutions that we were searching is actually gearman and gearman is also pretty native it was around for a while and what it does it allows you to create also like job server where you can create jobs and then have some code that will execute them and also it has a lot of different backends so you can write your worker's code in like phpc go like python anything and another thing that we tried to use is actually php rescue php rescue is port of the queuing system of the github into the php so there what what is nice about it it has a resting interface so you can write your clients like no problems and the idea is that you when you create a job you specify the name of the queue and you get back the identifier of the job and then you have another to another rest call to check the status of the job and this is where you can see like oh it's not it's in waiting list or it's in progress or it's successfully completed or it's failed so that's another call and also some calls like to get results to delete the items so it's very nice the reason we went from it is just because of monitoring and actually that is kind of my lack of knowledge because i'm not very familiar with ruby and monitoring solutions were implemented in ruby the ones in php didn't work unfortunately for me so we have switched to rabbit mq and what is nice it's standalone demon you don't have much dependencies on anything else also it implements in pq protocol and so it has a lot of like it's a little bit different it has like huge set of features i think we used maybe i don't know 10 20 percent what was nice is that you get monitoring out of the box so you can have like special url where you have credentials and then you can see all your processes how many workers like everything you usually need and it's kind of very performant thing and there are a lot of plugins like shoveling shoveling is used for creating your like scaling your queues so you can create one queue server in one data center another queue server in another one you can sync the jobs between them so we can actually create very very distributed network of workers but we haven't gone that far but just about that rabbit mq can do a lot of stuff i think that's it yes when i think when mr also mentioned mongo db here this is yet another database option which is supported by the mongo db package for drupal 7 on drupal 6 there has existed a branch at some point which included the queue but it has been withdrawn you can still work on it if you still feel like doing work on drupal 6 i don't and the work on the drupal 8 version went quite far before chic's who maintains basically the drupal 8 version stopped working on it some months ago but the work has resumed in recent weeks to bring back mongo db to live on drupal 8 and it includes this queue component which has some advantages of over the core queue in that it can scale much better of course to very high workloads without impacting the core database itself and finally there's of course redis which is a very common option for handling queues and there's a queue for redis not in the redis package on drupal dot org but in a separate redis queue package this is for 6 and 7 only now that we've seen why we want to do queuing and what products exist what does drupal provide us with basically drupal as i said provided provides us with the queue api which defines a number of concepts the first of this is the queue itself what is the queue well in for drupal at least this is just a soft sort of low featured tube a 5.0 first in first out without any further mechanisms it's just a tube with a name in which you put data at one end and get it at the other end another term you can encounter is the worker the worker is the callback which your runner will trigger to perform actually the worker on based on the data which has received from the queue this data is called an item so some someone at some point places items in the queue some runner extracts them from the queue passes them to the worker for them to work on and when the work is done the runner receives the result and passes the information to drupal so so that the queuing can proceed another term as you mentioned is the batch subsystem which is using the queue system already but it adds another layer of api on top of the queue api itself and it's also an older api and which is maybe more drupal specific it enables you to define the starting point for for a job an ending point and in the middle several sub jobs which can themselves be subdivided by time quota and pass information in a sandbox from one iteration to the next this is what drupal has been using since drupal 5 i guess at least for updates for translation imports and a few other queues but it doesn't use them for instance for the aggregator in drupal 8 you have also two related concepts which are the queue worker manager and the queue worker plugins so this api has not changed much in between 6 7 and 8 in drupal 6 and 7 it's available either in queue or in contrib and you declare queues the same way with an info hook this hook is hook run queue info and like many info hooks in drupal it has an alter hook to allow third-party modules to modify existing definitions the api can be used without cron but it defaults to being run on every cron run when you declare a queue you can declare that it needs to be skipped on cron and if it's not skipped and cron is handling the queue then you can declare a maximum time during which cron can spend time working on any given number of items from any number of queues and the main parameter of in this info is for any queue the callback queue worker which is the callback which will be called to process all items on the given queue the jobs are run by the cron subsystem which is down in the includes of drupal core and as you already mentioned workers can cause errors and throw exceptions and regulatory core just does what is called pokemon handling it it's got to catch them all it tries the worker and catch whatever it suggests to catch exception e so it does nothing with your exceptions it's not smart about it in drupal 8 things are a bit better many things as you can see have not changed cron is still providing support to run the jobs in your queue except it's now based on the drupal core cron class instead of being down in the includes and there's a specific handling for one type of exception which is the suspend queue exception in which case the runner knows that your worker has handled an erroneous situation itself and the job must be rerun in that case it catches that exception releases the job which has been claimed by the worker and passes it for a later run since we no longer use info hooks in drupal 8 the hook cron queue info hook has been replaced by a plugin manager which creates instantiates queue worker plugins their settings as you can see are very similar enable cron or not the change being that it's now defaults to off instead of defaulting to on in drupal 6 and 7 and time again which is the maximum lifetime allocated to run any number of jobs for any number of queues if using the cron runner core uses this in two places currently one to refresh the feeds which is one of the cases we discussed earlier and to update the locate translations finally the info hook has small remain the remaining part which is the alter hook which is used to alter the the plugin definitions the so if you look at the first part of the api what's in the queue api itself now the first point is putting data into the queue for this you have the create item method on a queue object once you have a queue object you create item and past as you can see mixed data so there's no type hinting you can put just about anything it's a blob to the to the queue api it doesn't do anything with it and it doesn't return anything you don't even know if the submit was successful then at the other end the runner will claim items from the queue that is it will mark them as being being worked by the worker and it will claim some time to work on them by default one hour as you can see and it will pass the data it receives from the queue to a worker again as a blob object except that's an std class in which it you know there will be some properties like the item id the data itself and the time it was created this list time parameter which you the runner must pass to the queue is a maximum duration after which the queue is allowed to kill the job for the runner however core does nothing with it the currently when your worker has completed its work it can delete the item from the queue marking it to be completely done and the work can proceed if however it failed to perform the job it can then release an item maybe there was a resource conflict and it could not perform the work so the work can be put back into the queue that's what the release operation provides there's a very minimal operation regarding monitoring which is the estimate the number of items in the queue this is specifically documented as a best guess unreliable providers are allowed to return zero at any time and anywhere even if they are really reliable when returning the information you can still get a raise conditions when multiple workers are involved so it's at best just an indicator for statistics and not something you want to rely on in processing and finally you create queues and delete queues creating queue is a very lightweight operations with most providers it doesn't actually do anything because the queue are really instantiated when you when you create items in them delete queue on the other hand is not completely symmetrical because it deletes the queue registration within Drupal but it also deletes all items that it contained so one is idempotent but the other is idempotent but loses data finally all of these are properties of the queue interface in Drupal 8 and we also have an extended interface which is the reliable queue interface which has no method no constants but specify that if you request an interface of the reliable type you will get a queue which is expected to provide ordering of the items as a 5.0 which is not guaranteed by the default queue and single execution meaning exactly once finally in 8 you have some other helper methods available you have the queue service which is the only part related to the dependency injection container which gets you a class plugin instead of having to delete yourself you get this factory you pass a name and you specify if you want the queue to be reliable there is a complex interaction between various settings on your site and the code installed which will actually give you either a reliable or not reliable queue whatever you pass in this parameter take some time to read the code to see how it works because it's quite surprising if you don't take time to understand the settings the queue manager as I said instantiates the queue worker plugins and it uses the definitions from the plugin system and adds the alter hook as in previous versions and finally there's also an interface for the queue workers which specifies that workers must implement just one method which is the process item method which is expected to receive the data again as a blob and which makes for this specific suspend queue exception the last bit in in this panorama are the runners core provides the crown the crown runner which we already mentioned but you also have a very popular elizia cron module in six seven and eight uh at least seven and eight i'm not sure which tries to replicate the the features of cron while adding its own internal scheduler there's also a brand new module for drupal eight which is the queue runner which adds the ability to change jobs much like the the batch system and you have built-in commands in drush without any additional plugins to support listing the queue plugins in eight or executing a cron run from drush all these runners share similar limitations they work much the same way they're called the same code they have no support for preempting operations that take too long so the time parameter is not used the least time parameter is not used they are all single threaded php is single threaded of course and you have a single process to again that's works across all queues looping first outside loop on the queues and within that that loop another loop looping on the item on any given queue that is being processed which means that any queue can possibly starve the other queues just by putting too many too many items this is not a fair share scheduler and you probably want in many circumstances to write your own runner which will allow you to specify which queues you want to handle in any given number of processes and maybe kill items that take too long or things like that there's room in contrib for that where I didn't find any and I don't think you find one either Yuri but basically most projects doing serious works with queues will want their own runner this queue API is very good in that it provides us with the common ways of talking with the all the providers of job servers we saw but it has limitations as I said it's a limited 5.4 paradigm first if you don't ask for a reliable queue you get a queue which does not guarantee you to be a 5.4 which does not guarantee you single execution and which doesn't guarantee you execution at all actually jobs can just be lost or can come in any order or be executed any number of times it's essentially a datagram service so always ask for a reliable queue which most providers actually supply also it does not have any extra queuing discipline that means for instance you don't have priority management you can say I put this in a queue but it's higher priority so it should come before the others go before them in the queue you can tag items to fetch to allow a runner to specify I want some specific items from a queue you can't replace an item in the queue you have to delete it and then can push it again you don't have the option to submit an item for delayed execution this is provided by being stored for instance or to bury an item meaning if there's been a problem for instance I can still leave it in the queue but it won't be eligible to be claimed by any worker until I raise it again none of this is supplied again as you mentioned there's no built-in monitoring there's no peak operation there's no way to be to do a last in first out system but for all this if you actually think you have some hacks using which are possible using the built-in methods you can delete item recreate them and various ways to implement most of these things and also since queues are very light white objects you can very well in many situations create as many queues as you want possibly in the thousands range it doesn't cost really a lot because most queues don't actually have any implementations they are just items in another be a larger queue and finally whenever you implement such a system like we are doing for evq you may provide a richer interface on top of the queue interfaces so finally how can we go further when on performance Yuri you have suggestions I guess yeah so some of things about runners the idea is that when we so you shouldn't do such thing like active polling so when we have a one process just pulling like checking all the time if there are anything like that I mean if there are anything in the job queue so instead there are there are solutions where they you hold the process on the socket and then when the job comes in it will be triggered and then you will get the job the next one is the so the problem with this is that we are going to have like multiple selects hitting our if for example the database if we have the queue in the database and another thing is that very important for performance when you when you create the queuing system is to think about running all your jobs in parallel so this is when you would like to create multiple job runners and also what is very important is to have different runners per different queue so you will not get this Drupal Drupal story of having multiple queues and they are being run one by one and if there is no there is one problem with one queue other queues will just not run at all so when you create runners just create multiple processes like I don't know three four runners per queue in this way you will have some kind of guarantee that all of them will be processed in parallel at least if you will have some problem you will identify that problem pretty fast and the item regarding read after write this is about the caching so when you are refreshing when you are invalidating the cache and this is what happens for example in Drupal when we update the content we invalidate multiple caches in the system so after you do that one nice thing is actually to trigger rebuild of the new caches as well so after invalidating just hit the read it can be very basic like if we updated the node just trigger get request to that node URL in this way you will refresh the caches and there is way lower chance to get this happy user who will wait for one minute on the website to get the caches warmed up and that's all we had to say about the topic and I will welcome your questions we have I guess about nine minutes if you have any questions and you are invited of course to come to the sprint on Friday and to tell us what you thought about this presentation on the presentation node on the schedule so if we have any questions as a microphone over there oh you can just ask questions we will repeat them for recording yes please okay so what he just explained if I just well understood is that you wrote as a binding of the qapi to services in order to expose a standard way to access queues remotely that's that's very nice and I didn't didn't see it when we looked for what was it is it published on Drupal.org okay so the question is how do the workers that communicate with Drupal I think we have two strategies you'll re-explain the his case with the command system where he had to rebuild a quite specific way of writing the database by bundling inserts together on a database which was otherwise read only it's quite specific can you say more about it maybe yes so this is the problem of where to store results and things so one of the solutions is that yes workers can have access to the database and then then when they get job done they push results to the database directly that's one of the cases or in case of the comments especially that job to put data to the database right there was another application that's for like screenshots things so in that case we didn't want to make workers to communicate with database for security reasons because they were on completely different servers and we wanted to scale up our servers like immediately and we didn't know the IP addresses of those servers so we didn't want to give the whole world to be able to access our MySQL so in that case we stored all the results to the queue I mean the different queue but to the same queuing system and then we had little process on Drupal side next to MySQL that was actually getting those results and putting them to the MySQL so these are strategies that we used yes so we were using the external services we didn't use the Drupal queues in one case that was radius in the other that was rabbit okay I should say this this sounds to me quite unusual in most cases I think you want to write the worker as a drush commander because this is quite a quite simple way to ensure that you will have a properly booted Drupal and that you'll be able to access the whole Drupal API and since you are running a drush commander you are running on the CLE SAP meaning you have essentially unlimited time and unlimited memory if you configure PHP properly I think that's the majority case yes okay so this this is about the module called concurrent queue which apparently allows to submit the work to the multiple queues concurrently so that can be processed in parallel thus alleviating the problem we have with the the the big loops in the standard processing thanks yes could you speak in the microphone because we are otherwise otherwise we have to repeat the questions it's not very convenient for people who are not present in this room and who are listening on to the video or who will be hopefully listening to the video in days and months and years to come how do you guys deal with failure details in general do you have a standard best practice or it depends in the kind of the nature of the queue do you use some country module or something standard I can talk to that so I mean especially for retries yeah so the implementation that we had that was very custom so we didn't use any that was not related to anything to contribute modules of Drupal so in that case we just rerun we increased the counters and then we just had failed jobs and my job was like every morning I started my job like oh do we have any failed jobs overnight and yes we have now we have another couple of hours of debugging so that was kind of best practicing for me but it's like you get the situations a lot in the beginning but then you start understanding oh okay I eliminated all the use cases that can come from this subsystem and then all possible cases from another subsystem and then by the end probably you end up with not having these cases but in the beginning that's like 100% you will have something but no Drupal modules unfortunately one last question maybe we only have two minutes well just a module that I started to use recently it's it's just the opposite of running in parallel I was using drash queue run with a supervisor a setup and and I found out a module called mop execute that does the does the opposite it runs several seconds and iterates among all the system queues and it then it runs the all the queues in a fair share so you are always looping through all the queues and you just have to deploy a single process running in loop and you can deploy these these very same in as machines as you want okay this is interesting I didn't quite get the name of the modular it's mop dash exit dash queue I started using you recently and it works very well for me thank you very much for this information okay so I think we'll close this thank you for attending and for your questions which are quite insightful thank you bye thanks