 Okay, so thank you people for joining us on this showcase session with such a long title that it doesn't fit on a single line. A real live study of large scale personalization. We've drippled for one of the biggest extra net in Europe. So, first of all, let me introduce you who we are, exactly. So, Actancy is a French company. We actually have two offices, one in Strasbourg, the other in Paris. We are 65 web experts, and we mainly work on projects from 50 to 2,000 men per day. We have 100% Drupal and Synphony development, mainly Drupal, but we don't use any other technologies. My name is Nicholas. I'm technical project manager and lead developer. At Actancy, I work with Drupal since 2006 now, mainly on our large scale project. So, what were the business requirements of this project? So, the client was Alcatel Lucent, and he had an existing platform in another technology of a B2B extra net for about 80K users and about 50K contents to manage. Contents were either news, documents, heathens, whatever. And all those entities, users and contents were distributed across different services that were managed with other technology too. So, this B2B extra net had quite huge volumetry, quite huge volumetry of visits and of connection peaks for a B2B extra net. So, we had quite important issue with performances as you will see later. What were our main objectives? First of all, migrate everything to the Drupal platform and host everything on the Acrea network. Those were the two principal objectives of this project. Now, there were secondary objectives that were the migration of a business role system and a quite complex permission system, as you will see. And connecting Drupal to all those different distributed services across the Alcatel Lucent network and actually improve the performances because the old technology platform was quite not performant at all with a page load of more than 10 seconds on some pages. Our main challenges on this project were the security with a quite complex permission system, as you will see, and the business role that help us to generate every user profile, as you will see. The migration in phases was quite a big deal to deal with, you will see this later. And the performances was one of the greatest challenges. So, if you take the Drupal permission system, basically it's one way credit definition. Every permission is stored on the user side. Basically, Drupal, every user has implicitly a role in Drupal. It can be anonymous user at least or authenticated one, anything. But all permissions are always stored on the user side. So, when the credit is defined, then the contents are not createable, readable, yep, datable, deleteable, okay. On our specific project, the credit system was a bit complex. We had profiling data on user side and on content side. And we had to define a specific process in the middle that will calculate the final credit permissions. If you know Drupal well, you'll know that this is really hard to do. You'll see later how we managed to do this with only, well, mainly only Drupal tools and some custom module too. The volumetry was quite a problem too. As I said, about 80K users, more than 50K documents. If you take a one-to-one ratio, you hand up to something like four billion permissions. This is really huge. I think by the time we made a little park of this and I guess, I don't get the right number in mind, but I think the not accessible just exploded to something like two gigabytes of data and it really decreased the performance a lot. Another big deal on this project was the fact that every single piece of content on a single page had its own permission. Meaning a single user don't see the same thing as another one because every block of content has its own permission, read permission or update permissions that it's on its own. So basically two people don't see the same thing. Well, if some of you already gone to the performance or the caching session this morning, Drupal 8 does this very well, but with Drupal 7, it's kind of tricky to achieve this really efficiently. The business rule system that the client wanted us to create was kind of tricky. So basically every user had specific data on its profile, markets, services, things like that. And the client wanted the possibility to add and create a system of rules that will be calculated each after another to determine a specific profiling data for the user, meaning all those rules will generate which permission the user has to see or not a document. So basically the migration in phases also gave us quite little problems. As you can see, we had the legacy platform connected to different services, one for the content, the other for the users, another to generate companies for the users. Everything was connected to each other and we had to connect Drupal basically to all of these services without losing any information on any stages, on any step of the migration. So you'll see that was kind of tricky too. Now, what were the impacts of this migration in phases? First of all, knowing that some contents still exist on the legacy platform as references, for example, in a block of content on the side or anything, you had to be sure that your new system implemented permission system before anything else because you don't want any user to see a content that it doesn't need to see. You also have to keep every links to the un-migrated contents to be sure that any reference that may be done linked to something that does exist. You don't have to lose any content on one or the other platform. We also had to handle an ACSO to be sure that any session exists both on the whole platform and the new one. We had to keep contents and user profiles synchronized between the two platforms through all those distributed services. And as some of you may know, when you do a migration, generally you bring every problems from the old system. You bring them to the new platform and you had the new platform problems on it. So one of the big deal was also to limit this problem and to learn from past era, propose new solutions that may fit the clients need more. So, well. On the performance side, we had many asynchronous calculation for all the permission system. So you'll see that this handled the QAPI very heavily and it brings us some, it brings us some quite big performance issue. We had some locks on some tables because of, you know, may ask you or don't like when you try to query multiple time the same table at the same time. So sometimes it can bring you big locks. The one page equals multiple web ports with different permission system was also quite a big deal for the performance, as you may know, with, as I said, not access table with two gigabytes of data just decreased the performance a lot. So even if we can optimize this, it remains some great problem. We had huge volumetry of data. The caching was to be chosen wisely because as we had was different blocks with different permissions. We had to cache everything per user and per contents. And wow, that's, oh wow, we have also had the classical media performances issue that any project had. So what did Drupal gave us for this? First of all, the permission system. As you remember, 80K users and 50K contents means four billions permissions. Four billions permission is about several gigabytes of data in a single table decreases permission. So what can we do exactly to make this work better for Drupal? Does 50K contents really mean 50K profiling patterns? If one of your documents, for example, only has read permission for a specific market, is this permission unique in all your database? Or are there different documents that share this permission? Is it possible to gather every contents on the website that are profiled into groups of permission? Well, we tried this and we found that there was something like 3,500 patterns that exist. And basically if you try to record those permission, it's significantly less permission to store and there Drupal can, well, it's still a big deal, but Drupal can handle it far better and the performances are better. So is there a way for Drupal to do this? Basically, that's what Organic Group does. If you, if you people know the Organic Group module, this module just allow you to group documents by permissions and it allows you to link user account to those groups. So basically, if a user is able to see some contents because he is member of those content groups, well, basically if another user don't have the same groups, he will see other contents. And so we tried to work with Organic Groups because with Organic Groups, you can lighten your, you can really, let's say, your null access table is really preserved. You will have far less entries in your table and your permission will still be efficient. So we used Organic Groups and we developed free custom modules to work with this. The first one is the Content Group Engine. This module helped gathering every content by permission patterns. For example, anytime a document was saved and metadata hash was created and there should be only one Organic Group per metadata hash. So anytime a new profiling pattern appear, it means a new Organic Groups has to be created so that all documents that fits this particular permission patterns can be grouped to this. Sometimes when you have simultaneous addition, you may have multiple Organic Groups that handle the same permissions. Well, it may happens, it's rare, but it may happens and basically it's not a problem because your logic still works. Anytime you have a group and documents linked to them, anytime you have members to those groups, even if those groups represent the same permission patterns, everything is fine. So you just have to have some cleaning processes that will sometimes merge every Organic Groups that represent the same permission patterns. So basically this Content Group module, what does it does? When any document is created or updated, if the permissions patterns already exist, it only link your document to these Organic Groups and everything is done. If it doesn't, then it creates a new Organic Group with this new specific permission pattern and it tries to provide it against every users on the website. Now, let's see for the user and this famous business rule engine. Okay, so the user have several data on their profile and basically the client wanted to be able to write those rules quite easily. So we used a simplified PHP version. It's quite specific with some tokens. So basically this line here is one of these rules. If you read it on the first part here, you try to reach the market of a user. This little prefix here means that you try to reach the user. You check if the market equals one. Then here you try to get the user company market and you made some test here. And if you validate this rule, then some specific categories are assigned to your user to build its profiling data. Whenever one of the user is updated, it may happen that the profiling is not up to date. So we generate for each user a hash every time it is well created but mainly updated. And if the stored hash on the user is not the same that the one that is calculated on the fly, then we know the user profiling has changed and we need to recalculate the user profile. So if the hash has changed, we basically recalculate every rules to see if the profiling is good or not. Now finally, now that we have our content groups and our user profile, we need to match them together to have our final cred permissions. And this is what the last module does. The mapping engine is just there to grab everything and mix them together. So the mapping engine is a custom module that is also highly configurable as you'll see here. You have a bunch of settings that allow you to define some validation steps. Here for example, the first line tells that you have to compare on the content side field category with a user category field on the user. And this specific operator tells you that the user has to hone whole of the content categories to be validated. You have also an operator for one off which is more open to make some different validation step and you also have a specific Boolean field that allows you to get exception on your contents. For example, if you want a document to be available and to bypass, for example, the mapping engine on a specific market, for example, then you just have to check a little checkbox on the document which correspond to this specific field and the mapping engine will check the value on the fly. And if your checkbox is checked, then it just bypass the mapping engine process. And we also added some exception, mainly for the webmasters, because they are the main people who generally need to have exception on the content permission. Particularly for the view or update, delete operation. So for them it's quite simpler because they just bypass the whole process without having to configure anything, but the right roles that you want to process that way. So the mapping engine is an asynchronous two ways process because it has very huge amount of calculation to do. We had to generate all the calculation through the QAPI. So there are two ways. The first way is when a user has to be calculated for all content groups. So you have a new user or a user update that changes profiling data. You have to be sure that this user is member of some content groups that are the one of this permission pattern. So this is the first way. The other way is when you create a new content group. So you created a new content and this content has new permission patterns and you need to add any users that have the right permission to see it. So basically this process is available on the standard Q process for mainly the user updates or the content group creation, but we had to handle this same process with some kind of emergency Qs for specific case. So we have the specific case of a single user that you want to see right now on the fly. You had it as profile. You want to see the results right now. So we have a specific reprofiling for this. We also have a specific reprofiling option for all users being active from a specific date. So for example, if you have some corrupted data in your database and you want to be sure that all active user, I don't know, on the last week, for example, has the right permissions to see the right documents you have the right Q to calculate this. And finally, we also had a Q for new user because you always want new user to be available sooner than any user update that may be processed on a synchronous way. So basically here is a global schema of our process. So you have on one side the contents that may be created or updated. The content group engine will then generate the content groups that match this specific content permission patterns. If the permission pattern already exists as an organic group, we just link it and there's no more operation to do because this content, this organic group is already linked to user with memberships. And if the content group doesn't exist, then you just send it to the mapping engine which create a Q for it and we'll try to map it to all the users. On the other side, when you have a user that is created or updated, you check the hash. If the profiling pattern has changed, you send it to the business rule engine that will calculate every business rules and that will generate the user profile. And then finally we'll be sent to the mapping engine which will link the user to the different content groups. Okay, so that's the global schema. Now, what about the migration in phases? So the deal was to migrate the whole website, not on a single script, but really features by features. So we had to handle connection to the different services as I said earlier for every types of entities. So we had to be sure that the FIFO was guaranteed to be sure that the last update of the content is the one that is displayed to the user. You don't want, for example, an old update of a single item to be pushed on the production environment. You always want the latest version on product. You also had to take care of any message loss if one of the services done, you don't want the message that is sent to be lost. And you have to take care of any entity reference. If some entities are referenced to a single node, you want them to create stabs. You don't want them to be linked to nothing. So you have to create stabs and then complete the entity later when the right values come to your service. And of course, we had to ensure that the security of the Alcatel Lucent infrastructure was preserved. So to do this, we created a specific Drupal distribution that is called Drupal Queue Messaging System. This system is inspired by JMS or Rabid MQ for those who know those systems. Basically, this instance consists of a broker and it centralizes every messages in a single Drupal instance. The FIFO is guaranteed by several re-sequencing processes and we also have different retry and failure processes to avoid any item loss. The global system is based on a topic and subscriber system for both of you who know those two different, well, JMS or Rabid MQ uses the same kind of workflow. So maybe it's familiar for you. And finally, we used IP restriction and access token and SSL to guarantee every security because you don't want anyone to see what is exchanged between your services and you certainly don't want anyone to modify those information while they are transiting. So this is the global schema of the DQMS system, the Drupal Queue Messaging System. So you have different services that are linked to a broker, in our case, a specific Drupal distribution system with different topics and services subscribers that will await different messages, the broker just push the messages in the right topic and in the right topic queue. And if one of the services is not reachable for any reason, then it just retry until the service is available or if you have defined a, well, I don't know a limit, a specific limit of time at which it just stop and store the message for later retry. On this migration in phases step, we also add the problem of the session synchronization. In fact, the main problem was that the specific profiling of this platform had to be preserved on both side. On our side, we used the all process with the mapping engine, but on the legacy platform, we had to keep the profiling up to date for every users. So we used both the DQMS system to send the profiling updates to the legacy platform and we also used session cookies information for smaller updates on the user profile. And then the legacy platform just grabbed the cookie and update on the fly, the user account. Now we arrived to the little performance problem we encountered. So the first problem we had with MySQL was a deadlock because our mapping engine really often tried to reach certain field of the database and Drupal, well, MySQL in particular doesn't like it. So that was not particular, well, we could optimize our code a bit more to end of this, but we had to had a really great performance on this mapping engine process because the volumetry of permission was really heavy. So we only had to tweak the MySQL configuration a bit and this allowed us to reduce the lock on most tables. If you guys need more details about what were tweaked exactly, maybe we can talk about this later. I won't go too much in depth about the solution we had for the performance issues. We also had quite big issues with the page load as you can guess, the performance of the permission system where, well, with our content group optimization we preserved main of the performance but there was still some leaks that needed some code optimization and this specific page organization with multiple web parts also was quite a big deal. So for those of you who was to the caching session this morning, Drupal 8 now truly is doing an amazing job with this and it's really easy to do it. By the time we didn't have the render cache module so we used panel and panel's hash cache to specifically cache blocks of content on the pages for a particular user. So the granularity was really easy to set and every single block of content was then separately cached for a particular user and that saved us very important leaks of performance. On the other side, the heavy permission calculation on single page also bring us some wide screen of depth for webmasters due to the admin menu module which had it some more access check on every pages. So for this particular problem we only replaced the module for some custom admin blocks that were far more lighter. We also had, well with the QAPI we had the probably the main real permission problem. So the mapping engine had a really huge quantity of data to process and to be sure that the calculation will be done as fast as possible we choose to multi-thread the process. So we had to think of it from scratch as a multi-threaded process. It's definitely not the same way of working because when you have multiple process, multiple thread you have probably collisions between some of them. So we had to develop some semaphores, some locks to be sure that when a thread grabs an information in your database another thread won't try to write the same information at the same time. So when the mapping engine is called whenever one thread tried to work on it, he grabs a lock and telling the other thread, okay, wait for me when I'm finished, when I'm done I give you the lock at your time. So basically this decreased the performance a bit but on a general on the global scale the multi-threading really helped us increasing the calculation speed. The number of items was also quite a big deal to handle. At the beginning of our process we stored some full entities in each items of the queue and this was really a problem that made our queue table to grow as something like, I think the most important size we had was something like 50 gigabytes for the queue table which was quite huge. So we had to refactor a bit and only handle HIDIs that will be then later loaded by the process itself and that will then reduce the size of the queue. Well, today on a per day basis the mapping engine steel handle something like I think 15 million items, well constantly. So it's quite a heavy process. And finally the main, another problem we encountered was the cache desynchronization for memcache. I don't know if some of you already worked with the Acquia cloud already but here on this particular project we had a load balancer with several servers behind and each server had its own memcache bin. This was a problem because it introduced desynchronization between the different bins and sometimes the user may be served with another version of the same page. So we only had to dedicate a server to the caching and that solved the global problem. Okay, so I'm done for this presentation now. I didn't went too far in depth for the technical aspect so if you guys have question thank you for your attention, I'm here for you.