 Okay, I think it's time to get started. Can you guys hear me all right? Yeah, good, excellent. Welcome to my presentation here today. We're gonna talk about content staging in Drupal 8 specifically. I submitted this session under advanced site building because while we're not gonna talk too much code, I'm definitely gonna cover some components that we're using to build the solution and some sort of intricate principles around this. So that's what we're gonna talk about today. In a little bit more detail, we're gonna cover what content staging is, why we do content staging, some of the reasons behind that. We're gonna also cover what we want from a content staging solution in Drupal 8. We're gonna cover some of the lessons we learned from Drupal 7 and how it's being built in Drupal 8. With the emphasis of being built because this is not a solution that is ready to use yet. It's actively being built. Some of what we're gonna talk about here today is not yet built. So there's definitely work left to do on this but I'm still excited to present these ideas to you guys and hopefully for those of you that are interested, reach out to me during the code sprints here on Friday and we can build the solution together. That would be very exciting. We're also gonna talk about some reusable protocols. You will understand what I'm talking about there when I'll come to that. And then we're gonna jump to some conclusions at the end. I would have hoped to get a demo together for you guys today but as I said, we're in very early stage development here so unfortunately we won't be able to do that. So first, who am I? I'm Dick Olson, Dickson on Drupal.org. On Twitter you'll find me under at Dick Olson, simply. I'm a web technologist. I love everything about the web, about open source. I am a passionate Drupal contributor. I contribute to core as well as maintain a couple of modules as well. I maintain the UID module and the deploy module both being used for content staging solutions in Drupal 7. I'm from Sweden. I currently live and work in the UK. So that's just short about me. So let's dive into it. What is content staging? I'm first gonna start by asking questions. How many of you are doing some sort of content staging or advanced content staging workflows or anything like that at this moment? Okay, great. And how many of you are looking to build something or want to explore these possibilities? Okay, that's great, awesome. So content staging is different things to different people. We're not gonna cover too much of the editorial aspects of content workflow or content staging in this presentation. We're gonna cover the technical principles of how a system like this would work under the hood. The editorial part, that will be covered by other modules, by other solutions that you can sort of tuck onto this. So we're gonna cover the technical aspects. The simplest example is you have a stage site, a production copy of that site, your editors, they sit on the staging site, they do their content, and then you do a push from stage to production. That's the absolutely simplest example of content staging. There are also solutions where you stage content only on the production site. There are other solutions for this, so the part we're gonna cover here today is for when you have sort of a multiserver environment going. A little bit more of an advanced example is a merge workflow or what you wanna call it, where you basically have clones of your site, the editorial one, two, and three, for instance, where editorial teams, they work on individual updates, can be big updates, can be small updates, and then they sort of come together, they've been merged into stage, and then further pushed to production. This is a workflow that very well applies to the WF Tools module. It's a module being developed for Drupal 7. Dave is very excited in the back there. Basically how that works is that you spin up copies of your production website, where you can do big changes or small changes, you can do code changes individually. These changes might sit there and be reviewed for months and months, and then later emerge to stage. So this is another example of how you can do content staging. Another model is what I used to call a hub-spoke model. I know a few companies, they use and deploy sort of solutions like this, where they have one central editorial site that has no public content. That's where they sit and work, we're in collections of content, if you may, and then they push these bits and pieces of content out to their frontend websites. So these are all sort of separate websites driven by a central editorial site. Other examples is not necessarily, let's see if we can get it to change here, or about that, let's go through this again, shall we? There you go. Another model is just a simple sort of ring or network model where you don't necessarily have any staging site, you're sharing content across a network of sites. So not necessarily sort of a staging solution, but it's definitely an interesting solution that I've seen people do, they share content across websites, they sort of stage or replicate content, if you may. So this is just some different examples. Why do we do content staging? Some companies, they have very specific workflows where they simply can't log into production or they want to preview content in very specific ways. Drupal hasn't been very good at providing full site, full site previews. So this is often why you separate stage and production into two separate sites or installations where you can do sort of full site previews. Other reasons are for better security. Some companies, they don't want their editorial teams to sit on production and edit content, so they put their staging site inside their corporate network behind firewalls. That's where they sit and work on revisions and edits and then when the content is ready, they push it from within their corporate network out to the production site. So for increased security, that's one of the reason. Then it can be legal reasons as well. Maybe you have a type of content on your website that simply is not geared to be changed live in production. Maybe needs very extensive legal reviews. Sometimes can take months, so content changes might sit there and on a staging site for months while it's being reviewed and approved and so on. That's also one of the reasons why we do content staging. But no matter why you do content staging or how you do it, it all sort of is based on the same principles. It's content staging, some call it content sharing, content replication or content syndication between different environments. So it's all sort of the same principle, no matter how you configure your workflow or what the reasons are. It's built on the same underlying principles. It's just called a little bit different by different people. So what do we want from a solution in Drupal 8? We don't just want to build a sort of content staging system. We want to build a generic or more loosely coupled system where it's gonna be easier to support the various configurations or the various ways that you want to share or replicate or stage content. The solution that we built in Drupal 7 was very much a pure content staging solution. People tried to sort of bend it and use it for other sorts of ways, but so we want to create a more loosely coupled system here. And as I briefly mentioned, we of course want to learn from the Drupal 7 implementation of the UID module, the Deploy module and the Workflow Tools module as well. So we want to take our experience from those modules and sort of improve and build upon that as well. And in more detail, and I'm gonna cover this in more detail as we go, but one of the lessons that we've had is that we need revisions everywhere. Why do we need that? We need that for things like detecting conflicts. If you have a merged workflow or anything like that, then the previous solutions that we built with the Deploy module hasn't been very good at detecting conflict. So we want conflict detection. We also want an easier way of dealing with dependencies, dependencies being if you want to move around a node, you need to know what the dependencies of that node is. For instance, the author user or taxonomy tags or other node references. Those are all dependencies and we need to manage these dependencies as we move all of these nodes around or other types of entities. We also want to support ad hoc replication or ad hoc pushes or continuous replication of content between our environments. Content staging is usually seen as the sort of ad hoc push when I'm done, that's when I want to push. There are use cases where you want to have a continuous replication of your content across your environment. So that's also something that we want to support. Also bidirectional replication or bidirectional content sharing between environments would also be very interesting to support with this model. Maybe that's something that you most likely want when you have a network of sites that is sharing content across each other. And of course we want this all to be driven by a REST API. That's a good way of doing it. That's how we did it in Drupal 7. So these are the things we're gonna cover. And how is it being built? We're gonna walk through a couple of modules here. The core modules that we're utilizing are the serialization module and the RESTful web services module. And in addition to those core modules, the first module that we're gonna talk about is the multiversion module. On top of that, we're extending the RESTful web services with what I used to call a relaxed web services module. Yeah, sorry about that. And then lastly, but not least, we're gonna cover the deploy module. That's gonna work together with all of these modules. So first, multiversion module. The purpose of this module, it sort of sits in the bottom of this stack really. Its first purpose is to track all update sequences in order to make dependency management a lot easier. So we're tracking in what order and all the changes that are being done to entities. And we track all of these sequences and in what order they happen. Because then, one of the most challenging things in Drupal 7 with deploy was what was in the module called entity dependency. Because when you have a notice, as I explained earlier, to deal with these dependencies, we had to parse them recursively and build a graph of how these dependencies come together and in what order we're gonna deal with them. And graphing and recursive dependency management is, that's very difficult to deal with. So the solution here is that, just track in what order everything is happening and then we can simply replay the whole order or parts of that order. A lot easier, we don't need to change with ordering of our dependencies. We just need to understand what are the dependencies of an entity and then we can just replay that in a sequence. Multiversion module also provides revision support for all content entities. This is quite a significant change, but it's possible now with the entity API in Drupal 8. We have generic revision support for entities. So, and since last week, a core patch got committed where the schema generation is dynamic. So, it's not up to each entity to define its schema, it's derived from metadata about that entity. So the schema generation is dynamic which makes these things a lot easier. We are working on a patch for core that we will depend on on Friday, which is also gonna be crucial for this where you can extend entities with more base properties. Base properties not being configurable fields as we know from CCK and so on, but we can extend entities with more base properties. So that's also gonna be crucial. And then last but not least, we're tracking revision trees in a way very similar to how Git is tracking revisions. And we do this to support conflict detection and with the primary use case being, for instance, the merge workflow as we talked about earlier. Let's say if you work on the editorial environment one, you work on maybe node one, two, three and four. You're doing changes to those. And then on editorial environment three, for instance, you also make a change to node number two. And in the Drupal 7 version of deploy, then we weren't really able to detect conflicts. Whatever change that got deployed last, that was the revision that won. There was not an easy way to detect these conflicts or to deal with them in any way. So this is why we're tracking revision trees for all entities so that we can more easily detect where conflicts are happening. So that's the purpose of multiversal module. We're doing a few more things and I'm gonna talk about the revisions in a little bit more detail, how that's actually been done. In the Drupal 7 version, we had UUIDs for revisions. UUID support for IDs are in core. There are not any UUIDs for revision, which is a good thing because we don't want UUIDs for revisions. Revisions, in this case, is a hash string calculated from the actual changes of an entity. So this way, it's easier for each server in the network, staging or production or your editorial sites to detect conflict without needing to do deep inspection of an entity or without needing to ask the network, your other editorial sites, what did you change? What was your last revision? A lot of network calls to sort of figure that out. So we have a consistent hashing mechanism that where the revision is just a hash of the changes. The parameters that the hash is taking is if it's deleted or not, what the current sequence ID is, the previous or the old revision and then the normalized entity used by the serialization API there. And the string that we're doing the MD5 hash of is a JSON representation of that entity. And looking just at a simple entity in its JSON format, how we're tracking the revision trees is done in the revisions info field. And there you can see the hash, just two examples. We're also having a field or a base property that displays the local sequence number. It's important for various reasons. And then we have another flag there, which you might think is a little bit strange, but that's a flag saying if the entity is deleted or not. Strange. That's strange because we're changing the semantics of the entity storage. We're doing more crap instead of crud. Crud stands for create, read, update, delete. That's the traditional model of how Drupal is dealing with entity storage. Instead, we're doing something that is often referred to as crap, create, read, archive, and purge. That means that entities are never deleted. They're just archived. And then later maybe being purged. So every change to an entity is a new revision. Even a delete is a new revision that just happens to be marked as deleted. This is how Git is doing it. When you delete the file in Git, it's actually not removing the file itself. It's saved as a new revision that happens to be marked as deleted. So we're changing the semantics of the entity storage here a little bit. And we need to do this in order to handle conflicts, in order to handle conflicts where one version on editorial environment three maybe was deleted. And in another environment, we did an update to that node. When that comes together on the staging environment, we need to be able to take decisions of how to deal with this conflict. So that's why we can't sort of wipe the entity away because then we don't have anything we can take decisions on. So that's how we're changing the semantics. And then you can run compaction jobs, as I call it, on Cron where we are purging away actually deleting entities on Cron. They do take up a lot of space, of course, in your database if we're always gonna store all entities. So a compaction job can, if you want to actually delete or purge deleted revisions. Yeah, a question. So the question was is it possible during the compaction to not delete but store it in a archive database on a separate database if there are legal things why you might need to store deleted things? Yes, it is. And it's not built in inherently into the system but Drupal 8 is very nice for you in this respect because, and I will actually cover how the compaction thinks is built a little bit later but it's all, we have basically a compaction manager which is in the dependency injection container. You can swap out that compaction manager with the symphony stack that we're using for this. Swap out the compaction manager and instead of deleting you push it to a different storage. So it's not inherently built into the system but it's supported by Drupal 8 through the dependency injection container which is very nice. Okay, we're gonna cover some components that we used to build this. We have a sequence index to track all the update sequences. That's bundled together in a service that implements the sequence index interface. Injected into this service is a key value factory interface because we store all the sequence in the key value store that is provided by Drupal 8 because tracking all sequences there gonna be a lot of them on your site. So storing them in the key value store makes sense. You can swap that out. If you don't want to use MySQL you can use Redis or anything else that you want. So we have the sequence index service. That's wrapped if you may by a new entity storage controller. That is using a generic trait because the storage controller the changes that we're doing the semantics that we're changing around to do more crap instead of crud. It's actually storage agnostic why we bundle everything into a trait. So for instance the MongoDB storage controller can just reapply this trait and get all the benefits of what we're doing here. And then to what we just covered here there are two other services important services we have a conflict manager and a compaction manager both that are dealing with conflicts and the compaction things that we just talked about. And these you can swap out in the container if you want to. So that's multi version module that's essentially what it's doing and it's doing a lot of the heavy lifting of tracking revision trees, conflicts and compaction. So it's sort of a very fundamental low level module that's important for this stack. Second, we have the relaxed module which provides an extension to the rest module. Basically provides a new JSON API. We're not using the core endpoints or the core JSON endpoints because they are not very well geared towards what we want to do here. So we use the plugin framework that the rest model is doing but we provide a new set of endpoints. Endpoints for all content entities and file attachments or file entities as they are. Then we also provide a couple of endpoints for certain administrative tasks we make. Comparing revisions, we need to do that while we're staging content. We can start and stop replications either ad hoc or continuous replications. And there are endpoints for a few other administrative tasks as well. You can for instance trigger compaction over the rest API, et cetera. And there's gonna be a Drush plugin for dealing with all of these things as well. So you don't need to use the rest API. You will have the same set of functionality available through Drush. So we have a document resource, the generic resource that is dealing with content entities. And this is implementing the plugin system provided by the rest module. We, as I said, have a separate endpoint for attachments, all extending the rest resource base. It's a very nice API, by the way. It's easy to write new endpoints, so that's very exciting. We also have a bulk document endpoint where we can push multiple entities, multiple content entities at once. In the Drupal 7 version, if you were pushing in one push 100 nodes or 100 entities that you updated, that was 100 HTTP calls back and forth for every single document. The more network communication you have, obviously the higher the risk is that somewhere along the line it will fail. So we do provide an endpoint for pushing in multiple document at once. So one HTTP call for pushing 50 nodes or however many you want to push at once. Then we also have an endpoint for showing the sequence, the update sequence that we talked about before. Basically an endpoint to show all the changes in your database, or among your entities in your database. And this is gonna be important. I'm gonna cover how the replication is actually gonna work here a little bit later. And then an endpoint for comparing revisions, what revisions are target site and source site missing between each other, et cetera. So in practice, how is the replication working on a technical level? So there are a few network calls that we need to do here. First, the source and the target, they have each a UUID. So the source and the target can be identified universally. And when you want to push from the source site to the target site, first we ask the target site, at what sequence number did I last do the push? And so we do a, so we're storing that check point on the target site. And then when we have that answer, then we can get the changes since that sequence from the source site. And this can be all the changes or we can apply filter to this. So I only want the changes since this timestamp for node articles. You can apply filter to this. Not gonna cover the implementation details of this because it's not all done. But basically we ask here, what is your last changes that you don't have? Then we pass this result, these changes to the revision diff endpoint on the target site to ask among these changes that you don't have, what revisions don't you have? Because some of the revisions you might already have. So we get a diff and we collect all of those missing revisions from the source site and those are the revisions that we push in with just one network call to the bulk docs endpoint. So it's a little bit more sort of communication initially to figure out what revisions you need to deploy. But once you deploy them it's one network call so the stack here is very quick to go through and it's less fragile if you may compared to the Drupal 7 setup. And then last we save the check point on the target site so that we can redo this again and just get the changes that we've done since last push. And then, last but not least, we have the deploy module. Deploy module in this case is just gonna be a simple UI, simple user interface on top of this to manage your replication, so to manage your deployment plans as we call them. And it's also gonna provide you with a user interface to manage your conflicts, your revision collisions and so on. So this is gonna be quite a thin lightweight module on top of these other modules that we just covered. So that's basically the stack of modules that we're gonna use to build all of these various solutions that we covered. And I'm gonna spend a little bit of time to talk about some reusable protocols here. As I said, the revision and the conflict detection module is taken from Git, how Git is doing it, and also from CouchDB. They do something very similar here in pushing content around various nodes. We're not using CouchDB in any way, we're just sort of reusing the ideas here. And the replication protocol, the various steps they will go through to do replication is also taken from CouchDB and the API looks very, very much the same as CouchDB. Again, we're not using CouchDB for anything, we're just reusing this protocol because why reinvent them if someone already solved the problem for us? And reusing these API specification, this protocol, will lend the framework to unexpected use cases that I think can be quite exciting. And that otherwise would not be possible. For instance, we have CouchDB, two compatible APIs, quite exciting. Then you can use things like PouchDB, a portable CouchDB, which is a JavaScript implementation. You can write very thin front-end sites that natively can communicate with your site through this sort of standard REST API, if you may. Why invent our own REST API when we can expose our Drupal data through a unified API that more software can talk to? There's also CouchDB, which is compatible with this API specification as well. CouchDB is a local database for smartphones. So they can communicate with this same API that we're providing. It's the same specification that we're exposing our Drupal entities through. So that's quite exciting. I would have liked to give you guys a demo, but it's not done. So, you know, reach out to me on Friday during the code sprint, and I can show you the code I have. We have test coverage, so we can walk through how these tests are, you know, walking through all of these network calls and doing replication and so on. So we can sit down and look at that. And we need help to build this. So please step up and, you know, start contributing if you're interested in a solution for Drupal 8. So that's, we're gonna do that on Friday. I'm gonna be here also on Friday, Saturday, and Sunday for the extended code sprint. So anytime reach out to me and we can have a look. So I'm gonna jump straight to some conclusion here. As we can see, the stack that I've presented here is not a straight port of the Drupal 7 modules. We've taken a lot of concepts and we improved on them and made them fit better for how Drupal 8 is doing things. And we're gonna provide here a very loosely coupled system that will cover more use cases. We can do the simple workflow. We can do merge workflow. We can replicate content in a ring, et cetera. And the system will also lend itself to some unexpected use cases, I think, when we're reusing API specifications and so on. And more importantly, we're implementing, as I just said, battle-tested protocols here. We know that these things work. Why reinvent them ourselves? Drupal 8 have sort of gone with that concept all the way from reusing symphony and so on. So why reinvent the API specification? And that's it, really. I don't have much more to present here today. This is a stack, so I'm very excited to present this to you guys and reach out to me if you have any questions or anything. We do have quite a bit of time for questions, so please step up to the middle here and hopefully we can have a good discussion around some of these things here. Thank you. First question. Yeah, I just wanted to say first that I thought this talk would be interesting, but I just was blown away. This is really, really, really interesting. Great, thank you. And I think it's really promising in terms of moving this stack into enterprise publishing. There's really interesting implications from what it is. And the git hash versioning of revisions is a very interesting solution. One of the areas that I've been trying to think through, it's kind of like on top of the stack that you're talking about, but it has to do with as you're moving content around between different site instances, then one of the problems becomes representing really what the differences are between the different versions of the site. So we've got content out on a staging site and it's going through a pipeline, getting to production, and we've got to go and kind of visually, or manually go through to look at what's out on showcase to say is the change that we really want, what's going out there, and then when we go live, did that actually. So all that base work that goes around nailing down some way to handle revisions I think it's, well for us, it's again, it's a really big deal right now, trying to think through how you would make that work and on Drupal 7 where it stands. And so having like a kind of a solid ground to stand on and what you're proposing in that would allow us to build solutions that would just simply resolve that. So editors can just look at this stuff across and verify that it's all out. It's awesome stuff. It's really great. I'm really looking forward to it. Yeah, thanks. Thanks a lot. My question's largely around, we do a lot of, sorry, Paul, we do a decent bit of licensing of our content to providers who probably haven't even heard the word Drupal in their life. Have you thought any about or had any instances of dealing with metadata changes like we've changed this, this might work this way, anything like that that would make it just less Drupal dependent. Not that I don't want to build everything in Drupal. Metadata around the specific content, you mean, or? Yeah, yeah. So, I mean, obviously you could provide a JSON feed and say, here it is, good luck, do whatever you want with it. Yeah, so, I mean, there are different, couple of different ways you can do that. You can of course attach metadata around the content pieces themselves, provide additional fields on your entities and so on. Then, but then in Drupal 8, it's also gonna be very easy to just provide your own custom rest interface or your custom rest endpoint where you provide a JSON feed or an RSS feed or whatever you want to deal with your metadata changes to say, licensing. Check in here, see if it changed. Yeah, so licensing changed for this piece of content, go here, and then you give the rest endpoint for the content that changed. You can do that, so that would be sort of a very custom extension on top of the core REST module where you provide your own endpoint or your own API, if you may, to deal with those things. And then that feed, I guess, could point you into the API endpoints provided by this framework, maybe. Okay. Yeah, cool, thanks. Thank you. Hi. I was wondering, in terms of communication between the different servers and everything, in a scenario where the network blips, something blips, and halfway through your process, things fall apart, what thought has gone into wrapping these communications into a transaction type thought or being able to roll back, or even if you deploy something and then decide, whoops, wasn't ready, kind of rolling back from that standpoint, replaying sequences in reverse order, stuff like that. Yeah, very interesting that you're asking. And I didn't cover the details around that because there are still some details to figure out. But basically, if you do a bulk document input where you push in 20 documents at once, you may be, maybe the network drops, then that didn't happen at all. Or if it drops halfway through, then you're never gonna come to the point where we save the sequence number, because the sequence number is saved at the very end. And the way that the storage controller works here is that we're only allowing the storage controller to lift or query entities that has a sequence ID. So even, you can even save revisions, but if you never come to the point where we save the sequence number, then the storage controller will never query them. And the entity query API or the entity field query API will not either query them because we do some extensions to how the entity query API is working. So if there are no sequence number, then the revisions will never be picked up. So that's sort of one way of dealing with that scenario. So that's, yeah, we will be able to quite nicely handle those sort of when the network drops or if something else breaks along the way. That's awesome. I also just wanted to say, I used the Drupal 7 stack today and I didn't know that any of this was being worked on. And so I just want to say thank you and this is great. You're welcome. Thank you. My use case, I was interested in this talk because my use case is we have a public site and we're developing some new components for that public site between a team. So this guy is working on a bunch of blocks and theming that block and do all sorts of stuff that I'm working on another one. Someone's doing a view. Drupal 8 has the new configuration exports, which is awesome. But our only thing left was content. Does this work in the same way where you can move blocks and things like that around where you can just push everything at once and merge everything together? Does that make sense? Yeah. Yes, we will definitely be able to support a wide variety of use cases. And what's very nice in Drupal 8 is that we didn't have in Drupal 7, is that we have a very distinct line between what's content and what's configuration because we have two base classes or two base types of entities. We have config entities and we have content entities. So it's a very fine line. This stack will deal with everything that are content entities. There is one entity in core that is neither at the moment. And that's menu links. That's neither a config entity or a content entity. Work is being done to deal with that scenario. And there's also, so that work is being done in core to solve that. I plan to work on some of the things on Friday because it's gonna be crucial. There's also another situation around blocks. We're gonna also deal with that this weekend. We actually have a meeting schedule for this. And that's dealing with config dependencies because we have some interesting scenarios in core at the moment where config depend on content and content depend on config, which is a bit messy. Blocks being the best example here. The placement of a block, that's configuration, but the block itself is content. So how do you deal with these dependencies and so on? So there are still a few kinks that we need to sort out in that respect, but it'll take us a lot further than in Drupal 7. Awesome, thank you. Hi, I wanna preface this by saying that we are shameful you still on D6. And I'm not familiar with the UUID module before today, so I don't know if this solves any of that, but we rely on production being our source for all of our node IDs, meaning we have to have production be the incrementing state for our node IDs. They can't come from anywhere else, but we don't want users to necessarily develop content on production. We'd rather have them somewhere else, but because of that requirement, because of the way we do translation, our node ID, our revision ID, and then we actually implement another feature to track the translation's source version ID, because that was important to the way we do this. So my question is, is there any way that you can recommend to have some kind of globally unique node ID generator that can be used across dev, test, staging, production so that I don't have to develop everything in production? So that's exactly what the UUID module is doing. Okay. For you said you were using Drupal 6. Yes. I think there are, deploy module for Drupal 6 implements its own UUID functionality. There is a Drupal 6 version of UUID module, Dave. Is there a Drupal 6 version of UUID module? Yeah. Yeah. So yeah, there is a version. I actually don't, I maintain the UUID module, but I've never looked at the Drupal 6 version actually, because Drupal 7 version was a complete rewrite. So have a look at that. Okay. You know, maybe that will take you at least a few steps on the way. So that's essentially what that module is for, providing universally unique IDs and ways to deal with nodes and entities around that. Okay, thank you. Yeah, welcome. Many questions, that's great. My question actually is, you've talked about kind of the big picture, multi-server environments, things like that. Is there any discussion or talk about this working on a smaller use case of the concurrent editing problem? Okay, you get into a node, two people edit at the same time, you get a system process that edits at the same time as a user, and you get that dreaded message of, it's been edited, you can't do this. Is there any kind of work, I mean, this looks like it could be something that would solve that problem by being able to, in the background, resolve those conflicts and just make it work. Unless there's an unresolvable conflict. It's a very good use case, I haven't thought much about that particular instance, but it would definitely be possible because we do have, we control the entity storage with the solution, right? And we have a revision tree that looks very much like Git, so you will be able to save two revisions. The revisions are hashed, so you can see, if you do the same edit, if you say to two different people, ah, we need to change the title of this article, they're going to change this to the same thing, then the hash will be the same, so there will be no conflict. But then, we also track all the revisions, so we would be able to sort of say, okay, there was a conflict here, and then we can bring up both revisions and sort of deal with that. It's a good use case. We definitely make sense to incorporate that, I think. So just to repeat the question, will there be UI to compare and look at these different revisions? Yes, someone has to write it, I will probably write it if someone else doesn't, and it most likely will be written in the deploy module itself. Maybe we're going to take that out to sort of maybe more generic revision diff module, maybe that makes sense. I haven't got that far in the process yet, but yes, we definitely need a UI for these sorts of things to really leverage all the sort of groundwork that we're doing here to track revisions and so on. So the goal is to provide a UI, definitely, yes. More questions? Hi, first of all, this looks really, really awesome. It's really good stuff. I remember looking at something very much like this on Drupal 7, three or four years ago, and not having a good answer when someone asked me, how do you do this? And this looks like the answer. So it's good to finally have something that does this in Drupal. Great, thanks. My question was about how to handle kind of some of the slightly more unusual fields that you might get on an entity. So you might have like large binary files, for instance. Replicating those out across multiple sites and keeping a full version history for those sounds like it could be expensive or it could be slow. Is there a kind of way of maybe doing a special case? I know in Git, for instance, there's things like Git Annex, which you can use for kind of storing binary files off on S3 or some kind of other storage so that you don't actually have to keep a full history of that binary file in the repository. Do you think there's a potential use for something like that in this kind of system? So the file entities in core, first of all, we will have separate end points for files and have quite a nice way to deal with them. We do deal with some restrictions around that being the HTTP protocol because we're deploying these things over HTTP. So when you base 64 and code a file, there are size limitations to that on how much you can transport over HTTP. So first of all, that is some restrictions that we're dealing with. If you have very large files that you want to replicate, you're probably looking at another solution, I think, like have them centrally on an Amazon S3 or something like that and then just deploy the entity itself without the basics to foreign coding. But so the revisions of the entities, that's just the metadata that we're storing, that Drupal is storing. How Drupal is storing files on disk is not something it's transparent to the system. I think when you save a new revision of a file, it saves a completely new file on disk. So we're not doing anything specific to deal with the actual file on disk. That's done, dealt with in the normal way through the Drupal API. But if you have large files, you're probably looking at a different solution anyhow, Amazon S3 or something like that. Thanks. More questions, great. We have a little bit more time I think, so go on. I think this will probably be a short one, but so there's the endpoint for bulk pushing of the entities themselves, but before that you have to pull in the sequence numbers on your target site. Is there a bulk way to do that or are you doing a whole bunch of network calls to pull that in? No, so the change endpoint or the sequence number endpoint will list everything since your last change. So that will be a long list of sequence numbers. So that sort of dealt with in bulk as well. For multiple entities. For all entities, yes. So that's for all entity types. Great, awesome. This looks really automated, which is cool, but what is the conflict resolution strategy? What say when a human being has to be involved when there is a conflict between different revisions or states of the contents? So at the moment the winning conflict will be, we're using the same concept as CouchDB is doing here. There will always be, we'll always store both revisions, but the winning one will simply be the one that has the highest hexadecimal hash value. And why are we doing that? It's just that we can consistently take decisions on multiple servers. We can take a consistent decision without doing more network calls. So each server in the network can take their own decision and it will be consistent across all servers if the conflict is replicated in other editorial environments and so on. So that will be the winning revision. But then we'll store the fact that there was a conflict here and then we will be able to provide a user interface where you can say this revision was conflicting with this and then it's up to you really how to merge them. Maybe you just pick one of them. Doing the actual merging will be very challenging to implement. It is possible because we do have a nice serialization API so we can do a JSON merge of the JSON serialized entities and then we can potentially sort of re-save that. That will be very challenging. So most likely it will be sort of a user interface where you can just pick the revisions. But there always has to be a winning revision for Drupal to be able to continue to operate. So that will be just the highest hexadecimal value of the hash so that it's consistent across all servers. My second question, how do you deal with content dependencies in terms of pushing stuff? So let's say you push a node or an entity and then there are further dependencies like this page had tags, file attachments, something else. Is everything repackaged or are you kind of starting the transaction and then figure out there are more dependencies and you start initiating more transactions? So if you do a partial deployment where you just want to deploy one node, then we need to detect all the dependencies. So that we have to sort of recursively go through. So we need to, before we do the push, we need to figure out all the dependencies. So we're not figuring anything out as we go on during the network call. So we need to figure out all the dependencies. But we don't need to deal with the ordering because that's all in the sequence index. So then we say we have all of these dependencies now then just look in the sequence index in what order they were created or updated. That will be the order that will also work on the target site. Thank you. So yeah. So that's why the sequence index is very, very important here. We don't need to do fancy graphing of all the dependencies to recursively figure out in what order to deal with things because that was a big hassle in Drupal 7. Yeah, next question. Hi. I know there are a couple of Drupal shops that are as they say chasing head on Drupal 8 and they already have production sites running. One of the arguments against doing that is that the upgrade path is totally unclear between alphas or betas. But this presents an interesting opportunity. I'm thinking about doing a small production site in Drupal 8. And I'm wondering about the possibility of using this as a way of moving content from a current alpha to a new alpha or beta. You choose this with the idea of a demo today, but how close is it to being able to do that and is that an approach that you think might be viable? It's definitely an approach that would be interesting to explore. It will be difficult for you when the entities themselves are changing, of course, when the format of the entities themselves are changing. I know that people are using the D7 version of Deploy to do actual upgrades. I've had people reach out to me with questions around that. So it's definitely something that is worth exploring. I think it would probably do a very good job at it if the changes between Drupal 8.8 and Drupal 8.8 later on are not too significant. So it's definitely something that's worth testing. Big parts, we have most of the endpoints, I shouldn't say most, but maybe half of the endpoints are already implemented and working in the Drupal 8 version. So the replication protocol is not tied together yet though, but yeah, I can't say, give you a timeline maybe when it will be done, but it won't be too long. The goal is definitely to have it done before we go into RC release candidates in Drupal 8.8, yeah. Great, thank you. Welcome. Okay, we're slowly running out of time. Many great questions and thank you all for coming and reach out to me if you have any questions on Twitter or anything. Thank you. Thank you.