 Good evening Welcome to our last session for today I'm really happy that Henry Bergus has come to Drupalcon to give a talk about De-coupling content management What that means? You will learn to understand in the next 45 minutes and I leave everything else to him Please Henry. Yeah. Hello. Does this work? Yes, perfect. Okay, so I'm Henry Bergus. I work for a company called Nemain in Finland. Though actually by accident I find myself living in Berlin As Bit of a background I've been doing content management for quite a while So this is from my blog from 1998 where we when we started Midgardus the one of the very early open source CMS is out there But anyway to the subject of the talk so Regardless of what you're using in this in this room. It's pretty clear most of you are using Drupal So I'm sad to tell you your CMS is a monolith If you look at any content management system, this is pretty much the general architecture there There's some database. Maybe no sequel usually Relational there's the file system where you put your images and stuff and then on top of that There's just this one big blob called a content management system Now this obviously carries some problems For instance in Drupal now you have people working on spark which is a very cool editing interface and There might be people who say yes, I would really like to use this But then their IT department says no, we don't we are not touching this PHP stuff or Of course the other way around the IT department says, okay We want our strategies to do everything in Java or dotnet or whatever, but then the UI sucks So really by having everything bundled together You really limit the options quite a lot So I was thinking how could we Move the state of CMS forward. How could we solve this problem? And I came up with this picture on the right which kind of separates the CMS into three discrete blocks so on the bottom you have a content repository which is Responsible for keeping the structure of your data the storage the retrieval all of that Then in the middle you have a web framework that actually does most of the functionality of the website And then finally as a separate piece you have an editing tool This is very much following the kind of Engineering principle of having a clean separation of concerns So each part of your setup is responsible for Some part of the information some part of functionality But and that is the only part Dealing with this so the web editing tool provides The tools for you know adding images to text doing format thing All that stuff that users need to do when they publish content the web framework actually Provides the business logic it provides the rendering of the web pages the routing all that stuff and then the constable story Keeps your content it ensures that it's always Correct so it handles the validations and things provides the retrieval APIs and so forth and None of them really like Walk into the others territory For the past couple of years I've been working with this European Union funded project called IKS because obviously Moving CMS sees to become decoupled is quite a lot of work. So I'm very happy to have your tax heroes at my disposal It has like event I published a blog post a few a year and a half ago about this and since then it's been kind become a kind of a rallying cry for Fixing the state of CMS Now thanks to look us over there. We have a website for this message called the coupled CMS org Where the idea is to collect these different resources? Lists of libraries you can use to make your CMS decoupled and of course also to promote the CMS is that are built in this way So if you have any ideas for that, please contribute So how to actually move forward and practice like the idea of having these discreet different blocks is nice But obviously you need to have those blocks So the first of them is the editing interface and there instead of just saying we need an editing interface I wanted to do something and out of that came create.js It's a web editing tool which Can work with any CMS out there But building up building a web editing tool, you know Where do you start because as you know as as you have built triple for instance You know that there's quite a lot of work in building an editing interface and so I thought okay To come up with an approach We need a constraint some sort of constraint for ourselves So let's see if we can build a CMS without a single form I mean, it's very You know unnecessary limitation. Obviously forms are fine for various purposes, but this Decision was made so that we could focus the effort into something. I mean Forms are wonderful when you're actually creating data like you use forms to file your taxis You use your forms when you're entering countries like Russia or USA You know, but do you really want to use the same interface metaphor You're used to communicating with the government to also communicate with your web audience The modern web can do so much better So here's one screenshot of create in action Basically what you have is your website as it is All of your layout all of your CSS is there because that's your web page The only thing we had on top of it is this toolbar Where you have the necessary? functionalities like the editing buttons being able to save stuff Publishing unpublishing whatever the CMS wants you to have All of the editing how many of you saw this park presentation a few okay, so All of this works in very similar way in that we are not doing anything crazy about your content The content you're editing as you're writing it it is actually the content on the page So if you have some funky web fonts or whatever all of the editing tools will be Editing the stuff as it's shown and yeah spark is using the aloha editor In create we have different editor options one of them is hello an editor I wrote but you can also configure it to use aloha instead So in this way, it's very flexible if you want to use something else. Let's say tiny mc. Please don't You could configure create to do that The first kind of big question like problem with the approach came when we were thinking okay, so If we're not doing any forms, it's very easy to go and edit content that actually exists But how about actually creating content? How could you do that without forms and for that? We came up with the concept of collections. So let's say you have a list of articles or Navigation tree or something you mark it up as a collection and you get this ad button next to it You click add and then you Empty content item slides interview What do you see these title content created at there? We kind of borrowed that idea from let's say PowerPoint or LibreOffice where when you create a new slide They mark the places where you can write stuff with these kinds of markers. So, you know, okay Now I have this new content item. I can just go and write the title write the content press save and Let's say Drupal would create a new note Another kind of difficult one was image handling again How do you upload things without forms and so forth? Luckily with HTML5 we do have the tools and this is where The Swiss company Leap did great work by providing us with very nice drag-and-drop image tools So first of all, we can do suggestions So we figure out we look at the tags of your content and we can suggest Images that have been used in similarly tagged content items. You get this dialogue next to your Next to the content when you want to insert an image You just so choose an image or maybe search in your media repository or upload a new one and then you just start dragging the image to the content area and To make things very clear what's going to happen if you drop it We always show this kind of ghosted Image there showing okay, if you were to drop the image right now This is how much space it would take with your text. This is how the text would flow then So you always know how your content will look like when you perform an operation Linking is another kind of difficult Subject I like to use a lot of links because that's what web is made of and sometimes it's very hard to figure out Where to link what thing so one one thing where we decided to do some automation for the user was to Use named entity recognition Tools so that we actually parse the text you're anything we find Things that are mentioned like companies people products, whatever you have in your so-called knowledge base and We suggest links to you so this works very much like a spell checker you write your text and wherever we recognize a potential link a Underline appears you click it and We see you see where the link would go and you can accept it or reject it and if If there are different potential things like the BBC radio here could be any of the BBC radio stations Then you get to choose or maybe you want to link to something completely different But anyway, the idea here is to automate a lot of these cumbersome processes that content editors have to do so Like not using forms was one like showing the content as it really is on the page and not using forms But instead using the capabilities of modern web was very important for us But another reason like I think in the aloha presentation or the spark presentation people were being pulled about where the content actually starts its life in and With any web CMS the content is actually not written in the CMS It's always written in Microsoft Word and then copy pasted producing terrible html that we as authors of as Developers of editors have to clean up and you know everybody's in pain We were asking users. Why are they doing this? Why are they using Microsoft Word for writing their Web content instead of using the collaboration tools the workflow tools all the stuff that the CMS can provide There were two clear reasons One of them was that traditional CMS is give you very little space for your content if you look at let's say the current editing interface in triple seven the actual body of the Page your editing is this tiny tiny tiny box in a big form Looking like Microsoft Word 97 slapped into the middle of a web page. Sure. You can full screen it, but nobody actually remembers to do this so people were feeling very constrained by these small spaces we give them the approach taken by both Spark and create is that because the content edited is the one shown on the page you will have as much space as you need and The other one was how many of you have ever writing something to the web? Lost it your browser crashed your session expired the server crashed whatever Okay, about half of you So the other half that either doesn't write anything or is in deep denial So what we decided was we will never lose your content So what we do in create is every character you press every format thing change you do all of that we store in your browser's local store and If anything bad happens Whenever you come back to the page You will get this nice friendly dialogue saying okay these and these of your content items have local modifications Would you like to restore them you click restore? Your changes are back in the page as they were So you can always safely go back whatever bad happens now. There's a few cmscs that have gone this way First of all, there's the symphony cmf a thought where the idea is to build kind of generic set of content management tools on top of the symphony framework, correct Their reference user interface is based on create now if you look at this picture It looks actually quite different from the previous screenshots, right? The reason for this is because just like symphony Create is actually just a set of widgets and a base library for building your own editing tools We provide a default UX a default way for create to look like and behave like but you can actually go and configure it in other ways and this is something like spark is awesome and I hope this is something where we can Eventually collaborate. We miss smiling. So maybe that will happen Anyway Another one again looking quite different, but based on exactly the same code is open cms So this just is there to show that These tools are really back-end independent. It doesn't matter whether your stuff is built in symphony or midgard or triple or Java like open CMS you can integrate the same content editing tools there This screenshot is still running tiny emcee. They are switching to hello pretty soon And then of course, there is the CMS that some of you may have heard of so There is an effort to do an integration module for create JS in triple And there's couple of screenshots from it. It's still in pretty heavy development So but you can see you can tag stuff You can add images all this normal stuff is sort of there and If you're interested, it's available on triple toward honey. Would you stand up? Here's the developer. So if you're interested in seeing where you can take this Talk to this guy over beer. He likes that Yeah, so he's from Druid in Finland. They are doing pretty good work here. So how does this actually work? It's pretty simple There is One set of API's we use between the web framework. So in this case triple and the editing tool in this case create JS So first of all the web framework publishes web pages Surprisingly annotated with some RDF a That's what we use in create in spark. They use some data attributes But this is probably another area where we will try to work together and then we edit stuff on the JavaScript side of things And then we use restful JSON LD calls To actually get the content back to the server. So how many of you know RDFA? Most okay, so I'll walk through this quickly. Basically the problem here is HTML itself like you can understand what's being talked about here But machine can't and to be able to edit things. We need a way for our JavaScript code To agree with your Drupal code about how this content works What's the content that you edit and so forth? Otherwise, we couldn't save it back so we had the annotations there and Then JavaScript can understand it. So for this we use the vi library and MIT license Chouse group library for dealing with RDFA with JSON LD and so forth Here's a quick example we parsed the RDFA from the whole document and then we get this book from the parsed set of entities and You suddenly have normal attributes. So essentially We're providing this MVC layer on the client side on top of the content that your CMS produces And of course you get bonuses like SEO Suddenly Google can also understand your content if you want them to Jason LD is probably a bit less known How many of you? Very good So it's a way of serializing graph-based data Into Jason format and the benefit here is there are many ways to Serialize this kind of data, but Jason LD is reasonably clean apart from the context This would pretty much be the same Jason that you would write if you were Defining your APS by hand. So it's still readable not scary easy Laughing okay The good good part of the story for Drupal is that Jason LD is apparently gonna be pretty well supported in the future I don't know if the story has changed much since this was written Okay, so yes Drupal 8 will have Jason LD. So Fingers crossed. Okay. So Drupal is pretty much ready for this approach. You already have RDFA There's some issues that need to be resolved and being told and then there's Jason LD So you already have everything that you would need to run create Vi is the base library there This is especially the part that I would like to see in spark because that would give us a common client-side API for managing the content So it deals with the entities. It deals with type information. It can Read and write RDFA in your DOM and it can talk to various semantic services out there like DB PDR Standball to enrich your content and Generally for PHP. There's various libraries for this one one that I've been talking about today quite a bit this create PHP which is a PHP library that is meant to be used for Integrating create into various PHP based systems It's currently it was originally made for Midgard and it's also now being used in symphony cmf And I I think I saw some commits from the type of three guys They were also adding this library to their integration. So That's an easy way to do this if you're doing PHP. Then there's the triple module already Which as Ronny said contributions to which are very welcome and then for symphony There's the lead via a bundle or is it already renamed? Okay, so yeah, eventually it will be called lead create bundle Then the other part of the story so now we've pretty much solved how to extract the content anything Tools from your CMS. Then the other part is how to extract The data storage out of your CMS so that then you can actually have this clean tree level architecture so For this there is the PHP CRF odd before I start talking about that I would like to ask Lucas to stand up because he's the Let's say father of this effort so PHP CR is a Standard API for different CMSs to use to store and retrieve content the original Concepts and APIs come from the Java world where there's been a standard for this since early 2000s which most of these big enterprise CMS is I think already support and and Now as far as I understand the PHP part is also being sort of incorporated into the standard so Yeah, so in the next next version of the JCR standard, there will also be the PHP definitions So this is not just a set of libraries that PHP people agreed about But by themselves, but instead it's actual a real written standard, which is good So what is PHP CR first of all? It's a collection of interfaces and Obviously and more importantly, it's also a collection of pretty big collection of unit tests The reason for this is there are different implementations of PHP CR You will be able to use the same exact same PHP API to talk With various different ways of storing and managing content So for instance, there is jackalope, which can talk to the Apache check rabbit Server, which is the Java implementation of the Java reference implementation of JCR It cannot it also has doctrine DB al implementation, which is a pure PHP Implementation that can store to I think just relational databases or so you can with this you can keep your content in my sequel or SQLite or whatever Without having to have any Java or extension or other dependencies Then there's Midgard to which is the implementation. I've been behind so you can also store things in the Midgard content repository Where the downside is that you will need a PHP extension and the upside is that the same content repository API is the same data are available for various other languages like Python or JavaScript or Java Which helps if you need to integrate with external tools There's probably a couple of other implementations at work, but These are the ones that I'm mostly aware of So yeah, the idea is you as a CMS developer You write against the single API single API for storing querying searching content and Then when it's the time to deploy Your software to a client you can choose whichever PHP CR implementation you want to do use Maybe some some client wants to keep their stuff in the file system or some other wants to keep it in Oracle or my sequel or MongoDB or whatever You only have to choose this at deployment time because All the APIs are exactly the same your code doesn't need to change just the configuration So what can what can a content repository do to do for you? I mean OMS sort of already provide an API that can abstract different storage methods. Well, think of Content repository as ORM on steroids There's quite a lot of stuff there. First of all The content is organized in a big tree so you can traverse the tree you can Add each stuff into the different branches of the tree. This this is very similar to how File system works or how most typical websites work You can access data by their unique IDs You have workspaces, which means storing for instance draft content for users Becomes very simple. So you can give a user a workspace of their own they can Handle their content there do their changes and then you can move them to the main workspace when it's time to publish Versioning another very important thing for content management Handled multi-valued properties person may have multiple email addresses or phone numbers or whatever Supported they're different query languages. So there's a query object model, which is pretty typical query builder You also have this SQL to Language which you can use to write string-based queries There's X path, but I think that's actually dropped out of the spec now Yeah, and then another very useful thing is there is a standard XML format for importing and exporting content so You can always move from like if one of these content repository Implementations that you deploy the site with doesn't scale enough or isn't reliable or whatever You can always take your content out of it and switch to another one Permissions are there. So again stuff. You don't need to care of Capability discovery is important because some systems don't support all of the features of phpcr So you can always in your code check is versioning support it is this and this to support it and so forth so the base idea here is doing simple stuff like Retrieving an object from the database and showing it or updating a property there and saving it. That's very easy but then if you need more you need full-text search you need versioning That's still possible with the same API So this is essentially how it looks like You take a repository you log to log into it. So this is a bit like providing your mysql username and password to connect to the database you get the session out of that Out with the session you can access the different workspaces And then you can access the nodes and their properties through the API All of the content is stored in a tree of nodes This may be somewhat familiar to troopers The nodes have a name they have a type and then they have properties the types can provide various kinds of constraints They can say okay in this node you can have whatever or they can say okay This node will have a first name and last name which are strings So this is the way you can actually define your content structure and then they have can have child nodes of various other types So how does it actually work? Here's an example with check rabbit When I was making these slides the pure PHP implementation wasn't quite there yet. So That's why I'm showing the one where the content is actually stored with this Java service But you know the code is exactly the same. So first of all you give the URL or the Database details or whatever of your repository Then you get the repository instance from this repository factory You present your credentials log in and you get the session after that the whole PHP CR API is available for you and That's really the only repository specific part here The different configurations you provide so with check rabbit you provide just this URL With midgard you provide some other information like where should be the file attachments stored What's the database and so forth? So once you have the session you can actually start working with your content So tree in PHP CR always have a root has a root node. So that's where you usually start from you get the root node Then we can in this case we check if there is a sub node called example under the root If not, then we create it. We set the properties with then we retrieve it and we get the value so It's it's pretty simple way of working with stuff and there's always this session save So you can do as many operations with your content as you want and then save all of them together with this save method So you can bundle up all kinds of database operations together. So, yeah a little bit about the node types This is something you can define yourself This is probably very familiar again to triple developers because you have actually pretty good way of defining different content types in your CMS So, yeah nodes if no types define what kind of properties you can have what kind of children they can have There are some built-in types like file, which is kind of useful if you're let's say uploading images to your CMS Unstructured which is essentially Whatever so it can has have anything as properties and anything as children And yeah, then you can define your own type. So you could say okay person has this and these properties But the typical recommendation is before you go too far with defining your content types and stuff Just build your content with the unstructured type You can always add the structure later Because often when you're building a web application, you don't really exactly know what your data you're gonna have in the end So in phpcr you can actually do this you can start by just saying okay I have this big bucket of unstructured information Just like you would do with most of these nodes equal systems like let's say couch DB or I guess MongoDB You just stuff data into your nodes and then later on when you know, okay Our articles are structured like this you can go and say okay change the type of these nodes to that type And then you get all these validations and other things querying Here's an example of the SQL to language. So get that you get the query instance. You create the query you execute it and you get an Iteratable list of nodes very simple. I guess if you ever written SQL it looks pretty pretty familiar and of course the same you can do with the query builder to Have a bit more object oriented approach Versioning I mean this is something that is traditionally very difficult for CMS is to have But if you're using phpcr, it's just there So first of all you check if the repository are using supports versioning maybe in a CMS You don't want to do this diet thing But you'll just skip this method or whatever and then to a particular node You had a mix in called versionable. So after that that particular piece of content can be versioned when you have versioning enabled you can change the properties of an object and You can do these commits whenever you feel like So maybe you are doing that automatically for the user or maybe you do that when they actually save something or However, you need to do this. But anyway, you just commit whenever you want to create a new version and then you can Walk through the list of versions and restore whatever you want and this works with every type of content this works with files this works with Objects that have structured data unstructured data. So you you can even version the images that the user upload If you want. Yeah, I was talking about the export and import. So again You have a very simple way of saying okay export the content starting from this part of the tree as an XML file dump it into the file system or service to the user and then again, you can import back And you can also do this to copy content within the same repository. So this can also be a nice thing to have in a CMS It's you know, the XML is pretty verbose, but the good thing is it's standard So the same XML format you can use it to migrate your data from Midgard to check rabbit to this pure PHP implementation to Whatever PHP CR provider or even just to a JCR system So if if you have a client that is actually using some of these big enterprise CMS is that support JCR you can actually use this same interface for getting content out of that Java system and putting it into your CMS. So Essentially, this is what the picture I would like to see when I come to Drupal a couple of years from now I would like to see Drupal being a Decoupled CMS. So there's the layer in between which is actually Drupal which provides All the smarts of a CMS Underneath there would be this standard contemporary API and above. I mean create is there just as an example In in an optimal situation, you would have multiple different editing interfaces You can choose from just like you have different PHP CR providers. You can choose from I Don't know if we are actually getting there, but I'd say Drupal is definitely moving making the right kind of moves the symphony stuff happening in Drupal 8 is great As is the JSON LD stuff. I Think spark is very promising for having Decoupled or at least partially decoupled editing layer and Lucas told me earlier that there's been also some discussion in Moving towards PHP CR maybe in triple nine or Whatever that's probably going to be a bit of a longer route because that obviously means changes in how Drupal actually stores and retrieves content and So, you know the coupling gives you flexibility when you're building your stuff It gives you flexibility when you're deploying, but it also means something new. It means collaboration between different CMS projects So in this context, you've probably never seen these logos together on a slide, you know, we used to hate each other, right? Now suddenly All of these systems can utilize the same JavaScript tools the same user interface widgets the same storage logic the same PHP libraries thanks to things like symphony and composer So we can finally share code. We can stop reinventing every wheel every time we build a CMS and this means we can actually share resources because Building and editing interface as I'm sure the good people here are realizing is quite a lot of work Especially as most CMS projects don't have that many JavaScript developers, but put together These different CMSs do have enough People to get big complex things built if needed so most of these Projects here do you have at least some part of the coupling Story already there? Maybe there's the user interface. Maybe there's the PHP CR stuff or maybe there's both in the best case And that means we can finally Share code with each other and then you know triple can stay focused on the things that makes triple triple and Type of three can do the same midgard can do the same, you know We don't have to Spend so much time on the drudgery on the low-end so before we go for questions. I Would we do have a surprise demo? So score wanted to show some stuff using some of our libraries and Jason LD working in triple already right now Can we get the mic on the change of the display? Is anybody awake there? So Henry asked me to come on stage and present some very early prototype of What I started to build which is integration between VAE and Drupal So what I want to show you is I'm starting from a triple seven site an empty site that has just a couple of content types that you get by default in Drupal and it also has No content, so it's a purely empty site and I have a VAE form generator here loaded and The way this works Is that it it pulls the schema from schema dog and it builds the form On the fly based on the schema that it retrieves from triple from schema dog Here we are going to add a restaurant to our triple site. That is empty right now So I've pre-filled some values here. I'm going to turn on fire back to see what's going on behind the scenes And I'm going to save Fingers crossed you see The request and there you go Saved so now we're going to switch back to Drupal turn off fire back here Reload the list of content types and Here it tells us that it imported a template from schema dog the restaurant template It also has a new content type restaurant, and it has all the fields From schema dog and if I reload this content page, it also got the restaurant that I Imported with the values that I filled in so that's all done via restful web services and I Can also do the same another one here Save that that was faster because the content type was already there And if you look at the she look at the post it's got It returns you the the new resource that was created so if you go here Now we had one restaurant and now we should have two there you go second one is here so The way this works is that we have VA as Henry explained as a Dom tree here with RTF in it and it maintains the Jason LD model client side and That's the that's the data that we send to Drupal. So the Jason LD is the data that's sent to Drupal via Ajax and Drupal gets that that Jason LD document It transforms it to match the schema it has locally and it fills in then you know it creates a new resource and here the side effect you also have some already a favor for for your For your imported content that you just created. That's just because we have the the schema information all shared across syntaxes and already if it just in a different example here And yeah, that's pretty much what I want to show. Cool. Thanks Yeah, so I guess we do have some time for questions. Yes So who wants to go first? Yes, please Okay, so the question was whether you can have multiple trees in PHP CR referencing each other Yeah referencing different the same nodes used in different trees, right? Yes It's the answer You can have like one PHP CR workspace has one tree but obviously you can have like organize this as sub trees if you want to use one workspace and You can have References between nodes. So this is the way for example in symphony cmf You can build if you you can have the navigation And the routing and the content as one tree if you want That's the kind of simple setup or then you can also use the more Flexible setup of having a separate tree for navigation routing and content and they just reference each other Anything else? Yes By default. Oh, so the question was What is the content like if you're editing a block of HTML like the contents of an article for instance What's the format of that being sent to the server by default it is HTML as we get it from the editor So typically these editors do some cleanups to that before it's sent But you can actually plug in some other JavaScript code there So for instance, we do have an example of create actually producing markdown So this maybe shows you that there is quite a lot of flexibility there Yes, yeah, so the question was validation There's obviously you will do you will want to do validation on Both sides of the equation You cannot just trust that the JavaScript side of stuff does the things correctly So you also you want to validate things on the server, but also to be more user friendly You will want to do some validations on the client side, right? For this First of all for like the big stuff like the consistency of your content like Person having the right fields of right types and so forth This is something that phpcr does for you Or then you know you do it in the way your CMS does it traditionally You can also tell create about these same constraints and then create In near future will be able to do the same validations on the client side and my plan is to enable you to also do per Property validation callbacks in JavaScript. That's probably something I'll do In two weeks when I meet the type of three people because they did have some concerns in this area, so That's that's the plan and I Mean some validations are easy like it's very easy to check if something is an email field or whatever for instance Does this answer the question? Okay, thanks. Next there So to shorten the question the question was whether the data model and the views that Create uses whether they are available for other backbone applications. The answer is yes, so the way vii works is Every piece of content that you see and create for instance is its own backbone model instance any relations between content times items are collections and then the actual RDFA in the dome is a view and You can get access to all of these Through the vii instance, so if you're doing more backbone stuff On your website you have access to all of that Yeah, so the question was how to do these template pieces of content like standard like let's say image with caption or YouTube video or whatever The answer to that is that's pretty much something the editor's handle for you. So Aloha has the blocks feature. Hello has a little bit less functionality in that area we were chatting about this bit been earlier and One thing that sort of makes it easier is that with create you have a way of Easily refetching content from the server so you could say, okay Now I want this article without the filters applied to it So I want the raw contents with the tags as they are so that then