 Thank you. Thank you. Nice to see you all here second day morning ready to continue Yeah, so let's talk Gutenberg and collaboration these two things Do we have a presentation already? somewhere Okay, we have to wait But you can watch me at the you know in the meantime. I will maybe start since we have no slides yet Do we know? So Why am I actually talking about this there are two main reasons first of all? I'm a huge fan of Gutenberg the project from the day one that it was created and second of all I was working as a JavaScript technical lead at CK source where I was heavily exposed to Collaboration and collaborative editing in the web browser in the JavaScript. Oh Yeah, we have this So it's me. I'm David the rice came a full stock web developer What was developer for many many many years tech lead? I'm huge fan of Gutenberg react TypeScript JavaScript everything web development related And as I already told you I love Gutenberg and I worked as a JavaScript technical lead To a collaborative editing experience in the web also long-term roadmap for Gutenberg states for phases First is the easier editing, which is the Gutenberg editor itself. Second one is the customization it includes For example block patterns block themes block directory and full-side editing. I had to check this out Second third phase is the collaboration. We will be just touching this a bit today And phase four is my personal personal favorite is a native support for multilingual in WordPress and Gutenberg Phase two is still in progress if you want to check what still needs to be done You can check this issue on github in the Gutenberg repository and kind of unrelated But one of them most favorite features of mine in this phase to Gutenberg is this command center feature Which is basically like a spotlight for Gutenberg and the WordPress so you can add stuff quickly and do stuff Or even in the WordPress admin eventually just by hitting a shortcut and doing what you want So let's first start with basics. So collaboration. What is actually collaboration collaboration is when multiple people are working? together to achieve one goal and What the goal is in context of Gutenberg the goal is to create or produce content So it could look like this in like bigger companies Producing content could look like this We have editing part where multiple multiple people are editing the content and they are Creating it editors then eventually they optionally may pass this to another stage like correction And then maybe compliance to the legal team if we are working on for example terms of service page And then maybe we have someone who can approve the changes and once all of this is done We have ready to publish page so we can tell that all of these people have collaborated to create this publishable version of the page Real-time collaboration on the other hand is just an extension of collaboration It is the same thing, but allows you to do this simultaneously In our case a great Place where we could put real-time collaboration is the editing part So instead of people passing the document to each other so they can apply more changes They can just do it simultaneously, which would be great like Google Docs wouldn't it? Basically, the idea is to just get rid of this of this post-lock model Instead we want to allow everyone in and enjoy editing the content at the same time Still there is no real-time collaboration without collaboration So it would be great to have these flows of collaboration in Gutenberg before going real-time collaboration full Now let's get into some specifics These solutions for collaboration. They usually are network layer agnostic But when you try to implement this in a real product, which Gutenberg is You probably just stick to one of the connection methods So we could use like one of these short or long polling with traditional HTTP requests Similar to what heartbeat API is doing in WordPress We can open a web socket to the server and we can use webRTC to just connect people in a peer-to-peer network Speaking of peer-to-peer should we use server or peer-to-peer network? This illustrates the difference. So in peer-to-peer, obviously we connect everyone Directly to each other. This has an advantage because there is no server in the middle But you might already tell that adding such a Synchronization in such a decentralized network is probably a bit more difficult than normal On the other hand, we have a central server in the middle Which really reduces the problem of synchronization because we have one central server Which is a single source of true, but we now have a server So now we also have to ask other questions like where this server should live Should this live alongside with the website? If yes, it's probably have to be written in PHP Which is trust me not an easy task to create a web webhook server with PHP If not alongside with the web website then when where maybe an external server or Network of geolocalized servers hosted by someone like maybe automatic or WordPress org But then we enter into a data privacy issues because all of the data is flowing through those external servers So we need to probably have some encryption some advantages again some disadvantages If we are working in a system where multiple Machines are connected across the network. It's unavoidable to stumble across concurrency. So let's see this very simple diagram We have two timelines and two nodes node a sends some information to node b and node b sends some information to node a then they receive one and the another and When we look at this from an external actor perspective Us we know the absolute order of all of these events the total order because we see both of the timelines We know that node a I mean the a one happened first then b one happened second a one cream happened third And the last one is be one cream But if we look only at the one node perspective, let's let's look at the node b And let's think only based on the information he has can we figure out the whole order? I just told you if you know probably you know since I'm asking the answer is no and it is no because if we look at This and from the node b perspective, what is the difference between this from for him and this? No node b timeline didn't even moved It's the same for him and this is all the information he has so going this and this Changes nothing to him. So what we are saying that a one and b one these changes are incompatible. So therefore they are concurrent So at this moment, you might say well just use timestamps just send it when this happened and I I don't have time to enter into these timestamps and clocks in general but If if you want to like and get into this stuff ask me about this on the in the QA session But trust me if I'm saying yeah, if you have multiple machines You cannot rely on the physical clock. They are operating with Also, there is this misconception that you can just avoid concurrency altogether by just using a server And this is simply not true Let's see the same exact example we had before and what we just do let's just change the labels Let's say the top one is Jane the editor and the bottom one is the server So as you can see we have still the same situation. We have still have concurrency. The only difference is that it's between the client and the server and If you think about it for a second, it's great because we still again reduced the Concurrency issue to only two actors and You might also think how is it even possible that the server has some concurrent changes to James? Well, it could receive it from another node that is also connected to the server and as a byproduct We also have a single source of truth on the server about the total order of events Because for the server the order is basically whenever I received the event is the order of the operations And this will be important later For the comparison, this is the decentralized network example. We are sending all the information to all the nodes We have only clients. There's no server now But and the sending part by the way, it looks complicated But it's it can be simply handled by WebRTC for example So it's not an issue but the problem is the ordering of these events How do you can ensure that all of the clients all of the nodes have the same content in the end? It's a difficult problem to solve but we'll get to this So handing concurrent editing ideally should have two main properties, which is strong eventual consistency That basically means that at some point in time in the future all of the nodes that are connected They end up with the same content. That's as simple as this The second one is a bit more tricky to implement which is intent preservation So what we are saying is that we when someone edits the content and someone else edits the content When we merge those changes, we don't want to just merge them Whatever we want to have some intent preservation We have two words. We don't want this these letters to interleave for example just creating gibberish We want to have two words one by one Even if they are in the wrong order that's still fine because it's easily fixable But if the letters interleave, there's no intent preservation and it's obviously bad So are there any solutions already? Created by the by the people and they are Two types of solutions are OT and CRDT if you don't know what it is bear with me I will explain this in a moment So operate OT stands for operational transformation and it's like super super super old piece of algorithms ideas Software and all the stuff around it and most notably it's being used by Google Docs for example and CK editor 5 so how does this actually work? Let's see this diagram and let's imagine both of these nodes. They have some content in the editor ABC and D now Node 1 wants to insert X between C and D and node B wants to remove B Basically, so now they end up with a content ABC X and D and on the right a CD now they exchange information what I have done and The first note says insert X at index 3 and the second note says delete B at index 1 now when we received this operation we pass it to a Transformation function this and this function is clever enough to know Oh, the B is still on the index one nothing had changed it So just do nothing with this transformation do not do those operations do not transform it and we end up with the con applying delete B a latex one as is and Result is AC X and D on the other hand. We have the same exact Transformation function, but now it's clever enough again to know. Oh something had changed and just Basically, we have to account for this change in our Operation so we transform it and we say instead of index Insert X on index 3 we now we have to insert X on index 2 which ends up again with AC X and D there are multiple Multiple algorithms created for these OT solutions But that through to be told as you can see the OT is very old because it's even have some stuff from 80s But most of these have been proven to be just straight wrong for some situations and the only two I know are Correct are the highlighted the one who are once highlighted on the slide And the they both use server in the middle and the Jupiter one is used by Google Docs So I already told you that order matters, but what if all operations are? Commutative so what does it mean in mathematics is where it's very simple to explain commutative means No matter what which order you apply the operations That they the result is the same basically so we can shift operations and we have the same result So let's let's see if our operational transformation is commutative imagine We are creating a title for this whole conference and we start with the word come word in the editor I want to insert. Hello at the beginning and you want to insert rules at the end. We actually if we Represent these changes as operations and then exchange them Now let's apply the the first one. Hello insert. Hello at index zero So index zero is here at the beginning we inserted hello word come rules all good And now let's do the opposite one and you probably see where this is going the index 8 is now between o and r Which we and we end up with hello war rules early come which is basically wrong So both of these content diverged and they this is basically like a critical error of the system and they will never get back They will never converge So what we are saying that these two operations are not commutative We cannot apply them in any order because the result will be the Different and that's the main reason why we need these transformation functions in the first place And just to say also about these transformation functions This is the main criticism of OT because those transformation functions are very complex They are full of magic in direction because they they what they do is they transform operations against against another operation So you have often like a chains transform operation then transform operation then transform this operation then transform this operation And we have like very long chains that nobody is able to debug or even understand the code that is Creating these transformations and also to foresee the result of this code, which is pretty bad If they would be commutative it would be more like this we start with the same state an empty document We have three operations. We shuffle them doesn't murder the order and then we end up with the same result If they also are idempotent, which is a big word, but it's pretty simple It means that no matter how many times we apply the operation we end up with the same Result so multiplying by one is idempotent operation because we can multiply by 100,000 times and we have the same result In our operational transformation world or CRDT's If we have operations like this and we somehow Receive the same operation twice exact the same We only apply it once and we have one paragraph with text. Hello in this example So if we would have a system where we have operations that are commutative both commutative and idempotent We have a very simple Way to Synchronization because think of this if I can send anything in any order to anyone and they receive it And we don't even care if they receive that multiple times It would be pretty simple. Yeah, I mean Here we enter CRDT's which stands for conflict free replicated data types also known as commutative Replicated data types or convergent replicated data types. This one is the most popular and CRDT's are the set of data structures and also algorithms and again stuff around this to Basically enable this commutative idempotent everything is going you know merging itself hence conflict free Some advantages of this are for example, you can create a software that is ready for offline editing AK local first apps, so you can have two different editors that are completely offline We either changes you the changes and then once we are online somehow it magically merges It supers both operation based and state-based things So we can both use the normal operations as I showed you in the operational transformation Or we can just send the whole state of the application and they are be will be getting merged This is obviously more expensive the payload wise But it's also an option and there is no need for the server because we don't care about the order of operation So we can send it to anyone at will some of the Solutions and algorithms for CRDT's are listed here This is these are the ones that I know and the I'm not going to go into details about this But the wood the first one is pretty funny because it stands there This abbreviation stands for without operational transformation So you can see that this has been created just to kill the operational transformation And it's like a new and more fancy way of doing collaboration these days So how actually these CRDT's work because you might think it's kind of difficult like I sent any order any Duplications and it's still somehow work. So how actually does this work? I? Will give you an example of one of the algorithms There are multiple ways of doing this But this one is the simplest one to explain so I will get this one in CRDT's if you have again the same content ABC and D We instead of working and operating on indexes, which makes shift we operate on IDs on unique IDs So in this case, I made these IDs one two three and four So now well if I want to insert X between C and D well, there's no ID there because it's three and four What do you do? You just create a new ID in the middle 3.5 and now you might think okay But if two people want to insert between C and D they both Have ID 3.5 which is bad. We don't want this and yes You you would be true because it's not that simple. I will explain this on this example. Let's say we have a Collaborative session between John and Jane and John have created some content first ABC and D and now you can see instead of just Adding ideas and number what we are doing is we have an array of Topal tuple tuple is basically to value our eye. Let's say to value Data structure so we now have ID of like index or ID of this element and also an ID of who actually created this So in this case, I'm using emojis because it's shorter, but we know that one Index one created. I mean ID one created by John ID to created by John three created by John and for created by John And now if Jane wants to insert something between C and D She's not longer need to shift and start do stuff with indexes. What she does is she takes whatever was before Three by John and now she appends it and she says this is three by John followed by one of me by me And now let's go even deeper like like now we're removing be the same situation We had before removing be and now John was at the same time insert X Y and Z between X and D So what we do? We just right remove the two by John and the John inserts After X so it takes whatever was there and it adds Y and Z adding one and two by him at the end So this way we have like conflict free merging of the letters There's also one and by the way these These fraction numbers I just showed like 3.5. They are actually Created by taking all of these numbers in the ID So it's 3.1 point 1 for the white white letter and therefore we can order all of this based on these values There's also one edge case for this. Let's imagine. I Removed C as a John and I added D also as a John So we would have again the same situation But different letter and this would be bad so what we are doing we are just using something called logical clock or or just counter basically and Every time we do any operation we bump the counter Which is kind of like a timestamp for us and the whole thing for the whole row is Actually the ID the whole row is the ID of the letter The two main known Libraries or software that is actively using in real prod projects CRDT is out on birch and yjs and Now if you can just Google this and find this on github now, let's go. We talked basics. We talked how Collaboration actually works now. Let's talk Gutenberg if we look at this issue. It's actually from 2017 so it's six years ago already and people were discussing collaboration and if you Look closely here. We also have a mention of operational transforms It's also sometimes called like this and CRDT's and they also say P for peer-to-peer by the way It's six years ago already And it's by Chris blower From automatic he's still working there legend then in Around the same time a Guy called Abyshek Gallot back then working on automatic He created this poor proof of concept using web RTC and then it was taken also by Gregor Zhukovsky from automatic and put into the WordPress Repository to to work this issues and it was at some point it was closed, but it was nice proof of concept Then the guy who created yjs, which his name is Kevin yuns He actually created took yjs, which is exactly what they do They take yjs and put it into all of the editors with the weak editors and he took took yjs and said okay If I can add this to any editor I can also add this to Gutenberg and he created this poor proof of concept and if you are Paying attention. There's a link here And it's life it's still life it's 2019 but it's still life you can go there and you can check collaboration in Gutenberg right now if you want in 2020 this is the latest take on Collaboration as you can see it's something is going on although we don't have the phase 3 yet This is the latest take it's created by Enrique Piqueiras who back then worked at automatic now He works at Google, but I Took this gift by the way, excuse me the Quality because it was just downloaded from the comment on the pull request, but you can see how does this work on? This pull request and it's you know kind of impressing impressive you type something other person sees this you can comment on stuff Etc. And this is using also yjs So it's using CRDT is kind of superseded the previous pull request. I told you from the yjs creator There's also this as blocks calm created by Riyad Benguela who also works at automatic and this is not embedded in wordpress But you can he took just just the Gutenberg and created this a nice piece of software when you can create a session You can share it you can visit the link in another browser and suddenly you have collaborative Editing enabled in both of these Gutenberg editors Until now I was talking only about Like text editing mostly because it was simple enough to explain But it's nice to know that CRDT's also can handle like this all of the different types of data For example, this is showing the offline capabilities on a to-do app Which is probably is just a JSON under the hood. It's just sharing the state and merging it somehow auto magically After the days, you know change the toggle to online from offline, which is super cool so Exciting times to get ahead of us amazing times ahead of us. I truly believe that adding collaboration Real-time collaboration to Gutenberg is going to be the biggest milestone that WordPress Had to face and will actually achieve since the creation of Gutenberg itself. Oh You got a problem But the next slide is thank you. Yeah, something popped up here, but the next slide is thank you. So that's it for me Thank you, David. That was an awesome talk. Thank you If anybody has questions again, we have the mics in the aisles if you want to ask a question Just please proceed to either of the two sides This was very exciting to watch especially the last gif Showing how it could work. I can't wait Yeah, I can't wait to I mean it would be awesome. Yes. Yes. I think we have a question over there Hello. Thank you for a talk. I Have a question. How will this influence? Database and also revisions. Oh, yeah. Yeah. Yes. So it's a good point. It's a good point So it has to all I probably know what you're talking about Yeah, it obviously has to be integrated somehow with revision history For and what was the first one? Can you can you can you? Oh, yeah? Yeah? I mean, I don't I don't know answer to this to be honest. I'm not working at automatic I I did my due diligence and I spoke to people working at automatic. Thank you. By the way, thank you Adam Chinese key But I don't know and I don't have answers to like all of this because I'm not like Very close to this development cycle of Gutenberg But I can check this one and I know there's like a revision history feature and all of the other features like suggestions like Content locking so it should also work with like locking the templates and everything so you Some people can edit some stuff some people can't so this also should be aware of all of this Which is not an easy task. That's why we only have proof of concepts We don't have the real real thing yet But you know, hopefully at some point in the future that we will get this also I know from the inside that to be absolutely honest It's I'm not representing automatic But I know that the real-time collaboration is not like super main priority They they off icing flows for content teams is kind of like more important thing for automatic for the Collaboration part because they don't specify real-time collaboration. They are working on it looking at it but the async collaboration is the Is the the main priority because you now the the bigger bigger content teams They are using external tools they create content there and they just move it to the WordPress in the end Which is not ideal So yeah, any more questions from the audience? Looks like okay. Thank you very much David. That was amazing. Please everybody give it up for David