 Hello all. Welcome to FOSDEM. Welcome to Encrypted Matrix Talk by Matthew Hawkson. So Matthew is a technical co-founder of matrix.org. Matrix is a VoIP and IP messaging solution. Matthew has been building IP solutions for around 11 years. Please welcome him to the stage and we can start the talk now. Thank you, Pravee. Can everybody hear me? Wow! Fantastic. So welcome to the land of matrix, everybody. I'm going to be talking about the epic that we've had over the last few years to add end-to-end encryption for our tool of matrix. But before I get into that, I probably need to tell some of you what matrix is, or what is the matrix, if you prefer. So just a quick show of hands. Who already knows what matrix is? Wow, that's really, really scary. It's probably over half for the people here. So I apologize that, especially if you came to the talk this morning on reputation, that there's going to be a little bit of overlap whilst I try to bring the other 40% up to speed. I'm trying to do it quickly. And also, if anybody has any questions whilst I'm talking, I'm quite happy to be interrupted, so please wave your hands around if you want to get clarification, or if I'm talking too fast or in the wrong language, or you can't understand what I'm saying. So let's get on. What is matrix? Well, we're a non-profit open-source project, obviously, and we're unusual in that we're providing an open standard here for defragmenting communication. So in practice, this means that we're basically creating a global encrypted communication meta network, and we call it a meta network because apparently the word internet work has already been taken. That bridges all of the existing silos together. So by silos here, we're talking about all of the islands which don't interoperate today. So that could be IRC interoperating with XMPP. It could be Slack, or Gitter, or Telegram, or WhatsApp, or Facebook Messenger, and I'm sure I'm preaching to the choir here, but everybody must be suffering due to the way in which all of the different communities for just communication online are fragmented these days. There is no equivalent of email for instant messaging or VoIP, or even IoT kind of data, and that is what we're trying to fix with matrix. And the whole point is to liberate the communication here so that it is controlled by us, the users. You should not be locked into a walled garden or a silo, which is controlled by any single entity for something as basic as your right to communicate. So in terms of the kind of silos we're talking about, we have things like IRC, or Gitter, and Slack, and I'm sure especially in the open source community, we're very familiar with the fact that you have a perfectly good IRC channel that lasts for 20 years, and then Slack comes along and half the people want to go to Slack and then half the people are on Telegram and then people start using Discord and it breaks down. Well, the point of matrix is to come in basically as the lowest common denominator and fabric to connect them together. So you have the bridges here that go and connects, I know IRC through into the wider decentralized matrix network, or for that matter, Gitter or Slack or any of these guys. Even an application like GitHub, you can go and bridge into the wider schema things via a bot or integration that allows you to start talking to SUs on GitHub or filing, whatever issue it might happen to be as if it were a chat system. And inside matrix we have this full mesh of servers which participate in the conversation and the really sexy, fun thing about matrix is that the conversation history is replicated over all those servers. So this is like Git, but for communication. There is no single chat server for any of these conversations. So if someone in IRC is talking on free nodes and it's bridged into matrix, talking to somebody natively in matrix like this guy or somebody in Slack, then that server, that server and that server will all have a copy of the same conversation. And this is incredibly powerful because it means that if any of those servers go down, the conversation lives on. So literally tomorrow matrix.org with its server could go down, but because a lot of the rooms have replicated over hundreds of other participating servers the room lives on. We think this is pretty cool because it's basically impossible to communicate on matrix without breaking open a wall garden. The actual acts of going and communicating with anybody else where gives them equal ownership of that communication. It is impossible to hoard control. So no single party owns your conversations and also the conversations are equally owned by everybody who's participating. What can you use for this? Well, classic use case is a group chat, WebRTC signaling for VoIP because I'm sure many are familiar that WebRTC doesn't have a standard signaling layer. Bridging any kind of communication silo together. It's also kind of cool because in the end it's really an open object database for any kind of real time data. So you can use it as a big open decentralized fabric where people can do puts of data and somewhere else people can get them and that can be any data. We've done media over matrix, we've done drone control, we've put car telemetry in and really in the end it's basically a PubSub system but for real time persistent data. Some people in the audience, how many people are thinking why are you reinventing X and PP? 1, 2, 3, 4, 5, 7, 8, 9, 10% of the audience are having the same reaction as this poor cat. Now, from our perspective, we're not. Really, we're not. I mean, it's very different to X and PP and it kind of goes on a philosophical and a technical angle. On the philosophical angle, the actual spec in the governance model of the project is completely different. There is one matrix spec. It has one version. There are no zaps, no extensions. Of course you can extend it yourself and you can experiment with it but the actual thing that people have to implement to be compatible is today matrix 0.2. If you don't implement it, you're not talking matrix. This seems like a bit of a kind of legal point. It actually changes the philosophy enormously of the project. So you have one big spec with all the features in it. Some of them are optional for different profiles but in general for the classic ones you have your instant messaging, your vibe, your read receipts, your notifications and all the stuff that you need to communicate. Also, the privatives are completely different. So conversation history is the first class citizen here. We're not passing a message between Alice and Bob. We're synchronizing Alice's chat history for this room with a server which then synchronizes it with all the other users in that room. And that's just a huge fundamental difference that you're literally synchronizing eventually consistent history around the place. Also, we have end-to-end encryption albeit still in beta as the first class citizen throughout the whole thing because if you are replicating data everywhere you obviously want to encrypt it so that the assessments aren't reading all your messages. And we're using HTTP and JSON but if people are thinking, wow, this is just X and BP with JSON and HTTP, you're missing the point completely because it's not just swapping XML for JSON at all. As I said, this is basically an open database. Think of it like Cassandra where anybody can spin up a node anywhere in the world. That's the kind of model that we're going after here. And obviously, finally, we've got a big focus on bridging and defragmenting. That's why the thing is called matrix because it is matrixing together all of the different communities out there which otherwise can't talk to one another. Architecturally, we kind of already touched on this. You've got some servers in the middle which store the messages. You have clients that connect to them. You have application services which do the fun stuff, bridging and other things. Oh, I've got a question. Quick question. Jesse, yeah, let me. So it's asking all the messages get replicated within the servers. I know that it's encrypted and to end and stuff is the message on each server encrypted on the server itself. And if it is, it's fine. If it's not, is there four secrecy? Okay, so there's a question about end-to-end and basically all of the rest of the talk is about end-to-end encryption. So we'll get to it in excruciating detail. But the quick answer is that in order to be end-to-end, by definition, it has to be encrypted on the clients, not the servers. At the moment, it isn't turned on by default because we're still in beta, but when it is, it will be there. And then the actual data stored on the servers is obviously encrypted by the end-to-end. Okay, yeah, I was just worried that if one of the nodes get compromised, then all the messages can be read. No, so actually, good question. So forward secrecy is the big interesting bit of what we've done with the encryption in Matrix in that we make it customizable per room. So if you were using Signal and you're optimizing for nothing other than privacy, the idea of being able to replay history on new devices would be a disaster because what if that device was owned by some nasty person? Whereas if you want to configure a room like that in Matrix, you can. But then some rooms aren't quite that tinfoil level of security, and you want to be able to add a new device into the room. And so in that instance, we allow people to select chunks of history that can be replayed. So it's a kind of deliberate compromise to PFS in exchange for usability. So Matrix ecosystem is a whole very quickly. The main deliverable is this boy here, the Matrix spec, which is a RFC style doc that describes all of the APIs in all of the actual functionality. And then we have Synapse which is our original Python home server which has been out there for two years now. It was very much a proof of concepts. It has some major shortcomings, although nowadays, as of yesterday particularly, it now has never been a better time to install a Synapse which is on 0.19. It's written in Python and Twisted. Memory usage has got a lot better. CPU usage has got a lot better. And it's kind of the one home server that works as usable in reality right now. Dendrites is our next generation one written in Go that uses a really interesting append-only log architecture. So Synapse is completely central. It is a kind of monolithic server of a big bunch of Python. Dendrites has got completely horizontally scalable architecture where you keep on firing up more components and they all talk over a big Kafka style event bus to scale up. We've got basically the first messages going through it in the last couple of days, so it's very, very early, but that's where the future lies. Then you have bridges that connect through to IRC and Slack and everything else and bots and configurations and all that good stuff. And then the orange stuff here is stuff from the wider community. The blue stuff is all provided by us, the matrix.org team. The orange stuff is provided by you, the wider community. And there are lots of different servers and services there. There's Rumor, which is a very cool Rust project to go and create a matrix home server. And there are lots of bridges, over half of the bridges now come from the community. And up here we have clients. So here are the community clients that are sort of native to web based ones. Then we provide free stacks of JavaScript, iOS and Android on the client side. And you have the React SDK, which provides kind of reusable React UI components. You have the HTTP wrapper, which actually exposes this guy as a sensible JavaScript bindings. And then on top you have the actual applications. And you have the ugly console apps, which are kind of proof of concept wrappers of the UI components. And then you have this guy, Riot, which is a kind of flagship app that we've built in order to make sure the matrix has a really, really good app that people can use on a daily basis and can kind of bootstrap the whole ecosystem. So what do you get? Very quickly. Obviously, conversation history. You get a timeline data structure, key value stores, group messaging, end to end crypto, obviously, void signaling for WebRTC, push notifications, server-side search, read receipts, typing this presence, synchronize read state and unread counts. We gather very, very crap decentralized content repository where you do an HTTP hit to your server and it fans out to the others via HTTP. We're looking at replacing that with IPFS. And finally you can attach arbitrary account data for users both per user and per room. So basically all the normal building blocks you need for a modern communication system. So how does it work? Very quickly, let me just change to here. And here's the front page of matrix.org. And it's got this little animation at the bottom that hopefully will make a lot of sense. If you have free servers and each one has a bunch of clients of it, but here we're just showing Alice on Alice.com, Bob and Bob.com and Charlie on Charlie.com. If Alice sends a message, it is just an HTTP put that message to Alice's server. It then fans out or gets stored in that server and it gets signed by that server. We're not showing end to end encryption here. Then it gets pushed out to the other two, a bit more of a complicated HTTP post in this instance. And by the way, HTTP is just the baseline transport here. Normally half the room have already left at this point because they don't like HTTP or why are you using HTTP and JSON. The reality is that it's just the lowest common denominator simple thing. If you want to do co-app, if you want to do MQTT, if you want to invent your own quick style whatever protocol, please do. And that's fine. But it's just the HTTP one that really mandates that everyone has to speak. That gets signed in those servers. They do an HTTP to get it and you've got the kind of basics there that Alice has sent a message to Bob and Charlie. Now if Bob responds, more interesting thing has happened. That message gets built into this direct-to-day cyclic graph, this tree of messages within that room. And if say Charlie responds at the same time, we have a race between Bob and Charlie. Now this is kind of interesting because right now the state of Alice's server is consistent with Charlie. However, this is okay. It's a feature. It's an eventually consistent database effectively. And so all that happens is that Bob's message is pushed out to Charlie at which point the graph bifurcates to show there was a race between message two and three. And likewise message two from Charlie. Sorry, message three from Charlie.com gets pushed out to the other two and they bifurcate two. At which point we're in sync again. So this is the whole point of matrix. You've got a whole bunch of structures for the room, which can be a very bushy or very linear graph and going and pushing and pulling just like merge and conflicts in a get tree between the different nodes. Later on Alice might send another message and it will go and merge the graph back together and that gets pushed out to the others and hey Bresto, everybody's in sync. So that's how matrix works. So, clients, well we've got at least 40 of them out there now. They range from text-wise to text-wise. There's an emacs-based client written by Ryan Ricks, which is really, really cool. You've got desktop apps like Caternion, which is a QML and QT app. You've got NeChat. Let's just switch to NeChat quickly. I've got it open here. So here is the matrix Fosden room on NeChat. Looks a bit like XChat. You've got matrix HQ here. This is a room with like 6,000 people in it. It's the main one. I can say hi everyone. And meanwhile, if I go to another room, I can say hi to Ryan Ricks. And you can see coffee is saying hi Fosden. I'll say hi coffee. And you can see Evil Matthew, which is my test user, because he wears a suit and is Evil, saying hi everybody. Tolov is saying hello and I can go hello from here. And because it supports markdown, I can also yell back. We can go and say upload an image into here. There's an embarrassing photo of us doing the release yesterday from the web. So here's a new idea. Basically, Riot here looks a lot like Slack or some kind of GitStyle tool, except everything is backed by matrix. And also, you might notice there are these read receipts jumping down the right-hand side doing this fun Tetris animation. Where are Slack's read receipts? Can anybody tell me where Slack's ... oh wait, it doesn't have any. Anyway, so you get the idea of what we're doing there. So, Eric here is at ericjonjki.re. You've got, who else is on their own server? This guy on riot.reticule.li rdaxi is on rdaxi.com and each of these servers have complete control over this room. It's so fun. On this account, which account am I on here? This is my Evil Matthew one, so I'm only in about 100 different rooms here. If I alt-tab some of my real one then I'm in actually about 1,000 rooms. So I got 415 direct messages open with different people there, 400 different rooms and some of them are IRC base, some of them are Slack, some of them are hybrids and a fun example might be something like the decentralized web summit room which is split between Slack and between matrix and about half of the people are on Slack, half of them are on matrix. Well, I might go to IPFS which is largely IRC but then you've also got a whole bunch of people at the top who are natively using it through matrix. Make sense? I'll just quickly show something else which is the video calling as an example of the WebRTC signalling. So in fact, if I go and take one of these messages like that one and I say if you source on it then, well, in my Dart theme it's not very legible probably, but you can kind of see that we've got the emoji as the body of the message and some of what goes in these brackets can be anything you like. So we specify instant messaging and VoIP but it could be literally GPS coordinates, thermometer, data, car telemetry, whatever. So let's see what else I can do. So what I was going to do was to go and call my test user from Riot on iOS. So Riot is an app is a React app here, it's also an iOS native app, it's also on Eftroid without any Google dependencies if you so desire. Which probably makes sense for Vozden. I'm not going to screen share my phone but I'm going to go and tell it to do a video call through to Evil Matthew which will hopefully make all the best live demos work perfectly. Come on. It does seem that I have internet connectivity on my phone of course. Oh, there we go. I've got an incoming video call here from Matthew. I probably should accept it. This is now using WebRTC here on, let me go full screen. It's only a video call. So this is using the Google WebRTC obviously on Chrome here. It also works on Firefox and meanwhile on this side it's also using the Google WebRTC stack. Hello everybody. You can wave to yourselves now. If I unlock my screen you might even show you that it does correct orientation support. There we go. Wave to yourselves. You look beautiful everybody. So this is using the Google stack but it also supports other stacks as well. So I have some interesting news about adding new stacks into that in the future. That's basically matrix as it looks from there. If we go back to Nechat you'll see that the same thing is happening here hopefully. Meanwhile on IRC we've got the same thing happening here on hash matrix and it's literally the same conversation reality gap saying great talk so far. Thanks, reality gaps. Meanwhile on Slack you would have the same thing going on here so let's scroll down to the bottom of Slack and you've got wherever it was. I hope we're in the right place. Oh yeah, there we go. Reality gaps saying great talk so far. A new bridge contributed by the community from Simon if he's in the room somewhere. Hello, thank you for writing this. This is really cool. You've got a whole bridge going through into telegram. At the moment this one is using bots to bridge people over but we do also have one that logs in as your actual telegram account so that you can transparently bridge the server zone into matrix and we're really in the point of building lots and lots of these bridges right now. So I think that's probably the obvious things to show you about matrix itself. Obviously lots of SDKs actually a final cool one would have been matrix IRCD. This exposes all of matrix as an IRCD so you can take your existing XChat or SSI or whatever it is and connect to port 6667 or matrix IRCD to the entirety of matrix. So you expose the whole thing as a IRC network and you can do silly things obviously like talk IRC to a matrix server which then goes into an IRC bridge and use matrix as one great big silly over engineered IRC boundary. Home servers, we've kind of already talked about these synapses about in fact we measured it earlier, it's now at 66,000 lines of Python and Twisted we've had some fairly major performance and maintainability challenges specifically on typing also the ability to do batch traces and profiling over deferred inline callbacks in Twisted. And so all of our work is going now into dendrite which is coming along well as I mentioned and it's built around these Kafkaresk append-only event logs. Finally you've got rumour on the rust side and you also have projects like bullet time in Go, Palium in Go, Jason Epps and Java which are basically experiments from the community. Bridges, well we've already looked at a bunch of them. We also rocket chat and matamos ones we have VoIP ones to free-switch asterisk really fun one, if anybody likes LibPurple please come and talk to me about taking over usage on the LibPurple bridge because this one should be the coolest in that it can talk anything LibPurple can and expose it into matrix. So when I built it we did it with Skype and it worked pretty well and you could use this to bridge into WhatsApp or whatever. So if you like hacking on LibPurple please come and talk to me. Meanwhile on the community we've had iMessage recently, Twitter, Facebook messenger blah blah blah blah you get the idea and everybody seems to love writing ILC bridges so we've got about eight of them. We've already shown you what it looks like. Community status says that we started in September two years ago now and just over two years very late beta, bits of it are out of beta some of it isn't. We've got 700,000 accounts on matrix.org server and we're pushing about 700,000 messages a day actually receiving them today. We realize that all of our message rates actually have been miscalculated and that we'd only ever looked at the messages coming in which is about 10 a second. In terms of going out we're actually pushing about a thousand a second on the server. We've got 70,000 rooms on matrix.org but a really important stat is this one the 1500 federated servers out there are run by you guys and the more the merrier and in fact almost half of the traffic on matrix currently is not on the matrix.org server but it's actually out there in the wild in the wider ecosystem. So it's pretty centralized to matrix.org right now but honestly I'd like to get to the point where we can turn off the main server and everybody can run their own. In terms of the user growth that's how we've got to the 700,000 users so far you can see that we really took the kind of training wheels off it at that point but a better graph rather than total users is the total messages a day and this is looking just at the unbridged messages per day so this is ignoring all of the traffic to and from FreeNode and MozillaNet and Slack etc. This is native matrix traffic and you can see that we're up at about 100,000 messages a day so it's not huge but critically the acceleration is frankly petrifying in terms of the update we have there and that's where a lot of work has gone into optimizing Synapse and making it good enough until we can get dendrite running so we can really, really go first. Now let's talk about crypto. Who wants to hear about crypto? Yay! So this is an Olm. It's a salamander from the cave systems of Pestonia Yama in Slovenia and it's the closest thing we have to a European Axolotl which is the North American salamander that opened whisper systems named their ratchets that they used for cryptography originally in Signal which was tech secure before they renamed it the very boring double ratchet. Well they renamed it double ratchet with calling Olm. And Olm is our unimplementation and it's the foundation of all of the encryption in matrix. As I said at the beginning without end-to-end encryption let me just kill telegram so it doesn't take too long. Matrix's replication conversation history is a huge privacy problem to put it mildly because in a room like Matrix HQ if we turned on crypto it's going to have the plain text over 500 different servers. So we spent two years gradually working away putting end-to-end crypto into the absolute heart of matrix and the point, the goals we had here was to trade off privacy and usability so that was the question at the beginning of the conversation. And sometimes you really want to be able to add new devices and invite people into a room and be able to replay and we want to do both. And I think we're basically unique in wanting that because everybody else whether it's wire or Signal are really going for the hardcore secrecy approach. We decided that we were going to encrypt and trust things per device rather than per user. So this is a big deal that when you send a device to a device you don't want to expose the devices which receive it. So I'm not sending it to a person and trusting them to go and blindly sink it to a whole bunch of different devices. I have the ability to blacklist particular devices from receiving messages and validate particular ones. What if I don't want to expose my device? If you don't want to expose the devices then rename them to my device otherwise you wouldn't be able to encrypt for the devices. So at the moment there's a bit of controversy that we name the devices based on something vaguely useful and expect for you to anonymize it if you want it to be anonymized. In future we're going to default to prompting the user so that if they want to remove any identifying data like the model of their phone or the URL of their client they can do that. We wanted to support big rooms. A lot of people give up on this but we're going to be encrypting for 10,000 devices why bother because one of them is going to get owned and so the trusted compute surface is going to be very big. But in practice it's a common use case anyway and a big company or even a big gathering like this which I guess is basically public or it is public. You might still want to at least attempt to encrypt the contents there and as you've seen our matrix rooms are huge and we want to support the whole thing to explode. We want to encrypt non-public rooms by default but we haven't yet. We're in beta but that's the goal. Obviously there's no point in encrypting a conversation like this one which you want to have on the public record. In fact the encryption gets in the way and finally we want it to be available to everybody in matrix. So high level 10,000 foot overview we've got two different mechanisms at work here. We've got own itself a little salamander which is implementation. It is almost identical in functionality to the one that Signal and Facebook and LO and all the other axolotl double ratchet implementations use. However, we use it completely differently. What we do is to use it to establish a secure channel between two devices and then we just use it to synchronize the key data that's required for MEGOM which is the new ratchet, the main ratchet that we use for encrypting a group of receivers. So the deal is that if I want to send all of you guys a message, I will go and first of all have to set up a one-to-one channel with everybody but I only have to do this once in order to share the group ratchet state which I then use to encrypt my messages to the room. So whilst it's pretty nasty that I'm going to have to do 800 one-to-ones at the beginning and honestly that does take a time and people using this today 10 second delay whilst it does the initial setup, once it's in place I can then use the MEGOM ratchet to generate a series of keys for the members of the room sorry for the messages which I'm sending into the room and hopefully everybody else has the same ratchet so they can decrypt. Easy. So when somebody joins the room is the question there and this is where it becomes not easy and there's a whole bunch of guides on the whole problem of what do you do when somebody adds a device into a room because it's not a person joining a room they're actually they could just be adding a new device in and well how about I just show you. So let's go back to Riot and this is the right one. I go into our big test room which is called MEGOM test. This one has got 127 users in it right now and I haven't actually validated any of their identities. You can see working pretty well. I can go and receive all of these messages. If I scroll back far enough you'll probably see some of the bugs which we've been chasing in the last couple of months but so far it's just fairly inane test chat. Ironically it's working perfectly now. Great. Things never break when you want them to. And there's MEGOM test which is working there. Now if I went and sent a message in here. Brilliant example. So room contains unknown devices. I last sent a message in here. Somebody's added a device which makes sense because there are over a thousand devices in the room and so I get this dialogue popping up saying what are, this room contains unknown devices have not been verified. No guarantee that they actually belong to the users. You might want to go and verify them. So I've got one from myself where I've logged in an incognito tab and you can see the names of the devices here are the URLs and the browsers and the OS is being used. One of the things we're missing now is a good UX for verification. So if I say oh, hang on a second, I wonder whether that is Simon. If only he was in the room and I could ask him in person to compare his public fingerprint keys. Now at the moment, depressingly, if I hit the verify button, that's precisely what happens. We do have this disclaimer at the bottom saying in future this will be much more sophisticated but right now I would say hey Simon, you know your device PSI HDDL QEP. Don't suppose the public key fingerprint is capital O, capital M, capital U and so on and so forth. Obviously we should be doing this with a mnemonic. We should be using a QR code. We should be hashing it down to a smaller amount or whatever. We just haven't done it yet. This dialogue itself is only, well at least this dialogue the one we were on before is relatively new as of yesterday. So that gives you an idea of where we are in the rollout. On the other hand if the verification failed and hang on that you didn't add that device at all, then the lovely thing I can do is literally just hit the blacklist, not that hard. Hit the blacklist button at which point this is now telling my client never to set up a one-to-one session with his device and that device will never receive the message. You can do this on a per room basis which is kind of fun because it finally solves the problem that if you're talking to somebody and they have an iPhone and an iPad and a PC they left logged in at a cyber café even if they're an idiot with their information security you have the option to say hang on I just want to send this to their phone or I just want to send it to the iPad on the sofa that the kids can see or whatever it might happen to be. So let me un-blacklist them. Yes. Oh yeah. So if I then go and look at Simon here in the members list I can see the same list of devices that he has here and I can literally just go through saying I can't talk to you ever ever again but luckily at the moment this is just personal data and in fact it's even done on a per device basis which is something we're struggling a bit with because obviously you can have some interesting security issues if you trust a server to be synchronizing the blacklist information so at the moment we're just doing it per device and if we actually look at the messages and also critically it queued my computer so even after all that it needs to be damn sure that this is actually the message that I want to send to the room so it's this chance that I can hit cancel or I can hit send and if I hit send then it will go through setting up new sessions with everybody and send it out. I'll cancel it for now but let's look at reality gaps one. The decrypted source is not very exciting it's literally just going to be mega-on-test from Janssen but if we look at the original stuff it's this beautiful thing which has just got the public key of the device he was sending from we've got the ciphertext which is Base64 AES encoded chunk of the JSON queued with the the ratchet state that matches the session ID which is the public key of his particular ratchet that he's sending to me so basically it's there and it works and it scales up let's go back to the slides and see where we go from here so we're using electric curve 255 online keys keepers get generated at log in time obviously the private ones get stored on the device this is a bit of a problem on the web where there isn't a good place to store your keys so at the moment we store them on local storage which is obviously not ideal if somebody can do an access attack another question? Where is the history stored can you access the history offline or is it on a server? Yeah, so if you're joining a room and the question there was can you access history offline and yes you can absolutely access it offline and at the moment right does stuff offline if it loaded it from the server when it was online but we're just changing it to store it properly on locally using indexdb and you store all of the session keys in your local storage so that you can then redecorate afterwards and if you add a new device into the room at the moment you have to export the keys and import them onto the new device but in the relatively near future you'll also be able to automatically share the keys over another quick question? So the question is how long do we persist messages on the matrix nodes and the answer is it's up to the guy who runs the node if I'm running my home server on a Raspberry Pi or on my phone I might keep a couple of tens of megabytes of messages but if you're running matrix.org we're currently up to 750 gigabytes of messages and you can throw away old history and that's okay because it will just get back filled and re-synchronized if you try to scroll back far enough so as long as one of the service for one of the participants in the room has the old history you're fine So going back to the key management public keys are published on your home server verified by comparing public fingerprints as we just showed with big bold text like this older UX Attachments took a lot of work because decrypting them client-side with the correct and security policy is a nightmare but we got it to work they're using ASCTR but with an integrity hash and so they get a new key for every attachment that's sent Ohm itself we kind of already talked about it it's on matrix.org. We have a formal spec for it which is very important for many the official double ratchet from open-wisp assistance didn't have an official spec although they have a really good one now and this is used for one-to-one communication we chose it because it's kind of the industry standard now and we wanted to avoid ruling out compatibility in future with WhatsApp and a ratchet obviously generates non-reversible series of keys for encrypting stuff and back in February of last year literally this time last year we were encrypting each message so it's an n squared problem there was no way to share history it was a good proof of concept but it sucked I mean it's how iMessage does it today still and it's how many other kind of group messengers try to do it and it just doesn't scale beyond 5, 10 devices there's a picture of a double ratchet which we don't have time to go through Megon however as I said totally new ratchet again it has a formal spec up on the matrix back and we go and contain a session per sender for each recipient called an outbound session and the big novelty actually from the cryptography on Megon is that you can fast forward existing ratchets before you share them so if I want to seal the history of the room I can refrow away my ratchets and start over again I can go and reset up one-to-ones with everybody in the room and then I can start encrypting over that however that's going to take ages and if you do it all the time we're just going to be doing nothing but key exchange so a fun thing you can do is that if a new device or a new person joins the room you can take the current one and fast forward it rapidly so they can join at the right point in time and they can't go backwards that shouldn't be there so ignore that Libom's architecture diagram is basically this you've got crypto-primitives using AS SHA256, Curve 24519 you've got ratchets you've got session management, account keys then you've got the Megon ratchet and then you have a CAPI over the top it's about 100k of 64-bit 8086 and it's about 200k of ASM.js and we try and spoil it via Encryption even into JavaScript I'm using it on the web devices and we use it natively on iOS and via JNI on Android security assessment so this is a big thing we've got Libom the library itself with the ratchets assessed by NCC group in September of last year and this is critical because we've done the schoolboy error of implementing our own crypto so it was absolutely critical that we had a proper professional independent audit we even got it released to the public unlike many of the other ones out there and so if you're interested in seeing all of the interesting problems that we had and all the things they found in the NCC group public assessment of Libom at the OM level there were two low risk and one informational findings which was pretty good Megom predictably was a bit more controversial with one high which is an unknown key share attack one medium and four low risk an interesting problem we had is that three of the findings they found though were actually features in that they said hang on a minute you can use this to break PFS one of the cool things about it is that if you want to turn off PFS you can they said well technically that's a vulnerability so three of the issues are vulnerabilities of that form we fixed everything obviously and since the audit we haven't found any more issues so that was a really really fun thing and I highly recommend working with NCC group if you want to do something like that for your own crypto we've already demoed it however we've had a couple of problems along the way the big one is that we spent ages obsessing about the ratchet implementation and getting it audited and making sure it didn't suck and if anything we probably focused too much because it turns out that the problem of what happens when a device joins a room and all of the reliable and efficient synchronizing of the Megom state over a federated system particularly like matrix is really non-trivial and it turns out that it's about two or three times more code than the actual ratchet and in some ways it's a lot more fiddly in order to actually get things secure and right so you have to know precisely what devices are in the room you need to ensure that your ratchet has been shared with them and it turns out the scope for races here is just spectacular I mean even that sentence itself here is basically racy in terms of at what point you decide what devices are in a room and you send a message and meanwhile some guy goes and adds a new device and whilst you're sending it they're never going to be able to decrypt it and they're going to send you hate mail saying well I can't decrypt that message so honestly the last two or three months have been chasing around these and it's one of the nastiest bugs I've ever seen because the symptoms are always the same somebody can't decrypt a message and it says unknown inbound session ID and so we get bug reports at a rate of about every half an hour with somebody saying oh I got the bug of course there are about 20 different races of different incarnations and flavors that can result in that state not being shared correctly on the plus side we've hopefully identified them all and we've almost fixed all of them we did a release yesterday which we hoped would finish up the remaining ones and predictably enough we got one of the we got a key share fail but we're very very close to wrapping it up at which point we'll get it audited and it will be able to beat it another interesting problem here is we went and coupled the implementation to the client SDK necessarily the key exchange stuff is talking to your server to say hey here are my keys what keys are there for the devices in the room and this was a bit of a disaster because we implemented it in the client side code which means that you have three completely separate implementations on JavaScript, Objective-C and Java for precisely the same logic so now I need to have some of these nasty architectural race problems to deal with we're also constantly porting the bug fixes to the three platforms so there's definitely some kind of learning there and I'm not entirely sure what it is because we could push it into a native library like the Ratchet itself but then we're going to have to expose all of these different APIs back and forth to talk to the server and I'm not convinced that's going to be any nicer than having to implement it three different times who knows design problems we also have an interesting nagging concern that Megal may be over-engineered in that you end up generating regression keys and critically if you want to do the off-line replay or if you want to share it with new people that means you have to keep all of these key sessions so it's not a keeper message it's a keeper N messages or per session but still I'm up to about three megabytes of key data in my local storage now on my Chrome and that's over a couple of months which doesn't feel very scalable plus that's in local storage and you have a limit of only 10-20 megabytes of local storage anyway so it's feeling a little bit unwieldy there they're talking about do we put them on the server but that also feels a bit dodged that you're taking key data for end-to-end encryption putting a passphrase on it and then putting it back on the server when it kind of works but again we need to think a lot more about that there's also the question of whether the funky fast forwarding is actually worthwhile because we have so many sessions anyway why not just go and create another one to invite somebody to a room basically where we're at right now is to see how well it works and tune how often we create new sessions and then perhaps we might go back and redesign it a bit and obviously at the matrix layer it's pluggable so you can keep evolving the ratchet and the algorithm is required so in terms of our goals can we do trade-off between privacy and usability yes we can in the protocol but we haven't exposed that in the clients yet we can do the encryption per device we can do big rooms we will encrypt non-public rooms by default once we're out of beta and we will be supporting on all matrix clients and the way we're thinking of doing that is to actually write an end-to-end proxy so if you have one of the other random clients out there and you don't want to spend a couple of months adding end-to-end encryption to it you can just route it through a proxy that you run on localhost in order to go and turn it into the end-to-end finally, quickly, metadata privacy matrix does not protect metadata currently so if you want to be private about who you talk to and when ignoring the contents but the metadata then look at something like Rookache or VivaZela the reason for this is that protecting metadata is basically incompatible with bridging and the entire architecture at the moment assumes that the servers can see the metadata however we have done a thought experiment for future and hopefully somebody will help us build this where if you run the home server client side and you tunnel the traffic over something like Tor and you use anonymous store and forward servers like Pond then you might be able to protect that metadata and the kind of architecture looks like this and the blue home servers are now running on your green client and the actual client app is the same and this is really fun because the client can be identical to what you have today so this is still Riot, it's still WeChat, it's still Nature app whatever but it's just happening to talk to a home server that's running client side and then if somebody wants to send a message they would go and send it to a hidden service on Tor which would act as a store and forward thing to send it out for another hidden service at which point the metadata is going to be exposed only on the client and in here is going to be a one great big nice blob of anonymity it's sci-fi if somebody wants to build one of those, please do and latest release the one that we released today which was actually yesterday not that I wrote this yesterday a warning when you get unknown devices that we saw, ability to blacklist backup and import is a big deal and also rage shaped bug reporting which will go and dump all of your JavaScript logs to us so we can work out why you got your unknown inbound at session ID mobile apps are catching up soon they're a bit more buggy but the crypto is still there in practice it works fine as long as you don't have lots of people joining and leaving what's next on home we kind of touched on it already the ability to share actual data with new devices cross-signing perhaps to ease verification better verification full stop push notifications are a bit broken on end-to-end we need better primitives and performance we're using we're not using web crypto yet we're using inscript and compiled native primitives on JavaScript which is a major security no-no we obviously need to get it audited we want to turn it on by default and we want to have some kind of way of negotiating we don't speak it at all our matrix threading bridges, tagging, ACLs file management and then decentralize the identity and spam which is an increasing problem which I gave a talk about this morning sorry if you didn't make it it will be published though by Fozdom of course with all the other ones we need help really we need help from everybody here please run your own servers please run gateways, please write gateways yes another proprietary HTTP API for messaging go and consider using matrix or XMPP or anything rather than reinventing the wheel and please follow us on Twitter and tell your friends and family thank you very much I think I have one minute for any more questions oh no I've got 10 minutes wow bonus, oh okay anybody got any other questions hi very simple question thanks for the talk it was great contact list management so the question there is contact lists and as a thing I like pigeon and all that as in pigeon and all that a roster kind of thing something that a lot of people feel is missing we have it in the protocol we have the concept of presence lists which allow you to subscribe to the presence of other people literally none of the clients actually implement it because in practice people haven't felt much of an urge for it but it's something that we will probably add on the mobile versions of Riot where it makes more sense because you can steal the address book off the phone and use that as the roster are the slides available yes I'll be publishing the slides I'll put them into matrix HQ in a few minutes and we'll put them up on the blog in the near future any other questions we have on there oh and one there I've seen that in dendrite the new implementation there is the new implementation so far and is that a new dependency that we need Java to run a matrix serve in the future? so the question there was do we need will dendrite be dependent on Kafka and the answer is absolutely not we're using Kafka at the moment to prove the event bus but all of the APIs and implementation deliberately call it Kafka-esque because in the simplest case you adjust use go channels within go itself in order to stitch it together and if you don't want to use Kafka and you want to use a go implementation of Kafka then that's fine too it's just very much a plug-able message bus module you mentioned server-side search feature how is that compatible with end-to-end encryption? so the question is on end-to-end search sorry on server-side search what do you do about end-to-end there are two options you can do homomorphic encryption there are two things on server-side despite it being encrypted which we do not do that's a whole different level of complexity and the other solution for search on end-to-end is that you just do it client-side and so that's one of the reasons on Riot that we're ending up storing all of your history client-side not only for offline support but also so you can just spider it locally just like you're grapping your IRC logs and time is up I'm afraid