 Okay, first of all, hello everybody, can you hear me? Apologize, I'm losing my voice, although it's slightly better than yesterday, and also that I have no microphone other than this one which is recording. Apparently there are no monitors in the room, so I will have to project from the diaphragm, and hopefully not lose my voice whilst doing so. So, we've got 20 minutes to play with, and I want to talk very slowly about breaking the 100-better-second barrier with Matrix. Does anybody not know what Matrix is? One person, two people, three people. Okay, very, very quickly, I'm going to skip through eight slides to try to explain the Matrix. It's an open protocol, open network for secure, decentralized, real-time comms. Use it for chat, void, VR, IoT. The whole idea is to create a global, open, decentralized network that provides an open platform for real-time communications. So, in practice, you have a whole bunch of silos, or not silos, but different sort of networks out there. It could be a proprietary closed system like Slack or Discord or Telegram. It could be an open network like X and PP, the Federation. It could be a closed, federated, but open standard like IRC. And Matrix can sit in the middle as a big decentralized network of servers. You have native Matrix clients hanging off it. You have bridges through to other protocols like X and PP and Slack, and it provides a kind of lowest common denominator Pub-Sub layer for the internet. It can be used to glue these things together. Relative to, say, X and PP, we are ideologically and technically very different. Matrix is all about syncing conversation history, replicating it between servers rather than passing messages. So, in Matrix, if I'm talking to you guys on your own Matrix servers, everybody gets a replica of the room. It's just like Git. In fact, it's the same data structures as Git. It's the difference between a slightly centralized model where the conversation happens on a single server in a Matrix where it gets replicated over all the participating servers. So, you end up with no single party owning the conversations, conversations are shared over all participants. Architecturally, you get clients, you get a mesh of servers, you get some application services being clients on steroids. You also have identity servers which map email addresses and phone numbers through to Matrix IDs. In terms of the ecosystem today, we've got stacks on the web, on the iOS. We have two stacks on Android, a Java one, and a brand new one we've just written in Kotlin which is looking really, really good. Performs about 10 times better than the Java thing and it uses Rx to glue the application there on top. We provide some SDKs at the, sort of wrapping the actual API level and then higher level for UI components and then apps on top such as Riot, which is the flagship client that we have on top of Matrix. We also have lots of clients from the community, desktop clients in QT and C++ and Rust, MecoS, Meco is QT and C++ as well. And also on the server side, we provide a Python and in development Go server, lots of bridges and there are also lots and lots of bridges and bots and a couple of servers from the community. The main deliverable though is an API, the Matrix pack, which is a big, big monolithic document with loads of HTTP endpoints in it and by default, by default we use HTTP and JSON. So that's my whistle stop four, five minute tour of Matrix. Let's talk about the good stuff. Loeb and with Matrix. We wanted to see how low we can go because today, historically, Matrix has been pretty crap at Loeb and with. We chose HTTP for being the most pragmatic and ubiquitous thing that anybody with curl or a web browser who has ever used a web API can do HTTP put to send a message, HTTP get to receive a message and that's it. That's all you have to do in Matrix. However, it's awful in terms of bandwidth and engineering. So we wanted to see whether we could get down to 100 bits a second. Now the way to think of 100 bits a second is that it takes two minutes, 120 of your Earth seconds to send an Ethernet packet at typical 1500 MTU. That's a long time to send a packet of data. So you might be thinking, why the hell are you bothering trying to do this? Why waste your time going to this ridiculous level? And honestly, there are situations where connectivity can be really, really bad, particularly ironically when it really counts in life and death situations. If you've gone and fallen down a mountain over a crevasse and your phone is literally bouncing off the valley in order to try to hit a random base station somewhere else, wherever you are, you really want every bit to count. You don't want to keep renegotiating from scratch every time. Now that has applications in emergency services, has applications in kind of personal area networks. If you're on a mesh network without any internet, obviously uses in government and sort of defense military context as well. So we looked at HTTP 1.1 and TLS, which is what we do by default today. So anybody, as I said, can just do a curl. That is literally the curl to send a message to a room in Matrix today. And I did that curl last night and it took 7,000 bytes, including Ethernet headers, and eight round trips in order to send the message test. So four bytes, payload, spending up in terms of doing the TLS handshake, doing the certificate check, et cetera, up to 7k. And 100 bits a second, that is 10 minutes to do a single HTTP request. So that's not going to work. Obviously it could be so much better. Obviously next thing to do, we go and chuck dash dash HTTP 2 and curl and see how speedy works. And we save a whole 300 bytes because so much of this is the TLS handshake still. Next up, we thought, wow, HTTP 3 is the next big thing. And it's just come out, the HTTP over quick would be referred to as HTTP 3. So now we've killed the TLS at the TCP handshakes. We're on UDP. We're using TLS 1.3. We can do sexy things like zero round trip time setup. We still have to have to do a TLS handshake and really surprisingly, it turns out that quick requires bit stuffing currently in the beginning of your handshake in order to mitigate amplification attacks. So they actually insist on you stuffing the first couple of packets up to 1,200 bytes just in order to prevent whatever UDP abuse vectors are other. So the good news is that once you have the channel set up, it's only like 983 bytes to send again, although that's still like minutes to send a message. But for that set up, we're still up at the six 7,000 bytes. I mean, honestly, HTTP 3 is designed to be a complete parity with HTTP normal, as you'd expect, and they're not optimizing for a bit rate over everything else as we would have to do for this sort of scenario. So then we started looking at co-app. Who knows what co-app is? About a third of the people here. Co-app isn't that well known outside of the IoT space. It was written around 2014. It's an RFC, 7252, and it is designed for very, very bit efficient transport of restful APIs over UDP. And it looks and smells a lot like HTTP, except it doesn't actually have parity totally with HTTP. For instance, the clients in the servers can talk bidirectionally. So your server can do a UDP hit to you as a client, just as much as you as a client can do the same RPC hit to the server. So you don't need web sockets or anything because you have this bidirectionality. In terms of the bitpacking, we're talking a kind of base header of about four bytes, plus some sequence numbers, plus some message IDs, which can be as small as you like. It's designed for these very, very small constrained networks and devices. Interestingly, their design assumption is that each request and each response fits in a single packet. So from their perspective, you're doing something wrong if your packets are bigger than 1,500 bytes on an ethernet because you have to start jumping through hoops using what they call blockwise mode in order to grudgingly split your payload across multiple ethernet packets. So that's the kind of mentality they're going into. And the good news is that it means that for a single request response, that's typically one round trip. One packet for the request, one packet for the response, and that's typically about 300 bytes to do the send, about 200 bytes to do the response, roughly more like 500. So now we're down to only 40 seconds to send a message. And in a life and death situation, I think that's starting to sound quite plausible as a way that you might be able to have a conversation through to an emergency responder. So before we're getting onto something here, ironically, by ditching the cool new stuff of Quik and going back five years to Co-App. So encryption. Obviously, we want to have transport layer encryption on this. We don't want to cheat and say, oh, look how small this is if we don't have any encryption at all. So Co-App itself recommends using DTLS, very similar to WebRTC. And in fact, back in 2014, they mandated DTLS 1.0 with private shared key encryption. So you can avoid having to do the certificate check as long as everybody has already entered some random key onto both sides of the conversation. So interestingly, there's a comparison that some nice chap did across all the different dialects of TLS, which you can use with Co-App. And they reckon that the overhead per request can be as low as 15 or 16 bytes, which is pretty good. However, much as we were just hearing about with WebRTC, there are relatively few stacks out there that support DTLS, especially Wrist and Go. And we were using Go to prototype this new low bandwidth stuff. And in fact, I think there are about three options out there. One of them came from the Pio NS or Pions project, which was mentioned in the last slides of the last talk as a WebRTC stack, except it targets precisely the dialect of DTLS you want for WebRTC, which I think is DTLS 1.2 without PSK. Meanwhile, there's somebody else who's done one specifically for PSK, but it hasn't been integrated at all with Co-App stack. And also, PSKs are all very well, but they're a bit of a pain in the ass because everybody has to go around ahead of time to make sure they've synchronized their watches and are on using the same secret PSK code. So we decided to look at noise instead. Who knows what noise is? Okay, about six people. So noise protocol is a really, really interesting thing. It's written by Trevor Perrin, and one of the guys behind the double ratchet that Signal uses for encryption. And noise is almost a meta protocol. It's building blocks for creating protocols. And they give you about 11 or 12 different handshakes that you can use if you're trying to build the lightest weight, most targeted encryption imaginable. So the two handshakes that most people end up using are called XX and IK. XX is this guy here. It's a three step handshake where you send an ephemeral key. You get back another ephemeral key from the other side and various combinations of elliptic curve, Diffie-Helman results. So each of these tokens is 32 bytes long, and it allows you to establish a shared public key which is mutually authenticated from the party that you're talking to. Once you have that, you can encrypt to that public key and so you can chuck a payload in so you have a one round trip time setup. But the really nice thing is that you can cache the secret and next time you want to set up a connection to the same person, as long as you still have their public key, you can do a zero round time setup again, which is called IK. So theoretically, this is basically the most lightweight encryption handshake and subsequent overhead which is 16 bytes of a north tag of an AEAD in order to do encryption. The bad news though is that you are effectively wrapping your own transport layer encryption at this point. Now the combination of XX and IK is kind of standardized by noise and they call it a noise pipe. But what it doesn't give you is any framing. It doesn't give you any sequence numbers. It isn't resilient to packet loss or reordering. So as a developer, you have to handle all of that yourself. So the good news is that it is so light, it's amazing. The bad news is that you feel slightly like you're re-implementing TLS, which is probably a bad thing at some level. But in the interest of saying how low we could go, we went ahead with it anyway. And in fact, one of the things we've done here is you need to do these handshake packets and the payload is obviously co-app. But here, these guys, you need to somehow frame the handshake. So what we did was to host the co-app headers out of the co-app payload. And they're not particularly sensitive. It's like a sequence number and a unique ID and the type of request. And that's it. So you don't get that much metadata on it. So it's probably not a disaster if you don't encrypt it. So we took those bytes, hoisted them up outside of the noise payload. And so you have unencrypted co-app headers and then a noise-encrypted payload beneath it. And then we used the same thing to cheat with these handshakes and make them look like co-app headers. So we actually gave them co-app IDs from unallocated co-app space so that you kind of multiplex the encryption in the co-app levels. I'm going to speed up a bit. Payload. JSON is about, well, 35 bytes for that. Switch to C-ball binary. You get about 75% smaller. It actually scales all the way through. Not amazing. So we also want to improve the payload. We want to map each ERI to a new route. We want to reduce the size of IDs. Manually mapping these is crap. So what we did is to run everything through a deflate library using a shared dictionary. This is great for saving bandwidth. It's a disaster for a protocol. Basically saying, hey, to talk low bandwidth matrix, everybody has to implement the same deflate library using the same deflate dictionary. And you can imagine how fragile that is. But it means we can get down to 90 bytes, plus 16 bytes of crypto overhead. So eight seconds to send a message, 100 bits a second. So thank you. Architecturally, you have a client that talks to a little go cart proxy. So the matrix clients and servers are talking normal HTTPS, no changes really at all other than to squidge down the size of some of the IDs. We go from 7 kilobytes of JSON HTTPS, et cetera, to about 90, 100 bytes of COUP, C-Ball, COUP, Flate, Noise, UDP, IDP, IP, any of that. So quick, before I run out of time, let's actually see this thing happening. So what I have here is a simulator, hopefully somewhere. Where's it gone? There it is. Called Mesh Sim. And let me run it. There we go. So what this does is to fire up a docker containing that entire stack. And if I go to a little web interface here, you can physically draw a network. So every time I click here in the background, it's firing up a docker instance and linking these matrix servers into a genuine Mesh. If I go and shell into one of these, something like Synapse 0, then bash. I don't know. I can't remember how to use docker. There we go. And I start pinning one of these guys. You can see it's 270 mils of latency. I can drag this around and it goes in place with NetM in order to increase it up to 360 now. And you can see this is between 0 and 1. So it's around trip time of about 400 or whatever. Let's add in a bunch more to make it a bit more firm. So we're building up the Smash. It's a maximum of four servers at a given point. If we do docker stats here, you can see it's spinning them all up. Each one is using about 100 megs, a lot of which is infrastructure on the outside. The Synapses are about 70 megs. So there we go. Interesting network. Let's do a quick trace route just to show you what's going on. Where should we go? Let's go from 0 to 10. Looks like a good one. Inside the thing. Trace route. My keyboard is broken by the way. The magic of the Mac. That's not very exciting. It went straight to 9 to 10. So it's going 0, 9, 10. Which kind of makes sense. And in terms of the routing algorithm, we're chasing a little bit by inserting the routes based on the topology of the network here rather than running OSBF or whatever you might use. So let's log into one of these quickly. I've got a riot running locally. And I could talk to it directly over HTTP. So I'm going to talk to the local proxy, which is going to convert it down into KALAP. I should probably run the local proxy, which is there. There we go. So now I can log in. And here I am now talking over that network. And that is what's going on over the line. So you can see these like 50, 55 by packets is the login sequence. If you ever looked at the matrix login sequence, HTTPS, it's like kilobytes are chasing going around the place. So let's go and quickly, I'm going to start logging again onto a new write. This time I'm going to talk HTTPS to it because I'm only running one proxy. I'm exposing HTTPS. They're all HTTP in fact. So there we go. In here, let me start a chat with Matthew on Synapse 0. As I'm logged into Synapse 1 there. And there we go. We've got the conversation coming in here. And I can say hello world. So before I hit send on that, I'm going to clear my trace route. And you can see that first of all, we're not suppressing echo locally. So I get a bit of crap here from receiving the same message coming back. But actually sending the message is 104 bytes, including ethernet headers and then 48 bytes to acknowledge it. And the really fun thing we have here is that if we... Thank you. If I scratch this down to size a bit and I send that message again, we actually have visualization here going through the mesh simulator to show the flow of the packets in real time based on the servers going and talking back to the simulator in order to see what's happening. So final quick thing we'll do is to create a room here and then make all of these servers join it. And we had about 10 servers, I think. This goes up to about 200 servers on my Mac, but I haven't got time to play with that. And what we're doing is having all of the servers join the same room. In fact, if we look right here, the room is now sitting right there. Interesting UI bugging right there. And you can see lots of people joining Matthew on Synapse 0123456910. That's great. Now, if I send a message within this room like that, you can see it propagating from server and all out through the mesh. Now, what you might not have noticed is if I send a message... In fact, what I'm going to do is quickly slow down time. I'm going to scale the latency up to 500 times normal. Sorry. Five times slower than normal need to also increase the maximum latency so that the mesh doesn't collapse. So now we've got like 700 ms of second latency. So if I do the same thing again, I send a mesh message. This is how Matrix works today. You've got a full mesh propagation of the messages. But what we also did is to implement a binary spanning tree over this side of servers. So if I send another message, you can see at any given point, it's propagating through slowly fanning out across the network. And it turns out to only be about 50% some slower than doing full mesh. So I've run out of time. There you go. Lobam with Matrix coming to you. We'll see you now. First floor of K at James. I mean, as I said at the beginning, there are emergency services, any kind of personal area networks, mesh networks, people using radio and sort of half-tuplex radio systems. I guess military use cases. Imagine submarines underneath polar ice caps. Okay, cool. Thank you, everybody.