 Good morning, everyone. Thanks for showing up in such great numbers. That's always a good thing for such an early session. And first of all, I would like to ask you a question. I mean, or let's start like that. Last night I had a weird encounter with a locked door. Out of the fate that we endured during this week, we were out of our apartment. And the hotel owner let us stay in their office. But the guy who stayed there put the deadlock on. So we tried to reach him. How do we reach them? We thought about maybe he has some messaging, maybe he has some mobile number. No, landline. They have landline. It turned out that the guy was not at the landline out exit. And so we looked around in the bar. So this wouldn't have happened if he had mobile messaging. So to dive into that, if we could just text him, hey, we are at the hotel. Please open the door. We would have had one hour more sleep tonight. So let's dive in with the talk of today. So this morning session starts with our speakers, Roland Schilling and Frida Steinmetz. And they will be talking about, they will at first give you a gentle introduction into mobile messaging. I have nine messaging apps on my phone, no 10. The organizers forced me to install another messaging app. And after that, give you a quick analysis or not so quick, I don't know, a deep analysis of the three more protocol. So let's give another round of applause for our speakers. Thank you, Tilo. I am Roland. This is Frida. And as well as Tilo already introduced us, we are going to talk about secure messaging. More specifically, we are trying to give a very broad introduction into the topic because we want to make the field that is somewhat complex available to a more broad audience. So as to leave our expert bubble and get the knowledge of technology that people use every day to this people who are using it. So to do that, we have to start at a very low level, which might mean for the security and crypto nerds in the room that you will see a lot of things that you already know. But bear with us please, since we are specifically trying at least with the first part of the talk to convey a few of these mechanisms that drive encrypted messaging to people who are new to the field. So what we are going to try today is basically three things. We will try to outline privacy expectations when we communicate. We are going to do that by sketching a communication scenario to you guys and identifying what we can derive from that in expectations. We are going to find an analogy or look at an analogy that helps us map these expectations to mobile messaging and then we are going to look at specific solutions, technical solutions that make it possible to make mobile messaging as secure and give us the same privacy guarantees that one-to-one talk would. Before in the second part of the talk, we move on to look at a specific implementation and it's no secret anymore that we are going to look at the specific implementation of 3MAR. So let's just dive right in. You are at a party. A party in a house full of people and a friend approaches you wanting to have a private conversation. Now what do you do? You ideally would find a place at this party that is well private and in our scenario you find a room, maybe the bedroom of the host where nobody is in there, you enter the room, you close the door behind you, meaning you are now private, you have one-to-one session in this room in private and we are going to look at what that means. First of all, the most intuitive one is what we call confidentiality and that means that since nobody is there in the room with you, you are absolutely sure that anything you say and anything your communication partner says, if you imagine Frida and me having this conversation, can only be heard by the other person. If that is guaranteed, we call this confidentiality because nobody who's not intended to overhear any of the conversation will be able to. The second part, no, the second claim that we make is if you guys know each other, and again if I had a talk with Frida, I know I've been knowing him for a long time, more than five years now, I know what his face looks like, I know his voice, I know that if I talk to him, I actually talk to him, meaning I know exactly who my communication partner is and the same thing goes vice versa. So if this is achieved, if we can say I definitely know who I'm talking to, there is no chance that somebody else which is in imposes off as Frida, we call this authenticity. Moving on, integrity. Integrity is a bit, this is where the analogy falls short, well, somewhat, but basically if I can make sure that everything I say reaches Frida exactly the way I wanted to say it and there is no messenger in between, I'm not telling a third friend, please tell Frida something and he will then alter the message because he remembered it wrong or has malicious intentions. If I can make sure that everything I say is received by Frida exactly the way I said it, then we have integrity on our communication channel. Okay, the next ones are two ones that are a bit hard to grasp at first. Therefore, we are going to take a few minutes to look at these and they are forward and future secrecy. Suppose somebody entered the room while we had our talk and that person would stay a while over here some portion of our talk and then they would leave the room again. Now, if they, if at the point where they enter the room, they wouldn't learn anything about the conversation that we had before, which is intuitive in this scenario, which that's why we chose it. They enter the room and everything they can overhear is only the portion of the talk that takes place while they are in the room. They don't learn anything about what we said before, meaning we have what we call forward security. We'll get back to that. And after they left, they wouldn't be able to overhear anything more that we say. This is what we call future security because those are a bit hard to understand. We have made a graphic here. And we are going to get back to this graphic when we translate this. So I'm going to take a minute to introduce it. We have a timeline that is blue. It goes from left to right. And on this timeline, we have a green bar that denotes our secret conversation. The first pink bar there is when the third person enters the room. Then our secret conversation turns orange because it's no longer secret. It's now overheard by the third person. And after they left, they wouldn't know anything that was said after that. So the left part of it, meaning the fact that they can't hear anything into the past is what we call forward security. And if they can't learn anything after they left, we call it future secrecy. Sorry. Okay. The last one that we are going to talk about since we are trying to keep things simple is deniability. Since we are only two people in the room and there are no witnesses, we achieved deniability because after we had this talk, we returned to the party and people asked us what happened. I can always point to Frida as you could to your friend and say he said something. Frida could always say no, I didn't. And it would be my word against his. And if this is, you know, if our scenario allows for this, we have deniability because every one of us can always deny having said or not having said something. And now we are going to look at messaging. Now in messaging, a third player comes into the room. And this could be your provider if we talk about text messaging like short messages that we used to send in the 90s. It could be your messaging provider if you use something more sophisticated. It could be WhatsApp for example. It could be Apple depending on what your favorite messenger is. But there is always, unless you use like federated systems, if some of you guys might think, but I'm using Java, I know. But we are looking at centralized systems right now. And in these, there will always be one third party that all messages go through whether you want it or not. And whether you're aware of it or not. And this brings us to our second analogy, which is postal services. Now while messaging feels like you have a private conversation with the other person, and I think everyone can relate to that, you have your phone, you see, you are displayed with a conversation and it looks like only you and this other person, in my case Frida, are having this conversation. We feel like we have a private conversation while actually our messages go through a service provider all the time. Meaning we are now looking at something more akin to postal services. We prepare a message, send it off, our message provider takes the message, takes a two-hour intended recipient and they can then read the message. And this applies to all the messages we exchange. And to underline that, we're going to look at what I initially called traditional messaging, meaning text messaging, unencrypted SMS messaging. And as you may or may not be aware of, these messages also go through our providers, more than one provider even. Say, I'm at Vodafone and Frida is with Verizon, I don't know. I would send my messages to Vodafone, they would forward them to Verizon, who would then deliver it to Frida's phone. So since both of our providers would know all the messages, they are unencrypted, we would have no confidentiality. They could change the messages and these things have happened actually. So we don't have any integrity, we don't know if the messages received are actually the ones that were sent. We also have no authentication because phone numbers are very weak for authenticating people, they are managed by our providers, they are not fixed, there's no fixed mapping to our phones or SIM cards. They can be changed, they can be rerouted, so we never know if the messages we send are actually received by the people we intended to. No authenticity and no authentication. Now forward secrecy and future secrecy don't even apply because we have no secrecy. We do have some sort of deniability, but this goes into like philosophical, philosophical, philosophical, let's do that again. Philosophical claims of whether when I say I haven't sent anything, this must have been the provider, they can technically guarantee they did or did not do something. So let's not dive too deeply into that discussion, but we can summarize that messaging translates, at least traditional messaging translates very badly to our privacy expectations when we think of a communication. Okay, moving on. Looking at our postal analogy, actually our messages are more like postcards because they are plain, our providers can look at them, can change them, all the things we've just described, just as they would a postcard. They can see the intended recipient, they can look at the sender, they can look at the text, change it, postcards. And what we want to achieve now is find a way to wrap these postcards and make them more like letters, assuming that postal services don't open letters. That's the one point with this analogy that we have to define. And to be able to do that we're going to, we're trying to give you the shortest encryption, the shortest introduction to encryption, see I'm confusing myself here, that you will ever get. Starting with symmetric encryption. Now, encryption for those of you who don't know is what we call the translation of plain readable text into text that looks like it's random, but it can be reversed and turned back into plain text provided we have the right key for that. So to stick with a very simple example, please imagine this box that we've just labeled crypto and we are not concerned with what's in the box, we just imagine it as a machine. Please imagine it as a machine that takes two inputs, the plain text and the key and it produces something that we call cipher text. The cipher text is distinguishable from random text, but it can be reversed at the recipient side using the same key and basically the same machine just doing the operation, you know, in reverse, turning cipher text back into plain text. This is what we call, sorry, this is what we call symmetric encryption because if you imagine a line where the cipher text is, you could basically mirror the thing onto the other side, so it's symmetric at that line. And when there's something that is called symmetric, there is also something that is called asymmetric and asymmetric encryption works relatively the same way, only there are now two keys. We have made them a yellow one and a blue one. These keys are called a key pair. They are mathematically linked. And the way this works now is that anything encrypted with one of these keys can only be decrypted with the other one. You can do it both ways, but the important thing to memorize here is just anything I encrypt with the yellow key can only be decrypted with the blue key. Okay. Since we have that now, let's capitalize on this scenario. Imagine each of our communication partners now has one of these two keys and we are still talking about the same key pair that we've outlined on the previous slide. Now we call one of them a secret key and one of them a public key. This is probably known to most of you, traditional public key cryptography. We've added something that is called an identity in this picture. We will get back to that in a minute. But the scenario we want you to envision right now is that both parties would publish their public key to the public. And we are going to get back to what that means as well. And keep their secret key as the name says, secret. Some of you might know this as a private key. It's the same concept applies. We just chose to call it a secret key. Because it more clearly denotes that it's actually secret and not never published. So this would mean any message that would be encrypted with one of the party's public key could then only be decrypted with that party's secret key. Putting us in a position where I could take Frida's public key and crypt my message, send it to him. And I would know that he would be the only one able to decrypt the message as long as his secret key remains his secret. And he doesn't publish it. Well, the problem is, this is a very, well, the problem is it's a very expensive scenario. We get something akin to a postal service where we can now encrypt a message and envision it like putting a plain sheet of paper into an envelope, seal it. We would put it on the way. Nobody on the line would be able to look into the letter. They would of course, well, since there are addresses on there, they would see who it is from and who it is to. But they couldn't look inside the letter. This is achieved. But as I've already said, it's a very expensive, it's a very expensive mechanism. And by that we mean it is hard to do for devices, especially since you are doing mobile messaging on your phones ideally, especially hard to do on small devices like phones. So what if we had a mechanism that would allow us to combine symmetric and asymmetric encryption? And it turns out we do. And we are going to keep this very simple by just looking at what is called key establishment. And then again, also just one particular way of key establishment. We have two new boxes here. They are called key generators. And the scheme that we're looking at right now works the following way. You can take one of the secret keys and another public key, like the one of the other party, put them into the key generator. And remember, these keys are mathematically linked. Each secret key belongs to exactly one public key. And the way this key generator works is that through this mathematical linking, it doesn't matter if you take, in this case, let's call them Alice and Bob, if you take Alice's secret key and Bob's public key, or Bob's secret key and Alice's public key, you would always come up with the same key. And we call this a shared key. Because this key can now be, it can be generated independently on both sides, and it can then be used for symmetric encryption. And as we've already told you, symmetric encryption is a lot cheaper than asymmetric encryption. So this has one advantage and one disadvantage. The advantage I've already said is that it's way cheaper. And the fact, well, the advantage is also that we come up with a key on both sides. And the disadvantage is that we come up with one key on both sides. Because whether or not you've realized this by now, since this is a very static scheme, we always come up with the same key. That is going to be a problem in a minute. So let's recap. We have looked at asymmetric encryption, which, as I've said, gives us IDs. And we're going to look at what that means. But it is very expensive. We know that symmetric encryption is cheap, but we have to find a way to get this key delivered to both parties before they can even start encrypting their communication. And we have looked at key establishment, which gives us symmetric keys based on asymmetric key pairs. Meaning we have now basically achieved confidentiality. We can use these keys, put them in the machines with our plain text, get cipher text. We are able to transport it to the other side. Nobody can look inside. Confidentiality is achieved. Now, deniability. Deniability in this scenario would basically mean if you think back at our initial sketch where we could say I haven't said that and the other guy couldn't prove that we did, would in this case be a letter that was sent to both of the participants and it would be from either of the participants. So that when looking at this cryptographically, we couldn't say this was sent by me or this was sent by Frida. You could just see it was sent by, well, either of us. And if you think of the scheme that we've just sketched, since both parties come up with the same key by using a different set of keys to generate them, basically the same key can be generated on both sides. And you can never really say by just looking at a message if it was encrypted with a shared key generated on one or on the other side since they are the same. So, very simply and on a very high level we have now achieved deniability. What about forward and future secrecy? You remember this picture, our overheard conversation on the party that we were at at the beginning of the talk? Well, this picture now changes to this. And what we are looking at now is something we call key compromise and key renegotiation. Key compromise would be the scenario where one of our keys were lost. And we are talking about the shared key that we generated now, which if it would fall into the hands of an attacker, this attacker would be able to decrypt our messages because it's the same key that we used for that. Now, if at the point where the key was compromised, they wouldn't be able to decrypt anything prior to that point. We would have forward secrecy and if we had a way to renegotiate keys and they would be different, completely different, not linked to the ones we had before, and then use that in the future, we would have future secrecy, but we don't. Since, as we've already said, the keys that we generate are always the same. And we want you to keep this in mind because we will get back to this when we look at FRIMA in more detail. Yeah, if we had a way to dump keys after having used them, we could achieve forward and future secrecy. Since we don't, we can't right now. Okay, next recap. Our key establishment protocol gives us confidentiality, deniability, and authenticity. We don't have forward and future secrecy. And if you've stuck with us, you would realize we are omitting integrity here. That is because we don't want to introduce a new concept right now. But we will get back to that. And you will see that when we look at FRIMA, it actually does have integrity. Now, basically, you could think we fixed all, well, we fixed everything. But you heard us talk about things like IDs. And we said we haven't really lost a few words about them or lost many words about them. And we are going to look at that now. And we are going to start with a quote by my very own professor. Don't worry, you don't have to read that. I'm going to do it for you. My professor says cryptography is rarely, if ever, the solution to a security problem. Cryptography is a translation mechanism, usually converting a communication security problem into a key management problem. And if you think of it, this is exactly what we have now. Because I know that Frida has a private key, I'm sorry, and a public key. He knows that I have a secret key and a public key. How does I know which one of those public keys that are in the open is actually his? How would I communicate to him what my public key is? Those of you who have used PGP, for example, in the last couple of decades know what I'm talking about. And we have the same problem everywhere where public key cryptography is used. So we also have the same problem in mobile messaging. To the rescue comes our messaging server. Because since we have a central instance in between us, we can now query this instance. I can now take my public key and my identity, tell the messaging server, hey, messaging server, this is my identity. Please store it for me. So that Frida, who has some kind of information to identify me, can then query you, get my public key back. This of course assumes that we trust the messaging server. We may or may not do that. But for now, we have a way to at least communicate our public keys to other parties. Now, what can we use as identities here? In our figure here, it's very simple. Alice just goes to the messaging server and says, hey, what's the public key for Bob? And the messaging server magically knows who Bob is and what his public key is. And the same thing works the other way. The question now is, what is a good ID in this scenario? Remember, we are on phones. So we could think of using phone numbers. We could think of using email addresses. We could think of something else. And something else will be the interesting part. But let's look at the parts one by one. Phone numbers can identify users. You remember that you rely on your providers for the mapping between phone numbers and SIM cards. So you have to trust another instance in this situation. We're going to ignore that completely because we find that phone numbers are personal information. And I, for one, have my phone number and I mean the same phone number I've had it for like 18 years now. I wouldn't want that to get into the wrong hands. And by using it to identify me as a person or my cryptographic identity that is bound to my keys, I wouldn't necessarily want to use that because I wouldn't be able to change it or I would want to change it if it ever got compromised. Now, something else comes to mind. Email addresses. Email addresses basically are also personal information. They are a bit shorter lived, as we would argue, than phone numbers. And you can use temporary emails. You can do a lot more. You're way more flexible with emails. But ideally, we want to have something that we call dedicated IDs, meaning something that identifies me only within the bounds of the service that we use. So that's what we want to have. We're going to show you how this might work. But we still have to find a way to verify ownership because this is a scenario that is more or less likely to happen. I am presented with a number of public keys to an identity that I know. And I have to verify a way to, well, I have to find a way to verify which one is maybe the right one, maybe the one that is actually used. Maybe Frida has used quite a number of public keys. He's a lazy guy. He forgets to take his keys from one machine to the other. He just buys a new laptop, sets up a new public key. Bam, he has two. Which one am I supposed to read to use right now? Now, remember that we are looking at the messenger server for key brokerage. And we are now going to add a third line here. And that is this one. Basically, we introduce a way to meet in person. And again, PGP veterans will know what I'm talking about. And verify our keys independently. We've chosen QR codes here. Frima uses QR codes. Many other messengers do as well. And we want to tell you why this is an important feature. To be able to verify our public keys independently of the messaging server. Because once we did that, we no longer have to trust the messaging server to tell us or to we no longer have to trust his promise that this is actually the key we are looking for. We have verified that independently. Okay. We have basically solved our authenticity problem. We know that we can identify users by phone numbers and emails. And you remember our queries to the server for Bob. We can still use phone numbers for that if we want to. We can use emails for that if we want to. We don't have to. We can use our IDs anonymously. But we have a way to verify them independently. The remaining problem is users changing their IDs. That is where we have to verify again. And we'll also get back to that later. But I want to look at something else first. And that is the handling of metadata. Now, we know that an attacker can no longer look inside our messages. They can, however, still see the address see who the message is from. And they can see how large the message is. They can look at timestamps and stuff like that. And since we are getting a bit tight on the clock, I'm going to try to accelerate this a bit. Metadata handling. We want to conceal now who a message is from, who a message is to. And we are doing this by taking the envelope that we've just generated, wrapping it into a third envelope, and then sending that to the messenger server first. And the messenger server gets a lot of envelopes. They are all just addressed to the messenger server. So anyone on the network would basically see there's one party sending a lot of messages to the messenger server. Maybe there are a lot of parties. But they couldn't look at the end to end, as we call it, channel, seeing what the addresses on each internal envelope are. The messaging server, however, can. They would open the other, the outer envelope, look at the inside, see, okay, this is a message directed at Alice, wrap it into another envelope that would just say, this is a message from the messaging server, and it is directed to Alice. Who would then be able to open the outer envelope, open the inner envelope, see this is actually a message from Bob. And what we have thereby achieved is a two-layer end-to-end communication tunnel, as we call it, where the purple and the blue bar are encrypted channels between both communication partners and the messaging server, and they carry an encrypted tunnel between both partners, you know, both communication partners directly. But, and we've had this caveat before, the messaging server still knows both communication partners, they still know the times that the messages were sent at, and they also know the size of the message. But we can do something against that. And what we do is introduce padding, meaning in the inner envelope, we just stick a bunch of extra pages, so the envelope looks a bit thicker. And we do that by just appending random information to the actual message before we encrypt it. So anything looking at the encrypted message would just see a large message. And of course, that should be random information every time, it should never have the same length twice. But if we can achieve that, we can at least conceal the size of the message. Now, so much for our gentle introduction to mobile messaging. And for those of you who stuck around, and we are now moving on to analyze FRIMA. Now, I want to say a few things before we do that. We are not affiliated with FRIMA, we don't, we are not here to recommend the app to you or the service. We didn't do any kind of formal analysis, there will be no guarantees, we will not be quoted with saying use it or don't use it. What we want to do is make more people aware of the mechanisms that are in use. And we have chosen basically a random message provider, we could have chosen anyone. And we chose FRIMA for the fact that they do offer dedicated IDs, that they don't bind keys to phone numbers, which many messengers do. Those of you who use WhatsApp know what I'm talking about. And, well, since it is closed source, we found it interesting to look at what is actually happening inside the app and make that publicly aware. Now, we are not the only ones who have done this, we are also not the first ones who have done this and we don't claim we are. But we are here now and we want to try to make you aware of the inner workings of the app as far as we have understood it. And with that, I hand the presenter over to Frida. Thank you. We'll dive right into that. Well, so I'll be presenting to you our understanding of the FRIMA protocol and how the application works as we deduced from mostly reverse engineering the Android app. And so this won't be a complete picture, but it will be a picture presenting to you the most essential features and how the protocol works. And I'll start by giving you a bird's eye look at the overall architecture. And while Roland was giving you this abstract introduction to mobile messaging, there was always this third party, this messaging provider. And this now became actually three entities because FRIMA has three different servers, mostly doing, well, very, very different stuff for the apps working. And I'll start with the directory server in orange at the bottom, because that is the server you most likely will be contacting first if you want to engage in any conversation with someone you never talked to before. Because this is the server that handles all the identity, public key related stuff that Roland was talking about so much. This is the server you'll be querying for whose public key. I have this FRIMA ID, what's the corresponding public key, for example, and stuff like that. Above that, there is the messaging server, which is kind of the core central entity in this whole scenario because its task is relaying messages from one communication partner to another. And above that, we have the media server. And I'll be talking about that later. In short, its purpose is storing large media files like images and videos you send to your communication partners. But as I said, I want to start with the directory server. And in the case of FRIMA, this directory server offers an REST API. So communication with this server happens via HTTP. It is HTTPS actually. So it's TLS encrypted. And this encryption also fulfills all the requirements you would have to a proper TLS connection. So if you want to communicate with a new person and you have their phone number or their email address or FRIMA ID, your app will be asking the directory server, hey, I have this phone number. Do you have a corresponding FRIMA account and public key? And the response will hopefully be, yes, I do. That's a public key. That's a FRIMA ID. Go ahead. And as Rohan said, we kind of chose FRIMA for the arbitrary use of IDs and especially for the system of verifying fingerprints in person by scanning QR codes. And because this is something FRIMA has and other messengers do not have, I want to talk a little bit about that. Because if you just ask the directory server, hey, I have a FRIMA ID, what is the corresponding public key? The FRIMA application will say, okay, I got an answer from the directory server, I have a public key. But I have very little trust that you actually know who the real person being at this FRIMA account is. We're not quite sure about that. So it'll mark this contact with one red dot. And if you had a phone number or an email address and asked the directory server, hey, what's the corresponding FRIMA account and public key? The app will say, okay, we still have to trust the directory server, but we're a little bit more confident that the person on the other end is actually who you think they are because you have a phone number probably linked to a real person and you have a better idea who you're talking to. But still we rely on the FRIMA server. So it'll mark a contact like that with two orange dots. And then there's the final stage. If you met someone in person and scanned their public key and FRIMA ID in form of a QR code, such a contact will be marked with three green dots. And in that case, the app says we're 100% confident. We're talking to the person we want to talk to and we have the proper keys. So right now, if we think of engaging a conversation, we were at the point where we do have all necessary details to start encrypting our communication. But question remains, how do we encrypt our communication? In case of FRIMA, FRIMA uses a library called Salt. It has been developed by Daniel Bernstein and he called it Salt, but it's spelled NACL. So I'm sorry for the play on words, but if you see NACL, it's Salt. So this is a library specifically designed for the encryption of messages. And it's supposed to be very simple in use and give us all the necessary features we want. And this is Salt's authenticated encryption, giving us all the features Roland was talking about in abstract before. It gives us integrity. It gives us authenticity. It gives us confidentiality. And just a quick look on how this library would be used. As you can see up there, like everything in the gray box is what the library does. And we only need our secret key, if we want to encrypt something to someone. The recipient's public key, our message. So far, very obvious. And the library also requires a nonce, which is something that should be only used once. That's actually, yeah, part of the definition. So we generate something random and include that in the process of encrypting the message. This is just so that if we encrypt the same content, the same message twice, we do not get the same cipher text. This is nothing secret, because as you can see at the output, the library actually gives us cipher text. Roland talked a bit about that, what it is. And it will also give us a Mac. And I'll just stick with a very simple definition of what that is. It is something that ensures that there is kind of a checksum. So someone looking at the cipher text and the Mac can ensure no one tampered with the cipher text. So the cipher text is still in the state when it was, when we sent it. And if we want to transmit our message now in encrypted form to someone, we have to include the nonce. The nonce is not secret. We can just send it along with the cipher text. But to decrypt, we need the nonce. And well, so this is what three my users for encryption. But as you might remember from Roland's introduction, this scheme does not offer any forward or future secrecy. And we can still try to add some form of forward or future secrecy to this scheme. And this is usually done, sorry for skipping, with something called a handshake. And handshakes are a system of discarding old keys and agreeing a new keys. This is usually what we do with a handshake in scenarios like this. And doing a handshake with someone that is not online at the moment is pretty difficult. There are protocols to do that. The signal messaging app, for example, does something like that. But it's kind of complicated. And three my protocol spares the effort and only does this kind of handshake with the three my servers because they are always online. We can always do a handshake with them. So three my has some form of forward secrecy on this connection to the messaging server and how this is achieved. I'll try to present to you right now. And we'll walk through this handshake step by step. And I try to put some focus on what every step tries to achieve. So if we initiate a connection, if we start sending a message, the three my app will connect to the messaging server and start the connection by sending a client hello. This is a very simple packet. It is only there to communicate the public key we from now on intend to use. And a nonce prefix, in this case, notice. It is, I'd say, half a nonce. And the other part is some kind of a counter that will, during the ongoing communication, always be increased by one. But it will do no harm if you just see it as a nonce right now. So we start the conversation by saying, hey, we want to use a new key pair from now on, and this is our public key. Please take note. And the server will react by saying, okay, I need a fresh key pair as well then. Generated fresh key pair. And let us know what its public key from now on is. The only thing to note is, I mean, as you can see, there's not much more than the things the client sent, just corresponding things from the server side. But there's also the client nonce included. So as we can see, this is actually a response to our client, hello, we just sent something that got, I don't know, redirected to us on accident or whatever. And as you can see, the latter part of the message, including the server's public key, is encrypted. That's what this bracket saying self-attacks says. And it is encrypted with the server's long-term secret key. And our ephemeral temporary key. And by doing so, the server does something only the person in possession of the server's long-term secret key can do. And proves to us, this public key we just received from the server, in this server, hello, has actually been sent by the proper three-mal server. No one can impersonate the three-mal server at that point. So after that, we are at a point where the client application knows this is the public key the three-mal server wants to use, and it's actually the three-mal server, not someone impersonating it. The server knows there is someone who wants to talk to me using this public key, but knows nothing else. He doesn't know who's actually talking to him. And this is going to change with the next packet, because the three-mal app is going to now send a client authentication packet, we call it that way, which includes information about the client. The first thing is the three-mal ID. Three-mal IDs are eight-character strings, it's just uppercase letters and numbers. And what follows is a user agent string, which is not technically necessary for the protocol. It's something the three-mal app sends. It includes the three-mal version, your system, Android iOS, and your, in case of Android, the Android version, and stuff like that. So it's very similar to user agent and web browsers. Yeah. I don't know why they sent it, but they do. And the rest of it is nonsense. Let's skip over them, but also the client's ephemeral public key we already sent in the client hello, but this time encrypted with our long-term secret key. So we just repeat what the server just did, proving by encrypting with our long-term key, proving that we are who we claim to be, and that we vouch that we really want to use this temporal key. And after that happens, each party knows what public key, what new keeper the other party wants to use from now on, and that the other party is actually who they claim to be. And so the handshake is just concluded by the server sending a bunch of zeros encrypted with the newly exchanged key pairs. This is just as the client can decrypt it, see there's a bunch of zeros, everything worked out, we have a working connection now. So if we've done that, we have this, we have, if you remember this picture, we have established forward secrecy in the parts between the app and the server. We do not have established anything for the inner crypto layer, which is in case of three not just taking messages, encrypting them with the sort library and sending them over the wire. There's nothing more to it. It's just, as I showed you the scheme before used in a very simple way. So we now have channels established, and we can communicate via those. And the next step, I want to look at what we are actually sending via these channels. And so I'm introducing the three my packet format. And this is the format packets do have that your application sends to the three my servers. This is what the three my server sees. In this case, it is the form a packet has, if it's something I want to send to a communication partner. For example, the content could be a text message I want to send to someone. There are different looking messages for management purposes for exchanges with the server that will never be relayed to someone else. But this is the most basic format we use when sending images text to communication partners. And as you can see, there's a packet type. Its purpose is kind of obvious. And what follows is the fields on the envelope as Ronald introduced is saying, this is a message from me from Alice to Bob. And so you recall the server can see that what follows is a message ID. This is just a random ID generated when sending a message follows a timestamp. So everybody the server knows this is a recent message has been stuck in the transit for a long time, whatever. What follows is something stream three my specific three my does have public nicknames. It's just an alias for your account. You can set that in the app. And if you do, it actually gets transmitted with every message you send. So if you change it, your name will change at your communication partner's phone with the first message you send to them. And what follows is a nonce. And that is the nonce used to encrypt the ciphertext as follows. The ciphertext you see down below is the inner envelope as in Roland's earlier pictures. And we're now going to look at what is in this inner envelope. What is how do the messages look we transmit to our end-to-end communication partners. And the most simple thing we could look at is a text message. And you can see grayed out above still all the stuff from the outer envelope. And down below it's very simple. We have a message type. It's just one byte indicating in this case that it is a text message. And what follows is text. Nothing more. It's just plain text. And after that, Northworth maybe is padding. And this padding is, as you can see, in the most inner encryption layer. So the 3MAS server does not know how big your actual messages are. This is kind of useful because there's stuff like typing notifications you send to your communication partners, which are always the same size. And to make this, to hide this from the 3MAS servers, we have this padding in the inner crypto layer. I want to look, next I want to look at a other message type. Like I'd say the most, yeah, I think one of the basic message types most people use with instant messaging app is image messages. I want to send someone an image. This is something we do regularly. And this looks a little bit weird on the first look. Because it has a message type. We know that. We know what its purpose is. Follows a blob ID. What a blob ID is, I'm going to explain in a minute. Follows the size. This is very, basically it's just the size of the image just should be transmitted. And what follows is a key and the mandatory padding. So the questions are what is this blob ID? What is the key ID? What is this key? And this is where the media server comes into the picture. The media server is, I'll show you what happens if you send an image message. Your app will take the image you want to send, generate a random key, encrypt this image with this key, and send it to the media server. And the media server will say, okay, I'll store this under the following blob ID. And your app takes note of this blob ID. And then we'll send this kind of image message I just showed to you via the messaging server to your communication partner. Your communication partner opens up the message, looks at it, sees the blob ID, sees the key, and goes to the media server and says, hey, do we have a blob ID, something stored under this blob ID? And the media server will respond, yes, I do. Here's the encrypted stuff. And your communication partner can take this encrypted stuff, decrypt it with a key you send, and look at your image. This is how image sending works. So right now, we do have the basics of modern instant messaging. We can send text, we can send images. This is the simple stuff. And what I want to look at next is something that most people would want a modern messenger to have as well. And that is group conversations. Group conversations, essentially, in Trima, do work not very different from other method messages. Because if you send something to a group, your app will just encrypt the message several times for every communication partner involved and send it to them. But your communication partners need to know, wow, this is a group message, and it belongs to this and that group. And to do so, Trima has group packets, and they include exactly that information. They include a creator ID, which is the Trima ID of the person who created the group, and a group ID, which is something randomly generated when creating a group. And after that follows a regular packet format. In this case, a text message. If it were an image message, you would see exactly the same stuff as shown in the image message before. So this is how group messages look. But we need a way to introduce new groups to change names. And for that, there are special packets. And this, for example, is a group set members message, which tells everybody there is this new group, and it has the following members. As you can see here, there is only a group ID. There is no longer a group creator ID included. And that is because Trima's group management is very static. There can only be one person managing a group, and that is the person who created the group. So only the person who created the group can send this kind of messages, saying there is a new member in the group, for example. And therefore, the group creator is implicit in this case. It is the sender of the message. So this is kind of annoying because you cannot have a group where everybody can have members, for example, and stuff like that. Just if you set a name for a group, the message looks very similar. It just doesn't include a member list, but a name field. So what I want to talk about next is something that happens above all the stuff I talked about right now. Because now I show you there are different kinds of packets doing all that stuff. There are lots of more packages for audio messages, for example. They look very similar to the image messages because they just, I mean, we have a blob ID for the audio file and stuff like that. But what is kind of interesting, I thought, we thought is that above this layer of packet formats, there is also some additional stuff happening. And a good example for that is how Trima handles subtitles for images. You can, I think a lot of modern messengers support that, add some kind of text to an image. And Trima doesn't have a packet format or a field in some kind of image message for that, but they just embed the subtitle of the image in the actual image and the acts of data of the image and send it along. This has the advantage of being compatible with Trima versions not aware of this feature because they can just happily ignore this acts of data. You won't see the subtitle, but it won't break anything. It is, though, kind of wonky because it's not actually a feature which is not reflected in the actual packet format. And this is also very similar happening with quotes. You can quote other people in Trima. You can mark a message and say, I want to quote that. And in the app, it looks like some kind of fixed feature. You have this message, you quoted, included in your new message and it looks like it's somehow linked to the old message. But in reality, it's just a text message including some markdown which, if your Trima version supports this kind of stuff, is rendered nicely as shown below. But if your version doesn't support it, you'll just see the plain text. So, again, being compatible with versions that don't have it introduces some weird layer. And with that, I'll stop showing you all the features Trima has. There's certainly more to talk about. But I think you should have an idea how it works in basic terms. What it does, all the other stuff is kind of similar to what I showed you and differs in particularities which aren't so important, I think. And I'll just hand over to Roland who will be wrapping up our talk and say something about the results of our reverse engineering. Okay. We told you we reversed the app and we told you we want the first ones. And this is all true. But we came here to tell you guys or to make you guys aware of things you can expect from messaging apps. We hope that by using Trima as an example, we have shown you how you can relate your own privacy expectations to different apps. And we also hope we gave you enough terminology and explanation to that so you can make a more competent decision next time you look at a messenger and look at what its promises are. Since we reversed it anyway and we did a lot of coding to do that, what we did is put it in a library. Now, I don't know how many of you guys know the term academic code. We are, of course, working at a university. So we've been doing this on and off for quite some time. We started roughly two years ago, did it for a couple of days, then left it lying around. And eventually we had the whole thing lying in a drawer for about a year before we decided to finish it. So we never actually put a lot of effort into the code. We are not proficient programmers. But we still wanted to publish what we did with the hopes that a small community might form around this. Maybe extend it, help us fix a few things that we didn't do so well, help us documenting it. You don't have to take photographs, by the way. We'll upload the slides. So these repositories they exist. We pushed to them. We made a GitHub organization that we pushed to them yesterday. If you wanted to look, if you wanted to start coding right away, say if you wanted to write a bot, we'd recommend you wait a few weeks, say two to three, because we still want to fix a few of the kings in there. Everyone else, we hope we'll just look at it. Maybe this will help your understanding of what it actually does. And also the activists in us hope that this might get the people at Threema to open source their code, because no matter what we tell you here, and no matter what they tell you how their app actually works, and this is always true for non-open source software, there will never be true transparency. You will never be able to prove that what runs on your phone is actually implemented the same way we've shown you. With our library, you would have these guarantees. You can actually, you can definitely use it to write bots if you ever wanted to do that. Or if you just want to understand how it works, please go ahead and dive right into there. Well, with that said, we thank you for your attention. Okay, thank you very much, Roland Frieder. We only have time for one question, so who has a super eager question? The signal angel is signaling. Yes, a couple of questions, but I will pick the best one. The best one was from Alien. Could you use captions to inject malicious active data into the images? What is malicious, what is malicious active data? Well, some data that probably the image passing library. What we did not do was have looked very particular at security problems in the implementation of Threema. I could, like, and I would say this falls into this department. There's also a library handling, the gift displayment and stuff like that. We could have looked at, is this broken maybe? We did not. We looked at the protocol from a higher level and so I cannot say anything about it. Okay, and another question was, when a non-group originating user sends the group update message, what happens? Nothing. The thing is, Threema group IDs aren't globally unique. A Threema group ID only refers to a particular group together with a group creator's ID. So if you send an update group message from your account, the app would look for a different group than you intended because your group ID would say, I'm trying to update a group created by me with this and that ID. So it won't be the group you want to hijack. Okay, very well. Another round of applause for our speakers.