 Okay, so my name is Tomoko and and today I'll be talking to you about challenges will building entered encrypted applications based on my learnings from etysync. I originally gave this talk yesterday and unfortunately due to some issues. I have to do it again today just to record it in an empty room. So no crowd. Just me and you. Let's begin. So who am I? I'm a long time open source dev. I'm a privacy and digital security insidious and maintainer and creator of etysync, which I'll cover in a moment. And I'm currently building a security startup with entrepreneur first, which is kind of like a startup accelerator that essentially invests in you pre team, pre idea and just give your stipend to work and cool stuff. You think are amazing. So check them out if you're interested. So what is etysync? So etysync is a secure and entered encrypted and fully versioned and personal information sync for Android, the desktop and the web. We currently do contacts and calendar and and tasks as well. But we have more things more things planned. It's obviously completely open source. You can self host the like Docker images and whatever. And but also I run a managed service if you want to use that. And I mean, unlike. Unlike dove. And for example, your, your all of your information is encrypted also in the server. So also at rest. So even if the survey is hacked, nothing, nothing leaks. So let's talk a bit about etysync. So etysync, the base, the base part of etysync is an encrypted and tamper proof journal. So essentially you have the address book or the calendar, for example, and every change in those collections is tracked. So you have, I added a calendar, added a contact. I deleted the contact, added a meeting. I modified the meeting. Everything is tracked. So think of it like sort of an encrypted and integrity check to get. And this is this is the base building block. So this prevents a lot of tampering, removing entries, all of that protects against all of this. And now I'm going to, the next two slides are a bit heavy. I'm sorry about that. But I promise it's only those two. So let's talk about how the encryption keys are derived. And so first of all, we use script to derive the key from the user email and password. And then we derive a key for every journal based on the journal idea ID. The reason why it's important is because when you share journals among several users, and you want to make sure that the key is different every time. So we had to modify it based on the journal. And the last heavy slide is how is the data actually encrypted? So we use AES in CBC mode. And we use the journal encryption key from before. We encrypt standard encryption. And then we use HMAC, which is a sort of signature, but symmetric encryption. So it's an integrity check. And we make sure the version hasn't changed and essentially everything that's available. We then generate a random identification code for the journal, just so we know how to refer it in a way that doesn't leak information, because if we called it address book, the server would know it's an address book. So we want to keep it completely non-leaky. And then when it comes to the entries themselves, we take the previous identifier, the identifier of the previous entry, and then we encrypt the entry as before. But this time we use the previous entry as well in the integrity check to make sure that no entries are removed or reordered, as I mentioned. And again, a key part of the encryption. Okay, so this is it about Etersync. I just needed a baseline. It's more for the people following at home or following with the slides just so you know what's going on. Let's talk about some of the challenges. I mean, obviously, this is from the point of view of Etersync. And so long-term storage of stuff, immutable journals, but also everything, a lot of other things that apply to a lot of other end-to-end encryption applications. And the first thing to keep in mind is platform portability. So in end-to-end encryption, the server is unaware of anything. So everything has to be implemented on every client. So all the clients need encryption libraries. So, for example, if you want to have elliptic curve support, you will need to make sure that every client has support for that. Because, first of all, you don't want to implement your own cryptography. It's damn hard, frustrating, but also it could lead to insecurity. And also if you use a hardware token, so for example, a UB key like I use, or if your mobile phone has a secure area, a secure zone to do encryption stuff, it's good to know in advance that those are actually supported before you go and release everything. Obviously, we also want library support for our protocols on all the platforms. So if Etersync relies heavily on vCard and iCow, and trust me, I don't want to re-implement all of those specifications on every platform I go. It's hard enough to re-implement all the code on every platform. You really want to make sure that support is available for the format you choose. And as I said, you need to code the same code on every platform. And when it comes to accounting and internalization, there's another challenge, which is everything is implemented on every client. So on every client, you need to have the initialization code. So for example, on Etersync, you want to have a default address book and a default calendar. So when users start using it, they get those. So you need to have it on every client because you don't know which is the first client they'll be using. Maybe it's the Android app, maybe it's the desktop bridge. And the same goes with upgrade code. So if you want to change something and you want to have code that updates the information accordingly, you need a way to, so you need the code to actually do the updating. So you need the code to actually do the updating itself. So it's available, the new version is available and you need it on every client because otherwise it won't be available. And of course you need to support past and current protocol versions on every client because otherwise the new code won't be available on that client and that will just not work. So you need to make sure that all of them are available when you release and all of them are available for the past versions because you can change it. And one solution is to have a sort of a master client, which is in EtiSync is the Android client that we assume most people use that. So we have the upgrade code or the init code was released there before it was released everywhere else. But this is a hack and it's much better to have it on every client. Another thing about protocol upgrade is that every client needs to support the new version. So either you need to update all of them simultaneously, which is a bit hard with FDroid. For example, this is a bug report from a user that it took a long time for version 1.0 to be released on FDroid, I think a month after Google Play. So it's very hard to plan for that. And the other alternative is first deploying support and after that deploying the upgrade logic, so essentially a long waiting time. This is good practice anyway, but it's really slow when you're just trying to iterate on a product and providing a lot of value for your users very quickly in the early stages. Another challenge with protocol upgrades is that you cannot transform the data on the server. So usually what you do in applications, you will have API v1, which is what you created in the beginning, and then the old application that access that API. And that's fine. And then when you want to add new API, you just add a new entry. The app calls the new one and the server automatically translates in the background the information between API version 1 and 2. So because the server can access the data, they can just do it. We cannot do it in N2N encryption applications because the data is just not available to the server. So you have to be aware of that. This is like normal practice that you just cannot do. You also need, as I mentioned before, need to gracefully handle future versions. So if you decide to change the app version at some point, you want the old versions not to crash, but rather pop up a message, hey, this is a version I don't recognize, please upgrade the client or something like that rather than crashing. So it's just important to remember because we cannot have this compatibility layer. So what is considered a protocol upgrade? So literally every damn thing is considered a protocol upgrade. You want to shuffle the data, split the data, add color to a calendar, remove anything you do is a change of the protocol. And if you're not careful about allowing that and being able to handle that, you can have crashes. So it's very important to watch out for that. Obviously changing cryptography methods is one of them. So if you want to add elliptic curve support or I don't know a post-quantum lattice-based encryption, all of that, you have to change a protocol and that would be a break. You also, if you want to change parameters to the current encryption, so for example to the script derivation function, you realize that maybe the original parameters you chose were not strong enough and you want to change that, you just, it's a break and you need to make sure that all the clients know how to deal with that. So changing the structure of the data, which I already mentioned, and literally every other thing you can think of. You also need to be aware of development speed. So development speed obviously is much slower. I mean, did I mention that everything needs to be implemented on every damn client? So you have to have it on Android and on iOS if you support that, on the desktop, on the web, literally everywhere you're implemented, you have to double your efforts. So you have to take that into account before embarking on a feature change or literally any sort of planning. Also debugging is really a pain. So I mean, you can't really ask for data and when you do, you're not really going to get it. I mean, the whole point of end-to-end encryption is not trusting anyone. So you can't just go like, hey, I know I said don't trust anyone, but can you trust me and give me all your data? This is a really bad idea. So you just don't do it. And again, if you do it, most people say no. No access to data, make it very hard to investigate issues. So usually what you do, you say, oh, this breaks. Okay, I'm going to have it. I'm going to try, play with it a bit, debug, you know, try to figure out what's going on. You just cannot do it. So you've really just been, you know, doing it blindly and, you know, trying to figure out what's going. And so you can't also test the changes that you've been doing. So if you realize, oh, I think I fixed it. You try to, you cannot run it on the existing data like you would normally. You have to give it to the user, get back the feedback. And it's really hard when the user is not technical or not, especially if they're not a developer, because they don't know how to debug. They usually give you incomplete information because they don't know what you're expecting to get. So they say like, oh, it crashed when it actually just popped up a message that's saying, you know, something went wrong or things like that. So it's really, really hard and really time consuming. And obviously you can also cannot look into the data for affected users. So if you have, you cannot just go through all of your user base and say like, oh, you, you and you, I noticed you are affected by this issue. Please upgrade or whatever. You just can't. And another thing that's important to consider is third party applications. So normally you would want third party application support. So as I said, et async supports contacts, calendars and tasks. I would love to have a notes support. The problem is I don't, and it would be trivial normally, like writing a module to any of the existing and open source note taking applications. The problem is we do not trust those applications. It's not because we do not trust the people behind it. It's just they don't work in the same security and secure environment that we try to work in. So we don't want, you know, we don't want them to be handling passwords or encryption passwords or storing encryption passwords. And also to be honest, we don't want to normalize this behavior with our users. We don't want to tell our users, hey, could you please, you know, could you enter your password in that application to get it set up? We don't want to normalize that because this is how phishing begins. Another thing that's more unique to et async is data immutability. So because the journal is immutable, you can't fix saved malformed data. So if you had a bug in the past that included malformed information in the journal, you can't fix that. So you have to deal with that bug for the rest of eternity. You also can't update the saved format. So even if you decided to, you know, to change the format and you already have all the upgrade code and all the clients that we discussed, you still have to support the old format because it's immutable. Again, for all eternity, and you guessed it, on all the clients. And there are of course a few usability issues to keep in mind. So first of all, having both an encryption and a login password is a big pain. You know, like everything else in this talk, it's solvable. You can have one of those login magic links or whatever via email. But that's a bit more overhead. It's a bit, you know, it's a bit clunky as well. And you have to implement and support it and maintain it and make sure it works. And also it's a bit of a pain when it comes to self hosting servers because they don't have this whole infrastructure. And another thing, encryption password and encryption password recovery is not straightforward at all. So I mean, I've more than a few users over the years came to me like, hey, you know, I forgot my encryption password. Can you help me recover it? And unfortunately, the answer is no. I mean, the whole point of this service is that we cannot recover your encryption password. So again, there are ways. What you can do, for example, is generate a private public key pair on the server and keep the private key for yourself. Send the public key to the client to encrypt the encryption password with that and then have the user save that file. And using that combination of that file and the private key on the server, you can unlock that and recover the key. But again, this is another security issue, potential security issue waiting to happen and it's not trivial to expect the users to save that file if they're not, you know, if they forget their password, maybe they're going to lose the file as well. So it's not a solution, but it's better than what we have at the moment. And don't forget, I mean, you are entering encryption application, encrypted application, so you are held to a much higher standard. So one of the things where we had to do with Etasync is, one of the things we want to do with Etasync is have a web app. Everyone, you know, everyone uses the web, it's sometimes you're really, you're in a bind, you really just want to access it on your phone, although you don't have a client or on another secure computer. The problem is, how do you know that the server has not changed the JavaScript of the web app and is now leaking all of your private information? You can't really know that the server has not been hacked and is doing that. So essentially, what I'm trying to say is the moment you're in a web app and what I'm trying to say is the moment you have a JavaScript-based end-to-end encryption application, it's no longer end-to-end encrypted, unless you take extra precautions, which is what we did with signed pages, which is essentially a web, it's a web extension, so a browser plugin that verifies the PGP signature of the entire web app. So the way it works, the developer would sign the page. Actually, you don't even need to do it just for web apps, you can do it for your blog as well, but the developer would sign the page and the plugin will verify, as you can see, either a good signature or a bad signature, alert you accordingly and could also block the loading of the page and whatever else you want to do. So this is something that our users expect from us. As mentioned before, you can't ask for any data. This is part of who you are. And obviously, you need to watch out for what you put in the logs and in debug information and don't leak any information there. So there are a few other things to watch out for. So the first thing is performance considerations. So you don't have any server-side search or any kind of processing, really. Normally, if you have, you know, iMap client, you would do the search, you type search, and you do search on the server. You can't do it if the server cannot access the information. So what you do, you would have to download all of the information in advance and then either process it like search it all every time or, alternatively, maintain a large index, which is a much smaller version, optimized version of the data that is not full, only for the searching and keep that maintained so you can search based on that and only download that small part every time you do it. But again, it's something to consider. On the bright side, because you're downloading everything locally, you can do offline work. Everything is much, much faster when you are doing local stuff. But you know, it's a good and a bad thing, but just something to be aware of. You also need to watch out for false sense of security. So revoking or changing encryption passwords is something that you would think is trivial, but actually it's not. So one way to do it is just encrypt the old key with the new key, which is, it's okay. And if you haven't been hacked and the old key was strong enough, it's not a big deal. The problem with that is that everyone who ever had access to your account or to your encryption key will still have it. So if you ever shared that journal and now you revoked access, whoever ever had access to that journal will still have access potentially. Or if you got hacked and you want to change the password because of that, that hacker could have kept that password, that encryption key. So it really does not protect you against all of those, which is I think the main reason why users change their passwords. Another thing you can do is to encrypt the whole data, but that's a bit problematic. So there are two things with it. First of all, it could be computationally and bandwidth expensive, just like downloading gigs of data, re-encrypting it, uploading it again. But also the problem with modifying a lot of data at once is that you lose the sort of integrity. What I mean is usually what you want, or every change you want the human to be able to review it and see like, oh, a new contact, that's weird, or why is that event deleted? And if you change a thousand at once or thousands at once, you cannot have this manual review, so actually you're reducing the security by doing that. So the best solution is using an old key for the past data because you assume that's already compromised or at the very least could be compromised. And a new key for the new data, that's much more complex. And as I said, you need to implement on every damn client. So it's a lot of work. So another thing, I mean actually before that, last time I gave the talk and I gave it live yesterday, someone asked what do we use actually in EtiSync. So we use the first one, which is the less secure one, but with massive, massive warnings like, hey, if there's any chance you've, you know, if it's a surgeon or any chance you've been hacked or anything like that of that sort, please contact us and we'll help you figure it out because this is insecure. Just another thing, so we can obviously offer alternatives. But the question then is how do you educate your users? I just spent more than a minute here explaining and giving you examples of the minute differences and I'm sure not everyone got it. So it's really hard to explain it to users, especially since users don't usually read documentation. You also need to be aware of replay and downgrade attacks. So this is actually from my talk at Boston last year. So you have to be aware that even though you keep an integrity, you encrypt everything and you keep an integrity checked version of all the data, the server can still serve you an old version. So the old version was authenticated by you and marked by you and everything is okay and it's all valid and the signature is valid and everything, but it's still a stale version. So it could be missing information. Like here it's missing a calendar event or whatever. So just watch out for that. Again, leaking user data is a struggle. So as we said before, sensitive information in logs and debug info is a big concern. But even more important are those because people don't know about those. So mixing together user-controlled data and non-user-controlled data could be very risky. So for example, the way some of the encryption, I mean all of them I guess, work is using padding. So if you have, they work, let's say you encrypt four characters at a time, just to simplify it. And if your input is five characters, it will round it up to eight. So using, if you mix user-controlled and attacker-controlled information, you could essentially leak the length of the content, which is, again, leaks information. Very dangerous, watch out for that. Another thing is optimizations. Optimizations can often lead to leaked data. So for example, data compression could very much lead to leaks. So the way compression works, some of them, is they detect repeating strings in the text, and then they say like, oh, okay, we have space KE, space KE, it's actually double. Instead of having it twice, this time we're going to refer to the last time. Very good, it saves us some data. And the problem with that is, is that if the attacker can control the information, they can just have brute force attacks and try different strings, and then based on the length of the output, they know if they got it right or not. So if, for example, the attacker would change this to F, it would know that this, so if it was F, it would not have been duplicated. If it was E, we know it actually compresses more, so we can extract information this way. Another thing is using the duplication. So one thing that's very tempting is saying like, oh, I have this one gigabyte long file, and this one as well. I can ask the client to just hash the file. This does not leak any information about the file, and then upload and call the file based on its name. That's how all the big, all the cloud providers do duplicate information. The problem with that is that while the hash itself does not leak any information, the fact that the same user, the different users have the same file is already leaking more information that you would like to leak, and as I said, you're held to higher standard. Another example that I very much like comes from video and audio compression. So a common thing with video, let's take video, for example, so one way to do it, is send you a frame every second. So every 24 times a second or however long. So essentially a full, think about PNG or a picture a few times a second. And this is grossly inefficient, especially if the image does not change much. I mean, for example, in the last minute, this whole frame has been almost the same. So there's no reason why to use the bandwidth. So what they came up with is variable bitrate and compression. So essentially it just measures the delta in information so it only encodes changes rather than the whole thing. So if it doesn't change much, not a lot of information is transmitted. But one big flaw with that is that let's assume you have a drone just outside the window. And I want to know if the drone is spying on me. So if I can sniff the Wi-Fi signal, let's assume it's Wi-Fi based, all I can see is encrypted communication. That's not a lot. But what I can do is I can open and close the blinds a few times and then just by the fact that a lot of data is changed at once, if I can see, if I see a spike in the amount of data transferred on Wi-Fi that correlates to my moving the blinds, I know that drone is watching me. And the same goes with bugs in the room, like if you see a transmission you don't recognize and it spikes every time you speak, maybe you're being watched. So just a few unexpected ways where that can leak. Another important aspect is that the UI can make the whole difference. So one thing that I mentioned earlier, you want to inform the users when the data is changed so they can go past it like, hey, I did not delete four entries. So just like the user will be able to know that they've been compromised. It's very important. Another thing is showing users how many devices they have that are active and how many encryption keys. This is from Conversations, the Android Java client. That's a good example. And there are other potential flaws and safeguards against them that you need to check. So just be aware of those. Now I want to touch for a few minutes on improving the Edison protocol. So one big issue that we had that I'm actually a bit annoyed with is we tied together the username and the encryption key. I don't know if you remember from one of the earlier slides when we derived the master key from the email and the password. So it was a great hack. It was safe, fast, easy to implement. The problem with that is that users apparently want to change their username, especially when the username is their email. I never considered that. When I originally designed the protocol, it makes sense. I really wish I want to change it now. So that's one thing that I'm going to improve. And unfortunately it also made the user inconsistently case sensitive. So usually emails are not, but in Edison's case, they kind of are sometimes, which is a big pain when it comes to supporting some users. Another thing I want to do is improve the integrity insurances. So this is actually from one of the earlier slides. It's a bit misleading because it looks like a fingerprint, but actually what it is, it's an HMAC, which is an integrity check. So essentially it's okay in almost every case, but the problem is when you have a shared journal, you can't know which of the participants, you can't in a cryptographically sound way, you can't know which of the participants made each entry, which is a problem nowadays, but when Edison was initially designed, there were no shared journals, so it was never considered. Another thing is that everything that I talked about with all the integrity checks just applies to the journals themselves, but not the existence of them or the lack of. So I want to have, you know, I want to integrity check the global state infrastructure for that so there's no, you know, no replay attacks and no downgrade attacks there. I also want to move to Per device keys. So you remember a few slides ago from conversations. That would be extremely nice to have. I mean, first of all, it can better use hardware tokens. So the encryption key could just never leave my UB key or my secure enclave on the phone. That would be great. But also, I mean, also can handle lots of devices, but most importantly, that's what I'm really excited about. It's a very good infrastructure for third party supports, because essentially if you have an encryption key per device, you can also have an encryption key per app. And then you can just revoke or give access to a certain app in a cryptographically sound way, which is great, really great, and I'm really looking forward to having that. So just a few finishing notes. I know it all sounds very scary and like, oh my God, this is so hard. It's not that hard. You know, you can solve all of these issues and really enter an encryption is the only way forward. So please, if you're making secure applications, make sure to enter an encrypted. Privacy is a sacred right, so please don't give it up. Not for just for your sake, but for everyone's sake. And don't forget that you are the weakest link. This is like a famous XCD comic. It's like, yeah, it doesn't matter how much encryption you have, if someone has a $5 wrench. So a few useful links. This is my blog and website and a link. You know, you have a link to the slides and the video there. And the Edison website, go register, file issues, give feedback, the source code, everything is open source, and the signed web page, web pages extensions. If you're developing a secure application for the browser, please let me know. I'm happy to help you integrating that. I think it's really important for everyone's safety. And I guess that's it. There's no questions because I'm alone in this room, so thank you very much.