 The next step coming up is going to be practical mixed network designs, strong metadata protection for asynchronous messaging, held by David who has done recent mixed networks and is a contributor to Tor Network and by Jeff who has done contribution to the GNU Network project, organized a couple of sessions for this on last year's Congress and is basically a muscle mutation trying to get practical. The full name of the talk is practically mixed network design, strong data protection for metadata in news services. David is a mathematician and Jeff is a contributor to the GNU project. Okay, so I'm Jeff, this is David, we're going to be telling you some aspects about designing mixed networks. I'm involved with an academic involved with the GNU Network project, he's involved with the Panoramics project. Okay, so first of all we just to be clear, of course, in critical works, you know, properly implemented and then we have a huge amount of trust in it, we can have, you know, sort of slides showing the powerful adversaries in the world can't get these things. However, we have to worry about the metadata leakage and in this talk we're going to be going to be worrying about traffic analysis, the connection. So it's time to actually start addressing everything. Okay, so interesting solutions to traffic analysis. So there's this wonderful Tor Network project and they have, we know as a five years ago, they considered the NSA as effective. So this is just works for what they're designed to do. Tor does not protect against location tracking, you should see both things on the circuit. So if you have a website that is, if you have a website, of course, if you have a website that is preventable, that you have from which you have a finger print, if you know the traffic profile and they can tell that you're just from looking at your connection, if you're looking at that website over Tor, so okay, so let's admit the difference for when you're on the website with Tor on site. We're not going to, we're not going to deny that kind of, we're not going to deny that kind of adversary very quickly. But the only way to message our friends over Tor, so the few of our friends, there's a prior message, or messaging, is that frequently the people who want to connect are in the same country or in the same ISP. So the original property, you know, the adversary being able to see both sides of the connection comes through again and then very quickly the same ISP and then the attacker can see both ends of the traffic and then we have another problem. So how can we actually keep our message metadata private? And we're going to say MixNetwork. These are message-oriented and not unreliable packet switching networks. And they are an unresolvable package of switching the network and they have a mix strategy and there are different kinds of, on the technical diagram notice there's no exit notes. On this architecture diagram you can see, it's very important, there are no exit notes, but you go through the different mix and we have a public key infrastructure, similar to Tor, but we call it the key connection, but there are some decisive differences between Tor and MixNet. We can actually do decoy traffic everywhere in this diagram, like we can do decoy traffic all the way to clients. We can do decoy traffic in all directions. Yeah, so one of the issues with Tor is, of course, you can't do, if even if you wanted to add decoy traffic, you couldn't protect against this website, finger printing attack necessarily, because you're going to be, you're still seeing the connection coming out the other side. Okay, so one thing is a mystery here, is that the network is actually the oldest that I'm going to use, sorry, that's not right here. David Chum's 981 paper, then there's a few other tools that have been proposed more than the private information retrieval you're using in PR. This works sort of narrow situations when you're trying to retrieve something from some kind of database, the scaling isn't perfect on it, but there's a few things you can do, but there's another, the other one that sort of is generally proposed is the alternative to MixNetworks is dining cryptographers networks, and the problem with them is that the bandwidth is really literally, you know, each you're paying literally for the quadratic cost per user, so your anonymity set is really going to wind up being very small, and if you're talking about building something that has, inherently has a smaller anonymity set, then you have to ask, who are we protecting? And you know, if you're, you're not protecting whistleblowers anymore, because if whistleblower talks to, you know, a journalist, it's unclear which journalist, you know, there's people he's talking to, well, he's still the guy who knew the thing, who talked to somebody at their Spiegel, and whereas it does protect, you know, it does protect is somebody who already has a lot of power, and who it's going to be hard to protect anyway. So what we want to do, we really want to blow up the anonymity set as large as possible, and that's why we like to do it. Okay, so we're going to talk about a few attacks on MixNetworks and some fences. At the systemic attacks are not one of the attacks we're really going to focus on, because it's really a specialized area of research. There's actually a bunch of papers written on raking different public key infrastructure systems for, like, things like point-to-point networks and other things like that. So, okay, but we can say, I guess we should mention that a PKI generally makes literature assumes you have a PKI, assumes that all the literature takes into account that you have a PKI, which is the network. So usually when anonymity researchers are at IPKI, they generally assume something like the Tor directory authority system, where you have some people who are trusted, who run the thing. This actually represents the availability problem. It's what's growing, it's what's the cost of this project and panoramics is doing. It does present the scale of the problem. It also presents a serious scale of the ideas you can do. There are other ideas of making it more secure beyond people that are trusted, and on the authority. We have people in the GNU Net project that are researching this task, primarily these peer-to-peer network projects, because that's very important. We have tried to build distributed PKI infrastructure, and they have very, very serious attacks that work against them. And they just can't be built out completely. If you want to build up a distributed PKI, then you really have to know how bad the attacks are, and you have to know what happens if so and so many people are compromised. There are a lot of interesting stuff there, and there are a lot of interesting things. But what we are leading from the after-party attack is David's going to tell you about sort of the scale, well, sorry, he's going to tell you about how the scalability comes in. We want to talk about the scalability in our mixed-networks, and David will tell you about that. Mixed-nets can use cascade topologies where every person is using the same routes, and that's different from a goal where some of the mixed-network routes are used to guarantee a certain security, but this is a scalability problem, so we have other things like free-run, and also private technology, but free-route actually has slightly worse problem than that. There are plenty of ideas that have got it excellent, and there's other kind of point about free-routed topology. You visualize it as a free-route, and it grew away from that. You imagine it as a free-route topology, but maybe free-routes aren't where you just wouldn't land there anyway, even if you tried, and it was a free-route topology in London. Here we have another diagram for the stratified... Okay, and with this stratified layer, this stratified topology, each layer 1, each layer 1-clotting to each layer 1, two nodes, or each layer 0 to each layer 1, calculate the entropy of each mix, calculate the entropy of each mix, in contrast to the free-route system, where it's very difficult. Stratified, you can also scale up, you can put more mixes in the layers, you can build more layers. I'm going to mention a couple, sometimes we'll put some citations on the slide, don't take them too... We have a lot of this one on this one, yeah, plenty of ideas as a nice paper for undershipping. But don't fix yourself too much on it. Claudia Diaz has a very good paper on this topic as well. Okay, so why isn't this tour? Well, the main thing that we can say is tour doesn't mix, if the packets are, you know, the packets coming in at a particular point in time are basically the same packets going out. If you just go to the bottom of the list, you know, within a very small packet, and you can actually reduce it to a very small number, reduce it to the things, but a mix strategy it does is to get a link to see, to reduce these, to this correlation, and there's, yeah, I'm just like a, yeah. So David Chum in 1981, with the first mix set paper described this threshold, so say this mix had a threshold of four, would accumulate four input messages like this, and if it had enough for a threshold, then it would shuffle them and send them out. Now, mixers are also wrapping a layer of encryption for each and every source, and so if I was an attacker and I wanted to break this, what I could do is wait till the mix is empty, or I could make the mix empty by sending my own messages into it, and then when a target message enters this mix, I could send my own messages, possibly to achieve its threshold, and send a couple of general messages out. So then I would recognize all these different types of my own messages, and the one message they could recognize is the target message, and you can keep doing this for each hop, and this is called a 10 minus one attack, or blending attack, and it's a lot of variations on this. We have continuous time mixes, like stop-and-go mix and the Poisson mix strategies, and these mix strategies allow the client to select the delays for each hop, usually they're not allowed in exponential distribution, but if an attacker wants to break this, as you're blending attack, they can, first they tend to empty the mix queue by walking all into the messages, and waiting some creative time where it's highly probable that the mix could then be empty, and then they would allow one target message into the mix, and then continue to block other types of messages, and then simply wait for that message to be out of it. Now this, these attacks we have, we have some defense for them, for example, the Heartbeat protocol, the Heartbeat protocol, the Heartstrike protocol, it's also in the loopix paper as well, it's an end end, so we would have mixes with a kind of decore traffic called mix, we refer to them as mix loops, or heartbeat traffic, where a mix is sending itself a message, like it's self-adjusted and it's going to see the mix never coming back, and if it doesn't receive its heartbeat in some time out, it goes, it could be under attack or, of course, there could be other problems within that as well, maybe you can correlate a attack with several failures received by heartbeat messengers, there's other defenses for blending attacks as well, there's a recent paper published, we're not going to talk about that right now, the next category of attack is closure attacks, this is essentially, I like to think of it as the adversary is abstracting the entire mix network as if it's one mix, and a lot of this literature is written from the perspective of point-to-point networks, like when Alice and Bob are receiving messages from the mix network, they're receiving it after home IP addresses, and then we have home IP addresses in the mix, and maybe more modern sort of architecture might involve queuing messages, and this is a concept used in the loopix design as well, loopix has got a bunch of different loopix types, in order to add noise to the signal at various locations in the network, drop decoy traffic where a client would select a random destination provider to send a message to, which reverses the mix net and gets dropped by the provider, and there's also client loops and actually I should mention if we're doing these kind of statistical disclosure attacks a lot of this stuff, we don't know how well it will work in the real world because it really depends on its fácillet interpretation and the adversary's ability to predict the behavior or behavior of the behavior of the user to say that it's a static system and of course that depends on how much information from the system is published. Mixed networks always publish information. It's always a question of understanding how much information is published and how much of the content is so varied that even though the information is published, it's not predictable. And that always comes on the specific system and how it's adapted. And in this style of mixed network, where it has a destination, so previously here in this situation it would be very special. In this point-to-point network where people directly report from their mixer in their mailbox or at their home IP king, the opponent is a very passive attacker. In the more modern or in the loopiest design where it has the messages and cues, it's an active attacker. And there's some padding to the client, so we have some amount of receiver activity. So padding in the client so you can see the same amount of information. Okay. So it's a question that we talk about at IPC and we are also talking about IPC. So you might ask, is there a way I could put it? And the answer is, could I get away with the answer you know? No. At least by some artificial measures. We have some artificial measurements for that, but the anonymity can't be better than the amount of protection and the amount of information. In the situation of Tor, it's always... If adding cover traffic to Tor would help. And one sort of extreme version of this is of course... Whatever cover traffic you add, I don't think very small is still something relatively small. So you'll notice here, of course, the anonymity will look very tiny or something, but the costs are in the number of users. So what we were talking about paying some sort of fixed up cost, that may be somewhat part of it is in terms of the user's experience, but not part of it is in terms of the network connection, but it's doable. So one thing... So sometimes people have made these... Just to sort of wrap up this section about topology and so on. So people have made these sort of quasi-religious statements about encryption. So for more rundown, encryption is basically free. And in the mixnetwork, we're going to have to pay some costs. So one thing about mixnetworks, you don't want to... There's this wonderful... It's a very reasonable one. It's sort of one that has stopped much development in this area. Sphinx is quite compact, and it has a very nice security proof. It's called Sphinx from George Danesis and Ian Goldmerg. It has a header and a body, and that's why it's called Sphinx. The body has to be encrypted with what's called a wide block cipher. The body must be encrypted with a so-called bright block. There's now some other wide block ciphers like AEC by ROG-O-A, and some other... Maybe... So we'll see what you think about the packet format. So the header has... One of them, the public key, this elliptic curve. And then there's this public key with the elliptic curve of the block cipher. So the way we sort of re-think about the mixnetwork, with this bright block, with this bright block key, is key exchange between the mixnode and Alice. Alice first does a key exchange between Alice and the mixnode, and then the mixnode computes the other side of it, of the diffie helmet. And from that, the diffie helmet format, he has to mutate all of the different things. So what's his... Who will undertake those? Well, Sphinx actually gives the rules for how these parts of the packet are changed. Why are we using... You know, why is this delta... I didn't think you'd comment on this too much, but the header part was mapped, and delta was not. Why is the header not mapped on delta? This seems very, very dangerous. It seems very, very dangerous to be. So if you were just using an un-maxed stream site, then an adversary, a mixnode, a mixnode, that would be very dangerous. And the place where the message is going, could just XOR an arbitrary message, and then check for it when it's on. And we don't use an arbitrary block, but we can identify it. But we're using a wide block, and an attacker doing this same sort of thing will only handle the one-bit tagging attack. So... That's still an attack. Why would we tolerate even the tagging attack? The answer is that anonymous receivers really matter. And so there's a few things, you know, so of course a journalistic source, or whatever, a kind of service, some cryptocurrency network, download some file, or anything you interact with them, but you have to do some kind of acknowledgement back of it. But even if there is that anonymous receivers really matter, and so there's a few things you know, some kind of acknowledgement back of it, you have to be able to do some kind of protocol acts for messaging the system. So it's a little bit of a long reply. Okay, so what is this kind of anonymous receivers? We create what's called a fly block, that's a node, which goes to declaration date, and then the header, and also one cryptographic key for one layer of it. And so the recipient makes up this server and supplies it to the sender at some point in the past. The sender attests to the delta and they can send to the recipient. Okay, so great. Now let's get into some things. Okay, we might worry, so if you looked at the key exchange that I did, the sender just made up her alpha on the spot, so she got a new, her key is ephemeral, but the mix node key wasn't, it was supplied by the PKI. So all right, we want to put it before the secure, and you know, Taurus forward secure doesn't negotiate live negotiations with each other. Great. So we want, so we need that, we need some kind of forward security. We don't have it, so what we, so first of all, a mix net, we need some kind of replay attack protection anyway. So what this requires, some sort of data structure, that will eventually fill up or overflow or something like this. So to prevent that, we have to do key rotation anyway. So one option is to just rotate the key fast. The problem with that is that you don't want to stress the PKI too much because PKI is already scaring. So, okay. But another problem with that is that the server like times are equal to the node key life, so they can't exceed the node key life. So that means that we, if we want to be able to to have our forward, to have our key compromise when there's more than the node key life times, then we have to do, or you know, smaller than the server life times and we have to do something else. So there's a couple ideas. So George back in 2000, so okay, the idea is, okay, maybe we can be like, a little like TOR and use more packets for the packet we want to send, but not doing the way TOR does it. So George proposed using two packets in different key epochs. That's pretty good. That gives you a lot of nice properties. So there's another thing you can do that I've been working on, which is you pick. You can use a loop to the mix node to actually do a key exchange and then on the mix node you can use double ratchet construction for some hops. And the problem with this is that you can use double ratchet construction for some hops. And the problem with this is it's cheap, these two things. And you wouldn't want to do them at all hops because they create correlations. So okay, so we can in general we can ask what do we want the key exchange that our mix node, what we want, how do we make this mix node forward secure. In general we can talk about the different sort of basic technologies for key exchanges and the properties we can get out of them for context is straight. And anything that's based on elliptic curves is not going to be post quantum. So if we want something else based on the video, if we want that, we need something else. So there was a bonding operation in Sphinx. I did talk about doing that after quantum context. We don't know if it works for LWE we certainly have no idea how to do it efficiently. Our teaching strategy gives us nice key erasure properties. It gives us post quantum if the loop did a post quantum exchange. And there's another nice property that you can't really get in the other way which is that in the the winding thing is hybrid you can actually have a hybrid post quantum property. And that means that you can use both an elliptic curve and this post quantum key exchange. And if either one of them is good then you can't break it. If you try and do this construction with something like LWE you're probably not going to be able to get that hybrid post quantum property with the winding operation itself for the LWE cryptographic assumptions. So nevertheless I want to conjecture that LWE may be I'm going to go from LWE that means learning with failing, learning with errors that's what makes the right post quantum properties that we want. We can probably find something where we have a nice blinding for the LWE and it even has puncturing function which we can currently do with but I think it's extremely slow but I think with learning with failing we can do this much, much faster at the moment, very slowly. So in that case some classical network literature can be applied now the automatic repeat request with automatic repeat request request with acknowledgement and re-send but that could be used in a correlation if the attacker can use the drop of a packet but we have three cryptographic layers in our project right now this project we're working on Yawning Angel wrote a cryptographic link layer based on employees the new framework and XO599 we also have a Sphinx cryptographic layer Sphinx cryptographic what Jeff talked about earlier the cryptographic packet format and we also have an end-to-end cryptographic messaging and this is another sort of Lupix style diagram so it goes to Lupix Alex and Bob and with some relatively simple changes from this we can have stronger location-hiding properties where Alex and Bob don't talk directly to providers at North where they send these white blocks to retrieve messages this would not be latency but a lot of times you tend to use PIR for this retrieving the thing from your provider and one of the problems using a PIR scheme is that you're going to have very different assumptions that play there and the way in which you model it is going to be necessarily quite complex playing with all this stuff but it's actually getting all of everything complicated so this is why in the scheme that David's talking about here David's here telling that Mixnet gives you the property and also whereas in this situation with Lupix design it doesn't have strong location-hiding properties especially not if you want to find out where Bob is you would get out from his IP address and get everything out ok, one problem with these provider models is that like David just said you can get your provider hacked and there's a way to fix that it requires modifying strings a bit but it's a good idea to go through the security proof again anyway so the idea is that we have in this middle this hard drive picture is some sort of mailbox server or accumulation server that the receiver can move without telling his contacts and his contacts actually reach him in other ways but at the end of the day he can reach the server which I didn't want to tell you much about so the idea is that this guy can the receiver can supply the receiver to this point in the middle and then when he goes online and then it will send him so because he can have it for the disk he can play and one of the nice things at the end of the day when the proof once you're like security result for the mix and that's going to be like okay well in three months they're not going to be able to de-anonymize you in three months so we may be able to do a bit more and he can move this guy in the middle periodically so this is very much work in progress it's not at all in the cat's eye and that's doing that so we need a number of proofs and then some more proofs okay we're talking about applications so far we've talked about messages but we're still sending messages but to give you something a bit more concrete there's a few strategies how to make money anonymously and most of us are shit but there's a few of them that are very strong cryptographic promises about anonymity and Taller uses blind signatures and they are absolutely unbreakable and that works like that we can think of something there are projects that want to park websites and send them over mixed networks but if you look at a fundamental level if we want to do something collaborative like Google Wave or an Etherpad then we'll see very special problems and the latency will also have an effect on the user and where we haven't thought about it but where we would like to think about how to make people happy with applications with higher latency it sounds hard but actually a lot of times when you think about how people build more modern web frameworks like CouchDB then it's not really about the latency to improve but you separate it into a way that is very pleasant for us and it would be cool if you could say hey, use this app it's not too annoying if I send you something you might not arrive there we have to turn it around we have to send it again and then we have more information but people need it and in the Loupix you can also lower latency because you can send more wrong traffic and compensate it doesn't do lower latency like tuning and we can play around with it when we want a mix network with lower latency we have the ones that are faster than 10 years ago and there is a group that really tries to rely on the mix network or directly on the mix network it's very important if you can use an app it is very often that you compromise about the reliability and if you can do the reliability in a way that you can have the security it's very important we want to thank the researchers who worked with us Yoning Angel thanks for the advice and the work with the specifications George and Claudia for excellence in paper I talked to Christian he was very important and and someone with the public infrastructure and Trevor Parrant has a great mailing list where we have collected a lot of important ideas and he also worked on the public infrastructure and he worked on the low-day protocol and here you can find information about our project and that would be here from the Q&A thank you so much if you have any questions please line up at the microphones do we have questions from the internet from the internet no questions from the IRC there is one question on microphone 1 you mentioned the latency will be higher than Tor in a few seconds minutes what's the sort of the question is the latency will be higher than Tor we don't really know until we tune the mix network we're not sure we're playing the second I should start by saying mix networks aren't trying to be a general service and in any system like Tor we're trying to make customized networks for specific applications each application has different traffic patterns and different ways of using so the latency would necessarily come after tuning some if from a idea that may be a few minutes I can't answer the question yet actually the reasons for working with are about to publish a new paper about how to decode traffic and latency for the desired entropy you want in each case microphone number 2 your question you have the event that in mix network that the public infrastructure in mix networks has a bigger scaling problem than Tor it seems that a mix network needs a replacement for email and you want everyone in the world to use it if you work through a sort of very bullshit lack of the envelope commutation there's an argument that you're that if you have a centralized PKI plus whatever other anonymity system is only about 10 million times better than just sending every message to everybody something that's very bad for the envelope you can try and so you need yeah well okay so there's that and the specific scene when I said there's less of a problem for Tor Tor can do certain clever things like there's one other thing that's actually not taking that seriously at the moment is where they published this big list they published the PKI and nodes don't actually the whole consensus at all they just point to a place in the consensus and they get back a proof that they were given the correct node that they were forwarded to the correct node so this gives another order of magnitude or two on that 10 million I just couldn't okay back over there Mike from number three hi looks like really good work and I'm happy to see it that seems like really good work and I'm very happy to see it well if there are different applications here which have different fine settings can you share the same mix network okay it would be best if they could help each other by increasing each other's anonymity but we're concerned that the specific tuning for the deeply traffic maybe this is some cases actually and there's some other considerations as well since we're not stream oriented all the data has to fit in one packet and so if we have a needle use case we probably are going to get around 50k average size of the mills let's say and if we want to make like cat and cat application I might send really short messages like go what's up and now we're sending that in a big 50k packet and now we're sending that in a big 50k packet so one thing that I fear you wouldn't do it for all obviously if you have it it's going to be quite infrequent like a payment thing then you should be using a network with much more frequent packets except that you're going to be accepting the inefficiency and there's another consideration too which is sometimes in these cat applications communication partnerships might be symmetrical and that might send each other roughly the same amount of data and stuff like not that I don't think web networks are good for web browsing but stuff like the web it's more like get to page then you get a bunch of information so that's not different so what would the decry traffic look like that versus a symmetrical communication partnership so that's what I meant by some applications might not be compatible with each other so that's the decry traffic of peer to peer the most sort of peer to peer either plan your other sort of collaborative applications, your email payment network, we'd certainly hope that all that stuff could be bundled onto one thing and sort of optimized for this email like FK then if you actually need the messaging network at all then that's the question all right, Mike for number one what's your question Mike, can you give more concrete examples of software that you could try out my papers are great my papers are nice and good but is there anything you can hack actually right now we're running a mixed network on several machines that we had lined around and it was great, thanks for calling for help for that but we don't really have anything near production ready it was finished so the things I talked about they still work it will soon but to answer your question we don't have anything but we hope that's the case, I don't know how soon it will be will it be paid or financed I was thinking about this I was thinking about it in the real world you set up an app where you can communicate and I would say a normal phone because we set up two users who want to use this app to communicate and the idea is that someone sends a message and a while later Andres takes a phone and the screen is turned on so much when you set up the whatsapp and everything and it happens so much outside of the mixed network when you can correlate who takes your phone out of the pocket when a message is sent then you can find something what are the problems you're thinking about I have no idea what the solution to making the user happy is the phone doesn't ding anymore you don't get notifications sometimes you check your phone when you check your phone but I would actually like it there's a question here is would that make people actually happy or late can you all of you being able now to maximize engagement and you actually don't want to do that anymore you want people to only use it when they want to use it if they really want to use it and they don't need it all right thank you seems there are no further questions so thanks a lot to Jeff thanks a lot to David