 Maybe, and it's our intuition that, and all these things don't really make me comfortable as an anonymity software developer, so we would have to figure out a way to try to nail some of this down and get rid of a little bit of speculation. So we picked two real world implementations of anonymity software. Mixmaster is software that's been around for almost 10 years, reliable, it's been around for about five. We wanted also real traffic, where there's a lot of speculation about how the traffic on the remote networks would behave, well, damn it, that's not good enough, so we wanted to actually tap that and see what was going on. What I'm talking about here is based on joint work with two lovely PhD students at the University of Lubin, Claudia Diaz and Evelyn DeWitt, and we're presenting a full paper on this next month in France at the ASOIRX conference, and you can get that paper from me if you ask me. So I'm going to step back here and do a quick refresher on the mechanics of strong anonymity. I'm going to assume most of you are vaguely familiar with this, and I don't have the time to go into exactly how all of this works, but I'll give you an overview. So everything we do in the strong anonymity area falls into a number of different categories, and the one major category is mixes. All of this is based on work that David Chum, a cryptographer, first proposed in the communications of the ACM article in 1981. Note that this was mainly pitched in that article as a novel application of public key cryptography, which, you know, your history was only about three years old at that point in time. He proposes a system where you have individually untrusted nodes. You don't have to trust anyone, third party, to protect your anonymity. You trust the entire network, but you can have some bad guys in that network. You have multi-layered encryption chains. So you've got this network, you've got a bunch of e-mailers, e-mailer A, B, C, D, and you want to send a message to your buddy down there in the front row. So you address your message over, and then you encrypt that message to the last e-mailer in your chain, e-mailer D. You encrypt that encrypted packet to the second to last e-mailer chain, e-mailer C, and let's say you're only using three e-mailers in this chain, so you encrypt that one to e-mailer A, mail it off, it goes e-mailer A, strips it off, says, hey, I've got a message here from land, and I get it's going to e-mailer C. E-mailer A doesn't know where it's actually going to end up, but it knows what's coming from, that doesn't really tell you much. E-mailer C takes this and goes, hey, I got a message from e-mailer A, it's going to D. All right, passes it on. And e-mailer D says, ah, here's a message from somebody, kind of another e-mailer, but it's going to over, delivers it. That's pretty simple. Now there's a whole class of attacks on actually tracking these things. If you're an observer on the network, you can watch all of this. You can do a lot of things to watch this, sort of like a shell game, and figure out what's going on. So, Tom had some suggestions on beating some of these attacks that have been widely studied in the 20 years since, but were all just kind of kicking around his brain at that time, including making all of the message tag at the same size, using encryption that isn't taggable. And of course, the key point, randomly reordering the messages at each hop. So message comes in, you don't want it to be a first in, first out protocol, because then if you can just watch it, you just count things as they go through. So the software Mixmaster, which for disclosure here I'm currently the maintainer of, is a mixed net implementation. It's the first real mixed net implementation. It deviates in a number of key ways from Charm's work, basically, to address attacks that hadn't been thought of when he invented all of this. Prior to Mixmaster, there was an attempt at doing nested encryption called the Staplepunk Remailers, but doesn't even come close to the threat model that Charm's original paper protects against. So we're not considering that in this analysis. Mixmaster has clients available for Windows, Macintosh. The old Macintosh and Macintosh OS10, which is Unix, of course, and we've got clients for Unix. There's servers available for Unix, Windows, and of course, OS10. It's got low hardware resource requirements. Doing RSA operations is pretty cheap these days. TripleTales is also pretty cheap. You need a reliable network connection, and you need mail server capabilities, which Mixmaster does not have in it a way of transferring messages that relies on SMTP. That's a reliability problem, but both Mixmaster and the other software we're going to talk about has that same requirement. Here's a little graphic. My only graphic of Mixmaster Packets showing how we do things. Get the bottom there is the message with your padding, then you got a stack of headers. And as you send the message through the network, headers are pulled off, and they're replaced with junk, which is indistinguishable from headers. So the full message stays the same size, and you can't watch things shrink as they go through the network. So I just kind of covered all of this when I talked about Charles' original method, but to drill down here. The Mixmaster protocol, or the Type 2 protocol, as it's now known since Mixmaster is the name of the software and the protocol, and that's confusing. But in Mixmaster, we do a chain selection, as I just described. Do encryption, do padding, and splitting. If a message is larger than the given size, which is roughly 10K, a total message is 30K with all the headers. If the content you're sending is larger than the 10K limit, it's split into multiple packets. And if it's smaller, we've had it out, so if it's the right size. A node operator, somebody running Mixmaster, and this is assuming they're running the Mixmaster that I distribute, and they haven't modified it in any malicious way, see messages come in, they go in the pool, Mixmaster does its thing, which I'll talk about more in a minute, and then they come out. And if you go and ask a regular operator where did this message come from, he's not going to tell you anything. He doesn't keep logs, and he doesn't watch. He doesn't have a sys trace running, watching what's going on where. But assume he's malicious, and he actually has a Trojan version of a Mixmaster up there that does log everything, doesn't mix anything. Still, all he can tell you is, oh, this given message came from this place, which was likely another emailer. Now, an outside observer, somebody who is on the network watching things, but hasn't compromised the individual emailer nodes, is watching the network come in, he's watching it go out, and he, hopefully, cannot correlate messages from their inputs to their outputs. That's the assumption that we make that makes this a strong method of anonymity. If that assumption is flawed, then we don't resist an outside observer. And we're not really much better than simple proxies that strip headers and forward things on. And that's not what we want. That's an easy thing to do, and it doesn't beat your ISP. It doesn't beat your boss. It doesn't beat your spouse that you're cheating on, whoever you're trying to hide from. There is also a key thing to note here. You have a requirement in a system that you have a large anonymity set. And that means that you need to have other people that are behaving the same way you are. If you're using McMaster properly and there's no one else using it, well, gee, who's using it? Just you. And the more users that are, the greater the total anonymity provided to each of them. So I've been stressing in my talks recently that privacy software really needs to bring home the point that security, that usability is a security consideration. This is particularly important in the case of anonymity because if you don't have users or you have a small number of users, you weaken your anonymity. McMaster also uses cover traffic. There's two ways of doing cover traffic. This is, these are dummy messages that aren't actually really going to a well recipient, but they're just noise added to the system. There is a good reason for using dummy traffic internally in the network. It's used as a protection mechanism against a number of active attacks, which I'm not going to talk about, but I'll answer questions about if you're curious. And then there's the end-to-end encryption dummy traffic for the users, which is really hard to actually make work and tends to be a list of resources. But both McMaster and Reliable do internal dummy traffic to varying degrees of possible success. Reliable. Reliable is a message format compatible remailer. The similarity to McMaster ends at the message format, at the type two message format. It encrypts and decrypts things the same way McMaster does, and it relies on SMTP to transfer the messages, but it is not in regards to the way it mixes things, the way it hides, your identity, and the properties of anonymity it provides in any way similar. It is based on, it's a Windows-only server. It's written in Visual Basic. That's about as far as I'm going to go, commenting on the quality of the source code, draw your own conclusions there. It uses an entirely different mixing method. And we'll talk about that. Now, with mixes, there are two different major classes of mixing methods. There's pool mixes, which have multiple variations. Largely, they differ in response to ways of dealing with different active attacks. And the canonical active attack that we talk about is referred to as the N minus one attack. Simplest example, we have our pool. Now, the way the pool works is it's considered a bucket and ignoring the startup time where you have to fill it to a certain amount. Messages are in there, and more messages come in, and then messages are pulled out at random. And you can't, if this is a black box and you're an observer, on the network tell input and correlate input and output. The idea behind the N minus one attack, which is also referred to in the literature as a trickle attack or a flooding attack, a trickle attack is a variation of a flooding attack, but you want to somehow flush out all the messages in the pool and replace them with your own. So if every message in the pool is one you own, then you can sneak in a message that you've somehow managed to get a hold of maybe because you run a rogue retailer and you're grabbing message traffic. This is a message that belongs to the person who's anonymity you're trying to break. You stick it in the honest mixes pool, and when messages come out, they're either yours or one that you haven't seen, which is the one that you're interested in. So one easy way of doing this is just flood the mix so you flush everything out, stick all of yours in, and then stick in the target message. Mixmaster uses a variation on the pool mix that's called the timed dynamic pool mix. This is a big phrase for a fairly simple method of dealing with a couple of these things. Mixmaster has a threshold pool mix where it won't let messages go out unless the pool is a certain size, but it is dynamic in that if it starts getting lots of messages, suddenly the minimum amount of messages that have to stay in the pool before it will flush rises. So if you have a policy of flushing all but 50 messages from the pool, but you start getting lots and lots of messages, you, there's another value on top of that that says, we'll flush a maximum percentage of the pool. So I'm not like the map on top of my head right now because I'm horrible at it, but you could end up with the pool size dynamically growing to respond to either natural growth in message traffic or an active attack such that you have a minimum of actually greater than 50 messages. This pretty well deals with the flooding attack. There's another attack referred to as the trickle attack where you sneak messages in over time and that's a little more difficult to deal with with pool response, but we can beat that hopefully with dummy traffic internally. Okay, that's pool mixes. That's a type of pool mix and that is the type of pool mix that Mixmaster uses. I'm going to talk about an entirely different method of mixing stop and go mixes. This is the method that Reliable uses. Stop and go mixes are designed with a different idea in mind. They want to guarantee reliability. If you have a pool and messages are randomly being pulled out at some time at which the pool is of appropriate size, you may have high latency, you may have the message never arrive. These are viewed as problems and the stop and go mixes have attempted to address those problems. The thing to notice here is that each message has a delay that is independent of all their messages in a stop and go mix, whereas the delay in a pool mix is affected by the number of messages, the amount of messages, the rate at which they come in, all those other factors. So what happens in a stop and go mix is that a message comes in and it's tagged with a timestamp. And when that timestamp expires, that message is let free. Stop and go mixes were originally proposed by a guy named Bukin Kiestekin. He had a lot of magic in his paper that presented ways to address the various active attacks that I've already mentioned, such as putting a batch number or timestamp on the message that says that if it's older than this amount, throw it out. And these are ways of making sure messages aren't delayed and aren't introduced in to do clutching attacks and so forth. But in order to make all this work, you need to have each individual user needs to know the point in time in which it will enter and exit each node in the network that it chooses. So basically either your user needs to be able to see the future or you need some other mechanism of getting this information to them. But regardless, it's yet to be proven that these countermeasures will work or won't work. That's not what we're disputing here because reliable doesn't attempt to perform magic. It just regards these problems altogether and just puts timestamp and when a timestamp expires, sends the message on. So why does it do that? That's kind of stupid. Well, the type two protocol was designed for Mixmaster. Mixmaster was designed originally as a pool mix and there was no way to do any of the, no reason to do any of the extra protection mechanisms that reliable would need or stop and go mix would need because Mixmaster wasn't doing stop and go. So here we have the necessity for clairvoyant users or as Dugan suggests, the information service that can accurately report to users what the future will be. In other words, you don't need to be psychic yourself, you just have to talk to a psychic. Short slide here about dummy traffic. The Mixmaster dummy policy has, it does dummy's two different places. We don't think that's really necessary but we couldn't see any real resources and not to do it. So we figure, okay, what the hell, let's go overkill one dummy messages rather than make them say they're not doing enough. But for every message that enters the pool, dummy one messages are generated, means look externally just like other message packets except they've got no actual body payload, it's all garbage and they're added to the pool. They follow a geometric distribution of one tenth of the fault. For every time the pool flushes, when the pool pulls messages out and sends them, D2 dummies are generated and they follow a geometric distribution of one 30th. That's added to the flow of messages that are exiting and the way this helps is it works the more clever at minus one attacks, the trickle attacks where you're not introducing a large amount of messages suddenly but instead you're introducing lots of your messages over time such that you can reasonably assume the pool is full of your messages. Reliable dummy policy is it generates 25 dummy messages every six hours, delayed for an amount of time selected from uniform distribution between one and six hours and really there's not much point to doing that, it doesn't really do much. But you can then say reliable has dummies and users get confused. So what we set out to do here originally was measure the actual anonymity provided by the systems. Back in 2002 there were two different methods of measuring anonymity based on using information through that concept of entropy that were proposed. Groupic campbridge and a group of looven, both simultaneously proposed this. The only difference is that the looving group proposed that they normalize with respect to the number of users in the system and since we don't know the number of users we use the non-normalized method in this paper. I defy anonymity here which I published and said earlier, anonymity of course is the state of being non-identifiable within a set of subjects that we call the anonymity set. So if you're doing some behavior and those XW people who are also doing that behavior and they all behave the same way as you, they're your anonymity set. Now those attacks they can whittle that down and attempt to break down the size of the anonymity set and we attempt to stop that but those are all in the realm of active attacks which again I can answer questions about. But right now we're gonna talk about how we actually compute the anonymity provided by the single nodes. Note that this is, what we did was an analysis of single node anonymity properties. We didn't look at the entire network anonymity because we didn't have a good hand on what the actual anonymity provided by a single node was. We figured that was the place to start. We have two different anonymity metrics or anonymity levels here that we need to measure and it turns out at the end of it all pretty much roughly the same in a given system but we measured both incoming and outgoing message anonymity, sender and recipient anonymity. So for the sender anonymity, we compute the entropy of the probability distribution that relates to a target outgoing message with all the possible inputs that could have correlated to that. And for recipient anonymity, we just basically do the inverse. We compute the entropy of probability distribution that relates to a input message for all possible given outputs. And that gets you the effective anonymity set size. We're assuming a passive attacker in this paper. We're assuming that whoever is doing this is somebody who's got a network cap sitting and watching traffic flow in and out of your node, treats it as a black box and isn't actively manipulating the traffic. But we can assume later that they are and that is bearing on this but these are two different things that we have to measure and it doesn't really change the results. The attacker can see all the messages, can't see inside the mix. If he can see inside the mix, he's gone and we get smited and we lose and we should go home. So we have to assume that there's a black box property here to at least some of the mixes in a chain. What he does know is the internal parameters of the mix. And we can try to hide this if we want but really you can always figure that out. There's no way to conceal the parameters. Now what I mean by parameters are in a pool mix, what is the pool size? What is the rate at which it goes under flooding? What is the time interval with fires? And for a stop and go mix, it's what are the values that it's going to pick from for the delay? Now if it's putting random timestamps, in what range is it putting them and how is it getting there? These are things that are done in the software so somebody auditing the code can know this and if you go and you change the code yourself to hide this, you're really not hiding anything because through a little bit of analysis, the attacker can figure this out anyway. So in order to figure all this out, we didn't want to go and say use a live email and actually calculate the actual anonymity on actual messages that were going in and break them. So we built simulators or rather, Claudia's undergrad built simulators and we take the credit for it. This is all written in Java. We simulated the behavior of reliable and mixed master based on their default parameters and we did feed actual traffic data into the simulators. What I did was I modified my e-mailer to log a timestamp when any message came in. Now this didn't log the message name, there was no way to correlate the message, it just wrote a timestamp, I received a message at, boom, this time down to the second. So we had a handle on how messages were actually flowing into the network. I didn't need to record when messages left because due to the actual, knowing the parameters, we could deduce that variable in our simulator and then figure things out from there. We took this data over four months and then fed it into some computers that spent a couple days crunching on it and it gave us our results. Most of the enemy literature assumes that the network traffic distribution follows the son process. We have an average of lambda arrivals over delta t, to know that it is x, this is the depth of the map that I'm gonna have in here, so. And the parameter is lambda is x equivalent to son of y. It's a member of lambda's time dependent and if this were the case, lots of assumptions are valid in typically in stop and go mixes and no one's actually questioned whether this is the case or not. Well, now we've got real traffic data and we look at it and we basically assume, give this one a look, that this is what the results are gonna be and we go and test it and we find out that it comes network-close. We actually have no idea what, if we can model the distribution of traffic data, it's not truly random, it's just strange. There's, I don't have a model for it, none of it, but the key thing here is all the previous literature assumes that it behaves a way that it really doesn't behave. So anything based on the assumptions that it behaves according to this distribution can now comes into question. The results of our simulation of Mixmaster. So we've got the time dynamic per mix. We've got each round, each firing of the mix, every message in that particular round has the same anonymity as every other message in that round. And the results that we came up with after processing this data was that we have a lower bound on anonymity in Mixmaster. The anonymity lower bound is seven, which is two to the seven or indistinguishability among 128 other users. As traffic increases, the anonymity increases. And we end up getting, with the data that I fed into it, an upper bound of 10.5. Now, if we had even more traffic, we probably haven't even greater upper bound, but we didn't have that to evaluate. This is a generally good thing. This is the first time anyone's really gone and analyzed an implementation of an anonymity protocol and said, yes, there is a lower bound. We know where it is. We can assume that, and in the worst case, we've got an anonymity property of seven. This is maybe, or maybe not good enough for users, but it's a way of putting our finger on it, and actually then we can adjust the values of the pool, et cetera, try to improve this lower bound. The one thing to keep in mind, though, is if there's a trade-off here between latency, user numbers, and anonymity provided. And if the number of messages, the traffic load decreases, we're gonna go down, we're gonna hit the floor on anonymity, it's gonna be seven, but what's gonna happen is the latency's gonna go up. So the longer a message stays in the pool, the message time for a message in the pool is gonna rise as we have less network traffic, and that may or may not be an issue for users. Most people don't mind the fact that their email takes maybe a half an hour to deliver if they're looking for strong government and mafia can't break my anonymity systems. We see them really reliable. Now this was a real pain in the ass because unlike Mixmaster, which has a very well-documented spec for how it behaves, reliable has no such documentation. And well, we thought, okay, well, we'll just go with this code and we'll figure out from the comments what it does. There's no comments. There's, and it's written in VB, and I don't know VB, and I can't tell you good VB from bad VB, but this looks pretty, it was pretty difficult to read, but with the help of some other guys, Peter Palford was one of them. We figured out exactly what reliable was doing, and we wrote it up and documented it. It's in the paper. If anyone ever wants to look at how reliable does things, we've documented it for you. And so yes, the SG Mix literature assumes the problem distribution, not correct here, and they're putting it through the simulation and we find under periods where Mixmasters anonymity was seven, reliable's anonymity was zero. There's no lower bound with anonymity in reliable. This is bad. This means that reliable is not giving you any anonymity under low network conditions. Now the upper bound was close to where Mixmaster was. It was a 10. So when you've got high traffic load, you're okay. Well, not really, I'll get to that, but assuming everything else was correct, you're okay. When network traffic drops, you're screwed. Now, of course, you can kind of beat this if you apply in the magic pieces that Kistigan's original paper poses, but reliable doesn't do that. So if any of you use remailers, you should make chains full of reliable remailers. Now of course, if you have one in there, two in there, well, it's just sort of the same thing as assuming that that one might be compromised and you've got your ass covered because you're using other remailers in the chains. So it's not something to panic about, but we should see reliable be retired due to just the utter lack of anonymity it provides under certain situations, which are, of course, manipulatable by an adversary. If you're able to sit and watch the network, you're probably able to do something to slowly tie the messages coming in and drop the network traffic. So we decide after learning at this that reliable just isn't suitable for its stated purpose, which is email anonymity. It misimplements the SG mixes. It doesn't guarantee anonymity and that's a bad failure mode. If I'm gonna be using a strong enemy remailer and I have to fail, my message isn't gonna do what it's supposed to do. I'd rather have it not get delivered than get delivered with zero anonymity. That's miss message failure mode. If the network traffic drops drastically with miss master, the latency goes up. And yeah, your message will probably be delivered at some point in time. It's gonna be a lot longer than you thought. Maybe it'll never get delivered, but at least it's not going to break your anonymity. Of course, that said, under normal conditions, neither of these failure modes come into play and you're safe, but do you really want to trust that and do you really want to build strong anonymity that advertises the property that it doesn't have under certain possible situations? I don't think so. Yes, there's no new active attacks. There's still possible here. There's issues, replay attacks, tagging attacks, frishing attacks, all this stuff is neat stuff. We've done a lot of work on it, but it's not in this paper. It's not in my slides. And I can point you at places you can go and read about it if you're interested. Now, we just stopped there because when we went through the guts, I'm hesitant to even call it source code, the guts are reliable and look at the VB. We noticed some other somewhat disturbing things. So we decided to extend the paper to cover other possible influences on the actual anonymity provided that aren't directly related to mixing algorithms. It's a pretty narrow view to think that the only thing that's going to possibly ruin your anonymity is the missing algorithm and if you get that right, you're safe. So, of course, host server integrity is important. If you have a compromised server underneath your screwed, that's not really the domain of the regular software, but you can write software that doesn't have exploits in it or not introduce a vulnerability to the server and do things that take precautions to that effect. There's of course UI issues. If the server is easy to misconfigure, then the operator's gonna screw up and the users are gonna get screwed. There's programming language issues. C is a big gaping bug of possible vulnerabilities, but VB is VB and I'll let the slash.weenies debate merits of programming language, but it's something to consider. Documentation is a major issue. For people that want to come in and analyze your protocol, you really ought to have things documented because reliable has been used since, I think, 98, 97, somewhere around that. And this is the first time anyone's actually gone in and analyzed it because it's just impossible to analyze. We looked, of course, at the cryptographic functions and both Mixmaster and Reliable use extra libraries for the cryptographic functions. Mixmaster uses OpenSSL, Reliable, funny enough, uses a really ancient version of Mixmaster which uses OpenSSL. So the actual generation of the message packets is done by an older, comparatively different code base of Mixmaster, but they're, of course, trusting that the libraries they're using are doing things correctly and then the code itself has to use those libraries correctly. Of course, then we'll jump to the last piece here, network timing attacks. As I said earlier, you can, as an attacker, figure out due to just watching things, what the internal pool values are. So that can't be hidden. You can't assume your security model based upon it being hidden. It might be nice to hide certain of these values to make the attacker work harder, but you shouldn't fail miserably if they figure it out because they will figure it out given enough time. Then we look, so entropy, randomness. That's important in lots of security, crypto-related systems. It's fucking essential in mixes. If you don't have entropy, if you don't have entropy source, you don't use your randomness well. You can't have messages come into a pool and leave a pool in a random order because you have to get the randomness from somewhere. Mixmaster uses OpenSSL's random system calls, and I'm assuming they're right. OpenSSL is a big enough project that hopefully people are bottoming these things. Of course, that's an assumption that we should make, and we should really look at that too, but that was out of scope here. We weren't actually going to start looking at the quality of randomness until we noticed this little line in the source code of reliable that used the Windows R&D call. Now, that made me take it up a take, to scratch my head. I go look at the Windows Knowledge Base, where it says, first entry, this should not be used for any security properties because it is not fully random. Okay, so that's bad. Really, really bad. We have to decide here, just based on this, that if everything else were equal, if both mixing schemes really worked, we'll have as fucked. You've got a deterministic pseudo-radimuner generator here based on known seeds that's spitting out numbers that you can figure out where they came from. So, what the hell is wrong with people? We figured out in 1996, when Ian Goldberg and Dave Wagner went and looked at the SSL implementation in Netscape, that you really shouldn't base your entropy on a seed at the time. So, yeah, that's bad, and that is just a side effect of us having to go in and document how this thing worked. It would just happen upon it, but be really careful with your entropy when you're looking at building anonymity systems and don't make assumptions that you really should be questioning. So, well, to be fair here, we could have got in, we could have fixed the redness problem. We could have used Peter Grittman's cryptlib and slapped in some good entropy source for a line of load, but that's kind of pointless, given that we figured out that the mixing scheme was broken. So, what do we want to take away from this as implementers? SG mixes aren't good for variable network traffic. If you can't really predict the distribution of the network traffic, you've got some problems. High latency systems where a message has to, it's okay for a message to go through and wait for a half an hour of being delivered, approaches are better. Now, in a system like the Onion Router, where you're using it as a network level protocol and you're trying to fetch web pages, latency becomes a concern, but for email, I'm going to go ahead and say that I much rather have a message take two hours to deliver in a bad situation than half an hour, than decide, okay, I'm gonna stay here in that half an hour, but I'm gonna give you no anonymity. That's a trade-off that email, use that post, et cetera, is probably reasonable, but in certain other protocols, it's not. So SG mixes still may have their place, but it's not in email systems. Yes, and always, of course, be careful with the easy stuff. We set them to do this. We didn't think, hey, let's go and do a full code on it on this crap and look at all the simple stuff that we figured out 20 years ago and break it that way. We're worried about policing protocol. That's the interesting problem, but that's not where the major, simple, easy break was. So when you're building your systems, quicker systems, anonymity systems in particular, be careful about simple things. Be careful about inadvertent ways of distinguishing message packets. We want to normalize every message packet, so it all looks the same, and let's not base our system on things that have variation between users. So worry about the libraries. Don't pick a random library that hasn't been automated, or a library that's based on source code that hasn't been maintained in 10 years. There might be problems. People aren't using it. People aren't looking at it. And don't just write your code, stick it out there, people use it, and then sleep well at night because, of course, it's open source software, so of course, there's lots of air agreements, fanboys reading it, auditing it, and reporting all the bugs to you, so you're safe, that's a myth. Particularly if it's source code for a system, that not many people actually use themselves, but there's not that many non-emotionalized out there, probably about 50, and probably about a third of them are reliable, like a good number of users, but they're using, you know, our client software, not servers, I don't think anyone did any kind of code on any of this before we looked at it. Documenting your source code is a really good thing. It helps people do code audits. It helps when you don't have a spec, helps you with way to spec, but really, please, just write specs on how your systems work so that we can compare what you think it does, what it really does, and point out when it doesn't work. So that is the rough overview of the source paper. I can point people at it with all the good, chewy math, and I can point people at the active attacks, stuff if they're interested. And right now, I'll take comments. Questions? Applauses. Erasement and heckling. Any questions here? Wow. Yes, sir. Oh, very good question. Who wrote reliable? And why do we assume they were intending to make an anonymous remailer? Put your info hats on. Reliable was written by a guy who used a student and a fairly, you know, a stronger than average student was based on the Type 1 system. And he used reply blocks, he used the MIT NIN-LAS-NET system. He called himself R-Process, and he wrote a lot of inspiring stuff about needy anonymity and ethics of remailer operators, and it's still referenced today. He, you know, yelled at certain remailer operators in France who were running multiple remailers to try to out people that were abusing it, and he said a lot of really good things, but of course, his code is not that great, and he disappeared in the year 2000. Poof, gone. I have no idea who he was, because he was using an NIN-NET system himself. I don't know whether he decided, oh my God, we know where I wrote his, completely fucking broken, and I don't want to have my name attached to it, and I'm going to go hide in the hole now, and whether he got, you know, backed by the facade, or just got run over by a car, or whatever, he's gone. We haven't been able to contact him for four and a half years. I don't know who he is, I don't know who he worked for, I don't know what his motives were, and the point there is don't just worry about trusting, it's in the libraries, worry about trusting the motives of the people writing the software. Most of you here have seen me around, but I could be a deep color plant myself, and I could be putting back doors in Nick's master, so you should not even trust what I say, but I'm not a fed, and you all can go and look at the courage yourself, and I encourage you to, because I'm, well not maliciously doing bad things, could certainly be fallible. Other questions? Yes, sir. The level of anonymity provided by the anonymizer web browsing service. Well, first I'm going to say some good things about the founder of Anonymizer. He's the original author of Mixmaster, and he went and decided to start a company that provided web anonymity. That said, also disclosure, I used to work for Anonymizer, so. If I'm vague, it's because I have to be. Anonymizer is in an entirely different class of anonymity systems than what we're talking about here. Anonymizer is a trusted third party proxy, so basically you put your anonymity in the hands of an anonymizer, and they promise that they're not going to out you to whoever might want to know about you. Now, they are in the business of selling anonymity. They have a reputation based upon this, and if it became known that they were outing their users, presumably the reputation would decrease so would their revenue model, and they'd go out of business, so we assume the, let's just say, because they have an incentive, an economic incentive not to do this as a company. However, of course, no matter what their intentions, you also assume that their network security is pristine, right? You don't have to get somebody to give up somebody else. If you can just go in and look through their crap and figure out what you want to know without even asking them. The premise behind Mixed Nets is that you have this layered encryption scheme where if you have a couple of re-mailers that are compromised, as long as you're picking chains of re-mailers that have some, enough safe, uncompromised, honest re-mailers, you're still safe even though some of them are broken. If an anonymizer is broken in that if an anonymizer is run by bad guys or is run by good guys who are rooted by bad guys, you're screwed. But then again, anonymizer is also, it doesn't do all the re-ordering stuff, doesn't resist any of these active attacks. It is a low latency network system, not a high latency network system, so it's better compared to something like Tor than it is to Mixmaster, and honestly, if you're worried about the feds or the upstream ISP putting a tap on the network and watching the traffic go through, you're probably better off not using it. However, there is the one thing to note that an anonymizer has some high tens of thousands of users, so they, yes, that was my password, six characters, so go ahead and try to break in. Maybe up some tens of thousands of users up in the high tens of thousands of users, that's a lot of users compared to other anonymity systems, so again, certain adversaries, once you don't have the capabilities of the strong observer, you may be better off using Hotmail through an anonymizer, but I use Mixmaster when I need to go synonymously, so that's not what to say about an anonymizer. Unless someone happens to be. Question back there. You, yes. God damn it, I knew there was gonna be an anti-span question. I don't talk about emailers when somebody is asking an anti-span question. We don't need it. Okay, so Mixmaster, first let me say Mixmaster itself as a protocol, a type three protocol is being retired for type three. Type three is not formalized yet. Type three is going to have a transport layer in itself, Mixmaster itself relies on SNTP, SPF and all that stuff gets done on the MTA level with Mixmaster and isn't directly related to the software, so I'll say when you're running an MTA that's got SPF or got whatever the necessary requirement for delivering mail properly, you're safe. Back to the standard question, before somebody else asks, why can't, are you building tools for spammers? Mixmaster is not really good for spammers. First of all, it requires you to do much crypto operations. They're not that expensive for a user. What's supposed to use that about people that are interested in watching dog fucking porn? But if you want to send out 100,000 messages about buying Viagra for your make money fast scheme for selling Russian brides, you've got now 100,000 messages with, if you want to do, let's say you're doing two hops because you want a moderate amount of anonymity, and then you go to 200,000 RSA and triple-thous operations and that's just expensive when, why don't you just go use it over relay in China? The other problem is of course you get a delay, distribution of messages and you don't have all your messages being delivered. Wham, they'll be a couple that start trickling in, they'll get sucked into the razor database, the various spam detection databases and the bulk of your messages won't get delivered. So that's enough on spam. It's not used through mail networks. It's not a problem. Other questions, you had a question. Is anyone developing or supporting reliable given that mixed master is actively maintained? There is a fellow, so the author is gone and has been for four years. There's a fellow who, another French guy that spoke up a few months ago saying, hey, I want to fix the, I read Lam and Claudia in everyone's paper and I want to fix the random problem and he wants to take over maintaining the lab and fix the random problem and basically ignored everything else, the bulk of our paper, which was that the mixing scheme was broken. So, whether he's actually doing any work or not, I don't know, but I wouldn't use it. Good enough. Other questions? Last chance? Okay, thank you.