 My name is Fabien, I'm a summer of code student this year under my mentor Daniel Pocock. My project was extending various one-time-pad authentication packages in Debian to support a new kind of specification for challenge response-based one-time-pads. And I'm about halfway done, so this specification is pretty much implemented in two packages. What's left to do is finishing this implementation and the documentation for it as well. And the second part of my project is implementing an SASL extension, which is in different packages using this authentication mechanism to also support a different specification for challenge response-based one-time-pad. But I will talk today about the stuff I did so far. So basically, most of you know what a one-time-pad is. It's mainly used in two-factor authentication. And there are two widely used specifications for this, both specified by the same foundation. One of them is HOTP, which is event-based. There is a counter, which gets incremented upon a successful authentication. And this counter is used to calculate an HMAC hash using SHA-1, which is then truncated to display a number between six or eight digits. HOTP works pretty similarly. The moving factor is not a counter, but is derived from the current timestamp so that you can have time windows, and one one-time-pad value is valid for one or more of these windows. So for example, you can say one pad is valid for one minute or 30 seconds. Both of these specifications are already fully supported by packages in Debian. There are PEM modules for them, comment line tools, and there are a lot of hardware tokens for these two specifications. And also applications for smartphones and other platforms. So Okra, the specification that I implemented, has the same principle as a foundation. So you have a hashing function. In this case, it's not limited to SHA-1, but you can also use SHA-256 and SHA-512. And you don't have a single moving factor as input, for which you calculate the HMAC, but you have a whole array of possible inputs. The only mandatory input is a challenge, which comes from the other party, where you want to authenticate. Yes, you can do authentication in both ways. So when you want to log in at a server, you can send a challenge to the server. The server verifies who is, sends this back to you. You can check whether you are really connecting to the server you want to connect to. Then you get a challenge, and using both challenges, you calculate your own value. There is also a mode for signing data or hashes of data, but I'm not quite sure whether this will be ever used in a wider context, because there are other signing concepts and technology that's a lot better, in my opinion. So anyway, how does this work? As I said, we don't have a moving factor anymore, but we have a big binary array of data, which starts with a string, actually, that specifies what else is contained in this binary array. The two zeros over there are just a separator, and then you have the different inputs, which are just concatenated one after the other, if they are included. The first one is a counter. It works exactly the same like in HOTP. If the authentication is successful, the counter has to be incremented server-side. The client always has to increment the counter. So as soon as the counters get out of sync, you know something went wrong, and the authentication is blocked until the counters are synced again. Then you have challenge values, I will come back to that later. They are the only input that is mandatory. You can include a password hash using the same three hashing functions, like for the Okra value itself. And there's space for a session information, if you want to include it. So for example, if both the client and the server have access to an SSL or TLS session data, you can include this in here to make the Okra value even more resistant to tampering and replaying. And similarly to TOTP, you can also include a timestamp. The window calculation works a bit differently than in TOTP because you can specify windows between one second and 48 hours, which usually doesn't make much sense, but there might be applications where this does work. This is an example of such an Okra seed string that specifies which parameters are used. It starts off with the version of Okra. There is only one so far, so that's always the same. The next element is the hashing function. So in the first case, we have SHA1, truncated to six digits. In the second case, we have SHA256, not truncated at all. So you can either have no truncation or between four and ten digits. This is also extended from the previous specification where you could only have six, seven or eight digits. And the last part of this Okra seed string is the actual data input specification. So in the first case, the queue is for the challenge, and it's a numerical challenge using eight characters. In the second case, there's a lot more included, a counter, an alphanumeric challenge with 20 characters, a password hash session information of 128 bytes, and a timestamp using 12-minute time windows. Usually you won't do something like in the second case, because that's a bit much. Yes, we already have that. So the challenge values can either be numerical, hexadecimal or alphanumerical, and should be between four and up to 64 characters per challenge. So you can have either one or two challenges depending on the mode. So if you have two 64 character challenges, they are 128 bytes if they're alphanumeric. So you always pet the challenge part of the data input to 128 bytes, no matter how long the actual challenges were. And the challenge strings themselves are converted to binary data, which is actually working a little bit strange. So if you have an alphanumerical string, you can just copy it because that's binary data. The hexadecimal strings are converted from hex to binary, but numerical strings are not converted from base 10 to binary, but the number is converted to a hex string, and the hex string is then converted to binary. That's how it's in the specification and the reference implementation. It's a bit strange. So how does the actual authentication work? You can have three modes, one-way authentication, two-way or mutual authentication, or the signature mode. So in case of a one-way authentication, the client has to tell the server somehow that it wants to log in or authenticate, so it has to send some kind of request for a challenge. This might be waking up at the screensaver or actually really initiating some login process. The server responds by generating a challenge according to the Ocrasuit specification and sends this challenge back to the client. The client uses the challenge value, the localist or specification, and the secret key to calculate an HMAC value and truncate it, and the resulting number is sent back to the server who can validate this value by using the same shared secret key and the same specification. And it then tells the client whether the login was successful or not, of course. The two-way authentication works basically the same, but some additional steps are included. So this time, the client initiates the authentication by generating a challenge for the server, sending it to the server. The server can use this server challenge to calculate its own value. You can basically have two variants of this. It's not specified in the RFC how you should do this. I chose to use the more flexible route and store a server specification and secret key for every user. So you can have a different specification, a different key for every user. You could also do a global key for the server authentication part. The server then has to send its value and a new challenge for the client back to the client. The client can validate the value it received to see whether this is really the server thinks it is. Afterwards, after checking whether the value was valid, it can calculate a client value using both challenges. So you use the server challenge and append the client challenge to it at the end. This value is then sent to the server again. The server can check whether it's valid and send back their authentication result. The signature mode works basically the same. You can have a one-way or a two-way signature mode, so either you verify the server before you send the data to be signed or not. The only difference is that you cannot use session information, otherwise the signatures would not be verifiable afterwards. And the challenge is not randomly generated, but depending on the data, you can either just use the data as challenge if it's short enough, or you can use, for example, a hash of the data. For all three modes, you can probably tell that you need some kind of secure channel to transmit the data, otherwise you can easily just man in the middle of everything. The usual way this is done is by just wrapping everything in a TLS connection. That's also how we do it in our packages. So what's the current status in OF Toolkit, which is a collection of tools and libraries written by Simon Josephson mainly. We extended the support, including HOTP and TOTP support to also include Okra, so we can now generate and validate Okra values, both with the common line tool and with library. There's a facility to generate challenges and convert them to binary when necessary. And the PEM module that was also pre-existing for HOTP and TOTP was also extended, but this only supports one way authentication so far, mainly because I didn't have time to figure out how to do multiple queries safely in a PEM module at the moment. The only other thing that's still missing are wrappers for the HMEG functions for SHA 256 and 512, because the crypto libraries that OF Toolkit use at the moment have SHA 256 and SHA 512 methods, but no HMEG wrappers for them. So those are basically missing and I will probably write them in the next couple of weeks sometime. The second package I worked on was Dynalogin, which is an authentication client server architecture. So I extended both the server and the client to support Okra authentication, both one way or a mutual authentication. Dynalogin also includes a PEM module and I also only did the one way variant there for the same reasons, basically. Dynalogin right now doesn't have any dependencies on crypto libraries other than OpenSSL and it's called TLS, so I didn't include password hash calculations so far. You could actually, of course, calculate the hashes each time to include in the data input or you could just store them once. But this would probably introduce new dependencies, so I didn't do this so far. Yeah, Dynalogin supports multiple data storage modules, so you can store the user data in a file or in a database and I didn't test the database module very extensively so far, so that's also my to-do list. If you're interested, I would give you a short presentation, so there is also a short, a little testing facility built in to Dynalogin, yeah, of course, sorry. So basically what this test client does is it uses the API methods for the client library to connect to the server and test the authentication mechanism. So what it did now is it sent a request to the server to get a challenge. The server sent back a challenge, which is in this case a 20-character hexadecimal challenge, as you can see, and we can then use the command line tool to calculate the one-time pet value. So the last part here, again, sorry, 3, 1, 3, 2, etc. is the reference key in hexadecimal notation. So what we say here is use Okra, use this suit, this is the challenge, which is the only input apart from the current timestamp and the key. In this case, we have a 5-minute time window based on the current timestamp. The command line tool also allows us to set other timestamps to test one-time pet values in the future or in the past, but of course, the server will only accept the valid one for the current timestamp. So here you can see some debugging output. This is the actual data input array that is given to the hash function. So you have the Okra suit, then the two zeros over there are the separator, and then you have the challenge, which is padded, and at the end the timestamp, which is converted as well to a number of timestamps, and then copied over. And here you can see the generated Okra value, and if we enter this, the test client returns zero, which means the authentication was successful, and the two-way authentication basically works the same, so you can enter some random challenge for the server. The server will calculate a value, which you can then verify, again the debugging output, which you can ignore, and I think you don't actually see that on the demo. And the zero back there again means the validation was successful, so this server code was correct. We get a new challenge, and now we need to pass both challenges, of course, and generate a new client value, which is accepted by the server again. And of course, if you enter gibberish, it's not accepted. Basically the PAM module uses the same API methods to authenticate with the server. One of the big problems as far as using this in practice is that there are no hardware dongles supporting this RFC as far as I know, correct me if I'm wrong, which is of course kind of a chicken and egg problem, because besides the reference implementation included in the RFC, I am also not aware of any software implementations. So as long as there is no software ecosystem, there are no hardware tokens, as long as there are no hardware tokens, it's kind of hard to use. Maybe this is the first step to actually push this a little further. I don't know. Are there any questions so far? There are no tokens here. So we've got a couple of these which were given to us by GOOS, and they make these available for free software developers. One of them is an event-based token, so you press the button to get a token code, and the other is a time-based token, so when you press the button, it just shows the current token value, which increments every 30 seconds. So I can pass these around, and yes, you can push the button. And then we have another one. Who's seen these calculator-style tokens before? Most people have seen these already, so that's a typical two-way challenge device. It takes a smart card, and the mechanism is proprietary, so I don't know which mechanism is implemented in this device, but sooner or later, we hope to see open and free alternatives to this that people can use, so I'll pass that around as well. Yeah, so basically that's also probably one of the reasons why it's not as widespread at the moment, is that the devices for challenge-response-based, one-time-pads are a lot more complicated because you need to enter the challenges, so you cannot just have one button, but if you only want one-way authentication, actually, you can just have a device with one button that generates a challenge, you enter the challenge, no, the other way around, doesn't work, I'm confused. You need a pin-pad, basically, or even a keyboard if you want more than just numbers. Does anybody else have questions for Fabian? Has anybody seen the system that Fedora Project is offering their developers now, that they've set something up with UB Keys to access some of their servers and websites? So there's some scope to do a similar thing for Debian infrastructure, if anybody would prefer to have that type of access. It offers the possibility that people can log in from untrusted locations. There are all sorts of discussions about whether you want to do that anyway, but it does give people an extra choice that instead of logging in with a password or with an SSH key, there are some situations where they could gain access to a website using a token, either a physical device or a soft token in their phone. Even my university is offering it now to all members for their internet access, and they are usually not the quickest to adopt to new technology, despite being a technical university, of course. Hello? Hello? How am I? Can you hear me? Right. I was just going to mention that if we've got open solutions to this, then finally you might be able to solve the thing where you end up getting more and more of these tokens from any bank accounts that you've got. And if we get another one from Debian, and if people at Debian have to do it, they probably don't want... What? Yeah. Okay. So, if we can have open ones where you can register your token with several services, you could just have the one token, which is unlikely to happen from banks, but it might happen from open solutions. Well, for those three specifications, you would have the problem that all the servers would have to know the share. So you would need some kind of central infrastructure to verify the codes, like SAML, SMAML server, something like this, because otherwise you would have to spread the shared key over multiple servers, and that increases the risks of getting the shared key out, and that's not good. I was thinking right now, how could you have a challenge-response thing where you don't have to be given the data, but you can calculate it yourself and you don't have to memorize it? And I was thinking, what about zero-knowledge proof based on a hash database of standardized identity data? So you fill this out on your own, in your own computer with trusted software, and you hash it, and every time you want an institution to be able to trust you, you give out your hash database. And so they challenge you with zero-knowledge proofs where you know your own data, so you can take the algorithms, calculate, give them the relevant digit, bit, whatever, and yeah, for me would be a very elegant solution to this kind of thing. I don't know how secure, but I'm pretty sure you could make it secure. Zero-knowledge proofs have been studied, right? I'm not sure. What would you use if they can verify that you are you by just information they know about you? How would anyone else, how do you avoid that anyone else that has the same information can be claimed to be you? You need some shared secret in some way or something that only you know that they can verify. But usually people can find some of your data, but to really have a complete profile of your identity seems to me quite hard. In any case, you could mix in there some data which are kind of random-ish. Like they do now with these questions where they ask for data which is kind of insignificant. I don't know. I was just thinking out loud. Maybe it was garbage and you just proved it. The problem with insignificant data is that suddenly my first school that I attended is actually a security token that people would want to steal. And sadly, I've given it to lots of banks. So yeah, insignificant data suddenly becomes significant when you start using it for things that include security. Yeah, no, but you have to tell them the school you went to for them to be able to work out whether it's true because they, how are they otherwise confirming it? Or you mean you need a microphone? Their datum is a hashed secret and they're asking you for zero knowledge proofs that you also hold the hashed secret. But you can rehash your secret any time you want and then apply the zero knowledge algorithms. So you mean you choose the question that causes the hash to be mixed up? Okay, yeah, so you- The question and the datum, and then you have to remember that. And then they ask you to prove, to do as you can. That you hold the hash data but they don't know what the data is that you've got. Yeah, so you say something like, what was my best friend at primary school's name? And that's only on your computer. And you hash the secret and give them the hash secret. Yeah, okay. But I mean whichever mechanism you use, there is always some new attack vector. And I mean, and there are always going to be people with a different level of understanding of the technology. So to some extent, what we have here is a good technology for people who are comfortable with how it works. People who would be comfortable using PGP or smart cards and can also use one time passwords and can probably extend that to more people through soft tokens very quickly. And it's much faster to deploy a soft token than to deploy smart cards and help people attach things to their USB. On the other hand, there is the risk that these tokens or the codes can be caught by a man in the middle. So if someone does capture the code and enter the code in the server faster than you can get the code to your server yourself, and there is a space of a few seconds there where they could do that. If they have some control over the network or the infrastructure, then there are ways they can attack this as well. So there's no perfect security solution. But I mean, just to give one example of the things that users are exposed to in the real world, there's one scam that was in the papers the other day that the criminals call you. And they tell you that they call you on a Sunday morning after a night out, perhaps after the Debbie and Birthday party. You get this phone call and it's on your landline. And they say to you, look, someone did some spending on your card last night and we're like the fraud alert service and we want you to call your bank. So when we get finished talking to you, then you should call the number on your card and ask them to help you. And so you put down the phone, you ring the number on the back of the card. And they ask for all your personal details. And because you're worried and it's early in the morning, you just answer the questions that they ask you. What's your password? What's your date of birth? What's your first school? They might go through all these things. And because it's a fraud, you might be tempted to trust the fact that they're asking more questions than usual and you actually give them more details than usual. But what's happening here is that in some countries, when someone calls you like that first call and you hang up the phone, they still have your line open. Yes. So when you think you've put the phone down and you've picked it up again and you're calling your bank using the number on the card, they've still got the line open and they're receiving your call, the people that called you. And I mean, this is a very low-tech way of getting data from the end user. So it's the same thing with these tokens. If someone can actually ring the customer and say, we want you to verify yourself by giving us your code and that attacker is actually going on to the banking website at the same time, then the customer may well be fooled into doing that. So people do have to be alert to these things when they implement these solutions in practice. Yeah. By the way, we used to use what you were talking about in prank calls because people would try to hang up but you still had the call so they'd pick up again and you'd still be there. Anyway, what I was going to say is that's why I was interested in a scheme where, first of all, you are the one who gives something to them so they can authenticate you. And furthermore, you're not handing out the secrets. You're handing out something that's somewhere in the middle of the zero-knowledge proof process. So you're handing them data which is abstract but is good enough for someone to verify your identity through challenge response which is the basis of the zero-knowledge protocol. So if it gets stolen from anyone, what do they have? They have data that allows them to verify you but they still need, I mean, verifying you doesn't allow them to verify before someone else because the way the zero-knowledge protocol works is they might generate one set of challenge responses but they can't go through the whole thing mathematically. Oh yeah, if anybody wants to look at the code or play around, it's on my GitHub page at the moment. Okay, so thanks Fabian for giving us this overview of the project. I mean, it's really good the progress you've made so far as well. So it's good to be able to actually see it working when we've still got more than six weeks of summer of code left to go. So I hope that people will be able to try it out and give Fabian some feedback during the summer as well. And the packages for DynaLogin and Oath Toolkit are in Debian, they're in Weezy, they're in Unstable but the latest work that Fabian has done is in your own repository. Yeah, so you have to build it from source to get all these groovy new features. So thanks Fabian.