 Okay, let's start over As I was saying my name is darling Smegdog I'm a free BSD developer and an engineer at the University of Oslo in Norway I've been working with the security and authentication for both professionally and as a free BSD developer and sometimes Those two have overlapped for over ten years now What I'm here to talk about is Basically, I'm here to tell you that we're doing it wrong and we've been doing it wrong all along so the title of my presentation is challenges in identity management and authentication and I'm I'm here mostly with questions or mostly with I'm here mostly to point out what we're doing wrong, but I also have some ideas on how we Could do it right First some some terminology Just to make sure that we all understand each other identification or identity well, I In a broader sense identity management is the the act of or the fact of Knowing who a person is or who a user is and knowing things about that user such as their username their Their UID their Their real name things like that Authentication is the act of verifying that a person who claims to be a Specific user is actually or that a principal who claims to be a specific User actually is that user Authorization and access control are often confused I I have made that mistake myself of saying authorization when actually mean mean access control access control is the act of Verifying that the person you have in front of you is actually allowed is actually authorized to do what that person Is now trying to do or asking you to do for them Authorization is actually the act of granting those permissions But it is commonly the word that word is commonly used to mean access control so Step into my time machine These are I'm sure you I hope you can read them I'm sure you recognize them. These are ETC password entries For roots Showing how they have evolved over time so The top line is a plain V7 Unix ETC password line with a desk hashed password and incidentally the salt is salt and the password is Password I think don't remember generate in in in all examples. The second one is BSD master password line, which as you can tell has extra fields. It has a login class which is empty and and password expiry time and an account expiry time and This time I used extended desk Hashing which is a different hash function and below that is The same line, but with the MD5 hash and then the Shah 512 hash that we use today in current so Here is five five lines will If you don't count if you don't count Error-checking and if you don't count the printf at the end that just says welcome It actually takes in in a traditional V7 Unix or in the traditional Unix were traditional ETC password world. It only takes two lines to identify and authenticate a user This is a slightly more complex version of the same code I've just added two lines which check the expiry time. So now we've gotten to now we've gone from there to there and and this final version Does a lot more because it also does a little bit of access of access control after identifying and authenticating the user and verifying that the account hasn't expired which is Not really authentication and not really of access control. It's some somewhere in between its account management Some like that What I do next is that I check that the user is a member of the group staff and that's access control That's an access control policy. I now know that the user is who they say they are But I want to know if that user Knowing that it is them is that user allowed to do what they're trying to do now, okay? So If we move forward a bit This is the copy. This is a copy of You probably recognize it ns switch.com this one was taken from previous the 5.3, which is when it was introduced And what ns switch.com introduced was a modularized identity system because we now had Different ways of retrieving information about a user of identifying a user. We actually had already had Different ways of doing it before but it was hard-coded So you had ETC password and then if you had a plus at the end of the file you'd check ns and possibly he did So the ns but NSS brought us modularization. It is now possible. It was now possible to Install plugins which would handle different so you could install a nail that plug-in for instance still can This is log in dot com from FreeBSD 4.3, which is not when that functionality was added, but it's when we actually started using it For more than than just well This is when we started making extensive use of it I'm not going to talk too much about login.com. It actually doesn't have anything to do with authentication It's actually part of it should be viewed as part of the identity management system because Except for This line and an equivalent line somewhere above there All of this is just information about the user It doesn't tell us anything about how to verify that the user is who they say they are if we keep moving forward in time We get to FreeBSD This is from FreeBSD 5. It's an excerpt from pan.com from FreeBSD 5 5 current actually this is this was Taken right before we switched to pan.d to the new configuration layout for pan at that point we had actually already had Pam since 3.1, but We didn't it was only in in FreeBSD 5 that we started actually putting Pam into absolutely everything in the base system and and Using it for more than just a few Specific things so we now have a pan configuration that actually lists the entire traditional Unix authentication policy so you check no login and then you check the password and things like that so So the the interesting thing is that both Pam and NSS came from Solaris and So we had Pam since 3.1 and we had Move back. We had NS switch since 5.3 But NS which is actually much older than Pam, but we didn't adopt it under until much later for some reason This is the Pam equivalent of the code I showed you earlier So we first have to initialize Pam and we tell Pam that we want to use a specific policy called system and By convention normally when you use Pam You use the name of the of the application as the name of the policy, but you don't have to Then we have one Pam service call there Which is Pam authenticates? Which does pretty much the same as the line I had where I compared the hashed password and Well, I'm not going I'll actually I'll come back to this code later So I'm not going to go through it in detail, but you can tell that it is Somewhat longer and more complicated or complex than Then the the old world equivalent So I'm going to move on now I'm actually going to start about talking about what's wrong what we're doing wrong By structural flaws. I mean flaws in the underlying model. I mean flaws in flaws that we can't just Flaws in how we think about it as opposed to flaws in how we we do it so Let's begin with identification this is a slightly redacted dump of the Publicly available Information about me in the University of Oslo's LDAP directory or in one of our LDAP directories because we have several with varying levels of detail And there's there's a lot of there's a lot of information there Which won't fit in ETC password? For instance, there's a line there that says I'm an employee There is There's a Line there that gives my postal address. There are several lines there with my job title and There are lines there which in a slightly roundabout way Actually encode in which department I work in This is my University of Oslo web page Except isn't I don't keep anything there my phone number My street address. It doesn't have my It doesn't have my office location But so there's a lot of information there that apart from maybe stuffing it in the Geekos field That you can't encode that in ETC password and as a consequence As a consequence You can tell that so this this is struck password. This is our current Identification API our current identity API is get PWNAM and get PWU you ID Which returns a struck password, which as you can tell has no way of There is no way to store that information in there This is the equivalent for groups Which is also part of identity management knowing that a user is a member of a group So our group our notion concept of groups is extremely simple The The the the API we have the way we have of the the standardized AP interface we use for for group querying group membership, etc And It only pretty much only cares about file groups so there are no There's no way to have a Group there's no way to express Sorry, there's there's no way to create a group which has other groups as its members for instance, there is no hierarchy And this is all we have So unless you unless you you I Was going to use the word teach is probably not the correct word But let's let's just unless you teach your application that unless you tell explicitly tell your application your Unless your application has actually built in LDAP support and you tell your application that hey, I'm actually using LDAP So you can use LDAP to look up additional information There is no way for an application to get that information even though We have nss underneath which supports LDAP in addition There is no API for actually modifying this information and The way we do it in the traditional word if we're if we're still using ETC password EC master password, etc The way we do it is actually we edit the file and then we regenerate the database and we're doing it We're we're sort of when we do that. We're sort of mowing the grass under the applications feet There's no Well, the the the the act of regenerating the database is it sort of atomic and there are locks in place so There you shouldn't actually run into any conflicts you shouldn't get Inconsistent information, but it's it's really it's a hack and if you're using something else an ETC password Then you have to know what it is that you're using and you have to go there if you're using LDAP Then you have to there is no way to directly Make changes from an application that doesn't know that you are using LDAP so we the problem exists in both directions this is login.conf again Which I'm bringing up again because when we Found the need and I'm saying we in in a very wide sense because this actually predates free BSD if I remember correctly Kirk Login.conf login classes today predate free BSD Yeah, they're from 4.4 So we decided that we needed to store more information about users and instead of instead of Implementing a generic API for storing information about users maybe for having you know per user properties or something What we got was a very specific API very specifically named API and a very narrow API for a very specific type of information which is resource limits and and Paths and environment variables and things like that. It's a key value store So yes, you could use it for pretty much everything except that the file format is very limited You can't as you can tell there you can't Store a column in a field in login.conf which is why the path here is Space separated instead of column separated and you have to do the translation when you get that information from Login get cap whatever so Structural flaws in authentication and access control Let's return to our PAM example what you actually see happening in these few lines of code is authentication here At least on the surface Some sort of access control there some sort of identity management there because actually when you So when I start I first initialize PAM and I tell PAM which policy I want to use and I also tell PAM which user I'm trying to authenticate and the specification says the X Open single sign on blah blah, which is the the sort of standard for PAM actually says that a PAM module is allowed to Modify that username. So the application should ask Once authentication is complete the application should ask PAM what the real login what the real username is Which is what we do here and then we can do a get PWM which Goes to NS switch which goes to whatever maybe LDAP we don't know But what we do know is that it doesn't go through PAM Because they're completely separate There is an additional an additional a problem here though, which is that Ostensibly we have Some sort of authentication some sort of access control here But in fact we have no idea what happens when you call PAM authenticate when you call PAM account management I have a PAM configuration file here again and And There is Where did I take this one from? There's a line I wanted to show you which is PAM no login So the thing is that actually in the authentication phase in So in a in a previous slides There you can see the PAM no login line That's not authentication That's checking if a file exists in ETC and if that file exists It means that we're preparing to shut down for maintenance or something and we shouldn't allow people to log in at all That has nothing to do with authentication. It could be construed perhaps as access control But we're actually doing it in the authentication phase and then we have PAM Unix If I'd shown you the PAM configuration for if I'd shown you the PAM configuration for Just go any faster if I've shown you the the PAM configuration for Sue or Sudu You'd see that it checks The the the PAM configuration for Sue checks that the user is a member of wheel before it asks for the user's password, which is also backward because Verifying that the user is a member of wheel is access control and we should authenticate the user So basically we're we're making a decision based on the identity of the user before we even know That that identity is correct before we've even authenticated it the reason why it's Done like that the reason why the PAM configuration for Sue is like that is that that was a historical behavior So when we PAM if I'd so when we converted sue to use PAM We just created a PAM configuration that approximated the historical behavior as closely closely as possible But you can also see that the the syntax is extremely simple and there are many things that would be that one would think were Obvious requirements that we have no way of doing there is no way for instance to tell PAM That I want at least two out of these authentication mechanisms to succeed And I have no way of telling PAM that for instance if you're logging in on the console It should accept It should accept one of them. I mean if you're logging in on the console your password is enough But if you're logging in remotely, I want you to also provide a one-time code or an SSHT in addition to your password There is no way to express that in the PAM configuration syntax Linux PAM has a slightly more complex syntax, but it can't do that either. It's only a They they have a some sort of flow control they can they can Control they have a slightly more fine-grained control over when PAM returns instead of continuing down the down the down the list so so What I was saying about Having different requirements based on where the user is logging in from etc. Is such an Obviously useful feature that open SSH. This is an excerpt from the open SSH man page Open SSH actually has a fairly fairly complex configuration mechanism for expressing that sort of policy you have Match groups where you can you can you can so you have conditionals you can Match based on the username based on the group name on where You're logging in from which can be either a hostname or address and you have a long list of Nobs that you can tweak based on that condition. So as in a more This is open SSH specifically, but more generally Open SSH isn't the only application to do that So what what we what we've ended up with is we have a centralized authentication policy which is so Inexpressive that Many authentication and access control decisions have actually been decentralized or Rather they haven't they should have been centralized But haven't been because because there is no good way of doing it. And so we have if you have 15 different applications, then you have The PAM configuration syntax plus 15 different configuration syntaxes for each of these applications with different concepts and different ways of expressing it and different levels of functionality plus probably Several additional. Yeah, for instance hosts dot allow TCP wrappers in addition. I mean open SSH is out of the box supports TCP wrappers So you can do access control at any even lower level although you you could view TCP wrappers are more part of the Part of the firewall really and not part of the authentication or access control system Network level access control. So I'm going to speak very briefly about technical flaws by which I mean Not so I've been talking about conceptual flaws and and um Technical flaws by technical flaws. I mean flaws in the way the tools we do have work the way they're implemented so Just very briefly. This is the PAM conversation API This is how a PAM module communicates with the user. This is how a PAM module asks asks a user a Question and receives the answer from the user and you can ask several questions. You can pass messages and you can pass prompts and you can receive input from the user It's somewhat limited, but actually quite flexible and It's done with a callback when you before you when you and when you Initialize the PAM library before you start authentication You have to register Conversation function which is a callback which the module will call when it needs to communicate with the user So this is what we call inversion of control. Here's a Very simple, I should probably have added colors depiction of The code path or the control flow when an application Uses PAM to authenticate a user. So the application has an event loop. Let's say for instance that this is SSHD it has an event loop But it uses to exchange Packets with with the the SSH client and at some point in that event loop it decides that it needs to Call PAM and authenticate the user. So it calls PAM authenticates which goes into the dispatcher in the PAM library and the dispatcher calls PAM modules one by one based on the PAM configuration and suddenly we get to a module here That wants to talk to the user and what that module does is that it calls the callback function and waits for an answer so the problem is of course that At This point the event loop has actually stopped the event loop is stalled waiting for PAM authenticate to return But we're trying to perform an action which requires the event loop to run because we need to send packets to the SSH client and we need to receive packets with the answer and That just plain can't work As long as the event loop is stalled Waiting for PAM authenticate to return actually my first draft when I when I added PAM support to open SSH my first draft actually ran the event loop from within the PAM shim layer, so The event loop would call PAM which would call the event loop and It sort of worked because the event loop was sort of re-entrant But it actually would have broken horribly with Anything slightly more complicated than the very simple test cases I was using so what we had to do was Move PAM into a separate process So that the event loop communicates with the user and it communicates asynchronously and it communicates with PAM asynchronously through Call it a proxy that there's a very thin layer that allows you to do a remote PAM procedure call the problem is That PAM modules Expect to run in the same process that they expect to be running in the in the in the process that will eventually at some point either perform an action on behalf of the user or For can exact the user's shell or something and in this case it doesn't in this case it runs in child process and All sort of sorts of things Mostly work, but only mostly and for instance PAM modules can set environment variables Which will then be exported to the user's shell? But of course if they do something like that if they do I mean a PAM module that has a side effect that affects the process That actually calls the PAM module those side effects will be lost because The process ceases to exist before open SSH starts the user's shell environment variables were a poor example by the way because they the The PAM integration code in open SSH actually transfers them back. So environment variables work What doesn't work is Do they know they don't The PAM SSH module starts Is it yeah, the PAM SSH module starts an SSH agent on your behalf after you've authenticated So the PAM SSH module is a module that allows you to authenticate yourself by typing in the passphrase to your SSH key So if I have an SSH key on the machine you're logging in on so it's kind of backwards It's not really intended to be used in conjunction with SSH But it's more like when you log in on on an X for instance So you type in your password and and or you type in your passphrase SSH Passphrase instead of your password and you're logged in and you have an SSH agent and The SSH agent has that key loaded in order for the SSH agent to have that key loaded the PAM SSH module needs to either Store that key after having successfully decrypted it using the passphrase you typed in or it has to store the passphrase so that it can later decrypt the key again and That doesn't work because that information is not transmitted from the part of open SSH that actually runs the authentication bit and the part of open SSH that runs the section Establish Section setup and section tear down bit which are parts of PAM that I haven't show you. Oh, yeah, that was a red square highlighting the Callback I've forgotten about it So I am now going to I believe I have about 15 minutes left Ten minutes left. Yes, so I have ten minutes to talk about solutions I'm not going to outline a solution instead. I have a going to I'm going to outline some principles which should Which we should follow when Optimistically when we try to solve this this problem so the first principle Consolidates Identity and authentication services. I mentioned this very briefly earlier. We have NSS Which is modularized identity management? We have PAM which is modularized authentication and To a certain extent account access control and to a certain extent account management and a tiny little bit of identity management It's a hodge-podge of Anyway, but they don't talk to each other and There's no shared codes. You have an NSS LDAP If you're using LDAP and you're using LDAP authentication, which is a bad idea because it's based on storing the Password out either in plain text of the hash password in the LDAP directory, which is a bad idea, but never mind that if you use those then the LDAP NSS module and the LDAP PAM module are two completely separate pieces of software The only thing they have in common is that they both use the open LDAP client library To actually implement the LDAP protocol They don't communicate with each other they don't cooperate in any way So we need we need to we need to merge this these these two we need we need to have We need to have a Framework where the back end handles both Identity management and authentication so that we can do things like like this We need to centralize the authentication policies Sounds obvious, but it's actually very difficult because what this means is we we can't We can't hope to ever fully Centralize all authentication decisions Because there are things like public key authentication in open SSH We can't move that into Some framework somewhere because it's part of the underlying SSH protocol It's so closely tied to the SSH protocol that you it has to be implemented in the SSH server but the decision of whether to use it whether to require Whether to ask for a public key whether to require a public key or other to require a Sign challenge anyway Should be centralized and that is actually possible because in SSH to You can there are multiple authentication mechanisms You can use one or several and you can use them in any order and you can You can switch back and forth between them and actually open SSH does that if you look very closely at the logs It will first run through the entire list. It will actually run through the entire list twice and it does some it Runs through the list first to negotiate which ones to use and then it runs through the list again to actually use them and so so it it would be possible if Open SSH actually cooperates if we have a tight cooperation between the two it is it should be possible for for the the the authentication framework to tell open SSH hey, can you please do public key authentication now and Tell me whether that worked It was okay, never mind Isolates identity and I think authentication services from what I call exposed surfaces If you're familiar with how open SSH does a privilege separation or if you're familiar with the concept of privilege Privilege separation then you'll understand what I mean Identity management authentication are sensitive they They require for instance access to if you're doing traditional UNIX authentication then you need to be able to access ETC SPWD DD or ETC shadow or whatever ETC master password get the grip and You really don't want to do that in the same process that also speaks to the user and Could potentially contain a buffer overflow vulnerability or something or some sort of code injection vulnerability that would then allow the user to access that Datastore directly you want to separate those things open SSH does that with privilege separation But you want to do that across the board For instance sue Does not it calls Pam directly and it calls Pam so it it calls Pam and does potentially Sensitive potentially dangerous dangerous things because before it knows that it can actually trust the the or Before it has a reasonable expectation of being able to trust the user So there are there are worse cases than ETC master password because that contains password hashes but if you're doing Time-based one-time passwords for instance if you're using Oath Which is the same as Google Authenticator the To in order to verify the code that the user entered you actually need The the key that we do use to generate it so you actually have to store the key in plain text Just like with Kerberos you have to the Kerberos What's it called Key server Yes the The server that grants the ticket granting ticket actually has a plain text copy of the of the password because it's based on encrypting It's a zero-knowledge Protocol based on encrypting a challenge with the password If we take this one step further We we can completely isolate authentication from the application entirely So we no longer perform We don't we no longer we don't perform Authentication and access control in a child not even in a child process of the application like open SSH does So instead we have a service We have a Damon running Somewhere and The application communicates with that Damon with some sort of remote procedure call interface and that Damon Provides identity management services authentication services very tightly controlled interface complete separation between them and We actually that actually has That actually gives us some advantages other than just improved security it means that that Service can cache Information in a way that you can't do in open SSH where you're only going to do one a request at a time and then lose your context and Sorry We can also do something that We can do a session setup and session tear down in a much better way than Pam does it and we can also do first open last close Operations for instance starting an SSH agent the first time the user logs in and then every subsequent Subsequent time the user logs in we just give the user Information about the already running agent, but when the user finally logs out of all SSH sessions and whatever then We kill the SSH agent We can't currently do that with Pam because Pam does not have a Big view Pam only sees the the exact Pam only sees one session at a time. It doesn't have And finally We must dare to break or at least bend compatibility free VSD The principle of least astonishment a set are a backward compatibility there There are good ideas But I believe that sometimes free VSD takes them too far and that we're we seem to be afraid of introducing of making large changes and incompatible changes and we have to dare to do that because We have to dare to do that We also have and we have to to dare we have to trust our users to actually understand the need for it Instead of instinctively Being afraid that our users will hate us for introducing such a change Of course, we have to provide backward compatibility Because we are not going to change every single application in the world to support our new framework if we Develop such a framework, so we have to provide some sort of backward compatibility We should ideally be able to use third-party Pam modules in some Manor because of course people are going to write Pam modules for stuff that we don't want to necessarily want to implement in our base system So we do have to provide backward compatibility but we don't have to emulate every single little detail at the cost of actual functionality and and at the cost of security which we could have achieved if we dare to Bend compatibility and that was it. I think great timing So any questions? Yes Right That's interesting, but yeah, but radius First of all, it's very network oriented. So, yeah, maybe you'd rewrite Maybe you'd take a radius server and rewrite it to communicate over Unix socket or something like that And and radius has some nice properties such as support for for back-and-forth Conversation so you can actually ask you don't you don't it's not It can do challenge response and multiple I mean back and forth ask you for your password ask you for a one-time code and stuff like that. It's a nice protocol But it's purely authentication It's not identity management. So it breaks my first principle, which is that we have to consolidate identity management and authentication next question Yeah, I Maybe I should have brought the so it's called CDDL and I think the Specification it is a thousand and fifty pages long I should have brought a print out Just to show you CDDL is actually CDDL tries to do absolutely everything. It's not just authentication and identity management I think it's also an entire crypto framework. So it sort of replaces open SSL and PAM and NSS and GSS API and whatever pretty much everything who I forgot to mention Jesus API because Jesus API is something that we Can absolutely not do with PAM because PAM can only PAM is entirely text-based and GSS API is something that we should support too much larger degree than than we do So we have open SSH supports GSS API and Kerberos because it does them itself Because PAM isn't capable of doing that. So Thank you Any further questions? No, okay. Well, thank you for coming