 Alright everybody, welcome to your last talk before afternoon tea. Super exciting. It's my distinct pleasure to introduce Fraser Tweedow. Fraser works at Red Hat on identity management systems. And today he is here to talk about integrating external authentication with Python web apps. Take it away Fraser. Thanks everyone for coming to my talk. Can everyone hear me okay? I've got a bit of a cold so I'm probably going to be talking quite softly to preserve my voice. So yes I do work at Red Hat on identity management things, mainly on certificate management. I'm the PKI guy but I'm not going to be talking about that today. I'm going to talk about external authentication, why you want to do it or to be able to support it. The advantages it brings and how to do it for Python web apps. So I guess the 30,000 foot view is that identity silos are bad. Don't build your apps as identity silos. So disadvantages of identity silos, particularly in a single organization, are that users aren't good at remembering two dozen passwords for different services. Chances are they'll just end up picking really bad passwords or using the same password everywhere. And you don't want them doing that. But then there's also a whole lot of administrative overheads in maintaining identities in identity silos. This is a problem for businesses, corporations and also open source projects if you have a large open source project. Say for example the Python project. Python Software Foundation has many different services that they maintain. So identity silos there also have all of the same problems that you have within organizations and businesses. There's federated identity for the open web. So these are things like OpenID or OpenID Connect, you know, signing with Facebook, signing with GitHub. And there are Python projects including Python Social Auth and All Auth that are solving or implementing these identity management solutions for Python. But not all apps are built for public consumption. Some of you in this room are probably working on applications that are for internal use in your organization. And even for general applications that could be useful within an organization or could be useful as publicly accessible applications still need to have their identity stored somewhere. So even if you're writing for example, no it's not a Python app, but WordPress, people deploy WordPress internally and externally. So being able to support external authentication is a desirable characteristic if you're building applications like that. So what's identity management? Identity management is basically having, or centralized identity management is a centralized store where all of your host, service and user information can be stored as well as access policies and other kinds of related information. So various solutions in this space include FreeIPA, which is one of the projects I work on at Red Hat, Active Directory, which of course is basically king in the Windows sphere, and plain old LDAP, so just using a bare directory. And yeah, they're used by corporations, open source projects, and define user's group's access policies, and also provide your authentication and authorization services. So these will be endpoints where you can actually query, you know, is this username and password valid or is a particular user authorized to access this resource and provide these facilities. Single Sign-On is a facility provided by a number of different technologies, including Kerberus and SAML. FreeIPA implements the Kerberus Key Distribution Center. This provides security in that your users only really need to remember one password, so password fatigue is not an issue, and the protocols themselves are secure. Convenience, once you're logged in, you're logged into all of the apps that you need to access until such time as your ticket or your Kerberus ticket or your SAML assertion expires. They're great for onboarding, so you don't have to go and tell each application about the users in your organization. The applications can just receive this information by way of receiving the ticket. And you avoid the duplication of data and duplication of administrative effort. So Kerberus and SAML are two of the SSO protocols. There are others, including OpenID Connect SSO for the open weblock I mentioned earlier. Kerberus is a ticket-based authentication protocol. Active Directory provides a Kerberus KDC, MIT Kerberus and Heimduller, other free software implementations. And it's supported with browsers via the HTTP Negotiate extension. We'll see an example of that in the demo shortly. SAML is an XML, very enterprise-y sort of format, where service providers receive assertions which contain attributes, which are cryptographically secure, but do require an agreement between the service provider and the identity provider. Free IPA is a centralized identity management system. You manage your user's group services, as I mentioned earlier. You Kerberus KDC and also host-based access control policies. And the system security services demon or SSSD is the client portion that works in concert with Free IPA or Active Directory. Or indeed, BearLDAP, but there are additional features available if you're using Free IPA or Active Directory. It provides a PAM responder for user... So for authentication, and it also provides user information lookup facility. So you say, okay, given a username that we've authenticated, we can now retrieve attributes of that user like their full name, their email address, and so on. And it can enforce the access policies defined in Free IPA or Active Directory. So the host-based access control has a D-Bus interface as well, so you can write applications in any language that supports D-Bus, or the D-Bus binding in order to get information out of SSSD. Okay, so we'll go straight to the demo now. What are we going to see in this demo? Basically, we're going to manage a user identity with Free IPA. We're going to use Kerberus to do the SSO to authenticate to an application. This application is going to be configured such that only users who are members of the Django group can access it. We're going to load additional user attributes via the request environment into the application. We're going to see that we can map external groups, so groups that are stored in the centralized identity management system, mapping those groups to groups in the application. And we're going to onboard Alice, so Alice is a new employee in our organization. First, I'll visit the... So this is a Django application. I'm just going to begin this and go to Groups. Okay, so we can see we've got two groups defined here, Xthelpdesk and Xthemoderators. So these are the mappings between the external groups and the internal groups. And in terms of the users, well, we don't have any yet. We just have admin. This is the free IPA web interface. I'll just make it a bit smaller there. Can everyone see that? Okay. Okay, so users... Hang on a minute. So active users, we just have the admin user and don't worry about Portal. But we're onboarding Alice, so user login Alice, first name Alice, last name Able, we'll set her password. So now we've added Alice to our user directory. Now I'll just switch to a different host. So this is a completely different machine, our virtual machine running on my laptop, which is IPA-enrolled. Whoops. Okay, so we're going to do a K-unit for Alice. This is going to require a Kerberos ticket. So if we do K-list now, we'll see, okay, there's no Kerberos tickets on the system. If we do K-unit Alice and login with the password that I set, okay, this is part of free IPA's policy, is that once you log in the first time, you have to set a new password. But after this, you wouldn't need to do that. Now if we do a K-list, we can see that we've acquired a Kerberos TGT, which is a ticket granting ticket. So this is our single sign-on ticket. When we talk to services, we'll be able to acquire a service ticket automatically behind the scenes for that particular service. Okay, so now if we go to log in, well, this is not going to work. We can see that we've got a 401 here the first time, which is going to have a www.authenticate-negotiate header in the response, and then the second time in our request header, we provide authorization-negotiate, and this data here, which is our Kerberos service ticket. If we now do a K-list, you can see that we've acquired a service ticket here for f224.ipa.local, which is the host. But the access failed. And why did it fail? Well, because we're using the host-based access control and only the users who are members of the Django group can log in to this application. So if we now switch back to our IPA web interface, we'll go to Alice, and we're going to add her to the Django group. While we're at it, we'll also add her to the moderators group because, say, Alice, she's been hired to be a moderator in this application. So we'll add those groups, and now we will attempt to log in once more. Excuse me. And you can see, okay, now we've successfully logged in. The application now knows about Alice, and if we flip back to the admin interface for our application, not only has the user Alice in this application being created, but we've also pulled in her email address, first name and last name. So these user attributes that were defined in the central identity store. Okay. In terms of how this is all implemented, briefly show the configuration. I'm not going to explain it all. If we have a look at the HTTPd configuration, so the Apache configuration, for this application, for the server that it's running on, we can see at this login location, we have a whole bunch of directives related to Kerberos here. So auth type Kerberos. Method Negotiate turned on. The author realm, IPA.local. We have here RequirePam account Django. So this is related to the host-based access control. So this says that, okay, once we have a remote user in our request environment, which is supplied by the mod-auth-kerb module, then we'll additionally go through Pam to authorize that user. We also have some other modules here, but I'll just skip them. And we can also do a cat. Let's see Pam.d Django. So this is the Pam configuration for Django. You can see it's using the Pam SSS, Pam Responder. And the service name here is implied by the actual name of the file. So the service name is Django. There's some online resources that I'll show you at the end of the slides. I'll point you to that, explain how to set all of that up, what the directives are, and how to use them to do all of this. But for now, we'll switch back to the slides. Okay, so we're going to talk about a bit about how to consume the external authorization in your application. So first of all, remote user. Remote user is a standard request environment variable, which, like it says in the name, identifies a remote user. This kind of harkens from back in the HTTP basic authentication days, so that, well, when I say standard, it's more of a de facto standard, but your old web server that did their best job is a remote user. Your old web server that did their basic authentication or maybe a challenge response authentication would set remote user when a user was successfully authenticated in a request environment, and then your CGI scripts or whatever could observe this variable and interpreted as meaning, okay, user such and such has logged in. So the web server sets this variable, and many applications can observe this variable. If you're a writer of a general web app that is intended for other people to go and deploy, so it might be a free software, you know, blogging system or CMS or whatever, you should support this. It's important because people may or probably will want to deploy your application in a centralized environment with a centralized identity management system at some point. You shouldn't assume that users or people deploying your application are always going to want to use a local identity store. In practice, remote user is not enough. Applications want to send emails, they want to say hi Alice, welcome back. That's where some of these server modules come in. So mod-auth-curb, which we saw in the Apache config, provides the curb-risk-negotiate authentication support. There's also mod-auth-nz-pam which provides the access control via PAM SSS or via PAM, and the PAM service was configured to use PAM NSS. Mod-lookup-identity is the module that given a successful authentication with a PAM user populates the request environment with additional user attributes read via SSSD and mod-lookup-identity users debuts to talk to SSSD. There's also mod-intercept-form-submit which in the event that you cannot use the curb-risk or some other SSO technology you can fall back to users providing username and password but mod-intercept-form-submit will intercept those values and then on the server side attempt to use the username and password to authenticate to a centralized identity store. So if you hit the page with the matching form fields that have been transmitted in the post-starter mod-intercept-form-submit will recognize that and say we want to try and authenticate this user and it'll do that via PAM. Finally, mod-auth-melon is a module that handles SAML assertions but there was nothing in the demo that was related to mod-auth-melon and I'm not going to demonstrate that today. In terms of the middleware and the backend application which is what we're using in the demo the remote user middleware already supports remote user and will log in a remote user but it requires the ticket in the case of curb-risk or whatever mechanism it is that is used for the authentication. It requires remote user to appear in the request environment on every single request. If it's not there then the user gets logged out. Persistent remote user middleware which is going to be in Django 1.9 I think. What's the current version of Django? 1.8 So it's going to be in Django 1.9 So it's on head now. Persistent remote user middleware will create a cookie-based session and will not log the user out if remote user does not appear in the request environment on a subsequent request. So using the persistent remote user middleware can allow you to have, like we had in our demo a particular authentication path such as login that forms the authentication of the user and the authorization and creates a cookie-based session. You'll observe from the demo that the HTTP negotiate requires two requests on every single page access. The first one will always return a 401 unauthorized with the www-authenticate negotiate header in the response. That instructs the browser to acquire a service ticket and request the resource with that service ticket present in the headers. So you can avoid this additional round-trip on every request and additional load on your authentication service if you're actually having to perform the authentication each time by using the persistent remote user middleware. The remote user attra middleware reads the mod-lookup identity variables from the request environment. Mod-lookup identity has populated these variables in the request environment. Remote user attra middleware just pulls them out. That's not part of Django. That's a middleware that we've written. We did propose it for Django upstream and it was rejected so we're probably going to distribute that as a third-party package. The remote user backend, as you saw, it actually created a user for Alice in the applications database. It does that by default. You can suppress that behavior. If you're not using Django how do you do this? Well the general approach would be to use middlewares to interpret the request environment and provide the information in a form that your application can understand directly. You may want a system to map remote groups to application groups and roles. Actually there's something I didn't show in the demo. I'll just switch back to the demo and select Alice here. We can see that we added her to the help desk group and this information is also available in the application. She's in the chosen groups. There we are. X moderators. If you have some sort of group-based or role-based authorization within your application you might want to map remote groups into your application groups so you'll need a system to do that. Users. The question is do you want to persist your users to your application database as we did in the demo or is it sufficient for them to be transient? For example, you could write them into an encrypted cookie and just have that information exist in the session or a server-side store but no persisted state in the application. It might be sufficient to do that and if you can avoid creating application specific objects that need to be kept in sync with your information that's in the identity store then that's desirable. And you may need to tweak the views for example if remote user has been set, if there's a logged in user you might not want to show a login for. So why do this in Apache and not in Python? So the Python only approach would make sense if you only deal with Python and if you need to be server-agnostic obviously these were Apache modules so if you are writing a Python app and you want it to be easy to deploy on Nginx or Apache or whatever well this approach isn't going to work on its own. But in a heterogeneous environment using the Apache modules or server modules means that you don't have to implement the authentication and authorization logic in different languages and it means that the applications themselves have less configuration and they do less work, do less IO. Okay, so a few resources there's a Django how to on using remote user authentication this page at www.delton.com is all of the information you need to do external authentication for Django projects including all of the Apache configuration for the different modules that I talked about. For other languages and more general advice there's the free IPA web app authentication wiki page and if you are interested in finding out more about free IPA or asking questions how can I use it with my application there's the free IPA users mailing list and hash free IPA on free node so wrapping up identity silos are bad they lead to duplicate data duplicate administration effort and typically less security because users have password fatigue they're not going to choose good passwords secure passwords and the passwords they're going to write them down so one password one single sign-on system particularly within a single organization if you're all has centralized identity management and most pretty much already do then use it if your organization doesn't open source projects maybe have just grown up with in different identity silos start planning to move to a centralized identity management system and possibly evaluate free IPA as a solution for that your web server can do the heavy lifting for you if you are sending to use external authentication and authorization that's pretty much it so I think we've got about five minutes for questions hopefully thank you very much do we have questions yeah good question I know someone was working on that so engine X equivalents of model curve, model identity look up I think at Red Hat we had an intern master's thesis student something working on that stuff the last I saw was that the project was successful in practice I don't know what that means I don't know if the modules are out there ready to use or if they're you know just need some cleaning up to be practically useful but yeah they were being worked on I'd have to look more into exactly what the state of play is there my understanding is that you end up having duplicate user data on your application and in your centralized management the identity management what's the likelihood of the data getting out of sync and how do you resolve that problem yeah great question so you've rightly identified that the application in the demo was creating user data in its own database what's the name of the what's the name of the module yeah remote user Atra middleware that we're using in this example we'll keep those attributes up to date so if for example the user's last name has changed or their email address has changed mod lookup identity is going to populate the environment with the new information this middleware will observe that and update the user object as I also mentioned it may be that you can get away with out storing any user information in your app's database and just having transient users and if you can do that I'd recommend doing that because it avoids this problem entirely anybody else alright let's all thank Fraser one more time