 We have Madeline Boyd with us. She'll be talking about a bit of success for permissions in Django. So over to you, Madeline. Thank you. Hi, everyone. My name is Madeline Boyd. And today I'll be talking about how to handle permissions correctly by removing the need to worry about them. The system I built is for Django, but I'll explain the concepts in a generic way for those familiar enough or familiar with other ORMs. So if you work in a system which involves permissions or sharing, I hope you'll learn something from this talk. And I'll try to make sure to save about five minutes at the end for questions. But I'll also hang in the Zulub chat after this talk. If anyone has anything else they want to discuss. All right. So before we begin, I just wanted to thank my employer, Bit.io, built this while at work. And they're letting me talk about it, which is awesome. Yeah. So I guess I wanted to start off by talking a little bit about the title. So the title comes from the expression to fall into the pit of success. That is to have the easiest default be the correct course. So for example, if you're designing an API or framework, you want developers to fall into the pit of success. Whereas by not trying, they end up doing the correct thing anyway. In this case, I wanted to avoid permissions bugs by having permission checking be built into the ORM itself. So when business logic queries the database, it should only return results that the requesting user has permission to see. Model object edits or deletes should only be allowed if the requesting user has permission to make those changes and return an error otherwise. So quick outline of this talk. Spoiler alert. This talk fits into the I had a problem. I couldn't find a solution, so I built a solution archetype. And my problem was the risk of permissions bugs and human error. I wanted to automate them away. So I'll talk about the solution I built. In addition, I'll discuss considerations for using it or when you should not use it, such as potential performance cost and things you can do about it, how to mitigate that performance cost. So a little bit more about the problem statement now. So a bit there, we're trying to build a website where it's really easy to both share and lockdown data. So think Google Docs, but for Postgres. So we wanted to make it easy to upload a file, and then you have your database, and then you can share it with whomever you want, friends, collaborators. You can even make it public if you want to collaborate openly. Or you can lock it down and prevent. And we wanted to make it secure so that hackers can't see it, but just in general, anyone you don't give permission to won't be able to see or work with your data. That being said, databases are boring. So my forthcoming examples will use cake instead. So quick overview of the existing solutions for managing permissions in Django, just so to help clarify the motivation for this particular system. I'll skim over them because for people who are not Django folks here. But I'll also need to clarify what I mean when I say object double permissions. Because most of the existing, so there is subject verb permissions or subject verb object permissions. So the former, you have a mapping of users to permissions. And so you can say that the user Alice has the eat cake permission, but you can't say that Alice has permission to eat Bob's cake, but not Charlie's. That you need object double permissions for that. So in Django natively, it only has subject verb permissions. There are no subject verb object permissions. If you want one of those, you have to use a third party library. There are the two most popular ones. Both are robust, but in both cases, they require you to, at the minimum, check your permissions manually. So anytime someone might eat cake, you need in the code, does this person have the eat cake permission on this cake? If so, allow them to eat the cake. And that is what we wanted to bake in. We wanted the system to say, if you're trying to eat some cake and you don't have permission, just raise an error. So I'll give another example. So let's say I have some delicious cake, and I want to make sure that anyone who is either my friend or likes cake can have some. Except for Adam, he knows why. And so here's an example of how you might do this with Django rules. You can define predicates, which are functions that allow the function to handle the permission checking. So in this case, if someone is not Adam, then they will have permission. And this is basically just how you might define it in the code. And here's how you might handle a permission check. But again, it's quite manual. And we wanted to take the former and simplify it down to the latter, so basically have the ORM only return results that match the permissions to view the cake. So I'll talk a little bit about how I baked in those permission checks and how I applied them not automatically. So you basically have to do a few things. You have to be able to always reference the active or the requesting user in Django speak. In this case, we use some middleware to wrap around the request to make sure we got that user. And we stored it on a thread local variable because requests in Django are thread local scope by default. Also had to override some of the ORM methods. So in the Django ORM, you have model objects where each class of a new model object maps to a particular table or in each instance of that model object maps to a row in that table, whereas query sets are in abstraction over SQL queries. So I think selects, think inserts, think updates. And manager classes do what you would expect. They just manage working with models and query sets. And I'll talk about how I override those in a minute. And then also making sure to raise permission denied or the appropriate exception if a requester tries to take an action for which they do not have permission. So here are the methods I had to override. So query set fetch all is the workhorse of the Django ORM. This is what takes your filter expressions in Python, generates a SQL query from it, fetches that query from the database, instantiates and hydrates Python objects, and then returns that set to you. And so the most important thing we did here for this permission system was override this, look at the result set, and then do the filtering of permissions there. And then model manager query set is basically a way to inject our version of the query set in, model save and delete, or so that we can raise permission denied if someone tries to take an action or make a change from model object or delete a model object where they don't have permission to do that. So this is how you handle permissions at the model level. But you can also handle permissions at the field level. So let's say, first, just a little bit more code. So the Django ORM fetches objects and stores them in the result cache. This is where we're filtering them out. And this is just a perm checklist. It's just because there might be multiple permissions to apply to a given query. So if we want to talk about field level permissions, I'll talk about that. But first, I will explain a little bit about fields and descriptors. So the basic analogy is that a descriptor is an instance level field. Fields in Django are properties on a model object that map to columns in a table in the database where, again, each row in that table will be an instance of your model. And descriptors are a really cool Python design pattern, actually, where it's the power that backs the property decorator, if you've ever wondered about that, where it has this magic method. A descriptor is anything that has a magic method called underscore underscore get underscore underscore. There's also set and delete. But get is the most common one. And it allows you to, when you're looking. So if you set a descriptor as a class instance variable, sorry, as a class property, then any time you look up an instance of that class, instead of looking up a scalar value, it'll call that classes underscore underscore get and return the value of that so that you can have dynamic property access in field lookups. And this is actually how in Django, you can define a field on a model class and then call that method on an instance and have it return a different value for each instance. So here is an example of how it's used in Django. We have a flavor field. And if you call it cake.flavor, it will return the flavor field if you define a flavor field on the cake. But then when you take an instance of that and you call it birthday cake.flavor, this is actually a descriptor, but it will return the value of whatever that flavor is. So in this case, you can think of in Django as a field maps to a descriptor in the same way a class is to an instance, but there's some subtleties there. So here's a more full example of how we can override fields and descriptors in Django and have our own custom descriptors on objects. Basically you have to override the field and then use a method called contribute to class to get our custom descriptor on the model instance object and then in the descriptor, that's where we add our permission checks. So to get back to my example, so growing up in the United States and the East Coast, there was this chain called Friendly's restaurants that had these ice cream sundaes and there was always a secret surprise in the bottom of the secret surprise. And you had to eat the whole ice cream sundae if you wanted to find out what that secret surprise was. So if this Friendly's ice cream sundae was a Django model, then each secret surprise, you may want to restrict the permissions on who can see that to only people who have eaten the whole sundae. And likewise, anyone in the restaurant can see the sundae, but not everyone can see what the surprise is. And likewise, the only people who can set that secret surprise, that's a different set of people. That's the only the employees because if you had a random stranger and they were trying to shove candy into the bottom of an ice cream sundae, then they would probably either get kicked out of the restaurant if not arrested. So that's how you might define it. And then how we actually implement those permission checks, you have to override the field and implement the contribute to class method. So this is how the contribute to class method on a field class is how Django inserts the descriptors onto the model instance objects. It's basically just a fancy set adder, but if they can do it, we can do it too. So we just call super contribute to class and then override whatever descriptor they have set with our own custom descriptor. Our own custom descriptor does the permission checks. So again, then I think about descriptors cause business logic before returning a value. So we can have our descriptor actually do permission checks. Just cool. That's for standard fields like ints, text field, URL field, great. But Django also has related fields, which are really nice. And so for related fields, you have models that point to other models and you could have permission checks need to, so when you have related fields from, if you have a related field from A to B, and then by default, there was also a relation from B to A. And then you also want to make sure that you're respecting your permission checks on the reverse so that if like I have some permission, like for example, if you have a friend's relation like A is friends of B and B is friends of A or an asymmetrical relation, someone may have, if I have permission to know that A is friends of B or not, like it's easy to check that on A.friends but you also want to respect that on B.friends. So if someone has permission to CB but not to know that A is, but not to see A, like for example, you're building a social network and you have a profile page and you're listing like friends of that user or users that user is connected to, then you also need to respect, like if A has a very hidden profile, then you don't want to show A and B's friends list. So permissions can get, so you need to respect both sides of the permission is what I'm trying to say. So the way to do that is not so bad. All you have to do is in addition to overriding contribute to class, you also need to override contribute to related class. To also, so if contribute to class is what adds like A.friends, contribute to related class is what adds B.friends. It's a little bit trickier though because I alluded to Django managers in which are the classes that kind of manage query sets. You also have to implement a related manager class function because what is dynamically instantiated at runtime is the manager class, which gives you the query set. So you have to do this, which is you have to override a method called related manager class. Take that original related manager class that is generated at the runtime. Relation provision name and reverse are just some of the business logic for how we add our permission checks. At runtime dynamically instantiate your own manager class that subclasses the original super related manager class, add your permission check in there and then return your new dynamic related manager class. So Python is awesome. You can do stuff at runtime. If Django can do it, we can too. All right, so that is, I skimmed over a bunch of code for the sake of time. Feel free to ask me questions after this or in the chat. I will also talk a little bit quickly about ACLs, which is how to record who has which permissions on which things. There are some permissions that can't be easily defined in functions. So say you have a list of people who are explicitly allowed to eat a birthday cake, like the list of invitees at a birthday party, or you may have people or even with different roles. So like maybe the birthday boy or girl has special permission to blow out the candles. So in this case, we have a way of permission similar to the Django Guardian Wave recording permissions, where we have an ACL model object with three fields. You have the accessor who is the subject of this permissions model that you're requesting user, the resource, which is the object that is being accessed. So upper digger piece of cake. And the role, which determines the level of permission that the subject has on the resource. So for example, the role could be like birthday boy or girl, birthday attendee, parent of boy or birthday girl, whatever you want to define. So this is our ACL class. And we use Django foreign keys, which are one to many relations because for every accessor, for example, a piece of cake may have many accessors that point to it, but a given accessor will only point to one piece of cake. So the related name is, so on the cake, you might call cake dot ACLs, and it will return the set of ACLs. We also use something called, we also use a modified delegate pattern, and that's because relations in Django can only point to one other type of class when you define them. And if you wanna point to two multiple different types of classes, then you have to use something called generic foreign keys, which breaks down a lot of the nice metaphors of the Django ORM system. Or in our case, we created an intermediary object with a one to one relation, which allows us to, and then that type, in this case, a class of type accessor delegate or resource delegate can point to, has different fields to point to other different types of classes. So our resource delegate is a stand in for cakes or pies or toppings or frosting types. So this is how we get around the Django or limitations on what types of classes you can link in relations, but it also still allows us to use a lot of the nice things about how the Django ORM works. We tried generic foreign keys, but we had a few issues with it. Django Guardian does use foreign keys. Okay, so now I'll talk a little bit about some considerations that should be taken into account when using a system like this. So number one is performance. Do not make database calls in your predicates. This will make your code slow. Django has a nice thing called pre-fetch related, which will fetch and populate related fields and specific related fields on those. So you can fetch all the fields you need to conduct permission checks in one database call. So for example, if we were wanted to look at an object, like a cake object, but we needed to fetch some fields to see if we even needed the cake, what you wanna avoid is the default, which is Django is like, oh, you wanna look at cakes. Okay, like let's look at cakes and then go to the database fetch a bunch of cakes and then, oh, but in order to see these cakes, you also need to see, you also need to know what type of frosting you have. So let's also do a lookup on the frosting table and like, not just that, let's do a lookup on the, like a separate lookup on the frosting table for each cake you're looking at. Like, no, no, no, no, no. Let's just get everything we need in one database call and then check the permissions on those populated fields. One thing we did to avoid this was to write a hook to raise an exception. If in a predicate call, Django calls the database, which helped us catch some things. And also Django Guardian has a method called prefetch-perms to prefetch fields needed for permission checks. So if you're using Django Guardian, use prefetch-perms, I would recommend it. So another thing you can do is explicitly check the permission you need and then override permission checks. So you're not accidentally over-checking permissions in a redundant manner in your, when you're doing queries. If you're doing this all the time, then Django Guardian or Django Rules will work great and you don't need a system like this. So again, here is, if this is the query you're making and then if you need, in order to look at cakes, the permission view cake is baker or is customer, you may wanna prefetch the fields you need to make that permission check. Yeah, don't do that, do this. This can be overkill a system like this if, for example, you don't check permissions very frequently or you want to be explicit with your permission checks because you don't trust magic, then you probably, or for instance, you don't need to check subject-verbed object-level permissions. In all these cases, there is either existing solutions that work better for you than you don't need to bake into the ORM. So the other consideration is that this is like, they say, pick a library is not frameworks and I believe that, and I will always take a library over a framework. This is, it's not a framework, but it's approaching it because you have to have custom-based model classes. You have to have custom Django ORM management. So I guess it's not that bad, but it would be nice if you could just drop it in a little bit more easily. All right, so that is the system I've built. I haven't open-sourced it yet. One of the things about working in a startup is that you don't, you have to stay focused and I just haven't had the chance to dedicate some time to open-sourcing this. But this is my Twitter. So Madeline is spelled weirdly. It is not a, my mother thought this was the common way of spelling this name and it is not, but this is how you can reach me on Twitter. Or Boyd at Bit.io is my work email and that's a little bit easier if you'd like to reach me there. But if you would like to use this or you're working in Django and you think this would be cool, let me know. The reason I built this was because I was hoping, so Django has a lot of third-party libraries and I was hoping that something like this would exist and then we spent some time looking and couldn't find it. So if someone could tell me like, oh, actually you should just use this library. I'd be like, I wish I had find you a year ago or 18 months ago, like that would have been great. But yeah, so reach out. Thank you for coming to this talk. I really appreciate you're taking the time as Pika and Indio wraps to a close. And I also just want to thank my colleagues at Bit.io who helped me work on this and build it. And thank you conference organizers for this together. So I guess I'll take some questions. Thank you, Marlon, for the wonderful talk. Thank you. And the really nice analogies about ice cream Sundays and a lot of, and thanks specifically for touching on the considerations with examples. It made it really simple to understand what you were trying to explain. Thank you. And that's just okay like this code that for the easiest to be the default, for the easiest default to be the correct course, I think that's how every library or a framework should be written. Yeah, the phrase, I got it from a former colleague Nick Sharok. He was, I forget, I don't know if he got it from somewhere else, but he was one of the creators of GraphQL, so. Nice. Okay, so we do have a few questions for you. Okay. So first one. So when I have to implement permissions at a field level or object level, I have to access some methods or attributes beginning with an underscore. So I think underscore or dunder. So is it really okay to use them? I guess this depends on, so I guess who, I need to know a little bit more to answer this question. Is like you and your business logic adding the underscores or is it some internal framework that's adding the underscores? I mean, in general, underscore means private, nothing is private because it's Python, but I guess, I mean, I'm also like dynamically subclassing things at runtime. So is it really okay to use them? Maybe like take a pause and just understand what's going on. It sounds like this is probably some other library or framework that you're using. Be aware that this API could break on version updates. So just like take any version bumps more closely, but it's your application code. And if this is an open source framework, then this is kind of the risk they take by making the code open source. So yeah. I think that's the perfect answer. We have to look at the circumstance and the kind of use that we're trying to do. So how are we trying to use it? So yeah, makes sense. Yeah, I mean, we'll look and see if there's maybe a more robust or easier way, but I'm also happy to look at the details of this a little more closely if you want some more help. Yeah, yeah. So probably the poster could maybe catch up on Zulip. Yep, find me in Zulip. Next one. How to create relationships between two tables in GraphQL? This is a great question. I haven't worked with GraphQL recently. So with GraphQL, I was one of the original, like the original three beta testers, but I haven't worked with GraphQL in a few years. So unfortunately, I'm probably not the best person to answer this question. There's actually another one on GraphQL. Oh, no. I feel so embarrassed by it. I'm so out of date with my knowledge. I think probably you could take them on Zulip. So I think there's one apart performance. Okay. How is the performance compared to rest? I guess the analogy would be... So if you're making a rest request, you can have permissions with that or not. I think that the best analogy is not necessarily compared to rest, but compared to what if you did the permission checks at runtime yourself, or if you didn't do permission checks at all, like what is the performance comparison there? And I would say it depends a lot on the permission check you're doing and how complicated it is. But for a naive and like, it's basically, it can be pretty minimal so that like the overhead of permission checks is on the order of like 10, 10s of milliseconds, which compared to the total request time is like, that's like for total permission checks. So that's not so bad. If you're doing things like not prefetching and checking database calls in predicates, then you can easily add a couple of 100 milliseconds to your total time, or whatever the cost of like 100 database round trips is. Yeah. Yeah, that makes sense.