 Okay, first of all, I'd probably be a little longer, so if we have no time left for a question, I will be outside, feel free to ask anything, because the talk is quite long. I'll try to make it as short as possible, but there are many things that I would like to say, and some of them are long and complex, so I didn't have enough time to explain them in full, so I recognize that something might not be clear, so feel free to ask me anything. The reason, first of all, why I'm here and what's the reason for this talk, I'm currently maintaining a few Python libraries and frameworks in the web development world for Python. The biggest project is for sure the Turbogeos project, which for any of you that doesn't know it, is a framework that is comparable to Django Insights as admin, as multiple, how to call it, database access layer, supports both MongoDB and SQL Alchemy, everything you might expect for a full stack framework, and has been around for like seven or eight years, so it's a very big project with a lot of other big projects that rely on it, so it takes a lot of my time, and then a few other minor projects like Beaker, which is a framework for caching and sessions on web, and there are other side projects like DuckPy, which is a JavaScript interpreter for Python. You can use it, for example, to run React server side on Python, or to run a Babel.js transpiler on Python without the need for Node or things like that, and Depot, which is a five storage framework, so you can see that there are a lot of projects I'm maintaining, and the reason for this talk came from the experience I had as a developer of libraries and frameworks. So whenever you create something like a library, if you are lucky enough, it will be used, and then the real problem starts. The effect is that most of these pieces, which might be independent, start to get integrated into bigger software or other solutions that involve them, and it's not always easy to find a way to make them interact with other pieces of software, and communicate the design and the intent of what you wrote. I mean, you might expect that your library does something in a very specific way, and you do not expect people using it in some ways you didn't foresee or things like that, and you try to do your best to enforce those ways and communicate them and things like that, but it's not always easy. So sometimes we need to refer to more stricter enforcing to apply them. So this is the most common situation I usually end up as a library, whenever people as a library developer, whenever people open an issue nearly half of the times, it's because they use the library in a way that I didn't predict or foresee or they didn't mean at all to be used like that. For example, the most interesting thing was recently an issue was opened to me for Deepod, which is a file storage framework, and as a deep integration with SQL Alchemy, so whenever you, for example, commit your transaction or roll back your transaction or things like that, your files get deleted or saved for real according to the state of the transaction. So for example, if a user applauded his avatar and your transaction fails because, I don't know, any kind of error, the avatar of the user is not left there on Discord, on S3 or wherever you applauded it, it's also rolled back with the transaction, so you don't leave around broken files or things like that. And there was a guy that opened an issue about the fact that it didn't work well with Django RM. I say, yeah, of course, it's supposed to work with SQL Alchemy. It's the first word in the documentation and things like that. And then I started actually implementing support for Django RM because people actually wanted to use it on that storage system, I would say. So you try to cover whatever might be possible misuses of the library or the framework or the code you wrote, and you try to do that in documentation. You show people the way they are supposed to use the library, but it's nearly impossible to show them all the way they should not use the library. If you try, like, I could write these works with SQL Alchemy and MongoDB, which is what I did, and then I could add one more paragraph to the documentation and say it doesn't work with Django RM, and then probably someone will try to use it with Mongo Engine, and then I would have to say it doesn't work with Mongo Engine, only Ming for MongoDB, and things like that. You will end up trying to list all the cases, all the things that people shouldn't do, which is impossible because our, in your life we'll say unlimited. So you quickly discover that your definition of what's reasonable as a way that your library or framework should be used is not easy to guess, a common definition that any of us can share. So you don't know what to do anymore. I mean, I already did my best. I told you how to use the library. I told you how you should apply it in your project. I wrote examples. I wrote documentation. I even provided some simple projects and things like that. But there is always something you didn't foresee or didn't predict. So there is actually a branch of software development that comes into our world regarding those kind of things. And it is, I would say, a kind of science that is called defensive programming. It's really common in really big projects, like if you need to write the software to drive the space shuttle, you will apply a lot of things that come from defensive programming, like much proof of the requirements or like double fallback systems for every single software piece you've wrote or things like that. So the purpose of defensive programming is to defend you from the impossible because it will happen. And the more formal definition which I took from Wikipedia, so it's probably nowadays, it's probably the standard definition is what you'll read on Wikipedia, is that it's a form of defensive design. So a way you design your system, not actually the way you write it on call level to ensure that it continues to function under any possible circumstance. Or if it doesn't work anymore, at least it should notify in a clear way what's going wrong instead of just providing random error, which is usually what it happens. And there are a few things from defensive programming which we actually take for granted. We are used to them from object-oriented programming, from aspect-oriented programming, from many other paradigms, like you have interfaces in object-oriented programming, you have like invariancy of types in functional programming. So those things actually go under the umbrella of defensive programming as they are a way to prevent you from doing the wrong thing. So like if you are subclassing an object that does those things, you are expected to do those things. Or if you are working with a specific type, you are not expected to edit the object itself, which might have side effects, but you are expected to create a new one. And all those things are usually under go, usually under the umbrella of defensive programming. And you will notice that all the things that are involved in defensive programming usually resemble protocols and expectations. So declare how you want to communicate with other piece of the system and what's your expectation in the way the other piece of the system will work, what they are going to do and things like that. For example, if you are a file storage framework, you are supposed to store your files somewhere. You are not actually supposed to provide to have storage nowhere, which is actually one of the features that first got requested on Depot, for example. People started asking for a way to write a really fast test unit that involved integration with Depot and say, hey, why we can just make Depot do nothing when you store something, so that my test is fast instead of having to actually write the file on the eScore or upload the file to S3 or things like that. So whenever you face a condition in which the software refuses to respect your expectations, you should provide a clear alert to the user that he is doing something wrong because you cannot rely on the fact that they read the documentation or things like that. And I will say that as the software that uses your library as expectation out of your library, I expect that your web framework renders my page. I expect that your file storage framework saves my files. I expect that whatever you are doing does what it is expected to do. So your library should have expectation, too. So you should have expectation in regard of how the user is going to use it. So try to enforce them because it's one of the major pain points when you start getting involved in many different projects that do many different things. And there are actually parts of the protocols enforcing which we are used to in our daily work. So whenever you write a piece of software, it doesn't need to be a library or a framework. It can even be just something you use in a single project. My baby in four different places of that project. At that point, you will have to provide the clear expectation and you usually do that through interfaces, through signatures of your methods, types, assertions, and all those kinds of things are actually a way to express a protocol. A protocol in the sense of how your object or your library interacts with other libraries and objects. And you actually need to provide expectation for each single join that your library will have with other people could. So any single piece where your library can interact with something else should enforce those expectations. And I'm not talking about the most obvious one. Like my methods, if you call it, you are supposed to pass a number. That's something we take for granted. It's obvious that if you expect a number, you should check that the thing that you are receiving at least resembles a number or can be converted to a number if you want to apply that typing or whatever you want to work with. But I'm also talking with side effects. Like if you saw the previous talk from Armin, there was a clear problem of side effects in import. I do import and I expected what I find in CIS module C is actually the module that I imported. But there is nothing that it's enforcing that for me. It's just a side effect of how the import system works. And it's not actually supposed to work the way for real. Something else might have changed the content of CIS import as a side effect of import. So those are all kinds of expectation that your library should enforce in the context where it runs in. And the fact is the context is pretty hard to define in something like a dynamic language. And in this case, we actually have a few tools that can help us in many ways because the fact that it's a dynamic language has these side effects. Like it's not always easy to specify the protocols your classes or objects need to enforce. Now we are starting to see improvements. They try to do abstract-based classes. They try to do type-inting in more recent versions and things like that. But usually Python has a legacy of being a type-less language mostly. And but the fact that it's so powerful and flexible in managing code and types also means that it has powerful and flexible tools to inspect what's going on because Python itself has to do that. To be able to treat your objects as they are something, it has to have powerful tools to check what that something is. So Python provides us powerful infection because the language itself needs them for real on a daily basis. If you provide something that is suspected to work on a sequence, and that's probably a case every one of you ever faced, you know that it's really hard to check what a sequence is for Python. We have a definition in collections. We have a base class that says I'm a sequence. But actually for the general idea of what a sequence is, many types that do not expect, do not respect that hierarchy, that interface are actually collections. So it's not uncommon in Python to see complex checking of what kind of methods or behaviors the file exposes or things like that. And we are really used to inspection in the context of debugging whenever your software crashes. You expect to be able to go into the code and see what were the local variables, what was the previous frame, go back and forth across this code stack and things like that. All things that Python does because it's a dynamic language. If you ever work with a compile language like C or things like that, it's not as easy, I will say. It's still possible somehow if you have an understanding of how the internals of the operative system and computer work. But it's not as easy to move across your stack of calls and things like that. At least you must know, for example, where, in which place of the memory your variables will be or things like that. While in Python, I can just get the frames and from the frames, the local variables. And here it is what made my function crash. And one thing that your library can do is leverage all these tools to inspect the surroundings in which it is running to check that the expectations are met. So I will try to show you some example to clarify the concept because so far it was a really theoretical talk. So I told you about protocols and forcing of expectation, defensive programming and things like that. But in practice, what does it mean? How can we apply some of these things to Python? And I try to resemble some of the most common anti-patterns that we see often in Python. Some of them are really widespread, even though there are many side effects and things like that. And one of the most common side effects in Python, which is related to the previous import example, is side effects at import time. We often rely on the fact that whenever Python imports a module, it runs the code of that module. So we tend to see like classes declaration and things like that, not actually at declaration, but actually at implementations of those that actually when they get executed, create the class. And this is common, for example, in the context of registering something, like I saw people commonly use like decorators to register a global object, to register an object in a global registry. Like this might be a really simple example of something like that. It's really common in the context of event endlers and hooks and things like that. It's really common to see this pattern, which involves on having some kind of registry, which in this case is just a dictionary, and some kind of decorator that declares on which event your function should be called. So for example, my decorator will register the function that you are providing as a handler for the event that was specified by the decorator. And then whenever I want to fire an event, I just look into the older register at endlers and call them. Like in this case, it just goes through all the event endlers for that event and calls them. And this usually ends up being something like this. So on event, some event, then my listener, and then I fire the event, okay? And there is one complex problem in this example, which is it works in the most common cases. It works in the case where your model gets imported and you register at the event at the global namespace. But what does happen if I do something like this? I declare a factory method, which is, I would say, a pretty widespread pattern in many cases. And that factory method creates something that has an event ender. And here starts our problem. If I fire the event, I will see it getting fired if my factory was ever called before the event was fired. Or it will never be fired if my factory was called after the event was fired. Because it will never register the event ender. Because my factory never was called so the decorator never was applied. And so my function never was called when I fired the event. And this is, okay, if you think about it, it's fine, it's the way you know the language works. You say, hey, you never created F, so how could you register it as an event ender? It's pretty obvious at that point. But from the expectation of a new user of Python, it might be more obvious that I, the register at that function as an event ender and I expect it to be called because it's in my code. So this is an error that you might face if users use your globally registered ender. And my, like, opening issue on GitHub and say, hey, I did this and the event ender never worked. And you reply, yeah, of course, because you never actually created it. But it would be better if we could find that this is an error. And we could tell the user you should not be doing something like that. And it is actually a condition we can assert. So we can actually implement something that checks for that. So we can implement a register decorator that at least traps the context where it was used. So in this example, that we have the same exact register decorator, but we are using the inspection features of Python to check that the current frame, where actually the parent of the current frame, where the decorator, where the event ender was registered at, is at model level. So okay, this might not be the clearest way to do it. It might not be the most elegant way, but it works and it was short enough to stay in a single slide. And what I wanted to show you is mostly the idea that my event ender should be in charge of checking that it was used in the correct way. So in this case, it's in charge of checking that the place where the event ender was registered is not a transient scope. So it's not something that might or might not exist. It's something that will exist, okay? At least in the sense that you usually import all the modules as the first thing. Of course, it will break again if you use lazy imports, but they have all set of side effects so you usually don't want to use them. So in the most common case, it will work. And in this case, I trapped the fact that the user registered the event ender in a scope that it's not global. So it might be there or might not. Because the only scope that we know it's always there, it's the global one actually. So, and I provide an error. So that whenever the user tries to use my decorator, at least it gets an error that tells, here you are registering the event ender into a transient scope. Maybe I can add one more option and say, yes, really do that because I know what I'm doing. I know that I will always call the factory before firing that event. But in the most common case, I should probably trap that and tell the user you're doing something that will cause issues to you. And this is the case, this is the scope of tools like static code analysis tools. They usually do things like that. They check empty patterns. So everything you show you are not supposed to do. And they usually do that really well at language level because they know how the languages was supposed to do. But they are not really useful in the context of checking your code. So I mean, your code, sorry, that might be a misleading definition, the code that you've write and there's expectation. You will check that your code solves the expectation of the language, but you won't be able to check that the other libraries that use your code solve the expectations of your code because you will have to write custom checks for that weapon. And in many cases, those tools are not really easy to extend and you will need to learn a whole new framework. You need to write parcels and things like that because they rely on checking the syntax of the language. They rely on checking the source code itself, not the runtime usually. And there are many things in a dynamic language like Python that are not easy at all to foresee without actually running the code. And if you ever use a development environment, like PyCharm, as we are in the room, you will probably notice that it's not always easy for your editor to guess the correct types or what's happening at the code because it doesn't really know what's going on, when you run that code. It can try to guess, but it has no guarantee that the guess will be correct. And there are actually one more dependency that you must add to your tool chain. So if you need to work on a new project, it's one more piece that you need to set up before working on that project. And as you add more and more of those dependencies, it tends to be a pain in the ass to start working on new projects when people start to complain and they don't want to contribute to the project anymore or things like that. So for an open source project, it's not always a really good idea. Okay, and here comes another powerful feature of Python, which is the inspection. And we have a full set of things that we can inspect around time. And the great thing is that you can actually inspect not only your code, but even other people code. But the main problem with inspection is that it is really expensive. So don't try to do that at runtime. Usually you want to do that at test time. So while you test with runs. And you can inspect actually anything in the language, not just objects, models, and classes. You can inspect even code itself, which is the really interesting part because we can check what other people are doing with our objects and our codes and our function. And here is one simple example and that tries to explain you why inspection of the code can help you understand what's going on. Okay, so for example, we might have read the Python documentation and it states that all the quality operators evaluate from left to right at the same order. So we probably think that writing something like that through equals false in false comma five, we'll probably respond with a true value because we compare true to false and that's false, of course. And false is in the list that contains false. So that might be an expectation that we have if we evaluate that expression from left to right as the documentation states. But that's not actually true because if we try to put squares around the expression which should change nothing because it should still get evaluated from left to right as we just placed the square on the left part of the expression, the result changes. And why is that? Well, it's not so easy to guess us first. So my first expression when I saw this was why? And we have a few tools that might help us understand what's going on. One is the syntax tree inspection and the other is the byte code itself inspection. The first thing that I'm going to show you is how to inspect the byte code itself so we can disassemble our function and ask Python what's going on here. And we now clearly see what the language is doing. What's going on is that Python is going to compare true to false and if that comparison is false, it will move away. Jumpy false means go away from what you are doing next and jump to 1127, which is the end of the function. So it will totally skip the right side of my expression if their left side is false. And why is that? Well, if we go into the extra syntax tree, it's now gets clear why Python is doing that because the expectation of Python is different than mine. Python expects you to write something like that to express inequality in the context of three different elements. So the operation, those are not two different operations. It's a single operation that is compare and it compares the left value to two other things using two different comparison operators. So you're not actually saying compare the first two and the result of the first two compare it to the third element. You are saying comparing the first element to these and then to that. And if any of the comparison is false, the whole expression is false. But what was misleading was the way that I wrote it because I used the in operator. If I wrote it this way in the second example here, it would be clear to any of you probably why Python behave that way. Because in this expression is clear that they will want all the three things to be equal while in this expression it's not as clear to explain that we want true to be false and also in the list that contains false and five. And now in the simple expression is easier but it's not always easy to guess what's going on until you go down to checking those kind of things. And now I need to actually to rush a bit because I have only like five minutes left. So the next example would be pretty fast. If you ever work with chiclon matting complexity is how much complex your code is. So how many branches your code is taking. And usually as good limit is stated to be around seven. So if your complexity goes over seven it means that your code starts to get too complex for humans to follow and read. And if we have something like this we can, which is just a function with two different branches, two different if branches. We can compute the chiclon matting complexity by decoding the code of that function. So ending up with something like this, which is the byte code. And then checking the number of if and four loops that my code is running. So if the complexity that gets from that formula. So counting the number of if and counting the number of four and adding one which is always the main branch itself. Is over seven, we can provide another to the user the digital refactor. So we can use something like code inspection to enforce best practices to our code. Like check that your complexity is low refactor your code when it gets too long. We can check that for example the function is no longer than x, x different for loops. And we can say, hey, you did too many things into this function, you should refactor it. And then once you understand that you are able to inspect what's going on around you, what should you do? The first thing that I suggest you to apply inspection is when you work on a big project with other members of the team. It's really common that in that context there are some parts of the system that you understand very well. And other people understand less than you because you are probably the main developer of that part of the system. And it's really important for you to be able to set expectation during the test of that part of the system. Because someone else might introduce bugs that he doesn't even know that it's causing because it doesn't know how your code works. For example, one of the most recent cases where I use inspection is this example. I try to explain the best I can even though I recognize that without the context of what was going on, it's not really easy to guess what's happening. Suppose that you have some code that destroys an entity. So it might be whatever, a block post. It destroys a block post. And the block post has many attachments, every image, every video, every whatever you uploaded to that block post. Maybe it does translation in other languages and things like that. And we want to make sure that in any way the code that deletes the block post change, it continues to maintain the assertion that all the resources that were uploaded with that block post are deleted. So even if our code fails somehow, it doesn't left around pieces that were supposed to be deleted. In this case, what I did was writing a function, an helper function which is called getMethodsCalledBy which looks at the byte code of a function, of a method and looks for all the other methods that were called by that function. So I did something like, okay, I know that this function does like 20 other things. Check that any of the other 20 other things that the function does, doesn't leave around any stray file if that thing fails. And that was a really important enforcement because you know that when you tear down a resource, you expect it to be gone forever. And you don't want to go and clean up things by hand because when you find them, it's usually really complex to understand anymore if they are still used or not. You have to go to all the dependencies and things like that. So in my test suite, I wrote a test that monkey patches every single method that was called by that method and causes it to raise an error. So every single method that was called by the destroyed function, by the tear down function now raises an error, one by one. And I checked that causing an error in each single method doesn't leave around any resource. So this is a clear example or some expectations enforcing to the inspection of byte code because I did inspect the byte code to find all the methods that the function was using. And then I just enforced the fact that each one of them shouldn't leave behind any resources if it fails. It can fail, of course, because we know that something might go wrong. We know that the connection to the database system might be down for whatever reason and the user deletes the object in the exact moment. But it still needs to make sure that the data that was stored there will be gone whenever the connection is back again. So you didn't really expect me to do a talk about code inspection without citing metrics, of course. And at this point, I tried to do the best I could to explain to you what's defensive programming, what's something you can do with code inspection in Python and why you should enforce the protocol and the expectation of your libraries. What you need to do is actually not easy for me to tell you because it really depends on your project and the context of your project. I would say that it's already enough if you get out of here and start thinking whenever you write your next piece of code about how every other single developer in the world might misuse my function. That's usually the question you want to ask yourself whenever you write a new piece of code. Thank you. We've got five minutes for questions. If you have any questions, raise your hand. We have a microphone for it so it gets recorded. Is there a library or something else that I can use to... Sorry, I can't hear you very well. Yeah, is there a library or something else that I can use to implement the functionality that you showed in your last example to get, for example, the functions that are called by another function? Okay. Not that I am aware of. I'm pretty sure I saw something that does things like that. The libraries for inspecting the code or trying to look at the runtime or things like that. But they are not really oriented to enforcing expectations. So usually you can rely on them to get the context of what's going on. But none of them is really up to date usually because the Python bug code changes sometimes. Not really often, but from version to version they might add a new operator or things like that. So unless the core developers are really committed to that project, it usually gets behind pretty quick like in one or two versions of Python. And so usually in my experience it was just easier to write it myself because when a project is implemented it usually bound to a specific Python version. Like you usually know that that project is going to run on Python 2.7 or 3.5 or things like that. So it's easier just to take for granted in the context of a project you do for work, of course. That's the version you are going on. Obviously for an open source project it's very hard and it's usually easier to rely on things like inspection frames, call stack and things like that than trying to go into byte code inspection itself because it might change. So I don't have a clear answer to your question because I know no good framework I would suggest you for doing that. That would be my point. I saw some libraries but I never ended up using them for various reasons so I have no enough experience in any of them to tell you it will solve all your problems. Thanks. So going back to your first example, the register where you register no classes, callbacks functions. I agree this is extremely common. Every binding is doing that. It's not so clean, I also agree it's not a clean but what would you suggest to do instead? Okay, well in this specific case, for example there is a really widespread library, every one of you probably helped of it which is Celery that does something like that for registering tasks. If you ever work with Celery and if you ever try to look at its code you probably notice that it has also some pretty complex logic to try to ensure that it's finding all the tasks in any place of the code that you try to register. But still there are many ways that your code might go wrong and some tasks you... In that case, I know very well for experience. I saw that as the issue because a single task takes a function as a first argument and that function is actually the real task. Yeah. I solved totally that problem but then I agree the design is totally wrong. Exactly, so for example one of the things that I usually do to solve this kind of problem is having an explicit registration phase on kickstart of the software. So for example for Celery, we ended up writing some parts of task registration that ensures that whenever you start the software all the tasks are registered at that time. If any task gets registered at any other moment it will crash. It will tell you you registered the task at the wrong moment. So you should explicitly call a register task function in the startup function of the code. And that was a simple solution that didn't involve changing Celery much. We just had to patch the task decorator to check that it wasn't called in any other place apart the startup function. And there are probably many other patterns that you can enforce like Pyramid does something really complex. They have a library which is called Venusian if I'm correct that inspects all your code trying to find any hook point in every single pint on file in your code. That might be another solution of the project but I guess that it will not be easy to catch them all. I'm sure that if I invest enough time I can find a way to fool Pyramid in looking up into my code problem. Thank you because it's similar to things that I'm doing to solve these issues. Thanks. We don't have more time for questions. Just a quick note on the EuroPython app. You can rate this app, thank the speaker and also provide any feedback you think it might be useful for him for better drugs. So please thank again the speaker Alessandro.