 can you guys all hear me? All working? Okay. So yeah, my name's Tom. Uh, I'm gonna talk to you today about writing an auto-reloader in Python. So I've broken the talk down into four sections. We're gonna talk about what an auto-reloader is. We're gonna talk about Django's implementation. We're gonna talk about how I rebuilt it. I'm gonna talk about the aftermath of how, uh, what happened after I rebuilt it. Sounds good? So firstly, what is an auto-reloader? Like all good programmers, I googled this and nothing came up, which surprised me. There was no definition of an auto-reloader, although it's a common development term. So I wrote this definition, which sounds sufficiently technical and vague. So it's a component in a larger system that detects and applies changes to source code without developer interaction. So raise your hands here if you use an auto-reloader in your day to day life, in some kind of framework. So yeah, pretty much everyone, right? Raise your hands if you could write one or you know in detail how it works. So nnnn, okay, one and a half people. So this is why I find them interesting. They're really common. Every developer or most developers use them. They're a critical part of frameworks like Django. If the auto-reloader doesn't work, as we'll find out later, it's kind of a big deal. Even though they're not a production thing, they're not really well understood and they're really language specific. An auto-reloader in Python is very different from an auto-reloader in JavaScript. So as an example of an auto-reloader, a really simple one would be automatically refreshing a browser tab every time you change a HTML file or a JavaScript file. That's an auto-reloader. So a special case of an auto-reloader is a hot reloader. And this is the holy grail of auto-reloaders because they're really fast and really efficient. So it reloads the changes to your code without restarting the system. So a really simple example of this is changing the style sheet on the web page. This is kind of hot reloading. The browser can take the changes to the style sheet and it can apply the new styles to the page without refreshing the tab. You can hot reload CSS. And these are impossible to write in safely in Python in the general case. And I'll tell you why. And a special shout out to Erlang where you hot reload code while deploying. That's how you deploy code in Erlang. You hot reload it in production. I wouldn't stress doing that in Python. So you might say, Tom, Python has reload. Isn't that a hot reloader? Isn't this implementation? Hot reloading a module? So reload does nothing but re-import the module. All it does is you give it a module and it just re-imports it. So yes, this is technically hot reloading a single module. But you need a lot more before this is a hot reloader. I don't know how well that translates into other languages in English. But what I mean is reloading a single module is very different than hot reloading an entire system or components within an entire system. And the reason for this is dependencies are the enemy of a hot reloader. And Python modules have lots interdependencies. So all hot reloaders are one thing in common. They all leverage language or framework features that manage dependencies between things. So in Erlang, the example, everything uses message passing. So if you want to hot reload a component in an Erlang system, you can just bring it down and you can bring it up again. The messages, there's no dependencies between things. The dependency is message passing, which is quite easy to, it's quite a nice interface to hot reload on. CSS, it's not really a programming language. So you can just take it down, remove the style sheet from the page and add a new one and the browser takes care of the rest. React.js has a hot reloader and it leverages how React components work themselves. So React is all about removing components from a page and adding them again and having React take care of laying out the page for you or rendering the HTML. So hot reloading a component in React is just deleting the component and adding a new one, which is really quite nice in React. It's how it works. So imagine that you could write a hot reloader in Python. So it's a little bit wordy. You import a function inside your module. So you have a function, a module, your module.py. From another module, import some function. So you have a reference to that function in your module. You then replace the code in some function with some new code. So you've rewritten it, you fixed a bug or something. After your hot reloader kicks in, what does your module dot some function reference? If it references the old code, then your hot reloader hasn't worked properly. So it's not right. Okay. So you could go through and find all modules or all references, all places that reference the some function function. You could then hot reload those as well and you could cascade all the modules that reference the module that references that one. And you can go through the whole tree of objects. This is, it just sounds complicated. It sounds really complicated. And it's really impossible to do in the general case. For any given Python program, it's impossible to do that safely. So for limited smaller cases, it may work. For example, Ipython has a hot reloader that works in a lot of cases. But it works, it leverages how Ipython is just a shell. So you don't hot reload an entire program. You kind of hot reload parts of the repel that you're using. And similarly, if you have a single reference to something, then you can hot reload that safely. You can use reload. If you have one reference to one module, you can call reload and you can replace the reference. That's hot reloading. That works. But to do it in the general case, you will end up with bugs. And what's worse than having an auto reloader that doesn't work is an auto reloader that you can't trust. So if you end up with some bugs in development, hard to track down ones, you're missing something's not right, and it's because the hot reloader hasn't worked properly, that's a terrible development experience. You're going to be spending time chasing bugs that don't exist. So how do we reload code in Python? We turn it off and on again. We restart the process on every code change over and over again. So this is kind of like refreshing the browser window every time you make a change to a JavaScript file. You lose all the state in the process, so you lose any connections that are open, et cetera, and it starts again from fresh. This ensures that the system or the program is right. It works pretty much. Rather than a hot reloader where you might have some kind of bugs or you can't reload code properly. So this is how Django, the Django auto reloader works. So when you run manage.py run server, Django reexecutes manage.py run server again with a specific environment variable set. The child process actually runs Django. So it runs the entire framework, it imports all your modules and does all the stuff that you want it to do. And it watches for any file changes. When a change is detected, it exits with exit code three. And the parent Django process restarts it. If it exits with another code, it's an unexpected error and it terminates or it shows you a message that's useful. So it's quite a simple loop. You have a process that's kind of a supervisor and it will restart the child process when it exits. And this is the most common and the simplest form of an auto reloader. So a little bit of the history of the Django auto reloader. The first commit was in 2005. No major changes until 2013 when iNotify support was added. KQ support was added in 2013 and it was removed one month later, which is never a good sign. I'll talk about what iNotify and KQ are later on. But the point here is Django code is definitely is usually very high quality and there's lots of emphasis on testing and readability. The auto reloader start to me is definitely an old and crafty part of Django. The code was very different and purely because it was something that wasn't well understood. It kind of worked, don't touch it, leave it alone. The code was definitely not idiomatic and it was very hard to extend and it was a pend-only code, right? Everyone's seen this, you kind of just chuck features on, etc. You bolt it on and you hope it works. So there were some new features that we wanted to add to the auto reloader that just wouldn't have worked with the current implementation. So we needed to rewrite it. So there's something so far. An auto reloader is a common development tool. Hot reloaders are really hard to write in Python. Python auto reloaders restart the process on code changes and the Django auto reloader was old and hard to extend. Okay? So to the fun part we're going to rebuild the auto reloader. So I like breaking things down into sections. So there's three or four steps. First one is we need to find files to monitor. We can't reload on code changes if we don't know what code we're changing or we need to watch for. We need to wait for the changes and we need to trigger a reload. We need to make it testable, of course, especially if you're refactoring an old implementation. And bonus points make it efficient. You shouldn't prematurely optimize stuff, so get it working and then optimize things. Cool. So finally files to monitor. Everyone here knows this modules. It contains all the modules that are currently loaded by Python. Python has quite a few modules. So just running a Hello World on Ipy. Ipy has 642 modules loaded. Python itself, just importing sys and printing the LEN sys modules has 42 modules loaded. So there's quite a few modules. And sometimes things that are not modules end up in sys modules. Sys modules is effectively a dictionary and it can be modified by arbitrary Python code. So some libraries do some crazy things, especially in development. For example, the typing .io isn't a module, even though it's in sys modules. It's a class. And this was actually a bug in the Django autoreloaded implementation. I naively assumed that things in sys modules are modules, which isn't true. And Python imports are really dynamic as well. It's one of the most flexible and best parts of Python. You can import from DIT files. You can import from PyC files. You can write arbitrary loaders in Python to do random things on imports. So this guy here wrote a 60-liner code. He wrote a importer that imports code directly from GitHub. So you can do from github underscore com dot whoever dot username import project and it will import that code, download the code from GitHub, install it or make it available to Python and it's there. Don't do this in production, but there's a lot of magic that can go into imports. They're not as simple as a file in the file system and a module in memory. The more common use cases for these kind of loaders are PyTest. PyTest rewrites the bytecode of your test files. So it changes the assert keywords that you use into a function call that PyTest can do things with. Scython as well which is a library for letting you write C extension modules in a nicer syntax than C. It can compile the module on import which is quite handy in development I guess. So yeah, there isn't always a mapping between a module and a actual unique file or you could have two modules with the same file, etc. So what can you do? What can you if someone wants to import code directly from GitHub in development? You can't really do anything. The point here is the imports are very dynamic and not all changes can be detected. So we can try our best to detect them. This is a really simple implementation of something to list all the files that are installed or modules that are loaded. So which module has a spec attribute and that module, that object has an origin which is the path to the location which can be a zip file, etc. All of these co-samples are really simplistic. So the actual implementation in Django is over 40 lines long. It wouldn't fit. It actually was going to include a slide with it on, but it just didn't work. It was too big. But this is conceptually what you want to do. We want to iterate over CIS modules and we want to return a list of all of the file paths we want to monitor. Pretty simple. So we found the files we want to monitor. We want to watch for changes and trigger a reload. So all file systems report the last modification time of a file. So there's a function OS stat. You can give it a file path and it returns a structure. One of the fields on the structure is the M time which is the last modification time of the file. And we can use this to detect changes to a file. And the important thing to know here is that the last modification time is pretty abstract. It can mean different things on different platforms and operating systems. So file systems can be weird. HFS which was the default file system on macOS before the latest version had a one second time resolution. So there was no nanoseconds. In the previous slide, that's the timestamp including nanoseconds. HFS would just be to the second. Windows has 100 millisecond intervals. So files may appear in the future. Linux, it depends on your hardware clock. So the current time in the Linux kernel is cached in memory and it's updated by some kind of clock every 10 milliseconds normally. Python does a great job about abstracting operating system specifics away. But you really can't escape from the realities of the file system that you're running on. A case in point, macOS has a case in sensitive file system by default which isn't something that you can abstract away. So there could be different system calls or different ways that you find the last modification time of a file on different platforms. Python can abstract that away. What the actual modification time means, you can't abstract away. Network file systems can be even weirder. And they mess things up completely. OS stat is generally really fast. Except if it's on a network file system, that could require a network access. So if you're for some reason developing a system on a network file system and that network file system lives on the other side of the world for whatever reason you want to do that, the stat could have a huge latency. Clocks might be able to sync as well. If you have two developers working on it, one clock might be completely wrong, one clock might be right. So you end up with the one developer writing a file, the other developer reads the file, all the other two reloaded kicks in and the files, the times are different, the times are one year in the future, one year in the past. And the time can be set by anything. It doesn't, you can change the last modification time of a file arbitrarily. It doesn't mean that the file has been modified. And the M time not changing doesn't mean the file hasn't been modified. So the reason we use this despite all these limitations is it's really easy to implement, it's generally efficient unless you're running on a really weird network file system and it's a pretty good cross-platform support. So here's a really simple implementation of an auto-reloader or that uses stat. So we have a function called watch files. We have a dictionary that maps the file paths that we've seen to end the modification time as reported by the file system. We have a wild true loop and we go through and iterate through each of the files returned from the previous function that we wrote. We call our stat on the path and we get the modification time and we get the previous modification time. And if they differ then we exit with exit code three. Otherwise we sleep for one second. Okay? So really simple. Obviously there's a lot more to this. If the file doesn't exist, if it's been deleted, et cetera, et cetera, this is again is a simplistic implementation. So we found file simulator. We can watch for, wait for changes. So how do we make it testable? So when I was researching this talk I went through and looked at a bunch of other projects that used an autoreloader. It surprised me, there were not many tests for autoreloaders in the wider ecosystem. So the tornado project has two, flask has three and pyramid has six. Most of these are high level integration tests. They're like spawn a process, touch a file, assert that the process exits with exit code three. The point here is not to shame these projects into saying, oh it sucks they don't have any tests. The point here is that it's a hard thing to test usually. Obviously these autoreloaders work very well and more tests doesn't always mean that it works, but it's a hard thing to test. And the reason is, is an autoreloader is generally an infinite loop that runs in threads and relies on a big ball of Excel state which is the file system. And each of these things is hard to test by themselves. But they're even harder when you combine them together. So how do you make things testable? And this isn't some crazy idea that I've had, it's just to use generators. So if we make our autoreloader implementation a generator, the only modification we do is add a parameter telling the function how long to sleep for, and we yield after each iteration of the loop. And it lets you write slightly better tests. So this is a simple test. We create a reloader, creates the generator, we call next on it which ticks, so it has one tick of the loop, then it hits the yield and it returns to this test. We fiddle with a file somehow, we mutate the state of the disk, then we call next again, and it should exit with exit code three. So this is, we have a way to pause the autoreloader essentially. And it allows us to make changes to the file system and then resume it. So you can extend this test to work with symbolic links, permission errors, files being intermittently available, et cetera, et cetera. So we've made it a little bit more testable. And how do we make it efficient? Surprisingly there are two slow parts to the autoreloader in Django. The first one is iterating the modules, which surprised me. And the second one is checking for file system modifications. On an SSD checking for the file system modifications on OS app is really quite fast. Iterating the modules every second was the slowest part, especially you have a really large Django app with maybe 5,000 modules loaded. So how do we make it efficient? We can just use LRU cache. So we have a function, the one we wrote before, get files to watch. We call another function with the frozen set of all of the modules that we have currently loaded. That function, the sys modules files, takes the modules and it has an LRU cache on it. And it returns the same implementation that we had before. So in reality, like sys modules can change but after an app is booted, it doesn't really change that much. You might import something in a function so it can mutate but in the happy path it doesn't. So you can just cache the results of all of this. And you can skip all the processing of checking if it's a zip file, resolving the sim links, etc. It can all just be cached into a single list and returned without needing to iterate through them. In the Django implementation on this MacBook with a solid state drive, it took up 30% of the time of the each auto-related tick which was quite a lot of time. So can we skip the standard library? Raise your hands here. Has anyone during a debugging experience or process edited a system library file, standard library file? Okay, so it happens but not very many people. Maybe a specific type of developer would. In the general case, no one really does that. The average developer won't need to. So it would be quite good if we could just skip watching them. We could just skip all of the system packages, all of the standard library. We don't need to re-watch them. They don't really change. This is actually a lot harder than it sounds because how do we know where the standard library is? I googled it. I got to a Stack Overflow answer and I was like, okay, good. This is going to be simple. There were 20 answers and each of them were different which is never a good thing. So the first one was this, get site packages. That's cool. It's not available in a virtual environment. So that's no good. We can call this function. That works but it returns a single path. Some Linux distributions have more than one site packages directory. So I went to IRC and I asked and I was like, okay, I feel like I'm pretty experienced with Python. I've never needed to do this before. Why is it so hard? Am I missing something? Someone linked me to a project. I think it was related to coverage and I couldn't find the code snippet for this but it used five or six different ways to try and detect the standard library and it fell back to checking whether site packages is in the path of the file. So at this point it boils down to risk versus reward. It might not be safe to do this in all cases. What happens if your project is called site packages for whatever reason? And if you make a mistake then it's going to frustrate users. The autoreloader won't work in all cases and that's just not a nice place to be in. And no other autoreloader I could find does this. So other games could be huge. You could reduce the number of packages or modules you're searching for by 70, 80%. It's not safe to do in the general case but it doesn't get done. But what you can do is use file system notifications. So calling stat repeatedly is kind of wasteful. You're just asking, are we nearly there yet? Are we nearly there yet? There yet. It'd be nice if the operating system can tell you or the operating system can tell you when a file is modified. So you say tell me when this file was modified and then you just wait and the operating system will tell you. So each platform has different ways of handling this. Watchdog is a Python library and it implements five different ways and is 3,000 lines of code and they're all file system notifications on operating systems are directory based. Whereas we care about files which makes it a little bit harder because you get notifications for any file in a directory which has changed and you need to filter them out and make sure that it's only files you care about. Notifiers are also potentially expensive. They're generally designed for longer term monitoring. They're designed for a demon that's watching a bunch of files and it performs an action when a change is made. In our flow, we're going to create and destroy them quickly. Every time a Python process shuts down, Django restarts it, it has to create a new watch thing with a kernel and it's going to use more resources than it should. So this is the actual feature that we wanted to add to the Django Water Reloader. It was using a system called Watchman from Facebook. So Watchman is a demon that runs on your machine and it handles all of the icky differences between platforms for you. You register watches with it. It does the right thing and it returns changes to you over a socket. And it handles git changes, which is one of the reasons you want to add Watchman in the first place. If you change, if you check out a new branch in git, you're going to have hundreds of notifications flying at your process, selling everything's been changed. But with this, it will wait until all of the checkout has finished and then it will send one single bulk update telling you that the process is finished. Otherwise, what might happen is your process receives one file has been changed, mid checkout, it's going to restart and it's going to be in an inconsistent state if the checkout is still happening after the Django process has restarted. And the demon can be shared with our projects. So if you have like a JavaScript project that also uses Watchman, quite a few of them do, they can share the watches and generally make it more efficient. So this is how we do it in pseudo code with Watchman. We connect to some kind of Watchman server. We tell it what files to watch. And in the wild true loop, we just tell it to wait. This waits on a socket for a message from the Watchman demon and if there are any changes, we exit with exit code three. And this way, we don't write any platform-specific code and we don't have any issues regarding weird OSX versions that don't use a particular library or something like that. Cool. So we've made it efficient as well. So the aftermath, the code was much more modern and it's easy to extend. It was faster and it can use Watchman if available. There were 72 tests. This is in Django and it's no longer a dark corner of Django. I might be a little bit biased in saying that seeing as I wrote it but it was certainly in my opinion a little bit better. So it's all good. I'm a genius. Work first time. Test for green, ship it, et cetera. Everyone's happy. Not quite. These are all issues from the Django ticket tracker after we released the inversion 2.2, the new auto-reloader. There were quite a few of them, unfortunately. So more tests doesn't always mean that it works. So this is my favorite issue and the issue is it doesn't work on Windows essentially without using Watchman. So it doesn't work intermittently. So this is, I want to highlight this because this is a great example of how you can make what seems to be a really simple optimization that makes sense and have it completely backfire in a way that you don't know why. So in the Django implementation that I discussed before, we might not be watching, we might be watching for a file that doesn't exist yet. Django, some files, Python files in Django, if they are there, that's a change. So for example, the models.py, if you were to create a directory with a models.py and add that file, the stat reloader would, the first tick, the first time it detects the file is there, it doesn't pick that up as a modification because it's the first time it's seen it. Only the second modification where it can compare the modification time of the previous time to the current time does it reload. So I was like, okay, that's a corner case, need to fix that. So we store the last time of a loop and if the previous M time is none, which means that we haven't seen this before and the modification time of the file is greater than the time of the last loop, then we reload, okay? This doesn't work on Windows 25% of the time and I could never work out why. So you would restart, the process would restart and it just wouldn't work but then you would restart it manually and it would work and in all other platforms it worked fine. If you know Windows and you want to tell me why this is, please, because it keeps you up at night and I don't know. But the point here is you get all kinds of strange behavior across different operating systems, across different disks, different configurations and simple optimizations can bite you. So keep it simple if you're writing your own and keep it really simple. In conclusion, don't write your own auto-reloader. Use this library. This is the library from Pylons called HAPPA and this is a fantastic library. In the abstract of this talk you may have seen, I was gonna present a library that I wrote myself that took all of this knowledge and distilled it into a library. This is that library that someone else has written, probably better than I could. So check it out if you are writing your own framework and you want to add an auto-reloader, it's really good. Cool. I'd just like to thank Onfido who's the company I work for. They're paying for me to come here and give this talk. We are in the business of identity verification. It's a really interesting problem space from the theoretical, like what is an identity. To the more interesting, how do you handle millions of identity checks as fast as possible with as little fraud as possible? So if you're interested in any of this, Onfido specifically, come talk to me afterwards, send me an email or check our careers page. Any questions? We have time for a couple of questions here and Matthew, if you're closer to me. So does your auto-reloader handle it properly if editors do weird stuff when saving files, like creating a copy first and then renaming? Can you save it again? I'm sorry. Like many editors nowadays do like save-saving, so it doesn't overwrite the file but rather create a new one and then replace it. So Watchman handles that for you quite nicely as well. So it looks at common patterns where you create a separate file and then you do an atomic move. The stat-reloader handles that as well because it doesn't watch for the new file, the .new file, which is then moved. So as far as it knows, the individual path has been changed but not the other one. Okay, thank you. Hi. If restarting the process isn't really an option, let's say you have plugins for an application and you can still kind of control how the code in the plugin looks because you're defining the API, would you say reloading without restarting is possible or just don't? So the exclusion to a hot-reloader is a plugin system where you have a single reference to that plugin or you control the plugin and you can safely, or you know that you can safely delete the reference and re-import it. It doesn't always work if you have, for example, a C module, extension module that that plugin relies on. It might have some initialization code, you can't really safely hot-reload those at all. So it depends. You can write a hot-reloader in some specific cases. A plugin one in general, a pure Python plugin one is definitely one of those but it's safer if you find into weird issues to just restart the process. So a good implementation might be both. If you can detect a change, if you can somehow diff the changes and work out what needs to be updated, you could hot-reload simple changes and then fall back to a restart if needed. Thank you. Cool. All right. One super quick question. There you go. I tried Watchman years ago when it just came out. Is the API better now or easier to use? The API is better-ish but it's still a little bit harder than I would have liked to use. The simplistic code that I showed where you register a file is nothing like you need to do. You need to work out, it's directory-based, so you need to work out the set of files, which common directories do you want to watch, minimizing the amount of directories that you do watch. It doesn't take care of any of that for you but in general it's quite nice. I mean, the simplistic case, you say watch this and you just get notifications on a socket and it provides utilities for filtering out specific files, regular expressions on the files, et cetera, in a way that's quite cross-platform and takes a lot of code off you. But it's definitely more complicated than I would have liked. All right, let's thank Tom again. Thank you very much.