 Our next speaker is Elisabeth and she works for JetBrains and works on the PyCharm IDE, which a lot of people probably know. She's working on the Python debugger and the data science tools in that application. She's going to tell us about the hidden power of the Python runtime, how to retrieve useful information from the Python runtime and the build tools. Elisabeth, could you share your screen? Great. Thank you very much. I hope everybody can see my slides. So today we will learn a lot of new things about Python runtime. As you already have heard, I'm a software developer at JetBrains. I'm working on the PyCharm IDE, the most popular Python IDE. And here's my Twitter handle. And feel free to join Discord rule. So there is a link to the slides there. Python is very simple and beautiful language. But very big part of its power is hidden from user. And available only during execution, only at runtime. You might even don't know about it, but you already use this power every day. For example, we run tests every day. The two most popular ways to do it are either built-in model unit test or popular test framework pytest. When your code raises assertion error, unit test will just show you that this assertion error was raised and will show you the place where it was raised. But if your code raises assertion error with pytest, it will show you not only location of this error, but also the purpose of variables we should try to compare. Where did pytest get this information? Of course, from the Python runtime. And today we will learn how Python runtime works, how you can get a lot of useful and interesting information from it, and how different development tools can use this information to make developers' lives better. Let's start with learning the basic concepts. When you want to use some objects in your Python program, you usually create them explicitly. For example, you use assignment statement to create variable, keywords like def or class to create functions, classes. But when Python interpreter executes your code, it creates not only objects which you declared explicitly. It also creates a lot of different util objects, which contain information about current execution state. And the most important of them is a stack frame object. Stack frame object presents a program scope. It contains information like corresponding code object, local global variables in the current scope, and a lot of other data. Frames are stored in a stack-like structure. The bottom most frame is called module frame. And when Python interpreter executes the program, for example here, execution is on the line 7, and interpreter calls function foo. It creates new stack frame and puts it on top of other frames. Then it executes in function. And when it's going to leave, it removes this newly created frame from our stack and returns some data to the previous frame and execution in the previous frame continues. This is this runtime machinery and concept of call stack. Call stack are the same for many different languages. But the major difference between them and Python is that not so many languages contain this runtime information out of the box. That means you can access this frame object right into your code and work with it like with any other Python object. And in the next part we will learn what you can do with it. You can get Python frame object with a built-in function sys.getframe. It takes argument depth, which returns number of codes below the top. So if you want to get the current frame, you should pass zero to this function. And when you get this frame object, you can inspect which interesting data is stored inside it. First of all, frame object contains a dictionary of local variables where keys are their names as stored as strings and values variables objects. In addition to it, a frame contains information about global variables. Global mean global for the current module. That's great, but to be honest, we don't need frame to get this to this dictionary because there are built-in functions as local and global, which return exactly the same dictionaries. The good news is that it's not the only interesting information stored inside frame object. Because in addition to it, frame object contains a link to the current code object, which also stores a lot of interesting data. Code object represents a chunk of executable code, but it differs from function object because it doesn't contain reference to the global execution environment. The easiest way to create code object is to call built-in function compile, like you can see here. And for example, you can evaluate its value calling built-in function UL. You can see here we evaluated result of this A plus B piece of code, very, very small piece of code. Okay. What code object knows about your code? First of all, it knows file name where it was created. It knows name of a function or module where it was defined. Also it knows names of variables, which are used inside this piece of code. In addition to it, it also contains a compiled bytecode. It also includes code instructions, which were generated by interpreter. So if you want, you can call built-in module dis, disassemble it, and read instructions generated by your passing interpreter, just for fun. Let's return to our frame object. In addition to code object, it also contains, for example, current line number, which has been executed in your program, the tracing function, which we'll discuss later, linked to the previous frame, and a lot of different things. Talking about previous frame, as you remember, our frames are stored in a stack-like data structure. And it is very useful, for example, to print traceback. When some exception appeared in your code, you might have seen this many, many times, like Python shows you this beautiful text. And the link to the previous frame is the exact thing, which helps Python interpreter to print this information for you. I've mentioned some very, the most important data stored inside frame and code object, but there is also built-in module inspect, which has a lot of different functions to inspecting frame objects and your program scope. The most important thing you should remember, if you decide to do some things with frame objects, is that you should explicitly delete frame variable when you're leaving the scope. It happens because if you don't do it, there will be a link from local variables dictionary to your frame object, to this local variable, and local variable is a link to the frame object. So there will be a cycle of references, and it's bad for memory management, because as you know, there is a reference counting in Python, which is used for memory management, and this cycle will be removed only by garbage collector, and it will happen much later. So it's better to delete this local variable explicitly. Okay, now we know a lot of information about Python runtime and how we can get this information in Python. Let's learn how different development tools can use this information, and how we can use it in our everyday lives. As you remember, in the beginning of this talk, we mentioned assertion error, which is very useful to use with PyTest, because it shows you real values which you try to compare under your assert statement. Let's try to understand how PyTest does it, where does PyTest get this information? Every exception object has a link to a traceback object. It's stored in a Dunder traceback attribute, and a traceback object has a link to the corresponding frame object, and as we already know, frame object knows everything about your current program state. So what can we do? Let's define function, which takes exception object as an argument, and after that, we try to get variable names and their values which were used inside assert statement, which was used in this line of code where exception was raised. So we have exception object, so we can get traceback object, frame object, and corresponding code object. From the code object, we can get, for example, line number, where this exception was raised, and even the source code, the string representation of the source code of our code object with the help of module inspect. We can call this inspect.getSource() function. Now we have a source code represented in string and line number, and again, with the help of the standard module AST, we can even find variable names used on this line of code. I don't show this function here, because it's quite big, but it's rather simple. You just need to go through this abstract syntax tree and find variables you're interested in. You can find this code in my repository with code samples. After that, when we know variable names and we have frame object, as you remember, we can find these variables in the local variables dictionary and just print it to the output. So now, how can we use this function? If assertion error was raised inside our code, we can pass this exception object to this function, and it will print variable names and their values right to the output. If you, for example, want to log some errors and understand some exceptions without integrating with Bytest, you can use this our new function. Of course, Bytest implementation is much more powerful, but our small but also powerful prototype describes how it works. The second tool we're going to consider today is debugger. As I've already said, I've been working on PyCharm's debugger for several years. That's why I know so much information about Python one time. And let's learn how they work. Modern Python debuggers are based on two main functions, tracing function and frame evolution function. Tracing function is defined for the frame and traces all the events which happen in your program. So if your program is being executed and events arrives to the function, inside this function, you can analyze these events and depending on it, debugger can understand should you suspend program in this place or should you continue execution or should you step inside, for example, function. As you can see, trace function takes three arguments and frame object is one of these arguments. Frame evolution function is being executed before entering your frame. And again, you can see it takes frame object as an argument again. And debugger, which is based on frame evolution function, can be implemented the following way. You can insert breakpoints code right into code object of the function and when execution comes to this place, it just calls breakpoint code and debugger stops at this place. So there is no need to analyze every event and tracing function. You can just quickly stop in the place which you're interested in. If you're interested in this topic, you can check my PyConUS talk. It was about Neo Frame Evolution API, which appeared in Python 3.6, but it also true for other older versions of Python. Okay, what is interesting for us today is that both these functions take frame object as their argument. And that means that we can get a lot of information from this frame object. For example, thanks to the frame object, debug can understand filename and line number where Python interpreter is executing our code and it can understand should it suspend a program here in this place or not. In addition to it, debugger can use local variables dictionary to show variables values to the user and also it can use fback attribute to show stack frames to the user. Again, we have current frame. We can iterate through this link to the previous frames and show user the stack frames to the user. And show user the stack frame for the current location where debugger is suspended. Great. Now we know how debugger uses this runtime information. Let's move to the next tool. Next tool will be called coverage. The coverage shows you which lines of your code base were executed. It's very useful, for example, to run your tests with code coverage and it will help you to understand which lines of your code are covered with tests and which are not covered. So you can improve the quality of tests and that will make your project much more stable. The most popular code coverage library is Coverage.py. Look, they have extremely cute mascot, Sleepy Python. And Coverage.py also uses tracing function, which we've already seen in section about debugger. And again, as we already know, it takes frame object as one of arguments. So it can use a filename and line number information to get the location, which is being executed, recorded in some place, and later show you in Coverage report, which is pretty simple and really cool. The next tool. The next tool will be the group of tools. They are tools for runtime typing. Here is the list of the most popular tools. What are they doing? Pilotate by Dropbox. You can run your code base with this tool. It will record all the function codes inside your code base, record types of arguments of your functions, and later will generate typing annotations for every argument of every function, which was called. Monkey type by Instagram is also very similar. It also records types of arguments of your functions, and later generates stop files, which also can be used for your text editor or IDE to help you write more high quality code. And collection time information in PyCharm. It's also very similar to the previous tools, but it's integrated with Debugger. So when you run Debugger and this option is enabled, it also records types of your arguments, but later it suggests you to use this information when you want to generate doc string right inside PyCharm. Let's try to understand how these functions work. Functions tools. PyAnnotate and MonkeyType are both based on profiling function. You can see it's very similar to tracing function, at least it's arguments. The main difference between profile function and tracing function is that trace function traces every event in the program, and profile function traces only call events. So only when we call function or enter a new scope, this function will be called. And it's logical to use it here instead of tracing function because we are interested in the only in call events because we want to record types of function arguments. We're not interested in other events. And collection time information is integrated with Debugger. And as we already know, Debugger has access to a frame object, so we can get information from frame here as well. Okay, in each of these tools, we have access to the frame object. How can we get information about types of our arguments? That's quite simple, to be honest. From the code object, we can get argument names which are used inside this function, which was called. And after that, again, we can find these variables by their names in local variables dictionary. And we have access to the object. That means we can get their types. After that, now we know their file name and line number where this call happened. We know variables names and their types, so we can record it. And after that, show it either a stop file or type in attention or inside Doxtree. So this is also really cool and will help you to generate typing annotations automatically just by running your code. This is very useful. Okay, we've learned some interesting facts about different popular tools, but let's try to create something new, something that didn't exist before. There are two ways to execute tasks concurrently in Python inside one process. They are threads or asynchronous tasks. You can start new thread with the help of the standard module threading. You can do it like this. And for synchronization between threads, there are synchronization objects. And the most fundamental among them is a log object. Thread can acquire a log object, and that means that the following block of code will be executed by this and only by this thread until it will release this log object. Also, log objects are context managers, so you can work with them with a keyword width. Running more than one thread and using log objects sometimes can lead to deadlock. Deadlock is a situation when you're waiting for resources which can't be released. The easiest way to reproduce it is to do the following. They can take two threads and two log objects. The first thread acquires first log object. The second thread acquires second log object. After that, the first thread wants to acquire the second log object, but it's unavailable, so it starts to wait. And the second thread wants to acquire the first log object, but it's also unavailable and it also starts to wait. The problem is that they will be waiting forever because this situation can't be resolved without program interruptions. And this is really sad because, of course, we don't want to have this situation in our program, and the second problem is it's really hard to detect deadlocks in big projects. But we will try to help people who are fighting with deadlocks. As you remember, we used sys.getFrame function, which returns frame object for the current thread, but there is also sys.currentFrame, which returns topmost stack frame for each thread. What it means? It means that we can create our own tool, let's call it thread handler. This tool will be living in a separate thread and it will print traceback to all the threads in the process with some interval. And we will see that if from some threads, tracebacks of some threads won't be changed for some time, we can see that they're stuck in some place and we can look at their traceback and understand, okay, they're waiting for logs in different order and we can quickly fix it. Pretty simple idea. And we can implement it, but there is a problem. This function, this tool is already implemented inside standard library. There is a model fold handler and method dump traceback, which dumps the tracebacks of all threads into the file. It's implemented natively in C code and, well, it already works. Everybody can use it to detect deadlocks, the location of deadlocks in their code base. Okay, but as you remember, there is the second way to execute tasks concurrently inside Python. They are asynchronous tasks. And asynchronous tasks from the user point of view, they are very similar to real threads. For example, there are the same, again, very similar synchronization objects in Async.io module. So there are asynchronous logs. That means that there is a place where we can apply our knowledge of the Python runtime and create a tool which will help us to detect asynchronous deadlocks. It will work the very similar way. So there is a method, all tasks, which returns all the running tasks in the current loop. And also each task has a method which returns the list of stack frames for this task. That means that how we can implement our asynchronous fold handler. We will start in a separate thread and in the infinite loop with something terrible. We will dump stack traces of all the tasks in this loop. We can do it this way. So we're iterating over tasks and print their tracebacks. And again, if from some moment we will understand that some tasks are stuck in some places, their traceback isn't changed, we can look at this stack trace and try to understand why it happens. Are they waiting for some log objects which will never be released or not? That's really great. We implemented our own asynchronous fold handler which will help us to detect asynchronous deadlocks. Okay, today we've learned a lot. Today we've learned that Python runtime is very powerful. Python allows you to easily get stack frame object and corresponding code object and inspect them. Also, we've learned that there are a lot of development tools which use this information and which help you to write your code and which make your life much easier. I hope after today talk you will start using runtime development tools or you will start using them more often if you already use them. And maybe who knows, you will create something new, your own tool which uses Python runtime information. Here are some links, the repository with code samples which I showed you today, a blog post based on this talk and also feel free to contact me by email or on Twitter and of course, come to the Discord channel. I'll be there ready to answer your questions. Thank you very much for your attention. Thank you very much, Elizabeth, for the nice talk. Excellent. Let me play the applause. We don't have any questions in the Q&A and I also don't see any on the chat, but I have a question. Since you are working on these debug tools you must know the differences between the different Python versions. Has anything much changed in recent Python versions in terms of these frame access or the debugging tools in general? In debugging tools in general, I think. But I know for sure that the frame evaluation function signature was changed in Python 3.9. Yeah, it added some new argument, but it didn't affect debugger, so it still works. Yeah, but from, I think nothing else changed. So the buggers are still work. They work with different versions of Python. And of course, by turn debugger works with different versions of Python. So if you don't use it yet, give it a try. The buggers are really cool tools, which help you to find bugs in the program and which also help you to understand your program execution. Okay, thank you very much. Excellent. Thanks again for the talk. And yes, enjoy the remaining part of the conference.