 Hi everybody, thanks for coming. My name is Ram. This is going to be a talk about an open source project I did a few years back called PySnooper. PySnooper is a debugging tool and this talk is going to be about showing you how it can be useful for you. And also talking a little bit about what goes into making a popular open source project. I've been doing open source for years and I've always tried to do a project that's going to go viral and most it failed and I had only a couple of successes and this was one of them. So I tried to analyze what went right there. About myself, my name is Ram. I've been Python Activist for a long time, a software developer. My two popular projects are PySnooper and Python Turtle and I've contributed to some of the big projects like C-Python, Django, PyPy and a bunch more. I'm going to give a lightning talk about my research. So now I'm working full-time on research in machine learning. We're using machine learning to understand our society. Research in a field of machine learning called multi-agent reinforcement learning and that's now the thing that's most interesting to me. So shameless plug. Please come to my lightning talk. It's going to be the one that's first today. So this talk is about a debugging tool. I'm going to show you the GitHub page for PySnooper. Feel free to follow me for using Windows. So this is the GitHub page for PySnooper and I posted it online back in 2019 and it got super popular. It has 15,000 stars for a project I worked just a few weeks on. So I was very happy with that. Put it on Hack and Use. It went viral there. It went to the top page. Lots of people tweeted about it and posted it on Reddit and then started and I was very happy to make something popular. So let's talk a little bit about debugging. Then I'm going to try to explain what is PySnooper and how it can be useful to you. I want to say if you have questions, unlike the other talks, I will appreciate if you just ask questions during the talk. So feel free to interrupt me and ask questions about anything. So let's talk about debugging. Let's say I got a piece of code that I'm running and it's not doing what I think it should be doing. Either I get an exception or the result isn't what I expected it to be. So I'm a big fan of using classic debuggers, old school debuggers. This is the wing ID. Most of you probably have never used it but it's basically similar to PyCharm or VS Code. I've got a piece of code here that's sort of my case study and what it does, it takes a number and converts it into binary, into bits. Let's say I want to understand what's going on here. I can put a breakpoint anywhere and I can run the code and the debuggers are going to stop on that line. I can press the F6 in the case of wing ID and it's just going to go step by step through the lines and every time it goes through the lines I can find out the values of the variables there. I can ask what is the value of the variable number or what is the value of the number remainder and I can even, if I could, if I wanted, I could also run any kind of code that I wanted to just modify things there. So for me, I mean this isn't anything new. I've used tools like this when I was developing C in Pascal when I was a child. So this is very classic. I mean it's not new. I love to use this tool. I mean I use the debugger whenever I can. It's amazing. When I started working professionally as a developer, in Python, I looked at people around me who were working and they weren't using debuggers at all. They were using print statements in the code. Whenever they wanted to understand what's going on, like there was a bug, they would add more print statements or log statements or temporarily delete them and sort of see the output. When I started first, I mean I couldn't believe it, I mean this is such a great tool. Why wouldn't anyone use it? So this is my explanation for that. Debuggers are awesome. So many tools, so many, it's so powerful, but to get it set up is a nightmare. I've shown you, I've shown you using a debugger on just one file. On my computer, anything easy. I just press the F5 in the train. But if you're developing it work, you're not running the code on your own computer. It's on a separate computer, maybe in a docker or a VM or in a different operating system or a different country. And getting a debugger to connect from your computer to that remote computer, that's very difficult. I've done it a lot of times. I can even say I'm an expert at sending up remote debuggers. And I think it sucks. I mean it's so difficult. The source mapping always fails. Something always fails. And so I've done that when I work for companies, but most people don't want to do that. And I think that's part of the reason why many people, when something doesn't work for them, what they end up doing is this. And this text is visible. Let's make it bigger. Okay. And then they see the output and then they get a clue for what's going on in the function. And they can understand that they don't have the output they wanted. So they add more print lines. And then they understand they wanted to expose more variables than they did. And when you run it that way, you get an idea of what's happening in the function. So this works. And so debug by print works. But I have very mixed feelings about it. I mean, on one hand, I hate it because it is such a crude tool. You have to go in there and put the print statement to yourself. I mean, just the fact that you have to modify your code to debug it is something offensive to me. You have to put in these print functions and then you have to reproduce the problem that happened, which can take a few minutes, right? And then you run it and then you look at the print output to inspect what happened. And then you realize you haven't exposed the variables you wanted to expose. So you go back and do it again. And this sort of back and forth, like I was seeing developers doing it all the time and it was so sad. I mean, so here was my thinking. On the one hand, we got the classic debuggers that are very powerful, but difficult to set up. On the other hand, we got the method of debug by print, which is very weak, but easy to set up, which is why everybody's using it. So I said, I'm going to make a compromise. I'm going to create a solution that is, okay, not as powerful as the debugger, but pretty powerful, more powerful than print and easy to set up as print. So that's PySnooper. Let me demonstrate how to use it. I'm going to delete these old print statements. So I imported PySnooper and I decorated my function with PySnooper.snoop. There were a bunch of options that are possible here, but I'm going to show them later. Now when I run the code, I'm going to get this. Scroll up. So, basically, instead of playing the game of putting a print statement here or a print statement there, it's like going all in. It's like I want a print statement everywhere. Every time something runs, I want to know what happens. So, yes, it does create a huge dump of text. It's basically similar to set minus six that you have for bash. You can think of it as set minus six for Python. And it is nice to get that text. I mean, it's a bit big. It can be difficult if you've got big function. But it's like an automatic log of everything that happened. Every line, the trend, and every variable that got declared, right, this is actually a variable that gets modified. Every time a variable gets modified, it prints out the variable that, the new value. Every time a variable gets declared, it prints the value of the variable. So, basically, it's sort of like an X where you have exactly what happened in your function. So, instead of having the back and forth with a print, just put Python and see the whole thing. And, of course, you can do all kinds of fancy things like the first argument is you can direct it to a file and then see the output in a file instead of on the shell. And you can also put a callable there for you to call the callable instead of printing. More cool options. One cool option is watch. Watch is basically the same as a watch expression in a debugger. So, I can put any kind of, I put a variable here. But I'm already getting the variables for free. But I can put any kind of expression in there. If I'm going to put some of bits, I'm going to run it. It's going to track some of bits. And every time the sum of the bits changes, it's going to show the current sum. So, any kind of Python expression I can put there just like a debugger. Any questions so far, if you're free to interact with me with questions, just step up to the mic if you have any. No, I'll repeat the question. Okay, okay. Actually, I can't understand you. Yeah. So, my question was like, there's some bits which is a string. There's a what? The sum bit, if you go. Yes. I'll bring it back. Yeah. So, it's kind of a string. So, can it also identify if there is some syntax error in that? This is a good feature for some, for a strong powerful tool which Pystoper is not. Pystoper is a cute toy. I mean, you know what I'm saying? Like, I think that, you know what, you know, in the spirit of experimentation, let's see if I try. It failed, which I guess is what you want, right? Yeah, correct. Because sometimes people had a typo by mistake, like some can become SM or something because. Yeah. Another useful feature is depth. So, I will actually have to add more code here to make it meaningful. So, let's say I'm just, I just have a code here that calculates the mean of a few numbers. Okay. I'll repeat the question. The guy was saying playing devil's advocate that I am actually modifying the code. And I said that I don't like the way, when you use debug by print, you modify the code and doing the same thing with Pystoper. You are correct. It's the same disadvantage as print. And you ask, you also ask whether there is a way to enable Pystoper without modifying the code and the reason, unfortunately. Sorry. Another useful feature is depth. Depth basically means go deeper in the function that you're tracing. So, if I use depth, actually, let's try depth equals one, which is the default. With depth equals one, we only get the lines of the current function. But if I use depth equals two, it's also going to track any function that my function calls. Right here, I'm calling statistics.min. And all the calling statistics.min is going to be traced here. It's nicely indented so I can see that it's a different function. It also says where it got the source. And I can basically go crazy. Let's do depth equals 10. And actually, this isn't even very crazy. But it is, yeah, it's not easy to read depth equals 10. But the nice thing is that it is a static artifact that you can save in a file and then inspect later without having to run your program again and again and again. Let's see if there are any more interesting options. Yeah, I guess if thread info is cute and if you're running multi-thread code, it's going to show the thread ID for each line that runs. So, if you have multiple threads going on, you're going to see which thread is running which line. It's so much for options. Now, some people may be wondering how does this magic work? I mean, how do I tell Python, please insert a print statement everywhere. And it's pretty simple. There is a function called sys.setrace. This is part of Python, part of the sys module. Here is a trace function. Please call this trace function wherever there is a line that runs. And please give the trace function some metadata, sorry, please give the trace function some metadata about what ran. I mean, what kind of line ran, whether it was a return from a function or entering into a function, stuff like that. So, lots of the code intelligence tools that you use, like debuggers and code coverage measurement tools, they just use sys.setrace. And if you want to write your own code intelligence tools, you can do the same. Cpython basically provides the actual machinery for that. Okay, so, let's talk not about PySuper itself, but about the experience of making a popular open source project. So, I said before that I made a bunch of open source projects over the years and many of them failed. Like many of them, I worked on them for months and I was so proud and I posted on GitHub and nobody cared. I got like two stars from my brother and it's somewhere in the person in the unit. And this one succeeded and I also have another one that succeeded. So, I'm going to sort of try to share what I did because I know that other people might also be interested in having the open source project be more popular, excuse me. Lots of the things I'm going to say sounds super obvious, but people somehow miss it. So, basically, if you have an open source project, it should be very easy for people to discover what's going on and how to use it and how to get started. You go on a GitHub page for a project, you expect to see a sort of example of usage and what the project is, what problem it's solving and how to install it and stuff like that. And the frustrating thing is I think everyone can kind of agree that it's obvious, but so many projects don't do that quite right. They don't understand that the user has 30 seconds of attention to read the thing and see that it doesn't suck so they can start using it. So, I'm going to show the GitHub page in a sec. Also, I want to say that now I'm getting into research. I'm using more of the data science packages in Python and I'm seeing that this standard is lower. If you look in the Python flask, like Django flask requests world, they've got it down bat. They're good at understanding that there's a quick start that people have to, that the GitHub page has to be super clear. Data science world, it's a wild west. You can see a package and the GitHub page doesn't show how to use it, where there's an example and you try to run it and there's a six arrow in the example, in the example on the read me. So, I do wish that more people would, you know, would uphold these rules. Let me show you the page for PySnooper. So, I got the page for PySnooper, now I'm thinking as a marketer. So, I came up with this tagline, never use print for debugging again, which already explains the pain points and what I'm trying to solve. And then there is an explanation there, also the comparison to set minus six, which makes it very easy for people from Bash to understand it. This is actually too much text. The explanation of what the problem is is actually too long, but it did work. People like the way I wrote what makes PySnooper stand out from all other code intelligence tools. You can use it in your shitty, sprawling enterprise code base without having to do any setup. I just find the pain point and say, this is the thing I am solving. People like that. And then, you know, just a sample, an example of usage. I've got string to load the output. And of course, installation instructions. So, pretty straightforward, but so many projects miss that. About the example of usage, there is something I noticed in the data science world. Sometimes you have a package and where they would usually have the example of usage on the read me, they say, check out the examples folder to see different examples of the way it used. And I've recently worked with a package like that that was hard to understand. It's like, what I'm saying here is if there is an example, an example folder with eight different examples, it feels like you're giving people more, but you're actually giving them less because now I have to go to the examples folder. I don't know which one is the best one. Like, I'm going to try to choose and one of them isn't going to work. And then I will complain to people who say, oh, that one sucks. Try the other one. I mean, as soon as there is an official example that works front and center, there is something very valuable that I think every project should have. If you make an opportunity project you want it to get popular, I recommend posting it on Hacker News, Reddit, Twitter, blog. I post it all at once. I also try to get a few friends to upload it. Only dark pattern I recommend. And sometimes it's exceeding, sometimes it doesn't, but it's a very nice trip when it does exceed. If there are any questions on that so far, cool. And the very last thing I'm going to talk about is, I'm going to talk about PUDB. This is a different debugging tool, not totally related, but since we're talking about debugging and it's an awesome tool that many people don't know about, I'm going to demonstrate it. So I said that PySnoop is a sort of compromise between classic debuggers and print functions. And PUDB is sort of a compromise between a GUI and a CLI. Because it's a debugger that kind of feels like a GUI, but it is actually CLI so you can use it in the shell. You could use it over SSH. Here's what it looks like. Come on, don't embarrass me. Okay, I'll use my backup Raspberry Pi. Let's see. Hope it doesn't suck. Okay, so I've got the same code here, the same function now with the bits. Let's try to use it with PUDB. So if you're old like me and you work with Bolland programs in the 90s, it's going to be very familiar to you. So I've got my code here and I can run it like a debugger. Now I'm pressing the keyboard to tell it to go to continue on a certain function. I can say basically the same kind of demonstration I did with the debugger in the GUI. I can do it here in the shell and I can go into the shell and I can ask for the values of variables and I can travel up and down the stack here in the shell and I can ask what is the value of number. I can say I guess bits or bits doesn't exist. Oh, does exist yet. I can run any kind of Python code I can run in the shell. So basically it's sort of like a toy debugger in the shell. I think it doesn't do I think multi-threading or multi-processing, which is a shame, but still very awesome tool. The fact you can just SSH to a remote server and have a debugger like that is awesome. The difficult thing is to understand how to use it with a keyboard. If you press the question mark you get all these keyboard hard keys. Memorizing them is a difficult part, but you know if but if you succeeded with Veeam you can succeed with this one. You have the features of setting breakpoints. You can go up and down the stack. You can set wash variables. So it's a pretty cute thing. Any questions about QDB? Cool. I think this is all I have to talk about. Again, shameless plug from my lightning talk. I'm going to give a lightning talk about my research. This is the first lightning talk today. I hope it's going to be very interesting. You can check out my research site. That's the first link. We'll sign up to get updates about it on the second link. So thank you very much for listening. So if anyone has questions, please just line up in front of the microphone. And if there's anyone online that has a question, just please let the operator know so that we can pop you up on the screen. Thanks for the talk. And I'd like to ask, so sometimes when I debug, I have many variables in the function, but I only care about how one of them changes. Is there a way because I think it will be very spammy to snoop and see 10 where I only care for one to filter out the unwanted ones? I understand, but that doesn't exist. Sorry. I was very tempted. I got a lot of feature requests like yours when I was developing and I made the conscious decision to keep it as simple as possible. Also, another decision I made is to not have any dependencies, because lots of people are going to use an old shitty code on Python 2. Basically, I want it to be a sort of a thing that doesn't have any features, too many features, but works everywhere with as little headache as possible. Thank you. You're welcome. Sorry, I was a bit confused by PUDB. What is the difference to PDB? We used it quite heavily. It seems to be working for us quite well, so why should I use PUDB? So PDB, I haven't used it in years. Last time I used it, which was maybe five or ten years ago. I hated it so much because I don't know if it changed since then, but it's like when you use PDB, let's see if there's a screenshot of it or something. Come on. Okay, I can't find a screenshot, but anyway, when you use PDB, you can't see the code, and it's like a sort of text-based adventure game from the 80s. You always have to say, yes, I want to continue. Yes, I want to quit. Yes, I want this. I mean, I like seeing what's going on personally. So with PUDB, the fact that you just see the code, see the stack, see everything at once, that for me is a huge deal. Thank you. You're welcome. How does it look if you're interacting with the extensions or built-ins or that sort of thing? It just keeps them, same as a real debugger. I love good questions. But I really like it. I'm just curious, would you recommend using it operationally as a kind of a prettier traceback? Like in production? In production. It's like, you know, I know this thing fails a lot. I put it in production. I wouldn't want it to print out this trace of values except if there's an exception. No, I wouldn't recommend it. No, it would be hard. And it's conceivable. Yeah, it is conceivable. It isn't a crazy idea, right? Maybe in some cases, would want to print all that. One thing that works well in production that I really love is the Django debug page or what's it called? What's the sort of online service that does the same thing as the Django debug century? And the last thing is that when there's an exception, they show the entire stack. For each level in the stack, they show all the local variables. That is insane for production bug reporting. I mean, that's what I like. One of the previous companies I worked for, I basically implemented that, is just took Django and used the code from Django. So every time there is a production bug, we would get all the levels of the stack and the local variables for each. That was awesome. So much time saved in investigating errors. Yeah, we are using it. Awesome. I'm not sure if this is a question or a feature request, but it would be great if you could, using PySnooper, step over the lines like after each line that it prints out, it would stop into a ripple as an option for an interactive, like in a debugger, view of the code. Would that be possible? It would be possible in SQL debugger. I wouldn't really want that in PySnooper. But thanks anyway. You've got a bit of your documentation. I've found an option you do have. Wait. Start over. Be close to Mike. So I found an option that you have implemented, which is to disable debugging with an environment variable. Yes, that's right. Let me show it. Oh, yeah. I know I've got a question about it. Does it disable at the trace level, or does it just disable the output? Hey, repeat your question. So when you disable the debugging with the trace, with the environment variable, does it disable it at the trace level so nothing runs, or does it just disable the output? And trace. And I think there is just logic in PySnooper that just tells it not to run if that happens. You know what? Let's even see the code. Let's see. Ah, yes. Okay. Now I see the answer. I understand your question. Yes, it does disable it on the at level. It means that even like PySnooper becomes a no op. That's brilliant. Thank you. You're welcome. Do we have any questions online? Don't be shy, online participants. Okay, go ahead. I guess it might be out of scope considering the previous feature request slash question. So you said that you can go deeper in the depth in the call stack. And I guess that there is no way to filter some of the calls only to not have the whole thing. Filter, what do you mean filter? Like for instance, I want to only go deeper for one module and not for the statistics module, but not for, I don't know. Yeah, yeah, that's right. There is also two esoteric for me to implement. Okay. But I recommend grep. And you will just grab like one line. Or do you have a prefix by a module? No, no, you're right. Grab would be difficult because there is just one source line. You're correct. I think we have time for one more question if you want to come up. And the question was whether PySnooper works on anything. That's something I kind of looked into, but there wasn't enough demand and nobody wanted to implement it. So this is actually a feature I would accept. Like if someone presents the use case for Async and implement it, I'll merge that. So go ahead. Okay. Let's think one more time from everyone warmly. Thank you.