 Hey everyone, welcome. So yeah, so this talk is called, Stop Using Print, Using the Logging Module. And that's true. I am a full-time Python and Pandas trainer. That's all I do every day. I go to companies all around the world, although a little bit more virtually since this pandemic you might have heard about. And I help teach the people Python. If they don't know it, I teach it. If they know some, I teach them more. If they know how to do exercises, I have newsletters, I have books. I have an online boot camp. My latest thing is something called Bamboo Weekly, which is Food for Thought and Food for Pandas. And if you like that joke, you're gonna love this talk. All right, so a question for the audience. Have you ever written buggy code? Probably a few of you, right? So let's dig in and write some already. Why wait? Why do it on office time when we can do it at a conference? So here we're gonna write a short decorator, not decorator, we're gonna write a generator function for Fibonacci sequence, which is undoubtedly something you do every day at work. And this generator function is gonna take an argument, n, and that's gonna tell us how many elements of the Fibonacci sequence we wanna return. First is zero, second is one, and then we're gonna iterate over that range. And one by one by one, we're gonna return the first, and then we're gonna do a little switcheroo, and we're gonna say first thing gets what was previously second, second gets what was previously first time second, and yeah, there's a little bit of a bug there, because if we run this now, well wait, before we run this, what the heck is this syntax here? Print star Fib of three, right? This is meant to be a beginner level talk. What am I doing throwing generator functions and splats at you right at the beginning? And this is definitely the hardest code that we're gonna throw in here, but it's just like one of my favorite things. And basically Fib is a generator function. When we call it, we get back a generator, which is something that's iterable, meaning that we can put it in a for loop. But I wanna take all the things that I'm gonna get back from that generator and hand them to print. So I put a star in front of it, and that sort of automatically unrolls it. It automatically runs the for loop on it. We're gonna get each of those elements there, one by one by one. By the way, don't do this with an infinite generator unless you are paid by the hour. And so if we go back here to our generator function and we run it, print star Fib of three and we get O, one, O. And we do star Fib of five, we get O, one, O, O, O. And this is where we say O, O. This does not work so well because the actual Fibonacci sequence is of course, you know, zero, one, one, two, three, five and so on and so on and so forth. So, uh-oh, we have a bug. But wait, hooray, we have a bug because that's sort of the whole point of this talk that we wanna understand what's going on. So now what? Well, option number one, which I constantly see among my students who come from the worlds of and part of me for using offensive language here, C, C++, Java and C-sharp, right? I know, I know obscene language is usually frowned upon in talks. But basically those people are like, wait, there's a problem. I know, I'll use the debugger. And they throw out the, you know, take out the debugger and the good news is with the debugger you can go deep into the code, you can put in breakpoints, but the bad news is first of all, if you're like a lot of us, you have to relearn how to use the debugger each time. It slows things down, it interferes with the normal running of your code and you can't really use this in production so much. So here's another option, and this is one that I heard about for years and actually last year at EuroPython, I learned more about it from the author from Ram Rakhum. This is something called PySnooper. It's a decorator you can stick on top of your function and it prints things out, it prints out the timing, it prints out the variables. It does a lot of the things that I would wanna print out to understand what's going on. But again, it interferes with the actual running of your code and you probably don't wanna use in production. And so we get to option three, which is probably the most popular option among an awful lot of Python developers, including me, which is print. I know, I'll just throw print at the problem and if I put in enough print statements, then, well, it'll look like I'm busy because my screen is full and my boss will see that, but moreover, I will eventually understand what's going on and then I can fix the bug. So I don't know what a variable value is, I'll print it, I don't know what a function is, when a function is being called, I'll print it. I wanna double check a value's type, I'll print it. I wanna conditionally display something, I'll print it. And by the way, F-strings are especially useful for this because I can stick all sorts of things in there and the new syntax with F-strings with the equal sign at the end, before the end curly brace, it was designed for people like me who like using print. So let's do that. Let's go back to our buggy Fibonacci generator and let's use print. And so I'm gonna say here, what is first and what is second? And I'm gonna say I'm about to yield first and after assignment and it does work, right? It does work, it tells me exactly what's going on here and so you do this for a while and you're like, wow, this is fantastic, I should do this a lot because it is kind of intuitive. You can throw anything at it because anything you put in print is turned into a string and everything in Python knows that it turned itself into a string and I can hand it as many arguments as you want and if I wanna use end and set and have things formatted nicely I can and if I want to I can even use the amazing rich package and make things look even nicer. But print is also awful. What if I want the output to go to a file and not the string? And yes, yes, there is the file keyword argument that can pass to print and have it go to a file but then I have to think and who wants to think when they're programming? I mean, come on. Do we want these printouts to be displayed in production? Probably not. What if we wanna print some things but not others? Oh, now I've gotta have like conditions in there. What if I wanna format all of my print statements in a uniform way? Now, the phrase that I learned from the pragmatic programmers and which I use all the time is dry, right? Don't repeat yourself. Don't repeat yourself, right? And the whole point is you don't wanna have code the same code multiple times or almost the same code multiple times in the same program. You wanna put it in one place or dry up that code as we sometimes say and wouldn't it be nice if I could do that for all these things that I wanna print out? Well, folks, you've come to the right place because you want the logging module. That is what this module is designed to use. And if you're like me, you've heard of the logging module and you've been like, oh yeah, that's that super complex thing I have to use for super big projects but I'm never gonna use it for my tiny little thing because come on, it must be really hard to use. So the good news is that logging is part of the standard library, right? So it comes with every single download of Python. So you don't need to worry about people installing it and it solves all these problems. Now for simple things, it's just a little bit harder and you're gonna see like a teensy weensy bit harder than print, but it makes complex things much easier and it lets you move into those complex situations much more easily as well. So let's do a little hello world of logging. So I'm gonna say here import logging, right? Because it is in the standard library but remember, standard library means it comes with Python. It does not mean that it's automatically imported. You still have to import the module and once it's available, then I can say logging.error and yes, that is a function or a method, if you prefer, but it's a function I think and then we say the string that we wanna print out and sure enough, right when I run this code I see output on my screen. What is that output? Well, that output looks like this. It says error, root, hello world, three different parts. The first part is the log level. We're gonna talk about that more in just a moment. The second part is the logger we used. We're gonna talk about that too. And the third part is the most recognizable part which is the message that I actually wanted to print out. So if you do nothing else, you can just, I'm just gonna go back and slide here, you can just replace every call to print in your program more or less with a call to logging.error and it will display things on the screen and you lose very little and you potentially gain quite a bit as we're gonna see and now you get to see the arrows again, exciting. All right, so what's with the log levels? So I call the function logging.error and it turns out, I mean, as you probably know, when you want to print things on the screen, not everything, ooh, not everything is equally important. Some things need to be printed out all the time. Some things just in development and sometimes like, you know, if you just wanna flick the switch for turning on debugging. And so what I used to do before I discovered logging was I sort of artificially did this. I would have like a global variable that was an integer and I would just sort of crank that integer up or down and say, if my debugging is a five, God, I keep doing that, all right? If my debugging level is a five or above, then I'll print this, but if it's three or above, I'll print that and I had all these if statements all over my code. Well, I can take advantage of the log levels to get exactly the same thing with zero if statements in my code. And so there are five different log levels that are predefined, critical, error, warning, info, and debugging. Critical is the highest priority and debug is the lowest priority. And we thus have five different functions that we can use. So when I import logging, I can say logging.critical, logging.error, logging.warning, logging.info, logging.debug. All these are functions, they take strings and they will display output on the screen. So if I have this code here and I run it, I will get, oh, I'll only get three lines. Well, you know, 60% is pretty good in coding, right? So basically, I see the log level on the left there, critical and then root, which is my logger and then the message hello world. So I see critical error in warning. Notice something about them, let's go back for a moment. It's the top three, the three highest priority ones. And that's because by default, we only see warning and above. By default, if I say nothing else, then the others info and debug are not displayed and that's so that you don't have all this debugging information clogging up the output from your system in production, which is kind of good. But we can change this if we use some configuration. If you're like, oh no, this is where logging gets bad, right? It's really tough configuration. Yes, it is. It is a single line of code. And we say logging.basicconfig and then we pass the level we want. Now, this is where things get a little bit tricky. There are five log levels. For each log level, there is a function with that name in lowercase. However, if you are referring to the level, then it is capitalized. So logging.debug, all lowercase, that's a function to log something and debug the level. Logging.debug all capitalized, that's the level that you specify as the threshold for whether something should be printed. So if I say now, logging.debug, then all of the messages that I printed before will be displayed and I can just crank this up or down depending on if I'm in development or production or development hell. Okay, we're already better off. So I can use logging, error and logging critical for important stuff. I can use warning info and debug for less critical things and then depending on what stage of development I'm at, I can change that configuration. Pretty good. Here's another thing though. Print normally displays things on standard out. And this is one of those things that takes a while to sort of get when you start developing that things displayed on the screen can actually be coming from two or be sent to printed to two different outputs that just appear in the same place normally. You've got standard output and standard error. By default, they both appear on the screen together but you can redirect one or the other two different places. So if you've redirected standard out to a file or some other place, that's fine but standard error will still show up on the screen which is usually a good thing because you wanna make sure errors are actually displayed and in your face and not with the output that's gonna go to your users eventually. So logging normally sends things a standard error which means it has a much higher chance of being displayed on your screen or wherever you're sending your error. So it's not redirected along with standard output and you can redirect it separately if you really want. Okay, so let's get some more interesting output. So I'm gonna say import logging, basic config, logging.debug, I want everything. I get the user's name and then I'm going to display it. I'm gonna say logging.info, user's entered name and I'm gonna use that fancy schmancy fstring stuff that I mentioned earlier. And as I like to say so often, unfortunately this works. Well, what's wrong? Don't use fstrings, I know. I love fstrings, they're really great. I was just in the talk about fstrings and they are truly, truly amazing but it turns out that when you are using the logging module, fstrings are bad. Why are they bad? Well, the problem is that fstrings are evaluated when they're created and so they're gonna take whatever variable values, whatever expressions you have inside the curly braces and they're gonna evaluate them once at string creation. Which means before we even decide whether we're gonna log anything, it's gonna use that computational power. Even worse, and this is bad, someone can feed you a very bad piece of information, a long piece of information and create some sort of denial of service attack by handing big information there and you putting it in your log. And imagine like a lot of times you're gonna be logging stuff from the user, from the network, from external stuff. Hey, I got this input from the user and it was weird, let's log that. Yeah, now it's really weird and you're looking for a new job. Okay, I'm exaggerating a little. You're looking for a new job and you're looking for a defense attorney. Now the thing is, you might think that this is like totally exaggerated and I admit that I'm exaggerating a little bit but it turns out I checked and I think it was logged for J and Java that had a huge denial of service attack of this sort that apparently cost some incredible amount of money across multiple companies to fix. And much as I would like to say, oh, well, that's a Java problem, no, this is a problem of people taking data from the outside world and using it willy-nilly and not understanding the implications. Okay, so if I'm not using F-strings, what am I using? I'm using a time machine and I'm going back to use the percent sign just like we used to do back in ancient times in Python, similar to that actually. Right, where basically I'm gonna say here, right, I'm gonna say user entered percent S inside of my string and then I'm gonna pass it right name. I'm gonna say here, this is the variable that I wanna put in there. Okay, not so bad. Well, let's talk about multiple loggers, right? I mentioned before that by default we have the root logger but we can actually set multiple loggers each one for a different part of our system. So I'm gonna have a back end logger and a UI logger, algorithms logger and each logger then basically can have its own output in different places and just be identified differently. So if I have, for example, one back end logger and one database logger, I just create them using get logger and a different string and every time I use the same string I will get the same logger back in different places in my program. So I'll create the back end logger. I'll create then and I'll use it and then when I say back end logger.error or back end logger.debugger, whatever it's gonna be, it's gonna say this came from the back end as opposed to if I say database logger.error and indeed here we can see the output. We see that some of them came from the back end, some of them came from the database and we can differentiate here between the different levels and the different loggers. Pretty great. But what if I don't wanna write to the screen? What if I wanna write to a file? Well, one easy way to do it is just say basic config and pass another keyword argument, file name equals and give it the string there and that will work just fine and then it will all be written to the file that I specify. But this is an example of something we call a handler and a handler basically tells the logging system where it should send that data, what it should do with it. It could go to the screen, it could go to a file and I can specify which handler I want to use and I can specify it and then have this very fine grain control over each logger and where it goes. Now, this is easier to do if I get rid of basic config. So we're now moving into like using the big guns and making it more complex but it's just breaking apart what we had in basic config. So now what I'm gonna do is I'm gonna get a logger and then I'm gonna say, I'm gonna set the level for that logger. This is something we did before in basic config and then I'm gonna add a handler for that logger and this is gonna be a file handler and then I'm gonna have a separate logger with a separate level and a separate handler and now when I send something to the backend logger it has one threshold and it's gonna go to a file and if I send it to the database logger it's gonna have a totally different threshold and go to the screen. And so I can do this, I can have two loggers with the same handler. You can mix and match them so they can both go to the same file if you really want. Right, and now they'll both go to that same place. I can even have, if I want, two loggers with three handlers because I can really mix and match it and one of them can have, two loggers can share one handler that goes to the same file and then I can have a separate handler that goes on that second logger that will then go to the screen as well. So here what we're saying is my backend logger should go to a file and then it should also go to the screen here. Here we go, add handler, logging, stream handler. So when I log, I get it in all these different places. Notice how I can also set the level per handler, not just per logger. So I can get even more fine-grained there and say, oh wait, if I'm sending stuff to a file it had better be super important. Now stream handler and file handler are perhaps the most useful ones in a day-to-day basis but there are lots of handlers defined in the standard library for all sorts of different things you might wanna do. Right, you might wanna send email about it. You might wanna just keep it in memory. You might wanna queue it up so it doesn't take time dealing with the file system or the network or all that other stuff. Well, wait a second. What if I want all my logs to look similar? Then I'm gonna use a formatter object and a formatter is basically a fancy way of saying let's create strings in a particular template and then all of my messages will look the same and you know what I'm gonna say also, right? That we can have multiple formatters and attach them in different ways and that's true. So I'm now gonna have one logger with two handlers and two formatters. This is as complex as it's gonna get folks. All right, so I'm gonna have two different formatters here. One which just says, oh here's the time and here's the level and here's the message sort of like we would get by default but my F2 formatter says pay attention and then says time level and message. You know if the message says to pay attention we will and then I can attach them in different ways. I can say oh yes, this logger is gonna have a file handler and we're gonna set its level, we're gonna set its formatter but this one is gonna use the other one. See my stream handler will only be for critical things and it's gonna say pay attention, wake up, you've got a problem. Yeah but what if I wanna format it really nicely and fancy, I could use rich, right? Rich is this fantastic package. It turns out that rich has its own handler that you can import and use and then it displays things really nicely so all I have to do is say, right, FH equals rich handler and we're done. Well actually that's not true. That's not true because I actually need to pass one little piece of additional thing when I log, I have to say extra markup equals true and then rich will do its thing. I can even pass a second argument to my formatter saying how I want dates to be formatted and I can use strf times formatting codes which I'm sure you all know super, super well without ever looking them up, right? One last thing we can do is we can use filters and a filter basically lets me dry up that logic. Do I want to actually log this message or not and I can check all sorts of different things. All I have to do is define a class, simple class in that class that I find one method called filter and that filter has self and the filtering record and it then has access to all the fields in the thing we're trying to log. Not only the level, not only the handler, not only the line number and the file and the module and on, on, on, on. And then we just attach it. We say oh, I'm going to attach that filter with add filter and now every time we log, right, we'll find out what's going on so we can see self, see the record, we attach it and then yeah, I know this is ironic, I'm using an f-string here but you know, come on. And if you don't want to write a whole class you can just write a function. It turns out that any callable's okay there and if you put a function there it'll work exactly the same way. Okay, configuration, one last little bit. Until now we've been configuring a piecemeal. Either I do it in basic config and I pass it a few arguments or I say set this, add that, set this, add that. I can actually define a dictionary for what version of the dictionary, the logging formatting I want and then here are my formatters, here are my handlers, here are my loggers and here's the nice thing. I can put this in a module and then load it whenever I want and have it there all the time. And then all I have to do is load it up. I say logging.config, disconfig, bloop, and more sound effects and then I get it and it's great. By the way, if you want to log exceptions, logging has that. I can say logger.exception, right? And if I want to tell people you should avoid dividing by zero, if you put that inside of an accept it knows what to do, it's going to log it appropriately, right? I have it inside the acceptor, I say logger.exception and in my log file it even says here it does all the stuff that I would expect to have out there. Now this is all nice but I have left out one little thing. I know I'm running out of time but that's okay, I'll just talk faster. No, no, one last little thing which is how do I then distinguish between development and production, right? Do I really want to go into my code and say I am changing the code to be configured for development or I'm changing the code for production, my QA people will probably be, shall we say, displeased if I'm like oh, it's just a small change to the code it just changes how the logging works, don't worry about it, you don't need to run it through CI again. Yeah, they might not believe you. So what you can do is you can actually use an environment variable and then check that environment variable, are we in development, are we in production? Something like this, os.getenv, development mode and if so, if that's non-null because the default is to give us back a none, if it's non-null or non-zero then we'll go through and do our logging debug. Otherwise we'll do our basic config with logging.error as per usual, right? And so you can call this of course whatever you want, right, I'm not gonna come check. And so if we wanna go back to where we started all this with our Fibonacci generator, I can put at the top there my get em, see if we're in development mode, are we? Great, now we can actually debug what the heck is going on and try to understand it. And so if I run it, I say development mode equals one and then the code and look at this, it's gonna print debug, root, running here and I see exactly what's going on before and after I do the switch, the root and assignment and I quickly debug the code and I'm the hero in my development team for the day. If your development team does Fibonacci sequences. Okay, so in summary, I hope you see them that using the logging module is not hard, it's not that complex and it's just about as easy as using print but has much, much, much more flexibility. You choose what you wanna write, when you wanna write, you can have conditions in there, you can format it nicely, you can mix and match things in incredible, incredible ways, you can centralize your configuration and that allows you to concentrate on your code and your debugging and not on oh my God, I've got yet more things that I have to do there. One last little thing is, if you go to the documentation for the logging module on python.org, it's got a lot of great stuff there but especially they've got a logging cookbook which walks you through many, many, many of the most important things that you should know and a lot of patterns that people use that you can pick up and understand and take advantage of. All right, thanks so much folks. If you have any questions and if we have time, I'm happy to answer them. If not, I'll stick around here and be happy to answer them as well. Thanks so much. Well, what a dynamic speaker. Thank you, Luan. Now we open the floor for questions. We have five minutes, so keep your questions brief and to the point, they're standing microphone. If you sit closer to the standing microphone, just go to the microphone. Otherwise, raise your hand, I come over and give you the microphone. Go ahead. Thank you. I was, I didn't really get the practical purpose of a filter. Why would I want to filter further than just the level? So maybe you wanna do it only at, okay, I'm gonna come with some ideas, some which are crazier than others. Maybe you've got time sensitive stuff there and so you only really wanna do some logging like at night when your system isn't running like rather when you have customers there. Maybe you wanna check and see, look in the message even, do some regular expression parsing and if it has to do with this product rather than that product because you know that the bug is only being tickled when someone orders widgets as opposed to a thing of a box. So you can, now again, those are two quick examples of those out of my head. I'm not saying you're gonna use a filter every day but when you needed that power is there. And I think part of the power of the logging module is it has all these pluggable options that you can sort of pick and choose when you need them. That makes sense, thanks. Thank you, great call, great to talk everything. Whenever we talk about performance optimization we usually say that don't optimize until you know that it's too slow and you really need to change something. But then when it comes to string formatting and logging it's like everybody says never use F strings, never use string formatting, that's super slow. Is it really an issue where I should tell my team please go back to this percentage sign thing? What do you think about that? So I was kind of skeptical about those claims as well and I just, I really love F strings, I really do. They're like one of the best things that's happened to Python the last 30 years or so. Give or take. Look, as always it depends, right? If you're on a tiny little project and you're not getting user data, then it doesn't matter. But the larger the project, the more time performance it is and the more you're asking for data from the outside world, the more I would say it's not even as much a matter of the lack of performance. It's more a matter of you don't know what people are gonna be giving you and they can really, really mess you up with some sort of DOS. And it was the fact that this was such a huge problem for, again, I think it was logged for J a number of years ago and caused just incredible pain that really woke me up to, oh, this is an issue. I mean, if you're having performance problems with your logging, then from everything I understand I haven't actually used it, but using the cues instead of the actual, like the direct handlers is probably a smart way to go. But this is like a different performance problem. So gently nudge your developers not to use it. Great, thank you. What if I want to add another logging level if I'm not satisfied with it predefined? So I'm pretty sure that each of them has, is defined to be a number and that there are spaces between them. So I'm pretty darn sure you can do that yourself and they did that with that in mind. That said, it's one of these things where in theory that makes a lot of sense. In practice, a lot of the stuff in the logging world assumes you're using these five standard levels so you just need to realize if you add one, a lot of tools might just sort of not know what to do with it. But that trade off might be worth it for you. Well, thank you. It seems to me that there are two kind of use cases for logging. One is more of a monitoring situation where you would leave your logging in the code and when you put it into production you can put your error message or whatever into a file. And the other is the debugging situation where you started off talking about the print. And I'm wondering if you're using logging for that, would you then, after you've solved your problem, do you leave your logging there or do you think, oh, this is ugly now, I'm gonna remove that as you would with the print statements? I would probably then remove it and then regret it. So like, I mean, it sort of depends. Probably the right way to do it is not remove it right away. And then after you see that the problem has actually been resolved, then say, okay, we're gonna clean this up or maybe bump it down a level so it just won't log. And then do it in a staged approach there. But yeah, I agree, having your code littered with lots of ancient calls to logger is gonna create a new problem, which is great, because I'm a consultant. So we thrive on problems. The last portion, we don't have any remote questions. So this pretty much concludes the talk, but we can continue the conversation over lunch. Thank you very much for coming.