 Okay, that seems to have worked. Thank you very much for the sponsors that helped us financing the Europe-Python conference. Without them, we would not be able to do this for you. So let's invite our next speaker. Hello Miki Tbika. Where are you calling us from? Israel. Oh, Israel. Nice. So you're one hour ahead of us. And we are already a bit short on time. So I think it would be good if you start your screen share. I think you're going to show us something about iPython. So that actually works well with the pandas we saw before. Yeah. Okay. So hi everyone. My name is Miki Tbika from 353 Solutions. I've been a Python developer around 25 years now, which makes me old. I'm writing code most of these 25 years, and I'm teaching a lot in the last seven years or so. And during the COVID break, I also wrote a small book called Python Brain Teasers. If you're interested, it's on Gumroad or available on the programmatic bookshelf. And in this talk, I'm going to show you iPython, which is my choice of tool for playing around and starting the program. So the iPython is an interactive prompt, also known as a REPL, which is read eval prompt and loop, which means it's going to read your input, evaluate it, show you the output, and wait for another input. And this is a very effective way of working. You can test your code. You can test assumptions about your code while you're writing your code way before you start having code, which is stable enough for testing. If you read a bit of how Paul Graham is describing programming, he says that we are at the beginning doing what is known as sketches. We're not sure exactly how it's going to form out. So we're playing around with bits of code. And for this type of playing around and trying out things, and also for in data science, the exploratory stage, iPython or the Jupyter notebooks, which most of the stuff I'm going to talk here is also relevant, is a great, great tool. Another thing that is great about iPython is that you don't need to do context switch for almost anything else. Everything you need is at your fingertips. And what you're going to do is just write in iPython. You don't need to do a context switch for the browser or context switch to an editor and running tests. Everything is right there, so you can stay focused and see what's going on. So what we're going to do is I'm going to have a task, which is I'm going to load compressed log files from a directory to a pandas data frame. And throughout this task, I'm going to mention some of the features that iPython is helping. And note that some of the code that's written here is specific to iPython, meaning if you copy and paste it to a Python script and try to run Python on it, it's not going to work. This is not just pure Python. So the first thing we want to have a look is where am I? What's the current directory? So I'm starting with the percent sign. The percent sign is what is known as a magic command in iPython. And the PWD magic command is saying what is the current directory? And I can see that the current directory is this directory that I'm here. And there are a lot of magic commands. You can use the magic magic to have a look at all the magic commands out there. And you can even write your own magic. And we will cover some of them throughout this. Okay, so the logs directory is here under this directory in the logs directory. And instead of doing copy and paste, I can use the output. So if you notice on the left side, there is an out. And iPython is saving the output for you every time. So I can do logs there equal out of four plus logs. And now I have the logs directory, which is there. And yes, iPython is going to consume memory because it's going to keep almost everything in memory. But it's really nice, especially if you run something that takes five or six minutes and then, oh, I forgot to save it in a variable, it's right there. Now I want to have a look at what are the files that are found in this directory. I can use the Python one, or I can use the command line, in my case LS, because I'm running on a Linux machine, to see what's there. And I can pass the logs directory as a variable to the LS command, and I can see what are the files that are found. Right, I can pass direct parameters like this, or I can do incurly braces, and both of these, oh, sorry, LS. Okay, so maybe something has changed since the last time I checked it. This used to work as well. But these are live demos for you, right? You've got to live on the edge. So I want to see the files, and I want to get them into a variable so I can play around with them. So I can do file equal LS logs there. Oh, I see, I had a typo. There was an x to s on the end. It should work without it. Okay, and I'm going to use the long format for the LS. And now when I'm looking at files, I see that I get the list of files that is basically the output of the LS command. And note that this thing is not exactly a Python list. It looks like a list, and you can access things from it, but it's not exactly a list. And if you do files dot and then hit the tab key, iPython is going to open all the possible completions for you as a menu, and you can scroll down and see what's going on. And if you'll start with the prefix, it will show only the things that start with that prefix. So you can do that. For example, I want to get rid of, I want to have a look only at the log files, and not on the char signature or the total. So I can do glob of the word log, sorry, grab on the word log. And now I can see only the log files. Okay, so I can do log files equal files dot grab of log. And now I have only the log files in this list. Another thing that this list object has, it's already split the output into fields. So if I'm doing logs files dot fields, and I'll take the fourth field, I'm going to get the file sizes. So to see what is the total amount of data that I have. So this is the amount of data that they have in this directory. If I want to see it in megabytes, for example, again, I can use that and two to the power of 20 is one megabytes. So I have 4.8 megabytes in this directory. So I can do a lot of exploration on the file system, on data, et cetera, to do that. Okay, so let's pick a single log file. So a log file is logs deer plus design plus plus logs files. And we'll take the last fields, which is the final name, and we'll take the first one. Right, so now I have a log file that I can play with. Now it's always a good idea to have a look at the data before you start to pass it. This file is compressed with a format known as XZ. In Python, there is an LZMA library, which knows how to work with these files. But before we are going to do that, I just want to have a quick look at that. So I'm going to use another utility from the command line known as XZcat. So I can call XZcat on my log file. And let's take just five lines to have a look. Right, so I can see that these are the lines that are inside this file. Okay, so let's gather some lines. So I can say lines, and let's get 15 lines. And now I have these lines that I can look around and see how can I start passing them and to pass them into a data frame. So let's pick one line. Let's say line number three. And here we have the line. So we see we have the host. We have the timestamp. We have the path. We have the HTTP status. And we have how many bytes were sent back. And we're going to use the simple approach of just using split. So lines.split, sorry, line.split. Whoa, I'm making tons of mistakes. Better to do them now, right? So these are the fields that I'm going to have. What I like to do usually is to see the fields with their position. So now I can see that field zero is the host name and field three and four are the timestamp, field five has the HTTP verb with an extra at the beginning, et cetera, et cetera. So now I can go and start writing my code. And you can write code in IPython. It has pretty good support for walking with multi-line code. But at this stage, I usually like to invoke an editor. And as I said, I'm old. So I'm going to use an editor, which is older than me, which is known as VIM. So the magic command, what it's going to do, it's going to run an editor, which is defined either by environment, variable, or by configuration on a file. And once you finish with the editor and you exit the editor, it's going to inject all of these code into IPython. So I'm going to edit logs.py. Oh, I forgot to, I wanted to copy all of these lines, sorry. Okay. And now edit log.logs.py. And now I can put the data that I have here, just as a reminder while I'm working to see what's going on. And then I can do pass line for line. So fields is line.split. And now I can return a dictionary with all the lines. All right. So the origin is fields zero. And the time is fields three plus space, fields four. And methods is fields five. And I'm going to trim this leading. And then the path, which is fields six. And then we have the status code, which is int of fields minus two. And the last one, which is the size, which is int of fields minus one. Okay. So this is how I can pass the line. And now when I exit the editor, I have this function ready here and I can check it on the line. And see that it looks fine, right? So origin, the time, the method, et cetera, et cetera. All of them looks okay. So now I can go over the lines. So for line in lines. And I can do print pass line of line. So I'm checking my code right as I write it. Everything is fresh in my memory. And I know exactly what I'm doing. If you notice this printing is a little bit different than how the shell showed the previous pass line, right? This one is nicer. And this is because I Python is using something known as pretty printing. So it shows you some of the built-in data structures such as dictionaries and lists in a nicer format for us. If you want to use it yourself, you can use it also in your code with pprint. And then just add a p here. And now the code is easier to pass and see what's going on. Okay, and you can use the errors to scroll through history in iPython. And now, once I checked my code on a small amount of data and it looks fine, now I want to check it on a whole file. I want to have a look. So as I said, the name of the library is called LZMA. It's available in Python, I think 3.6 and up as a compression library. But I don't really sure how to work with it. All right, so in Python, we have the built-in dir command that shows us what's available as an attribute inside the module, and this open command here looks promising. But I want to check the help. So I can use a question mark, which is a shortcut in iPython to get the built-in help about an object. And if you use two question marks, you will get the source code. And sometimes, because the commutation is not as great, viewing the source code is helpful. So when I'm looking at the help, I see that I get a file name and I get a mode, which is by default binary. But in my case, I know I want a textual file, so that's what I'm going to do. Okay, so with LZMA.open, and we have our log file, and we have our log file. And we're going to say it's a textual for a line in fp. And I'm not going to print it out because there are lots of lines. I just want to see that there is no pass line of line. And we see that we have some kind of an exception. We'd like to understand what was wrong and how it failed. And this is another magic command in iPython that can help you during development. There is a magic command called PDB. PDB, if you don't know, it's the Python debugger. It's text-based debuggers that most of the editors that use it under the hood to run debugging. And if you vote PDB, what iPython is going to do, once there is an encode exception, it's going to start the debugger exactly at that location. So if you want, I won't get into PDB. There is a shortcut for help about all the commands that you can have. L is showing you what's going on. But basically, I want to look at fields. And I see that this field is an empty sign, a minus sign, because we got four of four. So there are no bytes sent out. So now I know how to fix my code. I can do quit. And then I can go back and edit my code. All right, so I'm going to change it a bit and do size equal 0 if fields minus 1 is minus else int of fields minus 1. And I'm going to change it to size. Okay, and now I can run the code again. And now it's fine. Once you're done with that, I recommend turning PDB off because it's going to be really annoying on every mistake to get the debugger somewhere down the line. Okay, I can also check, for example, how much time it takes. So there is time and there is time it magic. So pass line of line. And I see that passing one line. And that's a good way, by the way, if you're cold in the winter to heat up your CPU. Also a good way in doing live demos to waste some time. So we see that it's about 1.69 microseconds per loop. If you want to check a bigger piece of code, for example, this code, you can do it with two percent marks. And this becomes now what is known as cell magic. Meaning it's going to run this magic on the whole code on the whole cell and not just on one line. So what I have in line 52 is called line magic. And this is cell magic. This is now going to check how much it takes for the process of the whole file. It's what it's 328 milliseconds. Okay, so once I'm done with that, I can do records equal pass line for line for lining lzma.open log file. And then import pandas as PD. And the f equal PD.data frame from records. Now we can have a look at our data frame. I'm running short of time. Okay, and it looks fine. Sometimes when I'm showing the whole data frame, it looks too much. And the nice thing about pandas with the combination of iPad on a Jupyter notebook, that pandas knows a lot about that. So I can do a PD.this options and that display max rows equal five. And then when I'm showing the data frame, I will get less of a context. With that, there are a lot of options in pandas and in ipython to play around with these things. So there is a configuration file that you can do a lot of other things with it. You can save the history, you can do a lot of things. Apart from everything that is built in, you can also use a lot of extensions to Python. So there's ipython SQL, for example, which is an extension for working with SQL. So I can do SQL and I'm telling it to connect to the SQLite database of the weather.db, sorry, load ext sql and now I need to do, I can do SQLite 3. Oh, it's my day of typos. I'm really sorry about that. So SQLite, the weatherdb and now I can do sql select weather where the temperature is bigger than zero. And this is going to give me a list of lines. I can do also config sql magic dot auto pandas equal true. And then if I'm running this again, I'm going to get it in a data frame, which I can join with my data frame for what I'm doing. As I said, there are a lot of configuration options in ipython and you can use them. If you're working with another editor, not like me in vim, but with pycharm or something else, you don't have to drop the ipython support. In pycharm itself, there is a configuration option. In the console, use ipython if available. And then once you do that, you will see that the console is now with ipython. And what you can do when you're running code, instead of doing run, you can select a piece of code and do right click and execute selection in the ipython console. And this is going to work very much like it worked with edit magic. This is going to run it. And now you have pass line here, which you can try it out and do many things. So even if you're not old like me and using vim, but you want to use pycharm or other editors, you can use them still successfully with ipython. And of course, there is an extensive configuration file. You can add your own magic commands. This one is very useful for me. And that's about it. Thank you very much for listening in. And I hope you'll consider using ipython in your development process. I think it will speed it up and make you much more efficient developer. So thank you very much. Let's stop the video. We have one question for you. Can you also use PUDB as a magic command? Probably, yes. I don't know. Maybe there is a built in magic already someone wrote for it or if you installed PUDB, there is already installed an extension. I haven't checked it. PUDB is a debugger, which is more visual than PUDB. So it shows you a nicer environment, but it's still textual based. So you can use it over SSH sessions and other commands. I haven't checked it. Maybe there is a configuration file. Okay. Thank you. There's also questions coming in last minute. Getting extra votes for this as well. When you finish the coding, how do you save everything into one file? So there is a magic called history, which you can save things. I won't go into whatever it does. I'm using showing the Python problem, showing the output and showing where to save. And then edit-access don't run the file. I have all of these here. And what I usually do, I save it as a Python file and then I start transforming this Python file into a module or something which I want to work with. But you could type that into Vim from there as well. Yeah. But I'm just saying I'm saving the history.log and then I'm opening this log. Probably a history.py was a better name for that. And from there I'm starting anything. It's the same problem when you work with Jupyter notebooks and you have your notebook and now you want to convert it to a module that other people can use. It has an option to export it to a Python file and then you start restructuring this Python file for a more specific Python code. And as I said at the beginning, this is for the exploratory phase. When you're trying out things, we're not sure how the code is going to look. Once you have a good notion of how the code is going to look, you should switch to my ID, which is Vim, but any other ID writing tests and writing Python code is at large. Okay. Thank you very much. We don't have time for more questions in here. I see more coming in and there's also some discussion in the Discord channel. So if you have further questions, please ask them there. I have to thank you for your talk. It was very interesting.