 Yes, thank you Richard, so we have not come to the exit of data visualization with matlotlib. So we will start off with a few questions and some motivations why a tool like matlotlib can be useful for post-processing and visualization of data. So what happens if you cannot automatically produce plots? Yeah, that might for instance slow down your speed of processing raw data. We will also address a little bit when matlotlib is a good tool and when one perhaps could have a go at using other libraries. So the vietkis for this lesson is that we will get going with some examples. We will have some type along and then there will also be exercises where you will do your first matlotlib visualizations. So automation is generally something which is very neat if you get to use to it. And one aspect of this is that it can be so that if you are writing your scripts to make your plotting, it might take a longer time than for instance using a program with a graphically using the face. But often this is when you do the plotting for the first time. If you need to do a similar type of plotting 10 times over for similar kind of data sets, then you typically have a good speed up when you're doing it in scripting. And moreover, it is reproducible in case you need to pass it on to a colleague or to like a third person who might then for reference in the future would like to reproduce the graphs and plots. Yeah, so this is happening. Yeah, perhaps you could fill in a little bit more with. I mean, this has happened to me many times. So I say, okay, so I'm making this paper. I'll make a figure. It takes a little bit of manual work. But then always my supervisor is coming in and I'm seeing things. Okay, I need to adjust this a little bit. I need to adjust this a little bit. I need to adjust it a little bit. And if it's automatic, that's easy. If it's not, then I end up spending a whole lot of time on it more than it would have taken to figure out the tools to do it automatically. Yeah, and then it gets even worse when the paper comes back and the reviews are there. And I've done something automatic and I don't even remember what it is. So yeah, that's what we're going to see now. Yeah. So another thing is this that manual post processing is something that we in general would like to avoid. Because this is something that can be very time consuming and also not perhaps reproducible. So what can manual post processing be that can for instance be? And this is not here within this particular lesson, but let's say you're plotting three figures that will go as panels into one joint figure that might get into a paper, for instance, or you want to have them grouped on the web page, then even arranging the positions of these three figures is something that it might be advantageous to do this by scripting. That might have to be that you then need to set the coordinates. So this and that panel would be so many points or have these coordinates relative to the others. So that might be a little bit of an hurdle in the beginning, but once you get used to it, it allows for rapid workflow. So why are we starting with multiple clip? So it's perhaps the most standard Python plotting library. The other Python libraries which are built on top of Matplotlib. If you have earlier experience of Matplotlib, then you will feel familiar with Matplotlib. It is relatively low level in the sense that you will get handles to the graphical objects and can then manipulate them one by one. So then other Python packages, if you have a little bit of a high level, might then give you more in terms of fewer lines of code. And we will come a little bit to this towards the end of the lesson when we will explore one other library. Yeah. Okay. So, yeah, getting started with Matplotlib. Is it difficult? Is it straightforward? I would say because I mean, certainly as always, you need to have the libraries available in a Python environment and that you have by now because you did this preparation for the course. And so I'll here go directly to one example. So I scroll down here. And these are many lines of code. It actually doesn't fit completely on my panel here in the web browser. But I copy this. Okay. And I paste it into a Jupyter Notebook. So that's the first line there. Matplotlib inline. Are you going to talk about that? Yeah. It's good to bring it up. So that's a command with the percentage sign that makes the Jupyter Notebook to display the rendered graphics within the Notebook. Okay. So that's like the integration of Matplotlib and Jupyter somehow. Yeah. Okay. Yeah. So then with this, if we run it, I see it makes some artificial data and it runs some commands to make a figure in axis and runs scatter, which I guess makes a scatter plot. And then it sets some labels. Yeah. It's the result that we'll obtain. So we could just highlight here in the code. So we have the import command and it's a single library. PLT is a common abbreviation as object name for the library. We have data here in two one-dimensional arrays. These are not non-py arrays, but just Python data here. These are completely arbitrary. We have here a handle to subplot. And I think we will talk about that a little bit later after the first exercise. But for now, just note that we have two handles here, fig and axis. And axis stands for the axis. And what we're plotting is the data in the y array versus the date in the x array. And we specify the color. And this is in hexadecimal notation. So it's RGB, red, green, blue, with two letters for each component. The running from A, no, from one up until F. Sorry, from zero up until F. The set label and set title commands do precisely what they sound like. So I execute. Okay. And then, yeah, we got the result as we had then here included on the web page. So these lines of code, about 10 of them produced something. Yeah. And it looks pretty reasonable. Not the fanciest thing, but yeah. Okay. I would say that Mappotlib has reasonable defaults. So we, what we have here in terms of sizes of the markers and the sizes of the fonts and ticks are, it's a good starting point, but probably not what we would like to have in the end. Yes, first year, yes. One technical remark here, in case we are running on, perhaps not your local computer and perhaps not in Jupiter, but you run, let's say, on a server, log in over to a supercomputer, say, and you might need to add this statement here so that you can render the graphics without rendering so that it can be printed to file, even if you cannot see it. Yeah. This used to be a big thing. Like, if you're doing something completely automatic, well, when you see the error message, you'll try to remember this and you can search for the right keywords to add in there. But yeah, otherwise it tries to use your display. So, okay. So, should we go to the exercise now or look at more first? We could just briefly mention what it is about. But it's also time for a break soon. Should we have break and then exercise introduction or exercise introduction and then a long break plus exercise time? Maybe let's talk about the exercise and then we can have a long period. Yes, that sounds good. Okay. So, let's exercise. So, Richard, can you perhaps paste the direct link in the HackMD? Oh, yes. Let's do that. So, essentially, you will start out from the code snippet which was about and you're going to edit this plot to make it look a little bit more fancy. And also, you're going to augment it with data. So, yeah, it's listed here. So, you will add one more data set, data 2 underscore y. You will also rescale the data with a factor of 2. And you will also try to assign a legends to the data so that you can keep them apart. And, yeah, one comment here. Multiplied is then fully compatible with Python and libraries such as NumPy. So, this is what allows us to do this very simple manipulation and the multiplication with a factor of 2. We can do it all here integrated with the plotting commands. So, that's, I would say, it's a huge advantage as compared to if you have, let's say, if you combine an older variant perhaps or plotting in scripting would be that you would combine dash scripts, awk, and perhaps a new plot. And that's perfectly fine, but then you would need perhaps to do the arithmetic, multiplying with two separate in a separate script and then you have the plotting script. Here, you have the full arithmetic when you can integrate with Pando or whatever as you can do it all in one. Okay, so here is, you can see here in the panel what you're aiming for. So, if you compare here to above, you see it's slightly different because it's more data than you have. So, now time is 10 to the full hour and we will have the exercise and we should also leave time for a break. Should we say, well, how long should the exercise have? I think 15 minutes and then 10 minutes break. So, perhaps to quarter past the full hour? I'm writing down break first so it fits in the hour time and then exercise but people can do whatever they want. So, yeah, so we come back at 16 past the hour and you should have at least a 10 minute break there and 15 minutes to work on the exercise. Yes, so I'm switching to HackMD here with the break notification and yeah. Okay, so see you in 25 minutes. Bye. Hello, we are back. So, let's see. So, there weren't too many questions and some of them we will talk about right now. Yeah, let's actually get right to it. So, I'm coming to my screen. So, here's the exercise we just did and now we're getting to the map plot lib has two interfaces, part of things. So, yeah. So, what's the point here? As many people have noticed based on the chat, these two interfaces can be a bit confusing. So, the traditional interface, the pi plot interface, is sort of what was designed to look like MATLAB and has global state. So, basically, you only call some functions and in the background somewhere it's remembering what you've done and makes the plots. So, this is simple at least because there's just one sort of, you don't have to worry what's the plot, what's the plot objects and things like that. You sort of run from top to bottom and it works well in scripts and things like that. So, for example, if you, let's see, what's an example? Yeah, like in a Jupyter notebook, if you made it so it's designed to run from top to bottom, then it works or you have a single separate Python file and you run that and it makes a few plots. This can work okay. But what happens if you're doing something really complex and you have, for example, you're reading in a bunch of data and you're generating two plots at the same time and incrementally adding data or you want to start making a plot and call another function that will do some standard setup or something. So, this pipeline interface can start becoming a bit of a bottleneck. So, let's look in particular what the difference is here. So, it looks the same until here and here we see we call the subplots function which gives us a figure object and an axis object and the figure contains all the information about the figure and the axis contains all the information about the axis. And then when we run the scatter plot, it's a method of the axis object. So, we're telling run the scatter plot on this axis object directly. When we're setting these labels we set it on this particular axis that was made here. We could have another axis going around and it's still clear. While the traditional interface, here we call plot.scatter and then plot.xdible, plot.white, plot.title. So, since plot is the module, there's only one pipe plot module and thus you can only be working on one plot at a time. So, yeah, and this has been confusing to many people over the years. So, when I was using map plot libelot, I would try to use this object-oriented interface which took a little bit more time to sort of get used to and learn how it works. But, I mean, in the end I was basically copying and pasting things anyway. Yeah. So, when you're finding stuff online, finding examples, you might see something that's written in one of these but you need to translate it to the other one in your code in vice versa. So, this lesson recommends to try to use the object-oriented interface and personally that's what I would try to do also. But they both work. Yeah. Let's see. Any hackMD questions on the topic? Is that enough? Did I convince you, Johan? Do you have an idea of which is which? I think so. Okay. I think so. Yeah. Okay. So, we could move on to the topic of styling and customization of plots. Should I go to your screen? Yes. Yes, please. Okay. There you go. So, how to style plots? So, what they mentioned is that there are benefits with not customizing and styling your plots manually, but it's better to do it as an integrated part when plotting the date in the first place. And there's quasi-unlimited opportunities for how you can style the data. And something which can be very useful is that you can have variants of your graphs that are for different purposes. So, let's say you have the data set with three figures for it. Then perhaps these three figures are to go into a manuscript. Then you make them with a certain font size and a certain thickness and color scheme for the markers and the lines so that it looks good when you put it into a document which is probably in the end shared as a PDF file and perhaps goes into printing into a paper copy. Then you then probably also have that you would like to use the same figures, the same contents in slides when giving a talk, or it could be that you would like to repost it on a website. And for both these two other channels, having the slide presentation material and having the web-hosted material, you might then need to use a slightly different set of font sizes and tweak then a little bit to get good ratios between, it can also be between let's say the X and Y axis to be presented on a screen of different sizes. And with the map of Lib, you can then do it so that you within the same script essentially generate all of the needed versions of the figures in an efficient manner. So we will, you will find here, these are not here in right here in the lesson but you will find them lower down. Namely, yeah, we actually have links here at the exercise customization three, you have links to gallery, so perhaps Richard, you could you could paste that into to HackMD. Yeah, okay, I will. And I move in here to the screen. So we do have then these other libraries, Seaborn, Adelpter, Plotly, and so forth. Now, holding on to map of Lib, I'll go here to the examples gallery. And here we find nice examples of different kinds of plots with both the output and with the code needed to render these. So you can start with something simple, color bar. So here we have a traditional color bar with broad bars, indistinct colors. Okay, apple and cherry here are very similar in color actually. And as you can see here, the lines of number lines of code here are rather few, which is not surprising because it's a simple graphical logic. Another kind of bar plot would be horizontal bar chart. This is somewhat longer code. And we have had election in a few countries on the international level here fully. We had them in the US midterm elections. A bit earlier in the autumn, we had the national elections in Sweden. And then this is let's say you have one red block and one green block and some other independent parties. Then to visualize how to reach the majority in the house or in the Senate. You probably have all seen this kind of bar plots in the news media. Yeah. And what one can emphasize here, yeah, for instance here, within the bars here, you do find the actual numbers here, 29 and 10 and so forth. So if this is something discrete, for instance, the actual because after all the number of seats in a parliament is a discrete number, it's not the floating point. There's actually quite a lot here, like the legend, the colors, the numbers inside the bars. Figuring this out yourself would take a long time, but copying and pasting. Yeah, that's a good point indeed. So instead of trying to go through the full API or matplotlib and to hierarchically search for what are the precise commands to use, you can start looking at the examples and then do effectively copy, paste and modifying. Isn't that a great lesson now? So matplotlib is too complex. No one figures it out. You look at the gallery and copy. But I think really that's what the moral is here. Yeah, in this case, I would be rather skeptical to copy, paste style of programming in general, but for this part here, I do think it's adequate. Finally, we could, what can be shown, yeah, a traditional two dimensional plot here. And what we have here that you can have very wide range of different markers. You can have here, we have six panels, and it's very, many tools have subplot commands. So it's easy to get the panels up, but it can also be a real mess if you just push the button. So to find a good trade off here being for the size of the panels and also the size of the fonts and the markers is very important. And you probably have often noticed that this kind of, in a way, rather big font sizes are actually what goes into print in PDF versions of all figures. So that was a third example. So I think we go back and out to the lesson. And we will now come to an exercise styling and customization. We do have here as many as three exercises. The exercises one and two are on how to tweak the look of all plots. And example three is a bit different because then it's about how to modify the input data. So we will, in a minute, we will let you work on that. And then Richard, how is the plan then when we reconvene, then we will have walk through another example. Yeah, I guess we can do one of the examples when we're back and we stop the lesson at 00 of the hour. Yeah. Yes. Yes. So I think we could probably take 19 minutes up until 50 for exercises. Yeah. Okay. So I said there are three of them and choose to work with one of them or work with two if you have whatever, whatever you're interested in. Okay. So we will see you in 20 minutes. Yes. Right. Hello, we're back. There's some good questions here in HackMD you might see, but we're going to Johan's screen now. Yes. So we will here make a little bit of an example exploration of another library, namely the Simon library. So first, we can see what is the graphical field of this library. So we follow this link here. We'll find here a large number of really appealing figures which have been created with this library. And one of them which catches my attention here is this here. And this is called violin plot with a bit of imagination. Yeah, this looks like a violin. There is a code snippet for it, which we have here. Yeah. Okay. And what have that produced? It has produced these blob, these elongated blobs here that are visualizing some data. And yeah, for the time being here, we are just looking on the, on so to say, the aesthetic and graphical properties of things here. So we are ignorant about what this data is. If you look here in the code, you can see that this is actually random numbers. So I move back now to my Jupyter notebook. And what we're going to do here is that we will play around with this code snippet. So I'll paste all of this to the notebook. And yeah, before executing it, we could just highlight here that we needed to input NumPy and then this Seaborn library. And Richard, how is it? We don't import matplotlib here. Yeah, that's probably not needed. I guess Seaborn would import it as the backend without us needing to know anything about it. Yeah, so the theme doesn't have any arguments in here. Here, we are creating the data set. Yeah, so somehow it's a bunch of random stuff. I guess we don't need to worry too much. And yeah, Richard then put into this D variable. And we will show it with the violinplot command by executing this statement. Yeah, as far as we see here, written to standard out in the Jupyter notebook, we see here the contents of the the array. And as you can see here, it's a lengthy array. Just scroll past it. And we here then get these nice set of blobs. And there are here as many as eight of them. And what we would like to do here is to we would like to concentrate on how do you produce one of these objects. So we then play around a little bit and we try with a much simpler D, namely in form of just this simple array, which is defined here in line in the code snippet. So I'll copy that code, paste it into the Jupyter notebook and execute. What do we get here? Okay, we get two of these plots. Looks good. So do we add in some other data then? No, I think at this point, I think we can leave it. So we can do whatever we needed to. So should we switch to HackMD and look for interesting questions to answer? Or is it all to the wrap up here? Yeah, that's a good idea. We could just get us a very brief mentioning here. I just scroll back up here to just to highlight that it's under the heading Exercise Customization 3. Here is where you find all of these links to the other libraries. So we have here now Seaborn, then you have these other libraries. And yeah, so Richard, do you want to take the screen for the HackMD? Sure. Yeah, so what was the overall summary here? We started with this idea all the plot generation should be automatic. And hopefully we've shown you that that's actually pretty reasonable to do. So it can take time to figure out all the customizations you need. And maybe sometimes you need to tell your supervisor, okay, that's hard. Can we find a more practical way to do it? But if you can do it this way, then you made a very good investment in your future. And for the most part, when everyone uses that plot lib, it's finding the examples and updating. So with that being said, let's see, there's some errors here, which can be debugged and HackMD hard to say anything now. Is it possible to update the XY labels after they've been created? So yes, if you look into all the methods on the axes objects, like there's ways to get those labels and change their sizes. But also in practice, you might have say one function that produces the data, and then it's handed off to another function that makes the plots. And it might be easier to remake the whole plot rather than try to update only little bits of the plots. So it's sort of, well, yeah, let's see. Here's an interesting suggestion here in HackMD, that one can convert matplotlib code to tixplots for LaTeX, using the tool tixplotlib. Okay. And then that can be inserted directly into LaTeX or something like that. Yeah. Okay. I didn't know about that. Tix is kind of a session program, I think. So one remark about this with reproducibility, so sharing of matplotlib scripts or, I mean, with colleagues and collaborators is relatively straightforward, as it is, I mean, widely available free source code. That, it works on most operating systems. Because we're with, well, somewhat anecdotal, but what happened to me many times that I've had matplotlib scripts, which are doing the job really, really well, is just that if the person that you're working with, if they do not have the license for matplotlib, then it might be different, the difficult for them to run it. Yeah. So we are at the full hour. So we should perhaps yeah, wrap it up, conclude for matplotlib. And let's see what we have after the break, we have after the break is data visualization. So maybe let's go straight there and we can keep answering questions on the notes. Okay. So see you in 10 minutes. Bye.