 so on, so forth. Right. Now, okay, we have gotten like tons of numbers and we have seen how to compute a lot of numbers from our data frame, but it's much, much, much nicer when we can actually show that on the screen. For this, we will rely on two main libraries, which are very much used in Python. The first is Matplotlib and the second is Seabourn. Matplotlib is pretty a base plotting library. You can go very, very, very far with it, but it really offers a fairly low-level approach where you really create each point, each line, and so on, so forth. But it's kind of fundamental in Python plotting. And then on top of it, we use Seabourn, which is much higher level and interfaces very well with pandas data frame. Okay. They really work hand in hand. What's more, Seabourn plot object interacts very, very well with Matplotlib objects. Like I said, it's really built on top of one another, and it's super easy, and it's actually expected that you mix elements of the two in the same plot to really make this exact plot that you want to have. Okay. Again, maybe for people with a little bit of R experience, in R, there is the base R plotting, and then there is ggplot, and ggplot is super nice, but you cannot mix the two in R. In Python, it's completely different. You are exactly expected to mix them. It's very normal. All right. So let's first import Seabourn. It's typically import as SNS, that's its small name. And then let's say we want to represent one column, and we want to represent the fair. So I can call the this plot. And so this plot is short cut for distribution plot. So that's basically I want to see one, you know, one column, no one distribution for one values. And I give it df.fair. And this is what I get. All right. So just in one line. And so I've got an histogram of my fair. For on your computer, you might get a slightly different scale when it comes to the size of the, to the size of the, of the labels there. The reason why I get that is remember at the beginning of the notebook, they were the cell that I executed, and it tweaks the default parameters of matplotlib and thus Seabourn because it's built on top. And that was just because when presenting on screen, it's a bit nicer for me to make this bigger. But if you have not executed that, then you will get something which is slightly less large on your, on your computer screen. Okay. So this is for a simple histogram. Then of course, then maybe you can say this is a bit small, right? Maybe you don't like this size. So you can always change. For instance, just saying that the height should be here seven, seven is in inches. So by doing this like this, I get here, for instance, the same thing, but larger. Or if I want to have something much smaller, I can also have that here. It's lovely small, small, right? But you can always tweak this sort of thing. All right. Now, here, we see then one thing, when I call this, this plot by default, it will give me a histogram. All right. But that's something that we can tweak. So this plot is what we call a high level function in a figure level function, if we talk about multiple vocabulary. So in the sense that it lets you control a whole figure at once and the whole appearance of a figure at once. Okay. And there are a kind argument, okay, which we see in other figure level function of Seaborn, which lets you again tweak the sort of plot that you want to show. For this plot, you have three kinds. Okay. You have histogram, which is then kind equal his. That's the default. That's what we have seen. And you have also kind equal KDE for a density line or ECDF for the same thing, but with a cumulative distribution function. All right. So now if we look at what this gives us, all right, I will tweak maybe this to a height of six. I think with this zoom level, that might be a bit nicer. Yes. So see here now, the only thing that I've changed is the kind. Okay. And so I can go from an histogram to a density line. So I could also tweak it back to the histogram if I wanted. Or I could have the cumulative density function. So that's ECDF. There you have this. All right. So far, so good. Yes. No. Yes. All right. So far, so good. So let's move on. Then each kind on these offers you different options. If you go and look up the help of these, this plot function, you will see that there is dozen, if not more arguments. All right. And so very typically, you can change colors. You can tweak the sales of lines and so on so forth. Let's see a few, for example, if you specify that you're kind of histogram, you have an option which is called KDE. And you can set it to true or false by default. It's false. But if you set it to true, then it will overlay a density line on top of the histogram with a color that is somewhat compatible. It will be a color that is close, but different enough that you will see one or the other. Then the color is set up with the color argument, not too much, not too much surprises there. And then you can input any sort of color that you think about. There are reference documents for the colors that exist in Mac Plotli, but the number is absolutely huge. And you can also give color in R, using the RGB, or using X code and so on so forth. We, I detail this sort of option a bit later on in the notebook. So that's this. And then also there are more stuff that can be tweaked. For example, I can tweak the line of the density line by specifying some keyword specific there. So this is maybe for people who are used to Mac Plotlib, but any argument that exists in Mac Plotlib can be then specified there to be applied to the specific line of the density and so on so forth. Each element is sort of tweakable that way. So in Mac Plotlib, there is an argument that is called line style, or in short ls, and that takes as argument a style of line. This means that it should be dashed, or you can have, should be dashed dot or stuff like this. So here, this is what I specify, for example. And so you get here on the top my version with density and a line in color teal. And then here in darker range, but now you see that the line is dashed. All right. Now this can be tweaked at infinitum, at nosium. But because you are operating with one function call makes tons of stuff, you have to specify a ton of arguments then if you want to precisely tweak each element. Sometimes if you want to have a very, very fine control on each element, rather than making one call to make everything, it might be smarter to make several calls to really control each element one by one. So you would make one call to create the Instagram, then one call to overlay density line, and so on so forth. And we'll see how to do these sort of things as we move through stuff. Because as I said, now this plot is what we call the figure level function. It does everything by itself and you don't necessarily add too much stuff on top. But it's a bit limiting in the sense that I like it very much to just do stuff, do something very quick and have a look. But if you want to have a very precise control on all elements, as I said, it becomes hard. It doesn't play very well with multi-panel figures and so on so forth. That's when we go down a level. And we start to play with the panel level function or ax level function. For that, we need to maybe go a little bit back to a concept, which is very important in matplotlib, which is the concept of figures and axes and what they mean. So let's see together how one could create a multi-panel figure in matplotlib. And I think that this will illustrate that. And by the way, as I speak about this, who has already used a little bit plotting in Python and matplotlib, please put a little green tick and a red cross if you have never. Just so that I know how much I should spend time on this basic stuff or not. Okay, most of you have never done that. So then, yes, I will be sure to then take the time to discuss a little bit that. And even if you've done a little bit of matplotlib, I know that also many people are not necessarily super familiar with figures and axes. So in matplotlib, we have what we call the figure. Okay, the figure is the entirety of the graphical panel. And then your figure is composed of one or multiple axes. And an axe is basically a subplot, if you will. All right. So for example, there, I create a figure, okay, with multiple axes. So one row and two columns of axes. And I specify that the figure will be 10 by eight. Again, this is a bit more zoom in that what I usually use. I will make that a, so this is the width. And this is a height, I will make a height of five only, I think the first element is a object that let me interact with super high level stuff. So for instance, I can create one title for the whole figure, or I can export the figure to a file or stuff like that, like super high level goals and function and action. And then here I have something which is actually a list or a container or some type that contains the different axes. Okay, there, because I only have one row and two columns, you can see that it's a container with just two elements. If I have like then, you know, maybe two rows, two axes, then that would be nested lists that I would have to then navigate. But there it's just simple, two axes, each axe represents one subplot. And we will go from left to right. So then I access each, each element in this container by just using the square bracket as with any list. Okay, so then access or zero represents my left panel. And axes of one represent my right panel. And then what I do is that I just call that dot and then a matplotly function. So plot is to plot, I give x and the signals of x, and I give it a label. And I set a title, I set a label for the x axis, a label for the y axis. And then with this call to legend, Python will go back to whatever was plotted to that axe and has a label and will create a legend appropriate for it. Again. And so then you see I do the same thing, but now I change to which axe I plot. And then I call this little dot type layout, which will kind of reorganize a little bit the different axes, making sure that the label don't overlap with data and stuff like this. And then this call to plt.show, it's not necessarily super mandatory in the in the Jupyter notebooks. But outside of a Jupyter notebook, that is what would tell matplotlib that the figure is ready and should be displayed. All right. In in the Jupyter notebook, they kind of if you will use the end of the cell as this signal. So that's why we get here, you see our sinus with the specific legend and co-sinus with us a specific legend. Okay, the legend is kind of placed here, to the best of of matplotlib's limited capabilities. In practice, it's also possible to tweak exactly where we put it and what size we want it to be and so on. So first here, I just use some default stuff. All right. Is it good so far? Does this make sense, this concept of figures and axes? All right. So basically, there you have it. So that's also when I sometimes I will reference to an ax or something because you will see that then this axis there, we will tend to give them as argument to seaborne function so that they know where they should go on the figure. All right. So sometimes, rather than calling sns.displot, we will call the smaller ax level function, his plot for histogram, KDE plot for kernel density estimation. So for the density line or ecdf plot for the cumulative density. And that's a general pattern with seaborne functions that they are something, something plot function. All right. Whatever you want to show plot, dot violent plot, dot box plot and so on and so forth. So then I can sort of do the same thing as before. I create a figure and axis with one row, two columns and then I call his plot of the age and I say that the ax is equal to this there. So the first, the left panel and then a KDE plot of the fair and it goes to the right panel. Then my tight layout and so on, so forth. That's what then I get to see. All right. So then it's kind of the same thing as before, but now we know how to plug the frame of matplotlib with the functions of seaborne. Okay. Now I give you here a small worked example of how it works when you have like several rows and several columns. What happens is that you have now a nested matrix. So you don't just go with ax and then just a value, but you go with a precise, you know, way to say I want to have that top left is zero comma zero. Then top is then first row, second column and then second row, first column and so on, so forth. So applied there. This kind of looks like this where you can see that then I have then here top left, then here I have, sorry, I have here top right and then bottom left and then bottom right. Okay. So stuff like this. Now what I do there is that I play with an argument of a heap plot. Okay. So which is the bin's argument. So you know that in an Instagram. Okay. You cut basically your data into subgroups. Okay. And each subgroup is basically a bin. That's basically a bar here in the Instagram and seaborne has a fairly good default. I think most of the time it's a very, if it's quite nice, but sometimes you might don't want to tweak that. So I just want to demonstrate this. So here this is with the default. Okay. And now then this is with five bins, 10 bins and 1000 bins. And you see that it looks like this. Okay. So it's fairly intuitive what it does. Okay. And you see also that the number of bins that you choose will kind of change the sort of granularity with which you show your data. There is another argument which I don't necessarily mention here, but I think it's worth to know about. Although I do encourage you to spend a lot of time maybe looking at the help of this function as much as possible. And this argument is called bin width. So basically just say there isn't specifying that you want to have 100 bins. You say I want the bins to be zero point, I don't know, zero point one or zero point zero one in width. Okay. So just that there it's very clear. And you can also easily then align them with other Instagram and stuff like this. I actually use bin width quite a bit. In general, as I said, I do encourage you to kind of look like this. You can see that there is a lot of arguments. We won't go, we cannot go through all of them, but it's actually worth it to go there. And also the online documentation is super nice because they give you tons of examples of how to use most of the argument to create super nice figures. Okay. So remember and make the difference between the high level figure level function and the lower level, sorry, x level function. So this plot high level figure level and his plot x level lower level. Right. And something also. So now I want to just also demonstrate to you now that we have this lower level function, we will see that we can fairly easily then mix other matplotlib element within a C-borne plot. So for that, I just create a little function that takes some data, typically a column and a matplotlib x. All right. And then it takes this data, it computes its mode, its mean and its median. Okay. See this little zero there is to just get the first mode in case it's by model distribution. Then I compute, I create an Instagram plot to the ax that was given to the function. All right. And then I also to this add ax, add some vertical line. All right. With the ax v line function with the mean, the median and the mode with their line style, color and label. And then legend. Okay. And so just like that now, I can use this function to create a small representation of the fair and the age columns such as this. So you can see now I have mixed together this all this C-borne and matplotlib element to make the figures that I wanted. And that's one instance that can be useful to see that, you know, here we have a visual presentation when we see the relationship between the mean and the median of whether there is a heavy skew or not in the distribution of something like this. Okay. This is a very, very simple example. And usually the sky is the limit, of course. Okay.