 Hi everyone, I'm very glad and honored to be our instructor today. So we come everyone to our tutorial data visualization using the JG Plotto and its extensions. We are very proud and honored because it's our first participation in USAR 2021 Global Conference and we are very excited to be here and meet all of you. Let me present myself. I'm Haifa Burmasoud. I'm a data scientist and an engineer in statistics and data analysis. I have with me also Amir Suissir who is a concerted data science and Catherine Dries who is a concerted data science engineer and ML researcher. Our colleague Mona Belayd is absent today. You can find us at our latest Tunis in Twitter and in the top and also you can email us at tunis at our latest.com. Also you can reach us via slide in the channel of this tutorial. One of my colleagues will share the channel with you. The outline we are going to present the people analytics data package in which we will use the data. So our data user in this tutorial is coming from people analytics data package. Also we will go through one variable visualization doing some box flow visualization bar plots density histogram. Then we will do some two variable visualization and visualize two categorical variables, two numeric variables and one categorical one numeric variable. Second we will do some correlation metrics and finally we will close with the graphic appearance enhancement. And we will show you how to add an animation to your graph and make your graph look professional by adding some logo. And also we will see together how to combine one or two graphs, two or three graphs together. How to follow this workshop? Actually the GitHub link was sent to every one of you. You can find all the code that we will be using during this tutorial in the GitHub link. Also we need you to have some prerequisites like having a basic proficiency in R and also some basic knowledge about different R data types and structure. You have also the possibility to communicate with us in the chat room. So you can put all of your questions during the training session in the chat room and one of my colleagues will take care of it. And maybe at the end of this tutorial I will answer the remaining question by myself. Also I can get in touch with us in the Slack in our channel. And we can still we can still add your question there and we will also answer it. The learning goals of this tutorial are getting a clear understanding of the Gigi plot R package and getting more familiar with some of its extension like Gigi stats, plot, Gigi anime and patchwork. And also we aim to lead to design effective data visualization by choosing the appropriate type of graph when selecting your data. Also we will teach you how to make some visualization enhancement to your graph by adding some animation or adding a logo to make the graph look more professional. So now what's the grammar of graphics? We will start by definition given by Google which is the whole system and structure of a language or language in general usually taken as consistent of syntax and morphology including inflection and sometimes also phonology and semantics. This is a very global definition of grammar of graphics. Also we have another definition of grammar that is the grammar is the fundamental principle of rules of an art or science. Now applying to visualization grammar of graphics is the grammar used to describe and create a wide range of statistical graphics and Gigi plot 2 is an abbreviation of grammar of graphics flow and it's created by Heidi Weitman. Also you can find an interesting book in the net which is Gigi plot 2 elegance graphics for data analysis and this is a great book for beginner and it will show you how to use Gigi plot 2 to create graphics and help you to understand more your data. So you can use this book as a resource after this training. Now let's move to the components of the graphics using Gigi plot 2. So we have about six main components that should be known in Gigi plot 2 and before starting creating the graphics. The main one is the data. The data is what we want to visualize and it should be in a data frame format. So you can use it in Gigi plot 2. Also we have the coordinator system, the core and it describes how data coordinates are mapped in the plane of the graphics. And we have also the genes. The genes are the geometric objects that are going to represent the data. Gems are like the points, lines, areas, polygons, etc. And also we have the istychs. So istychs are the properties of Gems such as x and y position, the colors, the shapes, the transparency, etc. We also have this case. This case are mapping the values in data space, in the istych space, like for example, color, size or shape. And also we have the statistical transformation which summarize the data in many useful ways. As example, we have the binding and counting to create an histogram and the regression line for regression analysis. The prior installations are you have to install some packages in order to start this tutorial. People analytics data for the data. Gigi plot 2 summary tools to describe our data. They apply to make some data manipulation. Gigi prism to change the team. Gigi alle is used to create the correlation metrics. Magic to add some logo in the graphs. Patchwork is used to combine two or three or many graphs. Gigi stats plot which adds statistical details to the graphs and Gigi animate which add animation to the graphs. Now let's get an overview of the people analytics data package. The data set used in this tutorial is coming from this very amazing people analytics data package. It contains 11 data sets from the book hand book regression modeling in people analytics by Kate Mantley. And by the way, I want to greet Kate. He was one of our guests. To have the data of people analytics data, you start by calling the package using library people analytics data. And then the comment line data package equal to people analytics data. We'll show you all the data sets in this package. We have charity donation data, employee survey data, job retention, political survey. We have many topics and for this tutorial I will use sociological data. So now let's see and get a little summary about your data. We are using data table sociological data to see the number of our variable and to get an idea about the variables, their type, etc. And here we have nine to ten variables which are annual income PPP, average working hours, education month, region, job, sorry, job types, gender, family size, work duration and other variables. Let's move to the data frame summary. So before starting visualizing our data, we will do a little summary to know about some basic statistics related to our data. So for example, for the variable annual income we have the mean and max and the median and also the frequency of valid data. Here we have about 361 distinct value. Also we are plotting the missing value. So here we have about 10 missing value which represents 0.4 percent of the data. And this summary tool is generated by summary tools. It's also an app package. We can find and download to get this beautiful summary. Let's start now with graphic types and the first one is the box plot. What is a box plot? A box plot is a graph that aims to study a distribution. It can also show the distribution within multiple groups using some statistical measures like the median, range and outlier. The dark line inside the box represents the median. The top of box is 75 percent high and the bottom is the 25 percent percent high. We have also the end points of the line. Also we can call them whiskers. It's the distance of 1.5 and a quarter range. And this is the distance between the 25th and 75th percent types. The points outside the whiskers are marked as dots and are normally considered as extreme points. In JG plot, you use a jmbox plot to visualize a box plot. Jmbox plot requires some parameter like the data. You have to specify the data. Also, you have to specify our ices x and y. And we can also change the appearance of the boxes with color, fill, alpha arguments. Now let's create the box plot. Here I have my command line, JG plot data equals sociological data. Every time when we want to visualize something in JG plot, you have to put this line, which is JG plot data sociological data and indicates to JG plot that the data you will use in our visualization. Then ices. Here my x will be the gender and my y will be the annual income PPP. And I add a jmbox plot to have my box plot. Here I want to have significant names for f and m because f and m are not very significant. So I add scale x discrete with labels equal c, f for female and m for male. Add to that. I want to change the color. I want to have purple for female and blue for male. And I use this command, scale, fill, manual. I put the name as gender to display here the legend. And I give the labels female and male and I set the value of the color because here I'm using the scale, fill, manual. Then I want my graph to be more professional. So I want to add a title. And also I want to have to change annual income PPP by something which is more significant. And also I want to put gender in capital letter. So I call the JG title and I give a parameter which is label. Label equal to annual income by gender. It's my title which I want to show in my graph. Then I have my x lab. And I give a parameter like gender, label equal to gender. And my y lab. And I give the label equal to annual income in list. So here my graph. Now I want to change the theme of this graph. And I add theme bw. And as you can notice I changed the theme of my, the theme of my box. Now you have our first box plot and it looks very nice. Let's move now to an extension of the plot to which add statistical details in the two JG plots. This extension is called JG stats plot. It's an extension for creating graphics with details from statistical test included in the information rich plot themselves. Well, in typical exploratory data analysis workflow, data visualization and statistical modeling are two different phases. In the first phase, we have the visualization for information modeling and modeling in Easter can suggest a different visualization model and so on and so forth. The main idea of JG stats plots is to combine these two components together in one form of graphics with statistical details which make that exploration simpler and faster. From JG stats plot, we will use JG between stats. It's a function that creates either a violin plot, a box plot or a mix of two of the between groups or between conditional comparison with results from statistical tests and the sub-tact. Sorry here, I don't have the plot so I will switch here to show you the plot. Here I'm calling library JG stats plot and then I'm calling the JG between stats. The JG between stats is taking about four parameters which are the data. Here I initiate the data sociological data. Then I specify my x-axis which is gender. My y-axis which is annual income PPP and the title of my graph which is annual income by gender. Here this is the graph. As you can see here, we have the median. Also we have some other statistical details and also we have the number of female and the number of male display. Maybe it looks nicer here. JG stats is very easy to use and very familiar. I invite all of you to check the documentation of this package and use it in our graphic. I will share in the Slack channel the official documentation of JG stats plot so all of you can can read it and use it. If you have any other questions related to this, feel free to pin me in Slack or actually in Twitter or via email. Let's go back to our presentation. Now we move to another type of graphs which is the bar plot. A bar plot displays the relation between numeric and categorical value. This type of graph in which different amounts are being compared and shown using rectangles that have different amounts. That's the same way. We have a comment in JG plot 2 which shows JN bar which enables us to create bar plots. It requires to specify the data. Also like JN box plot, you have to provide both X and Y inside Isis. You have also to specify the width of the bars in the bins or bin width argument. And also to set the stat equal to identity to make a bar chart and create bar instead of Anastasia. Also we can change the appearance of the bar with color, fill and alpha arguments. So like the box plot, we will start by plotting a basic bar plot. Here I have my command JG plot data equal to sociological data. We have the Isis X equal to gender and Y equal to average working hour. Then we add JN bar. JN bar is the main command and we set our arguments stat equal to identity and bin width equal to 0.5. And we have your first bar plot. Now I want to add some color. But first I will change the label of female and male to female and male. Here I use scale X discrete because discrete variable. And then I will change the color by scale, fill manual. I have the name equal to gender. The label is equal to female and male and set the colors. Then I also want to add the title. I use the JG title and I have the arguments label is average working hours by gender. My X lab label is a gender and my Y lab is average working hour. We want also to change the theme and use the theme BW. So we add this line of command. Scale Y continues expand C00 and plus theme BW and the theme is changed. If you want to rotate the bar plot, you can use the cornflip command and it will be rotated like this. And now we have all you have our first bar plot fillet. Let's move to the density histogram. So a density histogram is a tool used to visualize and the underlying probability distribution of the data by by drawing an appropriate continuous curve. In JG plot two, you use geom Instagram command line. So a histogram corresponds to a set of field rectangles whose height corresponds to the counts and whose width corresponds to the width of the variable. The density plot requires other arguments like specifying the data, the XI sticks inside Isis and it's used to change the appearance of the bar with color, fill and alpha arguments. We have also the geom density. The geom density is dedicated to change the appearance of the curve with the size, the call and the LTY arguments. Let's start with the basic density histogram. So as you can see, we have a strategy here. We start by plotting basic graph and we are making some enhancement more and more. To create my basic density histogram, I have JG plot command. I add the data. The data is as usual sociological data. Then I initiate my Isis X equal to education month and Y equal to density. And then I call my line geom histogram to make the histogram. And here I have my first basic histogram. Now let's add the geom density line to this histogram to have an idea about the density of this distribution. And here we call the geom density. We specify the size of this line here. The color, which is black, the LTY equals 2 to have the shape of this continuous line. Okay. Now I want to change my labels for my X and Y Isis. I use the XLAB command and I initiate the label equal to education in months. And YLAB is here with lab equal to density. I want to have an idea about this distribution between the gender female and male in education month. So you want to add some color. Okay. So in geom histogram, we add an argument, which is Isis. And in Isis we specify the color, the field, the alpha, and the position. The color will be here, gender. And the field, it will be gender also. The alpha, which is equal to 0.4, the position equal to identity. Then we add geom density. And we add also an argument here in Isis. It's color and color equal to gender. The difference between the first histogram and this one is in color. So if you want to create an histogram and specify the color by a factor, we have to add this argument, which is color. And we put the color of the factor in which you want to change the color of the histogram. Also here, we add the scale field manual. And the arguments are some colors. Also the scale color manual. So here, we are using scale field manual and scale color manual because we are changing manually the color of the two graphs. And we are specifying that we want to have these two colors, the purple and the blue. We can find the code of the color here. So for the purple, we have the FF1493 and the other is the code of the blue color. Okay. Just I want to remind you if you have a question, let's put it in the chat. I will take a look in the chat by the end of the session. And if a question reminds you, I will answer it. Okay. Another type of graph is the scatter plot. The scatter plot is a graph that aims to understand the nature of relationship between two variables. We have two common NJJ plots to create scatter plots, which are GeomPoint and GeomSmooth. GeomPoint requires to specify the data in the Isis. GeomSmooth draws a smoothing line based on losses by default. And it can be tweaked to draw the line of best by setting the method. The method LM is for linear regression. So here we want to plot the relationship between annual income, PPP and education in month. So here we have two variables. We have the two variable visualization type. We will plot the education month and the annual income PPP. As usual, I call my GG plot with data equal to sociological data. I set my two Isis, the X equal to annual income PPP and the Y equal to education month. And then I call my GeomPoint and I have my graph like this. But I want to add some enhancement to this graph and I want to have my a different color with a different shape of points. So here in GeomPoint, I add some arguments. The shape equal to 18 is to change the shape of the point. As you can notice here, here I have some round points and here I have some other type of points. Also, I changed the color from here, it's a dark color to a purple color. Then as usual, I don't want to have my education month like this and annual income PPP like this. So I will call my XLAB and YLAB. My XLAB here, I will set the level as annual income in USD and my YLAB, I will set my level as education month in months and my graph looks better. Then I want to add a smoothing line to my graph and I will call GeomSmooth and set my methods to equal to LM. It's a linear regression between annual income in USD and education in month and it draws here the regression line. Also, I can change the shape of the regression line by setting some arguments like line type, the color and the fill. Here, I choose my line type equal to dash. Here, I have a dash of the line type and the color I want it to be dark red like this and the fill, I want to fill it in blue and this is my regression line in new form. Also, I want now to have an idea about the distribution between skilled and unskilled people in my scatter plan. So here in my assets in the first assets comment, I add an argument. The argument is color and the color is equal to job type and the job type is to distinguish between skilled and unskilled people. Also, a new thing is appearing here is the theme, the theme legend dot position equal to bottom. I use this comment to place my legend in the bottom of my graph. So here we can also put the legend position in the top or in the right on the left as you assume. It's specify in this argument legend dot position equal to the place where you want to place your legend. Okay. So here we want to add some regression equation. So for this, we have some I would say a more complicated code and we are calling two new packages. So we are calling Gigi Pimis and Diplier to enable us to have this equation here for education and we can also have the R-square which measures the quality of the regression. So here in sociological data, we have some manipulation done using Diplier. Here, we are creating a new data set. Sociological data is and we are doing filter. We are filtering with the job type in skilled and skilled. So maybe I will switch to the other presentation to show you more the code. Here I can show you better the code. So here we are calling, as I said, Gigi Pimis and Diplier. First, we will create our data set which is sociological data S and here we have the sociological data that we want to have like filtering. So we won't only have the skilled people. So we call filter. Filter is the command line from Diplier which allows you to filter your data and we want to have the job type in skilled. Okay. And then we have our P object. So the P is the previous graph and in which we have all this code. Okay. We have stat poly equal. It's from Gigi Pimis. The formula is why it's linear regression between education in month and annual income in USD. And here we have our data which is sociological data is created previously. And then we specify the color. The round digit because we want to have our R squared rounded in three digits. We are specifying the Isis here, the annual income PPP and the education in months. Then we want to paste the label. It's like we are pasting the equation from here from formula. We are setting the parameter we want to show in italic. And also we have some some enhancements here for the equation with ahashes and ahashes. Then the pass is equal to two. The label Y in PC is equal to 0.95 and the label X in PC is equal to 0.1. And here we can have our regression equation. Education equal to 143 plus 0.00824 ESC income. Okay. And the R squared is equal to 0.147. So here this is the scatter plot with the enhancement. And we can every time we want to display the equation of the regression or the R squared, you can use this block of command with stat poly equation. You choose this formula and we can create our data as per view here. And we can use all those arguments inside the stat poly equation to have this to have a graph looking like this. Okay. Let's return to our presentation. So if you have a question regarding this this block of code, please don't hesitate to put it in the chat. I will answer it if something is unclear. Please also tell me in the chat. I will take care of this. Take care of the chat. Okay. Great. Thank you. So now we want also to have the regression equation of the asking people. So here we are doing the same, but now we are creating a dataset for unskilled people. So we have our sociological data. We are filtering with the drop type in unskilled. And then we are using the stat poly equal like command. And as argument we are doing the same, but this time we are using the sociological data you for unskilled people. Also we have the ISIS, the annual income, the education month. And then we are pasting the levels and choosing the italic format for those levels. And here we have our two equation here. This is like a graph which contains all the information about a scatter plot. And we can consider this like a professional one and we can use all our articles maybe or your studies. It's very important in the scatter plot to have an idea about the regression equation. So here every time we want to display this you can copy just as I said, copy and paste this block off code and make some modification to it. Let's move to the correlation matrix. The correlation matrix is a table that displays the correlation coefficient for different numeric variables. And in ggplot we have a command called core. Core calculates the correlation between two or more variables. Here we are using data table to plot correlation matrix but it's a correlation matrix which looks like a table not like a graph. You want this to turn into a graph. So we are calling our library ggla and from ggla we are calling jjpers command line. In jjpers we have from now only two arguments. The two arguments are the data and the columns. Okay so just for your attention if you want to plot correlation matrix all the variables should be numeric. You don't we can't plot correlation matrix for example for two categorical variable or one categorical and one numeric variable. All our variables should be numeric. So here we have numeric variable which are the annual income ppp, the average working hours, the education in months and jjpers displays all these graphs as correlation matrix. So here we have like three types of graphs combined in one. This is a density plot, a density plot of the annual income ppp and annual income from annual income ppp. And we have a scatter plot between the annual income ppp and average working hour and a scatter plot between annual income and education months. Here we have the coefficient of correlation and the significance of it. So here we have a significant coefficient because we are having three stars. So it's a negative correlation but it's significant also. Then we have also the correlation coefficient between education month and annual income, the education month and annual and average working hours. Okay now I want to change the theme of this correlation matrix and I want to look nicer. So I call my new theme which is a theme BW. So here the theme BW changed the appearance of my correlation matrix. So the title are having this gray rectangle are putting in this gray rectangle. Also like we have, we still have the distribution between the variables scatter plot and coefficient of correlation and it looks nicer. Now I want to do some comparison in the correlation matrix and I want to do this comparison between gender. Before moving to the comparison here I want to change the look of my density line to an histogram and I want to add a regression line to my scatter plot. So here I add some arguments in my gg pairs. Lower here I have lower and lower it's for my lower part here and the upper is here, my upper part of the correlation matrix. And here my idea is here. So let me explain to you the role of each argument to better understand it. Okay, to recapitulate we have three parts in the correlation matrix which is the lower part which is this one. So this one is the lower part of the correlation matrix. The upper part which is this part as you can see and the jack part which is this one. Okay, so we will start the change from part to part. So we will start with the lower part. The lower part is composed of the scatter plots. So here in the scatter plots I want to add a regression line to get an idea about how my relationship is between my two variables. So here I call an argument called lower and it's equal to the list and in list I have to specify some parameters. Here for example I add a continuous equal to wrap and the wrap is equal to smooth because here I want to have a smooth line between the two variables. The method is the lm, it's for linear regression and the color is gray and here my point will be gray. So in the previous graph as you can see the points are dark and I only want to change them into gray. So here I'm putting my color as gray. So here my points are gray and my regression line is dark. Okay, now let's move to another part which is the diagonal. So previously I have a density line plot and I want to change it into an histogram. So here in my diagonal I will call my list and in my list I will put the arguments which are continuous and here in continuous I put the wrap equal to bar dia. So bar dia is to specify that I want my density plot to be like an histogram not a density line. And then here in upper which is my part here I want just to change the color and the size. I want just to change the size of my correlation coefficient. So here I have the little size which is unreadable so I want it to be more readable. So I put my arguments equal to five. Okay. And now I have my correlation matrix which looks better than the other one. And now as an enhancement I want to to add some colors and specifications to this correlation matrix. So here from Jiji Ali also. So I let I just I keep my Jiji Perl's sociological data and the column we are from one to three out three numeric variables. And here in Isis we have a new line in our code which is Jiji plot Isis color equal to gender. So if I want to specify the color I have just to add this line of code which is Jiji plot Isis color equal to gender. And here I have my specification between two gender. I have a woman in pink and men in blue. And here also as you notice in the correlation coefficient we have the coefficient the correlation coefficient of all the distribution. And also we have the correlation coefficient between female and male. It's a specification between the two gender here. So before moving if you have any questions please put it in the chat. So if you have some questions please put it in the chat. If something remains not clear please tell us. Sorry Carlos you're right. So the education which is related. So am I sharing my screen or not? Please tell me. Yes it's okay Haifa. Go ahead. Okay thank you. Okay let's move to another type of graphs. In this part we will take care of doing some enhancement in the graph we are seeing previously. So in the previous graphs we are using like themes from Jiji plot 2. And now we want to change to another theme from an extension of Jiji plot 2. The extension we will use in this slide is Jiji prism. And Jiji prism is an extension in which we can have like other themes displayed in Jiji plot 2. So if you want to have specific other themes or change the theme of your graph you can use Jiji prism. So here we are quoting the individual income and the average number of hours per week. And we want to change the appearance of the graph. So we are using the theme prism from library Jiji prism. So here we are choosing the 12th theme. The size here is 12. And we can also choose between other themes displayed in Jiji prism. I will share also with you all the links in the Slack channel of this tutorial. So you can take a look and practice more about doing graphs with the extension of Jiji plot. So there are too many other extensions we can cover all in two hours. But we are trying our best to cover the most important ones and a better idea about those extensions. So here we have Jiji prism. The Jiji prism changed the looks of all our graph. Here for example as you can notice we have a different shape for annual number of hours per week. It's what can I call this. It's a different type for example like area, serial area, something like that. We can change it in Jiji prism. And also we have multiple range of color also. And the theme prism looks nicer than theme BW. Okay. So now I will move to another enhancement type which is make animation. And also I will get back to my other presentation in Markdown in order to show you the animation. So to create an animation and make this beautiful animation like this one, it's a great library from Jiji plot 2. It's my favorite one so far. It's Jiji animate. You can do like many types of animation. Here I'm choosing the transition states one. I choose the transition states because it's the most suitable to my data and because it's the most suitable to this type of graph. So here as you can see the graph is moving and we can have a transition from one family size to another. And I think the graph is more clear than the other one. So here for example as you can see we have multiple colors and we can't really distinguish between them. So here I can see that I have blue, purple, pink, all the family sizes in just one line. So I really can distinguish between them. So I call my Jiji animate and I have this so I can have a better idea about the distribution of the individual income and average number of hours per week and the difference between family size. And it's very easy to use. It's just one line quote. I will explain to you how to use the Jiji animate. First I will assign my plot here, my previous plot created in an object called p. And then I will create another object. I will call it anim and in anim I will call my p, my previous graph, which is static and I want it to be dynamic. So I call my previous one p, the static one and I add plus transition states and I specify some arguments like how I want to do my transition. Here I want to do my transition in function of the family size. I want the transition length to be equal to two and the state length to be equal to one. Obviously, there are many other types of transition and too many other types of animation. I will also share it. I will also share all of this with you in the Slack channel of our tutorial. So you will have a better idea about the different animation you can do using Jiji animate. And here I call my object anim. And in anim as you can see the graph is becoming dynamic and we can have the transition between the family size and have better idea about the distribution between the individual annual income and the average number of hours per week. So if you have a question regarding this graph, please don't hesitate to put it in the chat. We will take also care of it. And also if you still have some other question, I will take it by the end of this tutorial. And also we will stay available of your question in the Slack channel. And also we can reach us via Twitter or via LinkedIn or whatever you want via eBay. I will share also all of my social media in this Slack link. So if you want to add also some other just so here you want to add some other arguments and some other parameter in the graph like the title and the source. Also you want to make this title more significant and this title more significant. So here we are having some improvement in the code. Here for example, I have my Jiji plot and data equal to sociological data. I set my assets. I want to have annual income TPP as X and I want to have education month in Y and I want to color with the job type. And here I want to display it as Jion plot. Okay. And Jiji title, I want so sorry for the autograph here. It's education, educated. And I want my title to be education and income by job type because it's very important to add the title to a graph. A graph without a title means nothing. So the more you add little details like this, like the title is very important to have the title in the graph. A significant title for the X and the Y is also a significant source. You have to add the source of your data in the graph. It's very important to have these small things in the graph to make it looks more professional. Okay. So here I'm changing my X lab to annual income in USD rather than annual income TPP which is not significant. And I add in my Y lab total month span in education. And here also I want to add the source of my data. So I have labs caption equal to source sociological survey data. To summarize, if you want to add a source in your graph, just you can use this line of code, labs which take as argument caption and we specify the name of your source. Okay. And we have the scale color manual because we want to change the color to this blue and this I think between yellow and brown color. Okay. And we have also our team present here from Gigi present. And we choose the legend to be in the bottom of the graph. So here we are specifying a team legend dot position equal to button. Okay. So sometimes when you are doing a graph, you have also to add some logo of a company. So a logo of a company can be added easily in the Gigi plot using the extension magic. So magic is an extension, which allows you to add a logo in your in your graph. To have also I will share with you all the link of the official documentation of magic. And we can also have an idea about what this extension is doing. But for example, you can add a logo here. You can also add something inside the graph, the top of the graph. We can also have two graphs together. We can add also, I don't know, a picture, something like that. You can do some also image processing using magic and any and many other interesting stuff with this library magic. So to use to add the logo, we are first you have to download the logo, our studio logo. We can find it in Google and we can download it from Google. I will share I think the other presentation because I want to get deeper with you in the code. So here to use magic, I have to use some arguments. First, you have to download the logo from the website. Here I'm choosing to download it from the RStudio website. We have this command line in image read. Image read is used to read the image. So here I'm specifying the path in which I already saved my my logo, my picture, which is fix the backslash RStudio logo, the PNG. You can also, so it's pretty important to use the PNG because it's a high quality format. So if you want to add a logo with a graph, I highly recommend to use the PNG format. Here I specify my X, so I want it to be just here and also I specify my Y. So I want to place it here because as I said, in Gigiplot we have always the X isos and the Y isos. So here I have my old canvas in which I have my plot and my source, etc. So here in the previous one, as you can see, I have something like blank here, so I want to fill it with the logo. So I have to specify to Gigiplot in which position I want to put my logo. So here I specify my X position, my Y position, how I want to do the adjustment. So here I have, I won't put it in the left, in the bottom left of my my graph, like that. And I want to have my width. My width is equal to here, unit one and inches. And I'm calling the grid raster and the grid. So what's the role of grid and grid raster here? The grid and grid raster are using combined two graphs together. Here I have my graph done in Gigiplot and I have my logo. So I will use grid and grid raster command specifying that I want to combine my Gigiplot graph here with my logo here with all the specification and with all these arguments. So to summarize, if you want to add a little logo to your graph, you have to call magic. First you have to log to call magic. Then you have to use grid and grid raster, which will combine the graph with the logo. And you have to specify to download your logo and save it, specify the path in image read. And then we have to specify the position of our logo in X position, Y position and how we want to adjust it from left in the bottom. I also can put it, for example, here in the right top or in the right bottom or in in the left top. So as I told you, I chose to put the logo in the bottom here because it was blank and I want to fill it with the logo. So I don't know if you have a question, please let me know. I can see that there is lots of questions. So I think we will have some time in the end of this presentation to take care of all of this this question. Okay, okay, okay, okay. So let's back to this to this screen. And now I'm going to introduce you to another tool in Gigiplo through another extension, which is patchwork. Patchwork is a great package. It's an easy package that you can use to combine two or more graphs together using the arithmetic operator. So we can use, for example, the plus operator to simply combine two plots or three or five. We can also have the vertical bar which will place plots next to each other. And also we can have the backslash which will place them on top of each other. So here I will show you an example. Here I have, I call my library patchwork. So here as you can see I have two graphs. It's very important when you want to, it's very important when you want to make some graphs with the patchwork to assign each graph to an object. So here I'm assigning my first graph to P1 and I'm assigning my second graph to P2. So here I have my P1 in which I put my scattered plot and my P2 in which I'm putting my two bar plots. And I want those two graphs to be one on the top of each other. So I'm using my backslash here. But if you want just to add them next to each other, I can use my plus. And if I have, for example, I want to plus these two graphs, each one next to each other. And another one, third graph, which will be here, I can use P1, P2 with the vertical bar. And the P3, it will be with the backslash. And it will be, all of them will be displayed two, one next to each other and one in the back. Okay. This is a very easy way to combine graphs using Jojeplot2. And it's less complicated than using, for example, grid or grid raster and layers to combine graphs. Patchwork is very easy. And I will also share with you all of the documentation related to patchwork in this lack of this tutorial. So now I will share with you how to export the graphs and how to use Joje Save. Actually, I'm using Joje Save to preserve the quality of my graphs. And because when using Joje Save, the graphs have better quality than using other, than, for example, copy and paste from RStudio or doing, for example, a capture of the graph. It's better to use Joje Save to preserve the quality of all our graphs. So here, I'm using Joje Save. Joje Save, I specify the path. Here is to indicate the path. It's the path you can find in your computer. You can just copy-paste it from the properties of the file in which you want to put your graph in. Then you specify the name of your graph and the extension. Here, I have my GPIG but also the PNG format. You have to specify the width, the height, the units here, which are an inch, and the DPY, and the limit size. And you can easily save your graph and find it in the file. Now, the presentation comes to its end. Thank you, everyone, for joining. And you can also reach us in Twitter, GitHub, or via email. Also, you can see us via the Slack channel. I will take a look now in the chat. So if you have a question for me, please put it in the chat. Thank you, Haifa. Thank you. We already answered almost of the question in the chat and near the co-authors of this tutorial. So there is an extra question. We are here. We have an extra time. Don't hesitate to ask us. We are available to answer you at any time and to reach us via social media as well. So the floor is yours. Yes. Just thinking, I can't see the chat. So Catherine, if there is a specific question, please share it with me. Yeah, sure, sure. Okay. Someone, one of the participants asked about TIFF format. He or she said, is it possible to export in TIFF format for graphs? Really not sure, but I will see here. Yeah, this is a specific format. I think it's possible to export in TIFF format for our visualization in this tutorial. So I think to export a graph in an image, in PNG or VPEG format, or in PDF. That's which are displayed in R. But I think she can or he can try it with GJ Save. Maybe in GJ Save, there are more formats than in RStudio. Okay, great. Okay, awesome. Let me check other questions for you. Okay. There is another question, please. For diligence levels, it's always labelled in alphabetical order, but I think it's easier to view if we labelled it as skillet first than skillet. What's the code to do that? They are asking about RScript, but it's already shared in our GitHub. Yes, the RScript is already shared in our GitHub, but if you want to specify, for example, another order of labels, not the alphabetic one, I will share with you a code which can do this. Okay, great. Okay, let me check other questions. Okay, it's almost answered by me and Amir. Thank you. Thank you for your great effort to answer all the questions. I think that was a lot of questions. Yes. So thank you for the great work. So if you still have questions, please send it to us. We still have some time to chat together and to answer all your questions. Okay. So also, I highly recommend people participate in challenges like Saturday, Tuesday for Graphics. It's Tuesday, we have a dataset shared by the R for the Data Science community, and we can use the dataset to do some visualization. You can start mainly with the basic one and by the time and experience, you can improve your visualization. So it's very interesting to participate in those kind of challenges. Also last year, and this year, it was this year, 2021 in April, it was the Data Visit Challenge. And in three days, we have to create some graphs in different topics. So it's very, very important to practice data visualization in order to get better. So I highly recommend to participate in those challenges and practice more their skills. So this is like the beginning of the stairs, but you have to claim the stairs by yourself. You have now the tools, and step by step, you can reach better. Okay, that's great. Also, Haifa, thank you so much. There are many people asking about missing values in these datasets. Can you explain that why we choose this dataset and how to deal with it? I mentioned in the chat that as a tutorial for beginners, we didn't apply preprocessing step data. So if you want to explain more, the floor is yours. Yes. So we are choosing this dataset because it's a clear one, and it's used also, this dataset was used in the regression handbook by Kate. And this book is designed to educational purpose. So it was adapted to educate people and make examples. So here, because the tutorial is mainly for data visualization, so we choose not to deal with the missing values with the data preprocessing, because it will take much time and it's another tutorial. Maybe in the next tutorial we will be able to explore how to deal with missing value. So in this tutorial, we just omit the missing value. You don't treat them, but we can do some preprocessing. For example, we can fill with the missing value in numeric variable with the median of the mean. We can also fill with the high frequent when it's like a categorical variable. And also we can do some other, many other data manipulation, just because as you said, Dr. Andrew Wright is a tutorial for beginner and it's focused in data visualization. So we don't want you to have some other distraction between data manipulation and data visualization. So we want you to manipulate more the data visualization and get more to know and know more about ggplot2 and the extensions. Yeah, exactly. Also it's a great idea also to think about another tutorial about implications of missing values and how to apply it in R and why not in Python. So it's just another extension of our work and we will try to do it as possible. Yes, yes, maybe you can you can do this as a meter. Okay, so if you have any other questions, please feel free to ask us.