 In this tutorial I want to show you how to create bar charts in Plotly using the R programming language. So I've opened a new document here. So it is an RStudio new file and it is our markdown. So this is a markdown document. I'm going to go through a few steps and in case you're not that familiar with R and in case you've just started, perhaps you're using a different Plotting library such as ggplot2 which is the day factor standard. I do however really prefer Plotly. I can use Plotly in other languages such as Julia and in Python. So let's go through a few things. That's our usual setup there. I always set the working directory to the current working directory. Set the seed. I'm just using one, two, three, four so that the seeder number generated just generates the same values for us and the library that we are going to use is Plotly. So if you haven't installed Plotly, just use install.packages and it is Plotly. It is on Cran. So the introduction. What are bar charts? Now bar charts visualize counts of unique data point values in the sample space of a categorical variable. So what are we talking about here? On the x-axis you have to have categorical variable and in a categorical variable there are unique values. So if I were to think along the lines of medicine, I might have different diseases, diabetes, hypertension, etc. And we're going to count how many times those occur inside of a variable. So I might have a variable called disease and I'm going to count the instances of each of these in that variable. The sample space is made up of a list of unique values and every subject, every row in a spreadsheet, if you want to see it that way, will have a value and we're just counting the occurrences of each of those. So let's create some simulated data. So I'm going to create a compute variable called cities. I'm going to use the sample command here. There is my vector of strings that I'm passing, New York City, Boston, LA and Seattle. I want 100 values of those and I have to say replace equals 2 so that if a name is drawn, it's throwing back in the hat for random, pseudo random selection again. So let's run all of these. I'm going to run my setup here. There it goes, that works perfectly. Let's run here my cities, compute a variable that I've created. Now let's have a look using the table command of what was generated inside of cities. There you go, it created these and there were 24 in Boston, 26 in LA, 31 in New York City and 19 in Seattle. So that's great. Now I want to plot this but I've got to think about how to get this into plotly to chart properly. So remember there's a as.numeric command. So I'm going to call that on the table of the cities. And what that's going to give me back is just the numeric values as a vector, the 24, the 26, the 31 and the 16. So let's go for that. And we see indeed I have my vector of 24, 26, 31, 19, all those values. But I would also like to print the names in my bar chart. So I actually want these names that were generated at the top. And for that, instead of using as numeric, I'm going to use the names command. So that's names of the table of the cities, as simple as that. So let's go. And then we have Boston LA, New York City in Seattle. So with that in mind, let's look at our first chart, a simple bar chart. So I'm going to call my first chart P1 within the reasons and confines of the naming conventions of computer variables and R you can name it what you want. Plot underscore Ly, so plotly that is the command. It's going to take an x value, that's what goes on the x axis. And there I really want the names as we had Boston LA, New York City and Seattle. And then on the y axis, I want the values. So the height of the plots must be these counts, how many times we have Boston, how many times we had LA, how many times we had New York City and how many times we had Seattle. I'm just going to name it cities. For now, strictly speaking, I don't have to do that yet because I'm only plotting one thing on my plot. It's just these four bars. But later on, we're going to see we can add things on top of that. And then this name becomes quite important. So for now, I'm going to say name equals cities. And the type of the plot that I want is a bar. So this is one of the types of graphs or plots that plotly can do. And we use that name bar. And then we're just going to call P1. Let's see what happens. So there we have our first beautiful plot in plotly. If you've never seen plotly before, and you're wondering why you stuck around until this time in this tutorial, have a look at this. It is interactive. It's beautifully interactive. As I hover over these, you can see there's Boston, and it comes up, it was 24. I hover there, it says LA 26, it says New York City 31, and it says Seattle 19. Beautiful. I can also have a look at these things. I can download this plot immediately as a PNG. So that'll just be a snapshot of this. It won't be interactive. I can zoom in, I can pan around, I can increase, I can zoom out again. If I lose my place, I can just reset the axis. And I can also, with these two little things, say where the information appears. Now you can see NYC appears at the bottom and 31 at the top. But if I click on this, that information is all together. And if I click on this, it opens it in the plot plotly website. Now plotly is a website. You can store your plots there. You can create dashboards. You can even change the look and style of your plots right on their website. So massively, nicely interactive plots. They look beautiful. I really do love plotly. So that's one thing. Let's add some titles and axis names. And exactly the same thing. I'm going to call it P2 this time. But it's plotly. And it's accessed all the names and as numeric for why the names are sitting in the typos bar. We have our percentage greater than percentage pipeline that we do create here because after the plotly comes the layout. Now I can add some layout. So the layout will have a title. It will have an x-axis and a y-axis as arguments. The title is just going to be a string, number of offices in each city for instance. The list, the x-axis contains a list with these key value pairs. So title being cities and zero line being false. Now the zero line, you see this black line at the bottom? That will be the y-axis is black zero line. And then you might also get one going up here for the x-axis and the y-axis. So this bottom one will actually be the y-axis. The top one will be the x-axis. You can set that to false if you don't like that big black line. So you can just say zero line equals false. And you see that I've done that for both of them. It's just my convention. I don't like that black line most of the time. The y-axis then having the title number. Let's run this and see what the difference is. There we go. We have a beautiful x-axis and y-axis written there for these two. And we have the title there at the top. So let's create some simulated data for data frame because these are easy enough to do. I just generated some values, but most often you will import a CSV file or you will create a data frame. So let's do that. I'm going to create a data frame and it's going to have a variable called cities which comes from the cities that I created up top. It's going to have a group column which samples from A and B of 100 samples with replacement equals true. And let's have a look at the first six rows there. So we're going to have the cities and we see that's a factor and the group is a factor that was done automatically for me here. LA LA Seattle LA New York City BAA and that is the first six rows. Now I'm going to create two new data frames because what might I want to do here? I might want to group these up by group A and group B and these might be two companies, two different groups in an organization. Think about it this way though. This is how you would get your CSV file, your spreadsheet file or your data. It will just be in columns of these two columns in these more more but rows of data. And somehow I have to change this into something that I can put in a bar chart. So I've decided to start the series with bar charts because it is slightly more difficult to do than the others. You've got to think about how to convert this spreadsheet style look of this data frame into something that you can plot. So the one way that I'm going to show you here is just one way. This is R. There are many ways to go about things. More efficient ways but I thought of doing it this way just to reiterate what is available in R and just go through a few concepts in R. So groups. I'm going to create two sub data frames. One is called group A and the other one group B. And I'm just going to use the filter function. The filter function. And I'm going to group, I'm going to group everyone together that has a unique value of A in the group column and then B. So there we go. We're using the filter function there or the filter verb I should say from deeplier. That was automatically done for us with as far as the plot was concerned. So let's have a look at the table of group A's cities. If I were to do that, I get the similar sort of thing that I did before using this table command. So again, I can see Boston LA, New York City, Seattle and how many there were but now it's only for group A. I'm calling it on group A. And again, if I only want the names, they'll be the names and if I only want the values I use as numeric. So I'm going to create a bar chart, a bar chart data frame. And I'm going to call it G bar chart, G for grouped bar chart, a data frame. The cities are going to have the names. Now, because both of the names are the same for both of these, it works out well for me because they're both going to be for A and B, Boston LA, New York City and Seattle. And for group B, I'm going to have the numeric variables and group A, for both A and B, the numeric variables for the respective group A and group B. Let's have a look at the head there. So there's just four values here because remember there are only four cities in this data set. And now the counts go inside of the groups. So you can see what I've done here. It is sort of a pivot chart that I've created here by using code. So we've gone from this to this for each group because this is what I can feed into or very easily feed into plotly. So let's go and see how we can do this. So I'm going to call a P3, my third plot, plotly, and the data I pass first now. So it can take a data frame as an argument. So I'm passing a G bar chart. The X, now we use this tool, the symbol to say that it is one of the columns or variables in this data frame. So cities on the X axis, it's going to take those four. The Y axis is going to be these values for group A. The type is bar. And now importantly the name comes in group A. I'm going to create a pipeline because I'm going to add a second trace. Remember I said you can plot more than one things on the same plot. And you do that by this add underscore trace command. So the Y is now going to be this group B, the group B values. And we continue this pipeline because we add some layout on the Y axis. We're going to have a list with a title called cities. And more importantly, we have bar mode. And this is the one that's important here. We say group. So it's going to create a grouped bar chart for us. Let's have a look at that. There we go. Now we can see group A and group B. If I would only written G there with uppercase, let's fix that. I don't like that. Let's do that. There we go. And we rerun this. And we have an uppercase G there. There we go. So it's a group bar chart. So it's going to do group A and B for Boston together, for LA, for New York City, and for Seattle. And that is why we had to go this long route of creating this pivoted chart actually to generate this. Again, fantastic. If I were to click on that, group A disappears. Group B disappears. I can just look at group A. I can just look at group B. And that really makes plotly fantastic. Stack bar chart is going to be exactly the same. Everything is generated the same except the bar mode is now different and it's stacked. So let's have a look at that. And there we go. They are stacked. Everything else exactly the same. Now these were the default colors. It starts with this blue and then goes on to orange, et cetera. Let's change these colors. So we have our original plot actually. So X is going to be the names of the table of cities and as numeric table of cities. That was the original plot we did. Call it name cities type is bar. And now we bring in this new argument called marker. We take a list of key value pairs, the color being this RGB value. That's an orange. And I'm using RGB A. The A is for opacity, the alpha value. And that's 0.7, 70% opacity. And I'm going to put a line around the color. And that is more of this black color, but only at 50% opacity. And I'm stating a line width as well being 1.5. It's still part of this pipeline. So the layout goes with all the things that we've seen before. So let's have a look at that. There we go. We have this see-through orange color here and this sort of 50% opacity black, which actually makes it a little bit of a gray. But there is some opacity to it, this color around. So let's change the text angle of the x-axis. We've all seen that where these are too long, and they actually start overlapping each other. One quick and easy way to get rid of that is just to change the tick angle. And that's the only thing I've bought in here. That is one of the other key value pairs inside of the x-axis list that we create here. The x-axis being one of the arguments of layout. Otherwise everything is exactly the same. Let's go for that. And there we go. We see this negative 20 degree angle here in case those become too long. Now let's specify the color of each bar. In order to do this, you must know how many bars there are beforehand. So you've got to write a few lines of code, specifically this names or as numeric, just to see how many bars there are so you know if you want to color each of them separately. And that is again under the marker, the list that is passed to the marker argument. And it's color that we passed this time. And this is this vector of RGBA colors here with capacities. And we see it's going to be this 70% of pacified grays. And then we're going to have this reddish color also with a bit of opacity and then gray again. So it goes in order of how they are done. The rest is exactly the same. So let's have a look at this. And now we will have New York City stand out there in red. So in short, that's your introduction to Plotly. I think these plots are absolutely fantastic. They really are beautifully interactive and there's so much more you can do