 Hello, my name is Maria Doyle and in this tutorial I'm going to take you through how you can create a volcano plot using Galaxy. So a volcano pot is often used to visualize results from RNA sequencing experiments. And there's an example shown in the image on the right. A volcano plant is a type of scatter plot where every point in an RNA sequencing experiment represents a gene. There are many thousands of genes in the plot. And the x-axis is measuring the amount of change of the genes in one condition versus another. And the y-axis is showing the statistical significance. And on the right hand side of the plot we've got genes that are up-regulated. And on the left hand side we've got the genes that are down-regulated. So to generate a volcano plot we need a file of differential expression results. And here we're going to use a file from the RNA sequencing kens to genes tutorial that was generated using the LIMA VOOM tool. But you can use any file that from other tools such as HR or DC2 as long as they have the required columns that we'll show you. And the data for this particular example is from a public study. And the comparison we are looking at is from genes from luminal cells in from pregnant mice versus lactating mice. Okay and what we're going to do is we're going to import the data into Galaxy and then we're going to create a few different volcano plot examples. So we're going to use the coloring gene, significant genes, and also labeling genes and with their gene names. Okay so the two files we need are can be found here in the tutorial. So what I'm going to do is copy the file links there and then move to Galaxy. So here I'm going to use Galaxy Your, but you can use any Galaxy that has the volcano plot tool. To import the files I'm going to give the history a name, a good thing to do. Okay and then I'm going to upload the two files into Galaxy. I'm pasting them into the paste fetch box. I'm going to set the file type to tabular and click start. Okay and then my files should start uploading into the history on the right. And while they're doing that I'm going to move back to the tutorial which I'm accessing in this Galaxy through the little graduation cap at the top menu. And in this tutorial interface we can see that we can click on the volcano plot tool link that's highlighted in blue and that will bring us to the volcano plot tool in this Galaxy. Which is handy. Okay so our two files have uploaded and I'm just going to check that they are file type tabular. They are just good. And now I'm going to specify the input file which here is the lima boom file. I'm going to specify the headers. So I need to say the FDR header is in column 8 here. The p value is in column 7. The log fold change is in column 4. And the gene identifiers what we're going to use for the labels is in column 2. Okay and then I'm going to change this is the significance that we're going to use to decide which genes to color in the plot so I'm going to change that to 0.1 as it was in the paper. And I'm going to change the log fold change threshold to 0.58, which is equivalent to a full change of 1.5. I'm going to click execute. And so those settings will the log fold change and the p value will determine which genes get colored, the red and the blue that we saw before in the example. Okay and while that is running. I'm just going to have a closer look at the input file to show you so for the input so the only columns we need is one column of gene identifiers. And then the log fold change and the p value and just a p value columns you don't have to have other columns in your file. And so in this input file every row is a gene. And for the log fold change here. This is for this here with the negative log full changes telling us that this particular gene is down regulated in the cell lactating. Luminous cells from the lactating mice versus the pregnant. Okay and our plot has been generated so I'm going to click on that to open it. There we go. We have our first volcano plot with our genes. Colored red that met our thresholds so they are less than FDR point oh one and greater than 0.58. And then the genes that are blue are less than FDR point oh one and less than minus 0.58 log full change. Okay, and now I'm going to create another volcano plot and I'm going to use the rerun button here. And this time. So I'm going to leave all the settings the same. And for the points to label I'm going to say I'll label the significant and I'm going to label the top 10. And generate another plot there. And while that's running, I'm going to go back to the tutorial and say okay so these were the settings that we specified for our first plot. And that was the plot. We had created. And there was a question here why does the Y axis use a negative p value scale. So why is it negative here. And that is because we want the most significant genes to be at the top, because that makes it easier to see and if we didn't do that the most significant the genes of the smallest p values will be switched at the bottom there. Okay, and now our next plot is generated so we'll have a look at that. Okay, great. There we go. Now we can see we've got the gene names for the top 10 most significant genes. And we can see that we've got CSN when it's to be is our most significant gene and it's got quite a large log full change as well. Okay, and then for our final example. This time we are going to. So we're going to keep all the same settings again. And then we're going to decide to label genes that we're inputting from a file. So that was the second file that we uploaded. And I'll show you what that looks like. So that's here it's a file with a single column. And it's got a set of gene names with 3031 genes there, header row, and we're going to label those genes in a plot. And this time just to show you that you can put these boxes around the gene names, if that makes it easier to highlight the gene names in the plot. So we'll do that execute. Okay, and while that's running. Go back to the tutorial again. And so you got the second plot we made. So that was the most statistically significant what we saw that so that was this gene here CSN one has to be. And now this is what we're doing so labeling these genes in the volcano plot. So this might be if you want to, if you've got like a favorite pathway or set of genes that you want to see where are they located in the plot, or to highlight them. Okay, and that is done now. So we've got our 31 genes labeled the boxes around them. And we can see that all except two of the genes are significant. There's just two genes that are in the green gray area, their lines showing that their points are in the gray, so that's MCL one and GM FG. And if we return to the tutorial. So that would be the genes of interest are significant. So it was 29 at 31 on which gene is the most statistically significant. So that would be the one here, EGF. Okay, and if you want, you can select in the tool form to output the R code. And that will allow you to customize the plot further using our, and we've got another tutorial in Galaxy that shows you how to do that. Okay, so hope this has helped you to see how you can generate a volcano plot using Galaxy. And if you have any more questions you can ask them through the links here. And we'd love if you could let us know if you like the tutorial or if you have any suggestions for improvement. Thank you very much. And I hope you have a nice day.