 So visualization with Circos. Circos was a piece of software filmed by Mike Krasinski and a long time ago. It's been around for a very long time. It's been a very popular piece of software over the years, and it makes these absolutely gorgeous plots. You can see a lot more plots within both his paper and within the Circos website. Today we'll be talking specifically about the Circos and Galaxy tool wrapper. This is a Galaxy tool which wraps most of the functionality of Circos. We can't wrap all of the functionality of Circos because Circos allows them some things that we can't safely do in Galaxy. So we'll be going through the background. First, how Circos plots are made. We'll talk a little bit about individual track types within Circos and how the data needs to look so you can use those track types. And then we'll follow that up with the cancer genomics example. We won't be doing the other examples for now, but you're welcome to follow them on your own. And of course, Circos is not just for genomics data. The key point that I want you to remember today is that Circos is an iterative process. You start building your plot and it probably doesn't look great at first, but you improve it over time to get something that you're really happy with that summarizes all of the data that you want to include. It just takes some time and you need to make these steps and improve them and you aren't going to get to the end immediately. It's very flexible, but it's very complex. Hopefully we've been able to abstract as much of that as possible into the Galaxy interface. So let's talk about some Circos basics quickly. This is the ideogram. This is the thing that appears around the outside edges and to which all of your data is plotted relative to. This could be chromosomes, it doesn't have to be. And most of the examples we'll cover though, we define the ideogram to be the karyotype of a genome which are all of the different chromosomes and their lengths. However, again, you can use other data if you want. Here are some examples with and without cytogenetic bands. These just make your plots look nicer and maybe relevant for biological reasons. You can plot one genome or you can plot multiple. Here in this example, they're plotting three different genomes all together at once. There are lots of different options that you have with Circos. Now importantly, let's talk about the different types of data that you can plot with Circos. The first most important type is the scatter plot. This is a very easy way to get started with data. All of your data needs to match this format though. It needs to have a four or five column tabular data set. The first column needs to be the chromosome or the ideogram portion that's relevant. It needs to have a region, a start and end and then a value. You'll notice these are sort of buckets like histogram data type. You'll notice in Circos that a lot of the data types share the same format. Maybe your start and end is a single base. Maybe it's more. You have a lot of freedom with Circos and you'll notice that this is not a standard format. So you are free to convert any of your data, any different types of data that you have. You can convert them into a format that Circos can process. Additionally, you can specify optional additions on top. So if you want to color a single data point red to highlight it from the rest, you're able to do that. With the attributes column, there's the line type, line track type. This is identical to scatter plot. Histogram, again identical. Heatmap, again identical. All of the data that goes into Circos is really chromosome start and end at its very basic. The scatter, line, histogram, and heatmap are all chromosome start, end, and value. The first different type we have is the tile track type, which is chromosome start and label instead of a value, which is more or less a value. And you can also have additional attributes if you want, just like with the other data types. Text is identical to the tile plot. Link, link track types are very different now. We have a chromosome start end and a chromosome start end. So this looks a lot like two bed files next to each other, for instance, where we're saying here is one region and here is another region. And something about these relations are related and puts them together. There are two different ways you can write link track types, depending on how your data is formatted, how you want to process it. The first type is just region one, region two. The second option is to have link zero, a link ID, region one, and then region two, vertically above and below each other. But you have to link them together with this link ID, which is a little bit harder to do sometimes. So we often use this first data format. Ribbons are exactly the same as the link track type. So you'll notice that a lot of these track types are basically the same thing. They're all tabular datasets, which Galaxy excels at editing. They are mostly the same for histogram, scatter, line, and heat map. They all use the same for tile and text labels, again the same ribbons and links, again the same. So there's not so much you have to know about how Circus data looks to be able to start using it. Tracks can be customized, of course. You can configure all sorts of different things about the track, like the radius, where vertically from the center to the edge of your plot, this track should go. You can specify lots of different rules. We'll cover those a little bit later, which lets you say things like, I want values over some value to be colored red or values under to be colored green. I often use this for GC skew, coloring positive GC skew red and negative GC skew blue, something like this. And you can also specify axes of backgrounds to make your plots a little bit easier to read. And with that, we'll get started with the Cancer Genomics plot, example Cancer Genomics. Okay, let's get started with making a Circus plot within Galaxy. So we've switched to Galaxy and we'll be accessing the training materials again through the training material in Galaxy interface. Clicking on this little icon, we'll find visualization. And under visualization is the Circus tutorial. And I'll be scrolling down to the example, Cancer Genomics. And today we'll be reproducing this plot that was published in the paper. It was in a paper written by my partner a long time ago, Safia Hiltman, and she produced the Circus plot as part of that publication process using a custom tool. We'll now reproduce this plot using this very generic Circus tool that we've built together. And so this plot sort of starts to showcase what's possible with Circus. From the outside in, we've got a track of the chromosomes. This is human HG-19, I believe. And the outside track then is a track of copy number variation. Next up is belial frequency. And then we have two tracks worth of structural variance. The paper they published was about a cancer cell line and chromothoripsis within this cancer cell line. So we'll reproduce these plots and showcase them for you. We'll start by uploading some data. We've got some files here and we'll go ahead and import data. All of these data sets are tabular data sets, which is the primary thing that Circus can process. And once those are uploaded, we'll start with Circus. We will visualize it with a carrier type. We're going to specify a carrier type from one that we uploaded. And then we're going to set some parameters for the ideogram. So I'm going to click on Circus here to activate the tool. Once those files are there, we will start by saying you want a custom carrier type. Here we will select the HG18 carrier type with bands file. Next up in ideogram, spacing between chromosome units needs to be 50. Radius will be 0.85, thickness 45. Labels, we're going to make the label font size to be significantly larger, 64. And then side to genetic bands. We will set the band transparency to two and the band stroke thickness to one. And we'll go ahead and click execute. And this will create our first Circus plot. And when it's done, we should also rename the dataset. We should rename it something like Circus plot ideogram. And while we're at it, we should rename our history. Otherwise, we're never going to be able to find this history again. I'm going to call this history Circus cancer genomics plot. Circus will take a minute to run. And when it's done, we'll be able to see our nice plot. In the meantime, let's look at this carrier type file quickly since we didn't cover it in the tutorial. So this is what a carrier type file looks like. It's got this identifier CHR, a dash and a chromosome number, something like this. This will be used to refer to the data later, a short label, a start and an end and another identifier. It seems like a lot of duplication, but there are good reasons for that. And so I believe this column is what will be used to refer to in the data. So if you have data, it should have CHRX or CHR19 or CHR14 in the first column to identify which chromosome that data is related to. If you want to produce your own carrier type, you can do that. There is documentation on the Circus website and within the tutorial of how you can create these custom files. If you have data that's not genomics related, then you might want to do something like that. So our plot's done. Let's have a look at it. Here we can see we've got nice big, large numbers. That's why we changed the font size. We have some carrier types plotted, all of our different chromosomes have been plotted. And they have bands on them because our file came with bands down here at the bottom showing their gram stain. So, perfect. This is a great place to get started. Let's continue on. Our plot should look about like this. If you've got something else, please ask. And if you want to define your own carrier type file, you can read about this here. We didn't use a built-in chromosome because the built-in chromosomes in Galaxy contain everything in HG18 or HG39. I'm sorry, not a human genomics person. And these chromosomes include lots of little tiny bits of data where they know it's related to the chromosome, but it's not yet assembled properly. And we don't really want that. We just want a nice clean visualization to show our data. We aren't so worried about what might be happening in those bits. So let's start with structural variations. The structural variation tools produce a lot of different output formats. I think every single one of them has a different formatted output that they produce, but Galaxy is very happy to work with tabular data, very happy to work with text files. So we can use all the built-in tools in Galaxy to rearrange things to cut out the columns we need and just the data points that we want. So structural variations, and specific if you're not familiar with them, these are when sequence of DNA is either inserted removed, moved around. It's duplicated. Maybe it's duplicated in a different location. Maybe it's inverted, or maybe it's just translocated. And all of these different structural variations can happen within our genome. One of the interesting structural variations that can happen is known as the Philadelphia chromosome observed in leukemia. Here there's translocation material between two different chromosomes. An interchromosomal translocation, which is something we'll be able to see here once we start working with the structural variation data. So this is the format of our structural variation tool. It may be different from yours. This was CGA tools. But we can see some columns here that we recognize, like left chromosome, start strand, right chromosome, start strand, and length. And from this, we know that the circus link format is chromosome start end, chromosome start end. So we can start to calculate by pulling out those columns and reformatting that into data that looks like how we expect. So we're going to take our VCAP high confidence junctions and we're going to select everything that does not match this pattern. VCAP high confidence junctions that does not match this pattern. This pattern looks for any of these three characters at the start of the line as the first character on the line. So let's execute that. And then we're going to cut some columns from that file. Cut those columns from the output of our select tool. And they suggest we rename this to structural variations circles, which we can go ahead and do now. And when those get calculated, we'll have a nice reformatted data set that we can use in circles. We can pass directly to the circus tool. So you'll notice here that we copy the same column twice, C3, C3, C7, C7. If we look at our structural variation data, we have a chromosome in column two. And then we have a start position in column three. With circles, we can say, the start and end of this ribbon or this feature is the same position. It's fine for circles. We don't have some region. We just have an individual point. It's enough. There is a length column. If we wanted, we could cut out the entire length of the region that's being moved around. We'll start just by cutting out the position. And now we're going to rerun circles. So in most tutorials, you've probably been running a tool just starting over fresh every time. With Circus, there's this iterative process we talked about where you start with a plot and then you build and build and build and add some more layers on that. So what we're going to do is we're going to rerun the Circus plot. I'm just going to clickly look at this to look if the data looks right. It does. Fantastic. And then I'm going to go back to the Circus plot and I'm going to click rerun. And this will have all of the configuration that our previous Circus plot had. And now we're going to add some more. We're going to start by adding this link track data. So we're going to go down to link tracks and we're going to insert some link data. It suggests that the inside radius is 0.95. And we're pulling the data from this SV's Circus file. In the Circus tool, we try to document it every time. So chromosome start end, chromosome start end. And you can see that our data looks approximately like that, but it's not exactly the same. It's just being a little bit more elastic. And we'll be making a basic link type with a thickness of three and a bezier radius of 0.5. I know these seem a lot like magic parameters. Right? We're just setting a bunch of parameters and then, oh, look, we produced some beautiful plot. And yes, that's very much the case. What you're missing here is the 10, or sometimes even 15 steps that were between these two where we experimented with how we wanted this data to look, we changed the thickness, we changed the Bezier radius to produce a good result. We experimented with the rules to highlight the different things we wanted, et cetera. So we're skipping some of those steps because they're painful and unpleasant and not necessary for you to understand how the circus tool looks works. But when you go to make your own circus plots, expect some more of these steps in between. You aren't just going to know that, oh, I want Bezier radius 0.3 or whatever. You're going to have to try some different things and see how they look. So we're adding some rules. We're going to insert a condition to apply and then on interchromosomal links, we can do inter and intracromosomal. We'll add an action and we're going to change the link color to red. Change stroke color. That says ribbon zone. Let's try to change fill color to bright red and we're going to execute that. So what that should do is when the link type is between two different chromosomes, then it'll be red. Otherwise it'll be black, the default link color, which is of course configurable. Circus plots are usually pretty quick to run. If they're taking a long time, it can be because you're sending circus a lot of data. And there are a number of tools that we've provided that will let you bundle links or resample data or change link density. And all of these are very useful tools for taking a large amount of data and reducing it. If you have some continuous data like GCSQ, maybe you calculated this with a very high resolution, you can use something like the resample tool to make it lower resolution data that plots more quickly. You can of course replot later with higher resolution data if you want. So looking at this, we see exactly that. There are a lot of these little black links between which are actually going between the same chromosome. So clicking to zoom in, if you're a Firefox user, the image is full screen or full size automatically. If you're a Chrome user, you're zoomed in already and sorry, there's no way around it. So looking at this link specifically, we can see, okay, all of these are within the same chromosome. Whilst these red links, due to our rule, go between chromosomes. So that's so cool. We can now see that there's a lot of interchromosomal variant or intra-chromosomal variation and a lot less interchromosomal variation. We should rename this plot Circus Plot SVs. Just to keep track of what we're doing. It should look something like this. So there are a couple of questions here. Are there more inter or intra-chromosomal structural variations? And we can see a lot more black lines and we can see red. So it's quite obviously there's a lot more intra than inter and which chromosome appears to have the most. You'll notice over here, the chromosome five is dark black. This is because there are so many lines here that we can't actually see them all. They're all overlapping each other. And we'll look at that a little bit in detail. So let's rerun this Circus. And we're going to limit just a chromosome five. We're also going to change the spacing for this. And we'll execute that. And then I'm going to rename it to Circus Plot CHR five. So what this will do is Circus gets all of the data but Circus knows that we only want to plot what's on chromosome five. So we'll be able to zoom in just on this region here. I was mentioning earlier that you could down sample large data sets that you send Circus. There are limits in place on running Circus and how many data set data points it'll plot. We've set those, but if you run it locally yourself you don't have to be held to those limits. You can plot however much data you want. The limits though are generally enough for every plot you want to make. If you're hitting the limits you should really down sample your data. So this is just chromosome five and we can see a huge amount of structural variations here going between different portions of the genome. What's going on there? So this is chromosome five of a specific cancer cell line and the Q arm of this chromosome has a large number of variations. They're not evenly distributed across the genome. So something happened just to a portion of chromosome. Chromosomes have two different arms, a P arm, a centromere and a Q arm. The short arm is termed P as long as Q and so for us the five Q arm, this portion seems to be affected here. What could be going on is something called chromothripsis which we'll talk about here. So chromothripsis is this phenomenon where part of a chromosome or maybe the whole thing is shattered in a single catastrophic event and then the genome or the cell puts this chromosome back together but it doesn't have good ordering for it so it just jumbles it up and sometimes you even lose pieces. So chromothripsis is a huge horrible event that causes a large number of rearrangements and there are a couple of these characteristics of chromothripsis we'll be able to visualize in our plot. For instance, a large number of complex rearrangements especially if they're localized to a single chromosome or single chromosomal arm. We'll see low copy number states alteration between two suggesting that rearrangement occurred in a short period of time and sometimes you'll see alterations of regions which retain heterozygosity with regions that have lots of heterozygosity. Next, we'll talk about copy number variation. Copy number variation, we're getting from f-metrics S&P arrays. So this is some data that was available during that experiment and it will help us look at some of the structural variation dating and firm some of our hypotheses about this. Like chromothripsis happened in chromosome five here. So human genome is diploid to start with. There are two copies of each chromosome, one paternal one maternal and for any given gene humans have two different copies of it. But sometimes structural variants will lead to a change in this copy number. If there are duplications then we expect to see increases in the copy number if there are multiple duplications we'll see even more. If there's just an inversion structural variant we won't see any copy number variation. And lastly, if there's a deletion we'll of course see a decrease in the copy number because there are fewer reads from that portion of the genome or none because it was deleted that'll map to that location in the reference genome. DNA microarrays give us a log R ratio that we often plot that looks something like this which will let us know that if it's zero it's the normalized copy number two in the case of diploid genome and if there are significant diversions from this expected value we'll see that as above or below. So looking at our VCAP copy number file we've got a chromosome start and a value in this array column that we don't really need. So let's start by preparing our copy number variation data. We'll start by removing the beginning first one line of the VCAP copy numbers.tsv file. Is that right? VCAP copy number? Yes, remove first line. Next, we're going to cut the first four columns. So that'll just be the chromosome start and value from the beginning of that file. And lastly, we're going to down sample this data. We're going to select random lines from this file. We're going to randomly select 25,000 different lines from this file. And we talked about this a little bit earlier that a large data set sometimes can take a long time to plot or they can even exceed the limit Circus has built in for how many data points you may plot. The limits on Galaxy generally should be accommodating every use case you want. And when you're exceeding those, it's a sign that you should down sample your data. We're going to use a random down sample to make sure that we're selecting different parts of the genome randomly. And while those data sets are processing in the background, we'll get the Circus set up. So we're going to rerun the Circus tool again. Be very careful that we are rerunning Circus plot SVs. You don't want to just do chromosome five. We want to rerun the entire genome. And here we're told that we're going to add a 2D data track from 0.8 to 0.95. 0.8 to 0.95. These are the radii of where this plot starts and where it ends. And we're going to be plotting a scatter plot, I believe. Scatter with our select random lines data. The columns must be chromosome start end value, which we have. And we're going to be giving them some plot specific options. We're going to leave it as a circle. We're going to set the glyph size to four and just selecting a nice neutral gray, something pretty boring. Lastly, we'll set the stroke thickness to zero. So we'll just get a little gray dot and nothing else. And we're also going to supply the minimum and maximum values of this plot as minus 1.0 and 1.0. And then we'll execute. Of course, it'll queue up between the select random line job. If we look down here at the previous job, we see that there are 1.2 million lines, 1.2 different million data points that would be plotted if we plotted the entire dataset. But we don't need every single position. We don't need all of that data at that density. So we down sample it to 25,000 lines. And this is a much more reasonable amount of data to plot. Again, if you have patience to wait for some hours for circles to plot all 1.2 million data points, you're welcome to. But for most of the queues cases, plotting a subset is good enough. Down sampling the data produces something that's representative of the data without losing some of the important details. These circles plots will take increasingly long to build, of course, as we add more and more data to it. And then we're going to have some questions. Like, looking at this resulting plot, what do we see and how can we fix it? So there must be something odd. Okay. Aha. You can see what's gone on here. We have all of our scatter points on top of our lines. That's not optimal. So we're going to fix that probably by moving the radius of one of the tracks. Right now, our scatter plot was from 0.8 to 0.95 and our link track started also at 0.95. So we're going to change this link data to have a decreased inside radius of 0.75. Link track, link data, 0.75. And this should look a lot nicer. And we'll be able to see what the data looks like. Execute. When that's done, it should look something like this. And we get a better feeling for, okay, here's where all the data is. Here's how it's distributed. It starts to look okay. But it's still not so easy to visualize what's going on within this data range. So we're going to highlight some of it. We're going to color everything with a significant copy number of loss as red and everything with a significant gain as green. First, we're going to wait for our plot to finish. You'll notice when we zoomed in on some of these plots, the detail wasn't super high. If you want more detail, you can render these as a bigger resolution, but again, it takes longer. So we render them in a lower resolution to make the plotting process significantly faster. Okay, we've got our copy number variation track plotted. It looks good. Nothing went wrong. We're going to run this job again. And now we're going to apply those rules to the copy number variation track that we discussed. So under 2D data tracks, this is our scatter plot. We're going to add some rules that'll let us conditionally format parts of that track to behave differently. And we're going to apply them based on value. And we're going to set positions above 0.15 will be green. We're going to change fill color for all points to green, maybe a brighter green. And then we're going to do the same again. So rule one, we need a new rule. And we need to click continue flow. That's an important thing. Oh, we don't hear. Okay. Sometimes rules will apply multiple times or add on top of each other and then continue flow is important. Here it should not have any effect. So rule number two, everything based on value below minus 0.15, specifically minus, these are going to be the ones that are decreased and we'll make that bright red. Okay. Let's execute that. The next note is sometimes it can be nice to see the axes of the plot to more accurately judge these values. So we can add some axes to our plot that will make it a little bit easier to see what's going on. I'm going to wait for this plot to finish and then we'll start that. Okay, look at that beautiful. Here we can now see the copy number variation when there are significant variations, significant decreases. We'll see a lot of red points or significant increases, a lot of green. And with that, let's add our axes to our plot. So we're going to add some axes to the 2D data plots specifically. You can have axes on all sorts of different plots. We're going to scroll down here to axes, insert axis, spacing 0.25. It'll be running from y minus one to one and we're going to make it gray again. Okay. Execute, I'm going to rename this plot to be called Circus Plot Copy Number. And when that's done, we should see some nice axes that'll help us start to evaluate those data points. Rather than just guessing, we'll know it's green or oh, it's red. We'll be able to say something a little bit more quantitative about individual data points. Okay, let's look at our plot and we now see some nice little lines that tell us 0.25, 0.5, 7, 5, and 1. So we can start to say, okay, we've got some really decreased down to nothing here or really significantly increased plots or expression points. Copy number, sorry, not expression. Can I work with nice tiny viruses? So we should have a plot. It should actually have lines. So it looks a little bit different than ours does here, but that's okay. Next, we'll start to look at how to add bealial frequency. So bealial frequency is closely related to copy number. There's some nuances about it, but we expect to see these sort of different distributions. A copy number of one, we'll see all points to the top or bottom. Copy number of two, we expect to see one line in the center or copy number three, we'll expect to see three different lines. So these can be used to estimate copy number changes and it'll be a lot clearer of a plot. So we have data that looks like this. It's basically ready to go into circus. It's got a chromosome, a start and a value. Perfect, just what we want. We just don't want this first line. So we'll remove the beginning, one line of our VCAP bealial frequency and then we're ready to down sample the data again. I'm going to select 25,000 lines again from output number 17. So if we scroll down here to look at the original bealial frequency, it's again 1.2 million lines, which is a lot of data and the down sample diversion will make this perfectly acceptable plot. So this time we're going to add a new data plot. It'll again be a scatter plot with our bealial frequency tabular file that we're producing now. And then we're going to make this from 0.6 to 0.75. As you remember from before, this overlaps the link track that we have in the center. So we're going to have to go ahead and update that to 0.55, so it doesn't overlap. And while those are running in the background, I'm going to come back here and rerun this plot under 2D data tracks. Let's go down to add a new 2D data track. And it's going to run from 0.6 to 0.75. Plot type histogram is not what we want. We want a scatter plot again, 18 select random lines. We'll make our plot format glyph size four. We'll leave this gray. I'm going to leave it as this darker gray for this time, just for some variety. We do not need a stroke thickness around it. We're going to supply our minimum and maximum values again. This just helps circus know a little bit about how the data looks so it can plot it best. And we're going to add an axis already, leaving a lot of things as default. I'm going to make that gray again. And then in our link track data, it's now overlapping this. So we of course have to change that to a lower value, like 0.55. Let's go ahead and execute. Here's our random data. We've selected some subset of it. I'm going to just rename it while we're here to circus be allele frequency. Just make that nice and readable. Hopefully we'll get a nice plot out that looks something like this where we can easily see those changes in variation across the genome. And again, we can see these three different patterns. CN2 copy number variation two, or copy number of two is a diploid. We expect to see two copies of the genome. If one of the genes has been deleted, then we expect to see copy number one haploid genome, or if it's been duplicated CN3 or even four in some places. That's a question. Do you see anything other than these states? So let's have a look at our plot and see how it looks. Okay, good, it's finished. I'm being a little bit impatient now. And here we see, as expected, some variation. So here is a region where we see two different bands and according to our chart, that means copy number variation, copy number three. So a lot of the genes in this region are diploid and we've got additional copies of them. Over here in chromosome five, what we see is just a mess. Here's one of the single lines. So that'll be a haploid or a diploid gene. Like expected, here's one that's probably haploid. And you can see a lot of red over here. So it's lost a lot of gene expression down to one copy even. Here you can see some variation even between different regions of this one where the first region is triploid, then we've got a diploid region in the middle and then back to triploid. Lots of different expression profiles. And if you are doing a lot of these different analyses, you can use this to make some assumptions about them. Chromosome five shows a lot of changes. Chromosome 19, let's look at that one quickly. Shows a pattern that could indicate even four times. So there could be four different lines there. It's a little bit hard to see. Again, this is down sampled data. If you plotted more data, then you could probably get a better plot. And you'll notice that this bit in the center looks very different now. And that's due to the change parameters of how big it is and how that interacts with the Bezier radius. So let's just change that one last time and plot it to make the plot look a bit more like we expect. Link track Bezier radius, it should be 0.25 now. It's a magic value. It's just something you have to experiment with, produce a couple of different plots and see what looks good. And then that should produce something like we expect. So the original image in our tutorial, it split the red and black into two different plots to make them a little bit more readable. That's something we can do here as well. There's an optional final exercise to split up those two data tracks. We can do this by taking our existing data tracks. We already have the red and the black, the interchromosomal here. And we can tell it to plot this link track twice, but with different radii. And in one case, only plot the black bits. In one case, I only plot the red bits. So I'm going to try this manually. There's a solution in the box. If you get stuck and would like help. So I'm just going to start rerunning this and not waiting for it to finish. So we know that there are two types. There's the inter and intra. And we have some rules for these types. And for the interchromosomal, right now we're coloring them red. In this case, I'm going to change visibility to show no. And that'll hide all of the red ones. And then I'm going to more or less duplicate this link track data. I'm going to plot the SVs again with an even further decreased radius. So it's 0.554, I'm going to say 45. Don't know if that's right. Again, it's trial and error. So yeah, you just have to test things and see what looks good, what doesn't. We want a thickness of three and a bezier radius of 0.25. And then we're going to have a rule, insert rule. Whenever it is an interchromosomal, we're going to double check the one above. Interchromosomal, we hid there. This time, intrachromosomal, we will hide. And then we're going to start another act. Oh no, we actually don't need to. And then we're going to just change the link color to maybe blue, just for fun and see if that looks okay. Let's see this previous plot we made. Okay, we got the bezier radius we expected. That makes it a little bit nicer to look at. Let's see if I got this right. So this plot, this is more or less what was published in the journal. All of the data that we've taken was from our different experiments. We didn't have to write any custom tools to do this. We didn't have to write any Circos configuration. We could just produce this plot in Galaxy using all of our data sets and configuration. And Circos is really fantastic for this, right? You can take your long genomics workflow. Maybe it does a lot of different computations over copy numbers and sequencing depth and calling genes or other features or doing expression analyses. And you can summarize all of this data together in a single plot. Circos falls in the category of what I refer to as like workflow summarization tools where you have a long workflow that does a lot of genomics analyses and it produces a bunch of text files or tables. But no analyst wants to look at that, right? So as a result, we want to make a nice easy to read plot and Circos comes in there. It lets us make these nice graphics as saying, okay, here's all of the data within our plot and how we want to visualize it. Okay, I did not get it quite right. You'd have to increase the plotting radius a little bit further or decrease the radius of the ender plot further. But we got more or less close to what's described here. And we've made a lot of Circos plots today. I'm going to search in my history for all of the plots and just go through them quickly with you. So we started today with this plot, really boring, not much going on. We added some structural variations. We added more data and rearranged things to make it easier to see. We added coloring to make it even easier to see what's high or low expression rates. We added axes to let the users start to quantify the data if they wanted to from the plot. We added more data sets like this belial frequency data set and we fixed our plotting radii. So it really is a long process to get from raw input data to a good plot. But once you've got this, this final result, then you can just rerun this with the different data. You can build this into your workflow. You've got all of these different parameters set. It's really a lot of configuration that goes into the Circos tool. But you get such nice plots out of it. It's worth going through the iterative process, building up your plots, adding new data sets and making something that looks fantastic. So thank you for following along. If you have any questions, let us know. With that, we're done. One last note before we finish, if you make a great plot with Circos, we would love to see it. We would love to have these plots to share with other people in the training material. So if you've made a nice plot, please just let us know. If there's something we missed in Circos tool also, just let us know. We'd love to take other people's plots, other real-life genomic examples and introduce them into the training materials, maybe with different data sets that are publicly available and share them with everyone else. If you have a cool plot, just tell us. Thanks.