 We're still in New York, same creative commons license, same explanation, same encouragement to share and so forth. I also still have some new slides, and so this slide is not on your slide deck, so any slide that has a little white spot on the bottom is a new slide. It's either changed from what's in your book, reordered or additional, so I'll explain as we go along. So today is, I'm going to talk to you about Galaxy and how Galaxy can be used in a number of different ways, one of which is on Amazon, but we're actually going to not use the Amazon version, although I did include at the end of the class notes, if you did want to use it on Amazon, how to go and do that. You sort of, another hashtag that is, it's interesting when you decide to name a software package, if you use a very common word, you make it very hard for people to find you. And so Galaxy, if you just Google Galaxy, you will not find the Galaxy software package, you have to do Galaxy, Bioinformatics and something else. But if they use Galaxy hashtag, you will find on Twitter, you will find all these things that people that use Galaxy to do Bioinformatics. They will be the same people, so that's the way they, so same disclaimer as before that I'm not going to, I don't make any money out of any products I may mention. I am on the Galaxy Scientific Advisory Board, which is an NIH funded project, but I don't, I do that for free, so I don't get any money out of that. So but if Galaxy is successful, it makes me look good, I guess. That may be one benefit. So the outline of what we're going to do today, so we're going to look at workflows and examples of using Galaxy to perform DNA sequence manipulation. The ideas of reproducible science and Galaxy, the various Galaxy servers and different types of Galaxy instances, and so I'll explain why you would use one versus the other. So we're going to be looking and getting data in and out of Galaxy, processing data in Galaxy. And an example, I mean, in the lecture, I'm going to do an example of using Galaxy for an RNA-Seq pipeline, but we're not going to do that in the lab. We're going to actually do similar things that we've done in the last couple of days in the, in Galaxy in the lab with Richard this afternoon. But tomorrow you will see RNA, we're going to start RNA-Seq per se. There's not an overview of RNA-Seq, so I don't want you to bog down on those kinds of details, but more as an example of the kinds of things that Galaxy can do. And you will see tomorrow we will be doing RNA-Seq command lines. It will be doing, going back on the cloud and doing RNA-Seq on a different way. So what today's lecture is about, so first of all, how many of you have a Galaxy instance running at your institution? So that's a fair, interesting number. How many of you have never used Galaxy? That's very surprising. So Galaxy is an alternative way of helping biologists to do computational biology. And you'll see throughout the lecture, it's not the only way, because you've done it other ways as well already, but it's a very powerful way, and it's also a very powerful way for sharing and working with colleagues. So some of these slides you've seen already, so I'm going to actually quick jump me over them to the important part, which is that some of the things about the cells is reproducibility, is doing experiments, and so we've done that slide already. And the thing about these experiments is that you do them once and you want to repeat them. If you do it a hundred or a thousand times, you probably want to script them up or find a way of automating the process. Galaxy allows you to do that quite easily. It's a sort of simple user interface to reproduce and repeat common tasks. And actually some new things that have been introduced into Galaxy in the past year allow you to group files together and to sort of do actions on groups of files. So this was actually a gripe that people had about Galaxy. It's fine to do one file and push it through and repeat that one file, but if I want to do a thousand files, how do I do it? So they've actually come up with a solution for that now. And also the output of Galaxy is shareable with your friends or with the world, so you can make it public or you can share it with your specific colleagues and so forth. So all of that is available and is relatively simple within the Galaxy infrastructure. So the other thing that Galaxy is very good at is keeping sort of metadata about the versions of the tools you're using, the arguments you're using and so forth. So it makes reproducibility of science possible. So some of the requirements that Galaxy meets which makes this possible is that project should be open source and the Galaxy project is an open source. Like I mentioned, NIH funded project. The solution should be useful to a large community, well supported and Galaxy definitely is flexible, expandable, scalable, cloud aware, user friendly and so forth. And so this is what you would want for any project and there are several solutions. Galaxy is not the only solution. One solution is this came out of a paper written by Robert Gentleman which is basically he's recommending in this paper which is actually a very nice paper. I recommend you have a look at it. He's recommending that every figure of a paper should be reproducible. You should actually provide the R script that allowed you to generate the figure in the paper. And so in this sort of example, he reproducible research and he uses bioinformatics as a case study where not only do you get the data but you also get the script that allowed you to generate the figure so that you can put in your own data, see how it looks. You can reproduce the data figure yourself and so that's totally reproducible and you make it shareable with the world. And so R and bioconductor use that and so we have actually a separate workshop to teach you how to use these tools and it's quite useful and it's a very quite large community and very useful community. Another interesting paper is published recently, I guess just a couple of years ago I think Ten Simple Rules on Rebootable Computational Research and the ten rules are for every results, you know, keep track of how you produced it, avoid manual data manipulation steps, archive the exact version of all external programs used, version control all custom scripts, something like GitHub or other similar repositories, record all intermediate results when possible in standardized formats for analysis that induce randomness, note underlying random seeds. So BWA is an example of such a tool that actually there is a randomness step in an alignment from BWA and that if a read has the same probability of being at two places, it will randomly put it at one or the other. And so if you run the same experiment twice, you may get two different results. Both results will be equally good but understanding that the tool you're using has some randomness and it is also very important to understand. Always store raw data behind the plots, sort of a solution also to gentlemen's paper. Generate hierarchical analysis outputs, allowing layers of increasing details to be interpreted, connect textual statement to underlying results and provide public access of script runs and results. And so this is all things, turns out that the two guys that are middle author on this, Anton and James, are the two leaders of the Galaxy Project. And so of course Galaxy abides by all these rules. Another, but it's not, so Galaxy is not the only sort of pipeline, automatable that keeps metadata around. The tool we use at OICR for all our pipelines is Seqware. And Seqware is another open source project and that's their page if you want to have a look at that. It's not as user-friendly as Galaxy but it's probably able to handle larger data sets and larger pipelines and deals better with dealing with compute infrastructures and so forth. And so it's for the larger projects. And of course Galaxy is a solution that we're going to study today. Lots of papers on Galaxy, here's a couple of them, here's another one. And so Galaxy comes in multiple flavors. And so if you go to thegalaxyproject.org, which is basically the Galaxy homepage, you will see all the various types of Galaxy. They use Galaxy.org, which is the main public Galaxy server, which is the one we're going to use today. You can also download Galaxy and install it in your own server institution and so forth. And I definitely encourage institutions that have the resources, the computed resources, to have a local version. With that, though, comes some overhead of maintenance and so forth. So if you have a Bifurax core facility at your institution, that's sort of a really good way of handling it. What comes with that, though, is some institutions have made their Galaxy server publicly available. But what they've done also is they've made it, they fine tuned it to serve a specific task. And so we'll, I'll show you some examples, but some of them, for example, have made RNA-seq-specific Galaxy servers or Galaxy servers that deal with metagenomics or Galaxy servers that deal with specific plants. And so they are good for those resources, of those biological niches, because they keep track of the things that need to be done for those communities. So there's a cloud version of Galaxy also. So you can do, you can run, and we have run in the past this workshop on Amazon. And the reason we didn't do it this year, or the last couple of years, is that we found that the public server has actually gotten much more robust and usable, and is a bit, the nice thing about the public server is it's the most up-to-date one that has all the tools, most up-to-date versions of many of the tools we use. So that's why it's sort of a bit lazy, I would say, but saving of time, yes. So pre-installed, yeah, yeah, yeah. So you can upload tools to Galaxy as well, and you can't do that to the public, you can't use that, you can't upload your own tools to the public server, but you can definitely update tools to your institutional version. I mean, you have to have root access, basically, to be able to install tools into a Galaxy instance, or you can also do it on the Amazon or whichever cloud instance you want to use it. You can go install tools there. The way I'm going to get to it, actually, I'll come back to that point a bit later about installing tools. So which tools and so the... So the main versus local, which is actually your own instance, versus a cloud version versus others. And so if you have big, small, too large, and you get data sets, then all of the various clouds are fine, or sort of various Galaxy are fine. If you're using ordinaries or requirements, these are all very fun to be hand-weighting, but a local instance can have more resources than the main version. The main version has several hundred cores, and I think has a limit of 250 gigs of storage per individual. So they actually give a lot of storage, and actually if you need more than that, they'll give you that. They're very generous with their storage. And it's all free, so it's quite useful. So if your data sets are large, very large, they say you can't use main. So they won't give you petabytes, they won't give you... You may get one terabyte, maybe, but it's not a bottomless pit, right? So it's used by everybody on the world, and I've warned them today that 30 of us are going to show up, and they assured us that it should be fine. But if a few hundred other people decide to show up at the same time, it'll be interesting. So this is the homepage for the Galaxy project, the homepage. And so it has new information about used Galaxy, so that's a link to the public server. How to get Galaxy, so how to download the software and install it. How to learn about Galaxy, so there's lots of videos, tutorials, and there's Google. They use Biostar for their help desk, maintaining all their documentation. And then also there's lots of mailing lists and the toolshed, which I'll talk a little bit more about, and the wiki, which has got information, some more useful information. So this is a useGalaxy.org homepage, which you logged in already. On the left side are basically all the tools. On the right side is the history of everything you've done so far in this session, or if you've registered as you have, you will remember your history from session, so from logging in and logging out, it'll keep track of all that. So that's very useful. And in the middle part of the Galaxy is your workspace. It's where you enter things, where you view files, and so forth. And there's buttons on either side so that you can expand the middle part to get rid of the two side panels, blue side panels, so you can have your whole window dedicated to your workspace. So that's a very useful way of dealing with that. GetGalaxy.org instructions on how to get the software, and so forth. UseGalaxy.org slash cloud is how to use it in the cloud environment. And so there's multiple, so there's the Amazon, but you can also, if you have your own academic cloud, you can install it there and make it available to the community as well. And this is a page that has all the public Galaxy servers that are available worldwide. So last year there was 50 plus, this year there's 60 plus. And so these are all various Galaxy servers that are available, like I mentioned, specific with certain projects and so forth. And so Galaxy integrates the input of data sources. Galaxy allows you to use many tools that don't need to install and maintain. Galaxy allows you to maintain workflows, reuse them and share them, list and publish experiments. So if you publish a paper, you could actually put in a Galaxy workflow with your paper that sort of reproduces the data that you've just published. And you can make that publicly available, making it sort of transparent and clear to the community how you generated that data. And so there's some journals like Gigascience, which actually have their own instance of Galaxy as well as that journal that allows you to publish pipelines that are published in that journal. But even if you publish in a journal that doesn't have their own Galaxy instance, you can put it on useGalaxy.org, the main server, and publish it there and tie it to your publication. And you can actually also do that in a cloud. So you can actually have a pipeline and have a make an Amazon machine image and publish that as well and make that available to the community if that's the way you want to work. And so the strong thing behind Galaxy that's sort of the driving force of the whole project is to make things reproducible. And Galaxy, with that comes the fact that it's really good at keeping histories and what you did and how you did it and making it easy to work with collaborators down the hall or across the globe. And so one could argue that really the power behind Galaxy is that it was designed for biologists. And it's not designed for computer scientists. It's not designed for people that like to do command line stuff and so forth. So maybe none of you fit this model anymore. Now that you've become command line experts. But at the same time, if you need to show something to somebody, if you need to make it reproducible and share data and so forth, it may be a great platform to do that. There's a lot of people that work on Galaxy on making tools. So one of the things that the Galaxy community encourages is that when you develop a tool, it will make it so rapid so that it's available for Galaxy. So if I have a new tool, I can make a version of that tool that's available in Galaxy and that helps your tool. And I'll come back to that a bit later. So your script would require some input files and would have some output of certain format. And so you wrap it in Galaxy and you tell it, okay, it's going to look for an input file of this format of this file type that looks like this. And your GUI will say put the file name here that's input, run this script on it and generate this output file of this type. And so it knows, so all of that. And now that tool, your script could be wrapped in Galaxy and put in the tool shed, which I'll talk about in a second, which is a place where people go pick up tools that they will include in their own Galaxy instance. And now they will have on their menu, on their sidebar on the left there that I mentioned, they will have your favorite script as one of the menu items to pick from in the tool that they have available. Yeah, yeah, yeah. So basically it provides instructions, not an API, as much as you would deposit your script, Galaxy wrapped in the tool shed so that people can now download it and make it available for them. But if your script needs to go speak to other parts in the world, that could be part of the pipeline itself. And Galaxy is able to take data from the UCSC browser, for example, and I'll talk about that in other places in the world. Yeah. So one of the big things again behind Galaxy is it helps biologists to deal with tools and data. It's been funded mostly of late by the NIH but also by other places like NSF, Penn State, and so forth. The two lead PIs now are at Hopkins and at Penn State University. And more details about that on the Wiki and on the many tutorials. So some of the problems with Galaxy is not all Galaxy's are created the same. So Galaxy team is moving to sort of working actually, providing Galaxy as an empty shell so that from which you then sort of have like the app store type model where you go to the app store and then you download the tools you want to have imported into your own Galaxy version. But being aware of all the various, if you have a Galaxy server, a public Galaxy server, not the main one, but the one that's dealing with your community, it's good to keep track of that one and seeing which tools and database. So one of the things that Galaxy main has a limited set of but it has quite a few is, for example, reference genome. So it has obviously the latest human reference genome and mouse and all the model organisms and so forth. But it may not have, if you're working on oysters or whatever other sort of weird organism, it may not have the reference genome for that one. And so you would have to download it and install it in a private instance for that. Yeah. Yeah. So companies are able, they could download Galaxy to their, to their within their firewall and then you could, if you have access to it with it, if you're working on helping you do it, you probably rather not. But they all describe it to you and then you can add email addresses, for example, people that want to look at it. It's not, I would not, Galaxy itself, it's not a, you shouldn't put data on there which you don't want the rest of the world to break into and so forth. It's not, you know, we need the security levels of many companies or even for human genome and so forth. For those cases, what you have to do is you have to put an instance internally to your own firewall. So at OICR, so what would we have an instance of Galaxy within the OICR space? So you have to be in the OICR to be able to use it or you can use a public one, but if you want to use one on unit data, for example, you have to use the one, the internal one we have. So the, so the tool shed I mentioned is a really good solution at, to address the problems of multiple tools and different tools being available and so forth. So the tool shed basically allows you, if you're an administrator of a Galaxy instance, you can go to the tool shed and download the tools which have been Galaxy wrapped or they're Galaxy ready to be installed and so forth. So you can search things and so in this example I search for SAM and I have a bunch of tools related that have SAM somewhere it's just doing a string search as a tag or what have you. And so it shows me all the tools I could install on my server for working with the SAM files. And so the general workflow in Galaxy is that you log into it, you get data and you upload your data to the server, you manipulate your data somehow, you can repeat this manipulation or do it better sort of as you see fit, then you save your output, you save your workflow and into sort of, so you have workflow files basically you can shape and you can publish the page which includes the data with the workflow as a page also. So the cloud version looks very similar to the one that use Galaxy.org. It has the tools on the left, has the history on the right and the workspace in the middle and it sort of reminds you you're in the cloud by saying Galaxy in the cloud. And if you look at all the tools that are available, it's sort of the same types of tools that have available. But last year, the year before, I did a difference between the list, I took the list of tools that were in the cloud version versus the one that use Galaxy and I found a bunch of tools that were only in the cloud and a bunch of tools which are only in Galaxy. So they're not the same. So you have to sort of take that into account. And each of these, this is an example of some of the tools and each of these bulletin items actually has a bunch of sub menus of tools. And so the way I, when I'm looking for a tool instead of scrolling through the long list of menu, I just type in in the top left corner there, you type in the word you're looking for, the tool FastQC for example, and then you type FastQC and then you see there's two or three instances that reference FastQC, which is called FastQC. So that's the one you want. So some example again of the differences between the two of them. So both of them, both the cloud and the one on useGalaxy.org use or talk to UCSC. So their API is basically or they're programmatically linked to each other. So they know their Galaxy, useGalaxy.org and UCSC browser are aware of each other. So you can save files to Galaxy and Galaxy can send files to the UCSC browser. And so you can work that way. And so for example, you can download the latest version of the various annotation files and the genome that you would need for finding genes and so forth, annotating genes on a genome browser. Very flexible, graphical and table views. So most of the time Galaxy uses a table view. So it deals with tables very easily and that's the way it likes to work best. So this is for those of you who haven't been to UCSC genome browser. This is an example of it. Other examples of data formats output from UCSC are tab separated sequences of Fast Day, which are different from FastQ. What's the difference between Fast Day and FastQ? Quick spot question. No quality score, very good. It has bed files or browser extensible data format files, GFF file formats and GTF file formats. So here's an example of Fast Day files of nucleotides from Homo sapiens chromosome 22. Here's a bed file format which are used in many, so it's chromosome start, stop and some information about the types of features that you're annotating. GFF file formats again, chromosome start, stop and a number of, you'll have axons and genes and so forth. And in gene GTF, it's a GFF but specific for coding sequence features that are used to extract those fields. Interesting thing in Galaxy is that you can publish in pages and you can go look on the Galaxy server and see the pages that are publicly available. So you can see other people's work already. For example, some of the ones there'll be things on Galaxy, Galaxy RNA Seq analysis exercise, for example, so a lab to do and then communities rate them also. So some get five stars or some get no stars because they haven't been rated and so forth. And so usually those, often many of these come from Galaxy staff but actually anybody in the world can submit these and make them available. So I'm going to go through an example of the RNA Seq exercise and it's just more as a use case scenario but we're not going to do RNA Seq today but we will tomorrow. So the analysis is to take RNA Seq data from this paradigm from the human body map project which is an Illumina generated data set and we'll be looking at adrenal and brain tissue and from a specific region of chromosome 19. So again it's a toy or a classroom data set which is a specific to one region of the human genome. And so the adrenal gland is on top of your kidney and the, what's the other one? The adrenal and brain? Well you know where your brain is, right? So this part you've all done already the logging in so now that you've logged in you're coming back in as a returning user has anybody had problems with this in the class? Is everybody okay with this step? We don't need to do it now but I just want to make sure that you're all okay. So basically once you've logged in the page looks like this and if you look at your your preferences then it gives you your settings the way what you've set up and so forth. In this lab exercise we're getting four files and so you can get files by loading them from your computer or by providing a URL and so in the lab we're going to do today we're going to give you a URL to get these files into your computer. And then so this is an example of getting with the URL and don't do it now, right? Don't start doing it. You have a question? Don't do it now. Yeah, but don't do it now. Yes, what's your question? You're not doing it now, right? I got a question when we do it. Okay. Because I want you to pay attention. And Richard is going to do the same thing for his lab when he does the labs and what he's doing and what he's doing you will follow him and you will play like he plays. And he's telling you to play this high you play this high. You know how you can be, right? So Sorry? Who's that? You're you telling people what they need to do? So when you so on the side panel I have the history I mentioned to you. So what happens when you start a command, it first shows up as a gray box and the color is here how are you there in your manual. So it shows up as a gray box which means it hasn't started doing anything yet. When it's in process it goes into a yellow box and then next to the middle of the box there you see a little tourney thing next to a 12 you'll actually need processing that step and in green it assumes that everything went well. This is when things don't go well. You see red boxes that means you're stepped in more properly. You use the wrong command you use the wrong file type there's could be or the program failed for whatever. And and this one is the transfer file that being fed a little arrow into it. It's been uploading a large file still. Another thing that oops sorry this is my screen saver or something. So once a file it shows up it has these three icons next to it. So there's an eye, a pencil and an X. So the eye I call it poke the eye and it asks basically you want to see the file so you poke and click on that on the eye and then in the middle panel it shows you what the file looks like. So the pencil is to edit the attributes of that file. So it allows you to rename the file or to give it a or add notes that from that you may want to say where you got the file from or something like that and so forth. So you're allowed to keep all sorts of things and you're allowed to delete that file itself. So that's the file sort of thing. So if you poke the eye that's an example what basically shows you the task view file and it shows you what it looks like in the middle of the file. Edit the file itself? No. You have to do that before you load it. So give me an example what you would want to do. Yeah. So you can actually so within Galaxy you can filter, you can grab basically so you can select lines you can do that on online formatted files. So there's actually editing features which we're not going to go over in this class I don't think that are allowed to do so there's a whole section of text editing so you can select columns you can select rows and so forth all of those things are possible within Galaxy, yes. So you don't have to do any Unix command up front before you load the file you can actually which is doable of course in Unix as well but you can do it with the help of Galaxy. I said keep me I want columns 3, 5, and 7 and I sort by column 1 and output to this file yeah that's all possible So you can edit attributes so one of the things I like to do is I, so for example in the brain issues I like to put the brain name so I have to put that in and it has this URL where I get the file from I copy that from the name to the file into the info file so I keep track of where the file came from by adding that to my notes into the file Another new feature that happened in the last set, in the last while, is this 3 fly with the 3 little icons here one of which is highlighted here which is to operate a multiple data sense so you're now able to select multiple, let's say ask you file with multiple BCM files and select them and make them into one data set and then do actions on that one data set so you're now able to select whichever file you want quite easily so if you do by clicking on the operating on multiple data sets what happens is that these boxes show up next to the file then you can select all or none then you can manually select the one you want to group them, to group things together and so forth so this is so if I select all of them, I can build a data set list or I can build a data set pair or build a list of data set pairs so there's also the things you can do for that in terms of what I should have done here which I did do, is build a data set pair that puts brain 1 and 2 are a pair of important or paired ends from one sequencing run and brain 1 and 2 or the other pair so the example of workflow in Galaxy also we have grooming so grooming is more, they call it grooming but basically what we talked about yesterday about using the old letter of scale that are used so Galaxy is able to convert between these various types if your tool requires old format or new format then you can use Galaxy to change those things we talked this morning about QC, FastQC so you can run FastQC on your files then you can get FastQC results within the middle section which is your result section as well as your working section you can look at your results in FastQC or you can look at it in text format so text format or graphical format so this is for the four files that I was talking about you can trim files if you want to also and that's possible within Galaxy with various criteria and so forth to remove the bad basis and so forth these are files so that you've accumulated and there's all sorts of different things you can do with them if you're doing this exercise you may get different numbers on the left because the numbers are basically every next gives you a consequential list of numbers of files as you generate them it adds a number if you delete them it then uses it again if you repeat an experiment then you have multiple you don't have 16, 17, 18 you have 18, 19, 20 plus 16, 17, 18 so you have all these numbers accumulating so the numbers you get when you're doing exercise may not be exactly what's in the class notes but that gives you an idea of what so here in this example we use Top Hat so I'm just quickly sort of with Parada and Reeds we know the distance between the Reeds from the body map data site and so you can now this is using Top Hat and you can use the various files of sets or unique files and so forth as we did before using we're not going to do this in the lab but this step would take about 30 minutes if we were to do it in the lab even with this small data set and then it generates a number of output files which are now available in the various viewers so there's actually a so Galaxy has its own browser internally to look at output files and this would be an RNA-seq way of looking at data I'm trying to think if you click on the file there's actually I forgot to put that icon I'll add it later there's a way of looking at specific graphic output there's ways of sharing files and so you have all your and you can view them or share the publish and so forth and you can share your history with colleagues and you can extract workflows so that's from the top right button this extended menu set so extract workflow it takes all the steps you've taken and it puts them all out in the list and then you can decide this step I repeated twice because once I got it wrong and once I got it right so I'm going to exclude that one from my workflow so you select the ones you want to include in your workflow and then you can make them available and then you have this sort of graphical view that you can also use to edit to then analyze and this is the RNA-seq workflow that allows you to see all the steps and all the connections you've made you can do these manually if you want if you're so inclined or you could add a workflow or step in the workflow and so forth so it's a quite powerful workflow editor as well so yes you mean if you want it to do from the command line yes so you can find out the versions of the tools which arguments you used and so forth it keeps track of all that so that's available I don't think you can not as a command line script it's not a tool to export a command line that you can then copy into and vice versa it doesn't do you can't copy in a script a command line into Galaxy either yes so it's even as as you can see at OICR so he knows it very well as well so can you repeat that they can use your own version so if you have your own instance of course you have you know which tool which version to install or from the user interface you can extract those you can extract the next version but not execute the older version okay almost finished so videos, mailing lists, twitters and etc for Galaxy there's a Vimeo dot com Galaxy channel so all the tutorials and Galaxy are on Vimeo and if you go to that page there's a specific tutorial for example there's one on ChipSeq so we don't cover ChipSeq in our workshops here but if you're interested in ChipSeq analysis there's tutorials and tools and so forth in order to do that the as I mentioned this is a trackster how to use RNA-Seq with trackster so there's a tutorial on that quickly I make mention of Genome Space which is another tool that uses Galaxy as part of the and it's a space that connects basically Genome Space so it connects a bunch of tools together and so it connects basically all of these tools so it connects Galaxy with side escape with the with array express with with with the staff package and so forth and so all of these and of course VTSC is also fully integrated but in Genome Space the output of one program becomes the input of the other program and Genome Space facilitates that and that transfer of that data so other useful resources I mentioned Galaxy it's got that's a page Twitter account user support uses Biostar and we'll hear some more about Biostar tomorrow Malachi and OB are heavy users of Biostar and so they have lots of useful hints and so that's in there too and Biostar is actually from Penn State University so that's one of the reasons why Galaxy uses it because they're also at Penn State other open elix is a commercial, actually you helped us but it has some free tutorials on your CSC Genome Browser and it has some for else tutorials on Galaxy which I don't recommend because there's lots of free stuff available UCSC has a lot of information C-cancer is actually a good website to go to if you want to ask and find out or read other people's answers about sequencing technology questions and so you say I'm having a problem with my sequence my structural variant colors from that I've been using I'm getting this weird sort of display you know, does anybody know how to do that so you can probably look up that question and somebody's probably already asked that question and somebody's already answered that question or you can go ask new questions yourself there's the papers of interest and before the coffee break but we're not going to have a coffee break you should have something that looks like this and I'm going to now stop now and get Richard to come in and do his part any questions so far before Richard comes on deck