 Okay. Thank you, Steve, and thank you to everyone who's come out tonight. I'm going to spend maybe the next five to ten minutes talking about the 1000 Genomes Browser. So, as Steve described, the data sets are very large. And honestly, to really work with the full, rich, and complex nature of the data, the ideal way is to download the data sets, especially the VCF files, use the tools that exist for them. It's very difficult to imagine how a web display scales to 1,000 or to 2,500 or to more. So, you'll see some of the conventions that we use in the browser. We display things by populations. We display other summary-level data. But as I said, to really get into the richness of the data sets, there's a good argument to downloading the VCF files and to investing in the tools. The 1000 Genomes Browser is based on version 54 of the Ensembl code base. Those of you that have used Ensembl in the past will find it reasonably familiar because of this. It contains all of the gene information that you would normally expect in Ensembl, gene and transcript information, external references, connections to other databases, sequence data. It's important to remember that the 1000 Genomes Pilot Project, all the analysis was done, all the data was released on the NCBI 36 version of the human assembly. For the full project going forward and the alignments that are being done in the full project, and in fact, even the SNPs that were released today as part of the full project, those are on the GRCH 37 assembly. And so there's a difference between the human assemblies that were used in the pilot project and will be used for the full project. So the pilot project browser incorporates essentially all of the 1000 Genomes Pilot data. There's still a few details that we continue to add in refine as we go forward. Eventually the pilot project will represent a stable collection, and we will keep it in perpetuity to be able to view the 1000 Genomes Pilot Project data. And as I mentioned, as we released data on the full project, we're actually going to create a new browser for that. So there'll be the stable pilot project browser and then a new, updatable browser as we continue to release data going forward. All questions about that, in fact, all questions about data access as well can go to the info.thousandgenomes.org address. So here's just a view of the homepage. One of the things that has been mentioned already, I just want to reiterate this, is that all of the data that was reported in the paper last week, in fact, we released this. This was the March 2010 release of the data. So in fact, the data has been completely available. All the SNP calls have been completely available for people to use for quite a number of months before the paper is released. And this goes back to what Gil said about how the project is very committed to releasing data as soon as possible to make it as widely useful as possible for the community. This is the most recent version of the browser. We actually just updated it today. And as I said, we have a few additional things to go. The main location view of the browser, again, this is built on Ensoble. There's two important points here. There's navigation on this side. Zoom into that section right here, where you have regions of the genome, chromosome summaries. There's a link for re-sequencing. I'll get back to that in a second. That's where we have information about each of the individuals in the high coverage trio. And then lower down, there's aspects of page configuration. So the key link here is configure this page, which changes what you're looking at, changes which tracks are available, as well as export data, which I'll come back to again in a few minutes. So if you click on the page configuration link, this is what comes up. There's a whole host of things that can be configured. If you look over here, going down to the types of genes. Up at the very top, the first category is 1000 genomes. So currently, while in this configuration, 11 of the 27,000 genomes tracks are turned on. I've listed all 27 of them over on the side of the page. And you can see we've labeled the tracks just like we labeled the pilot projects. So there's three types. The trio project, the low coverage project, and the Exxon project, as well as the three letter codes for the populations. We display tracks with the entire population's variants combined together. So there are 60 individuals in the CEU low coverage pilot. They're combined into a population level track. Similarly for the trios, we have trio level tracks. There's a couple of other things just to highlight going forward to give a preview to the things Jan will talk about. We have things like tandem duplications and other structural variants. For more details on what these are, the show info button always brings up a little additional information. And finally, there's a few other supporting data sets like recombination hotspots and things like that that we incorporate in the thousand genomes set of tracks. So I mentioned already that we have these SNP tracks. The SNP tracks are based on the populations. This is just an example of what they look like, where they are on the page, and when you zoom in and take a look at them. So here we have three of the SNP tracks, and then the recombination hotspots is also displayed there. Those of you who might notice that the SNPs turn different colors here, the colors are associated with how they're annotated within genes. So in this case, these SNPs are just upstream of a gene that you can't see here on the zoom inversion. But when we have these light blue SNPs, those are color coded for upstream of genes. The green SNPs here, which I hope you're able to see right in this area, those are synonymous coding SNPs. So this is actually in a coding exon where those SNPs are located. All of the SNPs themselves are clickable, and they bring up additional information when you click on one of them. This is just an example of the box that pops up when you click on one of the SNPs. It tells you the location. It provides additional information here where it was seen. You can see it was seen in the low coverage pilots. There's some additional information as well. And a link will take you to a main SNP page that contains information about the SNP properties. Much of this information has been extracted from DBSNP and is provided on that link. We also have structural variant information. The structural variants are, as you saw earlier, different tracks for different type of structural variants. They're selectable again via the configure this page link. And once the structural variants are showing, and here's an example of a larger deletion that I think Jan will also show on chromosome one, if you click on this, it will give you some information on the validation status of that structural variant. So you may also be able to see that this structural variant is actually associated with no SNPs whatsoever in the East Asian population there. So that things kind of tie together. That's actually a deletion. So I alluded to earlier this concept of resequencing alignment that we provide for the six high coverage trios. What this allows you to do is view any region of the reference genome with the SNPs from the six high coverage individuals substituted in. So this is not a de novo assembly of these individuals. But what it is, it's a view of the reference genome with the SNPs from these high coverage individuals substituted in. We also mark the areas where there's no coverage, so that 15% of the genome that Gabor talked about that we can't see into. This is again on the configure this page link allows you to set up what you want to look at here, which of the trios are located right down here, which of the trio individuals you would want to see. This is the only place that we provide actually the individual sequence data for the samples and the project via the browser. Everything else is either entire trio track or the population tracks. This is an example of the output for that. Here we show just dots where the sequence is the same as the reference. You can configure it to show the actual sequence. And over here we have regions color coded where we have homozygous or heterozygous deletions. When it's heterozygous we use the ambiguity code for the nucleotide. If there's a little tilde, which isn't shown here, that means there's no coverage in that region. So we also provide sequence level data. The actual sequencing reads you can see aligned. This is only for the trios. To view this, you have to click on this resemble settings button. And eventually you can see all of the aligned reads. For very small regions of the genome, 200 base pairs, you can actually see all the reads themselves. All of the reads with all of the sequence. But again, this is only for the six trios. You can view linkage disequilibrium information in context with the SNPs. Currently we calculate the linkage disequilibrium from the hat map in the prologen populations. And the populations to view this are selectable. I talked a little bit earlier about data export. This provides summary data from the region being viewed. And we're going to work on some additional export configuration options to make this a little bit easier to use to pull out just a small chunk of the genome, a small chunk of the SNPs. There's a whole host of other variation displays where a variation is incorporated into different things. If you're on the gene page, which is you end up on the gene page if it's highlighted at the top, then you have an area on the side where you can get to variation information. This actually shows you variation information in the context of protein domains, so how protein domains are affected by the variation sequences. On the transcript page, which again highlights the tab to the far right on the page, we have a couple of different displays showing how here the trio information affects the transcripts. These are each of the six trios. And down here, all the synonymous and non-synonymous variants in the transcript. And so with that, I just have to acknowledge a few people at the EBI who have put this together, built the back end databases, and created this ensemble-based browser for the 1000 Genomes. And again, any questions about the browser or data access in general can go to info.thousandgenomes.org, and I'll take any questions that you have. Any plan provided in API, for example, for Biopurl? Yes, so in fact, we are going to release, we are going to put up a MySQL server. I hope we will have that available within the next couple of weeks. And then you can download the appropriate version of the ensemble API and interact with the database that way. Yeah, so that will be available both for the pilot project data, and in the future as we release data on the full project, that will also be available. I'm, do we show inverted sequence? I'm not, sequence inversions. Yeah, so I don't think, have these been identified by the structure variant? I will add to that in a minute. Okay, okay. So there is, it is possible to identify some of these by looking at errant sequence aligned reads from the display that I showed you, where you could actually see all the reads aligned. It would be very difficult, though, to go through and look for those individually at this point. And so, so Jan will really talk to talk about the structure variant. As we get, as some of these become discovered in the project, we will put them into into the browser in the full project going forward. So, so the question is, is there are, I'm repeating the question. So, so the question is, is other tools to jump dump by genotype data and and visualization for haplotypes like there were in the HapMap project. Right now, the most effective way to get the genotype data is to download the VCF files and use the tools for that. We are working towards graphical user interface that will allow you to do that, essentially by selecting regions of the genome that you're interested in doing that. Creating tools, we've actually had some discussions as to how we create LD tools based on the thousand genomes. None of those exist at this point. So with that, I'll turn it over to Jan Korbel from EMBL, who will talk about structural variants.