 Okay, thank you Eric for that really thorough introduction and I want to also welcome everybody here today and thank you for braving the heat and for participating online as well. So I want to give a little bit more of a background to the mod and code project and hopefully that would be a good lead in to what we're going to hear about for the rest of this symposium. So as Eric was saying, way back in 2002 when we were anticipating the completion of the human genome sequence we started to ask really in earnest how can we read the human genome sequence. I like to say there is no instruction manuals, we really don't know the language rules for the human genome and we know that evolutionary conservation can help to identify functionally important regions about five percent of the genome is conserved about one and a half percent is protein coding. We were not really good at finding these protein coding regions but the fine structures were difficult to predict just from the sequence alone. We knew that regulatory regions can be very far away from genes and so we felt we needed an unbiased experimental investigation to really try to grapple with this challenge. And so we did start this project called Encoder Encyclopedia of DNA Elements and the goal of this project is to compile a comprehensive encyclopedia of all the sequence features in the human genome and in the genomes of selected model organisms as Eric pointed out we felt that the use of these model organisms is very important in being able to understand biology and understand the human genome sequence. And the approach we took in the beginning of this project was to apply the lessons that we learned from the success of the human genome project. We started with a well-defined pilot project and we developed and tested high throughput technologies. So as Eric alluded to there are several components of Encode. We started back in 2003 with a pilot project focused on 1% of the human genome sequence just to even see how we could approach this problem, how we would apply high throughput technologies to finding all the functional elements and based on the success of that pilot project we then scaled up to the whole human genome sequence in 2007 and then at the same time that's when we initiated the Modencode project to find all the functional elements in C. elegans and Drosophila, Melana, Gaster genomes in 2009 with help from the economic stimulus RF funding. We initiated a smaller effort in Modencode where we did a limited study but really trying to find functional elements in the mouse genome and as a way that we can help understand the human genome sequence. And throughout this whole effort we have been working on technology development, we really needed new technologies, new high throughput technologies that could be applied to really tackle this problem and we've had a series of initiatives in that area. Most recently we funded a group of proposals this past year and interestingly a number of the technologies that were developed early on were then implemented in the production activities. And we're looking forward to the next phase of Encode starting this fall and you can stay tuned for information about that but those will include additional data production where there'll be a collapse of these projects into one large umbrella of human and model organism genomes as well as special data analysis projects. We're really looking at new ways that you can use the Encode data and the Modencode data to understand genomes. So as Eric also mentioned in his slide of the people standing on the top of each other, these are really highly interactive research consortia. We have lengthy teleconference calls actually a lot more than frequently than that. We have a number of working groups through just specific problems and issues such as data management resources and data release. Each of the projects have data analysis working group. We have lots of annual meetings of the whole consortium as well as the analysis working groups. There are a lot of consortia collaborations that you'll hear about and also a number of consortia publications. We view the Encode projects as community resources and so we really want to get all this information out to the researchers to use that use as best as they can, especially for the understanding of the regulation of gene expression as well as the genetic basis of disease, especially from the human data. One of the hallmarks is rapid prepublication data release. We also want to get the information out and how to use it through consortia publications. The analysis of the data has required the development of common data reporting formats, data standards and analytical tools and you're going to hear a little bit more about those later. We're very interested in getting this data out to the community and so we have a number of portals that you can use. The first gain access to the human data is through this portal at encodeproject.org through the UCSC group. You can get the data from the UCSC browser. This is human and mouse I should say as well as ensemble. All the data is being pushed out to NCBI and you'll hear today more about the modern code portal moderncode.org and data is also going out to fly base and worm base. So we're trying to make this data as accessible to the community as possible. The encode pilot project which I mentioned started in 2003, went to 2007, had published a large number of papers back in 2007, a large paper in nature as well as a number of companion papers including this one whole issue in genome research devoted to the encode project and the pilot project and the encode project now this is all getting focused in human is planning to have a set of large set of publications coming out early this fall so give you heads up to be on the lookout for that. Moderncode group has done a number of publications as well. The group published what we call a marker paper back in 2009 and this was to inform the community about the plans for the modern code consortium, what types of data there were, how to access the data and really what the community can anticipate and you're going to hear a lot more about the data but I'll just highlight some of the features here. This is a figure from the paper that talks about data that you're going to get on transcription factor binding sites, histone modifications, origins of DNA replication as well as extensive analysis of transcription. It's just a little preview of what you're going to hear about later. Back in 2010 in December 2010 there were a number of publications that have come out from the modern code group. There were two papers in science, one focused on the integrative analysis of C. elegans data and the other focus on integrated analysis of the Drosophila data. And with that as well were a number of companion papers in nature as well as this genome research issue and in several other journals. And you're going to hear about plans for additional publications in the future. We're also trying to get the word out for social media. Through social media we have Facebook page now and Twitter feed and you can be following this symposium at ModSymp 2012. There are also efforts for education outreach. There are several volunteer scientists from modern code and the genomics education partnership who have teamed up with Science and IIIS to create an education website. And this is being led by Sally Elgin and Bob Waterston through modern code and by Laura Zahn and Stuart Wills at Science. And the target audience is really the general public but with a specific emphasis on high school students. And this is going to have six segments which will provide a rich background on works in both fly and worm including an introduction to chromatin structure and eukaryotic transcription, description of modern high throughput genomic technologies as well as bioinformatics approach to data analysis. And this is not live yet but if you want to receive notification when the beta test cycle is live please email Stuart Wills at this. You don't have to memorize this just because there's a handout and a poster about this website out in the front at the front desk. Okay so today we are here to celebrate as Eric said the completion of this phase of the project, modern code as an entity. It's been a five year project coming to a conclusion. We do want to showcase the modern code findings and to celebrate the project. We're going to hear today about data access and data analysis. We're going to have two panel discussions discussing the utility of modern code data for basic biological processes as well as human biology and disease. We have speakers from both within the modern code consortium as well as several outside speakers who have been using the modern code data. And we're happy that this meeting was being held in conjunction with the GenX Society of America model organisms to human biology cancer genomics meeting. We welcome those of you who attended that meeting who are here as well. So there are a number of people to acknowledge from the modern code consortium. There were 10 data production groups. I'm not going to go into the details of all their work and all the people but you can see here that within each major project there were a number of co-PI's of these projects. The main data production groups were from Sue Selnicker's group, Steve Hanikoff, Gary Carpin, Eric Lye, Jason Lieb, Dave McAlpine, Fabio Piano, Mike Snyder, Bob Waterston, and Kevin White. In addition, Nellis Kallis led the data analysis center and Lincoln Stein has led the data coordination center. In addition to all the people listed here, there have been many additional senior scientists, postdocs, students, technicians, computer scientists, statisticians, and administrators from these groups. So it's been quite a large effort. Here's a picture of many of them and NHGRI would like to thank this whole group for their very hard work in this endeavor. It's been a lot of fun. I also want to acknowledge my colleagues at NHGRI. I want to give a particular mention of Peter Goode, who's been my collaborator and co-leading this in-code and modern code effort for the past 10 years and unfortunately he couldn't be with us today as well as Mike Payson who joined us as a program director about 18 months ago. Special thanks also to our division director, Mark Geyer, who's been great help with guiding us through the management of this project and has been a very strong proponent of the in-code and modern code projects. In particular, thanks to our program analysts, Leslie Adams and Caroline Kelly who've been really keeping us on track and have been an enormous help in helping to run these consortia. And I also want to mention some former program analysts, Laura Leifer Dillon who is here today, Rebecca Loud and Jessica Malone, Judith Wexler and Julia Zhang who've been part of in-code and modern codes from the beginning. Okay, so does anyone have any questions? I think we have time for a couple if not we can move on to the next speaker. Okay, okay. So thank you. So it's my pleasure to introduce the first speaker, Goss Miclam who is director of the Cambridge Computational Biology Institute and is group leader in the Department of Genetics from the University of Cambridge talking about modern code tools, data and tools for the community.