 Welcome, welcome, welcome every guest to BioC 2022, 740. Let me know. Those of you who are remote, and I understand there are hundreds of you, if there are any problems. I'd like to begin by thanking our sponsors, conducting a virtual conference in the space of Genomic Computational Data Science, Computational Genomic Data Science, is a complex undertaking, I'll tell you a little more about it, but without these sponsors, many of the things that you're going to experience the next couple of days wouldn't be possible. So thank you, Moderna, Nanostring, Microsoft, Seven Bridges, Maze, Genentech, Biogen, Tersen, the R Consortium, Bluebird Bio, Sticker Mule, and especially Seattle Children's Hospital, and thank you, Mark Carlson, for setting this up. I also want to thank Erdal Kozgun of Microsoft for providing support and resources in Azure's cloud computing infrastructure, and to Ennis Afghan for arranging to share an Amazon Open Data project. The organizers of the conference have put in a tremendous amount of effort to make this succeed. Erica Feck and Levi Waldron have led a large group of individuals, Edine, Andrea, Andrew McDavid, Angie, Charlotte, Chelsea, Kristoff, Dania, Daniella, Deepak, Glenn, Elena, Jason, Jairam, Jennifer, Jenny, Joyce, Kayla, Kevin, Kritika, Krutika, Laurent, Linda, Lorena, Lori, Mahmood, Mark, Marcel, Matthew, Michael, Mikhail, Nathan, John, Raphael, Samuel, Sean, Tim, Simena, Simone, Somia, Stuart, and Wes, all contributing to what you're going to be experiencing. Over the next couple of days. This is our first hybrid conference. There may be some technical issues. Please let us know what's going on. Those of you who are here, if you need a hand getting into some place, there is a wonderful flashing green badge that children's hospital folks are wearing, and you can find them and they can do anything you need to get you where you need to be. We're spread over three buildings. There's a mobile app. You can see maps. They're stopped by the registration desk for assistance. We'll help people get from building to building. Everything is within a three block radius. However, being in the heart of a city, we do ask you to exercise caution when walking alone. Escorts are available and the security can be contacted with any issues. The Community Advisory Board will host a walking tour today. For those of you who are here, we'll depart from the Cure Building, I assume, at 5.30 p.m. The tour will be approximately a mile and will include a stop for dinner. For those of you who are not here and cannot do the walking tour, please have a look out on the BioC2222 Slack channel for a Kahoot that will enable people to interact with one another. There is breakfast and lunch available to attendees right here in this building, and there's a list of restaurants on the mobile app to help you decide where to go for dinner. Take a look at the Slack channel. Getting together, it's been a while, and I wanted to say a little bit about some of the things that BioConductor has facilitated as we come back into person-to-person contact for scientific interactions. Just a couple of weeks ago, I was able to work with Sean Davis to put together a course that was taught at the Cold Spring Harbor Laboratory on Statistical Analysis of Genome-Scaled Data. We had a wonderful collection of co-instructors, Martin, Alana Fartig, Stephanie Hicks, Anshil Kundaj, Mike Love, and Charlotte Sonason, all presented, either in person or virtually, to 24 extremely talented students who came to the course. They gave me a card at the end, and this was a cover of the card. I think the name of this work of art is Dark Space in the Human Genome, and they wrote notes here, which were thanks to Sean and I for being able to give the course, which instructed many of them in statistics and bioconductors usage for the analysis of genome-scale data in a wide variety of formats that they're all working with. The reason I'm putting this up here is that they thanked us, but the real thanks, I think, go to many of you, people who have written bioconductor packages, folks who give help on the support site, those who actually write and contribute packages, and those who review them, and you can see that the review process here for packages that we just submitted within the last 10 days is very active with people contributing their comments and responding to them, and a lot of this is facilitated by our core, and this is Nitesh and Marcel and Irvay Paj and Jen Walkati, Alex, a little blinded there by the son, Alex Mahmood, and Laurie Shepard, and there I am struggling to get into the picture as usual. We were able to have a meeting yesterday and to discuss some of the things that bioconductor needs to do in order to keep things going and keep them satisfactory for the user community. So it really is a pleasure to me to be able to discuss the project, and I think any of you who are involved in teaching in computational genomic data science know that if you understand bioconductor, you can get up in front of a room of biologists, computational biologists, software people, and so forth, and you can really hold their interest by describing the way this project holds together, describing the packages, the data structures, and so on. So it's a really wonderful privilege to be connected with this project, and I thank those of you in the community who have made the contributions that make it so valuable to so many. I wanted to say a little bit about how we're trying to expand activities of the project into domains that we haven't really gotten into, and so we made an application to the National Science Foundation to get some computational resources, and that was successful. And so, you know, depending on where you are situated commercially, academically, with good computational resources, possibly mediocre, the types of resources that we've been afforded here are very significant relative to what we've had to do and what we've had to pay for when we use the cloud. So 50 terabytes of storage in the open storage network, which we can use to manage archives, but we can also use to do experimental work that I'll describe in a little bit. Four terabytes in JetStream 2 storage. So JetStream 2 is a commercial cloud run by the NSF, non-commercial cloud, an academic cloud run by the NSF, and so we have terabytes of storage there to help support activity that will be conducted with VMs in JetStream 2 and also GPUs. So the idea here is that we can think about what I call GDADs, GDAD, Genomic Data Analysis and Development Services. So one of the things that's going on with the OSN storage is we've started to collaborate with some folks who want to use the ZAR format for image storage and analysis, and just getting one of the tissue microarray data sets into an open storage network where anybody can go and start using ZAR to compute on the data has been one step forward that's been led by Ludwig Geislinger along with some folks at Harvard. We can use this storage for deep archiving, so we do provide versions of bioconductor source code all the way back to version 1.8, and to distribute those via commercial cloud can lead to non-trivial egress costs, but now we don't really have to deal with that. We can also increase the reliability and security of the stuff that we have by using this for redundant storage, and as I mentioned, we want to do research on data representations for very large genomic and other computational biology data resources, and this will help us to do that. The JetStream 2 virtual machines have been very useful for developing Kubernetes infrastructure to build binary images of bioconductor packages and may be used to orchestrate the whole bioconductor build system eventually. You can work with this through OpenStack APIs, and one of the things we've done here is started to develop a persistent database so that we can track activity in the build system over time, something that we have not been able to do as yet. Databases can also be used to renovate the APIs for annotation hub and experiment hub and so forth, and so we're able to take advantage of this to choose a database that seems to be effective for the type of work we want to do. Now, we want to make this accessible in certain ways to the community, specifically for development, and that's why we need a concept of identity management, and so there will be work on a biocidentity access management system so that you can register and be recognized for the things that you contribute here, and we can also use information about folks to produce social networks that may help people develop things in collaboration more effectively than on their own. As we work through this, we're going to have to spend some time developing dashboards so that people understand what's out there, so that we can understand what's being done, and so if you're interested in that, we'd love to collaborate with you to see how you can actually deal with. What we said here is that this is engineering and disseminating a software and analysis ecosystem for genomic data science. These are big words. I think it's not very clear how to describe this precisely, but we want to try to have experiments and results on how this should be done efficiently, so I welcome any questions about that. Also, a new platform that many people are using is the Apple M1 chip, and we are on the verge of being able to produce binaries for that so that it'll be as convenient to use those packages in a native way on the M1. This is what it looks like, right? This is bioconductor's release, and you can see that the number of bioconductor packages that are successfully installing, building, and checking, possibly with warnings, is in the range of 2,000 now. Lots of software there that many people are taking advantage of, but we're shipping terabytes of data over CloudFront every month, and the entire genomic research community is taking advantage of all of these things, commercial and noncommercial. I can start wrapping up. It's always been a pleasure to me to look at this picture of some of the early progenitors of bioconductor. Jean Yang, Rafa Irizari, Sandrine DuDois, who we'll be hearing from soon, Robert Gentleman, Chang Li, Wolfgang Huber, Ben Bolstad. This is a picture from 2003. One of the things that has been on my mind is, did they have any idea that what they were working on then would lead to 20 years of growth, of the community, of the software stack, and of an approach to doing scientific work in computational genomic data science? I have a feeling that some of them did, that they knew that they were shooting for the long haul, and it's worked. And this is just another picture here of folks that we got together in Bresenone a couple of years ago to do teaching in Europe on how you can use bioconductor to do genomic data science. And so, you know, I like to think of the project as broadening competencies for inclusive and collaborative science worldwide so that more voices are heard and insights are cultivated. So once again, I just thank all of you for coming and supporting this project and contributing to it. And I'll pause to take any questions. And then in a few minutes, we will introduce Sandrine DuDois. Any comments or questions? Perhaps from the remote audience? Is there anything in the chat? Can we see a chat? If not, I can bring up Sandrine's CV. It would take a few minutes to review. Nothing in the chat. All right. Well, in order to keep to schedule, I think I will take a break and be back at 7.59 or so to introduce Sandrine. Thank you all very much.