 So, 10 years ago, I was so impressed telling the students how much these databases had grown. GM Bank at the time, there were like millions of sequences. Tremble is a large database, so we've at least made sure that it contains proper proteins, but there can be a bit of junk. But GM Bank is the raw kitchen sink. Anything goes in GM Bank. Tremble is proper proteins, at least. SwissProt at the time, this one is now called Juniprot, is more annotated sequences that we know a little bit about the information and we try to clean out more of the junk. But it's almost 200,000 sequences. That's an amazing number of proteins. While the Protein Databank, that was a small database at the time. I think this was 2006, but don't quote me on that. A ballpark of 30,000 structures. But still, 30,000 is a lot. When I was your age, Stuart Forsen and Lund had me strike out the line in his compendium that there were only, I think, 50 Protein structures known because there were 200 instead. So in my lifetime, we went from 200 to these numbers. But these numbers are more than 10 years old, probably close to 15, I know. It's not even useful to show GM Bank on the same slide because the staple is so large. Remember the trillions of base pairs or the 1.6 billion sequences, I told you, compared to those millions. If you want data today, just of Juniprot, these are just a clean, nice, annotated proteins. We have over 200 million known, confirmed, validated protein sequences as of 2021. And in the Protein Databank in the last 50 years, we've moved now from roughly 30,000 to 175,000 structures. This growth is unprecedented. I have no idea what we're going to do with all this information. And if you extrapolate this another 10 years, the rate of change is not slowing down. On the contrary, it's increasing. If you look at the Protein Databank, it's growing year over year. Pretty much, well, not quite exponentially, but pretty darn close. It's even more impressive if we compare the growth of the PDB to the growth of Juniprot. And this is now a, I think it's a logarithmic plot, even. Do you see that the red Juniprot line there compared to the blue PDB is not even in competition? So here, here I am comparing structures versus, I think, millions of sequences or so because otherwise it wouldn't work. No, I think structures versus thousands of sequences. The problem here is that the gradual difference between this is not shrinking, it's increasing. So that for every new structure we're determining, we're probably determining a thousand sequences or so. So it's going to become worse and worse and worse. There will be a smaller and smaller fraction of structures we know. And that means that one way or another, we're going to need to find computational ways to go directly from sequence to either structure or function, which is what we're going to come back to later in today's lecture.