 Today we're going to look at fold stability. So all these structures, both for fibrous, globular and memory proteins, we've gone through the last two lectures. I'm not going to try to relate them to the fundamental concepts that you picked up from physics. We're going to be looking a little bit about their size distribution and essentially what is it that makes a protein a protein and show that this is very much due to fundamental statistical mechanics. Based on what we talked about, in particular, in the lecture on globular proteins, I'm going to come back to this concept that we had a number of different classes or whatever we call it. We had the globulins. This is an example of the myoglobin fold. We had these large, mixed alpha helix and beta sheet regions that were called alpha slash beta, that they're literally in the same part of the protein. And then we had these alpha plus beta proteins where you had the alpha helixes and beta sheets in these joint regions. Now, it's not entirely easy to classify these based purely on math, but you can probably see the difference here, right? Purely alpha, alpha beta mixed, and then alpha beta segregated. I didn't have room for a pure beta sheet protein here. And the question that we're going to concern ourselves with today is roughly where I ended the lecture on globular proteins, that we had this observation originally by Cyrus Schocia that they were in the ballpark of 1,500 folds, maybe 2,500 or 3,000 if you ask Mike Levitt today. But it's a relatively small number. And more important, it's finite. It's way smaller than the number of sequences. So most sequences that fold into proteins, at least, appear to find one of these folds. Why? Well, it's even more extreme than that. Forget about 1,500 or 2,000 folds. I would argue that the vast majority of all the sequences that we see, if they're actually proteins, they're going to fit onto a very small set number of folds. Maybe 20, 25% of those 3,000 folds we saw. That's why I spent some time to go through simple stuff size, as the globin folds, the four helix bundles, those beta sandwiches and everything. So of course, given a sequence, I don't know exactly which one of these folds is going to be. But with 80% probability, it probably will be one of the 20% most used folds. So nature appears to be pretty conservative here, and re-use things. Or there is something fundamental in these folds that makes them very suitable to harbor many sequences. And it's going to turn out that it's mostly the latter. If you compare this to something else, say RNA. RNA, we haven't talked that much about it, but RNA, too, actually folds. Remember, this single strand will fold up into some sort of three-dimensional structure. It's usually not quite as rigid as proteins, apart from, say, the tRNA. That you had a very fixed structure. Then on the other hand, you have DNA. DNA in principle just has one fold, right? That double helix. Technically, it has a few slightly different versions of it. You have an A helix and a B helix and the Z helix. But the first approximation is just a single double helix. And that is partly because DNA is even more concert. You have the phosphates that need to face water, and then you have the bases that need to be paired up. So one way or another, evolution has selected for this. So with proteins, there is some sort of feeling that we need the diversity, and we're going to be looking into the cost of distortions in this and why we end up having a relatively small number of them favored.