 So now we're starting to get some really cool results. Let me show you that distribution again, but taken from the book just to see something to talk about. The reason why this happens is that far down here close to zero is that any Gaussian function is going to go down towards zero very quickly. So if we're trying to make an integral here, the vast majority of the integral is going to come from the last piece. We can't really afford to lose the stabilization energy of this last residue in yellow. The rest of the integral out here towards minus infinity is not going to be enough to compensate for that. And that means that I typically have, again, that single hydrogen bond that I'm losing or something could deteriorate the entire structure. But look at the white space here, the white space under the curve. The vast majority of possible sequences in this particular fold are not going to be stable in the fold. And what that means, if I just randomly synthesize sequences of say 100 amino acids, the vast majority of them are not going to form proteins according to this prediction. And that is true. That holds. We've tested it again and again in the lab. And it's much harder to make a protein than you think. Nature managed to do so because it's introducing small changes in existing proteins. But just creating a protein from scratch is a major piece of art, apparently. How have we proven that? Well, you can certainly try to synthesize things in the lab, but you can only do so many, right? There was an interesting paper a few years ago in Bifysical Journal that went through this. So they took a number of small folds, starting from, you've seen a couple of this. This is the villain headpiece. You have some sandwiches there. The last one there is said outer membrane protein, et cetera. This one is called WW domain, two tryptophanes. I'm not going to go through all of them. But these range from very small and stable up to large and more complex. And based on what we said that the larger the structure or domain here is, the more sequences it should be able to accommodate, right? So there is something not right here. Well, we're missing one thing. What they looked at is that they first looked at the length of the protein here, and then they looked at, say, what they call the structural complexity. And that just means that how many sequences would we be able to fit in this? And that definitely goes up with length. It goes up very quickly with length. So there will be many more sequences you can fit in the long ones. But those are not all sequences. The problem when you start increasing the length, the fraction of sequences that are going to be stable in the protein will actually go down. Because there is the number of non-stable sequences, potential sequences, are going to grow even quicker. So that, well, yes, technically, a large fold will be able to accommodate more sequences, but the number of sequences that will never fold is even worse. So that means that you somehow want to be somewhere in the middle, not too small and not too large. You want a bit of diversity, but as we start forming larger structures, the probability of spontaneously folding will very quickly go down to nil. Have you thought about those prions, I mentioned more? Based on this, we can calculate what is the probability of a random sequence folding into protein. I'm just going to do a rough ballpark estimate here. This might be 1 in 10 or 100 million, 10 to the power of minus 8. So the probability of having a well-defined local minima, we're then saying for a random sequence that's in the ballpark of 1 in 100 billion. But what about prions? Prions, you could think of as sequences that have not one, but two reasonably well-defined local minima in free energy. And if we assume that they're independent, we should maybe talk about the ballpark of taking that number and squaring it. So 10 to the power of minus 15 or 16. And it's hard to do statistics about sequences because we haven't sampled them evenly or so. But again, to first approximation, that appears to hold. So the likelihood, the reason we have prions is simply that out of all the potential proteins we have, in a few cases, we can actually have kind of two native states. It's just that they occur in different time scales. But it's the exception that confirms the rule.