 All right, good afternoon everyone. I am delighted to introduce our seminar speaker for today, Sean Carroll. I will point out if you're here expecting to hear about wormholes or atheism, that's the wrong, Sean Carroll, and now is a good time to sneak out. Otherwise, you're in for a treat. Dr. Carroll is now a distinguished professor at the University of Maryland. He's written many books on evolution and the diversity of form, which are great books and you should buy them. You can pay me later for that. And he's also an award-winning television and movie producer. So he's even been nominated for an Oscar. So that's what we're really going to hear about, is his Oscar. No, I'm just kidding. But he is one of the, and I saw him talk when I was much younger. And he's really one of the great communicators of science of evolution and developmental biology. So you're in for a treat, and I'm just going to let him talk. Instead of me, please come up and welcome Sean Carroll. Well, thanks for the invitation. Thanks for the introduction. Thanks for the folks who spent time with me during the day. Thanks for all of you tuning in wherever you may be. I hope there's a little something for everyone in this talk. We're going to roam over a little bit of territory. So I'm going to make a quick amendment to the title. This is, it'll make sense later, but maybe it's this sort of teaser that will make some of you stick around. But today I'm going to talk about the scientific ideas that have my last been pursuing for quite some time. But I also want to take a moment to acknowledge why we do what we do. I just had a discussion with several young scientists here, and it's only fair that I sort of say why I do what I do. Our personal motivations and inspirations as biologists, I think we actually talk about this too rarely. And I think that the writer Albert Camus, who I some years ago spent a lot of time with, at least sort of in spirit, captured a general truth about creative lives when he said that, and please forgive the gender biases was written 80 years ago. A person's work is nothing but this slow track to rediscover through the detours of art those two or three great and simple images in whose presence their heart first opened. And for me, those images are these creatures. I think it was these animals. I actually owned a speckled king snake, like the one on the left. But it's their beautiful spotted and striped pattern that inspired my interest in biology, my interest in pattern formation, and my love for the animal kingdom, which continues to this day that picture was taken in June at a sanctuary in South Africa. But before I could, if I had a hope of someday making a living studying or even helping these animals, I first had to study things that were a bit smaller and largely indoors. But I've tried to ask a big question, and a big question in evolutionary biology is the origins of novelty. And you think about novelties and the elephant's trunk is a great example of that. Way back to Darwin's time, one of the hardest things for evolutionary science to explain was the origin of anything, and particularly the origin of something with a new function, because if that was assembled through intermediate stages, which is really the description of a lot of evolutionary science, what good were those intermediate stages? So it's been a fundamental challenge, and it's, of course, central to evolutionary thought of trying to explain where new things come from. So this has occupied my lab for more than 20 years. And what I mean by tracking the origins of novelty, I'm biased towards a genetic description. So traits, either anatomical or biochemical traits, we're really saying is how, from some ancestral structure or molecule, through what series of steps does a novelty emerge into the world? And so there's a how, and there's also, importantly, and I think underappreciated a why. All of us involved in experimental biology, biomedicine, we're really into the how. How does this work? How does this piece of machinery work? What are the molecules? What are the genes behind it, et cetera? And that's the same for me from an evolutionary motivation. What specifically, for example, are the genetic steps that occur in the assembly of a novelty and in what order, and the order becomes interesting for certain reasons. But also, the underappreciated side of this is why these steps, and this order, and not others. In other words, are there particular paths that evolution is more inclined to take, either because certain steps are more probable or maybe they're more permissible, under, for example, natural selection, or maybe they're necessary, you can't even get there without taking a certain kind of step. So today, I'm gonna address two different kinds of novelties. I'm just gonna sort them into and talk about genetic changes that underlie the origins of both pneumomorphologies, and those little pieces of form, pattern and form, physical form. And the second deals with protein functions. Now, one of the biggest influences on thinking about the genetic source of novelty was a book written, I know for all the young scientists, you're thinking, oh my God, this guy is going back into the dark ages, but yeah, I am. But it was a great book, and if you've never read it and you're at all interested in evolution, this is one of the part of the canon that you should know about. And it was Tsosumo Ono's book in 1970 called Evolution by Gene Duplication. And Ono, this is kind of the first really concrete piece of writing trying to explain where did new things come from. And right in the preface and in the introduction of the book, Ono states his case, when he said that natural selection merely modified while redundancy created. So he thought that redundancy was a necessary element to invention, to novelty, and argued that allelic mutations of already existing gene loci cannot account for major changes in evolution. And a few years later, two other great Japanese evolutionary biologists sort of doubled down on Ono's view and stated that gene duplication must always precede the emergence of a gene having a new function. Now anyone here, anyone listening to this talk that has spent any time in a genome knows that gene duplication is a pretty prevalent phenomenon, or at least the results of gene duplication are pretty prevalent. So that fact, which has been known for a long, long time, even before there were genomes, has guided a lot of thinking, oh, well we see duplication and we see duplication associated with new functions. So therefore, the thought was that this was a necessary step. Now you would think that 50 years later, with all of our powers of genomic analysis, throughout the tree of life, we'd understand the rule of gene duplication and the evolution of novelty. But interestingly, Alasert, we don't. Collectively, we might have more misunderstanding than understanding. So I hope in the course of today's talk, I don't make that situation any worse. But rather, I'm gonna hope you're gonna come away with some fresh, maybe even a deeper appreciation of some of the issues, and one or two concrete resolutions. Now the expectation that gene duplication played a necessary role, and that was the expectation that I was baptized into when I started early work in the evolution of development. That was definitely the framework for really at least two decades. Until perhaps around the mid-1990s, when in the study of the evolution of development in animal form, a body of work revealed that this view was not correct. And a wonderful first clue arrived unexpectedly in my lab in this form. This is a little bit of nostalgia today. We're gonna go back in time a little bit. And what I'm showing you is on the left, immunofluorescence labeling of the imaginal disks, larval imaginal disks of two different species of butterflies labeled with a particular antibody. And you can see that this antibody is lighting up just thousands of cells out here at the periphery of this developing wing of a monarch butterfly. And also thousands of cells at the periphery of the developing wing of this East African butterfly. But in addition to those cells, you can see an intense pattern of expression, a high level of expression in the centers of each of these divisions of the developing butterfly wing. So this one has this general pattern and this one has this general superimpose on this general pattern. Our seven new spots where the gene is deployed. And I guess I don't think the animation of this is gonna work. So that's just a closeup of that. And those seven spots correspond precisely to the white pupils, you can say, in these eye spots that are on the butterfly wing. These eye spots are used in predator avoidance. They sort of draw the attention of predators away from the main body where all the organs and juices are out towards the wing where the butterfly can withstand a certain degree of tissue loss without too much compromise. So these eye spots are important to an ecological strategy. And they are a novelty invented in butterflies. So we're very interested in this. And so when we saw that we had identified a protein product that was expressed in these developing eye spots, we were really excited. But what made us particularly excited was the identity of this gene that we were lighting up in butterflies. Because we knew we learned a bit about that gene about the same time about its deployment across the entire animal kingdom. And what's shown here using a variety of technologies is the labeling of a bunch of different kinds of embryos for this gene called distalus, the protein products of the gene called distalus. And this is the fruit fly in the upper left. And that's a butterfly larva in the middle panel. And that's a brine shrimp on the far right. And this is a developing sea urchin and a kind of germ embryo here close to me. And this is a polychaete anilid worm in the fifth panel. And that's a developing chicken wing bud in the next to last panel. And then for Sean a fish fin bud. So this gene has been used in the building of appendages throughout the animal kingdom. And based on this distribution for at least 500 million years. So the significance of seeing it in the butterfly, the reason why we got excited is that this tells us that, and something's not clicking. Let's see if we got this. I got a signal on the console here. Is there something that is, all right, hopefully this is gonna continue. Any reason why this would stop advancing? Oh, now it decided to advance. So it hung up there, there we go. We're all right. But there was a message on the board, not for me, hopefully. And it kind of froze things up. Okay, so simply put, what this told us was that distalus appears to have acquired a new job in addition to building appendages for 500 million years, which it also does in the fruit fly and butterfly that had been co-opted into making butterfly color patterns. So that's an important word in evolutionary lingo. So let's be clear, what co-option means is the new use for something that was pre-existing. Could be a trait, okay? So if you think about, for example, bird wings, they've co-opted the tetrapod for limb into making wings, while butterflies have co-opted distalus into making spots. But this applies to proteins, genetic elements, what have you. So this first investigation of the butterfly spots taught us something, what would turn out to be, we had no idea in the day or year that this first happened, that this would turn out to be a general rule about the evolution of development and form. And I really like general rules, so I'm just gonna put it here as a general rule for the evolution of development. Is it new patterns evolve when old genes learn new tricks, when existing genes pick up new jobs, and we'll talk in more molecular detail what I mean by that, but just for shorthand we'll call it a new trick. Now we could not have known that this one finding would anticipate so much of what was to come in the study of the evolution of development. Distalus was a leading example of a critical and unexpected discovery concerning developmental regulatory genes. And that has to do with the multiple functions or pliotropy of individual genes. So distalus may not be familiar to all of you, but I'll come up with another example, the very famous Sonic Hedgehog gene. So once it was cloned in the mid 1990s and people looked around it where the gene was deployed, they could see it in the developing, this is in chicken, they could see it in the developing limb bud where it's known famously for patterning the digits, the developing neural tube, or later, for example, in the development of the feather buds. So what do these three tissues, think now as embryologists, what do these three tissues have in common? Correct, nothing, okay, nothing at all. There's no reason anyone would predict that Sonic Hedgehog would be involved in the patterning or development of the limb bud, the neural tube, the feather buds, no reason whatsoever. But this pattern was observed again and again by developmental geneticists that individual, in this case a signaling protein, part of an important regulatory pathway or transcriptional regulators were involved in the development and patterning of a wide variety of tissues. And that, to some of us thinking about evolution, had immediate evolutionary implications. The implications of this gene pliotropy, these multifunctional genes and proteins were that clearly individual regulatory genes have acquired potentially many new roles in the course of evolution, but without gene duplication. That's one Sonic Hedgehog locus that in the fruit flyer, butterflies one distalous locus, multiple jobs encoded by a gene at one locus. That means that to pick up those jobs over the course of the evolution of the animal kingdom, gene co-option must be widespread. But what we had no evidence for was how exactly did co-option occur? So we were arguing that co-option was really important. We were arguing that evolution must be proceeding by using these genes again and again in new ways. How are we gonna figure out how co-option occurs? Well, I wish I could show you that data from a butterfly, but we learned pretty quickly that butterflies weren't gonna be nearly as tractable as some other organisms that we were working out, the developmental regulatory mechanics in. And so we dropped back to working with a little more familiar and malleable creatures like this one who nonetheless had turned out to be, I'd say a reasonable analog, not as pretty, I know, not as pretty as a butterfly eye spot, but had a spot on its wing and that we thought we could understand genetically and molecularly what was going on. So this is two species of fruit flies, the familiar Drosophila melanogaster on the top, Drosophila biarma pays on the bottom, not a fairly close relative of Drosophila melanogaster. And the males of this species and a handful of others in the same group display a dark spot on the anterior distal part of their wing that seems to be involved in a courtship display. I didn't bring the video just as well, it wouldn't have played. And so what we tried to do, and part of the art of our business of science is to find the simplest example of the phenomenon you wanna understand. So if, I guess I'll put it this way, if beta-galacta, beta-galacta-cytase synthesis was the model for prokaryotic gene regulation, well, for the evolution of a piece of anatomy, it's our little two-dimensional dark smudge on a wing that was our model. And the us in that plural was Nikola Gompel, Benjamin Prudome and Tricia Wittkopp were all in my lab overlapping at one point and have worked to dissect this, and I'm gonna show you some subsequent work from Nikola and Benjamin in a second. Now I'm not gonna walk you through the data, this is long-published work, I'm kinda setting up for the second part of the talk. I'm just gonna walk through the model. Again, you can look at all the evidence. I'm just trying to build a picture in your mind of how might new patterns of gene expression evolve without any process of duplication or anything like that, because that's what appears to have happened based on our analysis of the regulatory elements and transcription factors involved in making this pattern. So, if you wanted to walk you through the model, imagine, maybe don't look at the screen yet, imagine in a developing wing all sorts of transcription factors deployed and all sorts of patterns that are involved in shaping that wing, which means putting the veins in the right place and putting sensory structures in the right place and defining the front edge and the back edge and the outer edge and the inner edge of the wing and the top surface and the bottom surface, all that kinda stuff. So sort of cryptically to yours in my eye and then in Drosophila wing, there's all sorts of transcription factors deployed in all sorts of patterns. And then imagine there's also genes, which there are, that are involved in a pigmentation pathway and actually when expressed, make the dark melon in pigment. And now imagine some connection between those transcription factors that pattern the overall wing and a gene involved in making pigmentation pattern and that's what happened. The gene is called yellow. Just think of it as a black paint brush, okay? It's an enzyme involved in the melon and synthesis pathway. And what's happened is that in an ancestral condition, sort of represented by Drosophila melanogaster, yellow is expressed at low levels across the wing, just gets it a very sort of light dusting of pigmentation. And there's a regulatory element upstream of the yellow gene that's responsible for that expression. And I've not mastered your pointer here yet because it doesn't work like, there we are. Imagine that regulatory element, okay? So you have sort of a pre-pattern of the wing of transcription factors. You have a regulatory element. There's no connection between these transcription factors in this element in melanogaster. But go over here Drosophila bionarmapase. What we discovered was that this regulatory element had a novel activity that drove expression in the anterior distal part of the wing. So something has changed in the regulatory element. And what's changed in that regulatory element is it's acquired binding sites for a couple of transcription factors. And we identified one of the transcription factors right away because one of the obvious features of the pattern of expression of this element was it was excluded from the posterior half of the wing. And we knew a transcription factor that in fact had that domain of expression. It's a well-known transcription factor. It's been around for a long time called in grail. This is a transcription factor depicted here in green. And we mapped binding sites for this in the regulatory element. So one of those inputs, and that's a negative input, comes from the green protein called in grail. We didn't know the activator for a little while, but what you can imagine is, imagine the activator state drawn here in purple and the repressor drawn here in green and those combined inputs into this element would give you a little quadrant of elevated yellow expression. So that's what we think has happened. Hopefully I described that somewhat moderately clearly. So an extant regulatory element, extant transcription factors, all that had to happen was binding sites to arise in the regulatory element for those transcription factors and now you have a regulatory connection that modulates the expression of the pigmentation gene, which gives you the black pattern. Now, what excited us about the identity of that green transcription factor is that the in grail protein's been around again for 500 million years. It's been expressed in the posterior compartments of arthropod and insect segments for all that time. So what you have is a gene, a protein doing something else for a long time, now being brought into the pathway for modulating the pigmentation of the animal. So this is how new patterns can arise. Now, Nikola and Benjamin continue to work on the identity of the activator and wouldn't you know it? Well, we didn't know it, but they figured it out that the activator is distalous. So here we thought you know, if you were underwhelmed by that fruit fly spot as a surrogate for the butterfly eye spot, turns out darn protein involved in making the eye spot is also distalous. So the general point here, having walked you through some of the mechanics, is it really how do new patterns emerge? And these new gene expression patterns, and if they're involved in developmental pathways, new morphologies evolve largely through changes in enhancers, which is the rewiring of these transcription regulatory networks. Okay, so with respect to gene duplication, the story of the evolution of development of morphology is not one of gene duplication, but of regulatory rewiring among a fairly stable set of regulatory genes. Distalous has been around for a long time, Ingrail has been around for a long time. I mean, consider for example, maybe one of my favorite gene complexes may be yours. Something like the Hox genes, famous for shaping the development along the anterior, posterior axis of all vertebrates. There are no new Hox genes that evolved in the evolution of tetrapod diversity. In fact, a couple have been lost. So if you infer what was existing in the last common ancestor of ourselves with sealocants and frogs and mice and all that sort of stuff, all that whole complex has been there and that nothing, no new genes were necessary to shape, for example, our fingers and et cetera. All those genes were in place in distant fishy ancestors of humans. And what I want you to do is to hold this picture in your mind of and contrast this very long-term evolutionary stability with the part of the story that I'm about to tell you. Okay, so we're gonna switch creatures here. Now, those of you who are sane when you look at this picture probably see a dangerous animal that you would like to avoid if you were happen to come across it. I see, first of all, I do walk towards these animals, but I also see an irresistible evolutionary mystery. And that's because the snake body plan and venomous snakes in particular display many interesting novelties. First is their limbless body, okay? Which means they got rid of their legs, but that allows a unique locomotion and habitat. In the rattlesnake, the rattle, forked tongue. In these pit vipers, these infrared sensing pits that sense their prey, and when they open their mouth, two more impressive novelties, their fangs, which were a delivery system for their venom. So several years ago, decided to shift from our exclusive focus on the evolution of pattern and morphology to ask about the evolution of biochemical novelty. And we think that venoms are a great model of biochemical novelty for several reasons. First of all, they are recently evolved, so their invention is pretty accessible in the genetic record. They're a key trait for subduing prey. This is obviously how venomous snakes get their meals. So, a really important weapon. It turns out that in any snake venom that I can think of, there are multiple novel proteins. Not necessarily novel to that species, but let's just say that there are members of protein families that are novel in terms of their incorporation into venom. And then, from an evolutionary point of view, it's very clear that snakes in their prey are in evolutionary arms races. So that's sort of like evolution on steroids a little bit, because both, this is how snakes get their meals, and of course, if you're preyed upon by these snakes, any resistance you can evolve to them, whether behavioral or biochemical, is valuable. So that puts evolution sort of in a fast motion. I also mentioned something about recency. I don't know if this little factoid would interest you, but snakes are kind of overlooked as a model for anything. And that may have been your first reaction when I said I'm gonna talk about snakes. But the reason why they're, many reasons why they might be of interest from an evolutionary point of view is that, if you just look at the pit vipers in the Americas, so we've got things like copperheads and rattlesnakes here, but there are pit vipers throughout the Americas. The, first of all, they're all descended from one common ancestor that arrived in the Americas about 24 million years ago. And so this is a very recent radiation of all these snakes, whether it's bushmasters and fair-to-lances or fancy pit vipers that live in the tropics, or water moccasins, whatever it might be. These are all a recently diversified set of snakes that arrived from Asia about 24 million years ago. Now, venom, which they used to subdue their prey, now let's frame the biochemical question here, which is, are venom toxins, are they old proteins with a new job, or are they new proteins? Are they something that sort of invented from scratch? So, before you start imagining me, throwing students and post-docs into a snake pit, I wanna acknowledge the folks who made this work possible, and the pioneers who really helped get this off the ground, both at Wisconsin, and then subsequently, as my lab moved to College Park a few years ago, are Noel Vowell, who's there in the bright blue shirt, and Matt Giorgiani on the far left. And we got special help from the Natural Toxins Research Center, the NIH-funded facility in Kingsville, Texas, from Alda Sanchez and Mark Hockmueller. They handled the dangerous snakes, so Noah and Matt didn't have to. And then, especially back at Wisconsin, Victoria Kasner and Sam Griffith and Jane Sealy, who helped get this off the ground. So I'll give you a quick little background on rattlesnake venoms. Just to tell you that many rattlesnake species, their venoms are hemorrhagic, so they destroy vascular integrity. There's, for example, in human bites, there's a lot of necrosis, et cetera, but it's really sort of a collapse of the hemostasis system that's gonna be a problem for the prey. And some are neurotoxic, and will cause respiratory arrest really quickly. So let me tell you a little bit more about the key toxins and then the genes that encode them. So one really important group of toxins for this destruction of vascular integrity are a group of metalloproteinases. So we refer to them as snake venom metalloproteinases. As you've seen in a second, they're very familiar to biologists who've worked on metalloproteinases in vertebrates. These are just zinc-dependent enzymes. So what we did is exactly how you'd imagine we would tackle this puzzle, which was to go peering into the genomes of these creatures, see how these genes were encoded, and see if we could trace whatever we could of their evolutionary history. And when you do that for the venom metalloproteinases, you find all of, say, a Western Diamondbacks metalloproteinases all together in one gene complex, bang, bang, bang, bang, bang, 30 genes in a row. So this was convenient that all of these enzymes were encoded in the same place, but you can also appreciate those of you who've either analyzed genomes or cloned anything. That's a big complex to analyze and to annotate. But the key little evolutionary hint is that all of these genes are adjacent to a gene here, Heidi behind the curtain, which I'll try to highlight. There we go, I'll get a hang of this, called Adam. This is in fact the Adam 28 gene. That's a gene found in all vertebrates, a metalloproteinase, and that's not in the venom of the snake. So virtually everything to the right here is expressed in the venom of the snake. So clearly what's happened is that this family has been expanded in venomous snakes, and we know that history. We can reconstruct that history by looking at lots of other snake genomes. I just wanna draw your attention. So there's been a massive expansion of the venom genes and rattlesnakes of these metalloproteinases from this ancestral Adam 28 locus. And so if you walk through the genome here, you look at in most like, for example, other reptiles, other vertebrates, this part of the genome sort of stable, there's not been much action, but then you just have this huge expansion of close to 30 genes expanding out from Adam 28. So clearly we understand that all of these snake toxins, meltile protonase toxins are descended from a normal physiological protein, Adam 28, and have evolved from this ancestral gene through a process of expansion and diversification. And we can see that there's really a couple of steps to this. One is co-option, because Adam 28's not expressed in the venom gland. There had to be some evolutionary step that got at least the ancestor of the big, one of the genes of the big complex into to be expressed in venom gland tissue. And then subsequent, is that subsequently expanded than those genes were also expressed in the venom gland. So this is to tell you that snake venom, mental protonases, and essentially, and I'll show you another example, all venom toxin families are derived from ancestral physiological proteins that have normal roles in vertebrate physiology. And they've been co-opted. They've been brought into the, recruited from other tissues to be expressed in the venom gland. And in the case of the mental protonases, which are structurally quite diverse, chemically quite diverse, they've also gone through a massive process of gene duplication and intergenic deletion. So this is a case where at least biochemical diversity of this toxin has been shaped by lots and lots of deletion steps. And I'm just gonna show you kind of the forensic details of that a little bit to see that, to just show you there's also, evolution can also create novelty by even whittling things down. So there's been a fair amount of whittling going in this family. So these metalloprotonases have a really expanded structure. The Atom-28 gene is something like 25 exons. But if you look at the genes that are used in the venom, they all lack, as you might expect, a transmembrane domain. So the ancestral gene had four domains, a metalloprotonase domain where the active site is for the metalloprotonase activity, a disintegrin domain shown here in kind of yellow, a cysteine-rich domain shown here in orange, and the transmembrane domain. And so the ancestral gene, which you can still see in Western Diamondback rattlesnake genome, has this transmembrane domain, but all of the venom-expressed genes, this part of the gene has been deleted. And then there's actually three classes of metalloprotonases, which they call class 3, 2, and 1. Class 3 has all three of these domains, the metalloprotonene, disintegrin, and cysteine-rich domain. But the class 2s have had an intragenic deletion that's taken out the cysteine-rich domain. And there's a class 1 family where both, that is actually derived from the class 2 family, where both the cysteine-rich and the disintegrin domains have been deleted. And so there's actually toxins, which are essentially just the metalloprotonase without any of these other biochemical domains. Okay, so cooption, bringing this physiological protein into the venom, duplication, and then diversification through this duplication and introgenic deletion process. All right, so let's turn to the neurotoxin. So the neurotoxin, which is in a subset of rattlesnakes, this is a heterodimer consisting of two phospholipase A2 subunits, a basic subunit, shown here in purple, an acidic subunit shown here in green and orange. This heterodimer is restricted, as I said, to a subset of rattlesnakes who are really interested in its history. But it tells, there's a pretty similar story in terms of its evolution. These phospholipase A2 subunits clearly came from a region of the gene that encodes multiple phospholipase A2 proteins, some of which are used elsewhere in physiology, they have nothing to do with the venom, and several of which that have been recruited into venom. And the two neurotoxin subunits have highlighted here. So, for example, this phospholipase A2 here, shown in green, and this one shown here over, sorry, in gray, and this one over here in gray. These have never been expressed in venom, we don't know of them ever being recruited into venom. They're related to the venom-expressed genes, but they've not been recruited into this pathway whatsoever. So, and it's a fairly similar story of, well, how did this gene complex of toxin subunits come together, similar story to what I told you in terms of the metalloproteinases, in that we know there's a phospholipase A2 locus shown here, designated G, shown here, and traceable, easily through reptiles, that has existed in single copy for a long time, but it expanded into multiple genes in the evolution of vipers and pit vipers, they're close relatives. So again, it's a story of recruitment into the venom gland and expansion from a single ancestral locus. Now, I told you some of this, well, what I wanna do is, having given you a picture of this, this is a really different pattern than what I showed you, for example, for the hox genes or for the other regulatory genes that I told you about, a pretty striking contrast. And that leads me to wanna at least gamble a general inference about what kinds of gene duplicates are retained and why. So with respect to gene duplication novelty, this suggests there's kind of different rules for different gene types. And if you think about regulatory genes like distalus or sonic hedgehog or a grailed or whatever, these duplications are rarely fixed in animals. Now we know from all sorts of studies of mutation that there's all sorts of mutations going on at all times. So we don't think these duplications don't occur, it's just that they're not retained. Now, why might that be? And I wanna suggest from a variety of evidence, I'll get into some of the evidence, is that dosage of these genes really matters. We know that many of these genes are haploinsufficient, they also have phenotypes when present in three copies. So that initial duplication step, even if a wild type gene can have a phenotype that could be deleterious. So and because these things are involved in regulating networks of genes and some of them may be regulating hundreds of other genes, simply one extra copy may imbalance those regulatory networks and be deleterious. So the inference we take from that is that because these duplications in these regulatory genes seem to be fixed rarely, you can think of exceptions, they're constrained due to dosage effects on the expression of other genes and that these duplicates would actually be selected against. Contrast that with, and I'm sure you have your favorite gene families. Duplications may be numerous, olfactory receptors, immunoglobulin genes, et cetera. They're unconstrained, no effect on the expression of other genes, they're generally expressed in terminal cell types, so they don't have these pleiotropic effects, they don't have the constraint that exists on regulatory genes and therefore duplicates could be selected for. There's a reason I'm putting the question mark there but I'm just gonna stop for a second and let first this contrast sink in and then in the last part of the talk take you through a little conundrum. So while this contrast is empirically supported, and you can quibble with it as much as you like, but it's empirically supported, it raises a conundrum that I wanna explore and try to make at least a partial solution in the remainder of the talk and this is the conundrum. Ever since the time of ONO and sort of the embrace of this idea of gene duplication being either a necessary or a common participant in the evolution of novelty, that was such a prevalent mindset. Only some decades later, the particularly population genesis starts to raise their hand and say, there's a little problem with this model and the problem is that duplication does not necessarily lead to innovation nor is the presence of a duplicate evidence of any new function or even a natural selection having acted. And here's why, because since a duplicate is initially identical, it may be redundant and then it would be neutral and we know from the rules of population genetics that that would usually be lost or fixed only rarely by genetic drift. And even if it's fixed, it's more likely, it's more probable to experience an inactivating mutation, something for example, that would mess up the open reading frame, shift the reading frame, whatever it might be, then it is an innovative mutation because the innovative mutations that would give a protein novel activity, those are gonna occur in precise places in that reading frame. Those are gonna be relatively rare mutations relative to mutations that could disrupt the protein. So this is the conundrum and the question is there any way to sort of get around this? So it became appreciated that the fates of gene duplicates are not simply the middle scenario, which was sort of favored in a lot of thinking, which was if you duplicate a gene, now you've got a spare copy you can play with and you play with that copy and go through make something novel out of it, a process called neo-functionalization. Now the statistical argument is it's much more likely that two other things are gonna happen, which is the other copy is gonna get inactivated, for example, something that disrupts the coding region, I should point out I'm denoting in this yellow segment, or another common fate would be that if the gene has multiple regulatory elements that basically you'll have decay, you'll have mutations in say, a regulatory element that affects expression in one tissue and in the other copy may be mutations that a regulatory element affects expression in another tissue, and now basically you have two copies of a gene that together just do the same thing the original gene did. So you've gone through sub-functionalization, you've essentially divided the ancestor functions with two copies of the genes, but biologically nothing's really changed. So we could be fooled in looking in genomes when we see say two copies of a gene, oh something interesting has gone on, no, you can certainly explain the existence of gene duplicates through a sub-functionalization process where really nothing has changed about, now the ancestral functions are just divided by the two copies of the gene. So this process which excites evolutionary biologists the most, neo-functionalization, that's hard to explain, okay? So are there evolutionary forces that can surmount this conundrum? I'll just give you a second to think about that. Well it turns out, as we're starting to face this and we're seeing this massive expansion of gene families and we know these toxins are really important in sub-doing prey and we're thinking, ah this can't just be a neutral mechanism, something's gotta be going on. Elsewhere in the lab, quite unexpectedly a different project gave us, I think the insight that helps us think our way through this and it's in fact alcohol or at least flies it like alcohol that catalyzed two entirely unexpected discoveries and spurred our rethinking about gene duplication that I think will help you through this. So this is a project initiated by David Lowland when he was a postdoc at my lab now he's at Williams College. And David's goal, David was really interested in tackling quantitative evolution. That you can imagine that throughout the evolution of the physiology of all sorts of organisms, for example how much of a protein or how much of an enzyme activity you make in something could be really important. And it just hadn't really been tackled at a deep level of how are all the different ways that one can tinker with the quantitative activity at a given locus. And one of the famous fruit fly genes for evolutionary geneticists and in fact a really well-studied enzyme is the alcohol dehydrogenase gene in protein. And alcohol dehydrogenase is important to fruit flies because fruit flies live in these fermenting habitats and many species have to tolerate a certain amount of alcohol in their environment. And what's shown here schematically in the bar graph is the amount of alcohol dehydrogenase activity in a whole bunch of different species of fruit flies and color-coded with the habitats that those fruit flies inhabit, some of which for example have adapted to human habitats like breweries and wine cellars or they live out in the wild on decaying plant matter or for example rotting fruit. And what David thought was if I looked in detail at pairs of species, very recently slightly diverged species that had big differences in alcohol dehydrogenase activity, maybe I could map all of the molecular contributions to that divergence and understand in general how does something like a gene coding an enzyme like this, how does this evolve in an animal? How does the quantitative levels of this enzyme activity evolve? And you can imagine lots of questions, I mean lots of possibilities, you could evolve specific activity differences. So coding changes in the enzyme, you could evolve regulatory differences at the transcriptional level, regulatory differences at the translational level, et cetera. So what David did is because it's such a well-studied system, he also knew he wanted to be able to quantitate very small differences and he wanted to use the power of Drosophila genetics to do this in a really controlled way where he could for example put the ADHD of any fruit fly into the same spot in the genome and so that everything was extremely well controlled, able to measure and to see what the genetic basis was, molecular basis was of this phenotypic diversity and it's a small gene, small enough gene that he could do all the transgenics that he would like. And in the course of that study, he made one observation that is what I wanna highlight today. And that was he found that a fruit fly that was really adapted to a brewery environment relative to its more wild cousin, Drosophila virilis, had experienced a duplication of the alcohol-g-hydrogenase locus. So all of the parts of the gene that were evolving that he mapped in all these para-species, one of these, in one of these cases, there had been a duplication of ADH. And that's such a recent duplication that there were only like two amino acid differences between the two copies of the gene and he could do all the proper controls of changing those amino acids back, et cetera, et cetera. So the differences in ADH activity were not due to any protein changes, it was due to copy number. And part of his technology here was to be able to reintroduce these genes and quantitatively measure differences in the activity of the ADH genes from different sources, put them in at the same place in Drosophila melanogaster in all these cases. He was just doing this as a routine experiment. I just wanna tell you that you can do a routine experiment with really routine expectations and sometimes you get surprised, particularly if you measure things carefully. And the surprise was that again and again when he put in the fearless version that was tandemly duplicated, he always got more than two-fold activity than with one copy of either of those ADH genes. Now, I'd just say it's all published, he did all sorts of controls you'd think of and including making constructs with other sorts of genes, but it turns out he found reproducibly a greater than two-fold effect of genes being in tandem. So if you had two copies that were in trans, they had essentially twice as much activity as one copy, but two copies insist of ADH and of constructs that he made of reporters and all sorts of other things consistently had more than double the activity. So it looks like the overactivity, in other words, one in one does not make two, one in one can make more than two, the overactivity comes from tandem structure of the genes and this observation has now been supported by studies in a bunch of other species. But, and that's, I'll just say that's kind of like, I'm just being, you know, I'm just telling you about that data because it might interest you. It's unexpected, it's worth investigating and David's been investigating it, but the other thing I want to underscore is he mapped all sorts of contributors to differences in ADH activity. A couple of cases of coding changes, those were very, very, very rare. It looks like this enzyme is basically optimized for specific activity. He mapped things in three prime UTRs, five prime UTRs in upstream sequences, et cetera. But the greatest quantitative effect he measured in all these species comparisons was due to this duplication. So if you think about making a lot more ADH to live in a very alcohol-rich environment, the single largest mutational step essentially was duplication of the ADH gene. And because we saw that there was no difference between the two and the activity of the two paralogs of the ADH gene, no coding differences that were meaningful, we had a little slap the forehead moment because this fruit fly lives in a high alcohol environment. This duplication gives it a lot more ADH activity and we said, wait a second, this is a case where selection has probably selected for greater enzyme activity for making more ADH. And we thought, oh, wait a second, so this is what we've been not thinking about in this step here, that duplication happens and the worry, the population genetic worry is that this second copy will be lost due to drift, it'll be inactivated by other sorts of mutations, that there's nothing keeping this second copy around long enough to play with it. Unless there's selection for increased dosage. And you think about proteins, including things like venom toxins where there might be selection for making more of something because, for example, if you make more of a particular toxin you might take down your prey more readily or if you make more antifreeze that might insulate you more from the cold, think about your favorite sorts of proteins. So this little unexpected gift from, another lab bay actually in the same lab made us all pay attention that we probably just not been thinking through this issue nearly well enough that simply if the duplicate can be selected for it for making more protein, then it's gonna stick around. And then, we're really excited. Always go back and read your original references kids because Ono had it, okay? We're thinking that Ono was too caught up with needle functionalization and all of Ono's descendants have all been thinking about gene duplication being necessary. But right there on page 59 in this book in 1970, he wrote, the duplication for the sake of producing more of the same and knew at the time things like histones were present in multi-gene copies and ribosomal RNA was present in multiple copies and thought if a cell needs a lot of something, well duplication is just for making a lot more of the same. 50 years and we discover what he already knew. Okay, so it makes sense for venom proteins which are highly expressed. And so when we see other situations in genomes like this, so one of my favorite stories are, for example, antifreezes in polar fish. So if you look at something like the ice fish which lives in the southern ocean off Antarctica, makes really high levels of an antifreeze protein in its blood so that it can live in actually sub-freezing waters, waters that are below zero degrees Celsius sea water. When you see a battery of genes like this, you know, we don't think that these antifreezes are distinguished at all, one from the other. This is just essentially a protein factory. You gotta make a lot of antifreeze. There's only probably so much product you can milk out of, I shouldn't use milk, because that's another good example. There's only so much product you can get out of one gene and that this battery enables you essentially to have a factory for making large amounts of stuff. And you see the same sort of thing, for example, it uses antifreezes to protect its eggs in the waters. But there's more and more cases you can think about this. Even for example, this might be the case of things like salivary amylases in humans that have been expanded in the course of human evolution. So a little bit of rethinking of gene duplication. I think we can say it's unnecessary for morphological evolution in animals and potentially selected against in most circumstances. But if quantitative effect is biologically significant, then the duplicate can be retained intact, fixed by selection, and set the stage for later innovations, whatever that might be in a biochemical pathway, in an enzymatic pathway, et cetera. Thanks for listening. That last picture was another novelty, but some novelties you just want to hold, you don't want to really investigate. That was a pangolin in South Africa for those of you that are fond of rare mammals. How do we do questions, Sean? People can come up to the mic. And then we get them, there are folks on Zoom, is that right? Yeah, so Brahmin has got his eyes on just coming in. So my question, so you have in the case of venomous snakes, the duplication of a single type of gene and radiation from that. So how do you see things like the comb snail venoms, which are crazy cocktails of all kinds of different short peptides? Yeah, I think in those cases, those look like kind of novel polypeptides. Like, I mean, in some cases, like there's some insulin, like there's a beautiful story of an insulin-like peptide that's evolved and is associated with a dietary switch in the cone snails. But I think some of those short peptides, and I wish there were more genomics on these creatures. Last I knew, I don't think cone snails had. Sorry, I apologize to everybody, cone snail fans, but last we looked or last I looked, there wasn't much there. What we want to do is track the origin of these things in the genome. I think they're hard, you know, if they, clearly if they evolve to it, if they are related to an existing protein family, that's not so hard to spot. But what you really want to do is track their origin in the genome. And are they just, for example, a little orph, right? There's all sorts of things, at the size of some of those peptides, they could easily be another reading frame to another protein, right? And if there's some way to just make that stuff and it has any kind of biological activity, you're off and running. So I think it's possible that some short venom toxins are going to be truly sort of de novo polypeptides. What I want to just elaborate on there for a second is the first step in making a new toxin. And I gloss, I mean, I've inferred and I've made the argument that there's co-option, right? At some point, something gets into the venom gland and you're off and running, okay? What we, what I haven't told you is I really haven't given you any kind of glimpse of what that step looks like because I can't. Empirically, we just don't have the evidence yet. It's all inference that it happens. But realize we think that if it's a metalloproteinase, if it's a phospholipase A2, that even a normal physiological protein with no modification, simply if expressed in a venom gland and injected into prey may be toxic, even if that protein is itself, obviously a normal physiological protein. So we don't think that these proteins necessarily need to go under any modification to be toxins. That if, for example, Adam 28, or some piece of Adam 28 was expressed in the venom gland, that that would suffice to give you enough activity that now selection can start working on that and make a better and better toxin or toxin family. And the reason we say that is that if you look at all, there's a very interesting array of things that have been in the course of snake evolution co-opted into venom, including things like bradykinin, as is. So there are some venom genes that are, in fact, single genes and the physiological function and the venom function are encoded by the same locus because they're the identical protein, the one used in normal physiology and the one that the snake injects into its victim. You think about things like what are snakes often doing? They're often screwing up hemostasis or screwing up neurotransmission and merely the injection of some normal physiological protein into the circulatory system or where it can reach the nervous system can be sufficient to drop things. There's also all sorts of really interesting anti-coagulant and pro-coagulant components in venoms. Again, all you need to do is get the normal physiological protein expressed in the venom gland and you've got the beginnings of a toxin. So I just wanna talk about that early minute and I think what the cone snails could be doing is just at random expressing peptides, short polypeptides in their venom and if anything in that works, it's now a substrate for selection. I wanna connect those two dots. I'm sorry, I left you at the microphone for a long time. Did you have a question? Okay, great, thanks, yeah. So I had a question about the internal deletions that you see. Is it like, you said like a physiological protein can sometimes just be a toxin if it's expressed in the wrong spot. So if you take Adam 28 and use it as a toxin, is it toxic and is it that these deletions are necessary? Experiment has not been done. So if we did the experiment, we'd certainly cut the transmembrane part off. But there's reports in literature, for example, that Adam 28 is alternatively spliced. So I also wanna raise that for people who are thinking about how this gets started, which is some of these physiological proteins, there may be multiple forms of them and maybe a secreted form of Adam 28 because it was a good place to start. And I know it's kind of disappointing for me to give you this model and then tell you we haven't done the experiment. But part of this is we're just kind of picking our models for doing, because that really requires in vivo work. Like if I, it's not gonna surprise you that a truncated Adam 28 will digest every in vitro substrate that I give it, fibrinogen and all this kind of stuff. That's to me not gonna be good enough evidence. So we would have to go into animals and show toxicity in some sort of way. And I think we're gearing to do those experiments with a different family of these two toxins where I think the readout is gonna be a lot clearer. Where we're gonna find out whether the native physiological protein is disruptive to the potential prey and what modifications were necessary. So we're just picking our spot for where we're gonna invest in the in vivo work and just decided not to do it with either of these two toxin families. Okay, but we're on the precipice of doing that with some wicked cool toxins. So thanks for asking. Hi, Dr. Carroll, thank you so much for your great talk. When you talked about the Drosophila, ADH genes, and then concluded that the duplication was for selection dosage, I know males tend to compensate for having one X chromosome in Drosophila. Did you look at how the females and the male Drosophila, ADH mutants reacted to alcohol treatment or different alcohol environment? Yeah, so we're putting the ADH gene as an autosomal gene. And so we're putting it on an autosome. So we're making the animals diploid or in the case of the duplicate, essentially tetraploid. So did not look at any female differences. I've got the data, it's good for you to ask anyway, because the test for, and David did all sorts of experiments, including in vivo testing the flies for alcohol resistance, which is a straightforward assay. Yeah, that the internal, the anatomy of males and females are not necessarily the same, and I'm just not, the light bulb's not going off that males and females showed any difference in those assays in terms of alcohol susceptibility. Because I don't remember that, we probably did them all in one sex just to control for that, but I don't know that we actually explored any sex differences. Do you speculate that someone else might be curious, like the effect we see with male mice and female mice, in your case Drosophila, females versus males? I mean, I could expect there would be differences simply because where these enzymes are expressed, maybe there may be sexually dimorphic places, for example, in the gut, there may be also a little differences in feeding behavior and things like that. So I think it's something to be alert to from sort of the natural situation, but I don't think we had any evidence that there was a male-female difference. So I think you got to keep these things in, it's a great question. Thank you. I think it's like keep things in mind, because you can see the hesitation after the question, I started to think, gosh, I hope we controlled for that, but no, I have faith that David controlled for that. Yeah. Hi, so thanks for the wonderful talk. I also really appreciated the general rules. And in that context, so I guess I was wondering, you talked about how for many genes, duplication might be, not a good route to go, might be deleterious, and that regulatory genes and structural genes might place different demands or bring different constraints. So how is something like whole genome duplication possible? Why does, how can that even happen? Yeah, that's a great question. And we can all have, we can all bat that around, because I have probably no better answer than anyone else. The only thing is, at the initiation of whole genome duplication, of course there's lots of weird things too. There's, I mean, there's tetraploidization and, anybody here is a Xenopus researcher, et cetera. So we know that ploidy can increase, but of course the ratio between all the regulatory products and targets would be still the same initially, right? And everyone here working on telios fish knows that there was another duplication there that all your zebrafish have got more hawks complexes than I showed, right? And subsequent to those polyploidization events, there's some sorting out. So whether it's the yeast genome-wide duplications or whether it's the telios duplication, we see there then can be some gene loss after that. So it would seem like initially, and of course, and I'm only talking about ant, oh well, I just talked about yeast, so I apologize for that. I'm not talking all about plants, because of course there's a lot of ploidy going on in plants. I'm just gonna stick to animals a little bit where we'll talk about these regulatory networks and these kind of genes, because obviously plants seem to be a lot more tolerant of these dosage changes. But the only thing I'd just say in terms of animals dealing with polyploidization is initially all the ratios are the same. And then maybe you have gene loss after that of the things that can be lost without disrupting those balances. That's the best ad hoc explanation I can come up with. And I should mention things like for people in the audience, and maybe you all know some data or maybe you've done these kinds of experiments, but when I'm referring to, for example, third copy, one of these early days there was a lot of manipulating hoc genes and other regulators and things like a third copy of PAC-6 it causes a lot of defects. A wild type copy of PAC-6, a nice wild type gene. I think those are the things we should be thinking about. And I think of course, and here I am at NIH and we were talking about genetic conditions, I think we gotta understand that there may be consequences to segmental aneuploidy and things like this that are simply due to extra wild type copies of regulatory genes that these be manifest in human conditions. I don't think there's been enough, I don't think there's been a lot of discussion about that in the literature. We're out of time, but thanks, and it might take a few questions afterwards. One more round of applause, please. Thank you. Thank you very much.