 Gladys June Everson was a former undergraduate at the University of Wisconsin and she got her degree in 1931. She went to the University of Iowa for a master's degree and came back to Madison and did a PhD with Harry Steambock and she got her degree in 1941. After that she was a professor at University of California Davis and at her untimely death in 1969 she bequeathed her estate to the Department of Biochemistry and those funds are used to support a lectureship that is dedicated to those who have been associated with the department at any level. So I've always said that the University of Wisconsin has absolutely fantastic undergraduates and we have experimental evidence starting of course with Everson and continuing with Clara. Clara was my first undergraduate here at the University of Wisconsin. She set a very high bar. She graduated with a 4.0 which I know is extraordinarily difficult to do in biochemistry. After that she did her PhD with Doug Reese in California and post doctor with Steve Burley at Rockefeller University. She was an assistant professor at Johns Hopkins University and then moved to the University of Rochester Medical Center where she is currently now. She has a long time interest in RNA and the proteins that are associated with it and that's what she's going to start to talk to today. So with that I'll hand it over to Clara. All right today I'd like to tell you about some new views of UTOEF in pre-messenger RNA splicing and I think this is pretty timely since UTOEF has kind of come out as being a major splicing factor that's involved in a couple of different human diseases. So as many of you know most of your genes contain introns, these intervening non-coding sequences that need to be excised from the pre-messenger RNA transcript before it's translated into proteins and these introns actually can also serve as a source of non-coding RNAs. It's now known that almost all of your genes are alternatively spliced, that is a particular part of the transcript is either included or excluded from the final messenger RNA usually to encode different proteins also to regulate gene expression and this process is a major source of transcript diversity. For example if you have for each exon that's either included or excluded from a particular transcript you have two to the n power different transcripts that are potentially generated. Now I don't want to mislead you, in practice there are only about three to four splice forms on average per human gene although there are exceptions that have many many more but on average there are only about three to four splice forms per human gene and in each tissue type there are about two dominant splice forms that tend to take over in that particular tissue type. So it's pretty amazing that these splice sites are recognised and correctly excised at least most of the time. So the splice sites are marked in the transcripts by relatively short consensus sequences. At the five prime splice site these include this relatively short sequence which is a GU in the RNA, GT when you sequence the DNA. And at the three prime splice site still short maybe a little bit more of them but pretty degenerate, divergent. The sequences at the three prime splice site include a branch white sequence which is this pretty conserved adenosine with a few other nucleotides. And at the three prime splice site a highly conserved AG. And this three prime splice site AG is preceded by a polypyromid tract. What we're focusing on here are the early stages of three prime splice site recognition by a protein heterodimer called U2AF. And the large blue subunit here U2AF2 normally recognises the polypyromid tract. This is the stretch of either Cs or U is an RNA here that is about 20 nucleotides long and it precedes the AG. And U2AF is actually required for splicing most splice sites. It's thought to be important for splicing most of the major class of three prime splice sites. So you really got to have U2AF you knock it out and your cells die. U2AF1 is more conditionally important. It's also essential for viability but it's important for splicing a subset of splice sites that are dependent on this AG di-nucleotide they're called so-called AG dependent sites. And these usually have a relatively weak or degenerate polypyromid tract. So U2AF I think is tremendously important. It's the most important one. However, I should not exclude the fact that there are about 170 loosely associated proteins that are important for RNA splicing. U2AF I like to think of as serving a role analogous to TATA binding protein in transcription. It has to recognize that three prime splice site and nucleate splice-asome assembly. So it's very important protein but it takes a lot to get those splice sites identified correctly. So once U2AF has recognized a three prime splice site it helps to assemble the core splice-asome which is a more modest sized molecular machine. It's a ribonuclear protein machine. It's RNA-based catalysis by five small nuclear RNAs including the U4U6 dye SNERP that you may have heard from from David Brow and Sam Butcher's labs here. And the U1STERP at the five prime splice site U5 at the core. U2AF as its name implies it helps to recruit the U2 small nuclear ribonuclear protein particle which also ultimately is part of the active site of the splice-asome. So all together we have this huge splicing complex of more than 240 protein factors and all of these are important for bringing together the core splice-asome. So why do we care about U2AF? Well as I mentioned it's one of the early stage splicing factors. It recognizes a three prime splice site and helps to recruit and nucleate splice-asome assembly. So I'm going to show you this very schematic movie here with U2AF shown way proportionally out of size because the U1 and U2 SNERPs are much larger. But for purposes of illustration U2AF binds to the polypromunic tract in AG first with a splicing factor one which then exchanges with ATP hydrolysis for the U2 SNERP which anneals with a branched point sequence with its RNA, a U2 SN RNA. And through a series of ATP dependent conformational changes a branched intron laryate is excised in two transesterification reactions. And actually for splicing itself you don't need ATP and RNAs can accomplish the task of splicing but you do need ATP for RNP unwinding to chaperone the steps of the splice-asome assembly and you do need proteins for efficient splicing. Another reason that U2AF is important is that it recognizes these splice site signals in RNAs and while inherited mutations of splicing factors are pretty rare in human genetic diseases because they're just lethal. If you have a mutation in a general splicing factor you don't get very far in your embryonic development but mutations in the consensus sequences that mark the splice sites are very frequently found in inherited genetic diseases and that's because if you're affecting just one transcript that's going to be a more selective downstream effect than the wiping out an entire splice-asome which is pretty much going to be non-viable. I've just shown a couple of examples here of genes that often have mutations in their splice sites. These are relatively randomly chosen from the human gene mutation database. And U2AF recognizes a polyprenum tract and the following AG. The AG itself is very frequently mutated. The polyprenum tract it's a little bit harder to document mutations there because it's in the intron and usually these sequences are found through exome sequencing or DNA sequencing but it's also affected in certain known to be affected in certain inherited genetic diseases. Now most recently it's been found that in more than half of all myelodysplasias mutations have been acquired in splicing factors. Now as I mentioned you don't generally inherit a mutation in a splicing factor because it's too lethal. But during pre leukemias or myelodysplasias these mutations are acquired at specific sites in a certain recurring splicing factor genes which I'll go into more in a moment. Before I go into that let me tell you a little bit what is this myelodysplasia. To be honest when this came out in 2011 I'm like what is this disease I've never even heard of it. So let me first introduce to you what a myelodysplastic syndrome is. So essentially it's a disorder of the bone marrow and I got these slides from an MD at the U of R. But essentially the bone marrow cells have abnormal morphology. It can be seen with Prussian blue staining. Certain types of these refractory anemias the the iron has precipitated and can be stained with Prussian blue. And these no longer are able to make blood cells normally. This is more common in elderly usually because of environmental insult. The poster child is actually a very atypical case. It's Robin Roberts from Good Morning America. But she probably got breast cancer as a side effect of chemotherapy before. She probably got MDS as a side effect of chemotherapy for her breast cancer. So usually this is going to happen. It occurs in elderly males not in a relatively healthy females. And prognosis poor it often leads to death. And that's because you can't make your blood cells. So you have your heart has to work harder. So often patients will die of heart failure. They'll die of secondary infections because they can't make white blood cells. And they also sometimes have internal bleeding because they're not making platelets. So basically your blood is really messed up when you have this disease. And currently there is no cure. The only cure right now is a bone marrow transplant. So it becomes very important to understand the molecular basis. And maybe that can lead to new types of therapies. For example there are new splicing therapies that are currently in clinical trials right now. So in 2011 it was found that through whole genome sequencing that 60% of the affected genes in these MDS occurred splicing factor genes. And this was a real surprise because a splicing factor mutation is generally lethal. And in fact if you express homozygous mutated splicing factors that carry these mutations in the MDS it's lethal. But in the heterozygous state these are always heterozygous. Somehow these are leading to the MDS. And the molecular mechanism for that is still unknown. And I wish I could tell you more about that today. But instead I'm going to tell you about what I work on which is the structure and biochemistry of these affected splicing factors. Alright so four splicing factors are recurrently mutated in MDS and they're shown here. Among them is U2F1 which is mutated in about 11% of MDS and also with some and some some lung cancers. ZRSR2 is actually a paralog of U2F1 is found in the minor class of spliceosomes. So it's really just a limited set of splicing factors that are affected. And the residues, the amino acids that change are almost always the same ones in U2F1. It's almost always S34 which is mutated always to a phenyl alanine or tyrosine, a large aromatic residue. In a minor cases I think only about 15% of cases a Q157 mutation will occur. And this Q157 mutation does not occur in lung cancers. It only has occurs in some MDS. So really these mutations are concentrated in very specific spots. So they seem to have a new morphic or a new function that's kind of exciting to the field. Much more rare but does occur are mutations in U2F2 perhaps is a less frequent because U2F2 is required for splicing. You can't mess it up too much. But they do occur and I'll talk about them in a little bit of our work on those in a minute. What I'd like to tell you about today is the detailed structural work in my lab on how U2F2 recognizes the polypermute tract signal and our recent work to see how U2F2 polypermute tract recognition is altered by disease associated mutations in the polypermute tract and the MDS associated mutations. And then I just switch gears a little bit and look at U2F1. We've recently looked at single molecule front of how U2F1 can influence U2F2. So I'd like to share that recent work with you. And also we've looked at how the S34F mutation of U2F1 can alter its RNA recognition. So let's first look at U2F2 which recognizes the polypermute tract here. So way back like 15 years ago when I started working on this protein couldn't get it to crystallize. So in the typical crystallographer was working in a structural genomics lab. So our usual plan of attack was to cut out the protein, make it smaller. So using that strategy, I didn't manage to get crystals which are shown here. The structure is shown here. But I had a 20 amino acid interdomain linker dilation. So this structure was useful. It had a nice publication of like a cell for our starting assistant professor. And it did show us how the nucleotides are recognized by the two RMS. But it wasn't until quite recently that we were able to show a more intact domain. And perhaps the lesson here for budding young crystallographers is don't necessarily shorten, maybe go out a little bit. So the way that we actually obtained the structure was by adding residues, about 10 residues on either side. And that helped to fold up the entire domain and then obtain the more intact RNA binding domain crystal structure as shown here. So what we see is that we have the two tandem RNA recognition motifs, our RMS, the bind to nine nucleotides of a polypyridine tract. And the inter-RM linker folds across and interacts with the two flanking alpha helical regions, and actually recognizes the central nucleotide of the polypyridine tract, which was lacking in our prior structures. So we went on to do mutagenesis and RNA binding studies. We can see that the interactions with the central nucleotide are important. They contribute about five fold or one hydrogen bond of energy to recognizing the central polypyridine tract nucleotide. The end terminus is important to a mutation to alanine of this glutamine residue decreases binding affinity by five fold. And the C terminus folds nicely across the terminal nucleotide. I don't believe we tested that it's a glycine interaction here. But all together these make these nice interactions with the terminal nucleotides and central nucleotide is all recognized by regions outside the core, so called RNA recognition motifs that typically are thought to interact with the RNA. And most strikingly, these flanking and inter-RM regions bind cooperatively. So by simply extending these core domains by 10 flanking residues, we increase the binding affinity to 35, you know, by more than 100 fold should that very comparable to the full X protein. So in summary, what we know about UTOEF2 from this work is that the UTOEF RNA recognition motifs and alpha helical linkers recognize a nine nucleotide polypyridine tract. So then we want on to see if mutations in the polypyridine tract that are known to be inherited in human genetic diseases would alter UTOEF recognition. And so we started first with a disease called retinitis pigmentosa mutations in this RP2, like this RP2 polypyridine tract shown here are inherited in X-linked retinitis pigmentosa. So there are relatively commonly linked with X-linked retinitis pigmentosa and inherited mutation. And so the inherited mutation is this U2A change here near the five prime region of the polypyridine tract. And this is if we determine the RNA binding affinity of wild type of the wild type sequence for UTOEF2, it's shown here it's about two and a half micromolar. This is a relatively short polypyridine tract shown here. But with the U2A mutation, the RNA binding affinity does decrease of UTOEF does decrease. So here we can see that an A substitution in this polypyridine tract that is inherited in X-linked retinitis pigmentosa does decrease RNA binding affinity of UTOEF. We then went on to see if a similar mutation in it's inherited in neurofibromotosis would also decrease binding affinity. And yes it does. And by chance perhaps not by chance but not through our selection but we were just looking for polypyridine tract mutations that were documented to be inherited. And it just these are both in the same location. It's not because we cherry picked it. It's because they are in that site. It's a deleterious site to have an A substitution. In contrast if we do a control mutation near the three prime end of the RNA there's no effect or with an error you know it's very similar binding here. So there's something about this position on the protein that really needs to have a U and this position can adapt to an A. And this led us to hypothesize that we have a sequence specific site on the C terminal RRM2 whereas we have a promiscuous site on RRM. And to look at this more closely we have actually determined structures of UTOEF with all four nucleotides bound at the promiscuous site. And we'll just walk you through here quickly and show you how UTOEF can adapt to different nucleotides at this site. So it was a C nucleotide almost the same whether with pyrimonines. And here I'm just morphing between a crystal structure determined with a C nucleotide versus a U nucleotide. You see an arginine side chain can simply flip to accommodate either nucleotide. If we mutate to an adenosine again the arginine side chain can flip and if we mutate to a guanosine a little bit harder so actually UTOEF seems to recognize a synguanosine although a caveat of the structure is that we're using a deoxynucleotide which can stabilize a synguanosine. So we then went on to leverage the structural information to show a direct link to UTOEF in cells. So we used our crystal structure to say an aspirate at this position is probably going to in fact disfavor a urethane oxygen here whereas it can form a hydrogen monosodenosine. So we mutated this residue to a valine which packs very nicely with the urethane in a crystal structure. And then we showed that if we introduce, well just by RNA binding first of all we show that this D231 valine mutant protein increases RNA binding to the RP2 site. And then we introduced this into cells. To do this experiment we used an RP2 mini gene and we could show that if we co-express wild type of UTOEF2 with the RP2 mutant, you know I'm sorry this is, that we have a nearly complete exon skipping of the affected splice site. But if we introduce the mutant UTOEF2 which now binds better to the defective splice site, we can restore normal splicing. This is an RT-PCR gel here. And here is a quantification of multiple replicates of the RT-PCR and on the left I was showing a QRT-PCR assay of the splicing reporter. So from this we can use our structural information to show a direct link between UTOEF2 and a recognition of this defective splice site. So it does appear that in cells UTOEF2 is regulating the splice site and it is being affected by the RP2 related mutation. So we conclude from this that disease causing polyprotein tract mutations can alter splice AVI-UTOEF2 inhibition. And this is not high throughput by any means but it does show that in principle this does occur. So now let's look at the UTOEF2 protein itself. Now UTOEF2 is not the most commonly affected splicing factor in MDS but it does get mutated in some cases and also in cancers. So we surveyed the cancer genome atlas and mapped the document mutations along the UTOEF2 structure. And we can see that the mutations cluster in the globular, the recurrent mutations cluster in the globular domains. And particularly they tend to fall along the RNA interface which seems to suggest that they could alter its RNA binding. So if we look more closely at two of these mutations today, these are the two that we've looked at, we looked at a glycine to aspartate mutation near the 5-prime region of the protein, or a 5-prime region of the RNA. And asparagene to glycine mutation is actually interacts with the central nucleotide. And if we introduce these mutations and do an RNA binding affinity analysis, we see that there is definitely changes in the binding of UTOEF2 to the representative polyprime tract. The lysine mutation, as one might expect by introducing a positive charge, increases binding affinity by five-fold and the aspartate mutation decreases the binding affinity by 10-fold. So it does appear that these cancer associated mutations, one is recurrent in AML, the N196K is in AML, which actually increases affinity, and the other one is in prostate cancer. So they do alter UTOEF binding affinity. And now we're going to try to look at some RNA-seq data on this in the future. So I hope I have more to tell you about that in the future. So altogether, how does UTOEF2 recognize a 3-prime spice site? It does so via its integrated RMS, alpha helical linkers. And is it affected by disease associated mutations? Yes, for the ones that we've looked at to date. And hopefully there will be more to come on this in cells soon. All right, so let's look at UTOEF1 now. So how does UTOEF1 influence spice site recognition? And how is it affected by mutations that are common in MDS? And this is a very technically difficult topic. So we're going to start by looking first at its influence on UTOEF1, which is a more easily studied protein. And I hope that we'll have some tantalizing glimpses of how it might function in MDS at the end. So to date, we don't yet have a structure of full-length human UTOEF1 recognizing this AG at the 3-prime spice site. What we do have is this core human heterodimer, which shows an atypical RNA recognition motif that actually doesn't bind RNA, bound to a short peptide from UTOEF2. This structure actually lacks entirely the zinc knuckles that carry the MDS mutations. A few years ago, a structure was determined of a yeast UTOEF1, which is shown here. And this is the full-length fission yeast UTOEF1 protein with the affected residues and MDS shown here in yellow. And it's bound to a slightly longer fragment of UTOEF2. So this is a very exciting structure to many of the field. It shows the zinc knuckles and how they're integrated on top of this atypical RNA recognition motif. They show these atypical extensions here. And the MDS-mutated residues are exposed on the surface where they could interact with other molecules or RNA, but it still doesn't have bound to RNA. So we still don't know how the AG is recognized here. And it's been very challenging to obtain a UTOEF1 RNA structure. Many are working on it, including us. So what we have done recently, which I think, hopefully, Erin will appreciate, is we've been using single molecule fluorescence in collaboration with Dimitri Ermelenko, who is our neighbor at U of R. And he traditionally works on ribosomes, but we've roped him into the splicing field. And so it's challenging to label UTOEF1 directly because it has cysteines as zinc knuckle proteins. But what we had already been doing and successfully used to study UTOEF2 is our labels that we introduce in the distinct RNA recognition motifs. So we mutated, we placed single cysteines in each of the rms and labeled these with a mixture of Psi 3 and Psi 5. And that allows us to look at the inter-domain transitions of the FRET transitions of these two domains in the protein. So we've done this, and this is published in 2016 to look at the inter-domain conformational transitions of UTOEF2 in the presence of absence of RNA. I'm just going to look at the review this quickly to look like we next will add in UTOEF1, and we need to compare it with the UTOEF2. So in the absence of any RNA, we're tethering this protein via a relatively long, it has a lysine-rich linker and a histamine tag here. So this is tethering the protein to the slide. When we look at the protein in the absence, UTOEF2 protein in the absence of added RNA, it's a relatively diverse conformational ensemble. The rms are dynamic. And this agrees with the lack of inter-RM contacts in the crystal structure. The direct contacts between the two rms are relatively few, only this handful of hydrogen bonds shown here. Instead, they're glued together by binding to the RNA. And it also agrees with a relatively broad ensemble of conformations that is observed by a single swangel X-ray scattering. We add RNA by two different strategies. This is the original strategy of immobilizing UTOEF2 and adding the RNA in. You can see that it stabilized a particular confirmation, a particular fret state that's about a half fret efficiency, value of fret efficiency. And since in this particular immobilization scheme, we're also capturing, we're also viewing immobilized molecules that may not be bound to RNA, we compared a scheme where we immobilized the RNA and add the labeled UTOEF. So we see that here we still have a broad distribution of inter-RM arrangements, but we've stabilized, selectively, a particular inter-RM state that corresponds to about a half of fret value. And this value of a half fret agrees with the confirmation that we're seeing in the crystal structure. This is side by side confirmation of the two rms on the RNA. In contrast, there's an NMR structure in the APO state that's part of this ensemble, which is back-to-back and that has a very high fret efficiency. And that may be this part over here, maybe, but it can only bind to RNA by a single rm. So it's a little bit more detailed than we need probably. But all right, so next we look to see how adding UTOEF1 would change the inter-RM distribution of UTOEF2. And to accomplish this experiment, we immobilize UTOEF1 via this relatively lengthy tag. But unless this keeps it up off the surface, we don't have any detectable surface interactions. We immobilize it via Hestetian tag, which then binds to a nickel NTA resin that has a biotin that is bound by new travenin, and that's immobilized on the quartz slide. And then we flow in labeled UTOEF2. I look to see how it's inter-RM conversions. I see what its fret values are. So our results are shown in the next few slides. First, if you add UTOEF1 to immobilize, if you add labeled UTOEF2 to immobilize UTOEF1, no RNA, very different from UTOEF2 by itself. Look at this. This is beautiful, stable peak at about 0.6 fret. So it's a higher fret efficiency, and it's stable. So now if we add in RNA, so first we bind UTOEF2, and we add in RNA. It's this prototypical ADML strong splice site RNA that includes the 3-prime splice site where UTOEF1 will bind. Changes. The fret value drops way down here to about 0.3. It's still a beautiful, stable conformation. And these are very, I don't have it up here, but I'm happy to show later if anyone's interested, that these are very stable. There's not a lot of dynamics going on here. These are stable conformations that are stabilized by the presence of the UTOEF1 subunit. So this is really acting on the interarm conformation of UTOEF2. And then for comparison, here I show an overlay with the scheme of UTOEF2. I'm showing an overlay with a scheme where we know UTOEF2 is bound to RNA. And you can see here the major fret state of UTOEF2 alone is intermediate. So it doesn't exactly correspond to either of these. So as a structural biologist, I'm super excited to see, what are these? Is this going to be the crystal structure? It's a much lower fret state. So this one, could this be the back-to-back APO state? I don't really think so, because I understand that fret values cannot be directly related to distances due to the kappa factor. But still, this is much lower fret than would be expected for the back-to-back conformation, which is expected to have almost 100% fret efficiency. So we'll have to wait and see. We've got to get those high-resolution structures, see what's really going on here. But UTOEF1 is definitely doing something to UTOEF2. And this makes sense, because UTOEF1 regulates a completely distinct subset of 3-prime splice sites that don't depend on the polypremunit tract, or they have a degenerate polypremunit tract. So it has to get UTOEF2 to adopt a conformation that's compatible with this subset of specific splice sites. All right, so again, we don't really know how UTOEF1 is recognizing the AG, but what we can do is we can use the MDS mutations actually as clues to see how it is recognizing the splice site junction. So these mutations that recur in the myelospacitic syndromes are located in the zinc-knuckle domains. And I'm showing one of them here, a structural homology model based on CWC24, which has 45% sequence identity. And the counterpart of the S34 residue actually interacts with the nucleotide. So this suggests, I'm not showing it here, but Q157 is very similar. And this suggests that these residues are involved in RNA binding. So RNA-seq of patient samples and cell lines sort of supports this idea. It's found by a number of different groups and different publications that the minus 3 nucleotide of the affected splice sites differs in the presence of the S34F mutation of UTOEF1 that's in MDS. With the MDS, with the S34F mutation present, typically the exons that are included that have increased splicing are preceded by a minus 3C. And those that are skipped have a minus 3U or T here in the genomic DNA sequence. And this would suggest that perhaps the S34F residue is contacting and recognizing that site in a different way than the wild type serine. So to test that, we again used our purified proteins. I'm making it look easy here, but these are actually really hard to make. So we're one of the few groups in the field who are actually making these. So we have our UTOEF1, which expresses best as an MVP fusion, and then we cleave that off and do gel filtration and mix some, yeah. So we found that it was, we've got the most reproducible results and the highest signal to noise if we use the ternary complex for our fluorescence and isotropy assays. So here we're working with an entire ternary complex, not UTOEF1 alone. So we use this purified protein complex for RNA binding affinities. We use fluorescein and isotropy. We're sort of limited to the techniques, biophysical techniques that we can use by amounts of protein. Of course, we're not really able to use isothermal titration calorimetry. We just don't have that amount of material, but we can use our complexes for fluorescence and isotropy with a fluorescein labeled RNA splice site. I do want to point out that we don't have any differences in fluorescence emission intensity, which could lead to small changes in your apparent polarization, but these are the same. Before and after the titration. So to date, at this point, we've done 19 different sites. And I don't know. It's satisfying and unsatisfying in some ways. So in general, we see that the affinity trends do agree with the splicing. In all cases, except for med 15 was sort of an outlier here. It had a difference in the affinity that doesn't quite agree with the splicing. In many cases, the S34F mutation will decrease binding to a minus 3U, but not always. And in some cases, it will increase binding of this uteriform-containing ternary protein complex to a minus 3C. So we get these trends. They match the splicing data, but they're very subtle. It's only two to seven-fold change. So this is OK, but I think there has to be more to the story. I don't know if it's going to be underlying differences in the binding kinetics, because these are equilibrium apparent RNA binding affinities. I don't know if it's rules for other splicing factors that are present in cells that we just don't have in hand and we don't know what they are, like PRP5, the air it is working on. But I'm sure there's going to be more to this. I don't want to leave you here saying, oh, well, she measured RNA binding affinities. And biophysics says, well, it's just a difference in binding to that site. No, no, no. So we have these subtle differences. And they do agree with the splicing data, but they're very small changes. I mean, right, I mean, this is like a two-fold change. That's not even a hydrogen bond. So there's a lot more to come here. All right, so what do we know about UTOI F1? UTOI F1 influences splice site recognition via what UTOI F2 does appear to be one mechanism of structural action by UTOI F1. And mild dysplasia, MDS-related UTOI F1 mutations, particularly the S34F that we've looked at in detail, does affect RNA binding to effective splice sites, but in a very subtle way. All right, so that's most of what I have to say today. I just wanted to also point out that for all other MDS-associated mutations and splicing factors that the RNA interface does seem to be a recurring theme, and maybe this is related to the action of RNA helicases that have to dissociate these proteins that you're gonna somehow alter the RNA recognition that's gonna have downstream and other effects. The S3B1 hotspots map to the pre-messenger RNA. The other mutation of UTOI F1 interacts with RNA based on homology with ME and L1. We don't have a structure. And SRSF2, the proline that's involved also contacts RNA. So it seems to be a recurring theme that at least these mutations do mark out important interfaces just like maybe a more, a model system in a lab where you're doing mutagenesis. Well, this is sort of like biology doing an experiment and biology has shown us that these are the important interfaces of these splicing factors and now it's up to us to figure out what are they doing. So much more to come. And that's the end of my story, my never-ending story of splicing factors. So I just wanna thank Dimitri Ermelenko for doing the single molecule of FRET with Shandari Warasuria. And Rakesh Shatriki, who's now a postdoc at UPenn, did the UTOI F1 studies. He's a trooper. He made all those proteins. And Anna Agarwal did the UTOI F2 work. In particular, he came up with the idea of the D231V mutation. These are really great young scientists out there. And thank you for your attention. Thank you, Clara, for a fascinating talk. Questions? Heidi. Yeah, I was curious about the pattern of U2A2 mutations that you showed because you said there were cobalt spots permeating AML and FES. But the rest of the mutations would be consistent with a loss of function. So was that particular just on cancers? Yeah, so we simply surveyed the cancer genome, ATLAS, which is not, so these are actually from a variety of different cancers. It just happened that the asparagine 196K mutation was only observed in leukemias. The other ones are in a variety of cancers. So they don't seem to be any particular cancer. The glyceumutation is in prostate and colon cancer. So, and it recurs at only that type of cancer. So I don't know if, they're not that common. So I don't think we have enough data to draw. Well, yeah, heterozygous solution, I'm not sure. I know if you knock it down, cells do survive. People do knock down U2A2, but there's still some remaining. So I would say probably they would be. Maybe it's certainly embryotically cylinder zebrafish. You can't do it. You can't make a knock out of U2A2. But yeah, I think, yeah. What data was really striking going from this distribution in these really beautiful. And can I maybe wonder like how many other things could influence that confirmation? Right. So like SF1, for example, is there an indication to U2AF or vice versa? I'm just wondering if you. I was thinking, when you said other things could influence, I was thinking, is it Shondani's hands? Just, it's such a really beautiful data recently. And the other was kind of like a mess. So it looks like, but yeah. Actually Shondani wants to do a lot of controls with. But for, yeah, I wonder if. So in this construct that we're using, the C-terminal domain of U2A, we're really technically limited, right? The C-terminal domain of U2F2 has five cysteines. So in the construct that we're using, it has a C-terminal that corresponds to the crystal structure. It does not have the interaction domain for SF1 in it. So I agree, but it's harder to look at. Do you think that U2AF could also influence SF1 structure? Well SF1 is just one, it's like a little rock. It's like a, it's not like, if U2AF is a multi-domain dynamic protein and SF1 is the KH domain, this is what we know. And it also has a zinc knuckle that nobody pays any attention to. So I don't think a lot of dynamics are gonna be happening with SF1. Yeah, SF3B, maybe, that would be interesting, yeah. You may have covered this and I might have missed it, but I'm a little confused by, splicing is a whole genome sort of process. So presumably all the transcripts in the cells will be possibly affected by mutations and the splicing factors. So then you kind of get this very specific disease that occurs when you have those mutations. So I'm kind of wondering, like, what is it that causes such a specific disease phenotype? I think this is the whole field and as a structural bias, I'm not equipped to address it with experiments. However, that's what the whole field is wondering. Why do we have these splicing factor mutations that are, they're actually deleterious to sell proliferation? You get somehow they're leading to leukemia down the line and they're early stage and they don't necessarily persist. So if there's a loss of the splicing factor mutation, you still have MDS. It's not required to maintain MDS. So I think that's really the holy grail is why does one generate the other? And so it may be RNA splicing but it's only a handful of misplaced transcripts and there's not a lot of overlap between, say, an SF3B1 mutation and U2F1 mutation. I mean, I guess we were kind of expecting that there would be like just a few transcripts that are misplaced in all of them that would be related to MDS and that would be those transcripts that are causing the disease but there doesn't seem to be any core subset that's really related to it. So I'm actually wondering, maybe Erin could say as if it might be, these aren't just splicing factors, they're also involved in transcription, they're SF3B1s, chromatin associated, U2F2, paralogs are involved also in transcription. So, and the U2F1 mutations have been shown to affect alternative polyadenylation which is hard to look at by RNA-seq but it has been done by the green group. So I'm wondering actually if there might be some other pathway that is being really messed up like DNA repair. But I think that's something for future work by maybe more multidisciplinary group of investigators than I think everyone is very splicing centric right now. The splicing people like me are pretty happy to have our proteins have these mutations that are so disease-relevant but maybe it's not just through splicing that this is really happening. Is there any idea about whether the r-loops are forming at the end? So, what made you think of that? So, one would think if they, if you have any broken plot on the RNA that's going to sort of give it its ability to backthread the DNA and then get back to the DNA that's like for DNA damage and stuff like that. So, Tim Grobert at Mass General Hospital has done some work on that and I believe he did see that there was an increase in r-loops. I don't believe he's published that yet but I think yeah, that's a great question and I believe the answer is probably yes. So, you're splicing? Do you mean recursive splicing? So, I don't know. So, I thought it might be actually because it's just technical difficulties again. So, the DK, which is one of the major, it's an oncogene and it's one of the major effective uteroif one sites that we've done the most work on that. But we wanted to make a mini-gene. The stupid has got a 70,000 KB intron. So, I thought, you know, maybe this has to be recursively spliced but so we started to try to look at that but we didn't get very far, it was sort of a tangent. But yeah, as I was thinking, it just happened that this very effective gene has this huge intron and so I was like, okay, maybe it's recursively spliced. How in the world are these spliced sites finding each other? But yeah, I don't think anyone has looked at that rigorously, but that's a good point. I guess I tend to process well. I have these questions in my answer, follow-up question. So, it seems like what these things to do would be look at the RNAs in cells with and without these mutations and see what kind of changes there are in the pattern of the splice form. Which many groups have done, right? Yes, yes, and that's where the sequence logos that I showed up here was the minus three C, A, G. I was from RNA-seq analysis and those were from patient samples, cell lines, both lung cancer cell lines and leukemias. So yes, definitely. So... There's some big differences. Small differences, small cells, subtle, small. And the sequence logos, they agree with our RNA-binding analyses. But I guess the question is, is that enough? And how does that actually lead to uncontrolled, just really defective marrow cells, right? So, yeah. That's right. And the structural model, it makes perfect sense, actually, if because the Q157 mutation, which is in the C-terminal zinc knuckle of U2F1, it affects, has a sequence logo of changed splice sites at the plus one position of that splice site. And that makes sense because you have the S34F residue in a structural model with RNA. The S34F residue is perfectly fits at the minus three position. And then the Q157, this falls at the plus one position. So if you just simply take known structures and you look at them, it does make sense, especially using the newer CWC24 structure. Don't use the TIS. Don't use the NMR TIS-11D because I think that the RNA is backwards. But if you use all the newer structures, then it does make perfect sense, yeah.