 And it's my great pleasure to welcome Scooter Morris today. And Scooter will talk about genetic interactions, networks, and structure. And just before he starts, I'll just give a brief background. So John or Scooter Morris is the executive director of the resource for biocomputing visualization informatics in the University of California, San Francisco. Before finding his home in academia, he was a distinguished systems architect at Genentech, where for 19 years, he participated in the joys and trials of life and industry. And some of us will remember those joys and trials. So before that, he received his PhD in medical information science from UTSF in 1990 and has bachelor's degrees in physics, biology, and computer science from UC Irvine. Irvine, is it? Irvine? Irvine, sorry. So Scooter, importantly, is a member of the site escape core development team. And he is also the author of several site escape plug-ins and core features. And in his spare time, he's also vice president of the conferences of the ACM special interest group for computers and human interaction. That's a bit of a mouthful, but I made my best choice. He also tells me that he is known to occasionally jump off perfectly good boats near Alcatraz Island for a brisk swim across San Francisco Bay back to shore. So maybe we can ask him a bit more about that later. But also, I want to say that this week, and Scooter's been here with us all week. He's been helping to teach a network analysis course. And it's really been great to have you here. And I'd like to say this opportunity to thank you again for everything you've contributed to do. It was great fun. We all learned a lot, I think. And I think everyone really enjoyed the interactions and your work. Yeah, and so some people are actually here, so that's great. So thank you again. And on behalf of the organizers, thank you for accepting this invitation. Thank you. Thank you very much. But we're going to have some fun. So this is a talk that I developed under a lot of stress over the course of the last several weeks, leading up to our NIH site visit for a grant. So now I get to give it without quite so much stress. Here's the outline of the talk. Basically, what I'm going to do is I'm going to give just a brief introduction. And then I'm going to talk about some experimental and computational techniques to show you how we use those techniques in going through a hypothetical scenario. And finally, I'll talk a little bit about the tools that you've seen and how it all glues together if we have time. So a little bit about the RBVI. I'm not expecting you to read that. So I just highlighted the key phrases. We build interactive software tools. So if you're expecting me to talk about the detailed technology and experimental techniques around genetic interactions, wrong seminar. Hopefully we're going to show you some genetic interaction data and how we're doing it, how we're visualizing it. But I'm not in the position to talk too much about the experimental technique. Couple of the tools that we do, we focus on atomic level interaction, molecular visualization. But we also do networks. So what I'm really here to talk about is this combination of structural biology and systems biology. And the idea is, well, first I should say, this is a note to myself. I'm going to, if you're a structural biologist, I apologize in advance for what I'm about to say. If you're a systems biologist, I apologize in advance for what I'm about to say because I'm going to make some gross generalizations here and you may say, well, I don't fit that category. It's okay. So let's think about structural biology. If you think about structural biology, in general, it's low throughput. It takes a long time to get that crystal. And then once you've got the crystal, it takes a long time to solve the structure. It's generally focused on a single protein, complex or reaction. You really focus on the details. The function, what you're trying to do is you're trying to look at structure and function at the residue level, at the chemical level. Really critical that you understand that very low level interaction. Even if you're looking at high resolution cryoEM, some of the new technologies, you still want to be able to know what's going on at that resolution. It's sort of a bottom-up view of function. Contrast that with systems biology, which is very high throughput. The excitement that we all saw when we saw those publications of those very first interactomes, now that we call hairballs, they're completely meaningless. Nobody could publish one, but we were really excited when those first came out. It's focused on the systems level, entire pathways, entire interactions within the cell, and more recently, cell-cell interactions. And it's got a sort of a top-down view of function. We're looking at the broad brush. Network biology is where we start looking and using networks to understand function. And it can integrate both structural biology and the systems views, as I'm going to show hopefully soon. It enables the use of some great analytical techniques using graph theory, and it provides a lot of new visualization opportunities that allow us to bring some of these views together. Okay. So the first thing that I want to do is I'm going to talk about some experimental techniques. Now, I will probably say, you know more about some of these techniques than I do, but it's important so that we can all be on the same page before I start going into the demo. So protein-protein interactions. All right, before I bore everybody, how many know what protein-protein interactions are used? String or something like that? All right, skip this slide, great. I love it. Fundamental point, a bunch of public repositories, really useful for bringing in data and for mapping on to data. Great. Let's talk about genetic interactions. How many of you have done genetic interactions? Okay, so only four of you can go to sleep. The rest, I expect you to stay awake. If you don't know what genetic interactions are, it's a very simple process conceptually. Ideally what you're going to do is you have some sort of, in this example, we're using yeast, and you have some sort of metric of fitness. In yeast, it's pretty easy, you just use growth, okay? You look at the growth rate of the yeast colony and that gives you a measure of fitness. So what the goal for genetic interactions is I take and I knock out a gene and I measure the growth. Then I knock out another gene and then I measure the growth. And then I knock out two genes and measure the growth. Now, if these genes are independent, you would expect that the growth would be the multiplication of the two growth rates. So if I knock out gene A, it grows half as well. If I knock out gene B, it grows half as well. When I knock out AB, both, I would expect that it's gonna grow a quarter as well, right? But it doesn't, often. And what that is indicating, there's some sort of functional interaction between those two genes. So either, in this example, right here, they're part of the same linear pathway. So if I knock out gene A and I knock out gene B, I've just knocked out the pathway, right? No difference if I knock them both out. The pathway's still knocked out, okay? If, on the other hand, these are two independent paths that trigger a downstream event, and I knock out gene A and I knock out gene B, then with just gene A knocked out, I still can take the alternative path. If I knock out gene B, I can take the alternative path. If I knock them both out, I've now knocked out the pathway. So, very simple, generally what happens when you do these is done on a very, very broad scale, high throughput, which you wind up with is you wind up with a heat map, okay? You cluster this and you come up with a signature of some form. In this case, what I've done is a little highlight right there, and this is the pre-folding complex, and you can see that there's a real strong positive genetic interaction, okay? If I knock out any one of those, I've knocked out pre-folding. So it doesn't matter which one I've knocked out, okay? Makes perfect sense. Great. But we wanna take that a step further. Remember we're talking about structure here. What happens if instead of two genes, I'm looking at a cross between a point mutation and a gene? Same basic idea. I take, usually this is done in fairly large structures. So you have a point mutation, you measure the growth rate, you have a knockout, you measure the growth rate, and then you do a cross. Same thing, you do both point mutation and knockout. And what you get is something that looks like this. In this case, this is all of our knocked out genes up at the top, and then down here, we have all of our point mutations. And you can see there's some strong patterns where we still have strong genetic interactions, okay? Now, as I look at this, and when they first brought this to me, you can look at that and you can see, well, it's kinda hard to tell the structural significance of what we're looking at here, all right? So hold that thought. Okay, next technology. Residue Interaction Networks, okay? Real simple stuff. If you think about a protein or any other macromolecule, you can conceptualize that as a network, where I look at the residues as nodes, and the edges are the interactions between those residues. Now, what is an interaction in this context? We can define what it is depending on what we wanna study. So it can be bonds, either covalent or hydrogen bonds, it can be contacts, it can be clashes. We can look at any kind of interaction that we can plot on a network, right? The nice thing about that is if I have two network representations, I can align those networks and look at the differences pretty easy. Let's see, I'm sure a lot of you have looked and tried to do super alignments of structures. And determine exactly where the spread is, where the differences are. We're gonna see that in just a second, I'm actually in the demo, we're gonna do some alignments. It's really hard to tell when, if you do a, if you turn on all atom view, where your clashes are, where your contacts are. If you turn off all atoms and you use a cartoon diagram, it's really hard to tell where your clashes are again, because you turned everything on. So there's a number of techniques for looking at that, but it's hard to visualize. If we render these as networks, it becomes much easier to take a look at the differences in two residue interaction networks. And we're gonna do that in just a little bit. Also, we can also use graph theoretical approaches, which is really nice. So I can look for centralities in these networks, I can look for hubs, I can look for other interesting aspects when I'm looking at this from a network perspective. And then finally, this provides opportunities for integration. If I have a network view, then maybe what I could do is bring in other kinds of networks. A network is a network, I can mix networks of proteins, maybe with networks of individual proteins. What might that allow me to do? What kinds of new visualizations? What kinds of new analytical techniques can I come up with? All right, this is what this looks like. In this case, what I've done is I've taken this structure, it's a phosphotriesterase, and I've built a residue interaction network. And the residue interaction network, I colored it to be the same and you can see exactly the pattern, you come up with a nice representation that looks a lot like the protein. And if I could link these two views, then I can sort of travel as it were through the network and see where I am on the protein structure, okay? Does that all make sense? Pretty simple, pretty easy, we can go through and figure all that out. All right, of all the four techniques I'm gonna talk about, this one's the stretch, at least it is for me. If you, let's see, do we have any crystallographers out here? Crystallographer as in make the crystal, throw the X-ray, do the, yeah, okay. So if you look at a density map that comes out, generally what happens is the process of refinement is to go through and find the best fit side chain locations. And that's what becomes the structure. And everybody kind of looks at these crystal structures as the answer, there are gold standard for structures. But as I've now learned working with my collaborators, it turns out that when you do that, you don't explain all of the density. So you kind of have this unexplained areas of density in the crystal structure. So some collaborators at UCSF in Stanford decided to develop this system called Q-Fit, which goes through and looks for alternative side chains that can explain some of that unexplained density. And then what happens is you can imagine you're building a crystal structure and everything's pretty static, but still you can catch that protein with some of these alternative side chain configurations. All right, and that's represented in the density map. All right, so what they do is they go through and they extract that heterogeneity and they add extra side chains, extra locations to the structure. This is what that looks like. If you look in here, what you see is there's several different positions for that side chain proposed, right? And that the PDB has always had the capability of having alternative locations for atoms and they're just taking advantage of that PDB so that you can go through and you can look at the various side chains, all right? Good, so now we have an idea when this is saved, you have an idea of structure that has alternative confirmations. So what do we do with that? And that's where contact networks come in. What it allows you to do with the contact network is if you imagine I have all these side chains, now I'm gonna start at one side chain and I'm gonna look for all of the alternative locations for that side chain. And then I'm gonna see if any of those alternative locations clash with alternative locations of its dambers, right? So I'm moving this around. If it clashes, great, I'm gonna add that to my network. And now I'm gonna start moving this residue, the alternative locations for this residue around and find where it clashes. And this way, I build out a network of these clashes. And it looks something like this. You can see in this example that you've got this whole area in red that's represented by this network that the clashes will walk their way through. Now, if you think about this, it's really useful. You could look at potentially for allosteric interactions, right? It's a big deal figuring out where we have allosteric. I could look at the potential for mutation. If I have a perturbation anywhere in this structure and I disrupt one of these networks, it might have an impact on the flexibility of the protein. All right, so those are the technologies. And what I'm gonna show you now is a scenario. This is where we get to have some fun. We're gonna explore together the nucleosome. Specifically, we're gonna look at histones H3 and H4. And we're gonna do this by using a whole set of tools and the technologies that we've just gone through and show basically a scenario that we could walk through a step at a time to dive in and to put all of these together using both structure and network visualization. Before I do that, I'm gonna hold off just a second and find out if anybody has any questions about any of the technologies before I show them all to you in tools. Questions out there, all make sense. I don't know whether it makes sense yet but show me the tools and let's figure it out. Okay, that's fine too. All right, so the first step is I'm interested in the nucleosome, specifically H3 and H4. What's the first step in this? Well, gee, there's a bunch of known information that we can get using high throughput interactions, right? We have these big repositories of protein-protein interaction databases. Why don't we start there? Let's just figure out if we can probe that. So a little bit about the tricks that I'm playing. If you notice my little slide program here doesn't look like PowerPoint. In fact, this slide program is a molecular visualization program. And if I sort of slide this down and it shrinks, okay, we have some things that let me clean up my screen here. We were having some fun technical time earlier. Great. Okay, so what you can see here is two programs that are running in the background. On the left, we have the structure of the nucleosome. This is the yeast nucleosome running in chimera X. We'll talk a little bit more about chimera X. On the right, those of you who are in the class should recognize side escape, right? Two tools running and then literally this browser that has all of my slides is actually the help viewer for chimera X. But what's interesting is it has links in it that are commands for chimera X. So if I click this command, it's gonna basically tell chimera X to use side escape to go query the string database and find in this case 30 partners that are known to interact with the nucleosome, okay? So if I click that, there we go. So this was a query. We went off to string and we queried based upon, using the PDB ID in this case and went off and pulled them down. Now I spent some time with this network previously to make it a little easier. And what you see here, these are the two genes that encode H3 and these are the two genes that encode H4. I color coded them the same as what we have over here on chimera. And now we can begin doing some homework. We could begin browsing through this. We could potentially look for structural information in string. There's a number of things that we could walk through and begin to investigate what's known about interactions with H3 and H4, all right? Great. So that's the first step. We are not going to take time to dive much deeper into that, we could. But there's a number of proteins in here. There's silencing proteins. All of the ones that you see make total sense if you spend the time to look at it. I took a bunch of these and spent time cozy it up to these genome.org and confirmed it. Yeah, this all makes sense to me. Great. Back to the slides. So that's great. Now we know what the protein level to the, oh, it went bigger. Yeah. There's a little fun here with this that I can show you later. Is that better? Great. Okay, so now with string very quickly, we've gone out. We pulled down some data that's publicly available and we said now we know a bunch of proteins that are known to interact with H3 and H4. Great. But I'm a structural biologist. I want to know what's interacting at the residue level. How do I find out where those interactions are at the residue level? And that's where these genetic interactions come in. So the second step is we're gonna use genetic interaction information. So this is unpublished data. So I'm not going to go into too many details because it's unpublished. But the first thing in this is we're gonna bring up a cluster heat map, okay? So just like the others, just like we saw before, this is a PE map. And you can see this is the data that's loaded into Side Escape. I brought this up from the existing Side Escape data. Great, I can look at this, I can figure out what's interacting with what, but it's still not very satisfying. What I really wanna see is I wanna see these interactions in the context of my structure. So where is this, all right? So let's do that. There's a network running right here. Go ahead and do that. Again, all of these links that I'm clicking are causing commands to happen in here. And what you see is if you remember that heat map that I brought up, here is the dendogram at the top. And if we zoom in, there are the actual genes, okay? The, at the bottom, there was that weird residue interaction network. So we'll zoom back out. This is actually a residue interaction network of the nucleus home. So by doing a residue interaction network and taking advantage of this, this PE map, and the edges here represent the genetic interactions. Great, so what? What, I have this really difficult thing to look at and I still haven't solved my problem. My problem is I wanna see where we are in the structure. Well, let's zoom in. Sort of gonna take a random click. And now what you just saw when I did, when I clicked on the gene is not only did I get my heat map for that specific gene, but over here in Chimera, we actually annotated the structure by changing the side chain confirmation, showing the spheres on the side chain and coloring those spheres by the level of genetic interaction. So this gives us a really unique way to begin pulling both of these pieces of data together. We can use the genetic interaction data to begin probing exactly what the implications are of this particular interaction up or down. In fact, I've taken this one step further and I've gone through and defined all the well-known complexes that are known to interact with this and we can select a complex and you can see the patterns that we get on this structure of both the positive and negative genetic interactions and you can begin to start probing more completely what you might be interested in from the standpoint of both the structural standpoint and its interactions with other proteins, other genes. So it's great. Now I can take this information, I can couple it with my protein-protein interaction information and begin probing in some detail what we have and why it might be happening. So what I'd like to do now is I'm gonna look at one in particular. We just zoomed in on a particular side chain. Yup, I think Adobe Connect is trying to catch up. Okay, so let's take a look at this particular side chain. Now this has a very strong genetic interaction. If we look at this, what you'll see is it interacts with a large number of genes. All right, what's interesting about it is that it's down in the middle of an alpha helix buried inside of the structure. So if we look at this in more detail what I'm gonna do is I'm gonna open a modeled structure and let me zoom back out. Everything's kind of slowing down on me here. So what I did is this particular, if you look over here, you see the heat map for this point mutation. That's a lot of interaction. That's a lot of interaction. This point mutation has a huge impact. In fact, what I'll also tell you is that it makes the cell very sensitive to heat. So, but if we look at this and actually model it, we can, it doesn't have a huge change. The modeled structure really doesn't show much. What I wanna do, I'm gonna go and show you where it is. Adobe Connect and I are not having fun. This is very slowly zooming out, something that usually happens very quickly. While it's zooming out, let me tell you what you're gonna see when it finally zooms out. At the end here, what I've done is I've turned on spheres. Okay, and when we zoom all the way out, what you're gonna see is that this mutation is buried inside the structure. So, you have this change which is buried inside the structure. It cannot, because of its location, have any direct binding. So, this can't be a site that has any PTMs. So, you begin wondering what could be happening. What could be going on here that could be impacting this? And Adobe Connect is not gonna let me do this. So, let's see if I have, I don't know why it's, ugh, ugh, let me zoom back in a little bit, slide this out. All right, so the little green outlines that you can barely see there in this structure we can kinda peek through, that's where the mutation is. This is buried, but yet it has this huge interaction. So, what's going on? How can we begin looking, how can we begin probing this? And what we do is this is where we can take advantage of residue interaction networks. So, what you're looking at here is I've taken and created a residue interaction network of the mutated structure. So, this is modeled using the back rub server. And then I've taken and I took the native residue interaction network and I've compared them. Pretty simple, I can just do a sequence alignment to figure out where the residues are, right? I already know where the structures are. I can get that from my superposition in kinda X. And now I can compare them. So, the purple one is the mutation. You can see right there we've gone from loosing to an alanine, great. If you see a red line, that means that we've modified the contacts such that we now have a contact that is in the second network that isn't in the first network. And if we look at the blue lines, that's a contact that's in the first network that isn't in the second network. And what you'll see if you zoom out is that modifying that alpha helix has the impact of a very far reaching change in terms of the residue-residue contacts. There's no way that we could have spent the time to look through this in detail at the structural level. Well, we could, people do. And there are tools that can help. We can look at the contacts and things like that. But this gives us a really quick tool to begin probing on that. Now, caveats, okay? This was a modeled structure using Backrub. So, uses Rosetta, this is state-of-the-art, good stuff, but who knows, Backrub could have given us a slightly off model, okay? It could be that the genetic interactions really have more to do with just because this adds instability. But as you look at this, even as far as we go, this gives us some pretty compelling evidence to begin looking more. Now, what I can tell you is I went back after looking at this. I looked at that dendrogram, okay? And I found all of the residues, the mutated residues that have the same phenotypic signature, okay? And it turns out that they are all located right up on this alpha helix. So, there's three residues located right up, walking our way up this alpha helix. So, clearly, there's something that's going on that destabilizes or changes that structure enough to have this major interaction. Okay. Whoa, I docked it. One of the unfortunate things about having your browser be also part of your molecular visualization program is occasionally it decides to wanna go home and just docks right in. Okay, so now what we've done is we have these really long range effects, okay? We can see them in the residue interaction network. The immediate next question is, how can we explore these in more detail? And that's where we get to contact networks, okay? So, we can take advantage of our contact networks to build up a contact network and go through that. Now, unfortunately, and I spent the time looking, there aren't any really high resolution H3 and H4 structures. And in order to take advantage of contact networks, you need something around two angstroms. This is a 3.2 angstrom structure. So, what we're gonna do is we're going to cheat and I'm gonna load a different structure and you get to watch the contact network build. That's, I just downloaded the network file and the protein structure. There's the structure popped in, now we're creating the RIN. There's the RIN just like we did before. We're gonna color it. And this time what you see this little sidebar, it's gonna give us the networks that we can choose from. We can adjust the percentage of overlap that we allow. And I'm gonna look at 0.35. I can click on that. And what you quickly see is here is the network that works its way all the way through the structure. And this is what that looks like when we probe those sidechains. And you can see in this example, how encyclopedia and how we're getting that to walk all the way through the structure. So, if we did have H3 and H4 at high resolution, it'd be really interesting to run this and see if there are contact networks that work their way up that helix that might be changed when we mutate that one residue. Okay. So, quickly I wanna mention the tools that you've seen. So, Chimera X is our next generation molecular visualization and analysis program. It's not yet released. You're seeing raw prototype builds. We intend to open it up for users next month in December so people can download these builds. And we're shooting for an alpha release at the first part of 2017. This is what it looks like. You can see that it's got a very different kind of visualization. This is sort of the typical molecular visualization that you would see. And this is what you see in Chimera X. It takes advantage of the amount of data interactive shadowing as well as ambient occlusion to get a better sense of the depths of the structure. So, again, from that to that, and you can see why we're pretty excited about it. Side Escape, those of you who are in the class are already familiar with it. It is a platform for visualizing and analyzing networks. It integrates diverse data types and it has this concept of apps. And there's 302 apps now in the app store. And no, it's not a store. The app repository, but it's still called a store. And we're currently at release 3.4, 3.5 is in release candidate. It will be out in January. How does all this work? So I'm clicking on links. You're seeing things happen in Side Escape. You're seeing things happen in Chimera X. What's going on behind the scenes that has all of this glue together? And what's really happening is both Chimera X and Side Escape support a REST interface. They both have their own little built-in web servers. And we're just sending all of the commands back and forth using those web servers, using REST. So Side Escape has something called SIREST that we can send commands to. And Chimera has something built in called the REST server. And then there's a bundle, that's Chimera X's apps equivalent called Side Escape, which knows how to send commands to Side Escape. And in Side Escape, there's something called StructureVizX, which is sending commands to Chimera X. And so all of these things is I'm clicking on links. Those links in the browser are actually commands that I'm sending to Chimera X, which is in many cases doing nothing more than sending a command to Side Escape or processing itself. All right. So in summary, what I wanna kind of hit on here is that what I've walked through is this idea of using multiple data sources of multiple diverse types and integrating them together in unique visualizations. And we can do that because we've linked the structural view with the network view. Hopefully you kind of get the value of the tools for doing that. I wanna just make sure that everybody, there's one message you take away. It's that network tools and structure tools work well together, and there's a lot of advantages to linking them. And with that, I'll sort of camp on the acknowledgments. The UCSF Chimera team, these are the people that are working hard right now on developing Chimera X. My collaborators at String, Yergyl Jensen, the PEMap work is done at the Krogan Lab at UCSF. All the contact networks work is done in the Frasier Lab and also by Henry Veden-Biedem at Stanford, the SIDEScape team, and my collaborators on residue interaction networks. And of course, we are funded. Man, the NIH grace gratefully gives us money, and there's our grant. And with that, I'll take any questions.