 So what I'm going to talk about today, I'm going to spend the first 10 or 15 minutes here giving a general introduction to the setup of the course, what we're going to do, what is this that you have to do and what you don't have to do. And then I'm gradually going to go more and more into the actual biology. I'm going to cover the literature too and while we're at it, I might as well pass around the lecture notes. I'll come back to that. I might continue printing those. It depends a bit. If I realize that half of the team is serious away and watching it on YouTube, again, perfectly fine. But then it's a horrible waste of free for me to print 30 copies of the lecture notes every time. So if it fluctuates a lot, I might just ask you to download the lecture notes as PDFs. I know that some of you like to use iPads and take notes so that I have both handout versions and the full slide versions if you want to download them and note. But before we get there, I'm going to spend a little bit of time introducing you to Biophysics. Biophysics is a very broad topic and I'm not even sure whether there's one true definition. But the idea of biophysics is to somehow connect on the one hand biology. That's the bio part, obviously. The real wet lab biology, as you see up on the left, Zebrafish, is traditionally wet right. And that's not at all, historically it hasn't been a quantitative subject. That has changed a lot the last 50 years and in particular the last decade I would say. Because first we're getting so much more information, the whole sequencing and everything I'm going to touch on in a few minutes. But also that we're gradually being able, as we're understanding more and more in the molecular level here, we're able to not just go top down but increasingly also bottom up. So what you, Zebrafish might seem like a, well if you're a biologist it's probably a very fun topic, but if you're a physicist it might seem completely pointless to study Zebrafish, right? On the other hand that there are quite a few people that have been able to do models. This is a basic computer simulation from a colleague of mine actually, Petrus Komotakis in Switzerland, where they study fish and how fish swim together. And the reason why fish swim together, it actually turns out that it's energetically more favourable. And they can even do computational fluid dynamics simulation based entirely on physics and calculate roughly how much energy they're saving. And that helps you to actually predict not just patterns, but this is a whole lot of this type of research is then being used to things like patterns in car flow, or for instance if you have trucks should you let the trucks drive together to in similar ways save on the air resistance. Now the reason why we're starting with Zebrafish is there is actually a point in this. It might seem completely pointless to study. Researcher are not necessarily enemies of pointless things. So what you can do if you have Zebrafish in a small aquarium you can put a camera on top of the aquarium. And as you have this camera you can start recording the individual Zebrafish and as both as a function of time and location in the aquarium you can then start to plot this in three dimensions and see how do the fish swim. And if you know if you have any numerical way of characterizing how the fish swim you can then check so what happens if you now say add ethanol or nicotine or caffeine or anything and then you start to see differences here. And there are two important things here. The first thing is that this is noisy. It's insanely noisy. So what are your backgrounds? Are both of you physicists or a few chemists? Physics? Okay great. Some of things will be repetition for you who are physicists and conversely some of things are going to be repetition for those who are chemists. You can certainly characterize this and compared to classical physics the challenge here is frequently the noise. And if you do this experiment a second time you're going to get well not just different results but the results might be so different that you can't even reproduce it the second time. And that's a particular challenge with biology. It's far more varying than say trying to measure Planck's constant or something. So why do you think you do this? Apart from the fact that you might, well 50 years ago people might have been interested in studying animal behavior or something. Today it's much more concrete. We even have facilities for this within PsyLab. So what you're doing with CebraFish you're actually using this as screens. For instance if you want to test the effect of new drugs, chemicals, what doses, assuming that caffeine was a new drug what dose of this drug should you administer until you start seeing changes in behavior in the brain or something. That's probably an experiment you don't want to start doing in humans, right? CebraFish on the other hand is perfectly fine. You hardly need a, well you do need a permit for it but it's not a very difficult permit to get. So this way you can try, it's physics in the sense that we're measuring the location of CebraFish with camera or sensors, right? It's physics in the sense that you're analyzing the patterns. You're going to need physics to describe, is this nicotine pattern significantly different from the caffeine pattern? Yes, well you can probably say that, well they look completely different. But what does completely different mean? Does it mean that it's one chance in ten that it would happen? I would not be comfortable administering that drug to myself. One chance in a thousand, one chance in a million. And then we're getting back to the statistics. So whole lot in biology has to do with statistics and although I'm going to spend quite a bit of time today talking about biology and structure, on Friday we're going to get back to physics and the more we get back to the physics the more we're going to bring in statistical physics. But what I love with this course is a completely different way of bringing in statistical physics than you would normally with a statistical thermodynamics. The power of this way of reasoning, combining biology with physics is that you can also start to think in terms of scales, in length or space or time. Sorry, this is a bit of a faint image, but on the top left you have a tree. That's probably a scale of tens of meters. If you then zoom in on say a leaf or something you might have a scale of a centimeter or a few centimeters. Inside this leaf you're going to have say a cell which might start to become a scale of a millimeter or something and you have components of the cell which starts to be a scale of 10 micrometers or something. Inside those components of cells eventually you're going to get into very tiny organelles in the cell and eventually somewhere in there, and this is a model not a structure, you're going to see a small protein which might be part of the photosynthetic reaction cycle or something. And this component is partly what explains why the tree is green but also what explains the entire respiratory cycle and everything. Now the challenge here, how on earth do you describe a tree at the same time as you try to describe a molecule, right? A tree has a lifetime of maybe a hundred years. The cycles we're going to need to do when we describe this molecule and how things are moving in the molecule is going to be maybe a millisecond. And that's insanely difficult and I'm sorry to break it to you but we don't really have any good ways or mathematical formalism or anything of treating all this at the same time. So that means that we end up both with a reductionism and try to simplify. This is the way we've always done it. You tend to go top down, usually with observation, the classical way of doing this is a microscope. And as you have better and better microscopes we are able to drill further and further down. What has happened though in particular the last 40 years is that we've become increasingly better at studying things on the molecular level, partly based on physics and structure. I'll come back to that, but also partly because we've got it so good at understanding what the structure is that you can use physics but hey, if we know that that's the structure can you predict what color this component would have? We probably can because we can calculate what wavelengths of light it would absorb, right? And then you start to introduce a bottom-up approach that if we know what the components are can I predict how the entire organism would work? And that's kind of similar to the zebrafish that I want to try a chemical on a zebrafish so I understand the influence this chemical has so that I can use this drug in a patient which is a much more complicated system. And well, I'm not sure whether this is an ideal illustration but you could even plot this two-dimensionally where you have some sort of scales for microscopic things on micrometers or even nanometers up to macroscales with humans or even you could even think in terms of population dynamics. We're not going to talk about that but the entire earth is kind of an ecosystem, right? So you have tens of thousands of kilometers and same thing in times you have things ranging from nano or even picoseconds up to well if you think of the population it's about 4.3 billion years, right? Evolution which is somewhat difficult. You usually don't go beyond the earth in biophysics not yet at least. What we're going to focus most of our attention on in this course is this part of the spectrum, understanding molecules and it's not because the other things are not that all interesting actually I'm going to try to connect things up but this is really where physics is starting to play a pivotal role we're completely changing the way we can interact with things and I would really encourage you to think of this term as a top-down observation versus bottom-up modeling. Both are important but we will probably put a bit of our attention here too except for today that we're going to do more observation. There are a bunch of fun systems that I'm going to keep touching upon and I will don't worry too much the first time I bring something up if you don't understand it fully because if it is important I will be coming back to it again and again and again and what I both love and what can be difficult in this course is this ping-ponging back between biology and physics and that frequently requires me to first introduce something conceptually. Let's pick an example. These are the channels that are pumping different types of ions in and out of your cell. This is a super complicated topic because it involves your entire nervous system this is the engine in your nervous system we're using ATP for it but we also have to understand what a membrane looks like you have to understand what the physics look like at the same time. So that means in this case and we're going to talk about this later on in the course we would first say what does it do biologically and not worry too much about the details then start to go into the physics and what is the physical part of this process once you understand the low level physics I can take a step back to the biology and it starts to say okay so how will the channel based on that physics work? Once I know how the channel works I can start to think in terms of a higher level physics what would this mean for all the different conformations in the energy landscape and how could the channel change its behavior and eventually bring that back to biology again. So the simplest way of thinking this channels are just holes pumps on the other hand they can transport ions in the opposite direction that is against in this case a gradient of the ion. To do that they would require energy and this course we haven't yet touched upon where you get that energy from but we'll come back to that later there are of course structures on these channels that that too we will go through the energy you see somewhere here that it's catalyzed by say ATP it's a small molecule again this too we're going to cover later and the reason why all this machinery works in the first place or rather the reason why we can understand how the machinery works is that since roughly 20 years we have very detailed not just molecular models we have very detailed structures of this entire pump complex we know exactly well no we know roughly where every single small atom is sitting here these are amazing machines even after pieces I was about to say 20 but I probably have 30 years of this field by now even after these 25-30 years I'm still in shock and all that these machines work I can't understand it well we can understand it but it's that nature is actually evolved to this and as a physicist the amazing thing with this 4.3 billion years of islander evolution is still American there are a bunch of other proteins that we're going to touch in I will just give you that final example before it back G protein coupled receptors when I was roughly your age and studied undergraduate engineering physics and biophysics in Lund and then as a PhD student at KTH in the early 2000 sorry late 1990s we didn't have these structures yet this is called GPCR which is a G protein receptor which sounds fairly boring but these are probably the most central component in all signaling when cells need to talk to each other when you're binding something that one cell is going to tell another cell to start a process or something there are dozens if not hundreds of these different genes in your bodies so they're they're literally the entire communications network and until 20 years ago well until 50 years ago we just know yeah it exists and it must be a protein but it was just a black box we had no real idea what it was like roughly that it's sitting in a membrane which is obvious because if you're going to need to send a signal from the outside of a cell to the other side we need to get across the membrane and at that point in time people were fairly negative and I remember sitting through a bunch of seminars where people said that look we've sucked two billion dollars in this field the likelihood that we will ever see a structure of these proteins is zero until people found a structure of it and now there are probably 50 structures though so that's the other cool thing compared to physics as as much as I love physics physics traditional physics evolves fairly slow biophysics evolves exceptionally rapidly probably half the stuff I'm going to tell you in this course was not known when they printed the first edition of the book we're using which is again a challenge because it's difficult to find good literature but the cool thing things happen in this field every single year the reason why we have a structure of this I'm going to come back to that too is that you're using x-ray crystallography roughly the same way that we first determined structure of say normal salt sodium chloride but I'm not sure about you but this is slightly few more atoms than sodium chloride on the other when we have a structure you can start to do any type of modeling you can imagine so some almost 40 years ago people realized that this structure it's just a bunch of atoms they will obey the laws of physics and if they obey the laws of physics well you could in theory put them in a well now we can just not just in theory but in practice we can put them in a computer and then determine how this deep protein coupled receptor binds a small molecule ligand and you see that it got really stuck there and now that the molecule is stuck this is changing the structure of this entire protein and when the structure of the entire protein has changed what this will lead to is then a chain effect on the inside that will cause a signal to propagate on the inside of the cell again we're going to come back to this several times several times in the course and that leads to the extra complication here that I don't think either of these books do that the last 10 years or even 15 biophysics has increasingly become computational not in contrast to experiments we're using both computers and computers are increasingly our new experimental tool we use both experimental observation we're collecting tons of data and then we're using computers to make sense of it so what I showed you here was run on a very large supercomputing Anton in New York it's the one on the top right there but there are a bunch of these machines all of these machines have at one point in time been the largest in the world this new Chinese machine they're basically people are investing billions of dollars to build larger and larger machines to use essentially computational microscopes because we can use computers to study things on scales and with detail that is completely impossible in the lab again this was science fiction when I started my PhD in this topics but today any modern drug you design now is partly designed in a computer it's too inefficient to do it in the lab so what that allows you to do you can for instance compare the docking and again that would basically be the computer trying to predict how should this small drug bind and what you see here in purple is how the computer predicted that it would bind and what you see in grey is the X-ray crystal structure prediction how it would bind it's pretty darn accurate the difference is that determining a new X-ray structure of such a molecule can take months if not years the first time it took a decade doing this in the computer you can probably do one in a week and in the future people hope to be able to do one in a minute or so and they can imagine screening through a million drugs in parallel so the reason why people are interested in that both our teams and others is partly understanding physics and interactions how molecules work in the first case so that again these are just molecules there is no special life essence or anything they're just stupid molecules in a way but when they are these large and complex they start to behave in different ways and then water or carbon dioxide which would be the traditional molecules we're used to and this is exceptionally important in particular in the pharmaceutical industry I don't know how many billions of dollars are in this slide but these are basically companies and trademarks generic names of drugs targeting just deep protein coupled receptors and I'm going to come back to this later on in the course but we're having the average revenue for these blockbuster drugs can easily go in the tens of billions of dollars today so exceptionally important commercially too not just from science I'm going to have one more example before I go into the course we can certainly zoom out you can go from that scale of molecules to entire cells start to look where in the cells things are present there is actually a lot of work on that going on out of the CYLAF lab that I might tell you a little bit about we can start to look where things are in different types of tissues organs are there differences if a patient say has a cancer tumor are there going to be differences in what proteins are expressed where in the body or how the proteins behave in general the answer to that question is yes so we would maybe like to do a design a particular drug that would just target some proteins that are expressed too much or too little or that have changed in a way so that they don't behave the way they normally would on the other hand you could also go the opposite way you could start with something from the outside let's say that I have a traditional drug that I would like to deliver and deliver is the way that I would like to get this into your cell because if the drug is just on the outside it doesn't help a whole lot the problem is that your bodies are designed to protect the cells from the outside that's what the skin and everything does and you could certainly try to inject the drug but drugs that you inject they're virtually never successful it doesn't really work well it works but it's expensive you need to go to the doctor three times a week and you need to take injections three times a week if you're a diabetic you have to do it but again it's difficult and you need a lot of hygienic equipment and everything so by far the best drug is a drug you can take the skin patch and there are contraceptives that can be delivered that way there are a couple of anti-norship pills if you could see sick for instance but most drugs can't be delivered that way because if you zoom in from the outside what you have here protecting you are parts of your cellular component this is the lipid bilayer you see up there on the right and the whole idea with this to protect your skin so they can't go into your cells here too you can use computer simulations we have a bunch of people in our three people in our lab sitting at work with this together with a company and they're trying to design computational models that really model exactly how do the lipids look like in these upper parts of your skin and if we know how the lipids look like in the upper parts of the skin then you kind of understand what is the so what are the properties of this barrier what is it that causes some drugs to penetrate but what causes other drugs to penetrate your skin and if you do that you can start to build a computational model of this and when you do that in turn then you can start to let the computer screen through thousands or tens of thousands or millions of drugs and see first what drugs will be able to go through the skin those might be the drugs you want to focus on in your experimental trials right second if you have a drug that you would like to deliver because it's a really good medical drug could I modify this drug add a group or subtract a group to be able to deliver this in the skin same thing here this could easily run to the millions of dollars and investments but of course also much better quality of life for the patients and this is very much based on physics what is that determines whether things happen or not well it's partly diffusion right that you probably all seen in physics and partly causes like energy free energy interactions so before we know it already on Friday we're going to begin drilling down here and start to look how do all these different molecules interact the electrostatics funder valves bonds in the molecules and then we're going to start talking more and more about energy for a few lectures before we head back into biology there are a couple of other podiums that we're probably going to mention simply because they're very important in our research and again the beauty of this course that we can't connect things close to research what you see on the left there is literally the world's smallest machine it's just four helices is probably 10 nanometers across but every time the voltage across your cell changes when you have a nervous signal for instance when I'm moving my hand there are millions of these the blue helix in the background will move up a bit and cause a chain effect of other channels to open what you're seeing on the right here is a molecule that's a ligand gated channel so that if we are having a glass of alcohol on Friday the ethanol molecules will bind to this channel and will actually cause the channels to open a bit more than it would normally do which again will alter your nerve signals so there are really cool ways that you can fine tune and interact with the nervous system these type of channels are also what makes it possible to sedate you with anesthetics and again we know what they look like now so there are a ton of proteins that we're going to keep coming back to anything if there is something happening in your body and you don't know what it is that happened, guess that it's a protein doing it that's in 90% of the cases that's true we're going to come back to most of these classes later on in the course and we're really going to talk about first what they do but also what you can do here on the right so on the right you have an artificial protein a protein that has been designed in the lab and computers to achieve a specific feature in this case artificial photosynthesis why would artificial photosynthesis be useful that would be certainly we want part of it but also converting what does the photosynthesis do primarily it converts light into energy right and I'm not sure if you're a physicist you might have read that there's been an amazing development in photovoltaics the last decade and you have these new types of much more efficient solar cells none of the existing solar cells comes close to what happens in a green leaf in nature so if you could design artificial photosynthesis they would be way more efficient than the best solar cells you can imagine and that's an important element of Mori because as smart as we think we are as physicists we are pretty smart right and again the new photovoltaics are impressive but nature has had 4 billion years of trial and error to come up with something better and 4.3 billion years usually beats being smart so many of these things that are introduced in biology they're increasingly being used in engineering because simply they're more efficient there are all other ways we try to use bacteria to grow ethanol and again carbon dioxide capture is also a very good idea so biophysics ultimately is very much about explaining phenomena like this it's more in some ways it's easier than physics because we're not going to use quite as much hardcore math here but that occasionally also makes it more difficult I think because it's not as well defined as physics you're going to need to spend more time that's a biological concept how do you turn this biological concept into a physical model so it's increasingly going to be up to us to come up with equations now those equations we come up with will be much simpler than the traditional equations you're used to but this frequently requires more of you to decide what to model gross horrible simplifications and the reason why we need those simplifications is because it's so noisy unless you make horrible simplifications nothing will work and it's very much about knowing what's a known unknown versus unknown unknowns that what are the things that we can try to remove from our model to focus on the core properties of it and increasingly the great thing is that if we have a simple model we could of course even put it in a computer and you're going to use things in computers in this course a lot I will also try and if you're interested I'm going to try to take you out to a site visit to SciLifeLab so you can see what happens on the experimental site most of what we do is going to be applied to proteins and nucleic acids but these small structural components simply because it's for physics that's where it all starts these molecules they are micro machines they will move between different states and the physics and energy of these interactions determines what happens or what doesn't happen and what I think is really cool you don't really need physics, you need a bit of common sense but then you need to think and you get surprisingly far with that we can explain not everything but a pretty darned large fraction of everything that happens in very complex systems so if we go a little bit into course meta info time is running quickly here this is a 50% course not 100% as I said at Stockholm University we will average roughly two lectures per week some weeks it might be three and we have in well there are several days per week that are intentionally kept free partly because you're taking other courses in parallel and partly because we want to give you some time to study the key word in that sentence is time to study if you only go to these lectures you might pass the course but if you want to high grade on the course you will have to take some time to do the studies and read up on your own you will also need to spend some time on these handling tasks there are going to be three handling tasks in the lab they are just pass fail the first one will have to be handed in roughly two weeks from now on well a little bit over two weeks no sorry three weeks from now the idea is that you will have roughly two weeks for each handling task there is going to be one computer lab divided into sessions you will get separate credits for the handling tasks and the lab and then a written exam at the end in general I am a big fan of free-demand responsibility when I studied engineering physics 25 years ago I don't think there were like six months with no mandatory movements in the course I want to give you that freedom too but remember freedom comes with responsibility if you want to go away skiing for four weeks be my guest in that case contact us and say that you will not be able to do that computer lab on site in principle we are going to okay that but don't come after those four weeks and expect to be able to sit down with us and go through the computer lab that literally means you get the instructions you read up on what you need yourself again, responsibility part same thing with the handling tasks and everything we don't want to sit the day before the exam and have to correct 30 times 3 handling tasks because then we die so that's why we want to space it out the other reason we are spacing this out you are really hurting yourself if you don't do them on time and that's why we are going to force you to do them on time because we have been the first handling task for instance is very much about the Boltzmann distribution understanding entropy and everything in a slightly different way than use of physics and I deliberately planned that pretty much in parallel with the way we are bringing things up in the lectures because the handling task will force you to approach it in a different way from what we do in the lectures it will help you to do them on time so by on time when we start to describe the handling task don't think that these two weeks means you should do it the day before the deadline start working on them right away and then you will hopefully have a much smoother sailing through the entire course as I mentioned I will try to record the lectures but there is no promise whatsoever there if they are not recorded they are not recorded we are virtually always going to be here and there are some events in FB 55 and the computer labs are going to be out in the RB 33 room I am not going to repeat all the stuff about when things are all the course information will be in Canvas and it will be up to date in Canvas that would have been a really good idea if KT8 actually transitioned to Canvas now after 24 months of time but KT8 thought it would be a great idea to have three different places with course information out of which I might be able to edit some of them sometimes we will do our best I have complained to KT8 about this multiple times the Canvas course page I will make sure it is up to date for the other ones I will do my best but for instance yesterday I couldn't even remove the old schedule from the KT8 social site because they think just because I am a professor and in charge of the course doesn't mean that I have to write anything there sorry about that too much I can do about it full schedule will be there did all of you get an email yesterday? good I don't have your email address I can just decide to send an email to you and then I have no idea where the thing shows up feel free to use the forums for anything there if you can use it to talk to each other if there are questions you have we will try to be attentive and respond to the forums you are more than welcome to send an email to you too the reason why I suggest not sending an email to me is that I want to spend this time engaging with you and if there are 30 of you have sent individual emails to me about the same questions I am going to respond to that question 30 times and that means that there will be 29 other questions I don't have time to respond to but if you said post your question I will respond to it once and then I have time to respond to 29 more questions so if you have a question there are likely more people that have the same question and that's why I prefer to do the forums I will cover virtually all the topics here and I think the same goes for Lucy and Burke but do read literature and papers too I will try to occasionally I will include the scientific papers partly in some cases they are really useful it's not that I am going to interrogate you and ask you deep specific questions about papers in the exam in most cases but the reason why I hand out papers they will help you to understand the material they will help you to understand the book better and in many cases they will help you to understand what happened since they wrote the books there will be other papers that I don't mention so much or that I don't hand out and in that case you consider it optional reading if you are interested in that aspect but I am not going to ask about that on the exam and since this is an advanced course a long time ago when we first started this course there was a beautiful book written by Alexei Finklostein protein physics the course was even called Protein Physics the first few years I gave it and the beautiful part of this was deliberately a book each chapter was meant as a lecture that doesn't exist on this level so we used that for several years the problem is the duty of this field the field moves the book is 10 years old half of the things I want to tell you they were discovered after the book was written then on top of that the book went out of print and then for a few years people were able to get it on Amazon or private sellers in India or something at some point we had to give up on that and try to see if you could find a pdf a copy of the book online which many people were able to and then Magnes started to favor another book written by Thomas Nordlund which is kind of the opposite this is a usual US textbook if you like to read lots of texts these books are great and I say it's a beautiful book but this goes all the way from the start and then you're going to spend a lot of time reading here and I'm not sure about you what I love with being a physicist it will probably take you longer to go through this book because this book doesn't bullshit as much he goes straight into the equations so I still I'm biased I still love this book but it depends a little bit what type of material you want to approach I think you can find both of these electronic loans from the KTH library you're more than welcome to have a look at this I will try to provide literature references to both of them but again focus on what we go through in the lectures read those concepts in the books too if there's something we haven't covered at all we're not going to ask you about that but again the concepts we cover in the lecture are reflected to roughly 80% in both books and then there are some things where we're either just going to cover in the lectures or cover the slides and I realize on the lower level you want a textbook that you read from the first to the last page in that case you can pick one of them but I will deliberately try to provide reading references to both in order to define what we talk about not the books examination as I said that we separated this in two parts the written exam will be a relatively small part of the course you will be able to get five ECTS credits just by completing the handling tasks and completing the lab and that will basically get to an e-grade no sorry you're going to get pass grade those two moments are examined separately so you're going to get pass grades on those if you specifically would like to show how good you are at this if you have a written exam that's divided into two pieces there will be ten multiple choice questions that counts roughly 60% of the requirements and 60% of those have to be correct and if you do that you will get an e if you want these higher grades there are going to be ten essay questions where you have to write a bit and the idea here is intentionally to try to take some stress of you if you don't care that much about the grades and everything it's not too difficult to pass this course if you just follow the course and do all these moments as we're helping you again, we will be available to help you but please don't all come and ask for help with the handling tasks three hours before the deadline Sunday evening because that we can't have the handling tasks as mentioned they're going to be available in KTH Canvas later tonight even or at least later this week same thing there don't worry you have this material available there are going to be plenty of information there but it's not going to be like a lab it's not just going to take two hours and you will get click by click instructors do this do that you might have to go back to the course lectures and understand what we were talking about because again this is an advanced level course you expect to search for your own information and then we are to make it more modern that is not at all covered in either of those books Burke in particular is going to help set up a computer lab we're going to do proper molecular simulations the way people, well we in our research groups and other people are doing it today pure computational modeling and calculating free energies which is beautiful it's very much related to current research we are flexible we can handle virtually everything if you contact us before deadlines contacting us after deadlines does not make us happy we have ten, eight more minutes before the break so I'm actually going to go through the first few slides on the outlines today so when I'm topic-wise what I'm going to bring up is mostly basic concepts a little bit about water and a bunch of different biomolecules bit about protein production machinery and the protein structure and then on Friday I'm going to go into the physics before I do that did you have any questions about the core setup I'm available in the break too with your questions so biophysics the single most important molecule in biophysics is this one water this is actually a computer simulation of water and once in the 1970s this was science fiction the type of simulation that got people the Nobel Prize today this is something you can pretty much do on a laptop so this is liquid water and you've probably seen ice crystals I think a liquid water base quite differently in ice crystals every water molecule participates in four hydrogen bonds it donates two hydrogen bonds and it accepts two so that means that you have two full hydrogen bonds per water molecule and in liquid water you have almost the same number 1.7 so it's not as simple as saying when you're moving when you're melting ice you're breaking all the hydrogen bonds but on the other way we're not really breaking any real bonds either there is not a single oxygen that's leaving the hydrogens so this is all a matter of what state is most favorable is the most favorable to being crystal or more favorable to liquid how are all these molecules interacting in particular as you're moving between different states here this is the finite temperature that causes them to move semi-randomly and these are things we need to understand in particular if you put at this at say 300 Kelvin which phase is more favorable for water it's going to turn out that's not too different from proteins or DNA or anything so that understanding that if you're studying physics in particular quantum physics or quantum chemistry it's usually you're looking at one molecule right and then you want to determine what is the structure with the capitalist, what is the energy this is more complicated I haven't even introduced any quantum yet but the problem is that we can be in multiple structures but again it's very much a character of biology we're going to come back to that a lot it turns out that these hydrogen bonds are very much causing the properties of water high freezing temperature, high boiling point the energy of each hydrogen boiling water is roughly 2 kcal per mole can be slightly more in proteins that's the number you should know but we're going to come back and repeat it several times and the reason why you have this in water you might remember even from upper secondary school is that the oxygen will effectively steal some electrons from the hydrogens so that the hydrogens are up with a small partial charge, positive charge and the oxygen with a small negative charge and that means that that hydrogen is also happy to interact a bit with that oxygen there is another molecule that's equally central and that brings us more into the real bimolecular structure, that's DNA you've probably seen this well I would be a bit worried if you haven't seen that molecule that doesn't know a hundred times by now you see it all over popular science, literature and everything, it's kind of the whole mark of modern science it can, I think, yes, oh hi it's even embroidered in the carpets every single life science center in the world including Sci-Life lab needs to have a model of DNA and the logotype, that tells you something about how important this arguably, although I am somewhat biased arguably this is the structure of DNA could be considered the most important scientific discovery of the 20th century at least if you are biased towards biophysics but that was not at all obvious when people determined the structure do you know what this is? in a way it's not so bad so chromosomes the problem is science always makes sense retroactively when we knew what we were finding, right? but when you make science you usually start out with a mess and then try to make sense of it so DNA was actually first isolated in the late 1800s somebody tried to find other then they were studying pus from infectious wounds and everything and then they came up with some sort of other strange compound and at the time they had no idea what this was but eventually in the first part of the 19th century people were able to isolate this strange molecule that exists in all cells and when you isolate the molecule again at the time there were no obvious easy ways to determine what the molecule looks like it's a salt or something you can easily dry it out to get it into the salt but what do you do from that? it's a salt, you can't put the salt in a microscope and see what it looks like the microscope is limited by the wavelength of light the obvious way you could do with physics is of course if you turn the salt if the salt is a proper crystal you could shoot x-rays on it and if it's now a sequence of crystals then you have trillions and trillions of molecules and then you're going to get a diffraction pattern from all these molecules and then you can capture this diffraction pattern now that is trivial for sodium chloride two atoms and you determine what is the length between two sodium atoms what is the length between two chloride atoms what is the length between two sodium and chloride and from the patterns you can see the type of unit cell you have in it easy for DNA in principle the same thing works but it's a slightly more complicated molecule so this is a diffraction pattern from a DNA crystal determined by a scientist who is not as famous as she should be Rosalind Franklin this photo what the scribble there says it's photo 52 on May 2nd, 1952 and that's all they had what Rosalind and Franklin Gosling did is that there's an amazing amount of thing you can do just by approaching this systematically so do you see that there appears to be some sort of systematic difference between these bars right they're spaced roughly the same distance apart and if you do your physics properly again you're just going to have to trust me here I would so not know this either if you start to put out some length lines here you can actually use those length lines to derive this cross would correspond to that you're actually having some sort of helical shape in the structure and these length lines will tell you something about the length scale that there is some sort of repeat here with a few angstroms or nanometer but that's all you can determine because it's pretty darn low resolution so you can't go from the left and tell what the structure is that's impossible what you then can do is that you can sit down pretty much with a molecular model and say hmm if I have a hypothesis how DNA would look like I can test would this hypothesis model lead to a diffraction pattern that would look like the one on the left then there are three possible answers the answer might be in principle there are only two your model could be incompatible with the data or your model can be compatible with the data it's impossible to say that the data proves that your model is right and a lot of very famous people did this what you have here on the right is a model by Linus Pauling arguably the most one of the most famous scientists of the 20th century so this is a helical model that uses the base of DNA because we knew from a chemical composition people knew what the bases in DNA looked like it's just that we didn't know what the entire structure looked like so this is a structure that looks like a spiral stair in the middle with the backbone in the middle and then you have the bases pointing out this was published in Nature 1952 I think it's a completely incorrect structure right? like 50 years later we can laugh at it that structure is compatible with the data it's impossible the data can't rule out that structure so this is as beautiful as physics is with great models physics can also lead you wrong it's not a stupid structure it's one of again one of the world's famous chemists and physicists in the world but the structure was not right we tend to sweep such things under the rug in the history of science and I think that's really stupid because when we present science to U.S. students we send the signal that every single prediction we ever made was right and that's not the case, it's rather the opposite 95% of the predictions we make are wrong we're going to come back to you after the break how we figured out that was right but there is a famous interview with Linus Pauling not specifically based on that paper that's 30 years later when it was famous with his Nobel Prize and everything and there is this interviewer Dave I forgot what his name was it was so Dr. Pauling he had so many great ideas and it's also, well Dave, it starts with having a lot of ideas and the point is that is how science works you need to have lots of ideas you need to dare to make assumptions and that's one of the things I want to teach you in this course we want to, don't be afraid of picking up pen and paper and start assuming things work on things and see where it leads the worst thing that can happen is that you realize you get a beautiful model and two pages later you're going to realize oh shit that model is completely compatible with the data but that we're going to come back to after the break it's 2pm so let's meet here a quarter past and then I will continue alright, I will get started again so I'm going to tell what I told Burke and Lucia in the break that I know that you like structure a lot so what I'm going to do for all of these lectures I will try hard to stop sharply on time and not putting over more than one minute that might mean that some lectures I'm going to run out of time and then I just move to the last 5 or 10 slides or whatever to the next lecture so I'm never going to skip going through them if you want to understand the DNA structure a bit this is something that I would like to ask you to do because it's hard to understand in 60 seconds understand the different components of these nucleotides you have the different bases that you've probably seen in popular science at least AG, CNT that then they're binding to a small sugar which is also binding to a phosphate and then these phosphates are linked into this long chain that was determined by Levene already in the early 1900s there are a bunch of different ways you can look at the structure but the whole point of the component here each building block is called a nucleotide with T and then you can have a number of phosphates there you have the bases which are the actual parts connecting to each other unfortunately this was the only one I printed in your handout though but I think this might be a better picture you can find this online so the whole thing is called a nucleotide with T and that consists of a phosphate a sugar placed together there is called the nucleotide which is a nitrogenous base classical exam question to ask you to show, I will show you the molecule and I will ask you to name what the different components are these have been known for a long time since the early 1900s and during a lot of years when people spent time studying that Erwin Scharger found very interesting results that it appears that no matter what organism you have, you have different ratios of all these but it appears that adenine and thymine always occurred at roughly the same concentration and guanine and cytosine roughly at the same concentration with that concentration it varies but they always occur in pairs and that was the one key discovery that two other scientists used so there are a number of problems with this structure even without the Scharger rules that these outwards facing bases they would need to make hydrogen bonds with water it's biology noisy and you see this is the problem for US physicists because you expect to see this number with 14 decimals right 26.0 and 23.9 is that the same Lucy? yeah that's how a biophysicist thinks because it's 5% just a 5% measurement error that's awesome yeah it should be there should be plus minuses everywhere but yes but also there are so many you might have ended up with a bit of RNA in the sample right there there are 110 things that could have gone wrong here so that the key thing about focus on the first principle the simplest possible explanation then the point here is to compare to 19 versus 30 that starts to be a significant difference the patterns the problem with this structure is that you would have electrostatic charges and the weights are negatively charged so you're going to take lots of negative charges and put them right next to each other that's going to be bad there are going to be some steric clashes that is atoms will bump into each other if you try to build a model of this so that it was not wrong but it's not obvious that this model would be so great and at this point there were two other scientists that stepped in and depending on this way you tell the story you can argue that they pretty much stole Rosalind Franklin's discovery there is probably some truth in that because I don't think that they were entirely scientific honest about it but I would also say the key difference is that they took somebody's data but they spent more time thinking about it and they came up with one amazing discovery and there is a reason I'm going to pass around this paper to you I would suggest that you read this until tomorrow it's a long paper two pages in Nature in 1953 they came up with the reason why these must occur in pairs is that they must bond to each other and suddenly you have an obvious explanation for the Chargap rules and then they started to build these molecular models you see there on the left and it fit the structural data completely and you see this point is that this explains the Chargap rules there was no obvious reason in the Linus Pauli model why we would have the Chargap rules and there is a beautiful formulation towards the end of this paper that you should remember basically it has not escaped I don't remember exactly it has not escaped our attention that the proposed mechanism would also provide a mechanism for inheritance period, that's it that's basically the argument this is the genetic material of mankind and that sentence is indicative of the largest discovery in the 20th century this structure also has been heavily stabilized by hydrogen bonds it's a hydrogen bonds that's tying these pairs of bases together and the hydrogen bonds again provides the molecular basis of this Chargap's ratio his rules there are a ton of other stabilization features if you look at that structure you even have so called Pi-Pi stacking I'm not sure if you've seen that in a biophysical chemistry course or something but you have aromatic rings and these aromatic rings on the basis will interact with each other in very favorable ways when they're stacked this way DNA is a surprisingly stable molecule it's also a slightly more complex molecule why would you have four bases? being physicists you probably work with computers, how many states do you have in computers? two, binary so why would you have four? that's more complicated so if you only had two bases and let's say that you have eight positions in eight positions you can encode 256 different states if you have four bases in eight positions we can record 65,536 different states so it's because of more compact storage when you allow slightly more different levels this is incidentally what you use in modern flash storage too you have more than two levels, you have multiple levels so it's a very efficient way of storing information in molecular structures and there's a ton of information in DNA that we won't have time to go into this is that of course long after the initial discovery we've now been able to sequence all genomes and on the top you have an amoeba and the bottom, mankind there's a huge variation, diversity here in how large the genomes are so as a small genome could be just a few thousand base pairs there are bacteria that are in the ballpark of 10,000 base, forever 10,000 base pairs or at least 100,000 mankind how large are your genomes? that's the number you should know 3 billion base pairs again, plus minus 10% 3 billion base pairs how many proteins does that correspond to? because each such gene well, the genome will encode for proteins how many proteins do you have in your bodies? for a long time we didn't know we were guessing we were guessing a lot so when these first structural and it's also do you think that if you look at the previous slide there isn't the most advanced structure here there are certainly some plants and everything that have 100 million as we were able to sequence the entire human genome the sequencing was just the first step the second step that have gone over 20 years I was trying to make sense out of these sequences what proteins do they code for there are some deep questions that most of the protein in yourself doesn't code for anything at all at least it doesn't appear to code for anything so like 90% of your DNA is kind of strange dark DNA that is important but it doesn't appear to code for proteins directly and that has been one of the long debates that originally it was thought that just 1% of your genome code for proteins today it might be 5 or 10% I think yes, we have a later paper on this where we've argued that 8.2% coding for related to function at least but something like 90% is just strange DNA it's not useless all organisms don't have this in a bacterium, virtually 95% of the DNA will code for proteins so we are different than bacteria here and the general consensus today is that most of this DNA is regulatory so you have pieces of DNA that can kind of shut off RNA so that the DNA can control whether a protein should be expressed when you're young or when you're old there is in the classical example of fetal hemoglobin that is higher affinity to oxygen because the fetus need to be able to steal oxygen from the mother now the second we are born we no longer need that fetal hemoglobin and then we shut off the gene a bacterium can't do that but a human can do but this is not quite science fiction but very much active research so it's a bit of a mess to come back to the question of how many proteins we have that has been going down year by year we thought at first was 50,000 to 100,000 the latest number picks at the slightly below 20,000 but as fancy as you think you are you are not that particularly fancy there are only 20,000 small molecules that determine everything about us from the color of our eyes to the hair to their length everything if you compare it to something else a Christmas tree given the season how many proteins does a Christmas tree have take a wild guess good or bad but wrong 200,000 so we are roughly one tenth as complex as a Christmas tree if we don't really know why it likely has to do with evolutionary pressure well I guess if you are a Christmas tree it's not good to be beautiful but the most extreme evolutionary pressure is bacteria and that's likely why bacteria has to focus on so few maybe just 4 or 5,000 proteins because they can't afford to carry around things that are not absolutely necessary but all this evolutionary stuff we don't know we are going to come back to that the evolutionary pressure on biology and everything there is another type of molecule called RNA just to remember what DNA looked like if you take that sugar and in DNA we have an OH and an H group in RNA you have two OH groups it's a tiny difference but this will complete the all through stability and how it behaves so while DNA forms a double helix with this classical RNA is typically prefers to be a single strand I'm not going to go into details exactly about how these reactions work RNA this AGCT basis we have in RNA there AGC and uracil instead of timing in RNA there is some molecular reason that I'll cover it in 30 seconds and it has to do with that cytosine can actually chemically be degraded to uracil cytosine excess it takes a while at that were to happen in if you had uracil all over the place in sorry in RNA the problem is this would lead to errors all the time that would build up and that would be very bad so in DNA that where we store information a long time it's very important that they are different in RNA we typically store information transitionary so it doesn't really matter that much and that's likely the evolutionary reason it also means that RNA is not really stable RNA will break down you can't if you work with RNA in the lab you typically keep it on eyes to make sure that it doesn't break down what you get from these neanderthal genomes and everything that's always DNA and there is a small famous story there that my uncle he spent in the 1960s there were a lot of research going on to understanding how RNA breaks down and everything there were lots of labs in the world that he was sitting in Princeton working and after a while I think he got bored that the entire rest of the lab were working with RNA and I told him can I start a DNA and everything but that's really stupid everybody knows that DNA is stable so he started studying DNA and actually found out DNA is not necessarily stable DNA breaks down too and that's what got him the Nobel prize since a few years ago and this is the reason for a whole lot of cancers and everything DNA too is degraded by orders of magnitude slower than RNA it's Thomas Lindahl I think we covered the structures already you've seen the DNA we could plot it that way do you think that molecule would be stable it would never look that way so what RNA will do is that RNA will coil up that's self complimentary that the single chain will coil up into some sort of complex structures and these are much floppier than DNA and they exist in many more conformations there are a bunch of different places where RNA occurs we certainly have if you look in the ribosome the cellular, the protein factory in the cell that's actually a mix of protein and RNA so that the ribosomal RNA is the dark blue parts here and the sorry the dark blue is the small subunit and the dark red is the large subunit and I don't think we separate no it's slightly darker versus brighter shades that separates the DNA versus the protein part but it's not that good so the lighter colors here should be the protein while the darker ones should be RNA so this is a semi-horrible mix of RNA and protein you don't even see that here because it's so large there is transfer RNA that is used when we're moving these small building blocks that's going to help us to build the proteins again I'm going to come back to that in a second and at some point now we're starting to have a lot of different molecules and the place where all of these molecules are used is when we build proteins and I figure that's in particular this is most of your physicist you might have seen this in upper secondary school but you've probably forgotten most of it so I'm going to take a couple of minutes here and well repeat roughly how proteins are created and how genetic material works but introduce it in a slightly different way this is something called the central dogma of molecular biology the two books here actually I can pass around the books if you want to have a look at read them and see if either if you like either better both of them introduce this but they introduce in slightly different ways so the central dogma of molecular biology can be formulated as sequence leads to structure leads to function and the sequence we have that is what you have encoded in your genome in the DNA that's pure sequence information well it is encoded by molecules but its sequence is AGC and T this is converted through RNA to protein structure and for a particular sequence of DNA you will have a particular sequence of amino acids in the protein and that will determine whether this is an antibody or an ion channel or hemoglobin or something else given a particular structure that will give it a particular function for instance hemoglobin carries oxygen an antibody well that acts as an antibody and a pump will maybe move ions across the membrane but it's always sequence DNA leads to structure the protein leads to the function and the protein does if you want to change the function we need to change the structure and if you want to change the structure that means we have to change the sequence the sequence structure function and the way this happens is that we need to start from DNA and DNA is normally when you see DNA it always has this double helix but if it's a double helix we kind of protect the basis on the inside right so to be able to read this you need to start by splitting the DNA a particular combination of bases where to start reading then we need to cut the DNA open and we're going to need a small machine to do this I remember what I said the first part of last lecture if you need a small machine to do something what would your first guess be it's a protein so we need some sort of protein doing this this protein of course is also encoded for in DNA so then we have a bit of a chicken and an egg problem but let's not worry about that for now so this is a protein called Paul D or DNA polymerase so this cuts up DNA and it can then create copies of DNA so yeah then I have two strands so first you're opening here and now you have one DNA strand there and one DNA strand there and that's how we can replicate the information the other alternative is that well that's good if you want to create more DNA but then we would live in a DNA world that we have more and more DNA it's not useful unless we can somehow read it that we do with a slightly different protein it's a process called transcription transcribe, read something right so then we need to open the DNA in a small piece and this is a molecule called RNA polymerase so the polymerization process here is where we're stitching things together that's why they're both called polymerases and ACE is always an enzyme a molecule helping you to do a process but in this case we're not copying it to DNA strand in this case we're leaving the DNA SS because the DNA will close again, rewinding but we're creating an RNA copy of the molecule this is actually, it's not just a single protein it's an entire class of proteins that are related to each other they were, this structure was finally determined high resolution in the early 2000s by Roger Kornberg at Stanford it's quite fun because I was a postdoc in this department in a different lab at the time and of course around the year 2000 they were all working so hard in getting this structure and it's a it's a completely different way of looking at this while it's happening because people had no idea whether they would be succeeding, they had spent 20 years going up to this structure and nobody knew whether they would be successful and of course then they were, coming back two, three years later they were super famous for this they were in fact so famous that they got the Nobel Prize for this structure too so at this point we now had this red RNA chain so now we have the genetic information but we've copied it from DNA to an RNA chain but that's still just a sequence of bases and now it's fairly fragile this is also fragile, this will break down spontaneously as I mentioned because RNA is not stable there is a second process that does this this RNA moves into a molecule called the ribosome this is an old slide it's a super old picture, so it's 2005 four years after I got my PhD again, in the early 2000s we had no idea what the ribosome looked like so this ribosome as a matter of fact is a complicated structure both with pieces of RNA and protein and everything, so this takes the messenger RNA, coming in the small red chain there and then this other piece of RNA transfer RNA, that is kind of small carriers and each such carrier has bound one amino acid and then you just have a sequence of these carriers we're getting more and more and more of them and what the ribosome does is that it the stitches all these amino acids together into a long chain and that's how we create the protein and there were a number of groups involved here but in particular Tom Steyts, Peter More and Venky Ramakrishna, I got the Nobel Prize in 2009 for this, with X-ray structures so we're talking about fairly modern science here again, it's only 15 years ago these structures were not available at all so what you effectively do here is that you have these you always have triplets of bases each triplet of base will uniquely code for a particular amino acid that was originally Francis Crick who discovered that and this concept was we called the genetic code that I'm sure you've heard about but you might not remember it this table you need to know by heart now, you destroyed my joke you don't know it by heart nobody, actually the sad part is that there are molecular biologists that know it by heart but you're physicists, you're not going to know it by heart completely useless knowledge to know by heart so how many different ways if you have four bases and you sit three of them together how many different combinations can you have that would require the molecular biologists to bring out the calculator but you're physicists I expect you to be slightly better at that part four to the power of three, right so four by four by four that's 64 64 combinations, do you know from upper secondary school how many amino acids there are 20 that's not the same as 64 so they lie to you, there are 44 more no so there are a couple of caveats here first, we need to know when to stop you need to know where to start but that can actually be redundant with an amino acid by the point we need to start, we need to decide we need to decide when to know longer code and there are actually three of these combinations that are stop codons so when you see a stop codon we assume this is the end of a protein, we're happy but even then there are 61 remaining that does not mean that there are 41 amino acids you don't know of so there is a built-in redundancy in the genetic code here and this is also a classical exam question if you look at the amino acid I'm not sure whether you can see the table here do you see that some amino acids are more common than others some amino acids there are at least four five six different combinations that code for arginine while the poor tryptophan is pretty much just one combination right? so which amino acid do you expect to see more of in cells tryptophan or arginine so this and how if you just were to measure it would you expect to see do you think that you see six times more arginine than tryptophan? you do again within the normal 10-15% variation this is what explains the relative abundance that is i.e. how much we have of each amino acid not the evolution or anything it's all based on physics, mathematics, combinatorics whatever you call it simple code these are the building blocks that we can combine them in different ways but we can't change the building blocks that's probably what DNA has done over 4.3 billion years so now of course we talk about in general in your entire genome specific protein might there are of course some specific proteins that might not have any arginines membrane proteins for instance but these are the general let's compare this with Lego the relative distribution of all Lego pieces that are being produced by Lego Inc that's one thing the Lego pieces I use for a particular when I'm building a particular toy that's a different matter but those are the building blocks that are available and the relative abundance of them in nature is determined by the genetic code and again Francis Crick discovery so to go from there once we know what these building blocks are and this was discovered in the 1950s the other obvious thing is that can't we just determine the structures of these proteins with X-ray crystallography because we know that there should be a unique structure again sequence leads to structure leads to function if the structure is not unique it's not going to have a unique function so there has to be a well defined structure that only depends on these building blocks so if I just know the sequence of the DNA I can translate that using this code to the sequence of the amino acids in the protein and then I should be able to determine if I have 100 amino acids I just need to decide find a way to determine what the protein looks like and that seems like such an obvious problem and people started that they actually started that in parallel with the discoveries of DNA and one of the big heroes in this field was Michael Perutz it took them 22 years to determine the structure of hemoglobin again in hindsight it was worth it can you imagine starting a project now that you would finish when you're my age and you're not sure it's going to work talk about pouring your passion into something right again today it's simple because we know that you have been 500 other people in the 5000 other people in the world that have determined structures these for me to do one that is what real science is about the other thing that we are so horribly spoiled with our computers this is how they were determined in this structure they were sitting with rulers and measuring the distances between every single atom and building them up manually there were other in particular hemoglobin there's another called myoglobin that was determined in parallel we will come back to what those differences are since then there have been a number of structures determined they got the Nobel Prize for the first structures of course iron channels these structures when I was a PhD student the first iron channel structure appeared by Rod McKinnon I remember there was a small talk outside of Karolinska when the guy came over and presented his structure and I and some other people went because I was a bit interested in membrane proteins and it was so embarrassing because nobody wanted to go out and have dinner with the guy so that the host there convinced me and another colleague that can't you join us for the pizzeria that's a bit embarrassing otherwise so we were four people sitting around the table three years later of course we got the Nobel Prize and now there tend to be slightly more people who want to meet him when he's in Stockholm these are the structures that you have in actually this is not human shape but these are more similar to the proteins that exist in humans where you actually have iron channels controlled by voltage aquaporins these are the water channels that determine how much water goes in and out of yourselves Peter Ager shared the Nobel Prize with Rod McKinnon actually not this one, 2003 GPCR is the one that I talk about Brian Kubilke also Stanford got the Nobel Prize for those structures in 2012 and the way virtually all of these structures have been determined is through X-ray crystallography and that's what I said just take your proteins grind it into a powder so powder just means you will still have crystals but they're going to be randomly oriented crystals and then you put them in a very small droplet microgram of crystal or something and then you use a very large facility like this Max 4 down in Lund a synchrotron so you're using light you're having electrodes rotated around in the ring and then when you use magnets to force the electrodes to accelerate so you're changing the direction then they're going to cause synchrotron radiation so you get very high intensity X-ray pulses and then you use that and collect the scattered rays so not film today of course you have computer CCD sensors and then you're able to determine that type of patterns it's slightly faster than 22 years nowadays because we have computers to help us but then we determine what is the electron density corresponding to that pattern and then you use computers to build a molecular model but this can easily be a product that takes a few years just getting a crystal can take a year if you can get it at all but still for a very long time the safest way the Nobel Prize in Chemistry was to determine the structure of a new important building block in yourselves and there are probably still more in the pipeline that it's a bit sad when I first gave this course that I had all these GPCRs I could say oh this is likely a future Nobel Prize and there was so cool way I could save the next course John and this was awarded the Nobel Prize last year so a few years ago there are other ways to determine structures but historically nothing has come close to extra crystallography and this is the beautiful way when things change in size they change much faster in biophysics than in traditional physics there is another way called cryo-electron microscopy I have a paper on this that I'm going to pass around to if you're interested three pages but the first page is just an image and this is a 2015 paper in it cryo-electron microscopy builds on the fact that you've heard about or seen electron microscope pictures right and the whole idea is replace light with electrons if you just accelerate electrons to a few hundred thousand electron volts they will have very short wavelengths and then you can image with electrons that is amazing in material science you can literally see almost individual atoms and everything so that way you can see structure that are as tiny as you want the problem with that is that you only see that if you throw a lot of energy you need to throw tons of electrons on this and these proteins are fragile so if you start to throw lots of electrons on the proteins it will break and then you don't see the structure anymore so you need to have very very low doses of electrons and they need very sensitive detectors the only problem if you're shining electrons on those detectors the detector breaks most things don't like 500,000 electron volts electrons being shot at them so for a year everybody was joking about calling this little blobology and everything because you got these faint outlines roughly of the shape of the structure but then through a sequence of events suddenly there was a generation of new better detectors the microscope were better so during two or three years it was literally like turning a page so suddenly the method was good enough so today the hottest technique in the field and we have two high-end microscopes so this is how that is instead of collecting diffraction patterns you have a very very thin film of frozen protein sample maybe just 100 nanometers thick or something and then we're literally using a microscope well, it's an electron and then we have a real sensor here and then we're collecting images about these proteins we'll see 100,000 different directions now you have sliced through images of your protein from 100,000 random directions and that you throw in a computer and tell the computer please fix this and the amazing thing that it works we're gonna, well now I don't think I'm gonna cover it that much but it's an example right modern experiments would not be possible without computers and you can imagine the amount of physics and math that I've gone into these reconstruction algorithms so modern biophysics is in many ways more computational and mathematical than it is experimental so there are some very cool proteins on the up on the left here it's a TRPV1 protein I'm not even gonna go through the details what it does but up on the left there that gray outline was how well you know the structure in the blobology days and then Yifan Sheng a few years ago they were able to get 3.4 angstrom structures do you see all the detail you can see the individual helices and everything and we're gonna go through these components and structures on Friday we can find the binding sites we can understand what these molecules do and in particular these molecules they're actually pain and heat receptors so if you're eating tillip papers this molecule, capsicin binds to these channel and causes the channel to open and that is how you cells pain or heat and then of course there are hundreds of structures like this virtually every issue of nature now now has one or two high impact papers with a new structure determined by cryo-electro-microscopy so the unifying factor with all these things is that by having the right sequence of DNA or a particular sequence we can build a chain actually in some case in this place is four different chains but that's a complication we can go back to later we can build a chain of amino acids this particular sequence of amino acids will magically for now form some sort of structure and what I'm gonna argue but that I can't prove to you yet that process is based entirely on physics and you might think that's obvious but it's not at all obvious when people started studying this 60-70 years ago they went oh maybe the cell has some sort of very special machinery for building all these proteins but the fascinating thing that the entire structure, the way that this protein always finds the same state is based entirely on physics which is remarkably cool but once it has that particular structure that's what it creates the specific binding site here that causes us to bind that molecule the way the entire molecule looks it's what gives us the properties if it's binding that molecule in the right place the binding of that molecule will change the state the entire ion channel would like to be in so suddenly when we have bound this molecule suddenly it's better for the molecule to move over to another state where it's open then about say a minute later but probably a fraction of a second later this molecule unbinds and now it's better for the molecule to move back to the closed state again so these are stupid molecules they're literally just molecules right but because they can visit they're large, they're way more complex than say water or carbon dioxide they can exist in multiple different states they literally move and that's a complication because how do you see those motions in extra crystallography we can't right you could imagine determining two structures you could first we determine the structure of just the protein and then I can take my crystal but before I make the crystal I can soak the crystal I can add lots of this molecule and I hope that this molecule will be bound everywhere in the crystal and if I'm really lucky I might hope maybe I can see one structure with the molecule and one without it but how do we actually see the process when it's moving so historically the answer is you can't and this is where the modeling comes in understanding the difference between different states and we're virtually all modern pharmacology and everything is software because you can imagine that if all these molecules steer and control your nervous system what if we could deliberately steer them what if we could create the molecule that's a better anesthetic or something but that will literally require us to be able to go in and fiddle around with the process a little bit so what I'm going to talk about is the reason for this diversity is this polypeptide sequence and polypeptide I spoke about amino acids that sorry there are lots of different names here and some of these things will just slip out of my mouth and a peptide string is just a sequence of amino acids and the reason why I call it the polypeptide I'm going to come back to this that these amino acids they polymerize through something called peptide bonds so each amino acids end with a COO minus group and then the next amino acid typically starts with an NH2 plus group sorry NH3 plus if they move together and I think we have, yes I have an example of this so I have these two amino acids if they move together they can release a water and instead form a larger molecule that sits together so I have one amino acid there and if this happens again and again this is what's happened in the ribosome then I create a very long chain of amino acids so the point is that there are only 20 amino acids but if I now have a string of say 10 of these there are now 20 to the power of 10 different combinations and that starts to be fairly large numbers so by by connecting them in specific strings I get much larger diversity but just the string is not enough the string just means that I have a string of amino acids what tells different proteins apart is that there are groups that I had on the other slide again that's literally what makes one amino acid different from the other you have a tyrosine there in the middle you have an alanine there and you have a glycine there so different side chains the ones in red here will cause a diversity and we will come back to this too in the course that this whole polymerization process is fairly simple you deal with polymers in every day like plastic plastic is just polymers the ethylene is a small molecule derived from ethanol the ethylene you might have heard of polyethylene p-plastics every single shopping bag but those are fairly boring right but the reason why they are boring is that every single monomer is the same it's just billions and trillions of copies of the same monomer for amino acids you can have different monomers and because they are different we get much greater specificity and much cooler molecules but fundamentally a whole lot of things from the physics point of view part of this is similar to plastic when it comes to sequencing it was Fred Sanger in 1952 who argued that proteins have a unique sequence he managed to sequence it there is a paper on Sanger's first paper on sequencing that he got the Nobel Prize for in Canvas if you want to read it it's a bit more chemical so I didn't include it in the things I hand out and I'm not going to cover it in detail but given how it's iLife lab we sequenced roughly the equivalent of one human genome per day nowadays the first human genome was a worldwide endeavor that took 10 years and now we do it in one day at one site and Fred Sanger sequenced an incident that it probably took in 3 or 4 years so there is no other method that creates more biological information today than sequencing it's exceptionally powerful these are old structures beginning of the 1900s so the fascinating thing the people all this research on the proteins one hand protein sequence on the second DNA structure and DNA sequence they all happened in parallel and it wasn't really until the mid 50s or early 1960s that we realized that all these things stick together I mentioned that the DNA structure was arguably the world's greatest discovery in the 19th and 20th century do you remember from the slides when the Watson and Crick paper showed up you have the paper in front of you 52 when did they get the Nobel Prize it took over a decade so how on earth could it be that it took a decade toward the greatest invention well arguably the greatest discovery of the century a Nobel Prize so at the time we didn't know because the Nobel committee their deliberations are secret they're secret for 50 years so about a decade ago this was lifted and people said so they were not even nominated until the early 1960s nobody understood how important the DNA structure would be and there's this fallacy we always we tend to look at the future but we think the future will look like the past science is obvious when you look at the review mirror but in 1958 nobody could imagine that DNA would be important it was just another small molecule that could be useful small fan paper they determined the structure of a specific salt might or might not be important and beware of this the challenge with research it's hard to make predictions in particular about the future and today it's completely absurd that nobody bothered nominating them it was Sir Loris Bragg who nominated so the summary today you need to understand a little bit why protein structure and function is important how they relate to each other how do we determine structures today understand the hydrogen bonds we're going to come back to that we didn't talk that much about it here but the hydrogen bonds will be important later on we will go back on Friday and talk about fundamental physical interactions and how they give rise to the hydrogen bonds you need to understand DNA and RNA structure and function because even if we're just going to work with proteins this is where proteins are coming from and you so need to understand the central dogma of molecular biology and that leads to one question here so what was the central dogma so those are the molecules but if you think in terms of information and sequence to structure to function so do the arrows always go in that order sequence to structure do the arrows ever go the opposite direction does the sequence ever depend on the function in principle no not when we talk about the central dogma but there is a very important arrow that everybody is aware of this arrow but we don't point it out but you need to think about it there is an arrow that goes this way what is that evolution yes so this is what Charles Darwin found and that's another one that's 200 years before we knew of any of these structures right so evolution literally means survival of the fittest so if you have sequences that are more or less randomly created if that leads to structure that have a good function in some sense this structure will have an these individuals will have an advantage in evolution if they get more offspring if they are more offspring those random deviations will start to become more common in the sequence and then you have an amplification out there which is also quite fun because biology is not biology obeys also physics but it's not static in the sense that we can change the sequences and biology does change the sequences reading wise for the first lecture here I've actually found chapters both in Nordland Finkelstein that covered this reasonably well Finkelstein doesn't cover DNA and RNA at all but again the first approximation you can find lots of for the general stuff you can find a ton of information online I will try I'm going to update the following lectures too already day and tomorrow so that whenever possible I will have reading instructions for both books on Friday we're going to go deeper into the sequence structure function relationship and we're going to talk in a small podium like this how do all these things interact why do they interact and why does that lead to stabilization of certain functions before that Magnus started to do this last year I think it's a good idea if you want to be successful in this book this is what we talked about in the course evaluations last year what did they recommend you to do actually I think this is from two years ago they thought that attending class helped a lot in understanding the book contents if you prefer to see this online be my guess but that's up to you they also strongly recommend and started reading and do hand in toss in the beginning don't see the hand in toss just because they have a deadline on February third I promise you that we're not going to grade you down because you completed earlier and you will have a chance to interact way more with us oh somebody also liked the lectures the other part this is a famous quote that I actually there was some other calls by Rolf Waldo Emerson nothing great was ever achieved without enthusiasm I am frequently super busy but when I am teaching you have my attention and enthusiasm I love to teach although my time is limited if you want to be successful not just in this course research life and anything what you do do it with a full heart and I will start sharply on time I will end sharply on time I will be available over email even and again while we give this course you have my attention I will respond quickly unless I'm on the flight but that also means I don't want you to show up 10 or 15 minutes late things happen in that case going quietly and sit at the end but Subin take responsibility for your teaching and I will make sure that I deliver to 110% and I think the same goes for Lucien Burke the other thing that I will do sorry now I use that last minute study questions every lecture there will be a bunch of study questions if you know those study questions you're going to pass the multiple choice questions and the exam I promise the multiple choice questions will be taken from these so let's say if you know these and that's it for today see you on Friday