 everyone if you're watching this on Moodle or later on YouTube if you have any comments and stuff like just ask them so so let's go back in history a little bit because it's good to know what we kind of discovered in the last couple of hundred years and why we actually believe that this central dog is true so the first guy is Friedrich Mieschner so in 1868 he was doing research and Friedrich Mieschner was also in the DNA lecture of course so the first thing that was discovered is that DNA and RNA have when they extracted the nucleon from the white blood cells they figured out that there are two types of different substances so had we they already knew that proteins existed but besides the protein they extracted nucleon and then they figured out that there are different chemical properties to this nucleon so which led to kind of the discovery that well there are two different types of nucleon within white blood cells so this was this this was one of the big discoveries from Friedrich Mieschner and we will talk about the different chemical properties that they have and then it is very quiet because RNA is a really difficult substance to work with because it hydrolyzed very quickly it breaks down very quickly and even nowadays when you work with RNA you have to work super super clean right so you have a clean bench you have to get rid of all the RNA so the proteins that break down RNA because they are literally everywhere but in 1959 and just more or less around the same time as the structure of DNA was was kind of in well not invented because it's not an invention but when they figured out what the structure of DNA was they actually figured out also that messenger RNA carries information that directs protein synthesis so this was discovered by Severo Occia in 1950-1959 so the the discovery that messenger RNA so that RNA actually carries information and directs the synthesis of protein actually led to discovery of the ribosome so the ribosome is this big more or less protein molecule which has not only proteins in there but also RNA and other things and had this ribosome makes new proteins for the cell and had this was discovered in more or less 1960 and in 1965 we have Robert W. Holly which figured out that transfer RNAs are the physical link between RNA and proteins and we will be talking about transfer RNAs and transfer RNAs are kind of this it's this hybrid molecule which on the one side is an RNA molecule but on the other side it has or it has an amino acid attached to it and this this clover leaf structure with this amino acid allows the ribosome to synthesize any type of protein that it wants so the first purification of RNA polymerase was done by Arthur Kronberg in 1957 and RNA polymerase is of course the thing that makes RNA from DNA so head similar to the ribosome which transfers RNA into proteins RNA polymerase is the thing that binds the DNA and then moves along the DNA transcribing DNA into RNA in 1983 the year that I was born we have Kerry Mullis who diverted polymerase chain reaction which allowed us to look into the structure of DNA much more and also into the sequence of DNA because PCR is more or less the the method in molecular biology which made it possible to amplify DNA into large amounts so that we can really do something with it so in 1989 so a couple of years later the polymerase from thermophilic bacterium terma aquaticus called Tak was purified and this made it possible to do polymerase chain reaction much easier because for polymerase chain reaction you need different temperatures and because it's a temperature cycle and so you start off at a temperature where the DNA kind of opens up and then you have the protein the RNA polymerase binding to the DNA and then you have the transcription and all of these things work at different temperatures and the termas aquaticus is actually a extremo file so it lives near these volcanic vents on the top of the or on the top now on the bottom of the ocean and there it works under relatively high temperatures so this bacteria is able to survive temperatures of 90 100 degrees Celsius and because of this the polymerase that it has is very very stable so it doesn't easily break down and this was a major invention which allowed polymerase chain reaction to be used literally in every lab in the world and still polymerase that still the term as aquatic most polymerase is probably one of the proteins which is most sold in the history of molecular biology so in 1978 there is the there's the discovery which we already talked about that we have introns and we have exons right so that the the genetic the genetic code of the DNA does not code one to one into RNA but there are more or less gaps so there are sequences within the DNA which are not transcribed in or which are transcribed into RNA but the RNA that is transcribed does not code for part of the protein so it has more or less a regulatory function then in 1973 we have the prediction that there is something at the end of the DNA which is called a telomere so at the end of the DNA the DNA is kind of coated with a cap and this cap protects it from chemical degradation and because of course you can imagine that if you have a double helix then the end of the double helix is more or less open so of course all kinds of chemicals can come in and can more or less modify the DNA there so to prevent this at the end of the DNA you have these long caps called the telomeres so these are long stretches of A's and these stretches of A's need to be maintained and the maintenance of the ends of the DNA is done by telomerase so telomerase is a protein which uses an RNA template to synthesize new DNA at the end of the DNA molecule which keeps the DNA molecules stable so in 1984 we have the transposal discovery so that the discovery that many mobile DNA elements use an RNA intermediate so this RNA intermediate was discovered in like the 1985 so how we knew that there was something which is called transposals or jumping genes but this but the fact that this uses RNA as an intermediate was discovered much later in the 1980s so and small RNA molecules regulate gene expression by prostranscriptional gene silencing this is something that we will talk about in the rest of the lecture as well and but RNA itself is not without RNA is not a dumb molecule which just transfers information from the cell nucleus into the cytosol where the ribosome makes proteins but RNA itself also has a has a as a regulatory function so the cell uses RNA to do things like counting how many proteins it makes but it also uses RNA as a feedback mechanisms to silence genes which it does not need to express at that time so furthermore we have non-coding RNA that controls epigenetic phenomena which is called the tRNA and also in 2010 by 2001 by Eddie we have the epigenetic phenomena which is the which is the way that DNA is wound up and made inaccessible so some parts of the DNA are open and can be used for protein production other parts of the DNA are not necessary so they are more or less wound up and so one of the things that is a very which is a very common thing which people always talk about is for example the silencing of the X chromosome in females females have two X chromosomes but you don't in every cell one of the two is silenced and this happens very early in development so when you are still a blastocyst of like 16 to 32 cells at that point in each cell a decision is made that one of the two X chromosomes needs to be shut down and silenced because otherwise you would be producing twice the amount of protein right because you have two X chromosomes as a female a male only as one X chromosome and you only need one X chromosome to more or less make all of the proteins that you need so if you have two X chromosomes and both of these would be active that would be a massive issue because then you would have double the amount of proteins coming from the X chromosome so to prevent this at like a 16 to 32 cell stage when you're still a blastocyst the decision in each cell is made to silence one of the two and this is at random and this causes all kinds of things that we kind of take for granted have for example in in in cats this leads to the fact that male cats only have two colors and female cats can actually have three colors because they have because the X chromosome in cats and codes for the code color and having two X chromosomes means that you can have two different code colors as well as the standard code color coming from from the from the autosomes and so that means that that female cats can have three types of color while male cats can only have two so if you see a cat and this cat is multicolored so it has three different hair colors you can safely assume that this cat is a female cat while a cat which has two colors can be male or female but it is a higher likelihood of being a male because of the fact that females generally have three colors and this is this has to do with the whole RNA silencing of the X chromosome so a little bit of a look in DNA into detail we know this already because we have the DNA lecture right so DNA has a timing adenine guyanine and cytosine so GATC when we look at it in detail we see that we have the sugar phosphate backbone so we have phosphate molecules which couple the two oxygens together which form the backbone of course the backbone is also there at the other side and what we can see is that the way that we have the group here so the chemical group here this is an O-group so this is a slightly acidic group right so it loses in a hydrogen so it makes the water surrounding it a little bit more acidic and of course we have hydrogen bonds which keep the different base pairs together so the thing to remember here is this that if we have a GC pair we have three hydrogen bonds keeping them together while if we have an AT pair we only have two hydrogen bonds keeping them together which means that a GC pair binding in DNA is a stronger binding around 50% stronger than a binding which is an AT binding so that is just something to remember about DNA and of course have you see the nice structure of the molecules as well but so you have base pairs and you have the backbone and why are we talking about the backbone and how DNA looks because if we compare DNA with RNA and so on the one side here we see DNA so the standard DNA molecule we see the backbone which goes around and we see the base pairs in the middle again we see the different base pairs here so the difference the main difference between DNA and RNA is that DNA almost always comes in a double helix while RNA is almost always occurring in a singular helix actually within your cell you have protection mechanisms which detect double-stranded RNA and break it down immediately because double-stranded RNA is not used in eukaryotic cells it is only used by viruses and bacteria so head for a eukaryotic cell it is very important to have single-stranded DNA double-stranded DNA is directly degraded and this is also where RNA silencing comes in right because RNA silencing is in small molecule RNA binding to a big messenger RNA it forms like a double head it forms a double helix in a way and then this is detected and is directly broken down by the cell because double-stranded RNA is very very dangerous we can see here the different base pairs of the RNA molecule so had the first three base pairs are identical to the DNAs the only difference is is that the T which is coded or sort of timing base pair the timing base pair is coded differently in RNA so in RNA we don't have a T but we have a U which is called uracil and this is slightly different so it is the same as the T but it doesn't have this C3H group CH3 group so it misses a side chain which makes it a little bit different and also gives it some different chemical properties so if we just summarize the differences between RNA and DNA then the strandedness well RNA is almost always single-stranded while DNA is almost always doubly stranded the base pairs is that it contains a uracil versus DNA which contains a thymine the sugar backbone is a little bit different so had the phosphate backbone is actually a so the sugar is in DNA it's deoxyribose and in in RNA it's just ribose or standard so RNA is much more reactive because of that so if you look at the structure of the of the of the this part here right so the sugar right so you have the base pair and then the sugar and if you look at the structure of the sugar then ribose has an additional OH group where deoxyribose doesn't have the OH group and this means that it hydrolyzes much easier so RNA has a much shorter lifetime than DNA another major difference is is the size of the molecules DNA molecules are big because they are they are chromosomes are they are massive structures while the size of RNA is is really relatively small compared to a DNA so location-wise the DNA is confined to the nucleus so at least in in in eukaryotic cells right in bacteria this is not really the case but the RNA molecule is a molecule which is used for information transport so it is created in the nucleus but then moved or is moved to the cytoplasm so it is it is not a stable molecule like DNA which is more less confined into the cell nucleus RNA is a molecule which is created in the nucleus and then moves to the cytoplasm where of course it is bound by the ribosome and then causes proteins to be made alright so different types of RNA there are many many different types of RNA and RNA is one of these molecules where since it is a really difficult molecule to work with because it's not very stable DNA when you cool it it can be stable for like hundreds of years and have in theory we could even extract DNA from kind of dinosaur bones that we find RNA is a molecule which has a very short lifetime so it generally is made translated and then it breaks down because at the molecule itself is not very stable because of the fact that it's just an intermediate so different types of RNA that we will be talking about our messenger RNA we already know a lot about messenger RNA but I wanted to talk to you about the difference between Hn RNA and mRNA so messenger RNA is called mRNA but messenger RNA is also called Hn RNA I wanted to talk to you about the transfer RNAs the t RNAs which are used to synthesize proteins we for we also have ribosomal RNA which is the RNA which is part of the ribosome we have small nuclear RNAs called SN RNAs which are involved in splicing and other reactions we have catalytic RNA like the different ribosomes or which are called ribosomes and these are very interesting molecules because these are molecules that actually have an effector function and break down this biological dogma right so the biological dogmas DNA is the carrier of information or DNA is storing the information RNA RNA is carrying the information and guiding protein synthesis and proteins are the effector molecules but that is not really true because RNA are all RNA molecules can also be effector molecules they can all also participate in chemical reactions and make things happen within the cell I want to talk a little bit about micro RNAs I want to talk about small interfering RNAs and I want to talk about non-coding RNAs and these are all different types of RNAs so all of them in the way that we classify them it's all the same molecule but they are different classes so you have different more or less functions and different properties so when we talk about messenger RNA right we say that messenger RNA is the carrier so messenger RNA is made in a nucleus and then it is transported into the cytosol right so the when we talk about messenger RNA we have to remember that when we talk about eukaryotic cells messenger RNA comes with introns and exons right and only the exons code for protein the introns are coding not for a protein but are having generally regulatory functions so they have to be removed from the final product right so and what can happen is that different exons can be retained or skipped to produce different proteins from the same gene so a single gene can sometimes produce like 20 different proteins and this is because skipping an exon will leave out part of the protein right so the protein can be very long but sometimes it's it's a lot shorter and so introns themselves are removed by the spliceosome complex so the spliceosome complex is something which which removes introns from pre-mRNA so pre-mRNA is also called hnmRNA so when when messenger RNA still has the introns then it is called hnRNA instead of mRNA so of course we need to get all of these introns out which is done by the spliceosome and after we have that then we have mature mRNA which is called messenger RNA and is coded as mRNA and so we then call this an orph so the an orph is the open reading frame and the open reading frame is the genetic code so three base pairs in the orph code for a single amino acid the next three base pairs code for the second amino acids and then the base pairs seven eight and nine they code for the third amino acid so the orph is the sense-making so we go from something which has exons and introns to something which only has the exons and now when the exons are more or less behind each other then we call this part the coding sequence the cds furthermore we have the untranslated regions which are there both in the pre-mRNA so the hnRNA as well as in the mRNA and these untranslated regions are important for the ribosome detecting the messenger RNA as being messenger RNA so the UTRs are there to for the ribosome to recognize oh this is a messenger RNA that they can transform into a protein but they are also there to control the lifespan of the RNA which means that if you have a long poly A tail so if there's a lot of A's at the end then every time that a protein is made one of the A's is cut off and when all of the A's are gone like the the telomeres at the at the DNA level right so if the telomeres are gone then the DNA breaks up and degrades the same thing happens with RNA so if all of the A's at the end of the messenger RNA are gone then no more proteins are produced so the ribosome has a head so from a single messenger RNA you can literally produce thousands of proteins so of course if you want to have a thousand proteins then you are going to produce one messenger RNA with a long poly A tail and every time that a protein is made an A is cut off and in the end when all the A's are gone then the RNA itself is degraded and protein production stops so the nice thing about the codon structure in in messenger RNA is that it is very stable it is stable across different species and one of these things which is which which is more or less universal in all living things is that the start codon so the beginning of the protein production is always encoded by AT and G and of course AT and G that's not how you read it on the DNA or on the messenger RNA because RNA does not have the T so the start codon is actually when you look at when you look into the genome it says ATG but if you look at the messenger RNA arriving at the ribosome at the ribosome it is actually AUG for uracium. There's also a stop codon so there's also a signal to the ribosome that you need to stop the protein at this point right because we need to start making the protein and then there needs to be a signal for the ribosome to stop and this this stop codon is more or less a G a degenerate codon because it can be a UAG UAA or UGA and this is very species dependent so based on the stop codon being used you can more or less figure out if this is messenger RNA from a dog or if it's messenger RNA from a human because humans use a different stop codon than dogs that might not be true but like you get the drift that the stop codon usage is dependent on the species that you're looking at. So UTRs like I said control the lifespan of the RNA and this is generally done by the poly A tail so the longer the tail the more proteins will be produced from a single messenger RNA. Alright so that's everything that I wanted to say about HN RNA and mRNA the next step of course is the transfer RNA so transfer RNAs are 73 to 94 nucleotides long and they are the physical length between the RNA sequence and the amino acid sequence produced by the protein. So what we see here is a standard clover leaf that's how they call it so the clover leaf has a very specific structure so here at the top the ACC with the OH group this is the position where the amino acid is coupled to the tRNA right because the tRNA is an RNA molecule but at this position at the ACC there is an amino acid coupled to this thing so it's a hybrid it's not pure RNA it's RNA plus an amino acid. This is called the acceptor stem so the acceptor stem is actually holding the amino acid then we have the D loop which is important for the transition through the ribosome and then we here have the anticodon loop and the anticodon loop determines which amino acid is encoded by which three base pairs so these three base pairs here in the anticodon loop and the opposite base pairs are used so when it binds the opposite base pairs then there is a recognition and then this recognition makes it so that the amino acid is incorporated into the protein. We then have the little variable loop here and we have the T5C loop which are loops which are again important for the ribosome to recognize tRNAs and guide the tRNAs to the correct position so that it can synthesize proteins. So how does this kind of work so here we see a schematic right so in this case the messenger RNA is entering the ribosome so it is actually going from this side so from left to right and of course here we see the growing peptide chain and so the first base pair is of course the ATG or AUG for the start and then the next three base pairs will be bound by the tRNA so here we see three As on the messenger RNA and of course three As can be bound by UUU which is TTT and then of course the UUU is carrying its own amino acid right so the amino acid is then because the tRNA binds the amino acid is then coupled to the protein and then of course the next one and in this structure you can see that this happens two at a time so a ribosome has two tRNAs bound at the same time and then when the next tRNA comes in it pushes the last one out so there's always two which are bound to the messenger RNA and had this synthesizes the peptide chain and it makes it so that had the messenger RNA structure is encoded into amino acids so you can imagine that if there are mutations in your tRNA then this will have a massive effect on your proteins right if you have a mutation in one of your tRNAs here for example your tRNA doesn't have an a here but a g then all of a sudden every amino acid is coded or every had this three base pair structure now produces a different amino acid so hey instead of having a functional protein you have a non-functional protein all of a sudden because these the the the link between the code in the DNA and the protein is broken and this is why this this is so so conserved across different species because mutations in tRNAs generally are directly lethal because they just instead of making a protein which goes alanine valine leucine leucine you now all of a sudden instead of a leucine put a completely different amino acid in and that is why this structure is very very stable and is more or less similar from like dinosaurs that lived 15 million years ago as well as like humans who are living now and had the the DNA genetic code is kept stable because of this fact that single mutations in the anticodal loop are directly causing the wrong or the wrong amino acid to be incorporated so if we look at the ribosome a little bit more in detail then the ribosome of course translates mRNA to protein and it does this if we look if we zoom in a little bit more head then you see here the a side the p side so the a side is the side where it enters so again or in this case the the mRNA is pulled from left to right and what we see is that tRNAs are more or less tRNAs are brought to the ribosome and then of course there is a matching of the tRNA to the to the to the messenger RNA which happens in the a side so in the a side the tRNA still has the amino acid bound and then it it does like a like a motor like a two stroke motor it ticks every time so hey it ticks very quickly and every time that it ticks a new tRNA is is bound the old one is expelled and the new one is or the one that was at the first position goes to the second position and the the amino acid is separated and bound to the newly born protein so the a side is for amino acid which is the binding side for the charged tRNA the p side is the peptide side which holds the tRNA which is linked to the growing peptide chain and then at the last step it is moved to the exit side which is the final binding side for the tRNA before being pushed out of the ribosome by the next incoming tRNA so it's like a two stroke motor so it continuously works and it continuously produces proteins by incorporating a new tRNA the new tRNA pushes the chain forward and because it pushes the chain forward it also pushes the last one out so the ribosome always binds two tRNAs and as soon as a new one comes in the the old one more let's get ejected so ribosomes contain RNA themselves because they need to recognize this these sites right because for the the ATG there is a recognition pattern so the ribosome needs to bind this AUG and needs to know okay so now I need to start producing peptides so now I need to start a new peptide chain so a ribosome contains of different subunits you have the large subunit and the small subunit which together couple together and then pull the RNA through it and the subunits are named xx so they have a number and then they have an s and this s stands for the sedimentation speed and this comes from way way back when we were still classifying proteins based on how long it took for them to when you put the protein into water how long it took for it to kind of sediment out of the water so to get like a little kind of stuff on the bottom right so s stands for sedimentation speed so when we look at prokaryotes we have a 23s large subunit and then we have either a 5s small subunit or a 16s small subunit in eukaryotes it's a little bit different because we have a 28s big subunit and then we have a 5.8 a 5 or an 18s small subunit which is coupled to the lower part of the ribosome so the ribosome is always made out of two proteins it is not a single protein it is two proteins and not only is it two proteins but it also contains RNA to do recognition of of codons which are needed like the the start codon and the stop codon because they don't have tRNAs right there's no tRNA which has a start codon in there right so the the the the RNA contained in the ribosome is there to recognize tRNAs and mRNA sequences which are needed all right so i wanted you guys to show you what how it looks like because i always like looking at things in protein so i wrote my own little 3d engine like years and years and years ago and like i love 3d programming because i'm a visual person so i like looking at things um and i just wanted to show you guys a small 3d visualization and this might actually crash the whole stream um because the the computer program that i wrote is a little bit well it's not bad it's it's actually pretty good i think um but it it uses up a lot of cpu and also gpu so it might actually crash the whole stream but we're just going to do it so i'm just gonna start up the program and i am then going to make it am i still there i'm still there i'm just seeing my cpu usage jump to like 90 so i'm just going to add a new window capture um which is actually this thing here and we do okay so this is then the really really nice thing that i made so for you guys to show and we'll make it a little bit bigger um i can't make it that much bigger so i can make it a little bit bigger here so this is the the little 3d engine that i wrote and the nice thing is is that you can move around like this and you can load in different protein files right so what i did here is i took um one of the um i took the large molecule so this is the um e coli 23 s molecule um so every little dot that you see here is actually a molecule and you see the amino acids in different colors so the backbone is um let me see if i can zoom in and zoom out a little bit i can move so if we look and we zoom in then here we see one of these amino acid chains right because the ribosome is a protein itself it is also made by the ribosomes so you see that there's of course like thousands and thousands of molecules let me look up so here we are talking about 147 000 atoms so dots we have in total 11 000 amino acids and these are coded in 57 peptide chains so head there are 57 different proteins which come together to form one unit of the ribosome and in total we're talking about like around 11 000 amino acids which are used so why do i want to show you guys this well i want to show you guys this because the ribosome itself has this big hole in there right so if you just move around you you don't really see the hole but it depending on how you see the hole right and this is the hole where the messenger RNA is pulled through and of course this hole is covered by another by the by the small subunit um but what i wanted to show you guys is how much RNA molecules are actually in the molecule itself so you can see these really nice and i don't know if it's very visible let me actually move a little bit and a little bit down and here because these are not amino acids right they're not having these little squares and triangles here but you can see this more or less the RNA molecule very very as like a ghost image you can yeah it's probably not visible for you guys let me actually change the code a little bit um so i can show you guys how i do this um so let me go to notepad so i have here the pdb file which i load in and i can actually make the point size a little bit bigger so let's make it two and a half um then execute the program again that will probably fill the whole screen it takes a little while it's uh it's like a big molecule that needs to be loaded in okay there we go so it starts loading in the molecule so i'm gone for a little bit because it covers me as well then we have the big protein here and now we see all of these dots and now you can relatively clearly see here on the bottom um the nice kind of RNA structure right so you see this kind of half helix going into the ribosome and hey you can see here the the the RNA molecule combined with all of these tens of thousands of proteins um so hey and it allows you to the 3d engine allows you to zoom in and you can actually see this this really nice like RNA structure and this is this is just an electron microscopy um um no it's not a no it's a it's an it's an x-ray microscopy um image so it's a crystal structure where you shoot an x-ray to the crystal um to kind of get the location of each of the atoms in the in the file and the nice thing is is that you can really clearly see this kind of RNA molecule coming in um into the ribosome and of course hey you see that these RNA molecules are more or less all around this hole where this this um where the where the where the messenger RNA is pulled through so just wanted to show you guys that this is also part of bioinformatics is writing like little programs which do 3d visualization to learn a little bit more about proteins and so if you zoom out you can see that this is a very very kind of complex structure yeah and that proteins are only a very minor part of the whole ribosome um and so the the proteins here are the ones which are coated with these little squares while all the other molecules which are part of the ribosome are there and the ribosome is one of these fundamental proteins which um without ribosomes like life would not be as complex as it is today all right so just as a little intermezzo so all right so that that's I think enough about the ribosome so we know now how the ribosome works that it's like a two-stroke engine where one of the tRNAs comes in it pushes one out and it grows this peptide chain and the ribosome itself contains around um like what did I say exactly so it contains around like 60 different proteins and the 60 different proteins are composed of like 11 000 amino acids good so next type of RNA is small nuclear RNA also called SN RNA so I talked to you guys about messenger RNA and pre messenger RNA and pre messenger RNA actually needs to get rid of the axons of the introns right because only the axons code for protein so um the spliceosome is the protein which is responsible for this and also this protein needs to use RNA so we have ribosomal RNA which is RNA which is an integral part of the ribosome but also the spliceosome is a big protein and the big protein in the spliceosome contains five small nuclear RNAs called U1 to U6 and we don't have a U3 I don't know exactly who named this but there are five different small RNAs which are part of the spliceosome so these SN RNAs along with their associated proteins form the ribonucleoprotein complex SN RNPs which bind to specific sequences on the pre mRNA so small nuclear RNAs are found within the splicing speckles of the and caedial bodies of the cell nucleus so the splicing itself still happens within the cell nucleus so before the messenger RNA is transported out of the nucleus it first is transported to um these little kind of areas of the cell nucleus because if you look at an electron microscope photo of a of a cell nucleus you see that there are these big holes where stuff gets transported in and out but besides these holes you see that they're all these little dots on the thing and these are all little structures where splicing takes place right so the small nuclear RNAs are processing pre mRNA into mRNA so there are two ways or two ways that we know currently that this happens and there are that this is called U2 and U12 splicing so what happens is that these small nuclear RNAs are binding sequences on so here we see an axon then we have a whole big intron and then we see the next axon and then what happens is that these small nuclear RNAs they recognize these sequences so a g g q r for um uh for like an r is uh encoding a multi option so that it doesn't have to be an a a t a g or a u um but it it it needs to be a subset of them and then here we see the branch side and then we see the other part which is recognized which is y a g g so again y stands for two or three different amino are of two or three different um base pairs that can be there so what happens is that a protein binds on the branching side it the protein binding uses RNA to bind there then the three prime splice end and the five prime splice end is also bound by s and r n a s which then recruit proteins which then fold it together and then the the RNA is cut and is glued together again to remove um the um to remove the intron from the messenger RNA so this a here is the important part because that is what is kind of the so this sequence in the middle the branch side is the thing that is very important because that is the thing that is recognized so if you have a mutation in your branch side then again what happens is is that a um protein is uh that the protein is unable to bind this exact sequence which means that the removal of the intron fails which means that a certain protein cannot be made so that's what i kind of wanted to tell you about splicing so splicing happens in something which is called splicing speckles and also cajol bodies and these cajol bodies are around the pores which are in the cell nucleus and these pores are important because there is where the stuff is transported in and out um but around these pores you have a very specific amount of proteins which use RNA to to modify RNA and there are two types of splicing u2 which recognizes agg it has it just has a slightly different recognition side um and now we can also see why there are five right because here we have one two three four five and these two are equal um which are recognized by five different small nuclear RNAs so very important RNA molecules which are encoded in your genome and again also under heavy selection so splicing is similar between humans elephants um dinosaurs sharks and so because of this pressure because having mutations in these small nuclear RNAs means that splicing cannot take place which means that in the end you are unable to produce the proteins that you need to survive when we talk about snow RNAs which are different from SN RNAs i know it's it's there's a lot of different RNA stuff going on right but small nuclear RNAs as an old RNAs play an essential role in RNA biogenesis so they do chemical modification of ribosomal RNAs and of other RNA genes like tRNAs and SN RNAs so SN RNAs are for splicing small nuclear RNAs snow RNAs are small nucleolar RNAs and these do conversion of for example urodin into pseudo urodin and what you can see here is that this is the standard urodin base pair which is used in RNA which is chemically modified into a different structure and this chemically modification is necessary because RNA can have um its its own like um enzymatic function so RNA molecules can be modified and one of the most well knowns of these small nucleolar RNAs is actually snow m1 so snow m1 is the molecule which transforms urodin into pseudo urodin and i don't know if you guys have been paying attention um or not so much attention but if you guys know that um and the the the beyond tech visor vaccine which is made it is an mRNA vaccine right so it contains mRNA which is then given to your cell in these little fat bubbles which then guides the expression which then makes which then tricks your ribosomes into making the spike protein also in this um in this in this mRNA vaccine this chemical modification is done so the the chemical modification is done before you get the vaccine to turn all of the urodins in the messenger RNA into pseudo urodins and this is a very very important step which was discovered only like a few years ago um which actually makes it so that these mRNA vaccines actually work because if you give an mRNA vaccine which contains urodin as the u base pair then this is highly um this is causing your immune system to react to the mRNA vaccine itself so it starts attacking the mRNA molecule which you give to the cell but you don't want that you want the immune system to start attacking the protein that is actually encoded so to prevent this what you can do is you can take these urodin base pairs use snow m1 or the the protein or the protein which actually is a small nuclear RNA so it's a small RNA together with the protein which changes all of the urodins in the vaccine to pseudo urodins which now mean that your immune system will just allow the mRNA to go through so it it will reduce your initial reaction of your immune system to the mRNA itself so a very important step um snow m1 is actually everywhere so but where is this pseudo urodin it is found in transfer RNAs so here you can see that in the the t phi c loop um you actually have this phi symbol you also have a phi symbol here and these are these pseudo urodins and pseudo urodin is the most abundant RNA modification in in cellular RNA and that's why they actually have their own symbol so if you think about RNA then RNA molecules um have always people say well we have um ac g and u but that's not true because we have actually ac g and u and this phi residue which is pseudo urodin it looks like urodin your cell treats it like urodin but chemically it is different so it is more stable than the urodin and it is less um immunogenic so a very important snow RNA is snow m1 which transforms urodin into pseudo urodin very important all right so small nucleolar RNAs again all of these work in the nucleus so one of the other uh types of RNA is actually catalytic RNA which is very very important RNA which is which have their own separate names so generally people call them ribozyme so have we we generally treat proteins as being enzymes which participate in a chemical reaction but they are not used so they just um they are there to for example make a chemical reaction easier um so but also RNA molecules can do that RNA molecules are also cataclytically active so they they can participate in a wide range of RNA processing reactions so for example RNA splicing all of these RNAs are also catalytic RNAs because they catalyze a chemical reaction themselves catalytic RNAs are also very important in viral replication and in tRNA biosynthesis right so an RNA molecule when it's not an mRNA molecule is not just a stupid molecule that just floats there and does nothing like DNA RNA molecules themselves are also effector molecules so they have a very big role to play in cellular biochemesis and of course here the structure of the RNA becomes very important because head the like the structure of protein is very important for the function of the protein the structure of catalytic RNA is very important for the functioning of the RNA itself so there are different examples of ribozymes like the hammerhead ribozyme the vs and the leadzyme and the hairpin ribozyme and so the hairpin ribozyme is the one that that cuts the hairpin out of the micro RNAs but i don't want to go too much into detail about ribozymes i just want you guys to know that there are ribonucleic acid enzymes so things which participate in a chemical reaction but are not used in they just catalyze so make the reaction run more efficient or make a reaction possible so ribozymes were discovered in 1982 and they are so RNA can be both genetic material storing genetic material like mRNA similar to what DNA does but it is can also be biologically it can also be a biological catalyst so like a protein or an enzyme and this is why in 1982 after the discovery that ribo that RNA molecules can be both carrier of information as well as a factor molecule the RNA world hypothesis was born and the RNA world hypothesis state that before life existed on this planet as we know it all life was RNA life so at a certain point in time before we had proteins before we had DNA the only thing that we had was RNA molecules which were kind of the only living thing so RNA molecules which would copy themselves and would then have kind of do their chemical function and the idea is that that RNA is at the basis of all life so hey there was not like DNA or proteins which formed first no first RNA proteins formed and then RNA proteins actually started making RNA molecules actually started making proteins they actually started making DNA to kind of extend their own lifespan since RNA has a very short lifespan hey you can think of an RNA creature which lives like like like an RNA virus which uses RNA molecules but the RNA world hypothesis is that before DNA before proteins all living creatures on the world or at least on our planet consisted of nothing but RNA and in 1989 Thomas Czech and Sidney Altman actually got a Nobel Prize for Chemistry for discovering the catalytic properties of RNA so we know now that RNA molecules can have their own can be their own proteins or enzymes they can act as a protein or an enzyme and there are actually some really funny online tools where you can play with where you can kind of simulate different RNA molecules in competition with each other and you can actually see RNA molecules copying themselves and evolving themselves so hey you can actually make an RNA molecule which makes a copy of itself out of nothing head just by binding base pairs complementary all right one of the last types of RNA are small or micro RNAs or small interfering RNAs and as small RNA molecules are micro RNAs and small interfering are abundant in eukaryotic cells so micro RNAs and small interfering RNAs are slightly different but for this lecture we will just group them into one hey they are abundant in eukaryotic cells and what they do is they do post translational control over our mRNA expression so they exert their function by binding to a specific site within the messenger RNA and introducing cleavage of the mRNA via a specific silencing a show is the RNA degradation pathway so how does this happen well I told you guys at the beginning that RNA is always single-stranded for eukaryotic for for eukaryotic cells so eukaryotic cells when they detect when they detect a double-stranded RNA it is directly degraded there's a whole pathway in the cell that just does nothing but that degraded double-stranded RNA so the cell also uses this to control so imagine that like the temperature goes up so a certain heat shock protein needs to be produced to protect yourself from this increase in heat so the cell will start producing this messenger RNA but producing a messenger RNA takes some time so hey it will take some time for the messenger RNA to be made then it needs to be spliced and then it needs to be made it needs to be transported into the cytosol into the ribosome so all of these processes take time but in the meantime right the the transcription machinery has started and blah and so it's already busy producing mRNA but the temperature now has gone down so what the cell can then do it can make these small micro RNAs which then bind to just recently created mRNA and have it degraded very very quickly so hey in that sense it's it's a it's a very fast way of controlling the production of proteins or not so much controlling the production of produce but shutting down the production of proteins before the cell even starts making them so it's kind of an interference mechanism saying that oh well we needed proteins like five minutes ago but they're not needed anymore so quit making them or don't even make them and what the cell does it makes this micro RNA it binds the mRNA and then the whole thing is degraded because double stranded RNA is just bad for eukaryotic cells so micro RNAs are 21 to 22 nucleotides they are processed from hairpin RNAs encoded by cellular DNA they regulate the gene expression primarily by inhibiting translation and promoting mRNA degradation so they bind the mRNA it hits the ribosome the ribosome kind of clogs up and stops making proteins and then the whole thing is kind of pulled out and degraded because it's it's not needed anymore so in total there are around 250 to 350 micro RNA genes which are encoded in the human genome so they are they are very abundant and they are produced regularly frequent especially when situations change very quickly and have for example also in the regulation of like neurons and synaptic communication they are very important because also there you can't take 15 minutes to start producing a protein to have something being done when you're talking right so when you're talking you need to have very fast control and this fast control is is done by micro RNA genes so how does this look so you have your cellular gene which then forms these pre micro RNAs the you have the drosha protein which then cuts off the the like edges you get this little hairpin and then you have dicer which splits the hairpin or cuts off the hairpin now you have two micro RNA molecules which are more or less complementary they are like separated from each other and then what happens is this that means the mature micro RNA is then transported into the cytosol and what it does when it is exactly complementary to the messenger RNA the messenger RNA is is degraded but if it's not entirely complementary then it still binds the messenger RNA but the messenger RNA is now unable to be transcribed into the ribosome because of the double strandedness and of course after inhibition it is also degraded all right so last type of RNA is non-coding RNA and of course all of these types of RNA that we have seen except for messenger RNA are non-coding RNAs right long non-coding RNAs also exist so they're not all very small so like the the the snow RNAs or the micro RNAs or the short RNAs they are all very small they are like 100 200 base pairs max we also have very long RNAs which are not coding for a protein which can be even longer than micro RNAs and had they they have been found in our not so much found but they are there to do genome defense and chromosome inactivation so had the most the two most well known long non-coding RNAs are pi RNAs which prevent genome instability in the germline so they are there to make sure that the the genome is stable had that telomeres are put onto the edges of the chromosomes and on the other hand we have exist which is the X chromosome inactivation in mammals so this is a a single long non-coding RNA that when it binds one of the X chromosomes it causes the X chromosome to kind of shrivel up and be completely inaccessible for expression and as soon as it binds one of the two X chromosomes had then the other X chromosome will will still be active so and this is a kind of funny image where it says i can't find the open reading frame could it be a non-coding RNA so had when you find an open reading frame so an orph then you know you're looking at a messenger RNA because it codes for protein but all of the other types of RNA they don't code for protein they are more or less ribosomes so they have their own catalytic function and their own more or less place in cellular regulation so long non-coding RNAs are modular so if you look at long non-coding RNAs they have different functional domains and they are more or less like lego put together by the cell so you can have a so for example you can have a region of the of the non-coding RNA which is RNA binding you can have things which are which bind proteins you can have things which bind DNA you can have conformational switches so which kind of move like a like a like a ribosome like a like the ribosome which which moves and have based on temperature or based on the existence of like iron or other molecules but all of these are more or less modular so if you if you have on the one axis the different types of of long non-coding RNAs and on the top you have all of these different functions and then some of them are RNA binding and protein binding others are RNA binding protein binding and bind DNA this one does something else right so they are they are modular so like lego they can be connected together by the cell to produce a non-coding RNA which has a very specific function okay so those are all the different types of RNA I don't like it either I like I when I got the course I first saw the RNA lecture and I thought like oh my god there's so many different types of RNA and like it's just gonna like put everyone to sleep and it does it it is but it is important to know that there are all of these different types of RNA because when you do structure prediction of RNA you have to be aware of the fact that something can be long non-coding or it can be a micro RNA and they are very important in the regulation of genes and that is the the way that we are interested in right because as a bioinformatician you're interested in how the DNA sequence causes RNA to be trans it causes messenger RNA to be made and causes proteins to be produced right that's the biological dogma we are interested in things like disease how does why do some people get sick and others don't and so we have to kind of understand how messenger RNA is is made and how it's regulated and then how this leads to protein production and because some proteins can make you sick and other proteins can help you defend against disease all right so mRNA expression is the thing that we are interested in well not directly we are interested in disease right but to understand disease we need to know which proteins we need to make to prevent us from getting sick and these proteins are of course made by messenger RNA so we need to understand how the genome or how a cell decides which mRNA to make and so which parts of the genome are active which parts of the genome are inactive so we have what we generally do in bioinformatics is measure and compare RNA expression between disease tissue and healthy tissue and so we estimate the environmental or genetic effects on on phenotypes that we're interested in and so for example you can be interested in cancer and so then we look at how a cancer cell is expressing its genome versus how a healthy cell is expressing it in its genome because when we find differences in expression and then that generally allows us to explain or reason about why there are differences in phenotypes so when we talk about mRNA expression there are three different ways to measure mRNA expression currently one of them is quantitative real-time PCR which is for example also used to detect well if you have SARS or if you have influenza A or influenza B there are ways to measure it whole genome so we can use microarrays to measure 20 30 000 genes at the same time and we can use RNA sequencing as well and RNA sequencing also allows us to get the expression of the messenger RNA not only that but it also allows us to look at like post translational modifications right if something is a uracil or pseudo uracil right which can be very important for the cell good so I will pause here and we will do another 10 minute break so I will see you guys in 10 minutes and then we will be talking more about mRNA expression and how we can measure it and how biofumatics plays a role there good so the next break I don't know what it's going to be what did we have in the last break was koalas I don't know so you will just have to find out so we'll stop