 everyone, if you're watching this on Moodle or YouTube or Twitch later. Thank you for coming to the lecture. Today we will be talking about RNA and RNA from a bioinformatics point of view. So let's just start, right? This is the overview for today. So of course we will first do the answers to the assignments from last week. I will also put the assignments somewhere open, I think, so that people watching it on YouTube can actually get the assignments and do them as well. But for today we will be talking about history, the difference between DNA and RNA, and then there will be a large part of the lecture going into all kinds of different types of RNA, which I always find very boring, but it's part of the kind of curriculum, right? So we have to do it. And it's just that there are so many different types of RNA that it's just confusing. So I hope that I will, at the end of the lecture, that you will have a better understanding of which different types of RNA there are and what they do and why they are so different. Then we will look a little bit of microRNA expression. We already did microRNAs. I already showed you some microRNA work. But since the assignments for today will be analyzing your own microRNA data, or not your own, but I will give you a data set with a little bit of microRNA data. And we will be talking, of course, about next generation RNA sequencing. So how do you use RNA sequencing to measure expressions of microRNAs? So activity of genes. After that, I will show you where you can get a whole bunch of free microarray data and a whole bunch of free RNA sec data if you are very interested in doing some research, but you don't have the money to spend on your own microarrays. Then, fortunately, a lot of researchers have done a lot of microarrays in the last, like, 20 years. And most of the data is available for free, so you can just take data from there. And you can get relatively high scoring publications by using publicly available data. So that's really nice. That's one of the advantages of bioinformatics compared to a wet lab biologist. Wet lab biologists generally have to spend a lot of time generating their own data. But as a bioinformatician, if you're lucky, then there's a lot of free data available. And you can make discoveries in free data, which other people fail to notice. And then we will be talking a little bit about the structure of RNA because I wanted to add a little bit to the lecture. So we will be talking about COVID-19 and the structure of the RNA molecules, which, in theory, is not that important, right? Because, like, the virus has really had the structure of the different proteins of the virus is the thing that matters. But, of course, I just use it as an example on how you can do structure prediction of RNAs. Since RNAs just like proteins, the structure matters a lot. And, of course, we will be talking about different bioinformatics tools there. All right, so let's go to the previous assignment. So let's see if I've set up everything correctly and I did not. So here are the answers. So let me also get the assignments. It's a little bit annoying since the whole computer rebooted, but... Assignments 3, DNA. Loading, loading. All right, so the first questions were about using Ensemble to compare the mitochondria of mouse, humans, and zebrafish. So let's just do that. So instead of looking at the answers, we can just look at a Firefox window. So the first thing that you need to do is go to, of course, the NCBI, right? So NCBI is the National Center for Bioinformatics. And they... Why do I need to retire my password? That seems a little bit nonsensical. So this is the main NCBI website. And here, like, this is the entry to all of the different data that you can find on bioinformatics. So it has, like, things like PubMed, but also genes and genomes. But we don't want to go to NCBI. The main website, we actually want to go to Ensemble, which is part of NCBI. But it is the entry point for many... The entry point is just a little bit different. So you can click on your species and then see. So first things first, let's go to human. And we can here go to the cardiotype. So the cardiotype is just the overview of the different chromosomes and their banding. So if we go all the way to the back, we see here this very, very tiny chromosome at the end, which is the mitochondrial DNA. So let's just get the summary of the mitochondria for humans. Right? So here we look at the human chromosome MT. What we can see here is that the human mitochondria are 16,569 base pairs long. There are 13 protein coding genes. There are 24 non-coding genes. So these are the tRNAs and other things which do not code for a protein, but are important for the functioning. And here we see short variants. So the variants are locations on the mitochondria where we know that there is a difference between different humans, right? So if we take like a sample from the population and we take 10 different humans or 20 or 100, then we know that at these 7,000 points, there might be a difference. So these are known differences. So this is human. I will just open up a new one. Go to ensemble and we will look at mouse. Again, we just go to the karyotype. I have slightly less chromosomes, but again, they have the mitochondria at the end. So we can just say, give me the summary and then we can compare mouse versus human. So the first thing that we actually observe is that the mouse is a little bit shorter. So human mitochondria are like 300 base pairs longer than the mitochondria and mouse. Furthermore, we see that the same amount of protein coding genes is there. So they code the same proteins and we can also see that they code the same amount of non-coding genes. So of course, there's less known variants in mouse because people are generally not that interested in mice. They are interested in humans. So because of that, we see that there's a big difference in the, well, not in the length, but in the number of short variants. So have we just haven't discovered that many variants in mouse yet compared to humans? All right, so that's the mouse-human thing and then we can go to ensemble and then in ensemble we can actually go to zebrafish, which is the third most common species used. And we can see here that, again, zebrafish also do have mitochondria. If we look at the summary, again, we see the same thing, 13 coding genes, 24 non-coding genes, and in zebrafish we have 143 known variants. So the conclusion here is that when we have the question, so what is the overlap, but what are the differences? So the overlap is that they are very consistent. So both humans, mouse, and zebrafish, they have the same amount of genes. They have the same amount of non-coding genes. They have the same length, give or take a couple of hundred base pairs. But the main difference is, is that we know much less about the mitochondria in, for example, mouse and in zebrafish than we know for the mitochondria in humans, because in humans you can see that we found a lot more different variants. So these variants are single nucleotide polymorphisms or little snips and indels. All right, so use the ensemble database to download the FASTA sequence of mouse gene. So mitochondrally encoded the heterogenase. The official name is MTND1. So just to show you guys how to do this, we go to mouse, right? So we scroll up, we see the mouse here, which is perfectly fine. And then here we can say search all species. No, we want to search in mouse and then we can just type in the name of the gene that we want. So in this case, we want to have MTND1. So it already comes up because it's a nice, it's a known gene. So an ensemble knows what you're looking for. And it switched to humans again. I hate that it does that actually because I don't want to go and search in humans. I want to go in mouse and then I want to search mouse for MTND1. All right. And just search then. And then here we have the mouse strain, right? So again, we get an overview of everything which is known about this gene. So we know that this is mitochondrally encoded, the heterogenase, which means that the location is on the mitochondria. And indeed here we can see that it's indeed located on the mitochondria and it's around a thousand base pairs. So the gene is encoded in a thousand base pairs. And if we scroll down, then we see that the gene encodes for like 318 amino acids. There is only one variant, otherwise we would have had more entries here in the list. But that's just the way that it is. So we want to get the sequence, right? So we can go to the export data. We then say we want to have the FASTA sequence. I don't want to have any upstream or downstream sequences because I want to just have the sequence of the gene. And then here we have some options, which means that we can have the unmasked sequence. So unmasked means that repeats, duplications, and these kinds of things are not hidden. Generally when you do primer design, and we will come back to that in a later lecture, when you do primer design, you generally want to have the masked sequence. So you want to have the repeats masked, either soft or hard, just so that you are prevented to design primers in areas which are repeated multiple times, either in the gene or in different parts of the genome. So what do we want to have? So in this case, we want to have the base pairs. So we are not interested in the peptides. We just want to have the CDNA. So the CDNA is the more or less the mRNA or the coding DNA. So this is what we want. Then we press next and then it makes it for us and it puts it and we just say, give me the text file. And then you see we get two of them, right? So we get the protein coding CDNA and we get the same sequence, but now the sequence of the chromosome, right? So if there would be introns or axons, then this sequence would be longer than the sequence here. But we see that we get twice the exact same sequence and this is because the mitochondria are originally of bacterial origin and since they are of bacterial origin, they don't have an axon intron structure because prokaryotes code genes differently than eukaryotes. So although eukaryotes use introns and axons on the autosomes, the mitochondria of prokaryotes do not use introns and axons. So both sequences are the same. So we can just use the export data and we just want to save this file, right? So I'm just going to copy this. I am going to open up my Notepad++ window and I'm just going to say make a new file, copy in the sequence and then we're just going to save it somewhere. So I'm just going to save it on my C drive and empty DNA one. Yes, save it. So now I've saved it. Oh crap, you guys can see it because the Firefox window is over it. So here what I did is just create a new file, put in the sequence. So I just copy pasted it in and then just save it relatively easy. All right, so the next question is about creating a small R script to analyze this little bit of sequence, right? Because we're doing bioinformatics so we really want to use a computer to answer certain kinds of questions about our sequence. So the first thing that we do is we create a new file. So that's what I did here. So I called my file answers03DNA biting bits. Thank you for following. Welcome to the lecture. So I already prepared this but we can just go through it one by one. So the first thing that we want to do is set our working directory. So we already talked about how to use the terminal and how to use the CD command to change where you are on the hard drive and the same thing holds for R. So when I go to the R window which if I would have installed R would look like this then I have no idea where I am, right? Because the program just opens up and we have no idea where it will save files or where it will load files. So we need to first explicitly say where we want to go. So the position where I saved the file previously, let me switch back to the NoteBit++ so previously I saved it in DED drive project lectures because I have a like massive file structure on my hard drive. So I just put it here, right? So the first thing that I need to do is tell R where I want to go. So where do I want to load files? Where do I want to save files? So I just set my working directory. So let's just copy paste this into R and now you see that nothing really happens, right? So if a command succeeds, you get no feedback and that is the kind of Unix strategy. So hey, it doesn't say command succeeded or anything. You know, if things go right, then there is no feedback. If things go wrong, then you get an error or you get a warning but as long as there's no error or warning then you can assume that it did what it's supposed to do. So let's look at what's actually stored in this folder. We can use LS, no, we can use DEAR same as in the command line. And then here you can see that in this folder I have all of the different assignments and the different answers and so that's what's stored here. And you can see here that there's a folder called data and that's where I saved this file. So I then want to load, of course, this file that we just downloaded from ensemble. So the file is called mtnd1.vasta and it's stored in the data folder. So what I'm going to do is just use the readlines function and this is going to just load the whole file into R and then we can start modifying it. So that's the first step. So let's load it in, mtnd1. And here you can actually see that when I loaded it in, this file looks different from the one that I used to have because your guy is looking at Notepad again. So the file that I actually downloaded just recently I told it to not give me anything else except for the cDNA sequence. But the original file that I made actually has much more in there because hey, it has the coding sequence in there but it also has, for example, the different proteins in there because head like when you do the export, there's many options that you can choose. So in this case, when I made the file previously, I just clicked all of the options and say, well, give me the DNA sequence, give me the protein sequence, give me the cDNA sequence and all of these things. And of course, a lot of these sequences are equal because like the gt, gtt of course is the same for all of the DNA sequences that we see because again, being mitochondria, there are no introns and axons. So all of these coding sequences will be identical to the genomic sequence in this case because we're looking at mitochondria. All right, so let's go back to the questions. So okay, so we did question number four. Go into the directory by using the set working directory. We loaded the file and now the question is, is how many FASTA sequences are in the file? So each sequence in a FASTA file starts with this larger than symbol. So the larger than symbol tells you that this is a description, right? So here, what is described? Well, we are looking at this gene. We're looking at cDNA and it is a known protein coding gene. So this is the information line and then the next lines will be what belongs to this kind of description and then when you see this greater than symbol again, then we start with a new sequence, right? So here we see again, it's the same. It's cDS, known protein coding. So it's more or less the exact same sequence. I have no idea why it exported the same sequence twice. Oh no, this is the cDNA and this is the coding sequence. But again, there's no difference because it's kind of a bacterial genome. Then if we scroll down, we see another larger than, right? And here we again see that this is an axon and of course, since there are no introns, the axon is exactly the same as the coding sequence which is exactly the same as the cDNA. And then we see the peptide, so the known protein, so the peptide sequence which is the next one. And then in the last one, we see actually the chromosomal sequence which is again the same as the first three sequences. So the question is, is how many FASTA sequence start are in the file? So in the file that I downloaded, there are five. Because there's the chromosome sequence, we have the protein sequence and then we have the axon, we have the cDS, so the coding sequence and we have the cDNA. So five different. All right, so now we can use square brackets in R to look into the object we just created, right? So we can, because here you see that this is a vector, right? So every line in the file became an entry in R vector, in R. So we can actually take stuff out. So if I say square brackets one, it takes the first line, right? If I say square brackets 15, it gives me the 15th line in the file. So this is a little bit different than other programming languages because generally programming languages start counting from zero. So the first line is entry number zero in the array, but R doesn't do that. Since R is based on mathematics, they start counting from one. So that's always a little bit annoying in a way because computer scientists, they start counting from zero while mathematicians start counting from one. And this just leads to a kind of conflict and every programming language makes a certain choice. So if you think about the C programming language, they start counting at zero. R starts counting at one. Java starts counting at zero. Matlab starts counting at one. SAS starts counting at one. So you just have to kind of know this from your language that you're working with. But using the square brackets, we can take a certain line out of the file. Not only that, but we can use a split statement. So what we can do is we can use the table command then to see how many elements are present. And then we can use the string split to give an overview of the individual letters. So if I look at the 15th line, I see that this is the code. So the code is 80, blah, blah, blah. But if I want to know how many As there are in this code, I can just say, OK, so string split. This line by nothing. So just split it every letter. So when I do this, I get something back which looks like this. And it just, so it's now a list and this has a vector in there and you now see that all of the letters are individual letters. So if I want to now use this, I can now say, well, use the first list element. So I use these double square brackets to select the first element. And this is because string split can actually do multiple lines at the same time. I could have not just split one line, right? But I could have split two lines as well. So I could have say split lines 15 and 16, right? So now 15 and 16 get split. So now you see that I get a first entry back, which is the split of line 15 and I get a second entry back and this second entry contains the split of line 16. So because we're only interested in a single line first, right? So we just take one line. We have to explicitly tell R that we want to get the first element back. So we just do the double bracket one, right? It also tells you how to select the element because it says, well, this is double square brackets one. This is double square brackets two, right? So if we do just the first element, then we get just the letters back. So now we're just looking at the individual letters and then we can use the table function to just have our count for us how many letters there are. And now it will tell you that at line 50, there were 17 A's, 18 C's, 7 G's and 18 T's. Good. So let's go back to the question. So the second statement, however, we can split up using string split. So we did that and then we get a nice tabulated representation. So how many R's are there in the first line of the sequence? So of course, if we look at the sequence, right? So we look at empty and D1 and we look at the first line, then the first line is not a sequence. So we have to look at the second line because the first line is the description line. So when we look at the second line, this is the second line. So we can do the same thing. So we can do a string split. Well, we want to split by nothing. So split all the letters from each other. We want to take the first element that comes out because string splits can do multiple lines, but we're only doing one line. So we're interested in the first one. So these are the letters when they are kind of chopped apart and then we can use the table function to count how many there are. So in this case, there are 13 A's, 20 C's, 5 G's and 22 T's. So let me see if it matches up with the previous answer that I had. So the previous answer, because the question only looked at the number of A's. So I do the table of the whole thing and then I string split it and then it says 13 A's. So that is indeed what we find again. So there are 13 A's in the first sequence line of the file that we downloaded. All right, so now we can go and have a little bit of logic, right? Because we want to start programming something and programming is not just splitting a string and then looking at the number of A's. So what we want to do is we want to decide if a line is a DNA sequence, if a line is empty or if a line is an identifier, right? Because the identifier start with the greater than. So again, we can do more or less the same thing. So what we want to do is of course now we want to go through each of the lines. But it's always easier to start making first the logic. So the thing that goes into the loop, right? Instead of doing every line and then trying to figure out what to do, we are just going to first figure out what to do and then we are going to use this code to go through each of the lines. So if we look again at this empty, empty ND1, right? Then we see that indeed when we look at the first letter we can determine if something is an empty line, if something is a code line or if something is a identifier, right? Because identifier start with the greater than simple. Empty lines don't have any first letter, so that's easy to detect. And then of course code starts with a G, an A, a T or a C. So let's first try that, right? So we're just going to look at the first line and we know that this is going to be an identifier. But what we are going to do is we are just going to say, well, we're first going to need to string-split it, right? Because we need to get the individual letters, S, D, R, S-split, right? We take the first element, and we get rid of the fact that we do it for multiple lines. We just want to do it for the first line and we just want to look at the first entry. So we have to get rid of the list. So this works. So now we have to say, well, okay, so show me the first one. And the first one in this case is a greater than. So we can use this to start building up a code, right? Because if we would string-split the second line, then now the first letter would not be a greater than symbol. It would not be an identifier, but a code line. And we can then go to the third one and see, okay, the third one starts with an A. So it's not an identifier. It's probably going to be a line which contains DNA code. So how do we now do this, right? So we can now say, okay, so let's store this in a variable, right? So let's go back to the first line and say first letter, right? Because the code that we just wrote will give us the first letter. And this is the first letter for the first line because we're using one here. Of course, we're going to change one in the next iteration with a different loop. So we're going to change it to x, and then x is going to be first one, then two, then three. But this is the first letter. So now we have to write a little bit of logic testing if the first letter is a greater than symbol, if it's empty, or if it's something else, a G, an A, a T, or a W, because there can be protein sequences as well. So the way that we do it, and I'm going back to Notepad, since we're now starting to write code, right? So I'm just going to take the line that we just had, right? And I'm just going to say, okay, so this is my first letter. So this is the code that I can use to extract the first letter from a given line of this variable at this position. So the first thing that I need to do is go through all of them, right? So I'm just going to say, well, for x in one, two, the number or the length of empty and the one. And I'm just going to say, okay, so now this will give me the first letter. And now by changing the one here to x, we can now do it for all of them. So of course, I want to check that this works. So I use the cut function to print to the screen. So I say cut and I say x, which is the line number. Then I'm going to do a space. Then I'm going to print the first letter. And then I'm going to do a slash new line. So an enter afterwards. So the thing that I want to happen now is it goes through all of the different lines of the file that it loads in. And it will show me the first letter, but it will also show me the number of the line that I'm on. So let's just take this piece of code, go to R, see if it works. So if we do it, we indeed see that, oh, yeah, it seems to work, right? So first line, we get a greater than symbol, perfect. This is an identifier. Then we get a G, which is a DNA line. The third line is also a DNA line. And then here at line 18, we see, oh, here's another identifier. So now we just have to kind of figure out what we want to do. And what we want to do is we can extend this logic, right? And we can now use an if statement to say if it is empty or if it's greater than. So we're just going to take this step by step. So the first step is to, oh, that's the wrong one. So the first step is to say, well, we can do an if statement, right? So if first letter is equal to the greater than symbol, then instead of printing this cut, right? Instead of printing what was on the line, I want to do something else. So I just want to say cut. And cut is what do I want to print? So I want to print identifier, yay, because that's what we found, right? And of course, I want to add a new line. And this will now print identifier in big letters. And then I'm going to say else. So if this does not match, I want to do something else. And what do I want to do? Well, I want to print X again, like we did before. And I want to then print the first letter. So now for every line that starts with a greater than symbol, this little piece of code will yell identifier. And for all of the other lines, it will just print the line number and then the first letter on that line. All right, so let's go back to R, see if this code works. So we do the same thing. And now you see that indeed it starts yelling identifier. But we also see that we get an error, right? Missing value where true false is needed. And this is because some of the lines are actually empty, right? So line 59, if we go back, right? 59 here actually says NA, so not available because there is no first letter because the line doesn't have a first line. So let's fix that as the next step, right? So we go back to our noteplast window. And now what we want to check is before we do this if statement, we want to make sure that the first letter exists. So we can say again, if is NA first letter, letter, right? So this checks if the first letter is NA. I will make it a little bit bigger so that you guys can see. And I will actually do code highlighting on it as well. So now I'm just going to say if the first letter is NA, then I want to say empty, right? Because I just want to say this line is empty. And else, I want to do this, right? So I'm just going to put the whole thing in an else branch. So if it is NA, print empty, else we now know that the first letter is not empty. So we can check if it is greater than and then we can shout identifier when it is a greater than symbol. And else, we're just going to print the first letter that is there. All right, so let's go back to R, run the code, and now it should actually not give us an error at line 59. So the first one, which is empty, but now it should actually yell empty at these points, right? So now we see indeed that at line 59, it yells to us, this is an empty line. And here at the end, we also see that there's two empty lines at the end. And now to kind of make sure that we kind of catch everything, we can now say, well, instead of printing 61G or 62A, we now go back to Notepad and we change the else branch here to say, okay, so now we know that if it is not empty, if it is not an identifier, then it has to be a sequence. All right, so let's go back to R. And in R, we can now say that, oh, so now we see indeed, yes, it works. So for, hey, we have an identifier, then we have a whole bunch of lines which contain sequence. We have another identifier and then we have a whole bunch of lines with sequence again. And this was the whole assignment. So the whole assignment is just for you guys to kind of get familiar with what is a for loop. So how do I iterate through something, right? So the empty and D thing in this case. So each line of the file, do some splitting. So do some manipulation on the thing that you are looking at, right? So string split it and then get the individual letters, take the first letter of the first split it item and then put that in first letter and then use a little bit of logic to write an if, else statement. Of course, there's other ways to do this. So if I look at my original answer, then I actually used a little bit of a different code because instead of going into an else branch, I used an else if statement. So saying that if it is NA, cut empty, else if the first letter is larger than, it's an identifier and else it is a DNA sequence. Good. So those were the assignments from last time. I hope everyone was able to at least start R and load in the file. Of course, depending on which options you clicked, if we look at the Firefox window, right? And so if you only selected give me the coding sequence and the genomic sequence, then of course you only had two identifiers in the file. But again, the first line of the file should contain 13 As because all of these DNA sequences are identical. And of course, we could go back and say, well, we don't want to have only two. And we could say, well, in the export, I can go back and I can say, well, give me everything, right? So bloop and give me the cDNA, the coding, the peptides, the exons and the introns and then you could press next and then of course you would have a file which looks very similar to the file that I had. So of course, depending on what you export, you will have more or less identifiers. Good. So any questions, any remarks about the assignments? We'll wait a little bit since there's a little bit of delay. And of course, I will put my answers online and you can compare it to the answers that you had. And I will make sure also that I put the assignment somewhere where also people that do not follow the course can actually get access to the assignment. So no questions. Everyone's still awake. Everyone's still happy. Everyone like, oh, bioinformatics, yay, RNA lecture. I hope so. Good. So those were the answers. If there are no questions, then I assume that everyone is able to start R and with all of the text, right? The original or the assignments are actually like three, no, two A4s of text. So I hope that everyone was able to follow along with the assignment and saying, okay, so yes, I can do the for loop and then I can do this and I can write a little bit of an if statement. And of course, like learning how to program is part of bioinformatics, but like we don't, the course is not aimed at teaching you guys how to program. The course is aimed at you guys knowing where to find the data, which tools are out there and how to kind of research a certain gene for your own master project or PhD project in the end. But I do think it's important to see a little bit of R and to see more or less how code is structured. It's the same as with using the command line, right? You can get away with not using the command line but using the command line is one of these skills that you need as a bioinformatician. And the same thing is programming. If you really want to make a career out of bioinformatics, you just have to be able to program because otherwise you're stuck with just using tools that other people develop and that is not the core of bioinformatics. All right, so if there are no questions and no remarks, was everyone able to do it? Did people do it? Did people not do it? I'm just curious because like I'm a little bit worried about this year's participation. People are not really active in chat and not asking questions. Furthermore, I also have gotten no suggestions for the your own choice lecture. So remember at the end of the lectures that I have prepared, we have space for two lectures based on stuff that you guys want. So if you say I'm really interested in machine learning or I want to know what random forest is or I'm thinking about doing a master project and in this master project, there will be data analysis or there will be literature research and I want to know how I can use a computer to do automated literature research. Then let me know because like for me, it's really hard because I don't see you guys to kind of estimate if you're all snoozing and like, oh, we already know this or if everyone's sitting there biting their nails thinking this is way too hard and I'm never going to pass this course. So a little bit of feedback is always appreciated and if you have an upcoming master project or if you're a PhD student, do send me an email saying that I would like to know more about X because preparing a lecture takes a lot of time. And if I don't get any suggestions in like a couple of weeks, then you guys will just be stuck listening to something that I like which is well, could be fun but could also be very boring. So if you have something and you think like, no, I really want to learn more about DNA sequencing analysis, right? We had the whole DNA sequencing lecture last week and if you're thinking, no, I really want to know exactly how to do this, right? And I am probably going to do DNA sequencing then please show me how you do it. And then we can just do that, right? I'm here for you guys. So every week we have four hours and if you think I want to be or I want to know exactly how to do DNA sequence alignment and trimming and basery calibration and had then call SNPs, then let me know and then we can make a very detailed lecture where we, for example, take an example data set and go through it. All right, so for the rest of the lecture, a little bit of a word in advance. First, there will be a lot of theory and this is just because the fact that there are so many different types of RNA, I had to shift more or less all of the bioinformatics to the end of the lecture and that is because I think a lot of people that follow the course come from things like PQM and there is not a big background in molecular biology and for myself, I think that knowing a lot about molecular biology helps a lot when you do bioinformatics because bioinformatics always looks at like things in a cell or so bioinformatics is very tied to molecular biology in a way. So you have to know all of the different levels that are there and you have to know, okay, so what are the peculiarities when looking at DNA? What are the peculiarities when looking at RNA or at proteins? So because of that, there will be a lot of theory about RNA and I hope that it's not too duplicate but again, if you already know a slide or if you're saying, well, we know what messenger RNA is, please stop talking about it then also let me know because that will save a lot of time. All right, so question to you guys. What is the central dogma of molecular biology? And I always think that this is a good question, right? Because molecular biology is something that, well, if you're doing biology, you should know the central dogma. And so what is the fundamental belief that we all have in kind of molecular biology on which everything, all the biology is kind of based on that. So if we look at a cell, very simplistic, how does this central dogma look like? And it's a little bit of a esoteric question in a way but I'm just wanting some answer from you guys to see what you think is the central dogma in molecular biology. And molecular biology, of course, is molecular cell biology. So we're looking at cells and how a single cell functions and what is our kind of Bible? So what do we hold to be true above anything even if it might not be entirely true? I should be using the cricket sound a little bit more. All right, nothing yet. Does my moderator want to do a guess so that at least someone says something in chat? You think this is small, but wait, it gets smaller. That is a good central dogma. That is a good central dogma. Although molecular biology is, of course, only concerned with things which are bigger than a couple of atoms, right? Because you have physics when you deal with electrons and protons. Then if you have molecules, then you're in the realm of chemistry and molecular biology starts above that. So above chemistry and above physics, there is molecular biology. Anyway, the central dogma is, of course, DNA is the carrier of genetic information. It is transcribed into RNA, which is the intermediate and then this RNA is translated into proteins, which are the effector molecules. So this is the central dogma. DNA codes for RNA, which codes for proteins and one is the carrier of genetic information. Then we have an intermediate layer for more or less communication between the DNA world and the protein world. And then we have the protein world, which are the things that do things. And this central dogma of molecular biology is not 100% true, but it's the thing that everyone always assumes to be true. A bit boring for a dogma, though. Yes, yes, yes. But it's better than love thy neighbor and God above everything, right? That's also a certain dogma, but that's a little bit boring as well. But this is the central thesis in molecular biology. So molecular biology says we have DNA. DNA is the carrier, the storage. We have RNA, which is the transportation. And we have proteins, which are the effector molecules. All right, good. So I think we should do a little break and then we will start the rest of the lecture more or less. The real part of the lecture, so history. So everyone get a nice cup of coffee, rest up, and enjoy the animated gives. And I will be back in around 10 minutes. It's still a little bit short. We've only been doing like 50 minutes. So again, if there's any question, don't worry, shout it in chat, be active, participate. Participation is appreciated. Otherwise, I just have the feeling that I'm sitting here talking to myself, which is true and which is not bad. Like talking to yourself can be a very entertaining thing, but it's nicer if people just kind of say something and then like ask questions, right? Good, so yeah, I will take a short break and then we will start with the history overview and then we will more or less go through all of the different types of RNA and then we will talk about bioinformatics and RNA. So for now, I will stop the recording.