 start recording. Good. So control structures. So control structures are the things that make code tick. So if you're writing code or if you're writing long scripts, then things happen not so much by variables, but things happen by things that are called control structures. So control structures come in like two variants. We have branching. So that means that a box can either go left or it can go right. And then we have looping. And looping means that a box kind of gets processed over and over again, or you get, you process boxes into different order. So you do that by using a while loop, a for loop, or a repeat statement in R. So an if statement. I used to have a switch statement here as well. I shortened it a little bit. I don't think that the switch statement is that useful to learn. But there's also a switch statement, which can do the same thing. So in this case, I am creating a variable called box, and I'm something put something in there, right? So I could create a box with a logical value in there, a box with a numerical value or a box with a character value. And now, for example, if I have code, which works on a character, for example, a function called write. So write is a function that only can take a character and I have a function called left, and this left function works on numeric values or logical values. So then I can just check the box. So I check if the class of the box using an if statement. So I say if the class of the box is equal to character, send the box to the right side. Else send the box to the function called left, right? So here you see how an if statement is structured. So you have the keyword if then you have round brackets. So everything between the round bracket will be evaluated into a logical value. So it will test if the class of the box is equal to character, and then it will send the box right, and otherwise it will send the box left. Of course, you can make multiple routes. There's also an rd else if statement. So that looks like this. So again, same code if class of the box is character, you send it to the right. Else if there has to be a space in between and then if else if the class of the box is equal to logical, send it to the middle function and else send call to call the left function with the variable box. So then based on the class of the box, and of course, you don't have to test based on the class of the box, you could also look into the box and see if the box contains five. So if box is is five, do something else, do something else with it. So if statements are kind of guiding boxes and doing logical testing. So if you have something like a for or a while statement where you do a whole bunch of things, then usually you have an if statement to process different things. All right. So of course, there's more comparison that can be done. So for example, a single number, you can ask if x is smaller than five, print x is smaller than five, you can test if x is smaller than y. And of course, this only works for single numbers. This doesn't work for vectors. And that's the thing in R. If you want to use a vector, then you can use the all or any keywords. So if all elements in x are smaller than five, then all the numbers are smaller than five. Or if any x smaller than five, that means that there is at least one number in x, which is smaller than five. And of course, these things become very complex. As soon as you start doing questions like if the sample that I'm looking at is a male and has a certain body weight, something then added to the matrix, right? So you can use if statements to really kind of make subsets of your data when you go through all of the rows in the matrix. So often this is combined with a for or a while loop. So for, while and repeat. So for example, I can put a thousand in the box. So I say box equals a thousand. So this means that it just has a value of a thousand. So then I'm saying for x in one to 10, right? So I'm just saying, well, make a vector one to 10. And then let x go through this vector one by one. And every time what I want to do is take x from the box and then save it back into box. So the first time that I go through this loop, x will be one. So I will take a thousand minus one is 999. The second time that I go through this loop, it is 999 minus two. So I end up with 997. The third time that I go through, it is x equals three. So it's 997 minus three is 994. So after I'm done, the box will contain a number and I will have subtracted the numbers one to 10. The sum of these numbers, I will have subtracted that from the box. You can do the same thing with a while loop. So for example, I put a thousand in the box. The nice thing about a for loop or about a while loop compared to a for loop. In a for loop you need to know the bounds. I can only go when I know from where to where I want to go. A while loop is more flexible. You don't have to know the bounds. So you don't have to say from one to ten, but you can do from anywhere. So repeat until something is true. So while something is true, continue doing it. But you don't have to know. So if you have a matrix and you don't know how many rows there will be, because sometimes there might be five rows, sometimes there might be 500, then you can use a while statement and then say, while I still have rows in my matrix left. But in this case, I'm just taking the same code that I had here in the for statement and now making a while statement out of it. So I have a box and now I have to define my own variable takeout, which is of course x. So this is equal to the x in the previous example. And then I say, while the takeout is smaller than or equal to ten, take my box, make it equal to the box that I had minus the value that I take out. And then in the next statement, I have to increase my takeout variable, right? Because here it automatically does that because x goes from one to ten and then goes through this loop ten times. But here, of course, I have to make takeout bigger. Otherwise, I would just continuously go and have an infinite loop. And I would just say box minus one is box. If I do not have this statement, then it would say again, box minus one, because the value takeout would never become bigger than ten. So hey, you have to be careful with while statements, because with while statements, you have a risk that you run into an infinite loop. So you just continue doing things forever and ever and ever. The repeat statement is in R. I never use the repeat statement. I always use a for loop or I use a while loop. So it is there. If you want to use it, feel free to use it. I never used it in all these years that I've been programming R. All right, so a little bit of an example, but it's more logical. So have, for example, here we define a variable called even, which contains all the even numbers from two to a hundred. So even numbers are numbers which are divisible by two. So we go from two to a hundred by two. So it contains two, four, six, and so on up until a hundred. And now I'm going to add up all the even numbers from two to a hundred. So I say for number in even. What do I want to do? So before I start adding up all these numbers, I define a variable called total. This will hold the answer. So now have for number in even. Total is total plus number. So the first time it will be total is total is zero plus two is two. The next step will be two plus four is six, and then six plus six is 12. And then it will continue like that. So it will just go through all the numbers in even, and it will just add them all up. Of course, you could have used the sum function in this case, so sum even. And then it would also tell you how many, what the sum is. But here you go through them one by one to make sure that, and you kind of know what a four statement is. So if statement, so I'm always using statement and expression wrong, but if you have an if thing, right, then the thing between the brackets is the statement. So a statement always, a statement in computer science always evaluates to a logical value. And then within the if, you have expressions. And the expressions are the things that, that do something. They change, for example, the variable, right? And statements can and will become complex. So you can have not a, so if not box is is one. So when the box does not contain one, do something, if the box is larger than one, and so you use the double and sign, box is smaller than a hundred, right? So in our, because this is a logical value, and this is a logical value, we don't want to say, well, both of them need to be true. So if true and true, then you do the expression. If you don't care about both of them, right? So if only one of them needs to be true, then you can do something like this where you say the horizontal line, so box smaller than or equal, if the box is smaller than or equal to zero, or the box is larger than a hundred, you do something, right? So this is the opposite of here, because here you're demanding that the variable box contains a value between one and, or between two, or larger than one, and smaller than a hundred. And here you're demanding the opposite. So the box needs to have something which is smaller or equal to zero, or it needs to have something in there which is larger or, no, well, not equal to a hundred. So a hundred will not compute here. Alright, so there's different comparisons operators in R, so you have is, is, which means equal to, you have not is, so exclamation mark is, which means not equal to, you have smaller than, larger than, smaller or equal to, or larger or equal to, and they are coded like this. So if you ever want to test a number in R, or then you can use this. Alright, so, and then we have a little bit of a difficult thing, because you have a vectorized and a non-vectorized version of the Boolean operators for AND and OR. So you have the single AND, single ampersand, and you have the double ampersand. So the same thing applies to the single horizontal line and the double horizontal line, which is the OR statement. The AND is vectorized, and it can be used for logical vectors. So if I have V1, right, and I have 1, 2, 3, 4, and 5 in V1, then I can use this single AND to say V1 smaller than 4 and V1 larger than 2. Right, so this only holds for the number 3. So if I do this statement, then R will say false, false, true, false, false. And this is of course very useful when you're taking a column from a matrix and saying, give me all the individuals, which are more than 20 days old and less than 40 days old, or give me all the plants, which have a weight of the plant, which is larger than 20 grams, but smaller than 1 kilogram. Right, so you can use the vectorized statement then. However, if you want to use the IF statement, right, so if you want to use IF statements, then you always need to use the double ampersand or the double horizontal line for the OR. So because this is not vectorized, it takes the first element from a vector and it only uses this first element. So true and false is of course false, but if I have a vector which contains true and false, and I ask true comma false and true, it will say true. Although one of these is false because this is not vectorized, it will just take the first element of this one and compare that and say, well, true and true and that's true. So it will also issue you a warning in this case. So if you use the non vectorized version on a vector, it will throw a warning and it will say something like error in statement, only the first element is used. It's not an error, it's just a warning. But when you see the warning, only use first element, then you have to, you probably are using a non vectorized version, while you should have actually used the vectorized version, or you forgot something else. So in IF statements, you need a single logical value. So the single ampersand and the single horizontal line should not be used and the IF statement will just take the first comparison. So you get this warning message in IF v1 or v1, the condition has length larger than one and only the first element will be used. So as soon as you see this warning message when you're writing an IF statement, then the IF statement is wrong because you're using the vectorized version, well, you should actually be using the non vectorized version. And then you have to make sure that v1 only contains a single number or a single value. All right, so what can you do with this? Well, you can use logical vectors as the index to another vector, right? So for example, here we create something called 10 to 1. So this is all the numbers from 10 to 1. And then we can say 10 to 1 smaller than 5, which is false, false, false, false, false, true, true, true, true. But we can use this directly into the vector. So we can use this to subset the vector 10 to 1 to get only the elements which are smaller than 5. So you can do this with matrices as well, right? So all of these things that work on vectors also work on matrices. So in matrices, it's much more logical. Normally, you have a big data frame. So you have, for example, five different measurements. So you have measured, for example, the body weight or the yield or the length of the tail or you've measured like the flowering time. And then you want to make a subset of your matrix. Then you can use the same structure. So you can use just the, you can take the column of the matrix and ask which elements are smaller than five and then use this as an index to the matrix. So do you want a little example of that? Because it's, I think it's, for me, it's very clear. But the thing is, is that when people start out programming, sometimes the most kind of things which, yeah, because I've got experience. So I know how to look at these things. So if you want, just shout and I can give you a little example. So, and the thing is, is with these logical vectors, they need to be of the same length. If not, they will start looping around without a warning. So if you say 10 to 1, c true comma false, and then this will cause, so it will look in 10 to 1, right? So which is 10, 10 is then true. So it will give you back 10. Then the next one is false or nine will get false. So you get only the even numbers here, which sometimes you want, sometimes you don't want. So make sure that when you're using a logical vector as an index, that it has the same length as the vector from which you are selecting. So, yeah, if you want an example about this, then I can show you an example, for example, on the microarray data that we have loaded in. All right, so the vector statements, you can combine these logical vectors to make an advanced selection in vector. So you can use, for example, the pairwise and or the pairwise or. So, all right, example, good, good, good, good. I like doing examples much more. So, all right. So we have this microarray data loaded, right? So just look at the microarray, let's take the microarray log and let's call this X, XA, for example, right? So if we look at XA and we look at the first five lines, then it looks like this, right? So there are six columns, five elements. So for example, imagine that I want to have all of the genes which are highly expressed in the first two samples, but low in the third sample, right? Then what I can do is I can say XA, so highly expressed is, for example, above 10. So I want to have XA, 1 being higher than 10, higher than 10. And XA, 2 also higher than 10, right? So high expressed in the first two. And I want it to be lower expressed in the third column. So lower than 10, right? So when I now press enter, for each of the elements it will do a test. So it will say if it's true or if it's false, right? So now when I use this, I can say, well, this is my subset vector, right? So this is the stuff that I want to subset from the matrix. So when I do subset, then now I can do XA and I can do subset and I'm only going to show you the first three columns and then it looks like this, right? So now you see that for every element that now is in my subset, I have a value higher than 10 in the first two and a value lower than 10 in the third column. So this is how you can kind of build up, right? And you could do the same thing with when you have males or females. So hey, you can make this as complex as you want. And when you make it, when you make this, right, and you do a subset like this, you could do it in one go. So you could, instead of storing it in a variable subset, you could take out a whole bunch of other stuff. And of course, if I want to know, generally when you do this, you put a witch surrounding it. And then which will give you back the indexes, so in this case the rows of XA for which this is true, and then you would store this in something which I always call II, which is the indexes for me. So I can now ask, for example, the length of II, and then this will tell me that out of the 22,000 probes on the array, there are 16,332 probes for which this is true. And then I could say, well, that's too many, right? I can't make a nice histogram. So let's make this a little bit more strict. And let's say that XA1 and 2 need to be higher than 11. And then I can ask for the length, and now I have 130 probes for which the first two samples are really high, and the third sample is really low. But now I can say, well, 130 is something that I can plot, right? So I can do XAII, and then I just say, well, give me a histogram of this thing for the first three rows, right? So I'm saying give me the first three columns, and then make a histogram, no, not a histogram, make a heat map. Right, and now you can see that indeed, the first two samples, which here by name, so it reorders them based on the box plot here, but you can see that it's high in these three, and it's low in these other ones. So it allows you to do very complex subsets in a single statement. So often when we are measuring like 3,000 mice at work, and then we only want to have mice, we want to have the measurements, which have been done at day 70. So after 10 weeks, and then we want to have the mice, which are having a weight, which is higher than something. And so you have to have like a whole bunch. So if you make them, if you have a big matrix, you often want to make it a little bit smaller, a little bit more manageable, because you want to look at only the males or only the females, and then only the males at day 70, or only the males at day 63. So these things can become really, really complex to make like subsets and selections. But that's kind of how I do it. Clear? Another example? More examples? Better example? You didn't like the example, like just, I'm here for you guys. So you said that you wanted to have an R lecture and learn more about programming. So that's, but yeah, this is kind of how you do this. And you can even use the R statement. Like if I want to have XR1 higher than 11, or XR2 higher than 11. And then of course, you could do something like this. Then you would get a whole bunch more, right? But something like this, like lower than 8 then. Yes, so now the first one or the second one is high. And then the third one is low. And in this case, of course, because I'm using an R statement and an N statement, I have to add another bracket because here we are dealing with the operator presidents of Boolean operators. But now if I ask the length of II, then now there's 43 of them. So I can then just show the first 43. And now you see that they're still high in one of the two, but they don't have to be high in all three, or in all two, although they're still high in all two, that just probably has something to do with the biology. All right, so back to the PowerPoint. So you can make these subsets, right? You can make these subsets on vectors, but also on matrices to select certain columns for having a certain value. That's my phone. I'm going to have to pick that up. Let me just mute myself for a second. PhD students, PhD students, can't live with them, can't live without them. So all right, so complex vector statements, and especially for matrices, these are really, really handy. The next one. Oh, I already had a slide about this. This works for matrices as well. But the example here is a little bit like it only uses a single column, but you can add more columns, right? So it's the same as the example that I just showed you. So M1 is a matrix. So I just put the numbers one to nine in the matrix. And then I say, well, give me everything for which column one is lower than three, by just taking the first column lower than three, and then making a subset. And I can also ask a single one. And then of course, it will give me back a vector. So can you get a list of probes ordered by value? Yeah, yeah, you can just sort or use order. So if you would go to R, right? So here we have, let me open up my R window for me. So here we have, for example, X, right? So this is not too much, right? You can see here that one of the two is high. The other one might be not high, but lower than 11. And then this one is low. And then imagine that I would want to order them by the column here, like GSM this one, I could just say, take this column, then sort, right? Sort will give me back the sorted numbers. But in this case, I don't want to have the numbers back, I want to have the indexes back, right? So I say sort, then say index.return is true. So now it will give me not only the ordering, so from low to high, but it will also tell me, well, the first element here, the lowest element is at position number 13. So you say that here it has something called IX. So I can then do SS, which is sorting results, or I can call it SR for sorting results. And then I can now do EXA, II, right? Because I already made a subset before I sorted it. And then from this, I can then do SR, dollar IX. So then take the subset of probes, for which this holds, and then order them by the sorting order that I just created. So if I would do something like this, then now I would get all of them, not the first three, just show the first three columns. And now you can see that the first column is ordered from lowest value to highest value. And the other columns are, of course, like reordered as well. And you can use sort, or you can use order. There's also an order function, which all only returns the indexes. But I like to use to sort explicitly. So something like this, right? So sort, hey, you could sort the entire matrix as well. Then I don't have to do the subset first. So now I just say sort everything by the first column. And then instead of doing II here, I could just do take the exa, sort them by II, then take, for example, the first column, and then plot them. So now it would show that all the values go from low to high in this matrix. So it sorts 22,000 really fast. So you can also order them by the numerical value. You could even order them by factor values, right? If you have males and females in the matrix, which are not ordered, then you could say, well, first, give me the males and then give me the females. So you could do something like that as well. Back to the PowerPoint. So matrices are just like vectors, you can use the same kind of subset selection on it. So some special control structures, which you don't really use a lot. But when you're writing code, which is getting used by other people, you write a lot of these. So you saw that are sometimes issues of warning. So a warning you can do by the statement warning. So if X is smaller or smaller than zero, right, if you wrote your algorithm and your algorithm might not function properly with values lower than zero, and then you can issue a warning to the user saying that this might go horribly wrong, or this might be, so hey, I suspect that you did something wrong. If you write an algorithm and your algorithm cannot deal with negative numbers, right, like the square root function, then you can just throw an error. So an error would be in ours called stop. So it's the stop function. And then you call it and then it breaks execution at that point. There's also the try catch statement. So you might want to do stuff which gives you an error sometimes. So when there is an error, it will still continue. One of the things which I do this with is if you have, for example, a microarray data set and you're doing a t-test between one group and another group, right? So the t-test needs at least three values to work. But sometimes, because it's a microarray, there are missing values, right? So for each line, there might be, sometimes there might be five versus five, sometimes there might be five versus four, sometimes there might be five versus three, sometimes there might be five versus two, because there's just three missing values. But the five versus two test will throw an error in the t-test. So then I write a try catch surrounding it to make sure that when the t-test throws an error, I just put in an NA, right? Because I do want to do everything. I don't want to stop halfway through and then say, okay, but this one I cannot test, so I should remove it. But I don't do that. So then I just write a try catch expression to kind of slurp up errors. But for you guys, if you're starting out programmers, it's not that important to know how to use a try catch. You generally want to use a try catch when you're writing code for other people. All right, so a little bit of an overview. So variables in control structures, like I told you, variables are boxes, or at least in my mind, control structures manage program flow. So you have branching, which are if statements and switches. And you have loops, which are for a while. And then you have warnings and errors. And this is more or less everything that you need to write a full operating system, right? You don't need anything more. If you have branching, looping, and data input and output, like using the retable function and the right table function, then you have all the tools that you need to write anything that you want. Because all of the code in R that you see is buildup of these basic blocks. Well, there's actually one block that I didn't tell you about. And I don't know if I put it in. But I did want to show you some of the advanced looping. So we use the apply function, right in the previous assignment. So you have the L apply for applying to a vector or a list. So this is L apply to the vector of the list, a certain function. And then you can add this dot dot dot. And these dots here are additional parameters. So for example, let's go to R. So to show you an example of the L apply function. So imagine that I have a vector which is containing one, six, seven, so just some numbers, right? So I have V. And now I want to L apply to V this function plus. And in this case, I have to quote the plus otherwise it will be a syntax error. And then I want to add eight to everything, right? So now it will give me back every number in V. But now increased by eight. You can make this more complex. You can you can you can make other functions. But you could, for example, apply the mean function to this list, right? And then it will tell you well in every time you would just get the same numbers back because of course V only contains one element, right? So it will take the first element of V. So which is one and it will calculate the mean of one. It will take the second element of V, which is six and will contain and calculate the mean. So this is not that useful. But if you would have a vector, right? So if I would have a list, if I would have a list with V and V, and I would call that AA, right? So now I would have a list with two two little vectors in there, right? So now I could use this L apply again to calculate the mean. And now it will calculate the mean of the first row. And it will calculate the mean of the second row, which is just five and five. So the L apply function just takes a vector or a list and then for every element in the list or in the vector does the exact same thing. So in this case, the mean function. But you can also just use the plus function to add stuff to each element in the list, which is of course, it's not a very useful example, but it's something that I hope helps you understand how it works. So the apply function on the matrix is more or less the same, but now it does it by the rows or it does it by the columns. So you have this margin. So you can say repeat a function to each row or column in a matrix. And this could, for example, be the t test function, right? I could say apply to the matrix to the rows, the function t test. And then I have to specify which things I want to test against which. So I have to add some parameters to the t test. But then you can use the apply function. And you can use margin to see one to to indicate both rows and columns. I never use this. So forget about the last one. Always apply across rows or across columns. This one, it has its usefulness, but I've never used it. So just forget about that one. So I'll apply a little example. I, for example, a list from one to five. And then here you have DNA. So hey, I can apply to my list mean. But then you see that the DNA propagates. So the DNA from this second list will cause the second mean to be NA. So I can add additional parameters to the function. So these parameters will be given to the mean function when it calculates the mean. So in this case, I'm instructing the mean function to ignore NA's. And then it will calculate a mean for the second list as well. And yeah, apply an example. So if I have my matrix, which is a matrix of one to 50, 10 rows, five columns. So here the columns went to the next line. So forget about this one. This should be here in the back. So I can apply to the matrix to the rows, the means. So then it will compute the means of each of the row. And I can apply to my matrix the means of each of the columns. And then it will use the column means. So it will calculate the mean for each of the columns. So mean of the rows means of the columns. All right. So why use L apply and apply, they are almost always more efficient speed wise, because they can use vectorization. They can be up to like 100 times faster than just using a for loop. But they are memory wise, much, much faster. And of course, when you use aromatic functions, like the plus, then you should quote them, because otherwise it's a it's a syntax error. Because are just reads the syntax. And if you don't do the plus, then if you don't quote the plus, then it will just see the plus as being one to two comma plus comma five. And then that does not compute. All right. So L apply and apply will normally give back a list. You can use the unlist function to get a vector, right? If you saw here, when we did the plus here, then you see that it gets a list. So it's a list, which at the first element contains one number at second element contains one number. So there's a vector with one element in here, there's a vector in one element in here. And then you want to use the unlist function to get back a vector which contains the means. So the unlist, you can use to go from having a list with singular elements in each of the list to having a vector again. And unless there's a very powerful, very powerful command, you can do a whole bunch of stuff with unlist. And it helps you to parse through big data sets very quickly. All right, a little bit of an overview for brackets, right? So the round brackets you use when you call a function, and you use it when you do a control structure statement, like in the if statement, right? So then you use round brackets. You use the square brackets when you index a vector, a matrix or a data frame. You use the double square bracket when you index a list. And you use the curly brackets when you define a block of code. For example, when you surround multiple expressions, when you have what belongs to the if statement. So where does the if begin, where does the if end, then this is surrounded by the curly brackets. And it's also when you build functions that you use these curly brackets. So just for you guys in an overview slide so that you know when to use which bracket. So if you want to call a function, round brackets, if you want to index a vector, a matrix or a data frame, square brackets, when you want to index a list, double square brackets, and the curly brackets for when you are doing code surrounding it. So which code belongs to each other. All right, and then the last section for today is escaping the inevitable. So about strings, they are enclosed by the double floaty air comma thingy or the single floaty air comma thingy. If you want to combine two strings together, you can use paste. If you want to print a certain character string to the screen, you can use print. And if you want to print them to anywhere like to the screen or to a file on your hard drive, you can use the cut function. So I always prefer to use the cut function. I almost never use the print function. But the paste function, I use a lot when you have multiple strings and you want to kind of combine them together in a single one. However, of course, if you forget to close the double air quote in a string, then you, which happens a lot, then no command you will produce an output, right? So then you have to look at the symbol in front of the cursor. So if you have a larger than symbol, then that means that you are typing in a command. But in R when you get a plus symbol, right? So you type in five plus five, and it doesn't give you an answer. Then that is because my string was not closed, right? Because I opened up a string, which I forgot to close. So then this goes into the string. So you can press the stop button at the top of the R window to stop the execution at that point. And then it will break the input for the string. So for example, I can print things to a screen using print when I paste together hello and world. So hello and world will then go to the screen. If I want to save hello world to a file, I can do cut paste because I want to combine hello and world into one string. And then I want to use the cut function to write it to a file called out.txt. So this will make a new file, clear the whole file, and then put the words hello and world in there. If we want to print the double air quote, which you sometimes might want to do, printing the air quote to the, we need to escape the character. So some characters that need to escape are, for example, quotes, new lines, tabs, backslashes, and the backspace character. So these are all characters that you can use and that you might want to print to the screen or to the file, and you'd have to escape them by putting a backslash in front of it. So if I want to type, for example, let's go to R. I don't think that I have an example for this. But when I have, for example, a, so if I say cut, this is a slash, and then I want to print a slash, right? Then if I do this, then I now unescape this one. So you get the plus symbol, so I did something wrong. But if I want to, so this I can print, right? But if I want to print the slash, I have to use a double slash. Then it prints the backslash character. If I want to put a new line at the end, right? You see that it prints it, and then it just continues on on the same line. Then I have to say slash n, which is a new line symbol. So now it prints this as a slash and then puts an enter there. I can also use the backslash character. So I can use b, b, b, and then it says this, this I slash, right? Because now it just presses the backspace characters three times here. So it will delete the a, delete the space, delete the s. So you can do fun things with this. But the escaping is necessary. And of course, it is very necessary when you want to print something like she said, double point, double point, and then I want to use something between quotes. So I have to escape this like this. And now it prints out she said, double point, high, right? So you have to, because otherwise you would close the string by using this. So if you want to print that, then you have to put a backslash in front of it just to escape the inevitable string stuff. All right. So there's many. You can have the slash t for tabs, new lines. You have the backslash and the backspace. These all need to be escaped. When you use cut, we print verbatim. That means that you have to make sure to add a new end of line element. Otherwise r continues on the same line like I showed you. And we can use a separator command when we want to separate elements when we use cut. So I could use hello world with a new line separators comma, then it will print hello comma world. You can use space as a separator, but you can also use a minus as a separator. So then the separator allows you to put things together. So often I used it when I put like row names. So individual one, individual two, individual three, I just say, paste individual comma, one to 10 comma separator is and then no separator. All right. Last one, some randomness. So if you want to get a random number in R, you can just always use four. But there are several distributions in R. So you have, for example, the uniform distribution. So every value has the same chance of being drawn. It's done by the Rene function, so R uniform. So R uniform will draw a number between zero and one. And every number has the same chance of being selected. We have the Gaussian distribution, also called the normal distribution. And this is done by the R norm function in R. So this will, if you draw like a 10,000 or 100,000 numbers, then it, when you do a histogram of that, it will look like a perfect, perfect normal distribution. You also have the Poisson distribution. So the Poisson distribution is very famous in biology because it occurs everywhere. And, of course, the Poisson distribution is based on whole numbers. So the amount of bees on a flower follows a Poisson distribution, right? Usually, there are no bees on a flower. Sometimes there's one. Sometimes there's two. But it's very unlikely to find like 10 or 20 bees on a flower. So it's a Poisson distribution where there's a high likelihood of getting numbers which are either zero, one or two. But then from then on, the numbers start kind of exponentially going down. So the chance of drawing a number which is like 50, the chance of finding 50 bees on a single flower is, of course, very, very small while finding one of two is relatively high. So zero, one or two is relatively high. So Poisson distributions. And you can do that in R by using the R poise function to draw a number from the Poisson distribution. If you need repeatable randomness, you have to set a seed. So otherwise, if you start R, it will set the seed based on the current time. So every time you will draw different random numbers. But if you want to have a repeat, so if you want to draw the same random numbers as before, then you can set your seed to a number, draw some random numbers. So in this case, I draw five uniform numbers between zero and one, and I round them. So I draw one, one, one, two, zero. If I set my seed to two and then do the same thing, then I draw a different amount of numbers. But if I set my seed back to one and then draw again five numbers between zero and two, then I get the exact same as that I did before. So because you want to have repeatable research, often your scripts need to have repeatable randomness as well. All right. So that was it for today, I think. So clean and reusable code just wanted to remind you. Programming is like working in a lab. Make sure you work cleanly and neatly. Code goes into a directory. Input data goes into another directory, and output data goes into another directory. You always want to keep these three things separate from each other, because the input data, you never want to override it by accident. So that's why the input and the output data go into two different folders on your hard drive. Code goes into a separate directory because you don't want to override your code from your script. So make sure that you think of speaking names for variables and functions. So I, I, of course, is not a very good variable name. So think of like microarray, microarray log, microarray log QNorm. Speaking names, the name should mean or should represent what you put in there, if you name a variable. You, you always want to use indentation when you are using blocks, and you have to align your comments on the end of a script, because it just looks better. And this also makes the code much more readable for others. So here I have a function, which is mySum, which takes some input. It goes, hey, it counts up. So it adds all of the numbers together, and then it returns the count. But you see every time that they open up a curly bracket, I then put the stuff inside of the curly bracket with two spaces or a tab. Use spaces. People who use spaces in programming earn more money than people who use tabs. I don't know why it's a research by Stack Overflow, but it's a very significant difference. It's around 10,000 euros a year more when you use spaces compared to people who use tabs, even when you use the exact same language. So looking at people like Java programmers or C programmers, they, of course, there's a difference in income, but there's a major difference between income if you use spaces and program Java, or if you use tabs and you use Java. And of course, him make code readable for others. So add comments, explain what the code is doing and why it's working in the way that it should work. 503. So just a little bit over time. I hope that there's still people listening, like we didn't have much interaction in the last couple of minutes, but that was it for today. The homework for today, I did not make yet. I told you that I was kind of arguing with one of my PhD students this morning, and that's the same one that just called on the phone as well. So I will put it on Moodle tomorrow. When I put my presentation on Moodle, I will also put the homework. I will at least make sure that it's before Friday 5, because then I'm going home. So don't look at Moodle yet. Don't start mailing me like, oh, the homework is not there. The homework will come. There is some homework in R. I just wasn't able to sort it all out, because I want the homework to kind of align with the lecture, and I changed the lecture a lot, or at least I put two lectures together. All right, are there any questions, comments, ask me anything, speak now or forever, be silent? I don't know what people say nowadays. So yeah, if you come up with a good question, that's really nice. If you don't have any questions, then how to use different directories? Well, using the set working directory. So you just, so what I do is, I told you that I always start with a set working directory, right? So I set my working directory, read in the input files, and then I have a set working directory to another location to write the output files. Yeah, so it would be, so if I would look at code, so let's get a window, so let's use a notepad plus plus window for this. So I have my header, then I have set working directory, and then I set C double point slash input slash files slash where they, and then I load. So I do a couple of tables, right? So I read table, so I read in table one dot txt, and I store this in V one. And then I do this a couple of times, right? So I read all of the things that I need. And then I do another, and then I say output files, where are they? And then from then on, when I do a write table, I can now never overwrite one of the input tables, just to make sure that I never touch them. So I always have like a set working directory at the beginning to go to where the input is. Sometimes I actually have an additional one, which is where my scripts are located. So I have something like this, and then I do, because you can load our scripts like this. So I generally have like a functions R script where I put all my functions. So then I load that first. So I go to where my scripts are located. I source the scripts that I need. Then I set the where directory to where my input files are. I load my input files, and then I set my working directory to where I want the output files to be, and then I write them. So this is a very common structure that I have in all of my coding files is just go to where the scripts are, load the scripts, go to where the input is, load the input, go to where the output is, and then write. And of course, the code is here, right? So hey, here you would have the code to analyze everything, right? Because the code to analyze goes here. More questions. Do you want to do a headcount of how many people are still awake? Is my moderator still awake? Oh, just as Sara has redeemed the highlighting message. Oh yeah, you can earn channel points. That's true. That's true. You can you can highlight your message to make it stand out more. It's not like there's like a hundred people shouting, but yeah, you earn channel points. So I totally forgot about that. We could actually do a commercial break now just before we finish the stream. So my moderator is also still awake. So that's good. That's good. All right. So let's see who else is in the chat. So Alexander should still be here. Damn, there's such a lot of bots in here. Bing Cortana, casino tanks, clear your browser history, troop dog, extra more, ice wizards. Oh, that's a nice name. Joint effort. Katharina Fernando Rosenthal is also here. Let's do this music. Let's do this whole stuff. Maybe I look out of this window. Strow here. So Max actually joined. He didn't say anything, but he joined. Oh, you have 1.6k channel points and counting. Oh, that's really nice. I have infinite channel points for my own channel. It actually has the infinite symbol there, which I don't know what I can do with it. Why would I use channel points? Oh, they come with the affiliate status. Yeah, the channel points. Yeah, they're affiliate. So you earn channel points by just watching me and listening. All right. So are there any more questions? Any more questions? Any more questions? Oh, Max is just collecting the, yeah, that's probably what he's doing here. I think he actually earns them even faster than the normal viewers because he's actually a Amazon prized prime Twitch subscriber thingy. I mean, the bots come with the status. Yeah, that's true. Yeah, they're just watching me. All right, Alexander, thanks. Thanks for watching. Choose. Let's come with the status. There's nine people watching, so there's no real status attached to that. It's just funny to see that the user list is like 20 people long and there's like only nine viewers listed. All right, so if there's no more questions, then I am going to wrap it up. Thank you all for watching. Thanks for the questions. Thanks for the joining and at least trying to answer some of the questions. And next week we will have proteins. So I will see you all next week. Then bye-bye.