 the recording and then we can continue so welcome back everyone also when you're watching it on Moodle. Alright so types of data, you have logicals which can be true or false, you have numerics which are any number, you can have characters which are single characters or words or letters or sentences so you can have as many characters within the brackets as you want, we have vectors which can be numeric, character or logical and we have matrices which can be created using the matrix keyword and then you have to fill in the numbers that you want to have in a matrix, the number of rows and the number of columns. There are some more advanced types in R so a data frame is a matrix but here because the vector and the matrix can only be of a single type so a vector can be either numeric, character or logical, a matrix can also only be numeric, character or logical. If I want to define a matrix in the sense of a matrix that you have in Excel for example right then in an Excel table the first column can contain numbers and the second column can for example contain letters like F for female or M for male or other types like true and false. So a data frame is a matrix kind of but it's not because it can contain multiple basic types so each column in a data frame can be of a different type. A list is even more complex, a list is very similar to a vector, it's things behind each other but now every element of the list can contain anything. So I can make a list which has which at the first position has a character at the second position I put in V1 so those are that's a vector and at the third position it has a number. So I can combine I can even put a matrix in there if I wanted to. I could say I have a list, the first element of the list is Fred, the second element of the list is a matrix and the third element of the list is a single numerical value. And because R is based on statistics it also has factorials. So factorials are categorical variables which have a certain amount of levels so for example something like gender normally when you do statistics then the gender is a categorical variable because you have males and you have females. Tell us okay thank you for following. So the gender is a the factor is a is a different is a different type so that R understands that in statistics it has to treat a factor differently than for example a numerical value or a character value. And of course in R you have comments and use me often always use comments to describe what you are doing when you're writing a script and comments start with a hashtag and then everything after the hashtag will be ignored but this is your free space to write and you see that I use even here I use comments to kind of write down so that still I can copy paste this into R and it will still work because here the numeric vector is just ignored by R so it will still create a vector a with the numbers in there and the comments are what it does. All right so those are more or less that's more or less the whole type system of R so you have logicals, numericals, characters. A vector is something which is of a basic type so either numeric character or logical a matrix is the same so every element in the matrix is numeric character or logical. If I want to have a matrix which is more similar to a matrix in Excel then I can make a data frame so now every column can have a different type. If I want to just store a list of different things I can use the list keyword and then I can put in anything that I want so I could put in I could make a list where the first element is a matrix or a single character I can then have the second element of the list be a vector and so on. I can have a factor which then is a special type which is used in statistics which is a categorical variable. All right so a lot of terms a lot of difficult things and to check we have a quick self-test for you guys. Good so we will just throw in chat what you think it is so what is the type of this thing and I'm just going to wait and not continue. Okay so first guess is from Tokoforol logical or Alexander says character. Any more guesses? There are no wrong answers. Well there are wrong answers but like participation is key. Okay Mata Klaus, yes Tokoforol logical. All right so this is a character. Characters you can figure out because they have these quotes so this is although it has this is a character which has the true written in there but it is not a logical. Yeah mean. Yeah of course well like if it would be like that's why the type system in ours like screwing you over every time that it's just it's the minutia that count it's just like an experiment right. If you do an experiment and you take the chemical which has the quotes around it then you're not taking the chemical that you want. All right so second one what is this character character character character yeah the same yeah sure sure like like you're not going to fall for the same trick twice right. So this is indeed a character which has the value one in there fool me once yeah that's true that's true there will be something like this on the exam definitely there will be and I will do my best to trick you guys so be very very mindful when I ask these types of questions. All right third one one E plus 11 numerical numeric numeric numeric numeric numeric very good this is indeed a numeric value just written in scientific notation our understand scientific notation. All right number four same numeric same numeric you guys are hard to trick this is indeed a numerical value so bonus points for anyone who knows which number this is and you have to be quick because otherwise I assume that you Google this zero zero no no this is the number 137 it's written in hexadecimal format which is zero x means that it's in hex so it is eight times 16 plus nine I think so let me let me check that eight times 16 and 129 plus nine so it's eight times 16 plus plus nine so 137. All right next one what is the type of oh you can see my mouse what is the type of this comment comment comment very good very good a lot of people think that this is a color and it isn't yeah it's a common because of the hashtag the funny thing is if you put the quotes around it right so if you if you do double quote this thing double quote then it's actually a color the R type system is complex it's it's it's like this it is not a color if I put the quotes around it then R automatically recognizes it as being a color but this is a color notation so if you're used to doing HTML and stuff then this is how a color looks like but then R typing it in like this will just ignore it and because it is a comment. All right next one logical logical logical logical yes this is a logical very good very good very good next one what is this right Mario says factor Alexander says function question mark factor yes it is a factor yes because it takes the value true and then forces it to be a factor as always forces something to be of that type so as factor will always return a factor value. All right last one people silence I should have like a sound effect of like a cricket. All right so we get numeric true logical no clue is false character checking if it is a character false false logical logical indeed is this a character will return false but the type of false is logical so I take a number then I ask a question the question can be true or false in this case false the type of false is logical of course yeah yeah can become very complex and very sneaky as well so just remember the type system in R is difficult I still struggle with it even after programming in R for 15 years I sometimes get surprised by how our interpret certain factors or automatically cast one from the other but I think you guys have a pretty good understanding of which types are there now so one more thing about indexing I told you indexing you do by square brackets unless you have a list so the list is a special type because it can contain anything so you index a list by using the double square brackets so square brackets square bracket the element that you want square brackets for a so if I make a list which contains on the first element something called Fred which I named Fred then I have some I have a character vector logic I have a numerical vector which I call numbers then I have a single numeric called age and then I put on the first fourth position I just put in a matrix and then I say this matrix has two rows two columns then the vector if I if I would type in RW then it would show me that these are the things that are there if I want to select the first element of W then I have to say from W give me the first element and then give me the first element again because it assumes that since it is a list that it can contain anything and if I want to select Fred I have to say from the from W give me the first element which is called name Fred and if I only want to have Fred I have to put in this double one it is of course a lot better to just use the name so I can use the dollar sign to also select from a list so I can say W so from the list W take the thing which is called numbers and then from numbers take the second and the third element so this will return to three I can also from W select the fourth element and then select the first column of this element that got returned and this will give you the first column of the matrix so it will tell you one zero it's better again to use the dollar sign because it's much more clear what you're doing so from W from the thing which is called matrix give me one comma which is the first row of the matrix so double square brackets you use with lists and it's just the way that it is and I don't like it no one likes it but it's just because of the fact that the list can contain anything so you have to be explicit in what element from the list you want to select so double square brackets only use with lists alright so matrices and data frames are slightly different because they don't have a length they have a N row and an N call so the N row function tells you how many rows a matrix has the N call function tells you how many columns a matrix has if you want to get the the row names of a matrix you say row names of the matrix variable or of a variable called matrix you can also get the column names you can also set the column names and the row names so if I would have a matrix which is a three by three matrix I could set the row names to be a B and C and I can also use the column name or I can also set the column names of the matrix to be a B and C so the function row names allows you to pass the require or the row names in that you want when we talk about matrices and data frames also the T function occurs occurs a lot and T stands for transpose of the matrix and the transpose of the matrix is just taking the rows of them are the columns of the matrix and making those the rows so hey it just takes the matrix and then the transpose of the matrix is just row one is column one row two is column two and row three row three is column three so it kind of puts the matrix on its side and transposing happens a lot I don't know why but data is always in the wrong format for example the correlation function calculates correlation between the columns and often you want to calculate the correlation between the rows so you transpose the matrix first and then throw it into the correlation function alright a little bit about variables we've already seen variables a lot variables are are like boxes and you can put things in you can use this arrow so the greater than minus symbol or you can just use the is single is for putting or for defining a variable and putting something in the nice thing about variables is in my mind they are boxes and you can you can use this box without knowing what's in there and we've already seen a lot of variables both being defined so I can say variables and assign to this word the number one point five I can define a variable which is called can and then I can put a vector in there and so the variables can have many names and the names that you choose for your variables should be meaningful so if I put for example body weight in there then I will name my variable body weight or I will if I have for example length of a tail then I would call the variable tail links so alright so we're almost through so coding clean means clean scripts so the way that I want you guys to do the assignments is use a new directory for each new lecture so when you have your computer go to your C drive for your D drive create a folder called our lecture or our course and then within the our course you make a new directory called assignment one and within that you put the script and you put the data that you are going to use during this assignment and this is two things to kind of separate out these things just to work clean so you want to have a structure like if you're doing an experiment you're also putting your chemicals at different positions and so you put all of the things which are highly toxic at a certain place and the things which you can like tip over and spill you put them in another place and the same thing holds for for for coding coding is like working in a lab in a way and so what I always do is name your files in a logical way use a directory for things which belong together so if you're doing assignment one or assignment two or assignment three you don't put all of these in the same folder no you put them in a separate directory on your hard drive besides that add a header a comment section to each file put in the comment section your name the date at which you did something the purpose of the file and add a copyright statement saying that this is copyrighted by me or it's copyrighted by me but I'm working for the Humboldt University and use a lot of comments when you create a script and the header is there that in case in the future anyone steals your data or steals your analysis script you can prove that you wrote it so the header is is something that will allow you to kind of claim ownership of the file or claim ownership of the code it's even better if you put it on the version control but just adding a simple header expresses who you are when you did something and why or what the purpose is of what is in this file and add a copyright statement just to be sure because people might steal your code and it does happen so this is more or less how my scripts generally look like so if I have so if I start by so my script starts here by moving to a certain folder on my hard drive but it contains the purpose so this is the analysis of the Hardy Weinberg equilibrium it is copyrighted in 2015 by the Ha'u Berlin because I work for the Ha'u things that I write are copyrighted how it is written by me it was last modified in 2015 April and it was first written February 2015 and this this is just so that you can remember what you did and if in 10 years you go back to the R course and hey you have all of the all of the PowerPoints and all of the assignments you have those neatly packed in a certain directory and then you have your own answers to the assignments in each of the assignment folders that you do it seems like a lot of work at first and it is it's a lot of like additional off on that you do but it will help you in the long run to structure things properly and it's it's just it's all about being diligent use a good text editor we already talked about this in the beginning so for Windows I always advise notepad plus plus for Mac OSX text Wrangler nowadays called BB edit for Linux use what you want if you can use Linux and you installed it yourself then you're perfectly capable of deciding which text editor you want to lose a lot of people like Kate or G edit or something like that under Linux but under Linux like if you're capable to install Linux then that's perfectly fine and then you can use whatever you want most importantly if you want to have a good text editor have one that supports code highlighting and support for bracket testing like I showed you in notepad plus plus right if I am doing something and I'm selecting a bracket then it highlights the bracket which belongs there and in this case everything which is a comment is color colored green everything which is a known keyword in R is colored in like purplish and logical values are in blue and strings are in gray so that directly separates out all of the code and makes it clear what we are looking at so remember clean code is smart code you can write stuff like this the computer doesn't care but you will when you look at your own script in 10 years you will hit yourself in the head for writing it like this and so make sure that you have a certain structure and this is just HTML but yeah clean code if it looks good it's it's more it's it's it's more correct or not so much more correct but it can be just as correct but it just looks better and it's easier to maintain and and yeah my R code for my study project is confusing mess yeah that's what I mean like when I started out my supervisor told me to wear a rubber band and every time that he would look at my code and he would say this is not correct and I he would flip me with the rubber band you are a professional working in a lab means being a professional coding being a bioinformatician or learning how to code and doing stuff also means that you're a professional so you should take pride in your work and things should look good right if you're a carpenter a chair which just looks like a mess is still a chair and can still be functional but it's not something that you can be proud of so make sure that when you write code that in the end when you look back on it you are proud of it and that it looks good and that if it looks good it's generally also well-written questions think this is the end no so questions so far anything that you say I didn't really understand that I kind of want to have a assignment or an example for it or something like that general questions like AMAs like what's your favorite color while Paka's alright so how are interpret things and what it knows depends on how it has been written in the first place right yes yes it it it it goes from top to bottom it just executes instruction instruction instruction instruction and of course you can loop you can go back and and loop but the way things are written is is is just as important as how it looks I can show you some examples of stuff that I did in the past just to give you an idea so let me get you a piece of code which I am proud of and then show you another piece of code where I think oh my god what the hell did I do here so let's just move you guys to Notepad++ so the nice thing about Notepad++ is that you can also have like this thing on the side so that you can kind of browse your directory so one thing that I'm really proud of is my own web server that I wrote it's not written in R but that doesn't really matter it's written in D but it it's also available online but it it's written like this right so you have nicely a module statement then you have all of your imports and then you have things but you can already see that there could be more comments here but then the code itself it is nicely I ident it so every time that I use an if statement I give it a little bit of extra space so I can see what is within the if statement and because of code highlighting can see that and this looks kind of okay it's not perfect it's not as good as I wanted and one of the things that is missing is that there's no header in this file but in this case that is not required because here I have a license so the license here is in a separate file so that the thing can track it if I would look at something in which I'm less proud then I could probably take any file which I did for the mega Muga analysis and then something like this it does have a header which is still quite okay well this actually looks pretty good as well right so it's nicely structured it doesn't look too messy there's probably some stuff which is more messy well this is just an empty file so and here we have a file which also looks kind of structured but if I go back to older code that I wrote when I was much much younger then let me see if I can find a good example I already have RStudio installed is it possible to use R without RStudio or do I have to install you can just install R so you don't have to reinstall RStudio that's not necessary I don't have RStudio installed but if you yeah you can install both side by side not not only that but I don't have that on my window capture currently let me open up a new command prompt and then add a new input capture I want to capture not the display I want to capture a window all right and then something like this so normally when I do R it looks kind of like this so I just start R from the thing and this is also R and if you have RStudio installed then just going and doing opening up the command line and typing R will also give you R and here I can do the same thing so I can say X equals five I can print X I can do 78 divided by five and so it still works the same in the end it's it's just how how it looks all right let me see all right so this you can ignore get them from Moodle I don't put them online anymore so I'm not putting them on my own website you can look at my own website my girlfriend and moderator put a lot of work in making it look good I had a website which which the design was relatively old but start of last week I updated it to a new design so if you want to see that then just go to www.DennyOut and spend a now and the assignments and the lectures are not there so you have to get them from Moodle and that was it for today which is pretty good I thought that we would run out of time but we still have like 10 more minutes if you guys want to talk about other stuff just a little bit about Twitch right so for all the people that subscribe to me when you are subscribed and watching the lectures you earn something which is called channel points and these things you can use to do interesting stuff like highlight your message or get an emote so I think something like this you can highlight so in theory you could have there's a lot of people talking to each other you can use your channel points to highlight a message no one used it today there is two special things that you can buy with a lot of channel points and I put them in to just make the lecture a little bit more fun for you guys and that is next slide in Dutch and next slide in German so officially I'm I'm I'm Dutch so I can speak Dutch and I can also speak German a little bit there is no next slide we're on the next slide but if you redeem your next slide in German then I get a message on my Twitch that says that I have to do the next slide in German so that would be for example this slide so then I would go and here you can find the tasks the tasks can be downloaded from my own website who you can find under www.dannyadens.nl.simmons but you don't have to forget the slash at the end of the URL but this is all old because this is an old slide so you can download from Moodle so in case you just want to have me struggle with German or you just want to hear me speak in my own voice in my own language then you can redeem this thing I put it in just to make it more fun for you guys so just that you can hear me struggle with German and make the lecture a little bit more interactive besides that on top of me you also see the Twitch thing which actually shows how long I've been streaming for how many people are viewing you see the mood box and you can use words in the mood box so the mood box let me see where that is nice German yeah well try it on a slide which is complex then you will see that I start struggling much much more but I do like I've been living in Germany now for six years so my German is OK that's why I put it in like I didn't put in next slide in Spanish because my Spanish is just like very very minimal seven exclamation mark at seven out of ten or seven years oh yeah seven years in Germany yeah I forget how old I am but so but you can also put stuff in the mood box no problem I'm Spanish teacher I know I know that's why I said Spanish but it's it's going to be like I'm not going to get further than Hola and Hasta la vista and vamos a la playa and these kinds of things in Spanish but yeah but you can see me struggling in German if you want or you can listen to me listen have one slide in Dutch which just means that we have to do the slide two times because I'm going to do it in English anyway but I will do the slide in German or in Dutch if someone wants to get rid of their channel points the mood box upstairs which you can see so you can you can fill in keywords which will show your current emotion so you can say things like sleep and then it will show a little bed and you can do things like zombie to change your thing so if you if you type it in in capital letters then my robot that I wrote which monitors my twitch channel will pick that up and will allow you to to show your emotion like my moderator and girlfriend had hard eyes the same force Rita very cool lecture format very refreshing thanks I I I like doing it like I was very very unhappy by not being able to do the by doing the in-person lectures because of the pandemic and I I I do like this much more like I've seen people that just pre-record their lectures and then just put them on Moodle and have like a 30 minute Zoom talk I like the interaction I like being able to have people directly ask me questions and having to respond to that and hey it's it could be better like I could have like a professional microphone and these but like in the end I think it's cool so for the mood box if anyone wants to use it let me see where the commands are twitch overlay and then it's the engine so let me show you oh it's thank you thank you so much like yeah well I it's it's my day job teaching people how to program and how to do bioinformatics so this is the bot that I wrote so all of the commands that it supports are things like grinning grin smiling joy open smiling halo wings and these kinds of things quote I'm really looking forward for the rest of the lecture thanks thanks I like doing it like this like it's good to have like the the interactivity and now you can directly ask for for an example if you want one so we'll see how this works out in the in the long run but I do like it but I think that it would be good to have like a zoom meeting and I think we had a new record of viewers yeah I think so as well I think so as well this is going to work really nicely one of the things that I'm one of the things that I'm a little bit which I don't like is the fact that there's commercials but of course like it's twitch so they have to make money as well so they can they can show you like 10 seconds or 30 seconds you need to remove the quotes surrounding it this is just because it's it's programming so I have to program it in but if you would do halo like this then it should work so and of course when you subscribe it so it's just a little bit of an interesting good yeah and there's a lot more right like so I'm and if you have anything that you want to add and you think like oh this is an emotion that I often have and there's a kind of emoticon for it just let me know and I can I can put it on the Peter Arans is for for for Daniel it's one of the students from the previous course who I used to game with and so we put that in so he also has his own so he also has the Daniel one which will be like a little boy emoticon so that that's about it for today and so if you want to know anything more or just have some general commands let me actually directly upload the assignments for you guys and the PowerPoints so that I do not forget so let me go to moodle data analysis and then turn editing on now I don't want to drag and drop files and just want to add a new section no label where's that label label label there's the label so lecture one save and return to course is the queue from Star Trek I think I did that because of the question mark it's just like raising your hand bye bye see you next week soil Rob and Linda Russel I see all right so lecture one introduction and I will add the lecture and the other one