 All right, um, let's do a pre-audio check Go YouTube. Yes. Go YouTube. Hey, Misha. Welcome. Um different this time, right? Like So yeah, I hope people can hear me We tested everything yesterday so everything should be fine So we'll start at Two I really hope people can hear me Misha give me a give me some feedback man if you can hear me Um, I hope my moderators here as well Then uh, can really start but I really hope people can hear me and Nothing went wrong with the audio. Okay, perfect. Everything's fine. Can hear you. Perfect. Yeah Yeah, it's always difficult right because I can't put on my own audio because then it starts like singing around because then you get Myself with 30 second delays every time But perfect. So nice It's good that everything works We still have like seven minutes before we really start So if anyone has any questions or something like that, just throw them in chat and um, I will just continue sitting here drinking my coffee waiting for Two So that everyone can join and that we can just have a nice stream that That will be fun. I can really excited about it I talked to one of the students yesterday. Um, who had an issue with Being not able to chat via youtube Um, if that's the case, I actually sent around an email this morning With a zoom link for all of the students um, so if you have any issues with chatting on youtube or You just don't want to make a google account because you can watch youtube without having a google account, of course Um, I'm not forcing you to make a google account But then of course if you have any questions you you have to go to the to the zoom thing A little bit of a warning Uh, don't unmute yourself in the zoom thing because that might Actually be picked up by obs so that might then End up on the on the youtube stream, but I don't think it will um, I Fiddled around with the settings, but I couldn't test it. So Anyway, five more minutes people In theory we could actually switch to the lecture layout That's kind of the same thing as the other one. So then you guys can just see me instead of listening to me but uh I don't know. It's always difficult just talking to yourself for like five minutes In an empty office All right, so we have someone who entered the zoom Hello moderator, thank you for being here as well. So Five people already I'm expecting 32 students to show up. Um, we had 32 people who signed up for the course Yeah, so let me type a message quickly. We don't have sound he sent it to Just for questions via chat Uh, that's a nice amount. Yeah. Yeah, I'm really really happy with um, 32 people signing up. Um, It's been great, right in the like eight years that I've been doing the course We grew from having like six to ten attendants in like the first year um, kind of steadily and um I wanted to ask if I can still join the course since I am not registered yet Sure, just send me an email Jessica. Um, then I will just put you on the list of participants Um That's perfectly possible. So that means that we have 33 students. So we're we're growing while we are not even started yet. So, um But yeah, I'm really happy that over the years like the course has grown into what it is now I think currently we have people from more or less all over the Humboldt University um, we started off with just Because officially the course belongs to the process and quality management master, right? Which is part of the Albrecht Daniel Ter Institute Um, but it kind of grew So I think we have a lot of PhD students this time We have other people joining from all over the world. I think Um, because of the fact that it's that's it's on youtube, which is good, right? Like I like people like I think education should be for everyone um, and that just Like if I teach the course and there's three students in the course then only three people get to benefit from um, listening to me, but if there's 33 people who registered and there's like 50 people that just follow it on youtube Then that just expands the number of people that get to learn are And I think it's a very important skill like programming is one of these things that It will help you in the rest of your career. Whatever you're doing So let me quickly look at the list of registered students because we have a bunch of people that are not even from our faculty So There should be a lot of fish people A lot of PhD students I think we have one or two neighboring hers even from outside of the Humboldt University There's someone from physics, which is interesting um So, yeah, of course most of the examples and assignments will use biological questions Because of course we're from a Biological master So I'm really really excited about that and I'm really excited that so many people are following it So I'm uh, I'm really happy about that. So, um, let's just switch to the lecture layout so that you guys can see me as well Let's see if that goes okay Perfect. Perfect. Perfect So I hope the light is not too bad Yesterday when we did the test stream like half of my face was kind of blocked out by shadow. So, uh, I hope it's better now Fish people say AO. Yeah That's actually quite nice. Um, I I I don't know why but I do like to fish people they At a certain point started joining the course and we also had them in the in the course for the bioinformatics course Which you can see from last year in the year before That's on youtube So, yeah I drew a fish yesterday. I'm I'm I'm horrible at drawing fish. So don't don't ask me to draw too many Too many fish horticultural sciences. Yeah. Yeah, horticultural is of course It's it's the same as the fish people like you're more than welcome here Like it doesn't matter, right? Like in the end programming is for everyone and if you enjoy it, then we're just going to Make it fun and teach you guys how to do it All right, it's two. So, um I think we can more or less officially start. I will admit one more person into the zoom meeting room So that they can also chat in there The puffer fish. Yeah. Yeah, you can see me drawing the puffer fish On the youtube and that's really really bad. Like that just went off the rails horribly horribly horribly horribly It was a great well, well Well, it was a fish people could make out that it was a fish So for the people that didn't see the test stream, uh, check it out. Um, it's very fun to do All right, so, um, let's start right so welcome. Welcome everyone. I'm so happy that you guys are all here Um, and I hope that we can have a really really good, um, our programming course this year And, um, it's the summer semester course. Um, the course will run from now until, uh, July 21st, I think um, so July 21st, I think is not going to be the last lecture is going to be the exam Um, and we will get into that. I made a couple of slides about the course overview and and Then we do an introduction Um, so what will we do today? So for today, uh, general course announcements because there's a lot of different course announcements to do Um, I wanted to show you guys what you will be able to do after you follow the whole course Right because you guys are here and um, I don't want you guys to go home and Don't know what you will learn And then we will start with my introduction to R. Um, I changed it up a little bit from last year It's not that different from last year. So it's a little bit similar. Um, but I changed some images I added a couple of slides. Um, so I hope that it will be different enough so that you guys will enjoy it. Um, so And, um, of course, I will talk about the assignments because practicing is really really important like, um, I was talking to a student yesterday And I described Programming to him or at least my idea of programming is that it's like playing a violin, right? So everyone can pick up a violin and can do this and can move his hands But in the end, if you want to become a concert violinist, you just have to put in the practice hours and the same thing holds for R Um, so when you when you want to learn R, you just have to put in the hours You can listen to like 100 streams. You can listen or you can read multiple books But in the end the only way that you're going to learn how to program R is by just doing it sitting behind the computer Failing right because at first you will get errors. You will get warnings. You will bash your head against the screen Um, and that's okay. That's what I'm here for. So, um, a tip in advance if you get stuck with the assignments Which you will because some of them are really hard. I noticed from last year um Just send me an email, right and the only thing that I require that is that you show me that you tried So if you say well, I tried this it didn't work. I tried this it didn't work. Um, then um Yeah, I I'm fine with that Then I'm more than willing to help you and even if you're not an official like student of the course Right, if you're just a random youtube person that happened to enter the stream Then I'm more than willing to help you as well because like I think that's my job as an educator um, actually one of uh, my colleagues, um Came up with a New term that said like no, you're not really a lecture You're more of a lecture answer because I actually because I took my like influencer circle thing to have a little bit better light So I like that term. It's a good term All right. So next slide. Um, so like I said like the lecture slides and the assignments will be made available on Moodle So if you're a student, you can just go to Moodle. Everything's there Um, there's also three books on there. So if you like learning from a book Then there's three different books that were available during the pandemic for free That you can download there and then you can just read the books. Um, so then that's more or less kind of a an additional way I don't know if I'm allowed to distribute them on my website But if you want to get one of the books and you're interested And you're not a student so you don't have access to Moodle Then just send me an email and I can privately share you the pdf's Um, please attend the lectures on youtube ask questions. Um, the best thing would be is to just make as much noise as you can So even if you just have like Other things you can always put them in chat. That's perfectly fine Um, that's one of the things that youtube's really like because the more kind of interaction there is in chat on youtube The better it is for the visibility of the stream So that's just something from my side, which I just want to give you guys that don't don't be quiet. Don't hold back ask questions So like I said, the lectures are supported by practical exercises spend some time of them if you get stuck send me an email And don't be afraid because I sometimes have the feeling that people get stuck for hours on end And are afraid to ask for help. Don't be afraid to ask for help I know that when you have no programming experience or only little programming experience, then you will get stuck on the assignments Which is person perfectly fine So for the students, I actually want to offer you guys an in-person meeting before the lecture So during the first break, I will walk up. I will see if the lecture room that I have in my mind is available So that we can meet there next week before I do the stream So that would mean that the stream would be from four to six and then the in-person Like lectures are at the in-person assignments Would be in front But I'm I'm not too sure about that because I got a new pdf from the university with all of the rules that I have to adhere to when I want to Do lectures in person It's not too much But I have to make sure that all of you guys are vaccinated and wearing masks and all of that thing So it's a it's a big hassle, but I'm more than willing to do that because I do believe that Doing the assignments together is good because then you can learn from other students Students who are a little bit more advanced can help the students who are a little bit less advanced So then everyone benefits from that The exam date at this point in time because people always ask me about the exam It is unknown, but it's probably going to be The 21st of july Because that's kind of my aim And I'm not sure if it will be online or in person that depends on the prüfunksbüro And the problem with the prüfunksbüro is is that they only tell me Well, only a couple of weeks before the exam and Three shots here. Let's do it. Yeah, I'm I'm excited Good. So that's the general course announcements and then I have some questions for you guys So First question for people in chat Or people in in the the zoom thing Why did you decide to follow the course and how did you hear of the course to start off with Like every time that I search for my own name in achnes. I'm actually surprised that I can find my own course and The idea is like of course that this course is for you guys So I'm just curious like how did people from physics actually end up finding our course? Which is listed under the Albrecht Daniel Terre Institute, which is biology So I know there's a little delay So I will just wait a little bit for answers and of course you can answer the other questions as well, right? You all can read so any prior statistic courses or any other experience that you have with programming Would be good that I know that especially when we start doing the assignments Because if there's one or two or three people That do the assignments and they have a lot of programming experience in java or in c++ Then that would be perfect, right? Because that would be good because then I I don't have to help everyone Because I'm only one person We have like I said 32 33 people who signed up So that would be that would be nice All right, first answer I searched in Agnes for statistic and found the course interesting Interesting, I never find anything with Agnes every time that I try to search for something The only way that I know how to find my own course is just by filling in my name in the lecturer field And then it comes up Fellow students told me about it and I thought it sounds really interesting and like something important to learn I agree. I agree a lot Biology is well, it's not a dying field, right? But like as a wet lab biologist you will be forced in the future more and more and more to do programming because Currently we have so many techniques that generate so much data That without any programming experience you won't be able to handle your own data And I think that's a shame because in the end If you are doing an experiment and you're spending time in a lab like collecting Samples and extracting DNA and doing sequencing or whatever you're doing Had being able to analyze your own data just makes it more fun, right? I see people in the lab here that are really good in in doing lab work, but in the end They get a hard drive back with their data because had the analysis is done by some company And then they have to come to me and I get to do the fun part I get to look into their data see if there's any signal or any other Any other really nifty nice biological effects and that's that's what I like right like extracting this from from data sets So does anyone have a very specific objective in what they want to learn? Because the course is structured in such a way that there are like 11 to 12 very Basic lectures where I will teach you guys how to load in your data. How to do some basic statistics But there's always two free lecture slots, which I try to reserve for People that are working on either a very specific data set Or people who have very specific requirements for their master Searched all over for training for social scientists. I'm a sociologist doing But I need to upgrade my quantitative skills. Okay, that's very good. I'm funded by a recommendation no experience Only with data preparation Perfect. All right. So that's good. That's good. But yeah, like I said, there will be two additional courses Let me read chat as mentioned by the professor before the r-language is used to process the data more scientifically So I want to learn r. Yeah Including better statistical and even graphing. Yeah. Yeah, there will be a lot of like focus on Visually exploring your data. I am a very visual person myself If I see an ANOVA table I can kind of get an idea of what's going on But when I when I physically see the data by using box plots or scatter plots I can kind of understand much more what's going on Um, so it's good that that actually people are recommending the course from from last year that means that The news is spreading, right that we have a fun r course um Yeah, so if anyone has a data set that you are currently working on or going to work on and think it would be good that This guy has a look at it. Um, then we can make a lecture out of it So if you go to the course from last year, um on youtube Then you see that there are three lectures or there's one lecture which is split into three parts And that is called the fishy data lecture And there we took some data from the bacher sey project, which was collected by professor arling house And I just went through the data and made a presentation about it, right? So how would I analyze your data? So if you have these kinds of things then do mention them Um The information was shared. I find it interesting and excited to learn about it. I didn't have any experience My main objective is to gain the skills for data analysis. Well, then you're at the right place because that's what we're going to do Perfect. Okay. Um, so yeah, if if you have any data sets, just drop me an email Throw it in the comments throw it in the chat Saying that no I'm working on this beautiful data set like one year. We had someone from I hope you can read this. Yes, I can read this. I'm I'm looking both at the chat on on on youtube and at the zoom thing That's why we have the zoom thing I have a little experience with ar and my main aim at this course is to learn the time series analysis Which I need for my dissertation perfect, um, I love time series analysis. I actually have a paper under No, it wasn't under review because it was directly rejected. So I have to resubmit it to another journal Which is about time series analysis, but I love time series analysis as well Because it's it's nice and visual. I like looking at things like growth curves Or other things over time and I think this time dimension to data is really interesting. Um, so Sure, I will put it on my to-do list and time series analysis We have it in there already a little bit when we talk about like the generalized linear models and How to use generalized linear models to analyze repeated measurements So repeated measurements can also be repeated over time, right? Because time series data is just measuring the same individual over different time spans So cool. All right, then I have some ideas on on what to do for the for the two lectures that are at the end So and if you have a data set like I said, send me an email I'm more than willing to look at it and make a nice presentation Based on your data All right, so I told you guys that I would want to tell you guys what you would be able to do So and this is not going to be after these first three hours, of course, right? This is for the whole course So what we will be able to do is to format your data for r, which is something that I noticed that especially with biologists is hard All right message in chat here to learn data analysis to I'm garbage as statistics and the master thesis is coming Oh, yeah, that's that's one of the reasons I see a lot that people follow it that That they that they have some kind of data already collected for their master But when they have to do the thesis they have to do the statistical analysis and and put p values on everything Which in a way, I don't think is that important like I think that if the effect is big enough, right? And you can see it clearly in the data or you can see it clearly when you when you plot it in in a box plot Then the p value itself is is relatively meaningless because in the end if it is really significant It's significant and then it doesn't matter if it's like one times 10 to the minus three Or if it's one times 10 to the minus 16, right significant is significant So the the exact value of the of the of the test is not that important in many cases, especially when the effect is big All right, so back to the slide. What will you be able to do? Format your data, which is generally a big issue I see a lot of people that collect their data in excel and then use like different colors in excel to denote what went wrong or what went right and of course like for a computer This cannot be understood, right a computer needs a very specific structure of data so that can load it in that it can reason about the data and I see it in Is prior knowledge? No, no all all knowledge that you will need to pass the exam and follow the course is going to be presented That's why generally the lectures are relatively long. I think in the last semester or not the last semester, but the last time we did the our course It's like 48 hours in total of me talking And you guys asking questions, of course, so no there's no prior knowledge required So when I talk about a t-test, I will show you guys what a t-test is When we talk about linear models, I will tell you what a linear model is what the assumptions are and these kinds of things So head load your data in our format your data. So that's kind of the first three lectures so we will spend around six to nine hours of Lectures on just the basic understanding of our how to get your data in And we will discuss some very very advanced topics at that point already, right? Like reading in binary data, which during my whole career like 14 years of working in bioinformatics, I only used a couple of times But so we will go very deep into Kind of formatting your data loading in your data And some of these things you won't use for the next 10 years Even if you're working as a bioinformatician, but I just want you guys to Know how to do it and know that it exists So in case it comes up in like five years when you're working in a lab That you know, okay, this is there I'm Taking the lecture again, and I'm just looking it up like how did this guy do it like five years ago So use statistics to answer your questions. This is of course very important Hey, you I will teach you how to formulate questions and then formulate these questions into programming code, which the computer can then answer because Asking the right question is very important And of course interpreting these results Last year, I think I went a little bit short on the custom plots I think people like custom plots a lot I noticed this also during the bioinformatics lecture when we did the volcano plots So people really love Having code to make custom plots and I think that last year I skimmed out a little bit on that So I think we will extend the plotting lecture to make sure that you guys Can use basic r to make really beautiful looking plots It will take a lot of time right if you make a very good plot Which is suitable for publication. It generally takes you in the order of like three to five hours Just tweaking everything making sure that the fonts are correct that the X's have the Things that they need to be that everything is the right color and that there's enough like Difference between colors So I think last year I skimmed out a little bit So I'm I'm going to try this year to add much more to the plot plot plot lecture Which is lecture five I think And then in the end I want you guys To be able to create your own analysis, right? I have a excel file How do I load everything in? How do I write functions to analyze my data? How do I do statistics? How do I create a plot in the end and then lecture number Nine or something is going to be Making an r package right so because in the end if you wrote some really genius code Then of course you want to share your code with the world And you want other people to be able to make the same beautiful plots that you made So have published your work as an r package for others to use That is I think more or less the end goal and if we get there and everyone is able to write a couple of basic functions Do a little bit of statistics Load and save their data and make a package out of these things Then I would be very happy because then have we reached my goal because from that point on It is just practice. It's just Looking things up that you need because everything or there's like Thousands of packages available for r For machine learning and all of these things so but that's the end goal that you can write your own functions Make your own plots and then publish your work so that other people can use it All right, so analyze your own data So what you will be able to do is make plots like this right where we look at the guinea pigs tooth growth Giving them different types of of feed supplements So there's guinea pigs that are drinking ascorbic acid and some of them are drinking orange juice and then some Person measured their dose. So this is one of the basic data sets that's in r And I would like you guys to be able to visualize it right and use nice colors And have like legends in there so that everyone understands what's going on And still this plot is not perfect right you can't publish this plot in a journal For example, if we look at the y-axis, you see there's tooth length And there's the like tooth lengths in millimeters or meters or centimeters. I don't know right So there's still some little things that need to be improved in this plot And one of these things here is for example, this double axis on the y are on the on the x-axis 0.5 0.5 So here there's still something that can be improved You will be able to make other kinds of plots, right? So these are things like heat maps or in R. They're called image plot And these are can be combined with things like these lines which you see in Geography maps, right? So so these kinds of things But in the end I want you to create your own, right? So this is one of the old images that was in the previous presentation and I didn't like it It was just a very basic plot But have because the way that I'm using a black background I want everything to be white, right? So I want to have a white axis white labels white legends And the same thing holds for for these kinds of plots, right? So in the end, I want you guys to be able to take your data Visualize it and learn something from that data. And of course the most important thing is making your own Because like just taking standard stuff off the shelf Doesn't make you a programmer. It just means that you can use other people's code But in the end what I want you guys to do is to be able to write your own code make your own plots make them look fancy All right, so that's it for more or less the introduction the course overview So if there's any question just throw it in chat or throw it in throw it in Moodle If you have very specific questions like Then I'm more than willing to answer those But if they're not then we are going to start with the introduction to our So this is a very basic presentation that I always give I've been giving this presentation. I think since I was like 20 years old I used to do this At king's college in London And I would already show so some of these slides they date they date back to like 2008 or something when I first did the king's college lectures But it just grew into a full presentation and I do think that it's a nice thing to show you guys So what will I be talking about during the coming 30 35 minutes? Well, mostly about the history And then we will take a break Then we will talk about the look and feel of r although I might add that to the history depending on how quick I go through the history We will be talking about using r as a calculator. So that at least we discuss some of the basic stuff So Let me actually delete that I will say remove Good. I have to do some moderating once in a while and actually let me just Hide this user all together Don't have the right tools yet I'll put the user in timeout as well I'm I'm still learning youtube so bear with me when like stuff like that happens All right, so um look and feel of r How to use r as a calculator so that we start doing the basic things like like type in five plus five And it will give you 10 and these kinds of things But we'll go just in the in not In depth, but we will go broad right how to use complex numbers how to use other things We will be talking a lot about the r-type system And this will be a recurring theme throughout the lectures because it's one of these things which is really hard for people that are beginning with programming Because different types of different properties for example, if you have a character you can't do computation with it, right? You can't say I have the word School and now do plus one on it, right? So there's different types in r and because r is a language also for statistical analysis the There's also different data types which don't occur in other programming languages, right because for example something which is a categorical variable Is not something that is easily translated to a language like java or c plus plus So sometimes there's people that have experience in other programming languages and they are totally confused by the different types that r has So we will be talking a lot about that and also how to index things like a vector and a matrix which is really hard And we'll take some time to get used to and to kind of get the mindset on how these things work in r We will talk a little bit about variables But that is already going to be part of the next lecture. So really in detailed in in depth about variables that we will do in in Next week And I wanted to show you guys how I want you guys to answer the assignments So answering the assignments means writing little scripts. So that's something that we will go through And I I have some specific requirements on scripts because I think it just helps when people work diligently I always say programming is kind of like working in a lab You have to be very Careful you have to look at like every bracket every dot every comma Which is the same in a lab, right? If you if you just don't read the tube correctly and just throw the chemicals on your sample Then you run a big risk of of destroying your sample. Fortunately in computer science or in r It's really hard to destroy your computer. So feel free to type anything into the r window It's really hard to crash the whole computer or Do other crazy stuff Which in a lab is relatively easy, right mix the wrong chemicals and you have an explosion Fortunately programming is not like that Why do I want to start off with history? I think history is very important I think many people know the beginning of the quote, but they never saw the whole quote So why do we do history in our course? Well, if I wouldn't be a bioinformatician, I would probably be a historian I really love history have loved history since high school And I think it is very important. So I will just read the quote to you It's the full quote done by john of solsbury And like I need to have some kind of knowledge questions as well that I can ask during the exam So if I give you the quote and I ask who did the quote then of course you have to tell me that's john of solsbury And there will be multiple quotes throughout all of the lectures So it's just so that not all questions have to be about like programming and writing programs But that we have some kind of basic introductory questions as well So let me read the quote We are like dwarves on the shoulders of giants so that we can see more than they and things at a greater distance We are carried high and raised up by their giant size And this is really true especially in computer science Like the computer that you're currently using to watch this thing on is the end result of almost two and a half thousand years of development Which involves hundreds if not thousands if not hundreds of thousands of people and all little incremental steps Every new generation of computer is a little bit better than the one Than the one before and I think that that's that that kind of signifies this quote for me And the same thing is for programming and programming languages like See as a programming language is kind of the most important programming language in the world and see itself became 50 yesterday But there's still new versions of c coming out and it's still an evolving language and language itself and the expressiveness of language is really important So by knowing what was done in the past What we are looking at in the near future. I think you can get a more A broader overview of what programming is and and it just helps you to see more Than the people before you So we start off originally when I gave this course in In in kings college, uh, I would start off in a kind of sheldon cooper way like 2400 years before christ in babel on on a quiet evening night with Stars falling down There was someone whom we don't know the name of who developed the abacus And the abacus is the first kind of tool that people used to do computation Right. So very basic things like adding things up Multiplying things subtracting things. So it's kind of an external memory so that you don't have to keep everything in your mind Um, so I don't know how many people are here that actually used an abacus in in in elementary school Um, but my elementary school still taught us how to use the abacus same like the geometric triangle And the the other like things that you use when you navigate a ship But the abacus itself is kind of the the starting point of modern day computers It's an external memory so that you don't have to keep everything in your mind And it can help you do computation on larger numbers. Um, so that you don't have to Write it down or do other things Then the next big invention came in, um Um, you did 40 years ago. Yeah, no, I think that the abacus is something that that people should learn It's an interesting tool and it it shows you how people in in antiquity actually Did computation and and calculated prices, right because people have been bartering and trading and trying to navigate the seas for Literally tens of thousands of years And this is really one of these fundamental things So 1100 bc, um, there's a big new invention which came from china and that is using differential gears So differential gears are the things that Make you or that that enable you to do Instead of just multiplication dividing and plus and minus, right? It allows you to do integrals So you can integrate over a range of of things. So hey, if you if you think about functions, then you have like the Hey, you have a line which goes on an x y plane and then at each point of the line You can calculate for example the directional coefficient, right? So this directional coefficient can be calculated using gears and These differential gears stayed with us until kind of the beginning of the well almost until the end of the second world war A lot of tools that have been developed use differential gears But the chinese used it to build one of these things here, um, which is called a south pointing chariot So it is a little car that for children that you can play with But the little man on top always points to the same direction So it always points south or it always points north depending on where you point it originally And this is done by a mechanism which is inside of the little car Which uses gears? So when the wheels turn It it calculates the amount of revelations on the one gear on the other gear and then based on that The the little man on top is moved into a certain direction So the next big improvement is then the chinese abacus So that is a more advanced abacus than you see here And that is an abacus which allows you to do plus minus Multiplication divide but also to do integration. So to calculate x to the power of two or The square root of of something so a much more advanced abacus I think that's the one that you usually see in classrooms The chinese version and I think think that it has changed since 200 bc until like 80 or 1990 And then no one used it anymore because we use calculators So in 120 bc, which is still In the middle of the roman age, right? There is one of these more or less magical or mystical Objects from antiquity, which is called the ankithera mechanism So that was developed in greece in corinth And you see a graphical design here. So this uses these differential gears. This uses a gear system to Do Most likely we don't know exactly what it was used for but keep track of all of these celestial bodies on on in in in our solar system, right? Because Most of the computation that we do today has its basis in navigation So if you are on a ship, you know the time of day, you know where the stars are Can you figure out where you are on the planet? That was the main difficulty for almost two and a half thousand years Nowadays we use gps, which is really easy, but For almost all of human history This was the biggest challenge to be on a ship In the middle of the water or on the water in the dark And not crash into the shore That's kind of the trick, right? Because then you die. So you don't want to do that So there have been a lot of tools developed and the ankytera mechanisms one of these like Things that that is almost like a modern computer, right? It's really good at Tracking the sun and the moon and the movements and figuring out where you are on the planet So an an updated version of this is called the astrolabe. The astrolabes have been used until like 1940s 1950s, so people would use a sextant to measure the Inclination of the sun relative to the horizon you would measure the north star or if you're in the Southern hemisphere you would use this southern cross And then based on this you would calculate together with the time of day You can calculate where you are on the planet So it will give you an kind of an x y coordinate of where you are And then actually nothing really happened for around 700 years So in around 800 we have the development of cryptography, which is one of these major steps again in computation Before that people used very simple Cyphers to Encrypt their messages, right? You can imagine that if you are leading an army and you are a general and you want to send messages to your troops Then of course you're not going to write down. We attack at 7 p.m In the morning or 7 a.m in the morning, right? Because then any enemy finding this message will know your plans So al kindus is more or less accredited to the main inventor of cryptography And he also developed frequency analysis. So he developed a formal system of cracking encryption And because of the fact that he developed a system to crack encryption. He also developed a better system of encryption So this is where kind of the NSA fbi fsb arms race starts, right? So where where Better and better encryption schemes get developed to hide the messages from from from the enemy and to to be able to say Attack at dawn Without anyone else knowing except for the people that you want it to know And this is of course in the golden age of the arabic world And then have 400 years later. We have the development of the first castle clock So this is a this is a mechanical clock which keeps track of the time But it also keeps track of the sun and the moon position. So it is one of these Things from antiquity, which I think still exist. I think if you go to mozol in iraq you can still see this ancient castle clock it's like the one in There's also one in one of these european cities. I would say Switzerland my mother radar would know we actually saw it It's one of these like things from the dark middle ages where there's this big fancy clock Which which you can see in europe, but the oldest one is actually located in mozol in iraq And then of course in 1673, which is the enlightenment period We had a big step format forward again. And this big step format is called the leibniz wheel So this was developed in germany by godfried willem leibniz And he actually made the first More or less pocket calculator. It's not a pocket calculator. The thing was like this size So it it you couldn't put it in your pocket, but it is kind of the predecessor of all of the Of all of the Calculators that we have you missed the topic. I'm afraid The castle clock or the fancy clock within europe we went there on holiday with your parents Just as a Thing which city is it in? I would say vienna or something, but it's not vienna. I think it's something else But it doesn't really matter, but there's a very old one in in europe as well I think that's like 14 1500 so that's 200 years later Prague that's it. It's the fancy clock in Prague people go there and it's like massively expensive Anyway, I will read you guys the quote from leibniz. I think it is If you are a programmer, then you identify with this quote directly so It is beneath the dignity of excellent men to waste their time in calculation when any peasant could do the work Just as accurately with the aid of a machine, right? So this is kind of the basics of Because before that smart people Would spend hours and hours and hours doing long divisions on pieces of paper Doing computation on pieces of paper things that we now just punch into a calculator Before that were done by hand So people would do long division and this would literally take hours if not Years for for some big computations, right? There's this famous example of people calculating pi By using this circumference method And this would take literally like years of your life to compute a new additional digit of pi, right? And now of course we have computers so we can calculate pi until like the Thousands digit accurately But in that time you could spend more or less your whole Life as a phd student. So like a four-year phd student Just doing computations to find one additional digit of pi and it's just an excellent quote. I love the I love the fact that It's so ancient and it shows that how people thought right like there are excellent men and there are peasants I I kind of like that. I also love his wig. He has a good wig that that's a perfect thing All right, so then we start with the real computer history, right? So with Charles with with light nets We had the light nets wheel, which is kind of the predecessor of the pocket calculator So we have Charles Babbage And he is more or less in the history of computers. He is the inventor of the modern day computer His computer is called the analytical engine And he it was not built during his lifetime. So it is a kind of general purpose computer which can be used to compute anything that you want but at that time The steel and the the ways that they would make steel and how they could Were not good enough to make the little cog wheels that were needed for his computer So this is not a digital computer. It is an analog computer using cog wheels And then you could set the input to the car using the cog wheels And then you would turn like a big thing and it would do the computation and then the output would come out But they actually built one in the uk a couple of years ago 10 20 years ago I think and it turns out that his machine really worked So the analytical engine is an is a computer the first general purpose computer that only existed in the mind of Charles Babbage Not just in his mind, but also in the publications that he wrote At the same time we have Ada Lovelace Ada Lovelace and Charles Babbage had a strange relationship But what she is known for is she is more or less accredited to be the first real computer programmer So she wrote algorithms computer programs that would run on Charles Babbage his analytical engine So she would write right programs for a machine that didn't even exist um, and she didn't get to be too old but in Uh in memoriam of her there is a language called Ada Which is one of these famous programming languages So she is the first computer programmer She is the inventor of regression which we will come back to in like lecture six or something But she wrote on in in her letters with Charles Babbage They they wrote or she wrote computer programs for his non-existing analytical engine Which they would hope would be built one day Which actually we did using modern steel modern c and c machines You can make cock wheels and gears very tiny And you can build this analytical engine and it turns turns out that it really really works Then in 1912 to 1945 we have the life of L&M Turing L&M Turing is one of these people who is more or less the godfather of computation So not the inventor of the computer, but he is the one that Cracked the german enigma machine in the second world war But in computer science, he's not so much known for his ability to crack the enigma machine But he is he is known for developing the Turing machine So the Turing machine is a hypothetical machine That allows you to reason about what you can compute and what you cannot compute So I added an extra slide because I think this Turing machine is really really interesting So the Turing machine again like the analytical engine is an abstract machine It is a machine that does not exist in reality Because it is a machine which has a which has symbols so zeros and ones on a on a tape on a strip of tape And for the whole computational complexity computations to work this tape needs to be infinitely long And of course you cannot produce an infinitely long tape But besides this tape where you have a head which can read and which can write numbers on the tape There is also a table of rules for manipulation So here you see a very basic Turing machine being built in lego And here you see one which is more or less built using using a more or less digital computer memory But but what a Turing machine does it has a table of rules So the rule says if you read a one move to the left Right and it has internal states So based on what it did previously it can do other things right So if you read or if you move left and you read a one then write a two So it has this table of instructions and with this table of instructions and this infinitely long band of Numbers where you can write and read from You can you can reason about What you can compute and what you cannot compute And the nice thing is is that any computer algorithm that exists today Can be implemented by a Turing machine So we often say a certain language is called Turing complete That means that you can compute anything that you will be ever able to compute Using this language So something like html is not Turing complete But something like r is Turing complete And this is where the whole computational complexity theory comes from So godfather of of of more or less computational complexity He allows us to reason about what we can and what we cannot compute All right, then we start with the real computers So the first real computer was built and finished in may 1941 The first real computer in the world and you heard it here is the z3 If you talk to an american they will say that's not true The first real computer was built by americans, but that's not true The first real computer was built right here in berlin by someone called konrad zuß So konrad zuß built the z3. He is an auto deduct So he taught himself everything and in the end became became a professor but He built this computer for the german For the germans for the for the nazis at that point in time And this computer was used to study wing flutter So wing flutter is the phenomena that when you fly an aircraft and you fly it very close to the speed of sound Then the wings start flapping because of the air resistance, right? The the the faster you fly the more air kind of builds up on the wing And when you reach the speed of sound you also have the sound wave. So you get this Kind of instability and the wings start flapping and if you go any any Any any faster than at a certain point your wings will just fall off your airplane So to study this phenomena, they built the z3 and the z3 is the first general purpose computer It's not fully digital There are some analog components still in there But it is generally considered by most experts to be the first real computer in the world The programming language, which was used to program this computer, which is also invented by kondrat zuus is called plan kal cool And in the year 2000 The fu in berlin actually wrote a plan kal cool compiler So it is nowadays a real computer language For zuus it was more or less a toy language, which he used to kind of Reason about what his computer could do and how to program the computer But the original z3 was destroyed in 1943 by an allied bombing run on berlin So you can actually find a z1 replica in the german museum of technology And there are other replicas across germany because of the fact that this is really one of these main achievements Which people generally accredit to the to the americans, but actually should be accredit to the to the germans So of course the americans struck back and the anyak is the first real computer that the allies built in the second world war And it is called the electronic numeral numerical integrator and computer So that's also where kind of our term computer comes from and it is the first real full electronic general purpose computer It is student-complete, which means that any algorithm that you could think of can be programmed on the machine It was fully digital. So there are no analog components And it was capable of being reprogrammed Reprogramming at that point means that someone had to go physically to the machine Pull out a cable and plug it in somewhere else right when you think about people being on the phone And having these phone switchboards. This is kind of how this computer was programmed So you would input the program by by connecting certain parts of the of the switchboard And this would have a certain meaning But the computer is a full computer. So it supported looping. So you could do a for loop or a while loop It supported branching, which means that you can do if statements and it supports subroutines Which is what we now know as functions Um, and the the purpose of this machine was to calculate ballistic tables So of course have when uh, when the second world war ended The big issue was is that we had nuclear weapons, which we wanted to put on top of rockets But when you launch a rocket, you kind of want to know where the rocket will land So and for this you need ballistic tables and which you can look up like, okay, so I have my speed I have my inclination and these kinds of things Um Was what the anyack was used for to calculate If I launch my rocket, where will it drop down and and what city will it hit? So ballistic tables was what they computed with it Another very famous person is john von neumann and in 1945 john von neumann came up with the von neumann architecture So nowadays all computers that we have in the world are von neumann machines Which means that they are structured in a way like this and every computer still is right So you have an input device, um, which for example can be a keyboard or a mouse or or A drawing pad And then hey you have the computer itself and then the computer itself is connected to an output device So the output device be it a screen Be it a printer be it something else a robot arm for example But within the computer you have two main components one of them is the memory unit which stores Things in memory and you have the central processing unit, which is the nowadays called the cpu And cpus are divided into two parts. So you have as any cpu in the world will have two different Parts and one of them is the control unit which keeps track of what we are doing now And then you have the arithmetic and logic unit, which does Arithmetic so plus minus divided but also logic. So logic means boolean logic Which means that if I say true and true then that's true It can be true and false then it's false True or false is of course true So you have these like logic tables And a computer is is can only work because of this right doing plus minus Multiplications is not good enough. You also need to be able to evaluate if a certain statement Is evaluating to being true or if this statement is evaluating to being false And john von neumann kind of popularized this architecture saying that no you need these four Parts to make a computer and had this central processing unit. He's also the one that coined the term cpu Needs to be control unit and an arithmetic and the logic unit. So this is called nowadays an Alu so it's a it's a cpu and every cpu has several alu units which can do computation And it has a control unit to kind of direct the flow of the memory into the different alus So nowadays I always like talking about this. I don't know much about it because I'm more or less a bioinformatician I'm not a quantum physicist or anything But I think that it is really interesting to see that since 1976 we started developing a new type of computers Completely different from the von Neumann machines and they started with roman stanislav ingarten who developed quantum information theory So the theory of of how to use quantum superposition to do computation Then david deutch in 1985 post that you can actually build a universal quantum computer So hey, you can take these qubits And then head using qubits you can build a something which looks and resembles a computer like we know But which doesn't use zeros and ones but which uses like superposition which means that a single byte Can both be zero and one at the same time or more or less 255 different states between there So very interesting theory But we couldn't really do anything with it until 1991 When people figured out that you can actually use this quantum entanglement So you can have two particles you can entangle them together And then you can use this to secure communication via under these cables, right if you if you are America talking with europe you don't want a russian submarine to cut your cable and listen in on all of the communication that you do So nowadays if you think about fiber optic cables, then these are secured using quantum entanglement to provide secure communication In 2011 the d-wave one Became available to the market and the d-wave one is the first real quantum computer So it is a computer which is consisting of 128 qubits and you can run quantum algorithms on these machines In 2020 the d-wave advantage was launched and the d-wave advantage is the newest generation of quantum computers It's not a full quantum computer. You can't it's not a universal quantum computer It only does a certain part of quantum mechanics But the d-wave advantage is one of these computers which is really interesting because it's like 5000 qubits Which seems really small right nowadays your your computer has it's a 64 bit computer But you have like terabytes of them. Yeah, so it's it's a big cpu big memory So this is a relatively small computer, but it outperforms any Normal classical computer that there is on certain tasks not all tasks just certain tasks So how does this thing look so the d-wave one you see a picture here It's just a big back box And the cpu so the the central core where everything happens looks like this and this is a Gold-plated little chip, which is cooled to around 200 degrees below Below zero and this chip can do quantum computation You pay around ten thousand dollars for ten million dollars for it. So that was in 2011 so in 2011 a computer like this would set you back ten million dollars But this is very similar to the price of the original enniak the original enniak was around $500,000 when it was built Which if you would bring this amount of money from the past into the future That corresponds to around 6.1 million us dollars So the the d-wave one is is kind of the first computer of a new generation of computers the d-wave one was Used in biological research To study lattice protein folding. So how do proteins fold and you can you can use this a classical computer to Compute this but then you have to go through a lot of iterations And hey, you have to do a big for loop and every time you have to adjust all of the atoms But this computer doesn't need to do that It just it gets the input protein sequence and then Because of the way that quantum computers works it more or less Instantaneously knows the minimal energy state of the protein. So lattice protein folding is what the d-wave one Was used for Of course nowadays we have the d-wave advantage. The d-wave advantage is used by many big corporations in the world It's again same box Just with different colors and advantage written on it. And it's used by mentin ai To do protein design. So to design new proteins to Help people in disease research or do protein docking And it is also used by volkswagen to schedule the paint shop So volkswagen creates a lot of cars, right and these cars need to be painted So but people order different colors of cars And you don't want to paint one car blue and then switch all of the equipment and then paint the next one red And then switch everything again and then make a blue one again, right? So you need to schedule this in a way so that you do as many blue cars in a row as you can And then you do as many red cars before switching to a different color So this paint shop scheduling is actually done by a quantum computer nowadays The same thing is for save on foods. They use this d-wave advantage to optimize their grocery delivery schedule So this is one of these problems which originally if you the traveling salesman problem or the grocery delivery problem if you have An x number of stops that you need to make right and or you know the physical position and the roads between how to get there Then a normal computer cannot give you the optimal route. It can give you a It can give you a good route, but it cannot give you the perfect route And a a d-wave advantage or a quantum computer can be used to do these grocery optimizations much faster than a normal computer. It still won't give you the optimal route, but it will give you the Very good route at a very good in a very limited amount of time Amazon also uses d-wave advantages and this actually was a really great opportunity for people who are interested in quantum computing Because they used to offer free access for covet 19 researchers Because of the protein folding that you can do with these machines, right? So when covet first hit People didn't know much about it They know the genomic sequence, but they didn't know anything about the proteins how they would fold Had that the spike protein has different confirmations and stuff And that was the thing that you could actually apply for so you could write an email to amazon And then you could get free access to one of these quantum computers to do optimization protein optimization and these kinds of things on covet 19 I don't know if it's still free but I think so Will I need to remember everything about these computers in order to successfully use our no But you will for the exam because the exam of course is Going to have some questions, which are more General, right? So I love people that do That want a noble prize. Um, so I tend to ask questions like Who is considered the first computer programmer? Just a couple of like warm-up questions before we get to the hard stuff So, um, so you don't have to it's just to let you know where we are and where we are going Right, so all of the programming that you are learning today in are In like 50 years probably won't exist anymore because we would have switched away from classical computers to quantum computers And that's a whole different way of programming. So But you don't have to know All of it just get the general gist that We're in a process of again Moving away from mechanical calculators to digital computers and now we're in the process of moving away from digital computers to quantum computers All right the same history again But now for programming languages Because of course there's a history of computation, but there's also a history of talking to these machines, right? And especially digital computers, um, have a history of, uh, that you need to talk to them So I don't know if anyone is like old like me and misha, um, but When I was younger when I would go to the hospital I would get one of these little cards after entering the hospital with holes in there And then this this card would hold part of my information So they would plug the card into the machine that they had and then that would do All kinds of rattling, um, and then tanks man. Yeah, I'm sorry like you're just as old as me, right? That like are just as young as me. That's that's How to say it don't want to offend you, but uh We're we're not squeaky young master students anymore unfortunately, so That's just the way it is. Uh, but these punch cards, uh, I'm 46 Okay, I'm I'm just as old as you not 46 yet, but almost there like I feel 46 on some days Uh, but like low level programming language are punch cards, right? So physical cards with holes in there, uh, which you plug into a machine and then it goes into this, um Yeah, it's like these little um It's like these music boxes, right that you used to have so the information is like the music in a music box So it's it's the spool and if it's an a zero or a one So having a hole is a one not having a hole is a zero So to get away from that people developed something called assembly So assembly is the first language, which is a an operand language And this is a very very low level language So it's something that the cpu can directly understand. So you're directly talking to the cpu But like I said, um see yesterday was 50 years old. It's developed by denis richey and ken thompson It was developed between 1969 and 1973. So there's no real official like this is c date But c is kind of the computer language that is ubiquitous your Hey, if you if you have a computer then the operating system that you are using is using c at some point in booting up And that doesn't matter what operating system you use be it android be it Windows be it linux And then of course there's a whole bunch of high level programming languages So what's the difference between a low level and a high level language is that a low level language is you are talking directly to the cpu But high level languages have like an abstraction layer on top of that So you are talking not directly to the cpu. You are describing what you want the computer to do This is then fed to a compiler or an interpreter And this compiler or interpreter is a program which translate your high level language Into computer instructions, which the cpu. So this are this alu unit will execute So you don't have to code. Hey, you don't have to say Take this memory location and take it from random access memory drag it to the cpu But you have to be aware that this still plays a role So if you're dealing with gigabytes and gigabytes of data Then sometimes using these high level languages can cause your program to be really slow because It knows about the cpu. It knows about the map, but it cannot optimize for example the throughput from the memory to the cpu So some high level languages plan co cool is the oldest one developed by con ratuz We have auto code, um, which is developed by glennie in the 1950s We have all goal which is still being used a lot. Um, and that was developed In the late 50s Lisp is one of these languages which is still used a lot today just like c is Which was developed by john mccarthy and it is the core language for artificial intelligence So many of the artificial intelligence algorithms that we use nowadays, um are written in some form by lisp I always like to mention cobalt because people that want to earn a lot of money with programming are in the wrong course R won't make you rich learn cobal cobalt was developed by grace hopper in 1959 and it runs More or less the aviation industry and the financial industry So if you can program cobalt, you can earn a lot of money working for a bank You will have a very shitty job because you're just maintaining code that has been written by other people like 10 15 years ago but that's kind of That's where the money is so making sure that the world financial system The fact that they call the language lisp Yeah, well, I think it's an abbreviation right most of these things are But lisp is an interesting term indeed But cobalt is where the real money is at lisp is if you want to do like artificial intelligence and develop the next skynet or something like that And there are literally hundreds and hundreds and hundreds of languages around So here we see a table of going back to the 1954 all the way to 2011 and here we see how different languages are related to each other and how languages kind of Evolve from one to another right new concepts are being being thought up new abstraction layers are being thought up And in a way had languages branch out some languages die out at a certain point in time But had the main languages from the 1950s are still with us. So there's still people every day programming in C There's still people programming in cobalt and still people programming in lisp because they have their own very dedicated fields So why would you want to learn r? Do you know robot carol? No, unfortunately not. Um, some of you link interested to learn about it So why would you want to learn r right? So the r language is free and open source And as a dutch guy, I love that. I love everything that's free. That's kind of our Thing that we are known for um, and it's open source. So that means that if I don't like anything in the language I can change it. I can I can Write code to change how the programming language works. I can submit it and they can Change it or they can accept that or they can reject it It is a language for statistical computing. So the language itself is built with an understanding of statistics So it knows that there's a difference between a numerical variable And a categorical variable. Um, it knows that there's a difference between like a Different distributions. It knows what a normal distribution is what a t distribution is what a what a beta distribution is It provides built-in graphics, which many programming languages do not you can say plot Something right plot one to ten and it will show you a plot So that's built into the language if you want to do something like that and see that that's going to be a lot of code to write r itself is written in r for tron and c Which means that you can use c to write really really efficient code You can write or you can use old for tron libraries, which have been developed in like the 1980s And you can write things in r, of course But one of the other big advantages is is that it's operating system agnostic Which means that if you have something written in r It will give you the exact same answer No matter if you execute it on windows on linux on mac or on your smartphone And this is not true for For many languages that are out there many languages are Tied in a way to a certain operating system or you have to write code Which is specific for linux you have to write code which does the same thing for windows and you have to write code Which does the same thing for mac os x so but for r That's not true if you have r code It will run exactly the same on each operating system that you use even on your smartphone. You can nowadays run r It has built in knowledge of linear and non-linear models Which is really useful when you are modeling biological relationships or when you're modeling Anything in general, right? So if you if you write statistical models, the language knows what a statistical model is It knows what a formula is it knows how to define these things It has built in help and testing Which is nice if you're want to develop your own packages in the end because if you want to develop your own Packages, it's nice that it has a built-in testing framework to test that your code is still doing the same thing Even after you change a couple of lines of code And one of the biggest advantage of r is that a lot of people in the world are using it people who work in the financial Industry who work in in ai development People who work in biology. So there are literally Thousands and thousands of add-on packages available a quick scan learns that in the main repository the Cron repository for r. There are four thousand plus packages And then we're not even talking about all of the other packages which are available. So All right, then I think it's time to take a little break And then we will continue. I can show you the look and feel right now add 312. It's not that long It's a single live stream on twitch. We used to have like real breaks, but So a little bit about the look and feel if I start r. I see this more or less Well, not with the plot because I have to make that But this is how it looks when you download the argui window. So this is I think on windows 7 This is how r looks on windows xp or older operating system again very similar, right? So you have the input console. So here you type your commands and you have a graphic view which allows you to make plots Same thing here. You have your console which inputs thing and then here you have your output So this is just a screenshot I made for one of the pictures For some people r looks like this This is our studio which is a shell around r Which provides more things like it it has the Code editor so it has a built-in editor where you can type code Here it has the console So this is where you type in the commands and then those commands are interpreted by the r interpreter It has of course a plotting window as well But this plotting window can also be used to view the files on your hard drive and do packages Um Does the download link you send us work on windows 10 and 11? Yes. Yes windows nowadays It doesn't matter if you're windows 10 or if you're windows 11. They're the same thing It's just that Microsoft is re-skinning it and reselling it So, um, but the nice thing about our studio, I'm I well the nice thing I can have a whole rant on why I think our studio is good or why I think our studio is bad I think our studio is bad for when you're a starting out programmer because like It there's a little bit of an additional layer between you and r right here Your very bare bones you type in your commands and then you see what happens in the graphic window But here it's all kind of click and play for for some parts right you can you can go back to the previous plot you can Um load packages by just clicking on them And that I think is not good for basic programmers But I don't care if you use our studio or if you use the standard R GUI Even if it looks like this because this is generally how I interact with r So I generally interact with r via the windows command line So that means that you can just do r it starts up r and then you can type in your commands And when you do a plot it will open up another window which you can just click away when you don't need it anymore Good. So r looks and feels very different in very different Situations. Um, so this is the r term window This is the r studio window which some people like but I I'm not a big fan of it This is how my r generally looks like but I use r a lot like this as well because it's useful when you want to use it on machines Good with that being said and done and the discussion going on We will continue with r as a calculator, but we will first do a short break So the first break and all of the breaks are around 10 minutes. Um, I will quickly run down to a cigarette and drink some coffee Have my voice relax a little bit. Um, and for you guys I have prepared Funny animated gifs So as is tradition the first set of animated gifs will be guinea pigs because everyone loves guinea pigs Um, so why not? Um, and I will probably start in some music as well because otherwise it's so damn quiet Here we go gifts. Yeah gifts gifts gifts. Good. So we will do some gifts and I will be back in around 10 minutes Good. All right. Let me start it up. So let me start the music Like this and you can't hear it Let me stop the music Unmute the thing So now you should be able to hear the music And then we will go and go to the First break. So I will be back in 10 minutes. Um, enjoy yourself and behave yourself and uh, see you in 10 minutes Guys, um, let me read the chat quickly. All right, so there's a little discussion about eating guinea pigs Feeding them to snakes, which you shouldn't do because they're nice and cute. Um, True at the alpacas. Oh, alpacas will come in one of the other breaks. So that's that's good. Um Let me see anymore. Does anyone have an issue with the proofing bureau to register officially to this course? If you have any issue, um with registering, let me know what the issue is. Generally, if you are not from our institute, um, you need to have a letter From me saying that you can join the course Something like that. So that's generally the issue that they have because they want to make sure that there's enough space in the course Chicks and cows are nice and cute too, but that doesn't stop you from eating them. Yeah, that's true. That's true. That's true Those guys are simply ignoring me like the plague. Yeah, that's that that's another issue I am well aware of that issue. They ignore me as well. Not like the plague, but sometimes they do. Um, so, uh, yeah Send me an email. Um, I will uh, see if I can, uh, contact someone since I'm working here They I might have a little bit more more push to uh, to see if I can get you registered then All right, so yeah, that's something that we can kind of figure out, um In in it might be because of easter holidays and stuff as well Um, because that of course always creates an issue with people being on holidays and and and these kinds of things In the last couple of years, I kind of learned whom to speak to directly and bother them, um, to to get a mail back So I'll do my best to get you registered. Um, so just send me an email so that I don't forget All right, so let's start doing some r, right? So I haven't even used my fancy button thing here where I can just say Ooh, go to r and go back Um, so first things first, right r is a very fancy calculator So you can just type in whatever you want and it will compute so, um For example, if I want to know what one plus four is, um, I just type in Something like one plus four. Um, so I can do one plus four and oh, no, that's one two four one plus four And it will tell me this is five. Uh, sorry, where did you send the download link for r? Oh, it's in the assignments which are already online on Moodle. Um, but the link let me actually, uh, put it in chat So if you're on windows, um, then this is the link. Um, there you go Oh, this is where you can download r. So this is our GUI, right? The one that I'm using if you want to do our studio um, which is Some people like it better. I'm not a big fan of it But then you can just go to r studio.com and you can download it from there Um, so those are the links at which you can download it All right, so r is just a very basic. It's a calculator, right? So you can type anything in. Um, Just remember, um, and that's on the slide. So let's go back to the slide. So just remember that the decimal point Is the decimal separator. So doing 1.5. It is always a period never a comma The comma means nothing in r and r will ignore commas and even in files. So, um It is one of these issues that we will run into when we start loading in excel files because excel likes to Localize your numbers, right and in some areas of the world It is four comma five and in other areas of the world the comma actually means thousands. So If you write a thousand you write one comma zero zero zero so In r the decimal separator always a period and it will cause some issues and then you will remember this slide and think Oh, yeah, there's a comma in there r doesn't recognize commas some Five divided by 10 will yield zero point five not zero comma five There are some special operators in r like you can do exponents So five to the power of two you can write it in two different ways. So you can use shift six if you're on a us keyboard um And do have this little rooftop thingy roof indicator But you can also do just the two multiplication symbols behind each other. So five Shift eight shift eight, which is the The multiplication symbol two will also do five to the power of two and of course this will yield 25 Then there are some special operators which Are useful in computer science a lot Which is called the euclidean division and the euclidean division remainder I have two slides about that to explain to you guys what euclidean division is But it is division by integers. So it's it's It makes it so that you divide but if you divide and there's something left behind the comma Which is a dot in r, um, then it it will give you another output And of course r comes with a whole bunch of uh built in numerical constants, which you have to be aware of So if r tells you that something is e and f that means that it is infinite If it tells you n a n it means that it is not a number So that generally means that something is a character, right? If you do in r something Weird like this and you say, um, I have the string called, um, hello, right and I want to do plus, uh, Something like this then it will give you just an error saying non numerical argument to binary operator, um, but it Sometimes gives you n a n meaning that it is not a number So that that's just one of these things Um, so you have not a number and the nice thing about r is r understands that because it is built on statistics Is that there are missing values and it will deal with this missing values when you do computation So the n a is missing So three very special meanings. Um, I think r also has minus infinite So something can be positive infinite minus infinite It doesn't understand things like countable and uncountable infinity But those are very very specialized subjects in mathematics, which we won't get into in this course But it understands what infinite is so So Euclidean division, right? So imagine that I want to divide 100 by 39, right? So then the way that we used to do this in elementary school or at least when I was in elementary school is just say Well, we write down the number that we want to divide between these brackets And then we write down the number that we divide by in the other Before that right and then the first thing that you would do is then you would see The first number one is not divisible by 39 So you would then say, okay, so then the next one comes in so it's 10 10 is also not divisible by 39 100 of course is divisible by 39 So how many times does 39 fit into 100? Well, it fits into 100 two times, right? And then you will say, well, I have 100 and then the two times 39 is 78, which means that there's 22 left And this is called Euclidean divisor. So the Euclidean divisor means how often does a number fit wholly into another number? And then the Euclidean division remainder is How many units are left when we do this first division, right? So in this case, it's it's 22 is remaining because there's still 22 units that we have to do And this comes back when we start doing loops. So when you start doing loops You want to for example use multiple cores on your cpu And then you need to know how many batches can I do and how many do I have left, right? If I have four cpus in my computer, then for example, if I have to do 73 computation, right? So then the question is how many times can I Do 73 so how many units can I do per core and then how many units do I have left? And it comes back into a lot of mathematical problems So but Euclidean divisor Euclidean division remainder those are things when you Divide using integers so whole numbers Furthermore r has a whole bunch of built-in character constants. It knows the letters, right? So if I type letters Then I get a vector which is containing the 26 uppercase letters of the roman alphabet If I type letters using small letters Using small characters Then it knows that I mean the 26 lowercase letters of the roman alphabet, right? So if I go to r and I want to know what the fifth letter is in the alphabet Then I can just say letters and I do square brackets and I put five in between So I'm saying here from the vector letters select the fifth element. So oh letters, sorry so the the the fifth letter of the alphabet is e the 20th letter Wouldn't know that by head the 20th letter is a t, right? So r knows this Furthermore, it also knows all of the months of the year. So there are two Things so it has month dot up Which is the three letter abbreviation for the english month names and you have month name Which is the english name for the month of the year So again, if we go back to r and we ask, uh, what is the months dot names Is it months? It's month dot names and we ask what is the 10th month Month name. Sorry, then it says the 10th month is october, right? So it it it knows a little bit about time It knows a little bit about months and and that's really useful when you have to deal with for example Times, right? So often in biology, uh, we have a birth date And then we have a date at which the experiment started or a certain treatment was applied So r can be used To calculate how many days are in between The day at which an animal was born and the day at which the animal went into an experiment Of course r also knows, uh, pie I think everyone knows pie That's the ratio of the circumference of a circle to its diameter and the built-in constant is accurate till I think 128 digits behind the comma. So Don't use 3.14 15 When you want to calculate something on circles use pie. It is just a built-in constant So typing pie will show you pie with a whole bunch of digits behind the comma r also understands imaginary numbers So imaginary numbers come into play when we want to calculate things like weights on a feather Which there's some kind of dampening and in all kinds of mathematics imaginary numbers come back, right? So We know that we can take the square root of a negative number Generally, but in mathematics, of course, we have advanced calculus and in advanced calculus We have the definition that the square root of minus one equals i So um, you can have r use imaginary numbers, but then you have to specify that you want that So if you want to if you say to r give me the square root of minus one It will say this is not a number But if you ask it for the square root of minus one plus zero imaginary units Then it will tell you that the correct answer to the square root of minus one is zero real units And one imaginary unit so it can deal with imaginary units Of course r also supports all the basic trigonometry functions like the sign the cosine the tongue the arc sign the arc cosine in the arc tongue So those are just built in and these are functions. So if I for example want to know the sign of 20 right, so then I can just say sign round bracket open 20 round bracket close And then it will tell me that the sign of 20 is 0.9 So just a built-in function and of course it can do the same for the cosine And it can do Arcton and these kinds as well. So there are built in these are built in functions into r, which you can just use Furthermore r has a good understanding of what logarithms are So you can take the natural logarithm of five by using the log function So this is when the base of the logarithm is 2.2 something But if you want to have the base 10 logarithm of five, then you can write log 10 and then five The inverse of logarithms are of course the exponents. So if I take the e value So e to the power of one that is exponent one e to the power of five is x five So it has logarithms and exponents built into the language, which is really really useful So furthermore when we are dealing with r and we're dealing with With with mathematics, right? Then we have to know the order of operations So the order of operations for mathematics is that we first do the exponent Then we do the roots. So the square roots or the triple roots or the four roots Then we do multiplication and division and then we do addition and subtraction. So there is this Donky bridge. We call it in Dutch, Asilsbrugge. I don't know what you call it in German, but probably something similar like a Thing, but and so if you want to remember the order of operations, then it goes, please excuse my dear and Sally So it is subtraction, addition, division, multiplication, exponent and Routes, which is the e for No, it's not so because the p is for the power e is for it's for the inverse We call it Asilsbrugge too. Okay. Very good. Very good. So if I write down these things, right? And I ask you what will be the answer when you type this into r And these are all of these stupid facebook quizzes that your aunts and uncles are always sharing about Then of course the answer to this is Four right because we first do the multiplication And then we say so three times two which is six and then we subtract that from ten And it's not ten minus three which is seven and then multiplied by two. So the answer is not 14 Right that that's just stupid But like I see these things on facebook all of the time because my aunts are very bad at mathematics So they always mess up the operator presidents Of course brackets can change the operator presidents, right? If I would write this down and I would say round bracket Ten minus three round bracket close, then it would do the subtraction first, right? So the operator presidents is like normal mathematics But in in case that it is confusing use brackets Right brackets are really useful in that sense to kind of be explicit And this is one of these things that sometimes goes wrong when you write your own code And you get like answers that don't make sense, but Standard operator presidents like in mathematics also applies to r All right, so in r right when we start the r window we have a session so by starting r we get a session and r at this point in time Is started from a certain point in your hard drive So if I want to know where I currently am on my hard drive, I can use something which is called Let me move here get working directory, right? So I'm currently on my d drive I could be somewhere else. I can say for example set working directory go to my c drive Right, so now when I do a Dear right for listing all of the files in the current directory Then it will tell me well on your c drive You have something which is called recycle bin and then you have bio info 2021, which is the old course I have a folder called hit up and of course I have a call folder called windows as well Right, so by using Set working directory. I can move somewhere else on my hard drive Or to a different hard drive, right? And get working directory will tell me where I am at this point Right, so get working directory. Where does r save and retrieve files from? If I type dear then it will list which files are located in the current folder that I am I can use a set working directory Which is go to another place to save or load files And then there's this ls function And this tells me what is in the current environment So in r when I do an ls and I don't think that I have that many defined Well, I have four variables defined, but I can I can define more variables, right? I can say number X right and then I can assign the value 10 to this So now when I do ls it will tell me that I have defined a new variable called number x So ls gives you an overview of everything which r is currently aware of So all the variables that you defined all of the things that you assigned to a name So just that you're aware that this is the session right and in r everything is in RAM memory So when I try to assign like a billion billion billion numbers into a variable Then r will run out of memory Let me show you guys if I can actually force that right if I say Bunch of numbers Right and I say put all of the numbers from one to a million in there Right, then that fits because my I have 16 gigs of ram, right? But then if I would do even more numbers It still fits but then if I would do even more right then at a certain point It will say oh result would be so long our vector, right? So that would mean that it would not fit into my memory anymore So it just gives me an error saying that I can't do that So you have to be aware that you can't create an infinite amount of of variables, right? The the memory that r uses is very limited is limited to the amount of memory that you have in your computer So be aware that that is one of the limitations of r and that you have to deal with that Right, you can't load in a 50 gigabyte file when you only have 16 gigabytes of ram I told you that r comes with literally thousands and thousands and thousands of packages So for example, there are packages to do qtl mapping So qtl mapping is a method to find regions in the genome which are controlling Classical trades, right and there's a package for that So if I want to install a package, then I can say install dot packages And then I say between quotes because I have to give it a string because otherwise if I would not give it the quotes It would say qtl is not found because it would look for a variable called qtl, which is not defined So had these these air quotes They are strings. So um a a special data type What that you can use to Store text. So here I'm just going to say install a package called qtl Right and this will go online. Look at the online repository. See if there's a package with that name and then install it into Your r version if I want to use a package. I have to make it active. So I have to say library qtl So this will install it. You only need to do it once but every time that you start r You have to load your package. So you have to make it active So for example thinking about things like multiple sequence alignment, there is a package msa Which allows you to do multiple sequence alignment So you have to first install the package And then one when you want to do multiple sequence alignment You just say library msa and it will give you then a function to do multiple sequence alignment And of course, like I said, there are like four or five thousand perhaps even 12 000 packages So hey, you can just search for what you want, right? If I want to do machine learning in r Then hey, I can just search on google saying I want to have a package in r Which allows me to do machine learning Again, you find the package you say install the package give it the name of the package that you want to install This will download the package from online Put it on your computer. If you then want to use this package, you have to still type library qtl to Make the package active if you want to save something to the hard drive So for example, I just defined this variable, right called Number x, right? So if I type number x, it will tell me this contains 10 And I can save this so I can just say save number x and save it to something which is called my number x dot r data, right? So now it will No, um, sorry, I have to say file is I have to specify the name that I want to save it, right? Um I get a warning message and that's permission denied because if I do my get working directory I am actually at my c drive. So I'm not allowed to say things on my c drive because of the way that my windows is set up And I can go to my d drive. I can save stuff in my d drive and then I can say save number x And then it will now create a file on my d drive called my number x dot r data If I now clear my session, right? So if I now say number x Is na Right. So now when I type number x, it has no value and now I can just say load And then load in my number x dot r data And now it loads in from the file on my hard drive. And now when I type number x, you can see it has the value 10 again, right? So if you if you want to save stuff to your hard drive, then you can use the save function if you then want to reload it Um, then you can use the load function Good, um, then there's another thing if you want to just save everything, right? Because I just did this ls file function to show or to see and it's it seems that I have more than one variable defined I actually have like bunch of numbers number x offset x x sleep and offset i For some reason I probably started r before. Um, I did the started the stream But if I just want to save all of them, right? I can use the save dot image function Then I don't have to specify what I want to save. I can just say save dot image your dot r data If I want to quit r, I can press the button on the top, right? And then it will ask me do you want to save your workspace? Do you want to save your session? Always answer no Because the next time that you start r it will load in every time everything that you had defined And this is generally not what you want Because this means that if you load in a 4 gigabyte file Then the next time that you start r it will take like 10 minutes because it has to reload all of that stuff from your hard drive So when you quit r Don't save your whole session If you want to save stuff do it explicitly, right? So say save or save image But don't do it when you're quitting r because sometimes you have a lot of data loaded and that would mean that by quitting It saves it to the hard drive then the next time you start r it starts loading in everything Which can sometimes make r take like half an hour to start up all right I told you guys one of the advantages of r is that help is available and there's a lot of help available So hey, you can say question mark function name and then it will open up the help file for that function So, um, let's just give an example Right. So if I go back to r and I say give me the help of the function called Sec right sequence, which is a function that allows you to make a sequence, right? So I can say a sec One to a hundred go by two, right? Then we'll do one three. So that's just generating a sequence If I want to look at the help file of sec, I can just do question mark sec And then I hope no it opens it in the wrong window, but that's okay That's just because I'm streaming that it does that So then it opens up the help file and then for you guys it will look like this Right. So now it opened up the help file saying that sec is something which is in base r It is sequence generation and it just says generate regular sequences is a standard blah, blah, blah Um, it has the usage. Um, and then when I scroll down, um, it will actually give me an example Right. So I can say sec zero to one length.out is 11 So split the range from zero to one in 11 equally long parts. Um, hey, you can use by Um, so it just gives you a little overview and it gives you a couple of examples at the bottom So the the good stuff of the r help files is always at the bottom If you use this function for example in your publication Then here it also gives you the reference So it also tells you well you use this function. This function was made by um air becker jm chambers and ar wilkes Um, and this is the book that you have to cite. So in this case, it's a book So there's no doi or no no journal. Um, but it's called the new s language and it was, um, the Watson brook and coal was the um, so hey, it has all of these in I thought as all of this information for you So that's that's really really useful. Um, so by doing question mark sec I get the help file for the sec function If I don't know exactly what I'm looking for and I'm thinking like no, I want to look for something called, um For example, the standard deviation, right? I can say search for the term deviation So I use the double question mark. Then I give it the term deviation And then it will search through all of the help files that are there And then every file that mentions the term deviation Will be listed here and then of course I can click on it and then it will give me more Um, more information about it, right? So variance and standard deviation of a vector So that's probably what I want or I can for example surface area But so I can search for the help through the help files And that's very useful when you don't know exactly what the name of the function is Is that you that you should use or that you're going to use All right, um, let me switch back to power point. Um, so of course you can search for terms like obesity And remember you can even get the help file of the plus function Right because every function in r has a help file In the case of plus you have to do the quotes around it And to make it a string Otherwise it would be a syntax error because question mark plus could mean something else So but had a plus function also has a help fade Just in case you forgot how addition of numbers work So use it a lot. It's really really useful to just look at the help files And it always gives you a small example at the bottom that you can just copy paste in and see what happens All right data types. So now it starts getting serious because This is something that you need to understand to understand how r works So in r types are separated into different types of data So for example, you have a data type, which is called logical. So a logical value is something that we consider To be true or false So this this this A variable which is a logical variable can only have two states, right? It can be true or it can be false There's no other option. It's not a quantum computer that we're using We're just using a classical computer. So a bit in the computer can be on or it can be off Furthermore, we have numerical values, right? So five seven point nine ten point six. Those are numerical values In r we have a type called character, which is badly named because it's not a single character. It's actually a A word or a line or something that it can be more than a than a single character It can be a whole piece of text But a character is something which starts with a double quote Then you have something and then it's double quote end, right? So everything between the two quotes is interpreted more or less literally and is just stored in memory as being a character Because we are dealing with a statistical language r has something which r is a vector So a vector is a list of things and this list of things Has to be of a single type, right? So I can for example say c which stands for combine So combine these numbers together in a vector and then store it into v1, right? So let's just show you guys how that works by using r So in r I can say for example, I have some numbers that I measured so I have my measurements Right measurements And my measurements are well the first animal had a measurement of 10 the second 12 Then there was an animal of 11 then there was an animal which is 9 And then there was an animal that had like 80 right just random measurements, right? So now if I type measurements, um, it will show me. Okay. So these were the measurements, right? So this is a list of things You can do the same thing with characters and you can do the same thing with logicals, right? So a logical vector is for example true true true and false r has built in matrix support Unlike very very different languages like a language like java has no built-in matrix function It does not understand what a matrix is but because r comes from a statistical mathematical background It has it knows what a matrix is so a matrix is a two-dimensional array Like a excel file So a single sheet in an excel file is more or less a matrix because you have rows and you have columns Right. So if I want to define a matrix, I can say I can use the the keyword matrix So I can say make a matrix containing the numbers one 220 so the double point is it's kind of a shorthand for writing one 220 Um, and then give it five rows four columns, right? So this will create a matrix which has five rows four columns and it contains numeric values again In r a matrix is always of a single basic type So you can have a numeric matrix You can have a logical matrix or you can have a character matrix But you can't have one column being numeric and another column being uh, logical That's not allowed in our Basic types vectors and matrices are always required to be of one type And this comes back when we start loading in our own data and formatting our data for r Because in general when we measure things we don't measure Uh, we measure things um in for example a numeric way, but we also Use characters, right because we say that this animal belongs to group one and it has a measurement of 10 So in r you can't load this in as a matrix So but let's just create a very basic matrix or just that you guys know So I can say matrix, right one 220 give it five Rows four columns and then it looks like this So the first thing that you observe is that r fills it on a column basis right so the first numbers so one to five going to the first column Six to ten going to the other column if you don't want this if you want to fill by rows, right you can say Um comma by row is true. Um, oh sorry. That's all small letters By row is through and now it it fills per row, right? So it says one two three four Right, so there's two different ways of creating a matrix The standard way is taking the numbers that you gave it and filling it column by column If you want to fill it row by row, um, then you have to specify by row to be true All right, so generating a matrix So if you want to work with types, right because you you need to sometimes know how many measurements there were You can ask for the length, right? So if we go back to r, I define my measurement So if I ask for the length of measurements Then it is five. So there are five numbers In my measurement factor so the length If you have a very complex object, which is possible in r, um, you can use str So str will show you kind of the graphical representation of the structure Um, we have no complex object yet But let's just show you the str of the measurements, right like this Then it says it is a numerical vector It has five elements and then it lists the first It lists the elements afterwards. So but of course like You can have a list with a list in there and then a matrix in there and then these structures become more useful But at the moment because we're only dealing with basic types, it would just tell you well, it is numeric Um, it has a length of one to five and the this these are the values in inside of it The class will tell you what class has been assigned. So class will tell you this is logical numeric or character Um, you can force things to be Numeric, right? So you can you can go from a character representation to a numeric representation or form a numeric representation to a character representation by using as numeric or as character Um, so let's show you an example of that. So for example, if I have um, the string one Um, then I cannot add something to it, right because it will say this is an error Hey, you have the character one and you're now trying to add a numeric value to it Of course, we can convert this one from a character representation to a numeric representation by using as numeric, right? So convert This character one to a numeric representation and now it is able to understand what we want. So now we go from character one to a numeric one and then we can add another one to it and then the answer will be two because one plus one is two Um, we can also ask the question of something is numeric, right? So is numeric, right? So if I have a certain string or a certain character loaded, um, then is numeric of a hundred, right? This will be False because it's a character, right? Um, but if we ask is numeric A hundred then it will tell me true and this also works for vectors, right? Or matrices for that matter So if I define my matrix, right? This is a numeric matrix. I call this y, right? If I now ask is not numeric Why it will tell me yes. Why is a numeric matrix? Um, if I would do Uh, something else, right? If I would create a vector containing one and two and then hello Right, which is perfectly fine, right? Because they're all characters. Um, I can put them in my car V so my character vector and now I can ask is this thing numeric And it will say no Yeah, I can ask if is it a character, right? Then it will tell me Yes So just you can ask questions about what is in a variable. All right, so working with types very important We'll come back to it. Um, and um, it'll be fun because this is going to screw you over Every time like it even screws me over after programming in r for like 16 years The type system in r is massively complex. Um, and it will screw you over Definitely It happens all of the time All right, so when we want to create vectors and matrices We already saw a couple of these so we can use the c the combine operator, right? And then we just say one two three four five whatever we want to put in, right? So we can see Things together then it will create a vector out of the individual objects We can also use the sec function to create a sequence which goes from something to something by something, right? We already saw this with the Sec one to one to a hundred by two, right? So let's just give you an example So we go sec from one to a hundred and step by seven, right? So it will say one eight fifteen twenty two We can also use the rep The repeat function and then we just repeat a single object a number of times, right? So I can say, um Oh, it's the wrong button. I can say repeat the letter a 10 times And then it just gives me 10 a's So very very basic functions, but gonna be very useful at a certain point in time because have when we start having like conditions and then you say well the first 100 animals got were in condition one and then the next 50 animals were in condition two And then you can use the rep function to generate a vector which has the length that you want Matrices we also already saw so we can create a matrix by just giving it a vector object And then saying how many rows and how many calls there need to be We can also use c bind and r bind and these are really really useful functions because I can take for example two vectors Of the same length and c bind them together So what it will do it will just create so if I have two vectors of length four It will take the first vector as the first column and then the second factor as the second column r bind works the same but now for rows So it will take the first vector make it the first row and then it will take the second vector and make it the second row So for example, if I want to have an empty character matrix, I can say m is a matrix Use the empty character string, right? So just question a quote double quote double quote, right? It's a string which there's nothing in there. I could have used the a as well But then I would have a matrix which is not empty but filled with a's and then it will generate a matrix in this case 10 rows 10 columns So m is matrix empty character 10 by 10 So some examples v one is one two four So this is shorthand, right the double point. It just means two so one two four stepping by one So that's just one two three four I can use the sec function from one to a hundred stepping by seven I can use the repeat function. So repeat the number one four times. So I will just get one one one I have v four here, which is just repeating the letter a four times The same thing I can do with matrices. So I can say m one is a matrix character matrix. So empty character 10 by 10 Um, if I want to create a numerical matrix, I have to use na, right? Because it's an empty matrix But the empty numeric is called na Missing, right? So this is just a matrix which has 10 rows 10 columns all of the values in the matrix are missing I can take a c bind function. So I can say v one v three is Column bind them. So let me just show you that in r, right? So I can say v one is one two four and then I say v two is Repeat for example the number 10 four times And I can say c bind v one comma v two And then you will see that it the first column will now be one two three four The second one will be 10 10 10 and if I would use the r bind function Then it would do it the other way around so it will now put the first vector in the first row And the second vector in the second row Good And I know this is a little bit boring right because like it's just How do you do these things? But it it is very very important to really understand what's going on when you're dealing with vectors and matrices Because they they will be everywhere, right? If you load in your own excel file that all of a sudden you have a matrix Right, and you need to select certain columns from it because you want to know what is the average of a certain column So then you need to be able to select that column So how do we select things? Well, if we already saw this in r So when I type r it will give this number here, right? And let me switch back Um, that's what it does here, right? So this is one Right, so this is the first a in my vector So when I have a very long vector, right, which is not just 10, but like 100, right? Then it will say that well, this is the first one and this is number 16 So it helps you directly by displaying where in the vector you are The same thing holds for the matrices, right? So here it will say that here I have The first column second column third column and fourth column Same thing for the rows, but in this case since I used Vectors, which have names it will use the name of the vector as the as the row name in this case Or as the column name when I'm using c popping, right? So it's it's it's just these indexes here when you type something in r It generates auto indexes for you. So you know how to select from the thing that you just created So if we index a vector, we use the square brackets So here, for example, I have a vector which contains the first letters of the alphabet And I can select for example the fifth letter from the alphabet using fee square bracket open five square bracket close So if I'm selecting from vectors or matrices, you use square brackets If I'm calling a function like the sec function or the repeat function I use round brackets. So I can also use a trick, right? So I can say from v Select the letters two to five, right? So then it will select b c d and e And I can use the combined vector Again to say well select from vector v. You always read from the inside out. So Combine the numbers two to five With eight and then use this as the index two vector v Right. So I'm saying see so call this combined function Combine the numbers two to five together with eight and then select it from the vector called v So then it will select for me b c d e and then the letter h Good, we will practice this really really hard during the uh during the assignments because this is the thing that if you Are able to wrap your heart In our indexing starts with one that is a very good observation. Deonardo. Yes, that's a plus one for you um, yes Because r is based on statistics And mathematics things start with r The if you if you think about mathematics if you have a matrix then the first element of the matrix as it Is at row one column one This is different from many other computer programming languages Computer programming languages like c are generally counting from zero and that's the difference between someone who does computer science And someone who does mathematics A computer scientist counts to ten Doing zero one two three four five six seven eight nine But the mathematician counts to ten by saying one two three four five six seven eight ten So a mathematician is like a normal person a computer scientist is like a A crazy person who just happens to count from zero Right and this goes wrong a lot and especially if you're used to programming in another language These one by off uh these off by one errors are very common when you come from another language and you start programming r But this has all got to do with statistics because in statistics The first measurement is called measurement one and not measurement zero So because of r being based in statistics we count from one so very very good observation day on our very good observation All right, so for a matrix we have the same thing, right? So from a matrix called m select the first to the third row comma column one right so it will select this thing So it will give you back a vector which has the numbers zero point seven six zero point two nine nine zero point two one one I can also say from the matrix select from the fifth row The columns three to six, right? So then it will select this area here So it will it will give you back a vector which has a length of four containing these four numbers You can also select A single element right so then it will give you back a singular element not another vector So from the matrix from the eighth row select the seventh column And in r you don't have to specify what you want So if you just want everything right if I want to have the whole ninth column here Then from the matrix do a comma right? So because I don't specify anything for the rows it will automatically give me all the rows So from the matrix give me column number nine You can do this for the row as well So I can also say m nine comma and then it would select the ninth row of the matrix So not specifying anything means give me everything which is also a little bit counterintuitive All right. So just to repeat There are three basic types in r for now We have the logical which can be true which can be false We have the numeric value which are numbers Numbers always have a dot as a separator never a comma We have characters which are one two three and then we have a vector vectors are An array of things Which are all of the same type and we have matrices which are the same as vectors, but they are two-dimensional And they are also all of the same type So to make it a little bit more complex right because I told you that the r-type system is really really difficult We also have something called a data frame So a data frame is like a matrix. It's a two-dimensional structure But it can contain multiple basic types in the columns not in the rows, but in the columns So if I have a vector of length four containing numeric values one two three and four I have another vector which contains character values red white red and missing And I have v three which contains logical elements If I want to put all of these together into a single thing Right into a single two-dimensional matrix or two-dimensional data frame in r speak Then I can use the data frame function Give it v one v two and v three and then store this into a variable called d And now d Is not a matrix. It's a data frame And it is allowed to have multiple types in the columns So this will just say v one is the first column v two is the second column and v three is the third column We also have a list which is not a vector again a list can contain Multiple types, but it can contain anything right so I can say Make a list And the first element of my list is fred a character vector containing the word fred Then I have the next thing in my list is actually a vector Of v one and then the next one is a numeric value Which is designated age And now it becomes really complex right so now you can see where the str function comes in So let me just make you a quick quick list right so I can say a list I say The name is let's just do the thing that is there. So name is fred Numbers is v one we probably have a v one defined and then let's just put the matrix in right so I'll say Why which was this big matrix that we made? A question from zoom is it also possible to put a matrix with three or more dimensions like yes Yes, but not using the matrix function That is very advanced and that will come all the way at the end we in the first 10 lectures we're only going to deal with vectors and two-dimensional matrices But yes our supports and dimensional matrices as well So you can have as many dimensions as you want. It just makes it more. It just makes it more complex. So we're not going to deal with that today So here we see that I can create a list right And let me name the last thing as well. So this is matrix Or my matrix is like this right so now I create a list The first element of the list is called it has the name name and it's called fred or that's what it contains We have numbers in there and then we have my matrix in there So if you want to build really really complex objects, which hold different types together Or even matrices, then you can use a list Let me store this somewhere like d right or not d but call it my list And now if I use str of my list now it will tell me how this list looks like right So it's a list of three the first thing has a name has a name called name And then it is uh chr. So it's a character and this character has the value fred Then the second element of my list is called numbers. It is of type integer in this case not not numeric Because they're whole numbers. Um, so if I would um there's a difference between integers and Numbers with a decimal dot Because of the euclidean thing And here you will see my matrix, which is of type integer because they're whole numbers. Um, and it has Five rows so one to five and it has four columns So this this is where the str function comes in handy All right, so we have a data frame for when we have a matrix and each column has a different type So this is more like a real excel file, right? A list is um Kind of a a combination of things Um, and these things can be anything Be it characters be it factors be it matrices being even list I can put a list into the first element of a list, which just makes it more complex In r we also have a type called factor and this is the factor type and the factor type is the type which is Understood to be of a categorical variable And this is again because of statistics because in statistics, we have for example males and females and well Nowadays there's more than just males and females, but had imagine biological speaking biologically speaking you have sex And based on your chromosomes, you can be a male x Y or you can be a female x x, right? So and r now understands that when I say us as factor So I say repeat the word male 20 times Then combine this with repeating the word female 30 times So I create a vector of length 50 the first 20 elements are male The the the last 30 elements are female I combine these together and now this whole vector I I cast to this special type to this factor type Yeah, because when I do statistics and then I for example want to do a t-test and I want to compare the males with the females, right? Then r now knows that okay This factor has two levels and it can only be male or it can only be female and it prevents me from adding a third Random type into the gender vector. It's not gender. It's sex in this case. That's a bad. Let me make Mist and gender is Change that to sex, right? Just not to upset anyone So factors are categorical variables And this thing we've already seen a lot, right? So if you have if you write a hashtag in r, then this is a comment So and comments are always ignored by r So if you if you use a hashtag everything after the hashtag is ignored So when you write scripts or when you write your answers to something in a script Use the comment symbol and then write and use comments a lot. So hashtag are comments Good. So those are more or less all of the things that we have How am I going to want to do this because I think we should first take a break because I've been talking for like an hour and 10 minutes again Um, so let's go to the second break and then after that we will have a type test So I will show you Something and I will ask you what type does it have in r So prepare for a self test get behind your keyboard get excited and We will see how how well I am able to trick you guys And the r type system is tricky. So we will try and trick you guys and misha. You're not allowed to join Because you already saw it probably last year or during the bioinformatics course. So We're uh, we're going to have to deal with that So, um, yeah Quick break again five to ten minutes. I will run to the toilet do a quick toilet break and then drink a little bit of coffee You guys get uh more animated gives I think this time it is going to be How are they called again the chupacabra big animals the French shaped animals they are called French shaped animals Yeah, yeah, my moderator knows all the answers to French shaped animals they are well, whatever you will see which animals cup be badass. That's it. That's it I've been talking too long. So my my sugar level is getting low Um, I should get a bite of chocolate as well while we're at it. So, um, yeah, I will be right back Um, so see you guys in five to ten minutes and enjoy the chupi cup cup be badass Not chupi cup be badass Good. So be right back. I made it back in time before the end of the music. I had to run through the building. Anyway Good, welcome back everyone who's still there. Um Get a piece of paper or uh use your computer Um, we are going to do a type test. I think if I didn't put in another slide. So type test All right, so here I list, um One two three four five six seven eight So put the numbers one two eight on a piece of paper and write down the different types Of the things that are listed here um, so i'm gonna Start in a little bit of music because um, it's eight and you probably want to think a little bit about it. Um, so Let me know when you have written down the types of these things and um, then we will Discuss them and I'm going to see how many of you I got to trick with the r-type system Um, so let's do another little music thingy. Um, Let's do pocket. We haven't done that yet. So Hope I can talk above it that it's not too loud um, so you guys have like Seven ish minutes left to come up with the types and uh Just let me know when you're done. Um, just throw in chat. I'm done. Um, also you can do that, of course in the in the zoom It's nice that I don't have to say anything really loud in my headphones How's it going? Are we still enjoying the r-course? Um, as a warning there will definitely be a question like this on the exam and I will do my best to trick you guys. Um It's only going to be one point of the whole exam Um, the exam will have 42 questions That's kind of how I roll. So there's always 42 questions 42 is the answer to life the universe and everything so Also to the exam 42 is where it's at. It should be in the email that I send around on monday The password for Moodle is r-course 2022. So, um, like this with the r capitalized So that's the key for Moodle both for if you are having a Moodle account The guest for example from a different university and you don't have a Moodle account at the Ha'u then you can use this for being a Guest, but if you weren't in the Moodle, um Drop me an email because then I I need to check and make sure that you're uh that you're in there And it's not professor yet. It's just doctor. So do titles anyway. So but in germany, they're big on titles. So Professor would be right. Can I do exam? Uh, Misha, no no not saying that you're too old for it Not saying that isn't this the second time that you're following the course actually crap That's fine. That's fine. Like in the end. It's not about getting the grade. It's just learning how to program that's its own um, it's its own reward right being able to program that uh All right, so around three minutes left. So we're just gonna Sit here for copyright free music. It's actually pretty good that uh, I was expecting that uh It would be much worse, but so anyone done yet because I didn't see any yes or thumbs up or something like that yet. All right first one finished I'm really curious Leonardo. I'm very curious how often I got to trick you Um, because I do my best always with these kind of questions I'm done with not knowing the answers. All right. That's that's that's like That's good as well. Well, it's not good, but it's also not bad, right? Like generally people, um, Especially since we just discussed it, right? Uh, it's it's difficult, but uh, it's gonna be fun. It's gonna be fun Done. All right. Key on way she I'm probably horribly mispronouncing your name. So don't don't blame me for that Moodle, uh, not not the moodle the zoom So there's still a couple of people in zoom didn't get any zoom. Yes, isn't it? I'm gonna do my potatoes All right, Misha. See you next time hungry as well But uh, we have like a couple of slides left after this. So after this, it's like eight slides and then uh, then we're done So it's not going to be much more. I just want to introduce you guys to the type system Any assignments will go and have you practice a little bit with the uh, different functions like the combined function the sec function the repeat function Creating a basic matrix Oh, you can still this. Oh, that's perfect. That's perfect She that's good. That's good. Learn something new today all right 10 nine eight seven five four three two one zero and hands down All right. So first one who wants to Get a shot at it What is the type of this one? Just throw it in chat What do you think it is and I know it will take some time like there's a 30 second delay. I don't I I I don't Know why the delay on youtube is so much worse than the uh delay on uh, you are so correct. You are so correct Character indeed indeed because of the double quotes, right? That's why it's a character All right. So that's uh, that's uh One for you guys zero for me. All right Interesting interesting. All right second one. So the type of this one Just throw it in again davida second one two Also character character character with a question mark. Um, yeah, no, it's the double quotes So it's a it's a character. So very good. Very good. Very good That's two for you guys zero for me. All right. What's the third one? All right. So the the zoom Came in first Zoom says What do the guys on youtube think guys and girls numeric numeric with a question mark? Um Vector if that's anything. Uh, no a vector is is uh, a vector is off a certain type It's not that a vector is a type in itself. It's kind of a storage unit, right? So a vector is just things with multiple slots You don't have to retract your message like no one thinks any less of you because of it But um, yeah, indeed. It's a numeric value. It's just written down in scientific notation Um, so that's that's just the way that Our roles so our understand scientific notation. So this is a one with 11 zeros behind it So very good. All right fourth one Zero x eight nine What does our make of this? What does our make of this? I hope I'm still audible because again youtube is drawing all kinds of errors my way saying that the audio stream bitrate is lower than the recommended bitrate But I think I'm still audible. So all right on Zoom we get ditto An error like the x is meaningless right there question mark This is actually a numeric value. It is a hexadecimal value. So when you write zero x It means that you're not using a 10 base system for counting, but you are using a 16 base system for counting so Zero x zero f is also valid Let me guys show you that because that's actually pretty funny in r Right. So if I write zero x zero f, right, then this is 15 So and that's the way that hexadecimal's role. So it also understands hexadecimal And of course if we ask for the class, um, then it will just tell me that it is a numeric value So Good good good good good Interesting. So you guys are doing very very well. Um, no real tricks yet. So Next one This one What is the type of this? All right First answer on zoom is in You are correct Shia Rolando Rivera Unfortunately, you are not All right and on on on youtube people are also understanding that this is indeed a comment Anything that starts with a hashtag is a comment In many other programming languages, this would be a color Like in html, this would be a color which has full red No green and Half blue. So it's kind of a mixture. So but r doesn't recognize colors in this way So this is indeed a character. Uh, not a character. This is indeed a comment Good perfect. All right, next one. This one should be relative in my window. There is a comment that slow mode is active Uh, no, that's just to prevent someone from spamming like Go to my porn site like hundreds of times per second Um So the hashtag makes it a comment. Yes, everything after a hashtag is ignored by r Meaning that you can write whatever you want. So after the hashtag, it's it's a comment Yeah, the slow mode is on just means that it's a five five seconds So you you can type message message message because sometimes it it happens that someone has a bot and they just spam like Visit this website or buy bitcoins And then it rolls around the screen a hundred times and then when I block them so just to prevent that Um, you you can say anything that you want. Um, but you can only say it once every five second All right, logical logical logic. Yes. This is a logical value All right, and then the two that are I hope going to be Tricky tricky tricky. If not, then it's going to be an easy exam for you guys because you are almost having eight out of eight seven out of seven All right. So the next one as factor true What is the class of this? What is the type? Right? We are now getting interesting questions So we're getting answers like functions. We didn't do functions yet Um, oh on zoom. We have one which is correct factor. This is a factor. I take a logical value And I convert it to a factor because as factor means that make a factor out of the value that I give it So the as factor will always return a factor Um, because that's what the aim of the function is So factor good last one and then we have a couple of slides left and then we're done for today So is character one e plus 11 What is the type? What is the type of this? all right on Zoom i'm seeing Good answer lando. Good answer Good answer. That is indeed correct Leonardo says logical Shea Vash character Yeah No, this is a logical, right? Because is character ask a question, right? So the answer to this question will be true Or it will be false And the type of true or false doesn't matter what the answer is right is going to be logical so, um The answer to is character Um is in this case true and the type of true is logical So all right. So just quick recap quick recap character character numeric numeric comment logical factor logical interesting, right? Um It's just something that you have to deal with um and The exam won't be that much more difficult, but I will do my best to trick you guys and um, that's the only trick question That's going to be in the exam. I promise um Because the other questions are not going to be any trick questions or something like that. I'm not like that But this one I always try to come up with Being as creative as possible To make it as hard as possible for you guys to reason about what the type is, but you guys did very very well Um, that's uh, like eight out of eight for a lot of people. So yeah You are paying attention even though it's an online lecture. That's that's very good All right, so we talked about the list Right so that the list can contain anything and that you can put in whatever you want So I gave you this kind of an example where we make a list where we have a character vector of length one um with the name name and the content fret Then we have numbers um with the name numbers and then four numbers in there We then have another number Vector of length one 5.3 And then we have a matrix. Um, so in the last position I now put a matrix, right? So this is a two by two matrix Um with zero one one zero zero one in there So if you want to select things from a list You have to always use double square brackets And that is the only time in art that you use the double square brackets So when you are selecting from a list you can either do it by name, right? So if I say w dollar numbers, then it it will know that okay, so from from w Um, select the thing that has the name numbers and then in this case combine two and three Which means select the second and the third element If I want to select the first thing right if I want to select fred and I don't want to use the dollar name I can just say w Two square brackets open one two square brackets closed and then select the first element Because although it is a It is a single character value in r. It's still stored as a vector, right? So I still have to say select the first element from it Um, if I want to select from the matrix element here, um, then it is w four, right? In between two square brackets comma one and comma one means select all of the rows From the first column I can also just use this w dollar matrix one comma So when dealing with lists, my advice is to always name the elements of the list. So say Something is yes, so give it a name. So say element one is fred element two is Because it just makes it easier to select from a list Sometimes you run into issues or you deal with a package and this package doesn't name the elements of a list Um, and then you are forced to use the double square brackets So this is the only time in r that you use double square brackets and that's just difficult. Um, and it it It's it's the way that it is I can't change it But this is the way that you can select or index in a in a list element in r All right So there are some additional functions that I did want to introduce because they come back in the assignments So we have a function called n row, right? So n row round bracket open and then when we give it a matrix object, right so Something which is of type matrix. It will tell you the number of rows The same thing holds for n coal Of a matrix that will tell you how many columns there are in a matrix And this will be very important when we start looping, right? So when we start going through Every row in the matrix or we start going every row in the matrix or when we start going through every column in the matrix Right, then we are going to use n row and n columns Because we're not going to hard code it, right? That's bad. We're not going to say four x in one two six We're going to say four x in one to the number of rows of the matrix If we want to know the row names or of the column names of a matrix We can use the row names function or the column names function And we can also use these functions to assign row names and column names to a matrix So let me quickly demonstrate that for you guys because I think we have a matrix, right? We have y which is our matrix So I can say for example row names of y Is and then I can say row one Then I can say row two Row three and then say r four just because I'm lazy and then r five, right? So now when I Print my matrix you can see that it now has row names I can do the same thing for column names. So let's assign some column names as well So I can say the column names of y is Repeat No, not repeat is a sequence from five to ten. It's only four. So it's from five to nine How do you mean it's not What sec five comma nine Oh, that's way too much five to eight. Sorry Five to eight. All right. So now when I say why Right, and now I did something very very dumb because now I gave numerical row names to this matrix So now when I say select So now I can do why and then give me five And then it will give me the first row. So that's dumb, right? So if you use row names and column names always make sure that it's it's It's logical, right? So not that you start renaming rows saying that row number one is actually called five And row number six is actually called two Right that will just make life harder on you. So always when you do something like this, right? Where you say I want to have a sequence from five to eight You can use things like paste zero To say measurement or something like that and then Combine that with this, right? And now when we look at why then every Row will now or every column will now have a real name. So when you when you do names Make sure that the names start with a character and that don't assign Numeric names to a matrix. That's just making your life harder than it is But row names and call names really really useful because you can use them to directly select from your matrix And also when you want to combine two matrices together When they have the similar row names, you can just merge them in one go But hey, you can you can use row names to Get the row names, but you can also use row names to assign the row names And then one of these little tricks you have the t function, which is transposing a matrix And this happens a lot Because often you have for example functions external functions, which take a matrix And then the measurements need to be in the rows and the observations or The individuals need to be in the column and the problem is your data is just the other way around So for this and this is a very common problem that your matrix is is is kind of flipped upside down, right? Or flip the other way you can use the transpose function So what the transpose function does it takes the first column and makes it the first row It takes the second column and makes it the second row and it takes the third column and makes it the third row So it flips rows and columns around and that's what the t function is for I just wanted to introduce to you because we're going to deal with it during the assignments But it's a really useful function because it happens a lot that your data is just the flipped version of what you think it should be Or what the what the external function that you are using Expects it to be right if we want to use for example principal component analysis Then the the pca function that we use Expects the rows to be individuals and expects the columns to be the different observations that you did on the individuals and Often your data is just flipped the other way. So transposing Taking row number one making it into column one taking row number two column number two and so it's just flipping it All right a little bit of a word about variables because this will come back So in my mind because I always think about code not so much as code, but it in my mind I have like a A physical picture on how things look and this is different for very for for different people Right, so in my mind if I think about a variable Then a variable is in my mind a box like a black box Right, so I can put something in the box And then I can use the box without knowing what is in it Right, so for example I always say that that variables are like boxes control structures are like conveyor belts and functions are like factories because in my mind That's just the visualization that I have with it. It might not be a useful visualization for you, but in my mind Variables are just boxes where you can put stuff in Right, so and we have already seen a lot of them So we can name a variable anything we can come up with the name so make sure that you select meaningful names So here I'm defining a variable called variables and I put the value 1.5 in Here I'm defining a variable which is called can and I'm putting two logical values in there So a vector with two logical values So variables can have many names and you are responsible for selecting correct names for them So that's that's a difficult task, but that's something that we will grow into And you can put things in boxes using this Arrow thingy, right or you can use the is symbol So is also signs into a variable. So These two things are equivalent if I say mm Five this is equivalent to nn is five, right and you can select which one of the two you want So and generally I like to use the pointy arrow thingy And why do I like the pointy arrow thingy because I can flip it around right and I can say five put five into So put five into uh, oh Right, so I can assign the other way around while the is always assigned to the to the To the beginning, right? So here I'm putting five into nn But using the arrow I can assign five to mm or I can assign five to oh the other way around And this is just one of these quirks in r And sometimes it's useful. Sometimes it's not But I always try to use the arrow All right, so very good So for the assignments because we're almost done I want you guys on your hard drive to make a new directory and the new directory will hold One script for every assignment, right? So every week create a new file So my file is called assignment one or answers one dot r And then for the second round of assignments, we have answers to dot r. So name the files in a logical way Don't call them a one a two a three a four, right? It you can It's not wrong, but just be diligent, right? Programming is like working in a lab What's a directory a file no a directory is one of these yellow folders thingies in in in in windows Let me see. I don't have a So if you go into windows and you click somewhere and you can say new and then you have the option to do a folder Or a text document or something like that Yeah, a folder. Yeah. Yeah. Yeah. So head just create somewhere on your hard drive create a folder and Or a directory. Um, that's the official name. I think And and just put everything into one, right? So then you can also at the beginning of your script use Set working directory, right? Because you have to move from where you started are To the folder where you have your Very of your answers and that's not important for the like the first Assignments because we're not going to load anything from disk or write anything to disk But once we start getting into the later assignments, then you will also have files that you need to load Right, and then you need to put those files into a data folder. Um, and then hey, you get a more complex structure So when you program, um, and let me show you notepad plus plus Always add a header a comment section at the beginning of the file So all of my files start more or less like this Where I have a hashtag and then That is windows far away. So I always have like a hashtag Another hashtag and then I have the Description of the file. So what does it do? Right? So in this case, it would be answers to assignment one Right and then add a copyright statement because you are making it. It is your code and this is important And it's yours, right? So and so claim ownership of what you are doing Generally, I have the name of the file the date the purpose of the file and then a copyright statement And use a lot of comments. So I'll have when when I think that one of the first assignments is to do Something like four plus five Like this four plus five, right? Of course, you don't have to write a comment here. Everyone knows what's happening But when stuff starts becoming more complex Explain what you do and generally what I do is the answer. I write down like this. So four plus five answer Is nine, right? And then we have the next one which can be sec one to ten Well, that's not a good one sec one to ten By two and then do the sum of this, right? And then this is my code. I go to r I copy in the code Into r Like this, right? And then it says 25. That is my answer. So I go back to notepad and then I write down Hashtag answer 25 Not 225, right? So just to keep things structured and in the beginning this seems like a lot of extra work It is a lot of extra work in the beginning But in the end it will really really benefit you One of the reasons is um, why do I do this? So and why do I Why do I hammer so much on it that I want you guys to to do it as well? is Because in the end it makes your code searchable Right windows can easily search into text files. So this is for example in an Something I wrote like a couple of years ago in 2015 So the purpose of this file is the analysis of hardy Weinberg equilibrium, right? It is copyrighted 2015 by the hau berlin because I work for the hau So all of the code that I write during the time that I'm sitting here behind my desk is not mine It's from the university, right? It is written by me And then I generally add when it was first written and when it was last modified And this is not so much for me, but this is for the person that's going to eventually replace me Because you are writing code But you're not living in a void Like you are standing on the shoulder of giants. So you have to Know who these people were you need to know when they touched the file last right when when the first version of it was written All of my scripts start with a kind of set working directory Right, so I always say at the beginning of my script I want to move to the d drive into the r-course into the assignments So this is where I store all of my answers to the assignments So this is kind of an example But be diligent about it because this will save you in the end so Use a good text editor If you are on windows, I always advise people to use notepad plus plus It is a beautiful editor Um, I love it a lot. Um, it has automatic code highlighting which I like a lot, right? So, um, if I use like sec and sum it recognizes that these are built in function in r And I get this little purplish glow, right? It knows that these are numbers So they get to be orange. It knows that this is a comment. So it it makes it green So I can directly visually see what's happening in the file um and Use an editor which has bracket testing This means that when I put my cursor behind the bracket, it highlights the corresponding one, right? So here I'm highlighting the last one and it belongs to this one Here I can highlight put it here, right and I can see okay So this bracket here Which I'm closing here was opened here, right? And if I go one further it see okay, so this one I open It uh, this one I closed it opened here, and this is really important Because when you start building up bigger and bigger scripts, um, let me see if I can find an example of something like this Um, where I am using like a massive massive amount of brackets. Um, just to Confuse the hell out of everyone. Um, let me see if I can find a good good example Um, let's do an example in R um, so something goat ish For example sequencing Then no, that's not a good example Like here right here, so now I can put my cursor here and I can see oh the Curly bracket that I opened here gets closed all the way over here, right? You can directly physically see and this is very important Um, yeah the same thing for something like this. Oh, I know that the bracket because in the end A missing bracket will cause a lot of headache It will be a lot of debugging finding exactly where it is and then just being able to put Yeah, so if I would have an additional bracket right here Um, then now it will open it here And now when I can I can scroll through the file and I can see that this bracket is actually closing it here Right, so so now when I go to the first one here Then now it it doesn't highlight it meaning that this one is not closed Right, so so this one here is closed because it closes here after the paste But this one is is close or is opening and then closing the last one here But the one here in the front now has no partner So I mean I know now that I'm missing a closing bracket somewhere Um, and this is this is really important Um, so yeah It if you use windows use notepad plus plus If you are using mac os x use bb edit, which is a free editor for mac And if you're using linux, you can use whatever you want I'm super confused with these programs. Are we seeing r right now or notepad? Okay, so this is notepad plus plus And this is r So r is always everything in red at least in my version, right? And it has this r console here Right, and it has the stop button here to stop computation and um Perhaps it would be better if for notepad. I would just add my directory browser Like this But it it really depends so and in the end you can use any editor that you want Is atom good for r? Yeah, atom is a pretty fine editor as well. It it It's In the end like you can use anything you can't use microsoft word, of course I I definitely would have advised you against using notepad But notepad plus plus is a really good editor for code So yeah, joly some this is just something that is Um Is confusing at first, but once you installed both of them, um, then you can easily kind of see the difference But in r And that's why we looked at the look and feel of r also r can look very different from time to time right because this is r but This is r as well So here Let's open up the command window. So this is r as well, right? I can just type in five plus five Or five two five five plus five five plus five like this and then it will tell me ten Right. So the thing to remember if you see this, um Larger than symbol then this is an input line in r if you see something like one Then this is an output line in r. So this is an automatically index vector this vectors of length one It contains one value So it is something that you just have to kind of get used to and it can be different for all of the I think I have no, don't worry. Don't worry. That's why we're here for right? That's that's just to Just to help you kind of get familiar with all of the all of the things And like in the end every beginning is really hard Especially when you have no programming experience and for me. I also Sometimes gloss over these things too quickly. I've been programming since I was four years old So a lot of these things for me are very very logical and I don't I have for me. It's really hard to think like a beginner So if you're saying that no, can you explain it? Then I will gladly take the time to explain it Even if we're sitting here until six, right? That's why we have a four hour lecture Um had just to make sure that you guys come away with a new skill Right. I can already program. So for me, it's not it for me. It's really hard to Think back like what did I not know when I was a beginner? Um, but yeah, so Don't feel any burden to ask these kinds of questions. Um, I'm more than willing to explain And next time when we meet In person, I will send around an email about that about when we are going to do the assignments Together. I put them on Moodle. So do start already. Just download the assignments from Moodle and Start doing the assignments one by one If you get stuck either send me an email or wait until we can sit together on probably Thursday next week And then we will sit together for two hours and we will definitely go through all of these things but Will we know if the next lecture is actually in real life at Impoliti Strasse? Yeah, that's that that's still not entirely clear I'm very confused by the corona thing from the hau um Because we we have these emails with these dean stanweizung Right and the big issue was that we have dean stanweizung nine And then we had dean stanweizung 10 and nine was wearing mask and I have to check you guys If you're vaccinated and I have to make sure that if one of you gets covid that I inform everyone um, and then At a certain point they switched to 10 But 10 was not approved by the personnel Rat or personnel up tailung so that invalidated 10. So we're back to nine and it's completely unclear at least for me I I have no idea what I should do To have you guys in the building and have you formally in the building? Especially since I'm a little bit scared about it because we have two people in the building working in the lab Who have immune diseases? So they are either on immune suppressants or they are naturally immune compromised So I don't want to risk them getting sick because for them. It's a much bigger deal than for me Yeah, for me it would probably just be like a short flu or anything But that's the thing but I will inform you guys Speedily once I figured out how we can do this um All right, let me finish up the presentation. We still have like two slides left Um, Rolando Riviera. I have a question But I think it would be easy if I ask it early sure hang around in the zoom after the lecture we can just Chat about it very quickly All right, so use a good text editor So on windows, I always advise notepad plus plus if you want to use atom or you want to use visual studio code Or another kind of code editor No worries. We can just do that And you can use it as long as you feel familiar with it, right for me. I work notepad plus plus are gooey Other people use our studio our studio has a built-in editor Other people like to use atom and run our fire the command line But for a good editor most important code highlighting and bracket testing Because that is going to save your ass multiple times over the coming weeks All right, so remember clean code is smart code here. We see the same thing Written down in two different ways, right? One of them is just not adhering to Identation at all and because it's an unordered list with a list item Has so this although it takes more space It is nice and clean, right? You can you can see the structure of the code So this one is within this one and the same thing holds for are right if you if we go back to notepad head then you see also that I Take the time to lay out my code, right? So when I do a while then this is within the while loop, right? So because the the bracket starts here and ends here, right? So indentation is very important. Don't squeeze everything on one line Often you are allowed to and you can but you should not do that, right? So so make sure that things are structured in a way that just physically or looks good Right, so if you look at code and it looks clean Then it's much easier to reason about what is happening and and how it works And that's it for today So are there any questions? Wait, there's one more slide So the assignments are available on Moodle You can get them from here as well if you're Just hanging out on youtube and thinking oh, I want to learn a little bit of r So had the the assignments you can get from here. So r 2022 Those are the new assignments The lectures are in our lectures. Make sure to add the slash to the end. I wrote my own web server Should never do that should use a standard web server, but I wrote my own and I'm using it and I'm sticking with it but You need to add the slash at the end. So r 2022 without the slash is a different page Is a page not found with the slash it will show you an overview Let me show you guys how it looks just because we can do that. So let me go to firefox Let me get the firefox here. So if we go to denny arans.nl, right, then this is my home page And then if we go to r 2022, right without the slash it says Restricted you're not allowed to do this if you add the slash to the end Then you get to my home page and there's the first assignments So this is the first assignment, right? So the first thing is install r Then install the good text editor create a directory a folder on your hard drive Create a directory in this folder called assignments one, which is a little bit overkill But you don't really have to do that create a file called answers So but like make sure that you have a structure that you are comfortable with working in Again the example and then head just typing in these So all right, very good Let me see. I had a couple of words of wisdom The only way to learn coding to learn program is to sit down and code to do it You can listen to me for 40 plus hours But if you don't do the assignments You're not going to learn because I I can talk to you guys and I can explain to you all kinds of things In the end only by doing the assignments and and programming for yourself like taking your own data that you have and Applying the knowledge that you have to this Do you learn how to code? So do the assignments if you get stuck send me an email And do it quickly right if you get stuck and you're stuck with something for 30 minutes Then just send me an email. I'm generally very quick at responding to emails when I'm not on holiday So that's that's the way that it works And as a word of wisdom, this is not high school So I won't be checking if you did the assignments in the end. It is up to you to master a new skill I can only do so much. I can tell you How things work I can give you assignments so that you can practice how things work But if you don't do them, then you won't learn how to code And of course clean code. That's Good code. All right, so for me, that's everything. If there's any questions you can ask them now And Rolando had a question as well. So I will Quit the stream So I will see you guys next week and I'll send around an email informing you how we are going to do this Because I do think it would be fun to have you guys here do the assignments together And then I will do the stream And we will have a beamer so you can see me on the beamer or something like that Are we discussing the assignments next week? No, probably the week after Because the way that I'm currently thinking about it is that At before we do lecture number two, we do assignments number one together So then we will be discussing them in person. So I will show you my Answer so hey, you will you will sit there. You will do the assignments and then we will take 10 15 minutes and quickly go through them But I'm not I'm not sure how that's going to work yet So it's it's a little bit awkward since it's kind of a half digital half in person kind of semester thing So for me, I think it would be good if you guys get to sit together to do the assignments But we'll still have to figure out a way All right, Anja a more general question. What is the difference between python and r? brackets Both of them are programming language. It's just that the syntax is different So in r when you write a block of code you use the curly brackets We haven't touched the curly brackets at all yet But in python, you don't use curly brackets and in r you use curly brackets to denote Python is a different language. It has a slightly different structure But it's it's a perfectly valid language to learn as well. So More general question is there a difference between python and r? Yes and no because no, they're both programming languages They're both tutoring complete meaning that you can do the same thing in both languages It will just you will just have to tell the computer in a slightly different way what to do One of the things that r has that python does not have is the Built-in understanding of statistics python has no factor type. It does not understand that groups Are groups statistically speaking right because in in statistics you analyze differences between groups Different from numerical differences. So there is there is a fundamental difference in how these languages think about statistics But in theory anything you can do in r you can also do in python and anything you can do in python You can also do in r because both of them are tutoring complete languages So yeah But yeah, we can have a whole discussion about which one is better. Um, I don't think that any of them is better They're just different Um, my favorite programming language is actually not r. I have most experience programming in r My favorite language at the moment is actually the d programming language So you can look that up on google if you google day long um, or you just google my name with, um Day long well my forum posts come up as well And that it's a language which is very similar to c or c plus plus But it has some nice cities that I really like Which are not in r it has some very serious drawbacks as well like it also isn't a language for statistics but Yeah, but there's different languages every language has a as an advantage has a disadvantage all right any more questions i'm still here so We still have 37 minutes of questions if you guys want so I'm just glad that we got through all of it Because it's uh, it's a long lecture the first one. Um, I know this because I always try to push in the history to make you guys understand it Uh, Rolando, I can't hear you unfortunately Because I muted zoom so that otherwise you would be on youtube asking your question Why is r called r? Because it's the successor to s Which is a programming language as well So the s language is the is the language before r and r is an implementation of s That's that's that's the correct answer So Yeah, no, there's a programming language called s. It's not really a programming language It's more of a way of thinking about code. So you have s3 and s4 objects and r is the implementation of s um, it's like The language that I like most d right that is the successor to c He didn't see that one coming. No, no one sees that one coming. That's disappointing. Why what what kind of Like it was invented by a pirate Like that's why it's called r No, it's not like that Um, no, no, it's the successor to s Like d is the successor to c and c plus plus is also the successor to c Right, it's c but then better. So plus plus People have weird names when it comes to programming languages. There's a whole bunch of esoteric languages out there as well Which are programming languages that you do for fun Things like a programming language called pete based on paintings by pete mondrian Very interesting language very fun to program in But not very useful because you you use paint To make your programs so the programs are pictures and these pictures they are abstract paintings so every shape means is a kind of Instruction for the compiler So, um, yeah, if you're interested in programming languages and how these things relate to each other and stuff then definitely Go to wikipedia and search for the esoteric languages because those are really really fun. Um, there's also a Loll talk language. So that's that's all based on Loll talk, so but that's the way that it is. So good, um, any more questions i'm happy to Talk to you guys about anything And uh, if that's not the case, I have to wait 30 seconds right on youtube At least so that I know that there's no questions. I could just ask people if you have no questions and write no questions in chat that C++ is c on steroids C++ is actually c because when you write c++ You're also writing c right because c is valid c++ So a c program is also a c++ program a c++ program is not a c program So it's it's the one is a super set of the other. Um Which means that c++ inherits a lot of the baggage of of of c but Every language has its own advantages and own disadvantages Good, then thanks guys for being here staying until the end. Uh Pizza for dinner, um, I don't know. I don't know. I have to ask my moderator if we have pizza, but, uh Might be might be normally, uh, donner's talk is a pizza talk. So, uh, that uh That's kind of the rule My house Good, if there's no more questions, then Again, thanks so much for being here. Like the of course wouldn't be fun without you guys Thank you for asking questions and not being ashamed of asking questions, which you should never be Um, and I will let you know how we will do it next week and how it will work. Um, I'm definitely going to record it. Um, because With the whole covet thing if people Have to stay home or do isolation or whatever then, um, I think this is one of the ways that people can just continue following the course um, and uh With that, um, thanks so much and I will see you guys next week For sure. So I don't have a button for that I'm going to my finished screen. So Pizza tag. Yeah, thanks. Yeah. No, thank you guys for being there without you. It would not be uh, would not be as fun Good. All right, then See you on the flip side or at least next week