 So, welcome everyone, also the people on Moodle who did not attend the stream, but data analysis using R. So, we will be doing 13, 14 lectures and I will hope that at the end of the lectures everyone will be a capable R programmer. I sent out an email last week, I hope everyone was able to find the Moodle course, I tried to register people when they came in, I think we have one or two people who are external and for them it's more difficult to get access to the Moodle, but as far as I know no one mailed me that I can't log in or don't have anything missing. So I hope everyone was able to find the Moodle course and seeing it. So I will be uploading the lectures tomorrow to Moodle along with the power points and the assignments I will probably do directly after this lecture depending on how late we finish. Alright, so I chose Twitch, I like it much more. It gives me a little bit of an interaction with people, people can ask questions via the chat and I think it's much more fun for people to do it like this than to listen to a pre-recorded lecture which gets recycled every year over and over again. But it's up to you guys. So let's just start the lecture. My name is Danny and welcome everyone. So today we will do first a general course announcement. There are some announcements that I have to make and then we will have a couple of slides explaining are a little bit of a basic introduction and there are assignments for you guys to practice and that is of course very important because the only way that you are going to learn programming is by really doing it yourself and that is something that I always try to hammer on or try to enforce that you really have to practice programming. It's a skill and the only way you're going to learn a skill is by doing it and you can listen to me talking and follow the lectures but if you don't practice, you're never going to be a good programmer in the end. All right, so the general course announcements, the first one I already did, so slides will be made available online via Moodle, the system that we use here at Tahau. Check if you can see the course in Moodle, if not contact me as fast as possible. Is it just me or is the video or audio delayed? Delayed how? Is it not running in sync with each other? Like when I say stuff then my lips move like these things. Okay, I don't think I can fix it now without stopping the stream so it's not delayed, it works for me as well. All right, yeah, I can look back and see if I can get anything or fix it. It's a shame. The reason why I use Twitch is actually because it's a streaming platform, so it's made for people talking and audio and video and stuff. Okay, if it's not a big problem then we will just continue. Good. So yeah, please attend the lectures on the Twitch stream. It makes it more fun for me and having questions and people interacting just makes it more fun. And so feel free to use the chat whatever you want. You can send private messages to each other, but if you just want to say something just throw it in chat. The nice thing is is that anyone can make an account and you can make an account name with anything that you like. So you are totally anonymous and I have no way of knowing who you are. So it's always fun in a way because the link between the students who sign up and the user name that they use on Twitch is not there. So you can show up, you can not show up. But I do like people showing up because it's more fun. It makes it better for me as well. Hey, Skorita, welcome back, welcome back. So the lectures are supported by practical exercises. So spend some time on them. And if you get stuck, send me an email. And the thing that I wanted to ask you guys is do we want to have a Zoom meeting every week? Because if, hey, Lydia Russel, welcome to the stream. Because if you want to have an hour long Zoom meeting every week, then we can set up something like that. Because then you guys can see each other. And you can have more interaction with each other. And generally when we do the assignments, because the Zoom meeting would be really for the assignments so that, hey, you can work on the assignments and I can help you when you get stuck. And I'm a big fan of peer programming. So generally when we did the course in person, people would sit with groups of two, groups of three, and they would kind of help each other to do that. All right, so we have General Gulak. I think that would be very useful. Me, too. Whoa, that's a difficult using name. Chri-so-pe-dai. Is that a plant name? I think it's a plant name, right? Like a chrisantium kind of thing. Difficult user names this time. So yeah, if you guys want that, then I will make a little voting thing on Moodle. And then we can vote on the best day and time so that most people can join. Because I know that people have other lectures as well. So it's a beneficial insect. All right, I will Google it when we're done with the stream. So I'm quite curious to see what kind of an insect it is. All right, so then that's then decided. So we will just have a one hour long Zoom meeting somewhere during the week. And I will put that on Moodle. So Moodle, question, when, Zoom. All right, so we will discuss then next week when we do it. The lectures today are not, or the assignments today are not that difficult. So it should be okay for anyone to do it on their own. But again, if you have any questions, send me a mail. Because getting stuck is good because you Google around a little bit, you try different things. But in the end, if you're stuck on a question for more than like 30 minutes, then just send me an email. I'm generally very fast at replying to emails. So we will do a Zoom meeting and we will then, and I will answer emails when questions come in. All right, so at this point, the exam date is still unknown. I just put it in here. It's the first lecture I know. People don't really want to think about the exam yet. But the exam will probably be an online exam by written examination. People who followed the bioinformatics course or who already joined last year, the R course, they know that it's going to be like an exam. Just written exam, you make a photo, you send it into me and that's fine. So those are the general course announcements. Does anyone have a question so far? Or just a general remark? If not, we continue to the next slide. So it is important to ask questions. Programming is not something that is easy, especially when you don't have a lot of experience. You generally have a lot of questions. So the important thing is never, ever stop asking questions. And that's why I'm here. I'm here to help you guys with any questions that you might have or any problems that you might run into. All right, so I have some questions for you guys as well. So let's start with the first one. Who followed any statistic course? I just want to get a general idea. Generally, this works better when you're having a class lecture and people can just raise their hands. But is there anyone? Let me invert the question. Who did not follow a statistics course? For whom this is the first course that they get any contact with statistics? All right, so Alexander had a SAS course, Skrita Mi. Me as in me, I had a course. I think it's me, I had a course. Like the chat is a little bit delayed for me compared to the audio. I think there's like 30 to 60 seconds delay. That allows me to bleep myself if I say something stupid, which is quite fun to do, but I hope I don't have to. I actually tagged my stream as family friendly. All right, so SAS course in Rostock, okay, very good. Yeah, two courses. I thought the question was who had no courses. Had a course. Yeah, well, it's like we're 30 people viewing, so had a course, had a course. Good, good, good. All right, so I think everyone has some experience with statistics, which would be good. Yeah, all right, so if I ask you guys to do a t-test, then you won't be able to do a t-test by hand on paper, calculating means and standard deviations and then calculating the t-value, taking your like book and looking up the t-value. Lucas Reiter, it's my first course, her first course with statistics. Okay, well, we will ease into statistics. Second question, programming experience. Anyone who has any Java or C or PHP or whatever kind of programming language experience, because if you have some programming experience, the first couple of lectures are going to be a little bit repetitive with things that you already know, but Schiemannsky-Dienstlich, Schiemannsky-Dienstlich, that's an interesting username, not at all. A little bit of Python, a little bit of R, very good. Anyone else who wants to join in and say, not really, only VBA, that's visual basic scripting, right? Netlogo, well, that's a programming language that I haven't heard about in a long time. That's the same as VBA. Later and a little R, latech. Okay, so that's interesting. Bioinformatics with U plus a little bit of Python, a little bit of R, okay, so good. So everyone has a little bit of, I have hardly any experience. Yeah, well, you don't have to have experience. We'll start from the basics and we will just build up, but for me, it's just enjoyable to see that some people have some experience with R. So if we do a Zoom lecture, then if people are joining the Zoom lecture and say, well, I finished all the assignments, then we can have a breakout room system where the people who already finished can help the people who are still working on it. Because I do think that you learn the most by working together with your peers, right? By just helping people. I've been programming since I was four years old on the Commodore 64. Don't go for it all, thank you for following. So yeah, I've been programming since I was four years old. So there's a lot of things that you will run into which for me are really basic. So it's not that I can't help you with those things, but sometimes things that I take for granted are really advanced for people who just start out. So in that sense, I always like when people can help each other. Fortran long ago, oh, that's such a nice language. I love Fortran. Which version of Fortran? Fortran 77 or one of the newer ones? I love the old Fortran. It's really, really good. And it's really strong for numerical computing, probably even better than R. Although R allows you to also program in Fortran, interesting. All right, so like NetLogo. All right, I have no idea anymore. Okay, Skorita found the mood box again, so that's good. All right, so the main issue is what do you want to learn? Like I've prepared lectures to teach you guys about the basics of programming, how to make an R package and all of these things. But if there's anything very specific that you want to learn, then just mail me a suggestion because there are always like two lectures which I leave open. So if you have your own data set or are currently working on your master and you think like, oh, I'm gonna do like field research next semester. So I need to know how to analyze Latin square designs or I need to know more about regression. Then we can always do a lecture about your data set or about things that you are interested in. And I think that's kind of like I'm here for you guys. So if you guys come with show significance in a scientific diagram, would be great to learn. So you mean adding the stars and the lines to a box plot, for example, showing that two box plots are significantly different. Is that what you mean, David? That would be interesting. I do that a lot and I actually don't have it in the lectures at all. Yeah, yeah, no, no, we can do that. Text mining would be really cool. Okay, I'm gonna take some notes then. So significance in plots, text mining. So with text mining, you mean web scraping. So go and spider stuff of Facebook or something like that. I think I have actually a couple of assignments about text mining, not in the lecture, but in the older version that I used to give at the University of Groningen. So I'm gonna take some notes version that I used to give at the University of Groningen. So text mining is something that we can easily do and the significance in the scientific diagram is also something that is definitely worth the lecture. Yes, or just analyze a text like documents from an institution to see which words they use. Okay, okay. Like a couple of years ago, I had people from theology following the course. So I do have a couple of assignments which are about making co-occurrence plots about things like imagine that you have the New Testament. How often do the words Jesus and Mary occur in the same sentence? So these kinds of things to kind of go through text. Image recognition and computer vision, that is very, very advanced. I can put it on the list. I can see if I can find something fun in R. All right, image recognition, text mining, significance in plots, graphical representation of this experimental data. Yeah, we all do graphical experimentation of data. That's kind of what R is strong at. So we will do a lot of box plots and these kinds of things. All right, so let's go to the next slide. So this is kind of my idea of what you should be able to do after you follow this course. So format your data for R, load in your data, do statistics on your data, interpret the results, create nice plots, so create nice visualizations which are suitable for publications. So scientific journals and these kinds of things and to create your own analysis. So for example, imagine that you work hard during your PhD and you made a really nice analysis tool or an analysis algorithm and share this with other people. Yeah, visualizations are kind of key to R. So there will be a lot about visualizations and how to make nice plots. So these are kind of the four points that I want you guys to be able to do at the end of the lecture. Or not this lecture, but by the end of the lecture series. So it's not a very like far away goal and I understand that not everyone will become like a supreme programmer and that's also not the idea of this course. The idea of this course is just to have a very good understanding on how do I format data so that a computer can easily read it and how do I prevent some of the common mistakes in formatting data and do statistics, interpret the results of these statistics and create nice plots that are suitable for publication. So that's my idea of what you guys will be able to do. So for analyzing your own data, the idea is that you would be able to do box and line plots. So here you see a nice box and line plots or something like this. Make box plots and add things like arrows or standard deviations. The idea is that you can do things with histograms and overlay like different distributions on top of your histogram to see which one fits best. The idea is also that you guys can create heat maps. I actually have a very big preference for heat maps. I like looking at heat maps and I like creating heat maps because they are a really good way to visualize like three-dimensional data. The heat map has like an X, X axis, it has a Y axis or a Z axis in this case but you can visualize like a lot of data in a heat map. So I like them a lot and I tend to prefer data, visualize that heat maps if you can do it. Whoa, what happened? That's not what I want. Oh, I didn't put in a contour plot. But contour plots are really nice when you're doing things with like land and heights and these kinds of things. So hey, if you're working on a field research and you have fields which are located at different positions on a mountain, and then a contour plot allows you to kind of visualize how the data is done and how the different heights relate to each other. Yeah, come on. All right, introduction into R. So that was kind of the first section. So I know a little bit more about what you guys want and of course if you have a very specific data set then do contact me and we can make a lecture around your data set. So I made a really nice lecture or at least I liked the lecture a lot about someone who did a field research and they used kind of a broken Latin square design and then we made a really nice lecture just going through the data and what's well structured, which things could have been structured better and then together with you guys we can just go through the data and play with it which is kind of the thing which R is really good at to play with your data and to show what's going on. All right, so for the overview for today is I'm going to talk a little bit about history. The look and feel of R, using R as a calculator. I'm going to talk about the type system of R a lot. The type system in R is one of the most complex things that there is and even a professional programmer who's worked with R for the last 14, 15 years I still get screwed over by the type system and that happens. The types in R are complex and it does auto kind of upgrading and downgrading of types. So you have to know what kind of type of data you're working with and this has a massive impact especially for statistics. If you interpret something linearly or as a categorical variable, that makes a massive difference. I will talk a little bit about variables and about scripts because of course I want you guys to work in a structured way so I also want people to kind of make scripts in a certain way so that data and research becomes reproducible. So why history in an R course? Well, you have to know where you came from to know where you're going. I love the quote of John of Salisbury. I think many people know the quote standing on the shoulders of giants but people never know the second part. So I think it's very important that you know that John of Salisbury is the guy who is accredited to this quote and I will just read the quote to you because I love the quote. So the quote is, we are like dwarfs on the shoulders of giants so that we can see more than they and things at a greater distance. We are carried high and raised up by their giant size and this is definitely true in programming because computers nowadays have a relatively long history or on the human time scale it's a relatively short history but the history of computers is something that you have to keep in mind. It's a tool and you have to know how your tool came to be to be able to use this tool effectively. So we will just briefly go to the history and I always say if I wouldn't be a bioinformatician I would probably teach history. I love history and I think that you can learn a lot from history and knowing a lot about history also helps you to predict in the future where we will be going and what will be important. So we start off 2,400 years before Christ. The original lecture like this was kind of structured in a Sheldon Cooper style. I kind of dropped that because like I want to have a little bit more in front but the history of computers starts like almost 2,000 or almost 4,000 years ago. So 4,000 years ago people were already doing mathematics. Computers are built in mathematics and the thing that they wanted to do was do multiplications, counting, adding up things. So the first thing that you need for that is you need a structure which allows you to count things and which allows you to work with numbers. So a basic question to you guys who still used an abacus when they were in elementary school? My elementary school taught me how to do multiplication and these kinds of things with an abacus which is really fun. And I don't know, like it's, I'm getting older and older so the newer generation might not be very accustomed to abacus's anymore. So all right, so Schiemann Kindingslich plus one, plus one. You used an abacus in elementary school still? Wow, that's already three people. That's more than I had expected. But you can see a photo here of a very basic abacus which allows you to calculate things but it also helps you to remember things, right? And that's one of the things that is very integral to computers is not just only the computation, it's also the memory behind it. All right, so the abacus was more or less the most advanced computing tool that we had for around a thousand years, 1300 years. And at that point, people developed or in China, the differential gears were developed and the differential gears are very similar to the south pointing chariot. So I think people know what a chariot is. It's a thing where you have a rider and there's horses in front. But the sound pointing chariot is a chariot which always would face southwards. And the way that they did this is using differential gears. Differential gears are an upgrade for doing basic like multiplication and division because using differential gear, you can also do integration. So you can integrate a curve like you can calculate x squared. And that is something that of course you can do that using multiplication as well. But integration and doing something to the power of x or taking the x square root out of something is something that is a big step up. In 200 BC, the Chinese abacus was invented in China and this is the abacus that if you use an abacus in elementary school, this was the one that you used. So it's an abacus which is more complex. It allows you to do multiplication, division adding, subtracting and it also allows you to remember things. So it has like a memory cell where you can kind of remember like answers of previous computation. So you can use an abacus more or less like you can use like a pocket calculator. So can do all the basic stuff and it's a tool that will help you. So in 120 BC, we have something which is kind of interesting to note. We had the Anki Thera mechanism. So the Anki Thera mechanism here, there's a kind of a sketch of how the thing looks. It's just a hunk of metal if you look at it. It's somewhere in a museum and it's a big hunk of corroded metal but using x-rays, they kind of figured out how this thing looked and this thing is very similar to the differential gear. So there's all kinds of little gears and knobs and it allows you to do more or less navigation. So that's the idea behind it that it would be used to navigate on a ship. And the ship navigation is very important because you need to know where you are on the planet and since the planet is a big bowl and not flat, it's really hard to decide exactly where you are and what to do. So in a hundred after Christ, we had the Astrolabe which is more or less a development of the Anki Thera mechanism and the Astrolabe is a very commonly used tools and I think nowadays people who do like nautical stuff, they still get trained on an Astrolabe. So it's something that you point to the sun or to the North Star or the Polar Star and using that you can figure out where you are on the planet when you know which time of day it is. So it needs to be used with a clock. So in 800, we had the first real big breakthrough, not so much in computer technology but more in frequency analysis. So before 800 after Christ, if you were a general and you were commanding your troops, you would just send messages in your own language, unencrypted and that of course is very dangerous because once your courier gets caught by the enemy, you run a big risk of the enemy knowing your plans and running into an ambush. So Alkindus, who is originally from Iraq, developed a crypto analysis on how to encrypt messages but he also invented a way to do the opposite. So if you have a whole bunch of encrypted messages, can you use that to figure out what the original messages were? So frequency analysis comes from that and it's based of course on the fact that some letters are more common in certain languages. So hey, if you have a certain symbol and the symbol occurs like, is the symbol that occurs the most in a certain text and then in the English language, you can more or less safely assume that it's either the letter E or the letter N because those are the two most common letters that are occurring in the English language and of course, if you just rank everything and you have enough messages, you can more or less figure out the frequency or using the frequency of the occurrence, you can figure out what the original message was. In 1200 and six, we have the castle clock. So a development of a real working castle clock and this clock I think still exists in Mosul in Iraq. So it's a clock like you would normally, but it was like the top of science at that point in time. And then in 1694, we get the really, well, the first kind of computer-like system and that is called the Leibniz view. And I put it in since like, I'm in Germany and we're teaching, well, computer science so you would want to have some Germans in there. But Gottfried Leibniz is a very famous mathematician and this is this quote and I just gonna read you the quote. I like the quote a lot and it is about why we should use computers or why we should not do manual computation. So what he said is that it is beneath the dignity of excellent men to waste their time in calculation when any peasant could do the work just as accurately with the aid of a machine. So his idea was is that like at that point in time, if you wanted to compute something, you would sit down with a pen and a piece of paper and you would do all the computations by hand. And he invented this Leibniz wheel. There's probably a figure that you can find online on how it looks and it is a very basic calculator, very similar to kind of the abacus, but this one uses gears and well, it doesn't have a real display but it uses gears to do computations for you. So you would spend less time doing all the computations by hand and all the additions and subtractions. All right, then the real computer history begins with Charles Babbage. Charles Babbage is generally accredited to being the inventor of the modern computer and the modern computer or the first real computer that was made is called the analytical engine. So Charles Babbage lived from 1791 until 1871 and his invention was this analytical engine. Unfortunately, this analytical engine is something with cog wheels and at the time that Charles Babbage was living, techniques were not good enough to build the machine. The iron wasn't of high enough quality and they didn't have CNC machines so they were not able to make all of these tiny, tiny gears that he needed for his analytical engine. However, in 1990, using modern techniques, they built an analytical engine and it turns out that it really works. So his computer is completely analog. It's just based on gears and cog wheels just like the Leibniz wheel, but it is a full computer. So you can program it and then you can give it input and then it will give you the output based on the program that you input. The first computer programmer in the world who wrote algorithms for this analytical engine is Ada Lovelace. So Ada Lovelace is a very famous computer programmer and she is accredited with the invention of the recursion algorithms. So she wrote algorithms on paper for the analytical engine invented by Charles Babbage. And it's an interesting story. There's a love story there as well between these people and it's just a general interesting history to learn. But remember Charles Babbage is the original inventor of the computer. Ada Lovelace is the first computer programmer in the history of mankind. Oh, hello, wow. Thank you for following. In 1912, we have L&M Turing who invented the Turing machine, and the Turing machine is a machine which allows you to reason about what you can compute and what you cannot compute. So it gives us a mathematical framework to reason about what is computable and what is not computable. I have seen the imitation game. Immination game. That's the movie with Benedict Cumberbatch where he plays L&M Turing. That's okay. Like the English doesn't matter. I'm just wondering, because there's a really good movie about Adam Turing and Benedict Cumberbatch plays L&M Turing. He's in general history is more known for breaking the enigma machine. So the encryption that the German used on U-boats or for communication with U-boats. But in computer science, he's known for his contribution and his main contribution to computer science was developing this Turing machine. And this Turing machine is a hypothetical machine. It can never be built because it uses like infinite parts, but it allows you to reason if you can compute something or not. So we know, for example, that something like prime numbers, you can compute them, but when prime numbers start becoming bigger and bigger and bigger, there's less and less and less of them. So finding new prime numbers becomes harder and harder and harder. And the way that it becomes harder falls into a certain problem category. And that's like it's P-complete or NP-complete. And these problem classes are defined by a Turing machine. And all computers in the world nowadays are more or less Turing machines except for the fact that they have not got infinite storage and infinite memory. So three famous people that more or less are the basis of computers that we know nowadays. Of course, as everyone should know, especially when you're in Turing patterns are also very cool and interesting regarding development of plants and animals. Absolute genius, yeah, definitely. And it's a big shame that him being gay was a problem at that point in time and actually led to him killing himself. You also have one in Berlin, actually. And the museum, which has the airplane on the roof, they don't have the Z3, I think they have a Z1 or something. But the really first computer, the real first digital computer was built in May 1941. It is called the Z3. Of course, the Z1 and the Z2 are also there. And this is generally considered the first fully programmable digital computer in the world. It was invented by Konrad Zuse. And the reason why the Germans invented this machine is because they were studying wing flutter. So wing flutter is the thing that happens when you are in an airplane and you try to push your airplane to go faster than the speed of sound. Because when you go very, very fast and at a certain point, air molecules start accumulating in front of the wings which makes the wings go kind of fluttery. And to study this, they developed this machine because the Germans were, of course, creating jet engines at this point in time and they wanted to make an airplane which was faster than the speed of sound, which they didn't succeed in. And the machine itself, so the original Z drive was destroyed in 1943 during a bombing run on Berlin, unfortunately. So there is a, like, there is a Z1 replica in the German Museum of Technology and there is a Z3 in the Deutsches Museum in Munich. And here you see a nice picture of Konrad Zuse working in his lab. Many people dispute this in a way. And of course, like, I always like talking about Konrad Zuse and kind of highlighting the German advantages that they had, but Germany was at the beginning of the Second World War, one of the most advanced nations in the world. However, many people that you ask, and especially when you ask Americans which was the first computer in the world, they will answer the first computer in the world was the ENIAC, the Electronic Numerical Integrator and Computer. And it, of course, has the word computer in there, so that's kind of a dead giveaway, but it is not the first computer in the world. So, hey, if you have to do the exam or if you do the exam, and the exam is what is the first computer in the world? You have to answer the Z3 and not the ENIAC. Although many Americans will say ENIAC and they will definitely say, no, we were the first to develop a computer. That's not true. The Germans were first and more or less first by like three years or something. So the ENIAC is an Electronic General Purpose Computer. So it's slightly more advanced than the Z3 or it's a lot more advanced and it is Turing complete. So that means that anything that you can compute theoretically on a Turing machine can theoretically be computed on an ENIAC. It was fully digital, so there were no moving parts, no cog wheels anymore and it was capable of being reprogrammed. And that is one of the things that the Z3 did not really have. The Z3 is really hard or was really hard to reprogram because of course it was built for a general or for a single purpose, right? Just to study wing flutter and nothing more. But the ENIAC is capable of being reprogrammed and it supported things that we will learn during this lecture series, like what it supports things like for loops, a branching, like if statement and it even has subroutines, which are functions nowadays. Of course, the reason why the Americans were developing this ENIAC is because they wanted to launch nuclear missiles all over the world. So this computer was used to calculate ballistic tables. So if you launch a ballistic missile which goes from America to Russia, you want to know where it will land and of course you have to kind of calculate how much fuel do you need, how much does it weigh, air resistance and these kinds of things. So the ENIAC was built for one really specific reason and that was for the Americans to be able to launch a rocket and have the rocket come down in Moscow. Interestingly enough, the ENIAC was maintained or well, not so much maintained, but was programmed by women. So Betty Jean Jennings and Fran Bielos are the two main operators for the ENIAC and you see them here in the picture. And this is something that comes back a lot in computer science in a way. People always think of computer science as being a very beta field. So a field dominated by males and that is true nowadays, but in the beginning of computer science, this was not the case. Programming a computer was seen as something which was very creative and so it was more ascribed to being a woman because women were deemed to be more creative and have more in touch with the left side of their brain while men are more analytic. And so what a nice purpose. Well, you have to build the thing for a purpose, right? Many technology advances come from war and war is a good reason for a government to invest like insane amount of money in something and launching a nuclear missile and having it land in Moscow is of course a very lofty goal. But like I said, women play a massively important role in the beginning of computer science and in the beginning of programming, especially in programming. The computers themselves were generally built by males because it's kind of a more, well, it's more of an analytical thing to do, but have running the programs, writing the programs and managing the machines was generally found to be more creative and was generally done by women. All right, so a little bit more history we're almost through. So in 1945, we have John von Neumann being born and his big invention is something called the Von Neumann architecture. And this is the architecture that we see nowadays in all computers. So if you basically look at your computer, then the computer consists of four different parts. One part is the input device. We have an output device. So the input device can be something like a mouse or a keyboard or a joystick or whatever you want. And the output device is either a screen or it is a printer or something else. Has so any way that you can output. Furthermore, any computer for it to be a computer needs to have two additional parts. And one of them is a central processing unit, so a CPU and the other one is a memory unit. And this memory unit communicates with the CPU. The input device communicates with the CPU and the CPU communicates with the output device. The central processing unit is more or less divided into two parts. One of them is the control unit. So the control unit puts or controls which operations are going to be executed. And besides that, there's an arithmetic slash logic unit which does the computation. So this is called an ALU and an arithmetic logic unit is something that can, for example, do Boolean logic. So if true and false or true has so these kinds of statements are interpreted by the arithmetic logic unit, well the control unit says, well I'm going to execute this function and then I'm going to loop until a certain value is reached. And so those are two separate parts. But together these two parts form something which is called a CPU which is done to do the computation. So every machine nowadays is a von Neumann architecture. John von Neumann is a very interesting guy but this is kind of the more or less the beginning of really formalizing what a computer is and him making also a science out of it. So instead of wiring things up together, he said, no, you have to have input, output, memory and CPU. All right, so let's take a little jump because of course computers were developed and they become better and better and better but nowadays or not so much nowadays, well, nowadays, yeah, nowadays we're really on a threshold in computer technology. In 1976, Roman Stanislav Ingarten developed something which is called the quantum information theory. So based on quantum theory, you can have, because in computers things are either one or they're zero but quantum information theory proposes that there's something called superposition which allows you to have multiple states. So if you look at an atom or an electron, it can be not just in plus or in minus but it can be in all states in between. So in 1985, David Deutsch came up with the universal quantum computer which is a quantum computer which can do all computations that a normal computer can do but it uses quantum mechanical elements to do the computation. In 1999, we have the first entanglement in secure communication so quantum mechanics is already used nowadays to secure communication for these big glass fiber cables that run through the ocean. So to make sure that no one's listening in between entanglement is used to kind of secure the communication so that if someone is listening in it will change the state of the data so the data at the other end becomes more or less garbled. So if you read stuff which is nonsensical then you know someone in the middle is listening to our messages so we should cut the connection and communicate in a different way. The first real quantum computer or it's not a universal quantum computer it's a quantum computer of the third kind so you can't use it to break encryption is the D-Wave one. The D-Wave one is a very, very interesting machine. It is 128 qubits so that means that instead of having bits like in a normal computer which can be zero or one it has 128 qubits, dark spider four. Thank you for following. So it has 128 qubits and these 128 qubits can do computations which a normal like massive supercomputer can do but much slower. There's some examples on the next slide. In 2012 the Quantum Artificial Intelligence Lab was funded by D-Wave and NASA and Google and this is a computer which has 512 qubits so again just scaling up making it better and better. Last year D-Wave launched its new D-Wave Advantage trademark and this is a quantum computer which contains over 5,000 qubits. So it is a computer which allows you to do computations which so if you think about problems like the traveling sales problem, the traveling salesman problem this computer allows you to kind of optimize these problems it doesn't give you an exact answer but it gives you a relatively optimal answer and it does this in the order of minutes compared to where like one of the biggest supercomputers in the world would take weeks or even months to analyze the same problem and come up with a very similar answer. So the D-Wave one just a little bit of a step back the 128 qubits you see a photo of the core here so this is the core which has to be super cool just above like absolute zero and it is there to do discrete optimization so it is not a computer in the normal sense it can only do one thing and that is optimize a problem and the problem that it was made for to optimize is to do protein folding. Protein folding is one of the hardest things in the world have just based on the primary sequence of a protein to determine or to predict how this protein will fold when you put it in water. It cost the original D-Wave one cost around 10 million dollars which is very similar to the ENIAC. The ENIAC then cost around $500,000 which in nowadays money is around 6.1 million. So the D-Wave system is one of these steps where probably in like 50 years if humanity survive we will say this is just the biggest step forward as that the ENIAC or the Z3 were. If you want to look so how does this thing look so this is a D-Wave advantage it's just a big black box and it has a massive cooling system to cool the core to more or less absolute zero and nowadays you can actually get access to these things. So they are the companies like Menton AI still use this D-Wave to do protein design and protein folding but also Volkswagen is using a D-Wave advantage so if you go to the Volkswagen factory then they have a paint shop there where cars will be painted, right? So they're spray painted and to optimize in which order to spray paint the cars because people order all kinds of colors, right? And if you want to spray a car it's better to do twice black in a road and to do black, white and then black, right? Because then you have to switch twice while if you would do two times a black car and then one times a white car you only have to switch once. So Volkswagen uses it to optimize its paint shop scheduling. So the advantages of using one of these machines is it's not something that a normal computer can't figure out it's something that this machine, this D-Wave advantage can do this in minutes while a normal computer would have to calculate for hours and hours and hours to come up with a very similar answer and that is why Volkswagen is interested in something like this. The Save On Foods company which is a company which does food delivery services in Canada uses it to do grocery optimization. So they have all the orders that people put in in a day and of course like people order groceries during the day and then when it's like four or five o'clock then they start delivery. So they have to do a very quick computation on which addresses do we have to deliver to and in which order do we deliver. So of course here this is a very time sensitive problem because you can't have a computer calculate for a week and then come up with a proper grocery schedule. No, you have to do a grocery schedule and you want to have a kind of optimal answer in like half an hour. So you can't wait for two weeks. So that's why they use this D-Wave advantage. If you have an Amazon account or if you're interested in using one of these D-Wave advantages then you can actually get free access because every COVID-19 researcher or every researcher who claims that they are working on COVID can get free access through the Amazon system to one of these D-Wave advantages. So if you're interested in doing quantum mechanical optimization problems then just say that you're a COVID-19 researcher and you will get an Amazon account and you can get a couple of minutes or a couple of hours on a machine like this to test your quantum algorithm or your optimization algorithm. Again, this is often in the order of optimizing proteins, optimizing protein folding and looking at structures. So that's kind of the D-Wave advantage. And this is of course top of the line. In like 10, 15 years we will look back at this introduction of this D-Wave and we will think yeah, well, what were people doing using just ones and zeros for computation and why did they not use all possible quantum states? But of course it's something that is in progress and continuing. All right, so a very short history of programming languages because programming languages of course go hand in hand with computers. You need a computer to do a programming language or you need a programming language to work with a computer. And of course a computer needs a programming language to kind of communicate. So the basic low level programming languages are assembly, which is how much was the price of an advantage. For, I don't know, you could call D-Wave and ask them. I think it's in the order of the same. I think it's again like 10 million, something around that. Shouldn't be too much, much expensive because at a certain point of course for 10 million you can get a bunch of general purpose hardware as well. So hey, it's weighing off. If you want, I can look it up for you and I can put it on the slides next time. But it is one of these steps forward where it's becoming cheaper and cheaper to make them. It's becoming cheaper and cheaper to maintain them. And of course like the ENIAC in the 1960s cost also like 6.1 million. And nowadays of course you can buy hardware which is much better than an ENIAC for a fraction of the price. So you can buy a smartphone for 100 euros. And a smartphone has many, many more capabilities than any computer system that's used in the 1980s and the 1990s. And the expectation is that quantum computers will go down in price and will become readily available in like 40 years for the normal consumer to have like a combination. So I have a computer which is part normal computer and part quantum computer for things that need to be optimized. And of course quantum computers can probably have a big role in things like artificial intelligence but also in things like speech recognition and video recognition because that's where they really shine. So low level programming languages are more or less assembly. So assembly just tells the computer or tells the CPU from the main memory, load this address, put it in the CPU, and then take this other address, put it in the CPU, and then add these two numbers together. And punch cards are in a way very similar. They are kind of a, so people probably know punch cards. When you went to the hospital, at least when I was younger, you had this little card with all your personal information encoded on the card by having little holes in the card so they would throw it in the machine. And then they would know who you are and some guy would get your medical record from a file cabinet. So but the programming languages come in kind of low level, mid level, and high level. So the only real mid level programming language, and I put it in mid level because a lot of people are discussing is see a low level language or is it a high level language. So I put it in the mid level languages. But developed by Dennis Richie and Ken Thompson, see is still the most used programming language out there. Every computer that has been built since the 1970s speaks see or knows or is able to interpret see. Invented by Dennis Richie and Ken Thompson, see is nowadays maintained by a consortium. But it's one of these languages that comes back every time. And it's a very easy language. It only has a couple of operators, but it is a very powerful language, just because it is so limited. You can do a lot with it. There's a couple of high level languages. So the difference between a low level language and a high level language is that in a low level language, you specify what you want the CPU to do. Well, in a high level language, you specify what you want to do and then have something like a compiler translate what you want to do to how a computer has to do it. So high level languages incorporate things like Planck-Aukul, so the language that Konrad Zuz used to program the Zdrai. You have AutoCode, which was developed by Glenny in the 1950s. Algol is still used a lot. This was also developed in the late 1950s. So these were all to program old computers. Lisp is a very interesting language. It was developed in 1958 and is still the leading language when people are doing AI. Lisp is a language which is very suitable for AI programming and it is still used a lot if you do artificial intelligence or if you start artificial intelligence, then you probably get forced to learn Lisp. And it's based on some concepts which are not available in other programming languages. The best programming language to learn and you're here for learning R and learning R is good. If you want to make money, real money, learn COBOL. COBOL was developed in 1959. It is the most horrible language out there. No programmer ever wants to program in COBOL. But it is used a lot by banks and other financial systems. And if you are a little bit of a good COBOL programmer because in the 1960s, people learned COBOL and brought programs for banks. These banks still exist nowadays, but these people are retiring or they have been retiring in the last 20 years. And because it is such a horrible language, no one wants to learn COBOL. So if you are a COBOL programmer, then your salary will be at least six figures if not seven figures if you are a pretty good COBOL programmer. So if you're in it for the money, don't learn R, learn COBOL. But if you're in it for the science, learn R. So this is a little graphic that I found. And this graphic shows how different programming languages are related to each other. And it's kind of this dendrogram or flowchart of the evolution of programming languages. And like you can see, there's like a lot of programming languages out there. So why would you want to learn R? And why not just switch to COBOL and earn 100,000 euros a year? Well, there are some really good reasons to learn R. It's free. I'm originally from Holland. I'm Dutch. I love free stuff. That's kind of what we're known for. That's the thing what we do. It's open source. It has built in statistical computing. So the language itself understands what a statistical model is. And this is very different from languages like C or PHP or Python or other languages have no... The language itself is not built around statistics. There are good reasons. Yes, there are good reasons indeed to learn R. One of the nice things about R is that it has built in graphics, which also some programming languages do not have. Like if you look at programming languages like C or C++, they can do graphics, but it's not part of the core language. R itself is written in R, FORTRAN, and C. So if you are a C programmer, your code can be used from R. If you are a FORTRAN programmer, you can use that code directly from R. So R, FORTRAN, and C are very tightly coupled. R is COBOL your backup plan? No, no, definitely not. It's a horrible languages. It's like, COBOL is so horrible that you can only use 80 characters on a single line. And the first seven are for defining what that line is. And then you have left like 78 characters. So variables in COBOL are called x, x, x, y, y1, y2. So like there is no expressiveness in the language. What do you mean with horrible? It's just a horrible language. If you look at COBOL code, it just looks like an unorganized mess. And if you have to maintain code written by other people, like that's no fun. That's definitely, like you will, I would rather chew off my arm than have to program or have to fix a bug in someone else's COBOL code. And that's what you're doing. If you're working for a bank, then they say, well, if you put these numbers, then the wrong number comes out, you fix it. And then you have to read through like 5,000 lines of COBOL code, which have no formatting, no comments, no nothing, because the language doesn't allow you to do comments and stuff. Anyway, why R? R is operating system as agnostic. That means that if you write code in R, it will run and produce the same answer, whether you're on Windows, whether you're on Linux and or on another platform. On your iPad, it will produce the same thing. One of the things why I love R is because I'm a statistician and I do a lot of linear modeling, it has built in linear and nonlinear modeling. As a programmer, I like R a lot because it has built in testing and the help system is really good. I'm Maggie Yamoin, all right. It has built in testing and help, right? So every function that you encounter in R has a help file associated with it. The examples that are in the help file are actually test cases for this piece of software. So R forces you to write help. It forces you to build testing around your functions. And one of the biggest advantages of R is that there are many, many add-on packages available. I say here, 4,000 plus, that's just in the main cran repository. So the main repository in R. Besides that, there are repositories which are totally geared towards AI, geared towards financial, and there are repositories which are geared towards biology and biological problems. So these are the reasons to use R, or at least the reasons that I came up with. So how does it look and feel? This is how R looked on my Windows 7 machine when I made a screenshot. So here we see the console, and here we see the graphic device, and by typing into the console, you can have things show up in the graphic device. I just noticed that I've been streaming for slightly over an hour, so I will stop there.