 This is a love story. I love mathematics. And I'm doing a PhD in algebra. This entitles me to lecture you on statistics. So one of the things which is highly related to innovation I think is statistics and analyzing the numbers and just finding patterns in places. And today I'm going to show you a bit of R. So R is an open source statistical language and environment. And it's based around a console concept. So you type commands and you get results out. So first I'm just going to draw some data in. I'm going to read in a count of source lines of code in the Linux 2.6 mainline kernel. And the data ranges between 2003 when 2.6.0 was committed through to September 2009. Just out of interest I got that from this paper. It's also got lines of code counts for various other open source projects. So you might just Google for that and you'll probably find it. So it's just really simple to read in a CSV. You can also connect to ODBC databases and stuff. It's got heaps of different ways of importing data. And the other thing I'm going to draw in is the all-ordinary stock index for the same time period. And again, I just got that data straight out of this place here who's serving finance. So my story is not so much as a success story more as a case of warning, if you will. So let's just plot some stuff. Here's the number of lines of code in Linux as it started. On the left we have 2.6.0 and on the right we've got whatever version that was. And then we can also have a look at the all-ordinaries. So here is the trade volume versus the closing figure, which is a bit of a mess. But if we merge the two, and if you could just think of this as a SQL join or something, it's just joined on the date column. So if we now want to have a plot of that, say we'll take source lines of code along the x-axis and the trade volume along the y, it looks pretty, quite pretty. We want to do something even better, say the close. It's just as simple as that. Now to do some actual statistics, the most common thing you do in R is a linear model. So go model, LM, and I'll just for fun, we'll try and predict the stock market. So we've got Linux versus the all-ordinaries. Actually, you know, I'm going to do it this way. Close. And we're going to model that. And we're going to pump in the source lines of code. Our data set is the merge. Okay. Thanks, Pete. And we'll just run that and see what happens. Summary of the model looks like that. Well, it looks like there's not much going on. But instead, if we just pump that up a bit. So that was just a straight line fit, which is obviously crap. But if you keep digging, you'll eventually find a correlation. So if you use a orthogonal polynomial of degree five, and you've got significance all the way to the fourth power. And just to show you what the diagnostic plots look like. So first, we've got the residuals, and they look pretty nice. I'm not going to explain any of this. That's supposed to look like a straight line, which is just very close. This is supposed to show you how heteroscedastic data is pretty clean as well. And this just shows us this one data points kind of iffy, and the rest is fine. But to show you what this really looks like, I'm going to plot the data. And then I'm going to draw the actual fitted model that we got. You can see you just get the fitted values for the model straight out of the model object. And then we want to give it a color. And there we go. So in other words, if you want the stock market to recover, you go and find that man called Linus, and you bloat the kernel. Because the highest order term is going to blow up. So my message is not try and find correlations between random things like the Linux kernel quantity of code and the stock market. You could probably throw in all sorts of other variables like the phase of the moon and the tides and the astrology, whatever. My warning is you can find patterns everywhere. And really the moral of the story, especially when it comes to innovation, is question everything. Very good. It's reasoning like this that gives us the one in every 20 scientific papers being bullshit. Thank you. I hope that's not the only lightning talk we're going to have. Anybody want to say anything else? Yeah, why not? So John Cruz is going to talk about being a mentor in the summer of code. Fantastic. So to follow up as I was stuck in another talk and couldn't quite make it here in time for the main summer of code. So I'm not sure what you have or had not heard. Speak out if there's anything odd. But I've been involved with summer of code as a mentor since it first ran and been involved every year. First year, unfortunately, my student failed. But I feel okay about it because I kept telling him he was going to. I was saying before we started, you need to participate. We haven't seen you in the chat room. Come on in our chat room. We discuss a lot. Get on the mailing list. Speak up on the mailing list. Let us see some of your code. Well, he got the basics in so that we accepted him to begin with. But then he just wasn't getting with the group. But I recognized that. I tried to do something about it. And I did what I could. And, you know, among other things, a student had other pressures and having to keep up a living as a student is maybe not as easy over the summer. Certain jobs and other things impacted him more than he thought they would. So try, I would suggest you get involved, but try to get the students on track as early as possible and share with other mentors what's going to go wrong. Now, next year, I wasn't assigned a student, I don't think, but I stepped up and helped somebody else's student when, as we were approaching the evaluation period, midterm or something, the mentor kind of disappeared. The student got worried. Well, we had several signed up as mentors who didn't end up with students, so we stepped up and helped. So having more mentors than students can be a good thing for a project. And then the same thing happened to me. In a different year, I was the mentor assigned to a student and I missed things because I went to Germany over the weekend to participate in a conference and I had misread the schedule and I thought one deadline date was a different deadline date and I was all confused. But one of the other mentors came and got my back and got the midterm in so the student didn't have problems. And so, especially here in open source land, you've got a lot of projects here at this conference. You've got a lot of people who could mentor. If you've been working in software or not even professionally, if you've just been doing it for over a year, you can help some student. You know something that can help them. And as we found out, there's certain things, first of all, you know, get them involved in your project. Get them into your culture. Get them into each project is a little different. Let them know that. Let them help them fit in. Help them communicate and participate. And then when they have problems, they'll be able to bring it up to you and you'll be able to catch them before they run off the cliff by themselves. Although, not all the time, most the time. You know, maybe eight or nine times out of ten, you'll be able to help somebody but, you know, sometimes somebody just has to learn the hard way and you can't do anything for them. Possibly they'll learn and come back next year. Possibly they won't. Maybe they'll take it to somewhere else. But, you know, if you step up and help, that does make a difference. And then I've found also in all this, in Inkscape participation, because we've been in every year since it started, we have several contributors who became regulars. A lot of people will just show up for summer of code, get a fee for it, move on somewhere else. Some of those end up contributing to other projects. Some stayed with Inkscape for a year or two and then moved on. Some of our students ended up being mentors later on themselves. And then, or some go back and forth a little. Depends where they are, what they're doing. But, overall, you can give some people a boost, let them know what it's going to be like, help them succeed. Oh, and among other things, besides the other things, if you haven't really managed anyone in software, when you start to see these proposals coming in, look at them carefully. Make sure you force them to change their proposal so they can succeed. You know, so if you don't have something where, oh, I'm going to rewrite the Linux kernel from scratch so that it will run 10% faster, not going to happen. So, you know, bring down the expectation. Make sure it's phrased in a way so that they have something they can definitely achieve. And then some goals that they might be able to make just in case, and then some way out goals that they probably won't be able to get to. But in that odd case, sometimes you get someone who can do that. So, having that shoot for there, you might reach it. But if not, your only failure is in not trying. You know, but make sure so that they, as long as they try, they'll hit some level of success and the project can be completed. Then the next step, maybe it could go in, maybe not. We don't know. But it can really, really benefit both you and them and the whole project too, all at the same time. So, thank you.