 Super brief intro, my talk is called Life on My Home Planet, and if you read the description and you happen to be particularly cynical, you might have thought with a talk title and description like this, Giles could be talking about anything, and he might in fact have written the description before he decided what he was going to talk about. Unfortunately, that would be exactly correct. However, fortunately, the weirdest randomist thing in the talk description is that I said, you know, one of the questions I'm going to talk about is, do you need a goat if you're trying to get to the moon? And that actually random crazy part is the part that I'm serious about that I knew what I was going to say about. So I'm actually going to talk to you a little bit about riding a goat to the moon. And before I do that, I want to explain where it comes from. So this is Mel Gibson and Danny Glover, and the concept of riding a goat to the moon actually came up in a Twitter discussion about Mel Gibson, and someone was saying, Mel Gibson's insane, right? And there are a lot of people who agree with this point of view, and you know, there are many reasons. He has these allegations against him of homophobia, racism, sexism, anti-Semitism, domestic violence, and general lunacy. However, I had the counter-argument that you can't actually admire Mel Gibson anyway because he put together this hit movie, which was in Mayan, and he put together this hit movie, The Passion of the Christ, which was in Aramaic, Latin, and Hebrew. Now these languages, making a movie is hard. Making a hit movie is hard. Making a hit movie using languages that are no longer spoken anywhere on the face of the planet is just crazy. It's ridiculous. That's like saying it's not enough that I'm going to fly to the moon. I'm also going to do it riding on the back of a goat for no reason. So the thing is, Mel Gibson might be crazy. I'm certainly not saying Mel Gibson is not crazy. I'm not saying he is either, but I'm saying whether crazy or not, you've got to, you know, admire the fact that he makes these incredible things happen. And this reminds me of a really good book called Purple Cow, and the basic idea of this book is in the tagline, transform your business by being remarkable, and basically the idea of the book is that the only marketing worth a damn is being worth talking about. And I read this book at about the same time that I read this book. My job went to India, and all I got was this lousy book by Chad Fowler. And I haven't read the new edition, but he rewrote the book and republished it under the passionate programmer. So if you're like, wow, this is a good talk, I should check out this book, it's actually easier to find this version because that version is no longer in print. I don't actually get along with this guy in person, but this is a really good book. Some of you know exactly how much I don't get along with him, but it really is a very good book. And it's about how one of the best ways to have a good career as a programmer is to create good stuff that people like. So I'm going to talk about My Goats, which are basically based on a combination of the Chad Fowler thing, which is that if you want to have a good career, create cool stuff, and the Purple Cow thing, which is that you should create really, really interesting stuff. So My Goats. First goat is Archaeopteryx, which I sometimes just refer to as arcs because people can figure out how to spell it. Here is a quick demo, if I can find it. This is some music software, right, and this is what we were sound checking earlier, right. It's a drum machine, basically. So let's see. I'm going to have to do this while looking behind me. Basically, this is what you do. You run it, and you get a drum and bass. What you can do is actually change the code. It's more interesting by changing this to a random number. And you get like much sparser, but much more varied rhythms. So that's how it works. And let me just come in here. Thank you. Thank you very much. So that's what it is, and the real name is Archaeopteryx. And it's based on this thing here, this software drum machine, is actually based on a very, very old model of drum machines. And I don't mean an old model as in like a model car, but like an old conceptual model. This was created in like the late 70s, and people are still creating software drum machines based on the same idea. Let me show you a real basic intro to how this works. If I come back in here, right, and say I'm going to create a really simple drum beat, right, I'll start here with the bass drum, right. So you say, I want it to play on the one, and I want it to play on the nine. And that's out of 16 beats, right. So it goes like that. And say you want to have the snare drum play on the five and the 13. So that goes like that. Or you could shake it up a bit like this. And then for, you'd also want like a hi-hat in there. Here's a hi-hat. So you could do like this for the hi-hats, right. So that's how it works. Now the reason I showed you that, what you're wondering, is to clarify how this model works, right. This grid at the bottom represents the beats for a particular drum. So when you create this thing, you're adding up these different rows. What you're actually doing is setting up a matrix of three drums of 16 beats. So the way this actually works is it has 10 drums. So you're actually creating a matrix of 10 drums by 16 beats, right. And obviously these are on-off switches, right. So this is a matrix of booleans. But what my code basically does is it changes it to a matrix of floats. So here, instead of just yes and no, you can say yes, no, and maybe, right. And then if you have a random number generator, which gives you a number between zero and one, these represent the percentage probabilities, right. 0.5 and 1.0 is effectively equivalent to 0%, 50% and 100%. So the way Archeopteryx handles that, I don't know if you can see this with the colors, but it represents it as an array of arrays, right. Because it's a two-dimensional matrix. So an array of arrays is a very intuitive way to do that. And these probabilities are all floats that describe the percentage, right. So for one of the hi-hats, I want it often to play on, this is an eighth note pairs, I'm sorry, 16th note pairs. And there's eight of them in a measure of 16 beats, right. So 85% on the first half, 35% on the second half, means that you're very much going to, you're very likely to hear like that. And then the little ones in between are like sometimes happen, right. So you create this array of arrays, and then you go through the MIDI note numbers, which is basically MIDI is just a way that you can assign note numbers to particular sounds in your software. And you create drum objects in Archeopteryx, which is basically just a thing that sends a MIDI message to the music software. And you populate it with these, what I'm doing here with the L is it's basically lambda, which gives you a closure, and it really just allows you to use Ruby as if it were JavaScript, because you can now add a method and put it anywhere you want on the object, because these are just adder and accessors basically. And then probabilities gets the probabilities of the MIDI note number. So each drum now knows what its probabilities are. So that's just a way to assign this grid of floats. Now I want to talk for a second about porn. There was this horrible thing that happened in the Ruby community where a while ago someone showed a bunch of porn slides in a talk. And there was this controversy, and the guy was like, well, you know, porn doesn't necessarily demean women, and nobody seemed to explain to him that, you know, we all knew that, and the objection was not that porn in theory demeans women, but the particular porn he showed did. And I feel as if I have some culpability there, because I have in the past shown pictures of good-looking women for no reason in the middle of talks, other than the fact that people, when you see them in, you know, images, cause your brain to pay attention. And I frequently exploit that as well as images of food, images of terrifying animals, and so on and so forth. So I did throw a good-looking woman in the presentation for no reason, and I want to point out she is in fact a scientist who was published in academic journals when she was still in high school. And also I threw in some George Clooney for the ladies. But to further demonstrate that porn does not necessarily demean women, I'm going to show you some porn which does not. This porn only demeans camels, right? There are no women demeaned by this porn, right? This porn is demeaning to rhinos and elephants, but not human women. This porn is also demeaning to rhinos and elephants, unfortunately. This porn is demeaning to everyone who sees it, which means it is demeaning to women, but it's demeaning to women and men equally, which I think is important. Now, to get you over the trauma of what you've just witnessed, here's a puppy. Thank you. This is more cuteness, although this is a bunny, so it's kind of horrible because something that's adorable could still grow up to molester chicken. Anyway, so we created a grid of floats, right? And we populate it with these drum objects. We then hand it off to this thing called generate beats, which actually includes a recursive lambda, which is much more prosent than it probably sounds. So, basically, here's the code. You've got this method and it's called generate beats. Later on, it passes, you know, to MIDI timer at just says, later on invoke some code. And the code that it says to invoke is this here, lambda generate beats. It basically just schedules itself for later replay. I'm hiding something off-screen in this tidy code because if you look here, you can see that this block of code actually includes a little bit else. And that little bit else is evil timer offset WTF. I'll talk about that later. Anyway, this code is on GitHub. I want to talk also about another goat. Hopefully I've got time. This goat is named Wheatley, named after a character from a video game called Portal 2, where you play this character and you eventually face Wheatley. Sorry, Wheatley. It's too late now. Hey, guess what else? The cake is alive. Here he is in the part where he kills you. I actually used Wheatley as the name of the character because he is an AI, but he's also an idiot. And I like this idea of a robot idiot. And that's basically what Wheatley, the programming thing, is. So, Wheatley started life as Tauley. Tauley was a library created in 2008. Tauley keeps your code dry. And what it does is it analyzes legacy code. It goes through Ruby code, and it isolates duplicate methods. And it can also do very, very primitive repetition detection. But it is extremely primitive and extremely slow. Then, just this last year, I think 2011? No, last year, 2010. I got really, really annoyed at these ridiculous blog comment things that interpreted any tweet that mentioned your blog post as a comment on that blog post and spams your blog with all these ridiculous tweets that are just retweet, retweet, retweet. And I created a similarity detector for blog comments in JavaScript. And I was like, oh, this is really fast. So I was like, okay, cool. I just kind of discovered it by accident. So I discovered how to make Tauley fast. But I didn't want to rewrite Tauley because it was like an old project. I wanted to start over. And one of the reasons is I didn't want to analyze Ruby at that time. I had a bunch of JavaScript I had to deal with. So Wheatley is a library for automating JavaScript repair. And if you feed it two pieces of JavaScript which are equal, right, these function foo return true, function foo return true, it can say, okay, the parse trees of these two functions are identical. And it can give you that parse tree, which is basically underneath, you know, the language when Ruby goes through or your JavaScript interpreter goes through, it parses your language out into a parse tree and then turns those into instructions. And this thing is written in Ruby. It gets the parse tree from a library called Johnson which ties into SpiderMonkey, which is a C JavaScript interpreter. We talked about monkeys earlier. This is a spider monkey riding a goat, which is not what they do all the time. But in case you've forgotten what monkeys look like, when Wheatley finds that two pieces of JavaScript are identical, it says that they have 100% similarity, right? Parse tree zero, similarity, parse tree one, should equal 100%. And it does. Now, say you've got this where this has three tokens. No, I'm sorry, four tokens as a parse tree would understand it, right? Function foo return true, function bar return true. So in each of those, there's only one token which is not equal to all the other tokens and the other thing. So therefore, that's 75%. And if you ask Wheatley, it'll say parse tree similarity to parse tree is 75. It gives you an arbitrary percentage. The code for that is actually incredibly simple. This whole thing's built on arrays, right? So it just says it calculates the intersection with the other tree, multiplies it by 100, divides it by the other size and turns it back into an integer. Now, you can also get absolute numbers. Going back to purely identical JavaScript here, function foo return true twice. It'll say that the parse tree token diff with the other parse tree should equal zero. And I call it a token diff because when you're dealing with a parse tree, you're dealing with a set of tokens and you just want to know what the difference is in numbers. So here, obviously, the token diff is one. So when you run the specs, parse tree token diff parse tree one, should equal one. So it can count. And again, the code is actually really, really simple. The code that gets into the JavaScript can be quite hairy, but the actual comparison here is again just arrays against other arrays. So it's really very easy to understand. It's just array subtraction, really. Now, this is the good shit. You can also extract similar code blocks. Ignore this next picture. If you have this highly similar JavaScript, console log foo and console log bar, you can ask it for echoes. It can tell you that two sections are highly similar to one another. So if you say, show me the parse tree echoes with a maximum token diff of one, it'll say the first section of the parse tree has this array of echoes, which only contains the second section of the parse tree. And likewise if you give it something like this, console log foo, bar and baz, it'll say the first section of the parse tree over here, parse tree zero, has this array of echoes, which are parse tree one and parse tree two. And you can do that with an arbitrary number of tokens. And actually, I don't remember the percentages version, but it's actually pretty easy to figure out. And I call this echoes because it's like the tree is echoed elsewhere within it. Now here's the code for echoes and you can see it's a bit more complicated than the other methods. And that's because once you get in here you're dealing with some more complicated stuff. I think it might be recursive, but to be honest with you I forgot how it works. I was in the cafe, putting the presentation together for this and I looked at that method and I was like I don't know. So I invite you to find out how my code works and tell me. You can see it's only like 10, 12 lines. It's actually easy to figure out. It took a while to write, but it's pretty easy to read. Anyway, here's an innocent chipmunk, although he looks like he's getting ready for a date, so he might not be that innocent. And here's a horrified hedgehog. And yes, moving on. Here is some highly similar JavaScript. So Wheatley can also extract the variant tokens. Let's go back one. Here it has console log foo, console log bar, right? So you can say what are the variant tokens and variant tokens should equal foo and bar. You can also ask for the invariant tokens which is what tokens in those two parse trees are identical and it will give you that as an array. Now Wheatley also extends Johnson. Johnson is really cool but it can only parse JavaScript. It can't reassemble JavaScript and the parse tree that it gives you cannot actually be edited. So Wheatley re-ads well, it has a Johnson translator module that gives that functionality to Johnson which you can then use to actually construct JavaScript parse trees that can be rewritten. Although I have to tell you it's quite hacky, but it works. This enables Wheatley to add function calls, replace literals with variables, and create wrapper functions. And if we zoom out on this one, Wheatley create wrapper code function should equal refactored, we see this which is kind of like Wheatley's claim to fame. It can do an ultra-simple refactor, namely creating a wrapper function. Here's the code, it says console log foo. Here's the refactored function which is console log query, then it calls as the foo. So it extracts the code, pulls out the literal, creates a wrapper function, and you could use this for instance if you had a lot of code which called console.log in production, because IE, various versions of IE will barf and go nuts if you use console.log. So what you do is you run this and then you fix that wrapper function but it will automatically create these here. Now this is a proof of concept refactoring, it would actually require a bit more work to use a great deal. The code is on GitHub. A little more about this, I only have eight minutes left so I'm going to kind of hurry. The code which generates new JavaScript is as I said, quite involved. And if you're looking to get your GitHub on this is a place to start. It could use some cleaning up. Also the variable naming is clearly idiotic, as different QWERTY are not good variable names. I have not figured out a way to automate variable naming. Although I do have a wild-eyed idea to consider, I also want to look at cosine similarity, sim hashing, and if I were to do it over, which this is the second time I've done it, so I might, if I were to do it over I might do it in CoffeeScript instead using Jison. But all told, I'm actually pretty happy with Wheatley, it's a very good goat. Although these projects probably look extremely experimental, I actually used parts of Wheatley and a companion project that I haven't open sourced yet to find easy refactoring points in a legacy JavaScript code base of over 14,000 lines. In fact, it was 14,551. It would take a very, very long time to go through that kind of code by hand and reading bad code is depressing. It was also able to auto-generate some very simple code fixes just a tiny bit. And legacy code is tedious and repetitive. Tidious, repetitive tasks are why we have these boxes in the first place. You know, programs are hard to hire, but certain types of tedious, repetitive programming programmers are actually easy to write. So, I am actually firmly of the opinion that hackers should not do repetitive tasks. I believe hackers should write new hackers to do those repetitive tasks for them. It is the circle of life. Now, earlier I mentioned program or self-promotion and you know, in the context of this stuff I came up with this whole thing about how to build a goat. Unfortunately I didn't time the length of this presentation. So, although I do go into how to build a goat I'm not actually sure I'll be able to go through. I will give you the first, you know, the first core of how it works. Building a project like this is a gamble. It's an experiment. You don't know for sure if it's going to work. There are four things you need to know. When to hold them, when to fold them, when to walk away, and when to run. There's more to it, but I really, I've got six minutes left on this timer. So, say what? All right, all right. Well, the first thing is to do substantial research, okay? This is a list of books and stuff that I did to, like, learn about music, right? These are only the good ones that I remember. I've got books waiting on my iPad to be read. I got this one here, which is a Music Theory PhD thesis on Detroit Techno. And this, this was written by a guy who wrote AI code that can write orchestras. In fact, this is his second thing. His first one would write orchestras in the style of arbitrary composers. And when orchestras refused to play, I'm sorry, I can write orchestras, can write classical music. Orchestras refused to play his music in the style of various composers, because they were outraged that he could do that, and they found it sacrilegious. So he wrote this new thing, which does orchestral music in the style of, you know, hybrid style of various composers he likes. And they still refuse to play it. But that's another thing. And then again, you know, here's a whole bunch of stuff you know, reading on AI. I'm going to do another version of this talk, like an hour long version, and put it on my blog, so I can go into more detail there. The long story short is that if you're going to try and do something, you know, genuinely remarkable, you're going to have to do your homework, right? And correspondingly you need to choose something you care about. It should be obvious, but to a lot of people it isn't. The obvious reason is you're going to be doing it a lot, so you need to have something you'll have energy to look at. Consider the idea of rock star programmers, right? It's kind of a silly idea in many ways. But one thing that's kind of cool about rock stars is they have to like, memorize all this stuff with guitars. So a lot of time you're rehearsing. The other thing is that in the world of programmers, a rock star is a specialist. Like here, this is a great rock star. This is Jeremy Ashkenass. He created CoffeeScript, Backbone.js, underscore.js. This is a specialist in making JavaScript suck less. And obviously it takes a lot of energy. The other thing is you're going to learn a lot about it. You want it to be the kind of information you will be happy to have inside your brain. This is Alan Perlis, one of the sort of great, like early opinion havers about programming. Actually, the folder I filed him under was Code Profits when I was putting this together. And he said this, you think you know when you can learn are more sure when you can write even more when you can teach but certain when you can program. And well, here's a story I'll just skip. But if this is true that you're only certain that you know something when you can program a computer to do it, then to automate something is to study it and indeed to some degree to master it. So you choose something you care about because you will master it. And Building Archaeopteryx certainly made me a better musician. Writing Code which analyzes Ruby made me better at analyzing Ruby. Writing Code which analyzes JavaScripts made me better at JavaScript. And of course, the meta thing that I'm going into here is AI which is actually the study of like decisions and automating decision making. Oh, many years ago I grew disillusioned with technology. I learned how to do this. This is a video graphic stuff. Basically I wanted to like make film music and I was just like very frustrated with code for a while. So I learned how to do all this stuff and parts of it are actually done with JavaScript as surprising as that may be. And later I actually went and did some animation in Ruby. This is SVG animated in Ruby. It just creates like a very long series of SVG documents. And then there's some Java that turns these SVGs into JPEGs. You can just import those series of JPEGs into animation software. All the animation is already done. You can find that on the GitHub and on my blog. Let's see. I got a minute and a half. I don't know how good this is going to be. Next step in building a goat this is very important. Give yourself permission to fuck it up. So the iPad was basically designed in 1968 at Xerox Park by this guy who is holding up a prototype. It was called the Dynabook. Now when Steve Jobs announced the iPhone this guy was right there at the conference. And when he got off the stage he asked him, you know, what do you think? And he said, make the screen 5 inches by 8 inches and you'll rule the world. So that's basically what he did. The interesting thing is Apple was working on this for a long time. Apple knew about the Dynabook. This was made when Steve Jobs was not at Apple but the engineers were still very interested in creating a Dynabook. And you can actually look at this. This is a very very similar notes application running on an Apple Newton to the one running on the iPhone. And I'm sure there was some die-hard engineer at Apple when the iPhone was made who thought, yeah, we've made a half-assed Dynabook. Awesome, right? And, you know, it continues, right? This is actually like a long tradition in Apple. Let's see, let's see, okay. So the iPhone... Ipad. Yeah, yeah, yeah. All this stuff. This is Jim Cameron quote. The guy, the film director. If you set your goals ridiculously high and it's a failure, you will fail above everyone else's success. This is the guy who made Avatar, Terminator 2. So if you look at the iPhone or the iPad as a failed Dynabook, right, it's still an amazing success as a phone. And this is what people were saying about the iPad, right? Limited IO, no USB, no flash. Why the iPad will fail. There is a whole genre of why the iPad will fail, right? Fuck it up. Do it wrong. This is where you, you know, this is why you should do things wrong. Because if you set out to do something awesome and you do it wrong, it's still gonna be awesome. And in keeping with this, this code is copied from a flash tutorial and then melded with code from a Ruby book. And I did the flash tutorial wrong and the Ruby tutorial wrong and it still turned out pretty cool. This here, I was talking about before, evil timer offset WTF, right? Does this look like good code? This is not good code, right? Just because I'm standing up here in front of you, if I have a variable named evil timer offset WTF, that's not good code. And we all know what WTF stands for, but it's still not good to see it in there, right? And if you're going through the code with a WTF stamp, it's just not how it would be. There's something similar here. Wheatley replaced literal with variable. The actual code for replaced literal with variable has this comment. It says to do, fucked. I find comments like that in my code from time to time. Fix me, stone. Not actually informative. Or you know, def WTF WTF comment WTF and you look at the file that you're editing and it's in the folder WTF and you're like, what was I doing? But the funny thing is is like one of the things that made this habit of you know cursing variable names worse with me and you know encouraged me in this naughtiness was working with terrific programmers because they complain about their own code all the freaking time. And the funny thing about this code is it may have motivated the creation of this gem, right? This gem is a much better way of handling which I may rebuild archaeopteryx on top of. Now again, I went all out promoting archaeopteryx and went to every conference so I was like, oh I'm going to promote myself as a programmer, it's important. And also I was like, wow this is awesome, everybody needs to see it. Now the funny thing is that may have inspired the creation of that other gem, right? Someone might have been like, wow, there's all this attention on this thing, I'm going to look at it. Oh my god, how did you get all this promotion out of such crappy code? That may have happened with that gem I just showed you. I know for a fact it happened with this gem, right? And that is actually awesome, right? Ben Blything saw the code for how archaeopteryx handles MIDI and thought, no, no, it should be much cleaner. And so he made it cleaner, he rebuilt the foundation and then I just moved archaeopteryx over to sit on that foundation. And this is the magic of open source, right? If you create code and share your code and your enthusiasm people will respond and that's a great thing. Here is a kitten cuddling a lizard. We don't know what they got up to but we're not looking but it doesn't matter because they don't have any pictures of that one. I am now out of time, should I keep going? Yes, okay. So archaeopteryx being fucked did not prevent it from being awesome, right? So moral of the story is to give yourself permission to fuck it up, make them bad, just because later on they will become good. The next step, the last step is to do it over and over again. This is not how I mean. I created Tally in 2008 that was, of course, code to prevent, you know, don't repeat yourself. This was Wheatley 2011 also code to enforce don't repeat yourself. I wrote code which is enforcing don't repeat yourself I'm showing you this slide twice, right? Do repeat yourself, right? Don't repeat yourself as a great rule rule for code. It's good for computers. It's not good for humans. If you actually read up on the human brain and how it works, repeating yourself is key to getting good at stuff. Also when you repeat yourself you're going to do it slightly differently, right? If you think back the other version of this slide the colors were inverted. This slide is probably, we're probably not looking at the first motorcycle he ever built, right? You will come up with variations when you rebuild stuff that you're really interested in. And it's worth doing. Consider Archaeopteryx. I went to all the conferences in 2008 to talk about it. But I actually built an earlier version in 2006 which ran on a completely different framework of continuations in chaos math. And this is actually a very old email from 2006 about how I was going to show it off at the Albert Kirky user group which I did and it was built on this which I don't have time to explain. So do substantial research choose something you care about give yourself permission to fuck it up and do it over and over and over and over again. Because I might do another one of these in 2012 and it'll probably be better than the first two. Also public service announcement animals are perverts. This is three cats getting it on at the same time. And compared to all the other perverted animal pictures I showed you they're coming out ahead of everybody because they're all the same species. Anywho you don't even want to see the picture I found of the elephant orgy. There are baby elephants there. If it had been people it would be illegal to even look at the picture. It was unbelievable. I was horrified. Anywho this is who I am Giles Boquette and please google me and follow me on Twitter Giles Goatboy I actually chose the name long before I came up with the proposal theme or whatever the presentation theme but obviously if you're interested in goats I'm all about goats but not that way. Thank you. Brighter Planet is a cloud based computation platform so we provide an API for developers to build complex scientific calculations into their applications pretty easily. One big client we have is MasterCard International and they have a partnership with us where users of corporate cards who charge things like flights and hotel rooms onto their corporate card all those pieces of data go into a big database and we go in and using all those details calculate the environmental impact and put that information back in the database so the users of these corporate cards can use that information for making energy efficiency adjustments and corporate reporting.