 Dear fellow scientist, you may not have heard of me before, but my name is Matthew Robert Mongeau and I am writing to you to talk about the language of the machine. So many layers have been abstracted from us when talking about the language of the machine. All those zeros and ones seeming very innocuous, but in between them lies the very brink of insanity. It is for this reason that a colleague of mine and I set out to write a new language and we called that language Dagon. Dagon represented the frustrations that we had in the current programming languages. We would see code in the wild designed in ways that Ruby allowed and we decided to get rid of that. Enticed by this mermaid's call, we set off into the abyss. I would like to share with you the trials and tribulations that we underwent in order to create Dagon and to learn from these experiences. If you are to write a programming language, you must design first and program later. Otherwise, you will create an unimaginable abomination. When you set out to create your programming language, you must choose a progenitor, a language to start from, and you may be tempted to start with C, but C is a fiend unto itself. The problems you run into of C are sure to test the very brink of your sanity and imagination. One such thing you might encounter is memory management. Memory management, the allocations and the allocations are nothing more than a primeval force to bring upon never ending nightmares. Another danger of starting with C is dealing with pointers. Pointers, unimaginable monstrosities hulking in the darkness waiting to haunt your nightmares. The problem with this is that if you take this course, your language will perish and the depth of languages is all but far too common. So, dear scientist, this is my suggestion to you. Use Ruby. Ruby has so many tools at its disposal that makes it easier for the creation of a language that you now get the opportunity to work towards your goals and to accomplish and set out to validate your assumptions before your language fails into the darkness. But if you are to write a programming language, you must remember a few things. You want to take small steps. Each step towards the creation of your language will push it forward and birth it into this world. I leave the rest up to you, dear reader, to decide how to undertake your language. Sincerely, Matthew Mongeau. So this talk has an interesting premise in that I am trying to convince a room of people to write programming languages and I guess what I have to start with is why. Why should you write a programming language? This is a huge sell. So the reason that you should write a programming language in the first place is because you are going to learn a lot more about the programming languages you use. And in many ways, programming to me is a form of art. Writing your own programming language is the best way to express yourself in terms of how you program and what you do. Now to start, I actually just want to get an idea of my audience and what they are into. How many of you have tried to write a programming language before? Okay. How many of you have heard of lexing before? Oh, quite a few of you. Okay. I assume if you have heard of lexing, you have heard of parsing. So there is that. When you are writing your programming language, and I will go over this in a cursory manner, you should definitely start off with lexing. It is a fairly easy process. Now if you have not heard of lexing, it is analogy to language in itself, the English language, is to take a sentence of words and break them up into the individual parts and label their meaning. So you will label some words noun, some words verb, some words adjective. There is a parallel to this in programming. We are going to name all of the parts of our language and give them identifiers. This is a number. This is a keyword. This is an operator. And that process is not too difficult. But then you get to the part that I think is difficult. The most interesting is parsing. Where you are now taking what you understand about the language, the different pieces, and you are going to put them together in ways to make sentences. And these sentences are essentially how you are going to form your language. And this is really difficult because ambiguity. Ambiguity is horrible. And what I mean by that is you can write your programming language in a way that it will look fine to you, but the computer won't know what you are trying to say. It won't be able to understand what you are trying to do. And you have to teach it. So this is a very difficult process and I will show an example of that later on. But if you make it past this step, you are pretty much out of the woods. All you have to do is now evaluate your code. And you have essentially got a programming language. And I am kind of summing this up into a small package because it really is this easy. When I set out to write a programming language, my colleague Caleb over there, we pretty much sat down in a single afternoon and we had something that worked. And then you just build from there. And the building experience is immensely rewarding. I don't think there is anything else that I programmed in Ruby that when I succeeded, I felt this good about it. Now, if you are going to undertake this, I plan to arm you with some tools to make this easier for you. If you are trying to lex, here are the tools that I suggest. Using either rexical or raggle. And Aaron, wherever he is, that is under his. And the reason I suggest this is that raggle basically allows you to define what your lexer looks like. But it has different compilation targets. So it can compile to Ruby. It can compile to C, Java, go, objective C, objective C++ and a bunch of other things. And this is pretty useful because your language probably isn't going to stay in Ruby if you actually want it to become anything later on. Ruby is really slow. So if you are writing a language on top of Ruby, it is going to be very slow. But the value in writing in Ruby is that you are going to be able to validate what you assume about your language much faster. You are going to get to this is the part of the language that I think is really interesting and I like and I want to write this. So I used raggle in the hopes that maybe one day I would transition to C and then it could be fast again. But who knows when that day will come. Now the next step for parsing, the tool I suggest there is rack. That is RACC, not RACK. And this basically allows you to define your grammar which is the parsing step. I will show examples of these things after this so you can see what they look like. But these tools together make it really easy. And the nice thing about RACC is that it is modeled after YACC and Bison. Bison is modeled after YACC. And these are C libraries. So later on if you do decide, hey, I am going to rewrite and C, you can kind of just change the format of your parser and it potentially will work to some degree. So I actually want to show examples of the language Dagon that I wrote that utilized these tools. So this is roughly what raggle will look like. I should mirror my screen because this is impossible to see. Yeah, what? You say F1? Yeah, I am going to keep going. Thanks, Caleb. You are the worst. So this is essentially what raggle looks like. And in it you just define what each piece of your language looks like. This is the first step. You are going to say this is what a plus operator looks like and this is what a minus operator looks like. There are some operators. And if you are looking at this and you are a little bit confused because there are spaces around the plus operator, this is deliberate. Dagon was designed with the idea that the language is your style guide. And so there is only one way to do anything in the language. There is no ambiguity. So addition is space plus space, not just space. So we define these things and we build up essentially what is our lexer. And it is going to take code that looks like this. This is an example Dagon program. It is going to take this and it is going to break it up into the individual pieces. Like so. So here you can see we have labeled each part of the language. We have said that greeter is a constant. And then there is a colon and so forth and so forth. And this is all you have to do for lexing. This process is not very difficult to get these sets of data. But this format is particular to rack having an array with two elements in it. And when you get into rack, you are going to write basically a grammar that looks like this. You break down each thing and I decided to look at class definition because that is an example we had before. A class definition is a constant followed by a colon followed by a block of code. And so forth. It keeps getting deeper and deeper and deeper. So I mentioned before that when you do this, this is the hard part because you run into ambiguities. And they look like this. This is called a shift reduce conflict. How many of you know what a shift reduce conflict is? All right. Barely anybody. This is the best. Okay. So when you have a grammar and it is working through each of the words, it is going to look at each one individually and then it is going to decide what to do. It is either going to shift that token on and read the next token or it is going to try to reduce a certain number of tokens on the stack into something it already knows about. So an example would be I had an indent, some code, and a ddent, that is a block. And when it sees the ddent, it is like, oh, cool. I can reduce that into a block. That is a known thing. So the shift reduce here is it didn't know what to do. Does it need to shift the L bracket on or does it need to reduce using rule 11 which was defined up above somewhere? And when you run into these problems, you have to solve them. And it is a terrible process. I don't envy anybody who has to do this. In this case, the problem was that my lines didn't have a terminator character. And so you essentially could have one line of code followed by another line of code. And that could create conflicts. So tracking that down is really painful. But with practice, you get there and you can start writing actual languages. So to give another example of the language, I'm going to show everybody's favorite thing to do with the programming language. And that's to write an interpreter. I had to rename this to make this friendly. This is a BF interpreter. How many people know what I mean when I say BF? Okay. There's what? White space? Yes, this is not white space. So, yeah, I can't say the F bomb. But basically the language is brain F. F bomb, yeah. And the idea behind it is that there are eight operators that you have to implement. And once you've implemented those eight operators, you've successfully implemented the language. And the reason why this is important for language designers is because it's technically Turing complete. How many of you know what Turing complete is? Yeah, okay. So if your language is Turing complete, you've basically said that it's a language that can reproduce anything other languages can produce. So here is my attempt at the brain F bomb interpreter in Dagon. Now, this isn't that interesting, I guess, from the perspective of this isn't Ruby, but Ruby is powering all of this underneath. And I think that's really what I enjoyed about using Ruby to write my language was that I was able to do this really quickly, much quicker than I could have done in C. And changing it is really easy. So because I've had these experiences with this language, I can now apply them to other things. And this is where it gets really neat. Now, because I understand this, I can potentially work on Ruby. How many of you have looked at the parse.y file in Ruby? Okay, it is a nightmare. Okay, I'm going to make this small. This file is over 11,000 lines. And in its defense, it is not just a parser, it is also a lexer. But it is immense and be very difficult for anyone to actually understand all of this. But you know, just having a little bit of experience writing my own language, I understand a lot more about what's going on here. And if I wanted to say extend Ruby syntax to have something new, I would be able to do that because I understand this. How many of you are Rails developers? Okay. So you probably think, this is all meaningless. I will never ever do anything with programming languages with my job. And I thought that too. But you can do some really neat things using these tools. For instance, there's a website that I built where our client had a bunch of data that they needed access to. Tons of data about the real estate market. And the current way they implemented it was a giant form with tons of select boxes and toggles and everything to get to the data that you needed to get to. And it was a very subpar experience. And what we were able to design was reduced all of that stuff into a single search box. The idea being you would express what you were looking for in terms of English and it would find it. So I could say properties in Boston over 10k sold within the last three months and it would return that. And the way I designed it was I treated each sentence like a programming language. It pulled out the parts, it put them together and created meaning. And this experience was really fantastic for finding the data that they were trying to find before. It simplified what was 10 minutes of filling out a form into 10 seconds of filling out a single search field. So even if you don't think that you will ever write a programming language for your job, there are some opportunities that can be presented by taking this approach and by using these tools. So at the very least I hope that I can convince you to try. To try a little bit and see how far you can get. That is it. I actually thought my talk was going to be right before lunch and so now I got moved. This was going to be my final slide.