 Okay. Hi everyone, I'm Gregor. I will be talking about building, writing simple DSLs in Ruby. I work for this. Okay. Thanks. I work for this company called Collegio Travel Solutions. We are building different products that are related to traveling and to loyalty, to loyalty currencies. So today I'll be talking about something that is closely related to what I've been working on recently. When I was preparing this talk, I noticed that there were related talks recently here in Singapore. One of them was at the Ruby Meetup I think two months ago, and it was about writing a DSL, which was very different from the DSLs that I will be showing you, but you could check it as well. And the second one was at the Red Dot Ruby conference last month, and it was a talk about metaprogramming, which is very useful to understanding how you can write DSLs in Ruby. So first of all, I guess that you know what DSLs are. This abbreviation stands for domain-specific language, and that's where the easy part ends, because to understand what really is DSL is not so easy. Like when I show you a language, it's not easy to determine if you can call it DSL or not. So first example, which is easy is a make. It's a tool for building programs, for compiling, for executing some programs, and make itself parses files that are written in certain language. And this language was created only to be parsed by this program and only to be understood by this program. It's definitely a domain-specific language. Then we've got CSS, which I don't think that anyone thinks of it as a domain-specific language, but it is language, and it has just one domain. It has only one application, right? It is supposed to be parsed by a browser and to format the content that the browser displays. Then the third example, this is more complicated. So Veeam, the text editor, has built-in language that is called Vinscript, and I'm not sure if it's a DSL or not, and I will explain you why in a moment. Then we've got the last two examples. Here are languages specific, kind of specific to Ruby. So first one is Gerkin. It's a language that is understood by a library called Cucumber, which is a testing framework. And then the last one, I think all of you know what is RSpec, and that the RSpec syntax doesn't really feel like Ruby syntax. So it's also a DSL. So the first example that I mentioned was Make, and I'm sure that whoever of you ever written some more complex program in CRC++, something more complex than just one file, you know that it's not easy to link all these files to compile them one by one, etc. So we use the tool called Make, and it reads the file called Makefile. And this Makefile is written in a DSL that has certain constructs that a language may have. In this case, we've got rules and macros. So it's a very simple language. It has just two basic constructs. There are no more constructs in this language. Then Vinscript, which I mentioned that is more complex, it's actually like a full language. It has variables, it has loops, it has the if conditional, etc., etc. You can write a full program in Vinscript. The thing is that you will not use it anywhere outside of VIM. You will just use it... I haven't seen any example of Vinscript being used for something different than extending the VIM editor. So I'm not really sure if it's a DSL because it's like a... It's a full language. You can write any program in Vinscript. But on the other hand, it has just one application. Now, DSLs can be either external or internal. And this is a huge difference because the internal DSL is limited to the language in which it is written. So even though RSpec code, the RSpec code that you are writing doesn't feel like Ruby because you do not define directly any methods or classes, it still needs to be a valid Ruby syntax. And the code that you write in this DSL is simply a part of the program that you are executing. While the external DSLs, like for example CSS, are parsed and executed by some external program. So CSS doesn't have to be written in the same language than the program using it is written. And the last one, the last example is Gerkin. So I think that I have an example of Gerkin here. Yes, I'm not sure if you see it well. But Gerkin is not a valid Ruby syntax. If you try to parse Gerkin, if you try to parse this block on the left by Ruby parser, it will tell you that it's invalid. And on the other hand, you've got a library called CUDA, which is an alternative to Gerkin. So you can use Cucumber with both of them. And one of them is the external DSL because it needs to be parsed. It cannot be parsed as a Ruby code. While CUDA, even though it does the same, it is a valid Ruby code. And therefore it is easier because you do not need to write a specific parser for it. So it's easier to use such language. But on the other hand, it is not as nice. The first one, the Gerkin on the left, looks much easier to read, especially for non-programmers. This is some internal DSL that looks, the first line looks almost like SQL language, just a very weird type of SQL, but actually this is a valid Lisp code. So I think it's written in common Lisp. So Lisp is extremely flexible language, and it allows you to define, to build a DSL that doesn't really look like Lisp. Well, it probably recognizes that it is Lisp because of the parentheses. But it doesn't feel like very much Lisp code, and it allows you to build something like this. And the second one, it looks very easy because it's just a loop which we know from lots of other languages, but Lisp doesn't have native loop structure. This is actually a DSL that is written in the language itself. So this is something that I don't think that you could achieve something like this in Ruby, but I don't know, maybe someone tried to implement the for loop, not using the native for loop in Ruby. Okay, so once again, the external DSL is like a separate language. It is not part of the language which is reading it. So it is easy to use, sorry, it is more difficult to use because you need to write a parser of this language. You need to write a separate program that will parse your language, but it is extremely flexible. You can define any grammar rules. You can define the language to look whatever you want it to look like. On the other hand, the internal DSLs are limited because they need to be a valid syntax of the language that holds this DSL. But because of that, it is much easier to define the internal DSL because you don't have to write a parser for it. So if you check the RSpec code, it doesn't have any specific parser. The RSpec code is just parsed by the same Ruby parser that you use for running your programs. Now, in Ruby, there are basically three components that you need to understand in order to start building DSLs. The first one is a block, and I think that everyone using Ruby knows what a block is. The second one is executing the block in different contexts. So you can use the keyword yield here that just executes the block that you passed to a function. You can use the instanceEvo, or you can use blog.co. I will explain the difference between the instanceExec and instanceEvo soon. And the last concept is metaprogramming. So you need to just understand very basic components of metaprogramming in Ruby, like define method, which allows you to define a new method basically from string. So you define a method from a data, not from a piece of code. And the last one, which is method missing, which is basically a function that is executed when you try to call a function that doesn't exist in your code. There's, of course, more about metaprogramming in Ruby, and if you don't know how it works, you should definitely check one of the talks that I mentioned earlier. Okay, now I will show you a few very, very simple DSLs. Now I'll explain you how they work. I will not be writing DSLs. I already have the code. Is it big enough? No? Okay, so first I want you to just focus on this. So this is my DSL. I want to basically copy factory guy, but instead of doing it in 300 or 400 lines of code, I want to do it in 50 lines of code. So basically I want to be able to define factories that will be used later for testing so that I can easily create the user or some other object. Normally if I want to create an object for the purpose of my test, I need to do user.new and pass all the parameters that are required. And in order to make it simpler, I just want to have a factory that will already fill this user object with some default values of the parameters. Okay, so I define the factory code user and later I will do something like factory. Okay, and this is my DSL. It just needs to consist of two public functions. So let's start by how this defined method works and then I will explain how create works. Okay, so my class factory has one internal variable called factories so that I will keep the collection of created factories. And when I define a factory, what I do is that I take the name and I create a new factory that will have this name, so this one will be called user, and it will have some body. The body here is a block. So I'm passing this block here, all these three lines, I'm passing to this function under the name block. And now we have this function called initialize. So first what I do is I need to know what kind of object this factory, this specific factory will be creating. So I'm taking my name which is the symbol user. I'm changing to string. I capitalize it so the first letter is a capital letter. And then I try to get the constant from there. Of course, if this constant is not defined, then my factory will fail. And now here's where the magic happens. Here's where I need to use the instance evil, one of the methods that I mentioned before. I basically take this body, the block, I'm defining it, I'm executing it inside my factory. So I'm going line by line and I'm just calling something like this. I'm calling first name with the value Michael. And now what is first name? My factory here which calls this block doesn't have a method called first name. So what I'm doing is that right now the method missing will be executed. And the method missing tells me what name, what is the name of the function that I try to call, and with what parameters I try to call it. So in this case the name will be first name and these attributes will be an array with just one element which will be Michael. So I'm assigning this value Michael to the default attributes hash that I have here under the key first name. And now it's the same with the other ones. I go through last name Jordan and with this I'm going through this email. And now after this my default attributes which started as an empty hash is filled with all these three values. So it looks like this. So these are my default attributes. And now that's it. That's what this function does. That's what defining factory does. And now my factory is ready and it's added to the small storage of factories. So when I call this method here, when I call factory.createUser, it just checks what is a factory registered under the name user and it calls method createOnIt. And this method is extremely simple. It just creates a new object of this class that I saved before and it initializes this object with default attributes or with attributes that I can use to override it. So this is extremely simple. This DSL is just 26 lines of code and it's a very basic version of factory girl. Now I have a similar example here that doesn't use, let me show them next to each other. This example here doesn't use instanceEvil. It uses yield instead. And now also my factory will look a bit different here in this case. My DSL, as you see, takes one argument to a block. And now I call the first name, last name and email on this attribute. What is this attribute? What am I passing here? I'm passing self. So I'm passing to the block, I'm passing factory under the name f and then I call factory.firstName, factory.lastName and factory.email. Now why do I need to do it this way? Let me show you how both of them will work and then you will see the difference. Okay, so I create my factory user and it works and here I have one change. Instead of passing the string here, I will want to pass my last name. So I have an instance variable called lastName and as you see, what happens here is that my last name is nil. So why is it nil? It's nil because I call this thing using instance evil and instance evil executes the block that I passed in my current context and the current context here is the factory object. So now this block, what happens here in this block is that cell is the factory object and the factory object doesn't have an instance variable called lastName. Okay, so in this case, using instance evil, I am not able to use the variables that are outside of my context here of this factory. Instead, if I had variable called lastName here, this would work because this variable exists then in the context of this factory. So this is a limitation of instance evil and now if I do the same using the yield, this works fine. See that the last name is here. That's because this block is still executed in its normal context. So yield, the difference between yield and instance evil is that yield just doesn't change the context of the block. It uses the context in which the block was initially created while instance evil, it kind of overwrites cell. So this is one example of a very simple DSL. Now there is one more method like these two that I mentioned a moment ago and the last one is called instance exec. And instance exec is basically the same as instance evil. It also overwrites cell and the only difference is that it allows you to pass arguments. So here yield, it allows me to pass itself as an argument but I can also pass something else here. I can pass user or whatever I want to pass to this block as an argument. Instance evil takes only one argument which is a block but if I do instance exec then I can pass self as an argument and only the last argument is the block that I will be executing. So here I can also define like this. I can do f which will be the factory and I can have it like this. So to sum up this difference, you should use yield when you want to have access to the functions and variables from the original context where the block was created. Then instance exec, you should be using it when you want to have access to the current context when you want to use variables or functions from the object which will be calling the block, not the object that created the block. And instance evil, well basically there is no reason to use it because it's the same as instance exec except that it's limited. It doesn't accept additional arguments. Before I get to this question, I want to show you another small DSL. So I have this CSV generator. Sometimes at KaliGo we need to generate a lot of different files that we are sending later to our partners because lots of companies that we work with, they do not have REST APIs in their systems. How it works is that you generate the file, you send it to their SFTP or sometimes you have to upload it manually via web interface and then you wait three days and then they upload the handbag file on your SFTP or they don't do that because they forget because there's lots of manual process involved there. So we need to generate files and sometimes these files are in CSV format, sometimes they are in fixed-width format which is basically the format where each row needs to have the same amount of characters in each line. So here I just want to generate a very basic CSV file and I accept the records or orders as my argument to the object and I have a small DSL that defines the header, how the first line of the CSV file will look like and I define the body and the body is different for each row. For each order has its own row, has its own line in the file. So now this DSL is extremely simple and it's just like 50 lines of code. So let's have a look on how it works. First thing that you may notice is that I'm including the CSV generator here and when I include it, I need to enable some instance and class methods. So to do it in Ruby, I need to include the instance methods module and I need to extend my current class with the class methods module. So now class methods that I have here, so this is a class method because as you can see, this is code in the context of the class. So I've got a header and a body and header, basically what it does is that it takes a block and it passes it to a class called part generator and it happens, it's exactly the same with body. So I just take this block, it's not executed right now, it's just captured and I pass it somewhere and I will execute it later when I'm generating the file and the same with body. And now the block that I'm passing to a header takes one argument which will be the header generator because I don't need any additional information here. I'm only adding the columns with some hard-coded strings. But the second one takes additional argument which is the order because each row will have some information taken from this order. And now my method, sorry, my CSV generator has one method called generate, so I'm calling something like this. And that method called generate basically executes the block, first it executes the block that I pass to the header generator and adds it to the CSV file in the beginning and then it goes through all the records that I have and for each of them it calls the block that I passed to the body generator passing this record there and also it adds it to this CSV file. And now what do this header generator and the body generator do? They are also very simple. So my part generator that will later generate either the head or the body takes a block and it has a collection of values. And this collection is initially empty. It needs to be an array because that's how CSV is represented. It's just an array and each row in the CSV file is just an array of some values that are later connected with comma or some other character. And now when I call this block, when I call this generator, I can take any number of arguments here. This is not so nice part because some block will take zero arguments, some block will take two arguments, but I cannot define it here. I need to be able to take any number of arguments so I don't have any control over that. And then I call the block that I captured before passing cell first and then I pass the argument. And now what I do is that I go through these lines and I basically called generator.column and here generator.column with address, here generator.column with price. And what happens is that when I call column method, it's executed here in this generator. And this will give me an array like ID, price, etc. And here I iterate over the records and each of them generates one array of three elements. First element will be the order ID, second will be some string, and the third one will be a total amount. Let me show you that it actually works. I've got this variable called orders here. And the output is one string that is just all this. This is the first row, here we've got the new line, this is the second row, here we've got once again new line and the third row. So I just generated a valid CSV string that I later can save to a file. Here once again, I use the block.col, block.col, I'm calling the block here and I use this syntax where I'm passing the builder here and I'm calling this method explicitly on a builder. That if I call it this way, it's implicit that I will be calling it, I might be calling it in some different context. But here I just wanted to explicitly say that this column needs to be called, needs to be called on this builder. So this is extremely, this is the same, but here I will be using, sorry, in the, yes. No, where do I have it? Yes, so here I just wanted to show that I can actually do similar thing, just I can use yield and pass self instead of doing block.col. Oh, sorry, it's not this one, it's here. Okay, yes, this one, this one. So here I was using block.col, which is the same as yield and here I'm using the instance exec and I'm just passing first the arguments that I'm getting and then I'm passing the block. And here the DSL looks a bit different. It just doesn't take this other argument to a block. Okay, and the last example that I want to show you is a config that I'm often using in some applications when I need to do some configuration that I don't want to keep in the database. Why wouldn't I keep the configuration in the database if I can do it? First, because sometimes I've got configuration that changes extremely rarely and I just don't need to keep it in the database. And the other reason is that my changes in database are not tracked as well as the changes in the code. So whenever you change something in the database, it's already there. Whenever you change something in code, you need to push, you need to have code reviewed so that someone can notice that, hey, this change shouldn't be there. So I define very simple configuration files that are just Ruby code. So this code on the left and the code on the right are exactly the same. And I just wrote a very small library, which is just one method that allows me to write it like this. So what it does is that the basic class, the class called application defines free configuration values. The first one is root path. The second one is supported formats. The last one is authentication. And what happens here is that the root path doesn't have a default value. Therefore, if you try to call it, it will just raise an error, the same as you see here. The second one called supported formats has a default value, which is an array that consists of two elements. And the last one, the default value here is a block. So if you try to call it, it will just execute this code here. So by default, my application, when you call dot authentication, it will return your calls. But the admin application that inherits from that overrides this value. And when you try to call authentication, it will try to call a service called admin authentication. Now, how does this one function work that I mentioned you? It's basically one piece of a very crappy code that should be hidden somewhere and not touched so that nobody sees how metaprogramming can be ugly and everyone just sees how beautiful it looks like when it's being used. So it is terrible for many reasons. One of them is that it took me just like 20 minutes to write it. But the other is that I wanted to make it look very nice and very easy to use. So because of that, I need to do some ugly hacking here. So we have just one method called config. When you call this method called config, it takes a name of the configuration and it defines a singleton method. Define singleton method basically defines a class method. And it defines this method in a class. So if I say config rootpath, then my class gets a method called rootpath. And if I call this method called rootpath, then I will define the method with the same name on an instance. So let me show you how this code works in practice. Oh, no. It doesn't work in practice. Why doesn't it work in practice? So I've got adminapplication.new, which is my app now. And app I can call root to rootpath. It should be app. So I basically create an instance of the class, of the admin application class, and this instance has access to the rootpath method. So if I call rootpath in context of a class, I just define the value and then instances can use this value. So this is an extremely simple example of what you can do in Ruby that doesn't even... I'm not even sure if it's a DSL. It's just a small change that doesn't look like Ruby anymore, but this is still an example. This is still a working Ruby code. So I'm using this just to make my configuration files more readable. I could do it using JSON, I could do it using YAML, I could do it using JSON or YAML. I cannot override it with a working code like this block here. If I do it in database, then other people can change the values and I don't have control over it. Unless I add some triggers in the database that will inform me about it, et cetera. So this is how I write config files that are both... They are both pushed to the code and they are also easily extendable. Okay. So these were the examples that I wanted to show you today. Another question is if these examples are really, really domain-specific languages. I have some examples of random comments or by random people on the Internet that are bitching about it, saying that, oh, it's not DSL because it's just syntax abuse and the only real DSLs are at least blah, blah, blah. Here's some more. And one comment that brought my attention was that it is not really writing a DSL. You're not writing a new language, you are just abusing the existing syntax to make your code look like it's not written in this language. And then I thought that actually it's not an insult that abusing language can be a beautiful thing. And I wanted to show you some examples. So this is a valid Ruby code. This is a monad code maybe. And it's a part of library code. I don't even know how to pronounce it. But what brings... What draws my attention here? What is unexpected is this operator that is the greater than dash greater than characters. I tried to create method like this. And it's impossible. Ruby doesn't allow you to do it. So then I'm thinking like, wow, I'm using a method that I cannot define. How do they do that? And it appears that these guys just created a method called greater than. It's just one character. And then these dash and greater than is a syntax for lambda. So you see this code here, line four, this line. Now look how it can look like also. It's this. So is it a DSL? I don't know. Is it abusing language? Definitely. Is it beautiful? I really love it. So it's an example of something that Ruby allows you because it's so flexible. Other languages, except for least maybe not necessarily. Another abusing of the language syntax. It's from the same library. I didn't know that I could parse this code in Ruby. But actually you can do that. You do not have to put character dot character. You can put as many spaces between them as possible. And I'll show you that it really works. I read about it just yesterday and I'm still surprised. So let's say that I have a string and I want to capitalize it. And it works. So you can just write it like this. And you can create the functions with capital letters. So the first thing that you think of when you see a constant with capital letter is that it's a class but it doesn't have to be. Oh no. Oh no. It doesn't work in this case. Now why? Where? Yes? Oh yes. Okay. Thanks. Thank you. So Ruby first tries to, if you don't put parenthesis explicitly, it tries to find a constant. Say hello. But in case of this library, this maybe is both a function and it's a class. So it's once again an abuse that probably you don't want to do in your code. But I think it's just fascinating that you can do something like this. So once again, are these examples that I show you DSLs? Probably not. Because they are not real languages. They are just a bunch of random functions tied together. So to make a Ruby code not look like Ruby, to make it more readable. But I think that it's just amazing things that we can do something like this. Now the question is should you create new DSLs? The answer is no. Yeah, that's, if you ask yourself should I create a DSL? Then probably the answer is no. You see that there is a case where you're so convinced that you want to write DSL. Then maybe, maybe the answer is yes. So I've been writing Ruby for a couple of years. I wrote maybe two or three DSLs that I'm still using. And in other cases I just stick to standard methods. Because even though the DSLs are very readable for you, the author, they are not readable for new programmers, for your users. We have plenty of DSLs in Ruby, in the Ruby community. We have like Factorial, we have RSpec and a few more. And some of them are good ideas. But any of them, any DSL requires a lot of learning. So to understand RSpec, it's not enough to understand Ruby. You have to learn RSpec separately. And actually you can understand RSpec not knowing Ruby almost at all. And you can know Ruby very well, but not understand RSpec. So that's why I think that DSLs are fine in certain cases. In certain, probably extreme cases, but not for everyday use. That's it for today. Once again, my name is Czegorz. I work for a company called Kaligo Travel Solutions. We're hiring. We're writing lots of Ruby code. We're writing lots, writing lots of Elixir code. We use Hanami Framework that you heard about at the red dot RubyConf. We use it in production. It's awesome. You can join us and check how it works in production. Yeah, thank you very much for listening.