 So thanks Ming-Ding and hello everyone, good evening, my name is Daniel and today I will be giving a talk about what I learned from my first open source contribution but specifically it's an open source contribution to a repository named Rubocop. So I'm going to start this talk off with a bit of a hand raising exercise. So could you raise your hands if you are a developer here with one year of experience or less? Okay, that's a few of us and another hand raising exercise if you are a developer with two years of experience or less, yeah including the guys with the one year of experience or less. Okay, this is about half of us, high five, me too. Raise your hands if you've used Rubocop before, that's almost everyone, great. And raise your hands if you've looked into the source code of Rubocop before, alright. So to be honest I had a genuine fear that everybody in this room would have seen the source code before and then I would have pressed the next slide which is like okay thanks I'm done with the talk but okay cool thanks I just wanted to gauge where you guys were at. So right, my name is Daniel, this is my Twitter handle and that's my GitHub link. Today my talk will be centered around these two events. So last year, six months ago in July and August I managed to merge two pull requests into Rubocop but just to give you some context as well into what I used to do and why I decided to contribute to Rubocop. So I wasn't always a programmer and in fact these two contributions I made during my first year mark as a programmer and before that I was actually a psychology graduate and after I graduated from my degree I was a secondary school teacher in Malaysia teaching history so that's me and after it was a two-year contract and after the two years ended I joined one of the coding boot camps where I learned Ruby on Rails for about three months. That's where I also got married and then I moved over to Singapore and then this company called Tinkerbox they took me in as an intern and eventually they converted me to a full-time programmer and as a new programmer I understood that there was a lot that I didn't understand so I think there was my constant struggle in Tinkerbox. So one of the things that I tried to do in Tinkerbox was to improve my programming skills was I do a lot of code challenges so I made it a thing in Tinkerbox where we get together and discuss code katas on the site called Code Wars. It was a good way of I think learning the different libraries and methods that were available to Ruby but I still felt dissatisfied with how much I didn't know. I felt that I still couldn't do a lot of things and I think this was because I was trapped in this thing called the Rails bubble. The Rails bubble I described to be like only exposed to like building apps in Ruby on Rails so I view programming through the lens of Ruby on Rails and then I was very bad at like novel problems so if a more complex problem was presented or I don't know some algorithms or something then I would like completely trip up so then I had a talk with my CTO and he had a yearly annual one-on-one and he pushed me to contribute to RuboCop as like my short-term plan I think like it was my six month plan or something like that so I was like okay fine and that was when I made my entrance to RuboCop and that's why I contributed to RuboCop so let's get the cat out of the bag first so what is RuboCop so if you are thinking this guy the street cop which I think his name is Murphy or something right anyway so basically RuboCop is a Ruby static code analyzer based on the community Ruby style guide and it can do a couple of things so first of all it can help you to enforce a particular code style so in this case we have a method and we have a return on the method A and we know as Ruby is that's all method definitions have an implicit return so you don't really need to put a return there so if you want to enforce this particular style of not using a return you can and then RuboCop will highlight it as an offense okay it can also suggest alternative methods from the Ruby standard library so here we have a method someone is trying to iterate through a hash but he is only using the key and not using the value so here RuboCop can flag that as well and tell you well you can use method like hash each and it can also auto correct for you as well so when RuboCop catches a particular offense you can run RuboCop using the RuboCop dash A option which stands for auto correct and it can correct a few simple mistakes for you like for example you can get rid of the return or it can even help you to indent your operators or something like that so when I started to look into the RuboCop codebase I asked myself first okay the obvious question is how does RuboCop work how does it do what it does so I looked into their codebase and these were the apparent obvious structures in their codebase so I saw a folder named AST which I knew nothing about cop formatter and RSpec and then I had inclination to go into cop because that seemed like the most obvious thing like that's probably where it keeps like all its good logic so I opened up one of the cops and this is the hash each cop and then I was like oh my gosh like what is this by the way I I see a lot of methods that I didn't understand and I also want to make a quick diversion into this thing called c-tags so raise your hands if you use c-tags yeah okay cool so c-tags c-tags okay so c-tags is a library that creates a file that contains all the methods in a particular directory and then it stores a reference to their definitions right so then you can if you have a c-tags add-on installed you could just like hit a shortcut key and then it pops you over to the method definition and then the cool part is if you're using it with Ruby you could run a simple executable so that when you enter a new Ruby sorry Rails directory or Ruby directory or anything you could get it to run their c-tags reference file on all the gems in your gem file so now if you see a weird method that you don't understand you just like hit a button and then it goes to the method definition so I didn't know this when I was navigating through the Rubikop code base but now I I use it a lot and it's very useful so anyway I go to that method and I decide to find out what this def node matcher is and then I see that so Rubikop leaves a lot of like helpful comments on the methods to sort of tell you what it's doing right so I tried to read the comments and I still don't really know what's going on but I noticed you know there's the AST acronym over there right and I'm like okay cool and then I scroll to the top of the file and I see like even more comments which is excellent and again I see this like AST word which I'm unfamiliar with so at this point I'm like okay what is this AST and it must seem pretty important so I did what any logical programmer would do I looked it up on Wikipedia of course okay so an AST right so an AST I understand is to be a tree that represents the structure of your source code with this these nodes right so it tries to represent your source code in nodes and it's very useful to see like you know what's what code are arguments that are being sent to other method calls etc etc so it gives you some sort of a sequence as to what is calling what and what is calling what so on so forth right so I read this and I dug a little deeper and I thought okay cool so now what is Ruby use the AST for so as it turns out when you write some Ruby code it goes through a couple of stages before it finally gets executed so for example here say this simple Ruby code puts 2 plus 2 Ruby first of all have to read it from left to right character by character so you'll read like P U T S and then it was like oh puts like I know that and then it tokenizes it and then it reads 2 and it's like oh I know that too that's an integer and then it tokenizes that so that is known as tokenizing I forgot the puts but you know it looks something like this okay so now it's got a token it's got some tokens but you know at this point you you could make a mistake in your code and it still wouldn't know because it hasn't tried to like call it sequentially like 2 perhaps should be called as an argument to plus on the receiver 2 so 2 plus 2 and this is what the next step is doing it's sort of like parsing it giving it some sort of a sequence right and then once you've got this sequence then it gets created or it gets translated to these like instructions and these are called YARV instructions and this is you know I guess the last bit before it finally gets executed as bytecode so the bit that we are concerned with is here this is this is the AST so the tree that we saw just now on Wikipedia can be sort of like this so and I'll go into that in a bit and then that bit there is the YARV instructions so Ruby uses this gem called Ripper to create its AST but in Rubocop Rubocop uses a different gem called parser so different gem but achieves the same thing so to you can all do this you could just install gem install rip parser and then you could use the Ruby parse you know and followed by the dash E for expression and then key in expression like 2 plus 2 and you will generate the it's a bit small I apologize you'll generate the AST for you so let's let's look at this I called Ruby parse twice so it's a very simple expression I call my method passing in argument one at the bottom I call the same thing again with argument one but without the without the parentheses and we know we can do this in Ruby but through the AST it's very clear why the ASTs are completely similar so it doesn't distinguish between the parentheses and no parentheses and so we can understand okay this is why the code executes the same way here's another example so here is my method that returns a and let's just take a little deeper look into the into the AST right so first it's it starts out with a defined node and the DEF node there and the way we read this is the defined node has three children the name my method the second child is the arcs which is empty because my method accepts no arguments and then the third child is a return which in turn has another child which is a send on a so we can see here how Ruby is making sense of the sequencing of this of this piece of code so then you try to play around with it to see how the AST changes so like I try like okay what happens if you try to return two values like A and B and you notice that the code structure is still the same except now as we expect the return node has now two children a send on A and then a send on B so so far so good then I try something else so I tried like A and then return B and then we notice the AST changes a little bit to deal with like the you know two the things that are happening inside the block a new begin node is being introduced and then now the begin node houses the send to A and then the return which sends B so right now you must be thinking great why do I use this for or more specifically like why the what is the AST's method in Rubocop right and as it turns out it matters a whole lot so remember how I said that Rubocop you could it could suggest like alternative methods from the Ruby library so in order to do this it must first of all be aware of what you're calling like it has to know you're calling each on a hash and you're not using maybe a key or a value so it has to know so let's take the same code and generate this AST so we see the defined node which has empty arguments and then a huge block node as it's shown and then in the block node you see three children the first is a send to each and it is being called on the receiver which is a send to hash the second child which is the arguments is basically two children an argument called key and an argument called value and then the third one is just a send to do something with a local variable key right now how Rubocop knows what you're doing is it tries to just pattern match so this you would find in most cops you would find like a depth node matcher and then you would you would put in these things and if you were like me when I first looked into it I wondered what the lines with the pattern was like I was like oh what is that so don't worry that's just a it's called a here dock and it's basically you can treat it like a string that has new lines inside so you can just like like a string literal right so you could just print there okay so you try to construct a node matcher and what you're actually trying to match is the shape down there so it starts out with the block and then you're trying to match a send on each so you can see that part similar and then on the node matcher we are trying to match anything so not so you can see the exclamation mark so anything that is not being sent to an array so not not an array basically right and then you must also have two arguments a key or a value and in the node matcher you see a underscore k and underscore v it means like a wall card right so match anything so you could you could name it something else if you want you could name it like key value or you can name it like index and something it doesn't matter as long as you see these two arguments and you'll match that so when Ruby matches this then it'll be like okay I have a case here and then it then tries to call some methods to try to either throw a throw offense or auto correct your code or something okay so what about auto correcting right how does Ruby or how does Rubikop auto correct your code okay so here we have a very simple like a few expressions we have the instance sorry we have DEFN definition assigned to the instance variable definition we have full assigned to the instance variable source and then we have some unrelated method call and in Rubikop okay so one of the things that you can do is you can auto correct by for style as well so the equals operators in this code here as you see is not aligned Rubikop can help you to align those as well so what we can do is well we can parse it but we can also parse in the extra option the dash L or dash capital L option and what it does is on top of providing you the AST it also gives you the definitions of the parts of the form the expression so we can see in the first line definition equals to the instance variable definition you can see that it can identify it can correctly identify where the operator is right and so this is key this is a piece of code that I did not write this was written by white quark in one of his articles in his blog white quark is the creator of parser so I don't claim any credit for the code I'm just mentioning what what can be possible so he on the right is the AST for the three lines of code we saw earlier and on the left is you know a class that he wrote that inherits from parser me writer okay so first of all online to it calls on begin which means it's trying to match a begin block a begin node which it is a begin note so that matches and then it says note children each so it's just iterating through all the children the begin node has three children IVA assigned which is instance variable assigned LV assigned which is local variable assigned and then ascend to unwritten method call so it says okay if the nodes are assignment nodes then push them into an array and so that's basically what it's doing is pushing the instance variable assigned node and the local variable assigned node into the array then it calls a line on those on that array of nodes what a line basically does is it plugs out all arrays again it iterates through them and then for each node it calls this method lock which I think it's short for location and then it calls dot operator to find where the operator is and then it finds the column so you can find out the exact column number that the operator is in so now it's got a array of op of column numbers and then it just picks the maximum one so it picks the operator that is furthest to the right and then what the line at the bottom is doing is it's just inserting like white spaces before the operators to justify it so if you run that ruby rewrite and then you put in the script and you put in the expression then you notice that the equal signs are now justified okay so how does this all relate back to ruby cop so actually I mean obviously there's more to that but essentially that's how all the cops work right so take the hash each methods cop again you have the dev node matters which are just trying to catch particular types of patterns but in particular is trying to catch when somebody calls each on a hash with two key values or is trying to catch when you use something with keys dot values and then each and it's right to suggest a method for that and then you can also define on block amongst other any different types of nodes to say okay I want you to kick in when you see a block node that looks like this and that's it that's that's pretty much the core of what I learned about about ruby cop this is sorry a little bit more little bit more so did it change my dissatisfaction right so I went through all that like I was still dissatisfied and now am I like less satisfied a less no more satisfied okay and then the short answer is like no not really right you don't just like morph into a beast overnight but it got me started I think so I started reading this book by called ruby under a microscope by Pat Shaughnessy it's really good and I highly recommend it and that's what I've been doing I think for the past few months the book talks about the steps that that that ruby takes to finally execute your code and it goes through it in detail I just want to share with you one extra learning I got from from from this book sorry I want to share with you one extra thing from this book that happens on the compile and sort of the last bit which I thought was quite intriguing so this is a data structure called a stack and I learned about the stack you know in my first year and I knew that okay a stack is pretty much like an array but you know you can only put stuff in from one side and you have to take stuff out from the same side so you've got some kind of like a last in first out kind of a paradigm going on and I'm like okay cool that's a stack but I don't really know what a stack is used for I don't see how it is implemented and I don't see how it's useful to me okay so I explored this so here I have a piece of code at the top two plus two plus two it's very simple that's the AST and you could all do this as well so you could call the Ruby VM class instruction sequence and then you could compile that code and it will produce you this YARV instructions okay so when you look at the YARV instructions then you could see basically what is going to try to do is is it reads from top to bottom and then the first one first it's going to put the first thing it sees into into the stack so in this case it sees well it sees trace and then it sees put object two so it's going to put two into the stack and then it sees the next two is going to put two into the stack and then the next one is quite cool it sees operator plus okay and this is actually optimization but okay so it sees operator plus and what operator plus will try to do is it will look backward to the first two things it sees the the one immediate to it is the is the argument and the one after is the receiver so it says okay call plus on two and to the receiver two and then it executes this then evaluates to four and then it continues again so then the next thing it sees is two and the next thing it sees is plus again and then it repeats itself right so then it just like okay receiver argument and then that's the end right so now I try to modify the sequencing by putting some brackets above so now I do two plus bracket two plus two okay and I notice that the YARV instructions are not too difficult just one thing changed right so now we see two two and then two again right and then it sees the operator plus so plus is going to call the two things below it the receiver the argument that's going to evaluate to four and that's basically our bracket up there right and then it sees the next plus and then it and then you turn six so I thought that was pretty cool like this was a like the first time I've seen implementation of a stack outside of a coding challenge which like I could actually not relate to at all and then it led on to other things like for example I was looking at JavaScript and I keep hearing like JavaScript is async JavaScript is async but then like JavaScript has a synchronous call stack so then how does it become async right and I won't go into that because this is not talk.js but but yeah I did do something really clever with a queue and I would not have appreciated it if I like cannot dive into this and and looked at that so I have a new found appreciation for it okay so now I'm at the end of my talk and I think I from this I derive a lot more pleasure from my work I don't think I like skilled up but like I feel I feel more happy I think doing my work so that's that's good okay and I have one last favorite to ask of you so I used to be a teacher and I very I value feedback so I want more raise hands exercise could you raise your hands for me if you learned one new thing from this talk oh okay thank you very much sure okay thanks thanks very much guys okay I'll be taking questions but please go easy on me well okay so my last PR was in August and then I tried to look at more pull requests and I realized that at the time when I submitted the PRs I didn't have a full understanding of how all of this work I kind of just like pulled myself through it so I forced to reflect and dig deeper into what was going on and so yeah and this was the this culminated in this in this talk so I'll be starting again I just did one but I just corrected the one one documentation which is like one line of code and just change some documentation yes yeah so you could call man there is a method that you could call on it so what Rubikop does is it it recreates its own node that inherits from AST node and then it writes some wrapper methods around it to give it some superpowers so one of the things you can do is it can call the it can call out the exact line that was there was there was red so yeah then then they can spot the parentheses I'm very tempted to like open the C source files and try to look into it but at the same time there's a lot of inertia going into it as well I'm definitely more interested but I'll have to think about when I want to allocate some time and go into that I'm definitely interested to look into the recording closer to the metal yeah but with that being said like I definitely have a new found appreciation for Ruby and what it does alright thanks very much guys