 Good afternoon. My name is Claudio. I'm really excited to be here at RubyConf. As I was saying, this is actually a talk about Ruby, but the Ruby programming language. It's a language that we all love to write. We love to read. It's a language that is a little old, but it's still evolving. As a matter of fact, there is a new release every year, and there are new methods added every year. This is just an overview of some of the methods that I'm going to touch during this talk, and some of them have been in Ruby for many years, and others are new. They were just added in 2.4 released last December. The format that I chose for this talk is a refactoring journey, and I invite you all to come to this journey with me. What we're going to do is we're going to take a problem. It's actually a real problem that I had in my previous job, and then try to find the solution in Ruby with a first pass, and then iterate through the problem using some of these methods to make the code better. Whether you are a beginner or an expert Ruby developer, I hope you will all learn something from this talk. Now that the doors have magically closed, we can get it started. Let me give you some context here, because this is the example, the problem that we're going to use to look at those methods. It has to do with YouTube URLs. In my previous job, I had to deal with YouTube URLs a lot. If you have ever watched a video on YouTube, you have probably seen a URL of the first type that is a YouTube video. The second URL is also a YouTube video. It's just a shorter format. The third and the fourth one are URLs for YouTube channels, and then the last one is for a YouTube playlist. What we have here is different URLs, but really different strings that mean different things. What we're going to try to write is a method that is called parse, and it accepts a string as an input, and it's going to tell us what type of YouTube resource we are looking at. In the first case, that is a YouTube video. Ideally, this method parse would tell us it's a type video, and that's the idea of the video. In the second case, that is a YouTube channel, so that's what we expect. We expect type channel in the name of the channel. The third one, it's not a YouTube URL, so it's simply type unknown. Now, because we are Rubyist, we're not going to be happy with just writing code that is correct and complete. We want something more. We're also going to try to write code that is clear, meaning readable, and compact to follow the philosophy of do not repeat yourself. This is the problem that I'm going to talk about in the next 30 minutes, and before I start looking into that, I'm just going to stop here for a second and give you all a chance to think about this. If this was like a job interview or something like that, and this was the problem, what methods would you use if you had all the methods of the Ruby standard library available to try to solve this? Okay, so let's start, and I'm going to give it a first pass. So what I can observe here is that those URLs, they start in different ways. The first one starts with YouTube.com slash watch, and I can imagine that is a video. The second one just starts with YouTube.com slash, and then I can imagine that's a channel. The first, the last one doesn't even have YouTube, so maybe unknown. So my first attempt is going to be to use the method startWith from string, and this is straight from the documentation, startWith returns true if the string starts with one of the prefixes given. So you can say, hello, startWith hell, true. You can also ask hello, startWith haven, hell. It's also true because one of the prefixes is a match. Hello, startWith haven, paradise, that's false. So using this method, we can try to find the type of a URL, and this is how our first pass would look like. So how does this rank for what we're trying to achieve? Well, it's not perfect. I would say it's clear, meaning it's pre-readable. It's pretty English if text starts with and so on. It's not compact because we have some repetition there. The method startWith appears twice. It's not complete. We're only trying to match the type. We're not returning the ID or the name, and even worse, it's not correct. And the reason why I say that is because if your URL is simply youtube.com slash, that is not a channel. That is actually YouTube's home page. And the reason is that YouTube has some strict rules about how the URLs are formed. And this is basically the rule. When you have a video on YouTube, every video has a unique ID. And this ID is always 11 characters. And it's not any 11 characters. They are all letters, digits, underscore, or hyphen. So I don't know if you knew this. Maybe this is something you can take away from this talk. When you have a channel name, also, it has to have at least one of these characters. So as you can see, it's not as easy as just saying startWith. We need something more powerful. We need to deal with regular expressions. So for the next iteration, we can use this method string match question mark. This is actually a new method. It was just added to Ruby 2.4. And it lets you compare string with a regular expression, and it's going to tell you whether it's a match or not. In the first case, the string Ruby is a match for r dot dot dot, which means r and any three characters. That's true. But if you just try r dot dot, that's false because it's missing one character. And if you try p dot dot dot, it's also false because the first letter is an r, it's not a p. So using this method, we can now rewrite our code to look like this. If it's a match for a certain regular expression, then it's a video. Otherwise, it's a channel. Otherwise, it's unknown. Now, let me explain what this regular expression means. For instance, the first one is youtube.com slash watch question mark v equal that has to match exactly. And then there is this thing in square brackets. What the backslash w means is match any letter, any digit, and the underscore. So it's all there in that backslash w. The only other thing that we want to match is the hyphen, so we just add it there. And then the number 11 is a quantifier in curly brackets, and it means we need to match exactly 11 occurrences of those characters. So that's exactly what we want. That's what a youtube video URL looks like. In the second case, we still have the same square brackets, and we have the plus sign. That means it has to match at least one occurrences. So a channel name needs at least one character. And so with this in mind, we now have a solution that is correct. We're now actually matching videos, channels, or unknown. And I would say it's still pretty clear. It still reads pretty English. It's not complete. We're only returning the type and not the name. And it's not compact. As you can see, we have some repetition. Text match, text match. So let's give it another try, and let's try to remove this duplication. To do that, we're going to use the triple equal operator. This is what the Ruby documentation says. Following a regular expression literal with a triple equal operator allows you to compare against a string. And even more important, it's used in case statements. What that means is that if you want, you can explicitly use the triple equal, try to compare r dot dot dot to Ruby. That's true. But you don't have to, because if you use a case statement, then internally, Ruby is going to use that operator. So if you say case Ruby, when p dot dot dot, Ruby is actually using that operator. And if it's true, then it's going to return starts with p. Otherwise, it's going to return starts with r. So it's kind of the same as before, but it allows us to remove some duplication because we don't have to repeat match. So in short, we can take the code we had before and slightly change it to use a case when statement. Case text, when it matches the first regular expression is a video and so on. So the code is still correct. It's still pretty clear and now it's compact. We're not repeating any method. But it's not complete. We're only returning the type. And if you remember, we wanted to return two things. For instance, when we match a video, we want to return the type and the ID. When we match a channel, we want the type and the name. So we are maybe halfway through it, but we're still missing one part. In short, what we want is when we have a match, the thing that we're matching, we want to return it. So if we look back at the code that we had, those 11 characters that we're matching, we want to capture them and then we want to give them a name, ID, and then return that in the result. Same thing with the channel. When we're matching a channel name, we want to return it. So how do we do that? How can we capture and name these groups? We can use another feature of the Ruby standard library called regular expression captures. What it means is when you write regular expressions in Ruby, you can actually group a part of the expression in parentheses and you can give it a name. And if you do that, then when you have a match, you can ask just for that match in return. This is an example from the Ruby docs again. Let's say you want to match the string $3.67 and you have a regular expression that is the dollar sign, then this group in parentheses that we call dollars, it's any number of digits, then you have that and then another group called cents. So once you have a match, then you can just say, give me from the last match just the dollar amount. And that is what that dollar till the symbol means is a special variable that holds the value of the last match. So you can just say, give me the dollar amount of the last match, you're going to get three, give me the cent amount of the last match, you're going to get 67. So with this in mind, we can now write our code in a slightly different way. It starts to look a little complicated, but it's not very different from before. We have the same regular expressions, but now we're capturing the thing that we're matching. So in the first case, we're capturing and we're calling that ID. And then if it's a match, we're going to take that and return it in the resulting hash as ID. In the second case, we're matching what is the channel name and then we're returning it. And so finally, we have a solution that is correct and it's also complete. So if we were now writing Ruby, we could just stop here and say, well, this just works. I don't care if it's readable, if it's complex, let's just go to some other problem, but we're writing Ruby, so we want to make things a little better. Specifically in this case, it's still not compact because we're still repeating something here. This ID, this name that we gave, we actually see three times and same with name below. So is there a way not to have this repetition? Well, I'm glad you asked because there is. It's another method that was just added to Ruby 2.4 and it's called named captures. The concept is very similar to what we just saw with captures. It's just a more compact and elegant way. Basically, when you're capturing multiple groups, you can just call this single method named captures and you get back all of them. You don't have to individually specify I want the dollar amount or I want the cent amount. This is the same example as before, so if we just call named captures on the last match, we're just going to get a hash back that says dollars three cents 67. So let's see how we can use this in the code that we have. Let me break it down. For instance, let's say that we're trying to match the URL youtube.com slash conflicts. That is the name of a channel. When it's a match, this named capture is just going to return a hash. In this case, name conflicts. All we need is just then to add the other part that we want, the fact that the type is a channel, and we have back what we were looking for. So in this case, name conflicts type channel. So the code is becoming a little more dense, but it's doing exactly what we want. And this is how the code looks like. When we're capturing a video URL, then we get the ID back in named captures, and we just attach type video, and we have the result that we want. When we're capturing the channel name, it's the same thing. Now, is this any better than before? Well, I don't know. Maybe we now are just hitting our head against the wall because it feels like we're not really getting anywhere. We still have code that it's not really clear. It's not compact. Now we're repeating something else. So it feels like we're iterating, but we're really not getting there. And this is a feeling that you all might experience when you're trying to make your code better. You think you're following some path that's going to lead somewhere, and then it doesn't. So normally when this happens, I take a coffee break or a lunch break, or I just go home and sleep, and then in the morning I'm just like, wow, now I know what I have to do. I have to use something completely different. So we don't have time to go home and sleep. So I'm just going to skip that part. But it does work like that. When you are looking at a problem too much, you don't see exactly why things are happening. And in this case, this repetition is happening because what we've had so far is either if, else if, else if, or case, when, when, and it's almost in the nature of these structures to have some repetition because you are listing all the options. As I said at the beginning, there are even more options of YouTube URLs. So if we had to do that, we would probably end up with some repetition. So is there a way in Ruby not to do that, and instead to have Ruby itself do that for you? Yes. There is. So this is really like the biggest jump that we are going to make now. We're going to jump and talk about enumerables. So enumerables is a class and really the structure that sets Ruby apart from many other languages. What are enumerables? Arrays, hashes, ranges, you know, things that you can iterate through. And the enumerable class has this find method that is very powerful. Once again from the docs, if you use find, it passes each entry to the block and it's returning the first for which the block is not false. So basically you have a list of objects and Ruby is going to go through all of them one by one and try to see if there's one of them that matches your condition. And as soon as it finds one, it's going to stop and return that one. The first example, if you have all the integers from 18 to 99 and you say find a number that is divisible by 17, internally Ruby is going to say if 18 divisible by 17, then return it. Else if 19, else if 20 and so on. But you don't have to write that. And as soon as it reaches 34, it's just going to return that for you. Another characteristic of this method is that if it does not find a match, for instance, if you're trying to find a number between 18 and 29 that's divisible by 17, by default, it's just going to return nil. But you can actually overwrite that. You can tell Ruby what to return if there is no match. You do that with that extra argument if none. So in the last case, I'm just saying if you don't find a match, just return the value zero. So it's really powerful because then you don't have to write if, else if you just use the power of numerables. So now jumping back to our problem, this is really like the biggest change that we're going to see. We're not dealing with cases anymore. What we are doing here is we are recognizing that we have a list of patterns. Those are the same patterns that we talked about before. The first one is the regular expression that matches a video ID. And that is going to be of type video. The second one for a channel is the type channel. So we're just basically giving a name to these things called patterns. And then what we're telling Ruby to do is just go through these patterns one by one. Try to see if the text that you get is a match for any of this regular expression. As soon as you find a match, just return what we wanted from before. Type video ID 123. So once again, it's pretty powerful because now we only have eight lines of code and we don't have any method that's been repeated. The only portion that's missing is that gray line. How can we return something, you know, what we want in that case? Basically, how can we tell this find a method that as soon as it finds a match, it has to return something specific what we wanted to return and not just the first match. To do that, we have to use the break statement. So what break does is, you know, exactly that if you have a loop and you can use break to break out of the loop and to return a specific result. In this example, in the first case, we're not using break. We're just saying stop as soon as you find the number that's divisible by 17. Find is simply going to return the number itself, 34. But if you want to return something else, for instance, a string found it, then we can just use break. And then we say break with that value when you find it. And so that's what we're going to do here and we're going to use break. So now this code is becoming very dense, but it's really getting to the core of it. It's saying go through all the patterns. So patterns find, try to see if there is a match for an irregular expression. If there is, stop there. Take what you captured in the match, for instance, ID and the ID of the video, and then just add the type that you matched. So if we're matching a video URL, we're going to get type of video and the ID. If we're matching a channel URL, we're going to get type channel and the name. If we're not matching anything, if you remember, we want to just return type unknown, and we can just put it there. We can just use this extra argument of find to just say this is what you can return if there is no match. And so this is getting close to really being very compact. You can actually make it even more compact if you inline the if condition. And this is as compact as it can get, probably. It's still correct. It's still complete. But now we're dealing with clear, like is it readable? That is probably the most subjective of those characteristics. Some people might find it readable. Some people like compact code, very dense. Other people like to be a little more verbose. So I don't have an answer for that. What we can try to do is we can see if we can do a last iteration to make it a little more explicit. So for our last iteration, I'm just going to take a step back and remind ourselves where we wanted to get. We wanted to have a method that is able to parse all these different formats of URLs. Now in the code that we have so far, we're actually only dealing with two of those formats, the video ID and the channel name. If we had to put all of them in this code, it would actually look more like this. And now it starts becoming even not readable because I guess it's a very long hash. And, you know, this pattern structure that we identified, now it doesn't really have a specific meaning. It's including many patterns. We have patterns for videos, patterns for channels, patterns for playlists. So just for ourselves, just to make the code more readable, what is important and what's also one of the hardest things to do is to name things. Since we have identified that some of these are video patterns and some of them are channel patterns and playlist patterns, it actually makes sense to separate them in different constants. Because if we give them a name, then we kind of don't even have to read the regular expression. We just say, oh, this is the list of video patterns. If ever YouTube has a new way, we can just add it there, but we don't have to actually read. Once it's there, it's there. We have channel patterns and we have playlist patterns. So it's very declarative. It's very readable. The only step that it's missing is we want an object that it's basically going to go through all of them. First it's going to try all the video patterns one by one and stop as soon as you find a match. Then it's going to go through all the channel patterns and so on. And so this is the very last method from the Ruby Standard Library that I'm going to introduce today and that is enumerator.new. What this method does is it creates a new enumerator object which can be used as an enumerable. So it's an object that you can tell Ruby to iterate through. And this is what we're going to want here. We want an object that basically first goes through all the video patterns, then through all the channel patterns and playlist patterns. And this is how you write that code. You just use enumerator.new and then once again it's very declarative. Inside the block, you say, well, first for each video pattern, add it to the pattern and then if it's a match, the type is video, then do the same for all the playlist patterns and for all the channel patterns. And so here we are now with this code that it might be maybe too dense. As I said before, some people might prefer another format and I'm happy to discuss this with all of you after this talk. But I think it has some advantages. The first one is that really all the first lines of code, they're just declaration. We're just saying these are video patterns, channel patterns, playlist patterns and patterns. So that part really is going to stay as it is. And the bulk of your code is just the last three lines of code. That's where we're doing the real iteration. So if the first lines seem a little scary, those are just things that you're going to write once and they're going to stay like that unless YouTube adds more formats and you're just going to add them there. And your real logic is just those three lines at the bottom. So in my opinion, this code is correct and it's complete for sure. It's compact for sure, maybe too much. And in my opinion, it's clear because of this, because it's really defining things as they come. And instead, if you have like an if, else if, or a case statement, you can't escape from it. Like as you're reading, you have to read all the conditions diagonal in a way. And so this is basically what I'm going to stop. I'm just going to add one very last thing. Those patterns that I typed at the top, they just say YouTube.com. But really, YouTube URLs can also be www.youtube.com. They can be HTTP, YouTube.com, HTTPS, YouTube.com. You don't want to add all of that, you know, those variants at the top. You can just tell a match to match an interpolated regular expression that has optional captures is basically matching every regular expression and also this variance. Whether there is W or not, it's still going to be a match. So I know that was a lot after lunch. But I really wrote this talk to just give you an overview of the power of some methods that are inside the Ruby standard library. And as somebody said, yes, in another talk, Ruby is very powerful and you can do amazing things with it, both in a good way and in a bad way. You can just play around. There are no rules. These are some of the concepts that I introduced. And I think this was a great journey to have because if you guys have different opinions and if you have seen something that gave you some other ideas, just feel free to talk to me. That's all I have. And once again, the slides are available. And I'm up for questions if you have any. Thank you. The question is, would I consider extracting those into separate classes? Yes, I would. I tried to keep it compact in this talk. But yeah, that's a good idea. For instance, those video, channel, playlist patterns, they can definitely be their own classes. So you can play around with it. I really can't see anything. But if you have questions, just come find me or I'm going to be here for the rest of the conference to just hit me up. Thank you.