 Alright guys, I think we're gonna go ahead and get started here. This is the features in Rails 5 that you haven't heard about. My alternate title was Sean dramatically reads the changelog, but I don't have a great dramatic voice. So instead this will be Sean undramatically reads the changelog. No, actually the main focus of this talk is less about the features themselves and more about the stories that went into implementing them. This is, we're gonna focus on the smaller quality of life changes. Those are the sort of things that I get excited about every time Rails updates. Yeah, once I get started it'll be fine because I'll actually be pressing buttons on the keyboard. Anyway, what I get excited about with every version of Rails is the smaller quality of life features. The things that aren't gonna fundamentally change how I build applications, but the things that are gonna make my life just a little bit easier. The smaller tools, the new protections, performance improvements, and a lot of them have very interesting stories that went into why they were added or in some cases why they weren't added. So that's what we're gonna talk about today. My name's Sean Griffin. I work full time on open source. I'm a committer on Ruby on Rails. Shopify sponsors me to do open source full time, so thank you Shopify. And as I was saying, every feature has a story about it. What went into implementing it, why it was added, and in the case of things like relation or why they weren't added for as long as they were. Now the first feature that I'd like to talk about is one I've spoken a lot about in the past and it's the typed attributes API. This one's actually kind of interesting to put in a thing about Rails 5 because most of the work that went into implementing this feature was actually done in Rails 4.2. In Rails 4.2 we revamped the entirety of how we handle type coercion, active record, and preparation for this public API. And the API is 90% implemented in 4.2 with a handful of edge cases. And I've talked at length about this in the past, but I don't think I've ever told the story of why this feature was added. And there were a lot of things that led up to it, but really there was one particular project that stood out as the thing that really convinced me that this was a project that we needed. It was something I was working on while I was still at Thoughtbot. And the project had the requirement that all data needed to be encrypted at rest, including if our database credentials were leaked. So if you actually had access to the database, you still needed to be unable to read anything. So we were using a gem called adder encrypted to start. And this is what using that gem looks like. You call adder encrypted and you give that the name of an attribute. And it will assume that the database column is called that attribute name underscore encrypted. It defines a reader and a writer for the unencrypted form and performs the encryption in Ruby. The problem there is that it doesn't work with methods like where or find by. Now there was a feature at the time that has since been removed, and this is an example of the code that doesn't work, but there is an escape hatch that used to give you that has since been removed in that the dynamically defined finder methods such as this form were defined. But this isn't how we tend to write Rails applications anymore and haven't since Rails 2.3. So using this gem would mean that our entire application would have to change to know that this field happened to be encrypted in the database. Now we ended up using gems like ransack for complex search. And so we ended up having to, the attributes API didn't end up even fixing what this specific project needed because we needed to be able to do things like less than or like, which doesn't work terribly well if something is encrypted. And so we ended up writing this massive hack using the PG crypto extension and we would walk the entire ARL AST, look for any binary nodes where the left side was an ARL attribute and replace it with encrypting or decrypting it. And it was some of the dirtiest code that I've ever written. Anyway, right around all this time, this guy gave a talk at RailsConf. I just feel like that's a picture that needs a, this guy. Ernie Miller was giving a talk about some of the funkier parts of the active record internals. And one of the things that he was mentioning was that it would be really nice if when active record defined the accessor methods to match your database schema if it was just calling some public API for you automatically. And it agreed, it sounded a lot like what I had been looking to do for quite some time and I really agreed with his sentiment on how it should work under the hood. So then we fast forward eight months and I went and basically got into Rails to start adding this feature and vastly underestimated the amount of work that I was going to be doing on it. We had a working implementation of the API pretty quickly, but the thing that bothered me about it was at the time, in Rails 4.1, typecasting happens by grabbing the column object associated with that field and calling a typecast method. And types in 4.1 are basically just represented as a symbol. When we're loading from the database, we look at the SQL string that represents the type in the database and we create a simplified type from that, which is represented as a symbol. It would have a case statement on the symbol and then it would choose what behavior to have based on that. And so the initial implementation got rid of the case statement, replaced it with some objects that we could inject, but ultimately the way this worked was typecasting still went through columns in 4.2 or in the earliest implementations of it. And when you called the attributes API at that point, we would actually create a new column object and stick that in the column's hash and would basically lie. And this especially bothered me if it was being used for something that wasn't backed by a database field because there's not even a column involved there. And so there was a lot of conflated concerns about what a column versus an attribute was. So getting the initial API was actually pretty simple or fast at least, but then there was another eight months of work to get to an implementation that I was really happy with and it required rewriting just about every part of ActiveRecord with the exception of association code was touched by this change in some way. But now we have finally gotten to a point where it's a very clean and simple implementation. This is what the final API looks like. You call the class macro attribute, you give it the name of the attribute, and then you give it a type object. And I like to give examples where we're passing an explicit type object, especially one with constructor arguments, because this was one thing that I really wanted to go for with what I like to jokingly refer to as my overly engineered, extremely complicated replacement for adder accessor. But I wanted to work with objects here because the semantics of how you do things like dependency injection are very clear when you're just passing us an object instead of having a magic DSL with symbols and registries. And this is a very universal API. So you cannot, one of the things I tried to do when restructuring the code was we still need to get type information from a column at some point. We still have to, when you load up your model, go ask for the database schema and then call this API for you automatically. But we still at some point need to get the type information from the column, which is kind of sort of how it worked in 4.1. And so I really wanted to make sure that the code was engineered in such a way that other people getting into the code base wouldn't mistakenly use the column object in that way, and that there was only one way to get type information. And so that's why one of the big things, one of the big benefits of the attributes API over just defining a reader and writer is that it works with methods like where. If you have a price column and you're representing it as a money object, you'll probably want to be able to do where price colon and then pass it your money object. And this is what the schema inference look code looks like. It's exactly as we were describing it, we just loop through all of our columns and then we just call public API for you. Define attribute is also public API. It's a little different than attribute in that it's strict, whereas attribute is lazy because it waits until columns have been, until we've done this code so that you can override what the database said if you ask us to. Anyway, so the reason the feature took so long was because I didn't want to ship it when I was unhappy with the implementation. And in particular, the fact that we were modifying the column objects was technically visible through public API. And I didn't want to commit to that. But now we're in Rails 5 and this will finally be a feature that's been available. But it was what got me into open source and why I decided that we needed another full timer on Rails. So it's got a special place in my heart. Anyway, I've talked a lot about this in the past and I don't want to spend too much time on it. So let's move on to the single most frequently requested feature since Rails 3. I've been, I'd seen this proposed so many times and it was just, I'm sorry this took so long to get. But I'm really excited about Active Support Left Pad. Finally, we provide you a way to add padding to the left side of a string. And we feel that this feature is so important that we will actually be shipping this as a separate gem independently of Rails. And we're really hoping that we can get everybody in the Ruby ecosystem to depend on this great new feature. Sorry, I don't know, is it too soon? No, but this probably actually is the most requested feature. Active Record Relation, or finally, six years after Relation was introduced, you can add an or expression to your where clause, right? And so with a response like that to this feature, that begs the question, what took so long? Why didn't we just do this in the first place? And there's a number of reasons, but the biggest one is that the API is actually much less obvious than you might imagine. One thing that people often forget when they're dealing with open source projects is that it's extremely difficult to change an API or to remove an API after it's been introduced. We can't just do something that's good enough or that gives the escape hatch right now, unless we're really confident that that is not something that we'll get in the way later. So we really need to be confident that was the right solution. This is the API we shipped. It takes a relation, it's a method on relation and it takes another relation as the argument. And so this is sort of the intended use case. If we have two namescopes, one's called recent and one is called pinned, and we want anything that is recent or pinned to be on the front page of our blog, then you would create another scope and you do recent.or pinned. And this is sort of the key to why we did this specific design is that we wanted to allow you to reuse namescopes. If something appears as half of an or, it is likely to be used as independently of the or. And so we really wanted to optimize for namescopes. Or more importantly, we wanted to optimize for composition and abstraction. We wanted to give you a tool that let you write code that was easy to reuse and easy to change. And namescopes are how we encapsulate queries. There were a bunch of other proposals for this API. This was probably the most common one that we saw, where it was just or and it took a hash and it paves exactly like where. There's a couple of big problems with this though. The first one is that or doesn't imply where at all. That's not even the only place in SQL it can appear or can appear inside of your having clause. Since relation sort of exists in the realm of set theory, it would also be entirely reasonable to think that or meant a union query. So some people that the next most common proposal solved that problem by having it be dot where dot or, we have dot where dot not. Sure, and this disambiguated it at the very least. But it didn't solve our goals for the API. You wouldn't be able to easily reuse scopes here. So there's some drawbacks to what we landed on. Specifically if you're outside of a namescope, if you're just sort of constructing your or ad hoc in a controller, it's a little bit more verbose because you have to repeat the class name. So it might look something like this. That's even worse if you're not using a namescope. If you're just doing a really one-off thing and you're calling where, this gets a little bit funky. But at the end of the day, what we're encouraging is for you to not have these ad hoc one-off queries. We're encouraging people to move things into scopes to give them a name, to put them somewhere where the code can be reused and thought about. And I'm okay with that trade-off. So now let's talk about something completely different. Show of hands, how many of you do this? Set your database URL to your production data and then drop the database. Anyone? Anyone? No? Come on, it happens all the time, right? No, but it actually does happen all the time. This is a very common mistake that I've seen many reports of. I do know some people in this room have done it. I have as well. So basically usually the way it happens is they will have used a tool to pull all of the environment variables down from their production server to kind of get their environment set up. And if they're on something like Heroku, that likely includes database URL. And if they forgot to un-set it and then they run their test suite and you're using CappyBara with a server that runs in another thread. So you use database cleaner to delete everything at the end of each test. Then, well, you just got rid of your production database. So we decided that maybe it was time that, like this was happening frequently enough that we decided it was time to do something about it. So this was the first thought. Let's just remove database URL. We run database.yaml through ERB. So if you wanted to use database URL for production, you can just do that and specify that that's how you want to configure it. But when I say that, what I'm actually saying is let's break Heroku and their entire workflow. So Richard Schneemann actually was the champion of this feature. He got everybody together and we had a little meeting to try and think of alternate solutions to this. He put a lot of time and effort into fixing this particular problem in a way that didn't affect. It's not just Heroku, but anything that wants to be kind of container-based infrastructure, just sort of work out of the box. What we ended up doing is a little bit more complex. So what we have now is in addition to schema migrations, which is a table we create for you automatically. We now have a general purpose metadata table. It just has two columns, key and value. And right now the only entry you'll find in there is the name of the environment that the last action was run against. So you will probably not ever see this. When you upgrade to Rails 5, if you try and do something with your database before running a migration, but something that could be destructive, you'll get an error saying, hey, we actually don't know what this database is, just run this one command and then we'll know in theory, if the very first thing you do after upgrading to Rails 5 is something that would have dropped your production database, you might still get hit in some strange way. But basically when we run migrations, we keep track of what environment we know that we're in. And if you ever try and do something that could destroy the database and the environment variable doesn't match up, then we error out. And there's an enable footgun equals one environment variable you can set to overwrite if you really need to. This is the message you'll see if you see this at all. But hopefully you'll never notice this feature. This is just one of those minor things. If we've done our job right, you won't be aware that this exists until you accidentally try and drop your production database. But Rails is now silently protecting you from a surprisingly common mistake. And it's one that's really catastrophic if it happens and you're not prepared for it. So I think this is actually one of the biggest features of Rails 5. And thank you very much, Richard. I don't know if you're in here, but thank you for working on it. So now let's move on to, again, something completely different. Let's talk about migrations. So here's a question that isn't terribly asked terribly often. You can still run your migrations some two years ago, right? I think we tried having a new developer come on and not running DB setup, but actually just running all of the migrations. Notice how schema.rb changes when that happens. So this has been a problem for a while. Migrations are code that sticks around and doesn't change. And it's not code that you test. And it's not code that you're gonna see if a new version of Rails has made a breaking change. You're unlikely to ever see that breakage. And we're seeing more and more users where when we develop for most product companies, we have development and production. And there's exactly one production and end development environments. And when you write a migration, it's run in production very shortly after the code was written. But we're seeing more and more use cases where that's not the case. Something like discourse or manage IQ are both places where the Rails app is a product that's meant to be deployed independently. They actually have a larger number of production environments, an unknown number, and they don't directly control them. And it's entirely possible, much more feasible there that even aside from the two-year-old migration problem, if discourse has an update where they add a migration and then they release that version of discourse and then they have another version of discourse and as part of that they're upgrading to Rails 5, it's entirely possible that the person upgrading to the latest version just skipped a version of discourse and is now running the migration that was written for 4.2 against Rails 5. And so if we change those APIs ever, we can cause breakage. But the thing is, we actually want to make changes. Things have changed a lot of it. We've added things like foreign key support. We've improved the types of indexes that we can create. But we can't actually, for example, create a foreign key by default because that is different than what was happening when you ran that migration against an older version of Rails. So we generate new migrations with foreign key true. Belongs to are now required by default, but we can't actually change the references method in the Migrations API, so we generate null false there. And it just adds up and migrations have just a lot more options there when you create them. And those are some of the more visible visual examples, but there are some more, there's some deeper changes that we would eventually like to make to the structure of the Migrations API that we just definitely couldn't do because they would be not just subtle changes in behavior, but really breaking. So when you generate a migration in Rails 5 now, this is what you'll see. We have square brackets after the class and it will be the version of Rails that you generated it against. And if that's not there, we'll assume four or two, there's really not much we can do for migrations older than four or two. But basically you can be assured that the migrations you have in your application today, as long as you're not referencing your models or any code that elsewhere in your app that changes, which some people do and is usually not a good idea. But as long as you're just writing your migrations through the Migrations API, the migrations that exist in your app today will continue to work exactly as they do until the end of time. And that means that this feature isn't free because we are now dedicated to maintain every version of the Migrations API in our code base until the end of time. That means that at fixing bugs is gonna be harder. We'll have to think about how this affects older versions of the Migrations API if it affects them at all. We will have to change our test suite to run more and more and more versions of the Migrations. And a larger surface area does mean more bugs. One of the things that people forget is that open source maintainers aren't magic. We don't have an infinite amount of time. We don't magically produce code that is free of bugs. So the more complex a feature or API that we add is, the more prone it's going to be to two bugs. But I do think that this is the case where this does have a large maintenance burden. We might in the future decide that we're going to only support migrations up to a certain number of versions back and at that point we'll probably give commands to help squash all of the migrations and migrate them to the new API. But for now the plan is we're gonna just maintain every version of the Migrations API until the end of time. Which does mean that we're still going to be pretty rigorous about whether or not we think a breaking change to the Migrations API is worth it just because it does still add to that maintenance burden. But there's a lot of use cases where this is actually a really important feature that there's just no way to work around. So I think it's worth the trade off. So we've got an interesting little method in ActiveRecord now that most people probably haven't seen in many blog posts. So AccessedFields tells you all of the fields on an ActiveRecord object that were accessed. I think it's well named. So that works. You create a new user model and AccessedFields will turn an empty array because you've accessed nothing assuming you're not doing anything in callbacks. And then if you call dot name, AccessedFields returns a symbol name. So you might be asking, why? So one of the patterns I've noticed is that we never call select unless we're doing some weird calculation. If we're rendering a view and it only ever shows a list of the user's names, we still select all however many columns and not a table like user that could approach 100. And there is a performance cost there. And one of the big reasons is that people, I've heard people say like, I don't want to have to go look at my views and figure out what fields I'm accessing and have to maintain that. So I figured all right, let's figure out if we can try and make this easier. So the workflow I'm sort of envisioning here is you, at the end of your controller action call AccessedFields. Take that array, copy paste it into select, bam, you're selecting less data. If you need to change your view, we'd give you an exception if you're trying to access a field that you didn't select. And so if you change your view and you get this exception all of a sudden, you delete the call to select, add the call to AccessedFields back, and you do it over again. And you don't really have to ever maintain it, we can just sort of tell you. At the time I was sort of envisioning a bullet like gem that you could run in the background and whenever you loaded active record models would give you a warning if you were using say less than 50% of the fields that you selected because that's an indication that you might actually have serious performance benefit. But I forgot about that until the last time I gave this talk, so it hasn't been done yet. But, it's actually, part of the reason I did this was because it's an experiment, but it was really, really trivial to implement. As part of the refactoring that wins the Atree's API, this is sort of where typecasting happens. And we just assigned an instance variable with the result of casting it. And so I can then say if something's been accessed, because I just look to see if the instance variable's defined. So I don't know, I think it's a neat little feature. I was really pleased. The fact that it was so easy to implement made me feel much better about the architecture we ended up with in active record after the typecasting changes. So let's look to a little less data. Now it's a little easier to do. So let's talk about Booleans. Everybody's favorite data type. So I've got a question for you guys. Somebody in the front said false. No, I know it's nobody's favorite data type. I like it, they're simple, there's two values. They're so easy to handle. So let's take a look at this code. We have a Boolean field and we have a regex. And we're going to assign the result of whether or not this regex matches the string to the Boolean field. In this case, it's very clear that the string will match the regex. What do you think the value of the field's gonna be? No, this returns a match data. Anyway, I know a lot of you are probably thinking false is why would I be asking if the answer wasn't false? So a friend of mine from Albuquerque, his name is Jeff Petrie, this is what he looks like. He actually sent me this question. And the code actually looked like this. He was using the swiggly operator instead of the match data operator. And the reason I switched to the match data is because this one has a different caveat that I'll get to in a minute. But basically the problem is in four, two, and more specifically in four, one, the way the Boolean type works is we have this constant and it has all of the true values and it includes the string T, which is what we're gonna, or the string one rather, which is what we're gonna get from the form builder. One, because why have string one and not integer one. The string T, I'm assuming because that's what Postgres has and at some point that went through untouched, but that hasn't been true for a while. The string on, which works in SQLite and I think Oracle and a bunch of others. And then everything that's not in this array is assumed to be false. And the problem is, so dot matches returns a match data object. Match data object, not in the array of true values, so it's false. The squiggly operator here returns the index of the character that matched the regex. And so interestingly enough, if the second character in the string happened to be the one that matched, then this would return one and it would be true. But every other value is false. And so we actually, we were looking through this. How on earth do we just get Boolean yes or no? Does this regex match? Turns out this is the answer. Triple equals is the only method on regex or string that gives it as a Boolean. So this is actually the full list of the values which were truthy in Rails 4.1 and earlier. That's not how Ruby works. We don't have the known set of true and everything else is false. That's awful, right? How we're working now is not how it works either, but we're closer. So in Ruby, nil is falsey. False is falsey. Everything else is truthy. So we flipped it around. These are now the values which are falsey. Ironically, this doesn't actually quite fix that original squiggly operator problem because if the first character of the string matches, it's gonna return zero and zero's on that list. So there's still a catch in there, but yeah. Anyway, I've been saying 4.1 because you would have gotten a deprecation warning in 4.2 if you were affected by this. So don't, I'm sure somebody's panicking like, oh my God, we're assigning regex matches to the admin field of our user model and now everybody's gonna be an admin. No, if you had an ambiguous value in 4.2, you would have gotten a deprecation warning. So this is another one where it's minor, you will probably never notice it and hopefully now you will definitely never notice it because if you noticed the behavior before, it was because it was doing something really confusing and really wrong and I'm glad we're not doing that anymore. Anyway, those are the stories that I was hoping to share with you guys. I would like to, while I have a captive audience, pitch my new ORM written in Rust. It's called Diesel. I have stickers, if you'd like stickers, come find me afterwards. This is our website. Big thank you to Shopify. They sponsor me to work on open source full time and they pay for me to come to all of the conferences and speak to all of you. If you'd like to work with Raphael and I, who both work at Shopify, please let us know. Thank you very much. I will now take questions. Yes, the question was if you can chain dot ORS. You can chain dot ORS and you can combine it with, and is still just dot where, dot where, dot where. You can chain them. The precedence will be basically exactly what you've passed to us. The question was has there been talk about adding an explicit dot and to mirror dot OR? No, I'm not sure that that is something that should be added, but that is an interesting thought. And I would need to just look at how it affects the rest of the API. But no, there has not been an explicit discussion about it. Is there a specific reason NIL wasn't a falsely valued? Yes, NIL is null. And if your column is not null, then NIL will still try to get inserted as null and so it'll blow up. The question was is there a plan to put accessed fields on relation? No, because like the implementation would just be if the first element, it would be calling on the first element and I can't really know if you're gonna use every model in the collection the same way and it's just very easy for you to do dot first, dot access fields if that's what you need. The question was what is the metadata table I mentioned called and how do you prepare if you are worried that it might collide? I don't remember the exact name but I believe it is double underscore, active record underscore metadata so I think it's pretty unlikely that you will be colliding with it. If you do have that, I'm interested how you got like the model name to match up with that because yeah. So I don't expect that it'll collide with anything but that was a concern that was brought up and that's why we have the double underscore in front of it. So the question was a common practice is to pull down the production database locally and how does that interact with our protection against destructive actions? So tools that are meant to be used with Rails for that sort of thing should be should just update to exclude the metadata table that similarly to how I would imagine I guess they would include schema migrations. Basically what will happen is you will get that error message. I'll say we think that this is production and you're trying to run this in development. Please run this command to update if you're seeing this in error. So you would just need to run that after you fold that down. The question was dropping the metadata table would also get rid of that error. Yes, as long as you run a migration before you try and do something destructive otherwise you would get the same error because we don't know. The question is how square brackets on migrations implemented? We just implement the square brackets method on the class object itself and then we return a new anonymous class. It works very similarly to if you use if you use any gems that have like include there is a gem that implements the comparable methods for you which I don't remember the name of it but equalizer and the way it works is you do include equalizer not new foobarb as and it just works very similarly to that. We just generate a new anonymous class. The question was do I have any ideas for Rails 6 and things that are going to require major differences and I'm assuming specifically mean differences in the connection adapter. Yeah, the main thing I want to do is get the third party connection adapters onto public API so that way there is more stability. Unfortunately that change will have to be a breaking change but I'm hoping to basically re-implement a lot of the connection adapters to be on top of an API that we think can be more stable in the long term and then work with the third party adapters to get that over. The only other thing that I'm specifically thinking of for Rails 6 right now is moving the attributes API up to active model which will involve some breaking changes with things like dirty checking will need to go through the attributes API and wouldn't just work with regular Adder Accessor which I don't think will be a huge impact but I need to explore it a little bit more. All right, well thank you very much everybody. I will still be around if you have questions.