 Hey, everybody. Welcome. I want to say I'm really happy to be starting off the crafting code track of the conference. So let's just dive right into some. Here's an example of how a Rails app starts. It's a typical example of an example you see in a design talk. And I'm not interested in this example. I want to turn this into the real world schema that I saw at a client. And it's a little more complicated. It's not a mistake or a joke. This is a mature domain model for a complex real world problem, and this isn't even close to the full thing. You don't even see a user model in there. This is what those pretty little examples look like after a year or two or five of exposure to reality. And I think Rails has succeeded by promoting design conventions, like just put your code in these folders, connect your models with active record, use MVC. But as our apps have gotten bigger and bigger and older and older, the structure hasn't been enough to support them. Microservices and engines and gems, they break up our code. And I think they hurt most apps more than they help. I think monoliths are great, and I want to make good ones. Today, I want to talk to you about what happens next. What happens after you've used Rails Generate a few dozen times and after you have a migration that only reverses a previous migration because a feature changed and then changed back? What happens after your test suite creeps above 1 minute, then 10, then 20? And there's no end in sight. What happens after you've got a long argument at lunch about factories and mocks and you're holding grudges afterwards and nothing has settled? What happens when a small feature just doesn't seem possible or like making a small refactoring turns after test suite read? What happens when MVC isn't enough to organize your code? So welcome. I'm Peter Harkins. I'm a senior consultant at DevMind. We're a small, happy consultancy in Chicago. And I've touched dozens of Rails apps in the last couple of years. I previously worked in Django and PHP at one place with 100 small sites and at other places with huge Rails apps that are not aging so well. And as I mentioned on the title slide, I'm camera shy, so please don't post pictures of me online. It's an uninteresting personal situation. Just do me a little favor. Thanks. And you don't even need to take photos of the screen. I'm going to put all of the slides with verbatim speaker notes in case I get stage fright after so the last slide will have the link to that. And you might even want to close your laptop because if you think I'm speaking fast now, I'm going to keep this pace up and we're going to move through a lot of code and unfamiliar concepts. And I don't think you want to miss out because there's a catgif in Slack. So like here's the plan. We're going to talk about the code we have now. I'm going to give two rules, excuse me, two rules for breaking it down in different ways and we're going to explore what those rules implicate for our code. And then I'm going to end with some resources, some tools that you can use, some links to more places you can see good code. The techniques we're going to talk about apply to all of our code, but almost all of the examples are going to be from the model layer because that's where I see most of the problems in code quality. Like we have skinny controllers, which are great, but then models just grow and grow. And that big ball of mud that we grow in our models like has some tests, but it's unreliable anyways and it frustrates our efforts to add features or write good code. I think ActiveRecord is a really well-written gem with a lot of benefits for making apps. Like we have an ORM, a query builder, a factory girl, and we can easily get it any model, anywhere, any time, whenever we want to retrieve from or persist to the database and validations are right there in our models so they're easy to find, excuse me, find and change. And models are just an obvious home to put the related code. But all of these early benefits is mirrored by a long-term drawback. Those amazing tools let us put off painful things until we've painted ourselves deep into a corner. And without hierarchy, we can have cyclic dependencies where we have to create objects in an invalid state before we can create the other objects they need to work. There was one in that real-world examples slide that you didn't even see because these things are really hard to spot. Always accessible means any part of our code can create, retrieve, and destroy models so functionality drifts off into hooks and triggers and it's fine when we have a whole young code base in our head but this is the source of that fear you feel when you realize you don't know what other code might edit some model of yours or when you don't know what's going to be behind that method you call on a model. Finally, all that easily accessible database is sort of the ultimate global variable. It's not only shared between processes, it's shared over time and maybe even with different apps entirely if you're integrating with a legacy system. With our validations up in our code, it's really easy to insert invalid data or to have models confused with tons of conditional validations or to edit validations and then later discover that records you just retrieved from the database were retroactively made and valid and you can't save them back. That's a real fun bug. And that home for code keeps getting new wings attached to it. Active record models already have business logic and validations and trigger SQL queries and serialized adjacent. So I mean, why don't we have them render some HTML, some views, maybe their own form or send emails like more and more. They just become a junk drawer for any kind of vaguely related code. And I'm emphatically not saying that active record sucks. I think it's great code but all of the good features of the active record design pattern make it easy for us to write a lot of bad code and then we get stuck with no vocabulary to fix it. We're doing this pattern and select why is everything painful after a year or two and can we solve it with some complicated scheme that leaves our code looking weird and over designed. I wanna talk about how to get out of this trap. We're gonna start with some really bad real world code. It's mine. Decompose it according to two explicit rules and talk about how to test the pieces. The improvements we're gonna make are incremental, can be used partially and can be reversed if you think they're a bad idea. And if a new developer joins and starts writing code without knowing the rules it's not like your code base is gonna instantly burst into flames. So let's get into that example project. I wanna show you what it is. Shibrary is a website that archives mailing lists. The name is Apportmento of Chicago and Library because I'm from the one and I love the other. So here's the overview for a mailing list about game design and development. You don't need to be able to read it, I know it's tiny, but you can just see the shape of it that up top is a table full of numbers. It's each month's count of how many threads and messages there are and then below is a big text description of the list. I really like reading mailing list discussions because there's a lot of smart stuff out there and I started Shibrary to help with that and also as an excuse to play with different databases and Ruby itself. And I'll admit when I started it I had no idea what I was doing with object-oriented design. It was some of the first OO code I wrote. So if you look back in the Git repo after the talk it will be really obvious that that's the case. On the other hand it doesn't look a lot worse than what I see at clients. And I use Shibrary as a proving ground to experiment with OO design for a year and I'll be sharing the successful results with you today. So clicking into one of those individual months here's a list of all the discussion threads. You can see two of them expanded and all of the tree of people replying down. And then in an individual thread there's that tree repeated at the top and then just all of the messages. A big feature for me is reading one discretion thread per page because I don't want to look through a keyhole at one message and then click for one message and then click for one message. I want to see everything. It meant a lot of fun code though sorting and parenting the messages back together or dealing with bad data and trying to do that. So let's jump into the code for messages. Yeah. I know you can't read it. It's the God object of the system. And it's, I'm not too embarrassed though because it's only 300 lines and I've seen models that are four or five times that. So let's zoom in at the top. This is the start. It doesn't inherit from Active Record Base but because Active Record was the only pattern I knew for database access it's a fine example of what Active Record classes turn into it after a while and how to decompose them. It starts off with a bunch of accessors. There's all the sort of fields you'd expect and then there's a big regex and a method for normalizing subjects by removing all of the like re, forward, forward eight re, gunk that builds up when your uncle forwards you that email about Obama fluoridating our water. And then there's a factory method for instantiating messages from a hash and there's a lot going on so far so let's just keep scrolling down where there's the constructor which is really complicated and confusing. A message can get fetched by the database by passing its database key or passed in as a big string or even from another message object. It's a real mess and it should be at least three factory methods like that deserialize but we can do even better than that. So let me hide that big distraction. The call number variable is the unique ID for each message so it's really important but it's an optional variable all the way at the end of the arguments and I papered over that by having an exception down below to keep the message from living in an unusable invalid state for too long. There's a nice idea working here that objects shouldn't be seen by the outside world in an invalid state and we'll find a better expression of that later but there at the end it extracts some metadata like subject to instance variables so let's look at how it does that. It calls this method extract metadata calls this load subject and then it falls back to a placeholder if it can't find one and then it knows if the subject looks like a reply using that bid reacts and it has another method that's instance level that calls that class method. Message kind of goes on and on like this but we've already seen enough code that I have examples for the rest of the talk and we're gonna extract those clumps of related code that work on the same variables to objects called values. I promised you two rules so let's talk about them and apply them to this code. The first one is that values are immutable. When we call a method we know it'll give us the same answer it did last time. Like that message object where maybe we could ask it its subject and then yield to some other code and then we look at the message again and it could have a new subject. This is really closely related to referential transparency if you know that but I'm not gonna try and split that hair. For example in the code you write now integers are immutable. When you have three and you add one you get a new in four. You don't update three. We also use dates immutably adding and subtracting them to get new dates. We might have a date variable like posted at that changes but the date April 23 doesn't turn into some other date. You can't turn Tuesday into Wednesday. The variable and the value in it are separate things. Code that's immutable like values cannot call code that's mutable because then it can't guarantee it'll give the same answer. If my expiration date were to ask the user what plan they're on it might end up giving us two different answers because the user changed account types or something. The second rule is that values don't have side effects. They don't do anything another piece of code might see besides return. They don't read from an API. They don't save to the database. They don't look at files on disk and they don't update an attribute on another object. They just return. Code that doesn't have side effects cannot call code that has side effects. To use a mathy term like mutability and side effects are transitive. If a method uses code that mutates or has side effects then that method is mutable or has side effects. Like if you've taken a now a vow of nonviolence and you hire someone else to beat up someone you don't like that is totally cheating on your vow of nonviolence. When we ask a subject for its normalized version it calls a method that goes and talks to the database. Well then that would be the same to us as if the subject directly was talking to the database. So we're gonna take those two rules and we're gonna make them real by extracting some values from that message code. So here's the call number. Now it's a proper class. Since it's the unique key for each message in Shibri it really matters that these things are valid and correct. And it enforces that validity with an exception. A value is never allowed to be in an immutable state because if it's immutable it can't ever change to a valid one. And there's another reason too. If an object can be invalid every method that uses that object has to wonder which it's getting real or fake because they'll act differently. It's sort of a violation of the Liskov substitution principle. Finally, call number uses a great little gem called adamantium. After the initialized method the whole object is frozen so that an exception would be raised if anything tries to mutate a call number. If you've used freeze in Ruby with objects there's a couple of weird little edge cases and adamantium does what you'd expect. Adamantium guarantees our immutability. So I wanna look at the second value we're gonna take out a message. It's the, here's the front half of it. Subject starts out really simple. It just has a simple initializer that wraps up a string. And like call number it uses adamantium to enforce its immutability. It also uses the standard library's forwardable module to delegate some methods. So if you ask a subject for its length or try to sort it it just makes the string that it wraps up do all the work. Subject also uses equalizer another great little gem that takes a list of attributes and generates double equals and other equality methods as well as hash if we need to use it as the key in a hash. And it's kind of weird to me now how few of our classes in default Rails code implement basic Ruby interfaces like equality and hashing. I think maybe that's a sign that we're too framework dependent but I'm not really sure. Here's the back half of subject which has those methods that we saw including that regex. And all the code related to subject lives in the subject to object. So you can just ask a subject if it's normalized. If you have a string floating around you don't have to go call a class method on the message. So those are values. We started breaking down our active raw card models into less complicated pieces by finding methods that use the same attributes and variables and pulling them out to their own classes like subject. Or in the model knows an awful lot about an attributes behavior like how message knew what it means for a call number to be valid or not we pull that out too. This is the extract class refactoring and there's nothing special yet. To make that class into a value we'll make it immutable and remove its side effects. That doesn't mean a method can't have a local variable that changes cause that's usually an easy way to write methods but from the outside you can't even see that it's there. We always get the same answer for the same arguments. When we extract those values they'll want the common Ruby methods I talked about on subject and maybe some less common ones like if the value can be used in place of a string or integer like subject can be used in place of a string then it'll want to stir or to int. And finally really early on you'll be tempted to excuse me to have values call or return active record objects cause so much of our code lives there now. Don't do it. A value can't guarantee it's immutable and has no side effects if it's calling methods on another object that might do exactly those things. The simplest way to use a value from an active record model is to write your own getter and setter. Subject can be saved as a string to a var char column. The spiffy way to do this is to use composed of in current rails or the new attributes API in rails five and we're missing a Sean Griffin's talk on that right now which I'm sad but it'll be on conflicts.tv. For complex values that are made up of more values like a street address value might have a country value maybe you'll save it to multiple columns or maybe it's own table that you reach by a foreign key. You'll just keep modeling your tables the same way that makes the most sense to you. Nothing changes at the database layer this is just OO design. Then when we test values we can do without a lot of things because they're very small and predictable. If a value is built up out of other values it's trivial for our tests to integrate down through them. We don't use mocks and stubs because we don't need to ensure that side effects do or don't take place and we only write assertions on the result of calling methods because nothing else will change. Additionally we can do really cool things like automatically generate property based tests though I don't have time to get into that. So that's values. They're immutable and they have no side effects. They're often quite small. I'm sure you've guessed we're gonna fill in the rest of this chart but it's gonna go a lot quicker because we're only loosening those two restrictions. So this is the halfway point of the talk. I don't know how you're feeling. I kind of threw you in the deep end like you know my cat after his bath. No but I think generally people have this I'm not seeing too many confused paces but this is a good time to just kind of take a breath and if you have questions you should wave at me like right now. Okay we'll keep rolling. So here's email. It's another value object. It decomposes the raw string of an email into other values. We can see three strategies at work. First message ID just extracts the message ID header to build a value and it's done. That's all. Second subject extracts the subject header or it falls back to a placeholder if the header is missing. It also builds a value. And then finally from extracts the from header and returns a string. I look through the code and I don't really do anything special with the address so there isn't any point in making them more than a string. Not everything has to be a value. We extract values because they're help not because it's something we always have to do. This is a talk about practicality not about like religious extremism in our design. All those values we saw didn't have identities. If you have two subjects that are equal it doesn't matter whether you're using the one in variable A or variable B and that's the same with the email itself. If all the headers and all the body is the same you have the same value. Whereas with people if you find another Peter Harkins it doesn't matter that we happen to have the same name or we have individual identities. Even if we were roommates and had the same address or other attributes were the same we'd still have separate identities. And the flip side of that is true even if I change my name, my address, my phone number every attribute you keep in your user table I still have the same thread of constant identity. And mutability is right there in our definition of identity so we need something separate from values and we're gonna call those mutable objects that have identity entities. But unlike values entities also don't have side effects. So let's look at two examples. First we're gonna have the new message class with all those value objects and more pulled out message gets pretty thin. We initialize with the call number that unique ID for each message that we extracted to a value and we just saw the email the value that builds up more values like subject and slug is the ID for the mailing list that this message belongs to. We didn't look at that example because it's not really different from call ID or call number but it's used in the URLs and headlines. And we delegate some methods down to values so that outside code isn't violating the law of demeanor to get at the data it needs. The code that interacts with message didn't even change much at all. But message is not just another value. The email and the slug instance variables are mutable. Email is mutable because sometimes I find a better copy of an email in a new archive and I want to re-import maybe it has better headers or it fixes a character encoding issue because those are all over the place. And slug is mutable because sometimes an email gets CC'd to two mailing lists at once and I file them both in the same list instead of splitting them up. So to make really sure it's clear if the email and the slug instance variable can change that means it can be replaced with a new email or slug but the email itself never mutates. Like it's the difference between a variable and the thing that's stored in it. So speaking of list, here's the list entity. It looks pretty similar. The slug is the identity and the various fields are mutable. There's our friend equalizer again because it's just handy. One of the interesting things is how little code there tends to be in entities. Their job is to have an identity and wrap up values not run a lot of code for us. The decisions about our business rules naturally get pushed down into our values and the dependencies of which classes we use float up into our entities. We're gonna see more of this later. And you're probably thinking, okay, entity is a fancy name for active record model and they're really close but they're not the same. The biggest difference is that active record models have side effects, lots of them. When we call a model we don't know if it'll trigger a save or reload an association or call into some other web of models that have side effects. But this talk isn't about that academic distinction so I wanna talk about how to extract entities from our models or at least make our models look more like entities because these rules can be applied partially and the farther we go down this road the better our code looks. We can extract entities by following the steps we've already seen, figure out what their identity is, extract as many mutable values as possible and drive out side effects to adapters and shells which are the two things that I'm gonna fill in that chart with. If we're gonna use active record models as entities themselves we have a lot of side effects to deal with and here comes the part of this talk that's gonna sound ridiculous and scary. Remember you can follow this rule partially and even if you don't follow it I hope the rule helps you understand part of why our models become so hard to work with over time. I think you shouldn't let your models call there or especially other models query methods, life cycle methods or your custom methods that have side effects. I know this is really kind of out there as opposed to the standard active record design pattern but it works and if we lift up these side effects to adapters and shells to direct them from outside which I'll talk about in a minute we can still use all of active records wonderful features for queries and saving without snarling up our model layer. When methods have side effects we really quickly lose the ability to reason about what's happening and why. If all of my active record models were littered with calls to global variables like you would scream at that code but when I call those global variables the database and hide them behind methods named find and create it just seems totally normal. So when we test these entities we use a couple more tools from our toolbox like factories and stubs but if we're using them for more than a small convenience our code is probably poorly structured. The famous example is when we use factory record first excuse me factory girl to save 10 records to the database just to test one small method on one model because our model has so many side effects that reach out to that big global variable in the sky. That's the thing I most want to get away from. The real difference from testing values is that rather than only assert a method call return the right thing we tend to assert that after calling a method the entity is now in the right state like if I update the user and change my phone number it's not that that method returns something I care about it's that now the user has that phone number. That's entities they're mutable they have no side effects immutable values cannot call mutable entities because then they would effectively mutate to so there's a natural hierarchy forming but you wonder like how the hell do I get anything done if you can't save things to your database or like read from the Twitter API your API never your app never does any work that the customers can see. So I want to look at the bottom right corner of the grid where objects are immutable but do have side effects. And I call these adapters here's our first one and it's named for Alistair Cockburns ports and adapters are hexagonal pattern I don't know why he gave it two names but it seems to have two. When Shiber is building those unique call numbers it incorporates a run ID a unique incrementing integer because Shiber doesn't store it in SQL database I needed some custom code to generate these in a transaction which involves a bunch of commands too and queries from a Redis database that I left out of this slide because they're not particularly relevant but I'm showing you the run ID service because it's the one I could trim to fit on a slide and also because there's a caveat on it which you can guess with all that blank space. Even though it's against the rule that adapters are immutable it was more convenient for the code that calls this to mutate by, for it to mutate by caching the latest value. This object is not perfectly immutable and that's okay. The rules tell me where to go and my code improved by following them but on occasion it's also improved by bending or breaking them. Knowing about the rules certainly help me decompose better objects than just dumping all this into call number or dumping all of that into message but maybe I'll revisit this code or move that last beat of some mutability. I don't know what the future will hold but for now I have the words to describe what makes this object more complicated. Adapters tend to be really thin wrappers around external services. We're gonna use stubs to fake out the result of calling an external service and we'll use mocks for the specific purpose of making sure that that call to an external service was well formed and there's not a lot of benefits in checking results because your test usually looks like you set up a mock to return some value and then you call into the adapter and then you make sure the result you get from that is the value you just mocked. So that's adapters and we don't call, or excuse me, unlike values can't call adapters, values and entities don't have side effects so they don't call adapters. Adapters call into values and entities. Otherwise the values and entities would look like they had side effects too. There's a natural hierarchy that forms to help keep us from the situation where we wanna test one method but need to put several methods in the database first and then we have our code broken up in these three little boxes so let's look at the last box which is not this box that's my other cat. I learned from Aaron Patterson that every RailsConf talk needs some cats in it so. So here's a worker. It's a job that gets queued after the messages have been filed away. It takes a slug a year and a month which it calls sim for short and it caches the total of how many threads and messages the list had in a month. It fills in that table that we saw in the early screenshot. The perform method is a little bit cute but it had to be to fit on the one slide and all of these methods are one liners but rather than have a method with a couple of local variables I broke it up so I could write isolated tests for the parts and then just one integrated smoke test. If you squint this is function composition this is closure's threading macro or F-sharp's pipe operator. And working from right to left inside out first thing it does is it deserializes that key and then next it loads up all the threads for that month from the repository which is the adapter that interfaces to the database. If threads were an active record model the thread repo is the adapter that I would have calling the thread scope or life cycle methods. We don't have time to dig into that code but threads are entities because messages get added to them over time or even removed if they were mis-filed. So then it creates a value from all of those threads and then it stores that value back in the database. This is a little procedural imperative program. It uses the adapters to isolate the parts of the code that needs to talk to the database then it uses the entities to get immutable things a value for doing some work and then it records the result. Let's look at one more little program. If you haven't seen Sinatra before this is a route and a controller action all wrapped up in one. So when somebody gets at the page for a list to display all those mount accounts this is the code that runs. It just goes and talks to some adapters and then it builds up a result and rather than use another adapter to save things back to the database it uses Hamel to return a web page. But it has almost exactly the same shape of talk to adapters, talk to the things that they give you and then do a little bit of work at the end. These programs are called shells. They coordinate those specialized objects to do meaningful work. It's their job to manage the dependencies and the classes they use make all of the decisions. We keep them because they don't have the restrictions on we keep them small because they don't have restrictions on mutability and side effects. So they're harder to reason about. A shell might not be a single method or a single object like the month count worker and it might have several methods like the web front end has a half a dozen URLs at least. When adding new code it's easiest to first add a first draft to the shell and then extract pieces out into entities and then values and adapters according to the rules we've talked about which is another way of saying that we can take the code we already write and break it down by these rules when it helps us to write better programs not all of the time. So finally testing these shells we're gonna use all of the tools in our toolbox here. We haven't talked much about fixtures but I only like to use them with real world input especially the stuff that's invalid. Otherwise we can end up in a situation where changing one piece of test data can cause more than one test to fail and we won't know if that test failure was intentional or incidental. We might test on the result we get like that web page we rendered or we might check that the right thing is in the database or that we told SendGrid to send an email. This is the only place where we write integrated end to end tests. I'll write one happy path test where everything is wonderful to make sure all the parts are talking to each other and then as needed I'll write regression tests when I'm fixing a bug I can't isolate it but I wanna think hard on those regression tests and try and break things down to write more isolated code especially contract and collaboration tests. And there's one thing missing which is the question mark that's floating over your head which is like this is kind of bullshit cause this looks nothing like my Rails app. So I have to add one thing to my slide which is other. Right now almost all of our code is mutable and all of it has side effects especially database access. It's not tidy little shells it's that big ball of mud where it's hard to follow the path of execution where we fight about how to write just enough tests where we have all the bad things I open this talk with and I promise to write tests for each of the types of objects so yeah I don't know. I mean every test is integrated even the ones that lived in the folder called unit were integrated and we have to use every trick and tool we have just to have a little bit of confidence our code works. Dealing with those big buckets of other code is why I took so many notes on great books and talks like the ones on these slides and then experimented and experimented until I realized I was seeing some rules for decomposing my code and I had a lot more understanding of why more experienced developers gave the good advice they did and I'm really glad to have shared these rules and how you can use them to decompose your code into reliable comprehensible pieces. They've worked for me on a non-trivial project and they've worked for me on client code. So here's some of the great tools I used along the way. I wish I had time to demo them all and on the right there's some more example code you can look at. The library app is up on my GitHub now and two factor auth is a gem I made over this winter that's a moderate expression of these rules. I don't explicitly follow them but you can see the outline of the shape as things are decomposed. Cuba and Lotus are alternative web frameworks that are just interesting for being small and Trailblazer data mapper and Ruby object mapper are great for seeing Ruby code that's organized really differently from the rails we're used to seeing. There's a Trailblazer talk next up on the fourth floor so if you stretch your brain you can see how these rules apply and use them to understand other code and your own and Haskell's down in the corner hiding. I know it's really got an intimidating reputation but it's interesting because in it everything is immutable and side effects are explicitly encapsulated in a way that's just not possible to do in Ruby. I think if you work through a beginner course or book it'll teach you about programming in general but then also about Ruby. Lastly, I know I'm not the only person experimenting along these lines I certainly didn't write all of those nice gems. If you have or you want to please get in touch with me. I think there's a lot of room for us to experiment in Ruby and Rails to get everyone's app a lot happier as they get bigger. The experiments that we do expand what it means to be idiomatic Ruby and they give us new tools and new understandings. So all these slides with comprehensive speaker notes are available right now and if you thought this was interesting you can sign up to get some more details and examples as I have a little time to catch up on sleep and I would really love to hear about your experiments however they turn out whether that's here or by email so there's my contact info. This has been What Comes After MVC and we've got a few minutes for Q and A so let me thank you all for your time and your kind attention.