 Hi everyone. Thank you so much for coming. My name is Philip Mendoza-Vieta. You can find me on various social media at PhilMV with 2Ls, many other places online as well as Twitter. I am one of the co-founders of RubySec.com which is the, we host the Ruby Security Advisory Database. And so if there's anyone in the audience who maintains gems, I would love to talk to you afterwards about how you handle security disclosures. But for my day job I run a startup called App Canary. And App Canary notifies you whenever you are running a vulnerable package in your servers or your applications. And basically we spend a lot of time thinking about, are you running vulnerable software? So this talk is ultimately about a series of technical decisions we made and implemented while building the service at Power's App Canary. And as a result I feel that it's best in order to best understand our thinking process. We should start a story today at the beginning of our incredible journey as it were. So back in 2012 I began a consultancy with a friend of mine, Max. And we specialized in doing penetration tests and security code audits and doing MPPs and business automation and advising teams on how to improve their software development process. And one benefit of being a consultant is that you have a lot of control of your schedule. So by the time November 2014 rolled around I had just spent two months working 12-hour days on the Toronto mayoral elections. And my co-founder Max had just had magical summer at the recurse center. And as a consequence I was really burnt out and he was really bored. And we were really looking to do something more challenging. And so we'd happened upon this market opportunity that could really use our skills. And we said, okay, let's build a product. We'd had some experience with this problem domain because we'd built this free service called Gem Canary a few years earlier. But in a nutshell an advisory has many vulnerabilities. A vulnerability has many packages. A package has many versions. And we need to keep track of all of this in order to tell you if you are running a version that you shouldn't. And so we looked around and Max turns to me and says, hey, let's use this thing called Datomic. So if you've never heard about Datomic, Datomic is really cool. Datomic is this key value stored graph database where instead of SQL you write Datalog, which is kind of pro-law kind of language. And on top of all this you get a free point in time database, which means that you can roll back to any previous state of your data. It's basically nothing ever gets deleted. It's kind of really cool. And so the way we're kind of thinking about what are struggles from the kind of free service that we built, we're like, this would be really handy. Let's try using this. Unfortunately, all the client libraries for it kind of sucked. But it worked really great with closures. So let's use closure. Closure is really cool. For those of you who don't know, closures is functional programming language that features immutable data structures that give you all sorts of asynchronous primitives for free. And it runs on the JVM, which is supposed to be web scale. I don't know. And I mean, shouldn't we all learn a Lisp really? Is not what you're supposed to do. Isn't Lisp supposed to be this profound enlightening experience that will forever change how you program for the rest of your life, right? I mean, for me personally, at this stage I've been writing Ruby for five years. Most of my career, you know, a dozen of apps. And to be real with you, these are all compelling reasons. But one thing that kind of dominated my thinking was this fear, right? Am I stagnating, right? Because when I jumped into Ruby, it was like synonymous with bleeding edge, like web tech. And now not so much anymore. Am I still going to be able to get a job, right? What actually happened is to programmers after the age of 35, right? Did it vanish or something? And I really didn't want to spend the rest of my life writing JavaScript, right? So the idea that there's this other technology set that I could invest in and kind of broaden my skills is really appealing. So we said, fuck it. We're going to run out of money in six months anyways. Let's do it. So we went out and we built it in closure. It was really great. So for people in the audience who might know closure already, I know using the wrong words because people in the audience I'm assuming are more familiar with Ruby. So just please stay with me. So closure is a bit familiar, right? Both closure in Ruby are truthy, where things are nil or false or good enough. They both really spend all their time dealing with hashes of symbols. And they both have all these fun metaprogramming constructs. So why is closure cool, right? Let's start with the big one, functional programming. In closure, functions are the basic semantic building blocks, right? They're encouraged and captured with the smallest units that are practical. So in this little function here, we're filtering out all the even integers by applying the even function to this array. And if you squint, you've seen this before, right? This is not, like, unreasonable to you. And there's this old joke that Ruby is an acceptable list, right? There's this blog post from like 2005 that makes this argument. But if you kind of think about it, even is an artifact, a fixnum, select is defined in innumerable. But if we can contrast this with the closure code, filter is the one that worries about what kind of data structures it takes, right? It's not, it's, the data structure itself does not know the hash or the map doesn't really concern themselves. And even is just any function, right? This is really cool, right? It's not a property of the numbers or the things inside of the array. Like, I can do whatever I want in there, right? And so, being able to combine bits of behavior like this can be really powerful. Another thing that's really cool about this is when everything's a function, nine times out of ten, you can just select the whole thing, cut it out, create a new function, put it in, and rehab call what the thing was originally, and then you're free to just change this new structure. You don't have to really worry about it, right? Everything gets passed in, don't think about it. And finally, immutability is really, really interesting. Historically, it was an expensive feature to have until in 2002, there's this breakthrough and a bunch of cool papers got written, and so it became like a reasonable thing to put on your computer. And in the most trivial sense, immutable means something that can't change, right? And in practice, it means that for every insert, update, or deletion, you end up with this brand new structure. And this is important because if you think about it, when you're mutating state, that gives you a lot of complexity that results from it. And if you can just remove having to worry about things changing when you're not looking, huge amounts of bugs are just gone. And the best way to illustrate this is with a little bit of Ruby code. So suppose that we have this list, just a bunch of strings up in there, and we assign this list to another variable, and then we have, we apply this append method to it, right? Appending, whatever that means, d to the first list. What happens with list 2? In Ruby, this question requires a lot of thinking, right? Or strings immutable, right? Or arrays immutable. Does Ruby pass by reference or by value? Does append, and most importantly, does that append method mutate or copy? Because, I mean, you could have something with the same semantic, you know, the same semantic purpose, does the same thing, but in practice, very different outcomes. So this method will insert something at the very end of the array, right? And this one will create a new array, right? You don't know that from the top. And as you use them in practice, they have very different implications. And the thing about Ruby is that you can't get away from this, right? Freezing won't save you. I can freeze this array, and I can still modify the contents of it, right? So I can't add things to the array, so pedantically that's true, but for my purposes, like, some other thing could mutate my state. I can't dupe it, because I can dupe something, and I can still modify the contents of the thing, and it still points back, because in Ruby I've cloned the array, but it's still pointing back to the original objects, which in this case are immutable strings, right? Literally the only way to guarantee that your objects aren't being messed with is to do some atrocity like this, right? Now, I have code that does this because I had to, because I have a thing deep in the bowels of active record that had to do something, and I was like, why is my thing disappearing, and it turns out it consumes the objects literally? So this code I used, right? And I think we can all agree this is kind of ridiculous, right? But it's Ruby, so there's nothing you can do about it. You have no control over whether something can be mutated or not, which is really unfortunate, because once you have immutability, I found, in my experience going through this, the flow of state through your app becomes obvious and predictable, and there's a kind of a way to illustrate this. So this next page is a function from our old app, and it's just a route that takes in the user and the database, and there's some parameters that we've passed in, and it processes the stuff and tries to parse it out, right? Because we do a lot of text. You send us text, we figure out what packages you have, stuff like that. So the specifics don't really matter, and that what's cool about this is that I know from the level of indentation that the stuff at the furthest level of indentation literally can't modify things outside of this lexical scope, because they're just immutable. I can maybe redefine the meaning of a variable. I mean, I can't unless it's, if it's not the same scope, right? If it's in the same scope, I can change the meaning of that variable, but stuff that's outside of it, literally, I can't touch it, because it's all immutable. And that's really cool, right? There's this kind of confidence you can get out of it. And so this is, I don't have all the time of the world, but I'm going to give you a quick illustration of why the atomic's kind of cool. So your boss comes up to you one day, and she says, hey, so I have this report, it's really cool, but can you just do this if last year's data? And in traditional databases, depending on how you, it just depends entirely on how you design your system, right? You either specifically designed it so you could do this, right? You took snapshots of the data, or your report collects the right things, or whatever. This request can be the simple, this rather simple request, can actually be a nightmare to execute on. And what's really cool about the atomic is that I can have a function that, given the database, will give you whatever value you're interested in. And then I can grab that database, and I can say, oh, this is cool, but I want exactly what I had a year ago. And then I just re-run the same report, and it just works, right? That's cool, right? And I'm not saying that datomic as a key value graph database is something you want to do reporting on, but in principle this is the kind of stuff you can get away with. All right, cool. We really want to play this new technology. It's cool, it's in the right place, we got a use for it. But I've been doing all this time, I've spent years building websites. I have zero interest in figuring out how to do user authentication, and cookies, management, and whatever in a new language. I already know how to do that. Let's just do the hard parts and closure. And so we ended up with something like this. We have this API that speaks to the atomic, and we have this Rails quote front-end that does boring things like talk to Stripe and deal with users and stuff. Cool, great. In the meantime we start working on this full-time in February 2015. In May, much to our shock, we got into Y Combinator, which meant that money was slightly less of a problem. In July we released the production, and then we moved back to Toronto in October, and we just slaved away at this adding more features. And so we ended up with paying customers real demands and real problems, because it's taking weeks to ship and debug simple features, even though we're holed up in our office working our butts off. And in retrospect we had three large broad issues, right? Number one is that we just made a bad architectural assumption. We underestimated the domain complexity, because it turns out that what is a package really is a deeper question than what we'd consider at first, because everyone has a slightly different definition. But it just meant that we had this messy data model that was hard to modify. Number two, we had these separate deployments that added all this overhead. And it turns out there's this high fixed cost to orchestrate all these different features at once. And I have more to say about this, we'll kind of come back to this a bit later. But on top of all this, I found myself really struggling with the environment that we're in. It kind of felt that we had this death by a thousand paper cuts. All these different small issues that in and of themselves are not a deal breaker, but they added up. So first and for all, it's not clear how to structure large apps. It's like if I'm trying to model a chess game that works pretty well, but then if I have users that come in and they have these preferences that have to be stored and how do I put this and this, there isn't a lot of literature in how to structure these things. Rails kind of holds you by the hand, and you kind of get used to that. Closure can be really fun to write. A lot of the time you feel really clever writing it. You're like, yeah, yeah, this is computer science right here. But as a consequence, it can be really hard to read. There's this like in between stage where you're like, I understand how this syntax works. I know what most of these functions mean. I have to really think hard what all these things are doing unpacking themselves. It can also be too expressive. You can pack in too much meaning and too little of a bit. Closure has all these deep subtleties. When I was writing this, it came to the conclusion that a good analogy would be like C++. You can pick up C the subset of C++ really quickly that you can be productive in it. But in order to read anyone's code, you have this infinity of other stuff that you need to understand in order to apply to it. If you have templates, you have boosts to what? It's an analogy. And in closure land, you have reducers, transducers, atoms, agents, protocols, reader macros. It just goes on and on and on because they have all this cool shit that's been bolted on. But it makes it really hard to kind of like feel confident what the hell you're doing. Another problem is that when everything's an anonymous function, your stack traces are useless. And some of you might have direct experience with this dealing with JavaScript where you have some complicated nest of JavaScript and it goes at barfs on your lap and you're like, what happened? And they're like, well, somewhere in here, there was a problem. And when you're debugging something on production, that's not a good feeling. I'm just like, I hate you. I need to figure this out. So compounding this, there's like comically terse documentation. It's like in order to append, you take a list of f's that have x. I didn't bring up a good example, but it's tough to parse out sometimes just the gotchas that are just hidden, et cetera. There isn't really a debugger in the system. There's the REPL, which Lisp is famous for. But I can't really attach something and say, aha, here's the data. Show me what's happening. And finally, the Java virtual machine is just, I mean, our app took 30 seconds to boot, which means that deep integration is required in order for you to not tear your hair out, which in closure land means that I sure hope you love emacs, because if you don't, I mean, you can make it work with Java land stuff, but then you're dealing with Java land tools, which is maybe it's just not me. Our deployments, the deployment story in Java land is kind of complicated, so you have to grab all your dependencies and shove them into one object and then push them onto your server, which meant hundreds of megs somehow, which meant that I can't fix something from a coffee shop, which is like, from Ruby land is like, oh no. And our app took gigabytes of RAM, it literally needed gigabytes in order to boot, and I still don't comprehend why. And finally, just like miscellaneous Java land stuff, fine. This is all just another way of saying that I'm really fluent in Ruby in a way that I'm not in all this other stuff, right? It's not necessarily their fault for any of this. Like six months of closure or 12 months of closure does not compare to five or six years of Ruby. I'd gone to a certain stage in Ruby where like, while you see the difference between 2.3.11 and 3.2 is that in Rails lands, you don't have that same depth of knowledge. But regardless, I spend a lot of time feeling frustrated because I'm finding the tool set. And I don't know for certain that I'm not stupid, but like, I'm pretty sure I'm not stupid. And so it wasn't until I found this comment on Hacker News that described the ecosystem as user hostile. I was like, oh, it's not me. Other people feel this way. And everything kind of clicked. And kind of in a nutshell, if I had to summarize it, it seems like developer happiness is not a virtue in that community. And the best example off the top of my head is like, I went to this local user group and I'm like, hey guys, I need a debugger. I do debugger-driven development. That's how I live. And they're like, well, you have a console. I should be good enough. What do you need a debugger for? You can just type in all of your state onto the console and deal with it. So I had this strange feeling of culture shock. So time passes. It's now May of this year. And we knew we had to refactor our application. We needed to clean some things out, because if you're a hold up in some Aspid of Suburbia, desperately working, trying to push this birth this app out, you're not going to make good long-term architectural decisions. We needed to do some spring cleaning. And I happened to be visiting a friend in California and I'm like, well, you know, I just spend two weeks cleaning things up. It'll be great. You know, whatever. And my friend turns to many deadpans and he goes, just fucking rewrite it already. And I'm like, huh. I mean, it's not an original idea. I'd thought about it before. But after that stage is kind of this watershed moment after which I couldn't really ignore that option anymore. And the problem, of course, is that rewrites are risky. When you engage in a rewrite, you have a hell of a lot of effort that if everything goes well, no one will ever notice. That is a success condition. That is the best case scenario is that your users log in and they go, oh yeah, this is exactly the thing I had yesterday. But that said, I mean, we're spending all this time fighting the ecosystem. We're going to be migrating this platform, you know, really, really well. Frankly, we were not using Atomic properly. Flat out. Just like we're not using it. You shouldn't just be shopping everything in there. You should have specific things you're trying to do. And most importantly, we'd be reducing the number of moving parts that we had. Because this is what shipping features look like. We would add features and tests to the quote API. We would submit a short code review, aka have the other person review it. We'd deploy it to staging. And the web, get it to talk to the API, the adjacent, test that, code review that, and then manage the deploy both to production, both apps to production simultaneously, right? And so it turns out that if something like this with two people, this is a bad idea, right? Like, don't do that. Because we actually unintentionally built a distributed system and microservices incur large fixed costs that are really difficult for small teams to pay. And this is fine if you're a large company. Because your service boundaries should roughly mimic your team boundaries. If you can't change the color of a button without having three meetings about it, like, microservice is fine. That is not the part of your overhead. You can orchestrate that. But for a two person startup, it was just like, we're spending all this time. Whether we're fighting closure or personally fighting closure, my business partner loves it. Or you're finding architecture that we actually set ourselves up with. The end result is that we're spending a lot of time working on things that don't really matter. And so Dave McKinley, who is this early guide at Etsy and does a bunch of other stuff, has this somewhat famous talk where he called Choose Boring Technology. Which I think the key takeaway from that, the one that really sat with me, was that you can work on about three hard things at once. And you should make sure that they're the things that matter to your business. We built ourselves an object relational mapper for Datomic. No one cares. No one's going to send me emails like, oh man, I pay you X dollars per month, but that GitHub library release is dope. Not a thing. So in the end, the way we thought about it, we're reducing our exposure to risk. Which is what you want. We're already doing something really risky with our time, money, and careers. You don't want to do things that don't pan out. As a small operation, we had the luxury of stalling a little bit. If you're like Netscape in 1997, whenever they did their big rewrite, they had millions of customers and people breathing down their neck. A bit different. And we set a deadline sort of eventually. And if anyone tries to do something about a deadline, they're not serious about it. So that's just kind of a quick pro tip in there. Cool. So we set about it. We did it over the summer. On October 11th of this year, which is not that long ago, we had 8,300 lines of closure and about 3,000 lines of Ruby. And on the following day, we had 1,700 lines of Ruby. We still have 1,500 lines of Go, but that's another story. It took four months to rewrite the thing top to bottom, which is good compared to the year and a half it took us the first time when we didn't know what we were doing. But it's just not a good way to spend your summer working weekends because it's really deeply demoralizing. Because you're slaving away at something that you can't ship. And meanwhile an email come in and be like, so your product's neat, but it would be great if you had this. And you're like, yeah, soon. Soon I'll have everything. And so it's just a really immense relief to ship, right? So result, point-in-time databases are not that hard to design in Postgres. I really highly recommend doing some reading on data warehousing or domain-driven design. I have like 80% of what I need, just in a couple triggers, which is cool. Really recommend that. Despite going back to Loll Ruby and away from web scaleness, our app is actually faster. It's somewhere between 50% to 100% faster. But we wrote the data model, so duh. And if you'll indulge me in a brief tangent, there's this whole subgenre of technical blog posts that go like this. So we rewrote our app from boring technology to amazing new shiny technology, and things are 10 times faster. And so if you took all the lessons you learned the first time, you rewrite the thing and you didn't squeeze more performance out of it, like what were you doing? Like what were you trying to accomplish? Like what's going on? But my Ruby point, kind of the takeaway from the talk into you is that when I went back into Ruby land, I found that my Ruby's different now. All that time I spent in closure kind of had really crystallized some thoughts I'd been having. Because I'm going to do a real history of Ruby slash Rails over the years, because at first it was like skinny controllers, fat models, and then people were like, oh no, my user is real, user.rb is just huge. I don't know about that. And then there's that concerns that came out, but like concerns are not really a good way, so you just have the same problem, they're just spread across 50 different files for some reason. So I still have this like 500 method user.rb, but through all these different files, that's not great. And so I've been having these different ideas, and people have different ideas of how service objects, but then they try to make it too functional, so you have to call dot call and everything. Anyways, so key takeaways from my experience in closure land that kind of helped existing thoughts that I'd had. One is that I try to be as immutable as Ruby will let me. Ruby won't help you, but you can be disciplined in what you do. One example of this is something people have been calling a value object. And so it's really handy to pepper your code with them. And these are objects whose sole purpose is to represent a bit of state, where you pass it in when you initialize it. And then afterwards you're just not allowed to modify it. You pass it in once, and any time you try to update it, it just says, nope, can't do it. So I don't use this gem, but it's a good example of what's going on. So T.Crayford had a little immutable value object gem that you can just install. And when you do it, basically the way it works is that you thank you, initialize it, and then if you try to change it later, it's like, nope, not doing it. Another thing that really changed in me personally post closure land is that I've become much more aware of how state flows through my application. I'd say more that I'm paranoid about it. So this is kind of hard to articulate and a good key takeaway. But I'm going to give you a bit of an example of a pattern that I've been using lately. And something I've been calling for lack of a better word, a quote, manager object. So managers for me are distinct from controllers, which I see as handling input. So kind of briefly like a controller sucks input from the user. The model strictly only deals with persistence. Either it speaks to the database or queries it or saves it, that's it. And then the managers take input from the user. They look inside the database via the models and they kind of mingle it together. So I'm not saying this is the one true way of doing things, nor necessarily that I'm going to stick with this, but this is how a lot of my code lately has been looking like. So here's an example. Take the package manager. So pretty straightforward, very simple. We have an adder reader at the top, which means that the one and only time that they ever get modified is through the constructor. And then you pass an all state you need for things within this class. And then when you add instance methods you just have to be really careful to never modify your inputs and never set instance variables. So a lot of my code ends up looking like this, where I have something that is responsible for dealing a package manager will be initialized for once per platform. And then anytime I want to find packages or change them, or anything that mixes a lot of different business logic together, it will be like this. I can pass in an input, but I will never change it. I never set any more state. If I set state that I have to pass along, I create a value object and things operate on that. Because again in Ruby, anything can mutate, but none of your code has to. So if you kind of walk away if you have some key takeaways, one is that despite my moaning and complaining, closure is worth learning seriously. It's maybe not like the most transcendental thing you'll do with your time, but it was neat. I learned a lot from it. I would happily work on someone else's project for it. And while we've personally moved away in our business from relying on it, I'm actually kind of really excited for closure scripts. Once they get into a state where you can just drop it into some pre-processor and dump it out, I can have a whole talk on the ways that we're avoiding JavaScript from our application, but I'm not going to share my prejudices with you. And that's just mean and pointless. But this is all just to say that when the day comes and people are like, these things need to be nicer, they're like, all right, well there's a lot of cool stuff that's coming out of this. If you ever thought with if you ever screwed around with React, for instance, a closure script will just work beautifully with that. Another key takeaway, avoid building distributed systems for as long as you can. We now have this beautiful monolith where if I update something in one end, I don't have to worry about how it gets to the other. Eventually, there's only so long you can do this for. Eventually, you will have to have multiple things that talk to each other, and then we can talk about the cap theorem at that stage and whatever. But it turns out you can get really far without having to worry about it, because I don't know about you, but I can configure really beefy machines these days. So if you just have this once, you can get away with a lot of traffic before you really have to think about it too much. And this is kind of a corollary of avoid working on problems that don't matter. Does it really matter that you have these containers that you can shove in and out that goes through it if you don't really have a thousand machines that you have to put it on? Exhaust the tools that you know before reaching for a new one is kind of like a more pleasant way of saying, I don't think things are boring, it's just something you know really well. And so reach to the end of the things that you know really well before you... If you're pressed for time, if you're just doing this for a hobby on a Saturday morning, knock yourself out. But if time's clicking down, you've got options to do. And finally, I think most importantly, kind of something that I've been thinking a lot about in this experience is that code should make you happy. And it was kind of like a rude awakening after spending effectively my whole career in Ruby that the notion that programming should be fun and painless is not a universal value. This is bewildering to me. But I mean the less charitable interpretation some people are just happy to be clever, to feel clever in their programming and not necessarily happy. So I would say, take it for what you will, I think programming shouldn't be fun and painless. And this is all just to say to cherish what we have here in Rubyland and make sure you bring it, kicking and screaming through your new communities. So that about wraps it up. This is the story roughly of how we built this service that will happily notify you if you're ever running vulnerable software on your stuff. And that's about it. Thank you very much.