 So thank you, everyone, for coming. It's late on Friday. And I really appreciate all of you sticking around. Obviously, tender love is next. So talk louder. Is that better? So before I get started, I just want to actually do a quick plug, because I saw the session behind me is actually on the imposter syndrome. And since you folks are not going to be there to see that, I just wanted to actually tell you a little bit about it. Because actually, it's a really important concept. So essentially, the imposter syndrome is when successful people, like software developers, feel like they might not deserve all the success they've gotten. And it's actually very common. When I learned about it a few years ago, it was actually really helpful for me in my career. So when you go to conferences like these and you see members of the Rails core team or the Ember core team or all the speakers, and you think, wow, I am not worthy. Actually, you are. You really do belong here. And people who are successful usually do deserve it. So Google the imposter syndrome when you get home or watch the video when it gets posted. So to our talk, so you will never believe which web framework powers up worthy. I can't believe the conference organizers have made everyone wait all week to figure this out, to find this out. So at Upworthy, we aim to please. So we're just going to spill the beans right away. Obviously, the answer is Java struts. Awesome. Now, so to introduce myself, my name is Luigi. I'm the founding engineer at Upworthy. I was the first engineer hired. I've always been a software developer involved in politics and advocacy. I got really into this guy who screened Howard Dean. And so I worked for his political campaign or his political action committee. I worked for other campaigns and nonprofits. And then before coming to Upworthy, I worked for the Sunlight Foundation, which is a great nonprofit in DC that focuses on transparency and government and open government data. I'm Ryan Rasella. I'm a senior engineer at Upworthy. Before this, in 2011, I was a Code for America fellow, the first class fellows. I came on as a technical lead full-time staff there. And then last year, I was on the Obama for America tech team, or I guess 2012, working as a software engineer. And I ran out of four America organizations to work for, so I joined Upworthy. So at Upworthy, our mission, and this is something we truly, really believe at the company, is to drive massive amounts of attention to the topics that matter most. And that will inform the engineering decisions we made as the tech team. So just to give people a little peek at what Upworthy does for those who aren't too familiar. So that might be a bit hard to read, but when we say the topics that matter most, these are the topics we've covered in the last year, the topics I've gotten the most attention. So I'll just read some of them aloud. There's a lot of women's issues, like body image, gender inequality, standards of beauty, a lot of economic issues, like income inequality, health policy. We also cover a lot of stuff about disability, mental health, also bullying, bigotry, racial profiling, and race issues. And when we say that we want to drive massive amounts of attention to these things, what we really mean as web developers, as web engineers is we want to drive massive amounts of traffic. So here's a look at our growth for the last two years. So we launched a little over two years ago. We started off at around 1.7 million uniques per month. That was in our first ever month in April 2012. And then we went up to, in November of 2013, 190 million page views. So this has made us probably around the top 40, a top 40 site in the US, where maybe one of the larger more traffic rails apps out there. To give you a sense of what kind of traffic we actually deal with, here's a 24 hour graph of what our page view data looks like. So starting at midnight all the way on the left, and then ending midnight the next day. You can kind of see how during the, when the daytime happens, when work hours start, we get these spikes, these peaks of traffic. This is essentially the viral cycle being visualized. So we have handled at most about 130,000 concurrent visitors. This is a screenshot from Google Analytics during one of the traffic spikes. So we are handling large amounts of traffic in very spiky ways. So here is an example post from Upworthy. This was really popular about a few months ago in the fall really. Who here remembers this post? Curious. Cool, a few of you. So this is what Upworthy content looks like. So it's really just static content. So see why we have an absolutely ridiculous standard of beauty in just 37 seconds. It's a video about how a woman, a model, gets photoshopped and looks essentially like something, like the standard of beauty that doesn't really exist. So that's kind of the content angle we were going for. And here you see we have the content, which is basically static content on the left side of the screen. We had this sidebar with recommended content on the right side of the screen. And then scrolling down, we have what we call ASKS. We also do kind of some testing on different kinds of content, different kinds of headlines. You see that down there with that John Stewart video. And then we have ASKS around, do you want to subscribe to our email list? Do you want to like us on Facebook? We also have kind of pop-ups that happen after you watch a video, after you share, also asking you to do stuff. So those are the things we, the technical things we, our technical concerns we have at Upworthy were pretty much a static site, we're a public site. And then we have a CMS backing that, and then we have this system of dynamic pop-ups and ASKS around the site that kind of optimize our subscriber base, get us more subscribers, get folks to share content. So the topic of this talk will really be about managing the growth of our startup's web app in the face of very high traffic. And I actually remember, maybe five years ago sitting at a Rails comp, maybe it was in Baltimore. And I was sitting where you're sitting and it was a talk by the guys from YellowPages.com. And YellowPages.com was and still is one of the larger Rails sites in the world. Obviously YellowPages is such a strong brand. Everyone goes to YellowPages. A lot of people still use YellowPages.com to find out local businesses and stuff like that. And they were talking about how they scaled their Rails app. And I was sitting there in the audience thinking, well, this is really interesting, but I'm not sure this is ever gonna apply to me. I don't really, I work on these small things. Don't wanna ever really seize them. But fast forward a few years and here I am. I'm working on this app that millions of people see every day. So it can really happen to you too. So let's start from the beginning. We launched in early 2012, March 26, 2012 to be exact. And at the time there was one engineer me and our CTO who is a programmer, but it was not really a Ruby or Rails developer. And we actually launched not on Rails, but on Pedrino. Who here is familiar with Pedrino? Cool. Who here has production apps in Pedrino? No one, okay, that's what I thought. So Pedrino, it kind of builds itself as the elegant web framework. It is essentially just Sinatra with more Rails-like things on top of it. So who here has used Sinatra? More hands, of course. And who here actually has Sinatra in production? A few hands, yeah. So essentially, when you're working with Sinatra, it's a more low-level library, more closer to Rack. And Pedrino essentially adds the things that you miss from Rails into Sinatra. It also freely borrows really great ideas from Django. The first one being first-class mountable apps. So in Rails we have engines, but it seems like people don't really use engines that often. Like you might use it for Rails admin. It's, you kind of might have a larger Rails app and then break it up by putting it into separate engines. But with Pedrino, all code actually lives in a mountable app. So you have to use this mountable app system to use it. It's also middleware-centric. So those of you who are familiar with Rack know that there's a lot of, there's this concept of middleware, it's also in Rails, where you can kind of write these small bits of code that are Rack-compatible that sit on the stack of what you do, of how requests and responses come into your Rails app or your Sinatra app or any Rack app. And there's also this built-in admin area. And that admin area is actually an app. That's just another mountable app. So this is something that Django has. I know we have Rails admin here in the Rails world, but with Pedrino, this is a thing that's built into the framework itself. So why Pedrino? Why did I use Pedrino in the beginning? Essentially, at the time I was a seasoned Rails developer. I was probably developing for about five years on Rails. And during that time I started to form my own opinions about Rails and some of those opinions were not compatible with what the Rails way prescribed. And I saw Sinatra and Pedrino as this way that I could still write Ruby. I still loved writing Ruby, but I could also make my own choices. And there's this kind of epiphany that seasoned web developers kind of ultimately have, which is I'm using this web framework, whether it be Rails or Django or Sinatra or Node, and all it's doing at the end is it's really just a web server, because at the very end you're just getting in responses, you're just emitting HTML or CSS or JSON if it's an API or JavaScript. That's all you're really doing. And all this talk about TDD and good object-driven design, they're very important. They help us manage complexity. But in the end, what does the framework physically do in the air quote physical way? Is it really just takes in HTTP responses, excuse me, takes in HTTP requests and then responds with HTTP responses. So while Rails will give you the full foundation of how to build a skyscraper, Pedrino gives you a foundation, but it also lets you choose some of the plumbing and make some choices for yourself. So the good parts about Pedrino are it really is just Ruby, it's just Rack, and if you're a fan of thinking in that mindset, you'll really enjoy it. There is less magic, things are more explicit. It's unopinionated. All the generators, when you generate a Pedrino app, you specifically say what you want. That you want active record versus data mapper versus Mongoid, or you want SAS versus LAS versus ERB, whatever you want, you can specify it. I actually enjoy the process of writing middleware. I like thinking about web apps like that. I think it's a much more performant way to think about writing web apps. And Pedrino itself, unlike Rails, is light, it's performant, it's really just Sinatra with a few more libraries on top of it. So this is what our architecture looked like when we launched. So the whole big box is a Pedrino, is the Pedrino app. And we had two mounted apps inside it, main, the public site. So when you visited upwardly.com, this is what the public would see, those content pages. And then we had the admin tool, the built-in admin app in Pedrino, which essentially functioned as our CMS. And keep in mind that we were hoping that we're gonna launch this thing and that we're gonna get lots of traffic. And so we needed to figure out how to scale it right away, right from the get-go. So I kind of devised this idea called explicit caching. And I remembered back in the early 2000s, there was this blogging framework called, or a blogging tool called MoveableType. And MoveableType was what all the big, those early blogs used. And MoveableType essentially, the way it worked, is it was a CMS, so when you saved your post to the database, MoveableType would actually save, obviously see that you saved something to the database. And then it would render the HTML right then and then write the HTML to disk. So when people visited your blog that was hosted on MoveableType, they weren't hitting that Perl app and going through the database. They were actually just hitting these rendered HTML files and CSS JavaScript that were just living on the file system of your server. So I actually was drawn to that idea. I liked that idea. So I kind of remade it a little bit here in Padrino. So in the admin app, there was this publisher object. And the publisher object essentially did that where once anything was saved to the CMS, any content was saved to the CMS, the publisher object would see that. It would actually make a request to the main app, which was actually rendering the public site. And it would write that rendered response to Redis. And so Redis Cache was a middleware layer. I talked about middleware earlier that sat very close to the front of this Padrino app. So when we had a million or so pages in that first month, they were all really being served from Redis. Essentially the website was just in memory in Redis. And so that worked, it scaled well. So around this time, June 2012, we hired our second Rails engineer, Josh French. So he kind of joined the team and then a few weeks later he said, guys, I think we should switch to Rails. And he was probably right because there were pain points that were not related really to the technical performance of Padrino, but actually more about social aspects of it. The first one being that the ecosystem for libraries was pretty good because, again, Padrino was just Sinatra, just RAC, was not as strong as Rails. There's just libraries for everything you wanna do in Rails. There were many things we could do in Padrino, but the kind of quality of those libraries was not as high. A part of that is because Padrino was in that popular, it wasn't very frequently maintained. The actual admin app was not very pleasant to look at, it was its own HTML, CSS, styling. I put a start here because literally once the day we moved off of Padrino fully, they released a new release which had the admin system in Bootstrap, which is what we wanted all along. And there was no community, and as a startup, it's actually easier to hire Rails developers because we can't really go, hey, we know you're a great Rails developer, but you're gonna have to work on this thing called Padrino. That wasn't really a strong sell. So we decided to move. We wanted to move to Rails. But at the same time, we're a growing startup, we're getting lots of traffic. So how do we kind of balance this desire to move to move our architecture while still maintaining a running app, while still having a stable running app that is serving a growing traffic base? And Ryan's gonna continue. So we started our Rails migration in October 2012. And so this is a timeline, this is October 2012. And basically the way it started, we generated a new Rails app, and we mounted it inside the Rails RB, so we just basically mounted. And you can do the same thing with Sinatra, so it's RAC. And then we slowly migrated the models and utilities into the app. And so when I joined, we were kind of in this weird hybrid state. I joined in January of 2013, after taking a nice long siesta, re-electing the president stuff. And so we had to figure out how could we accelerate and just get us over the final hurdle and just move us onto Rails completely and get off Adreno. So the first step that I did was migrating the assets. And so we activated the Rails asset pipeline, which had been turned off in our app. Migrated the front end assets, the back end assets. It took us about a week in February 2013. The next step was deciding if we wanted to do the admin area or our front end area first. So we decided to do the hard part first and do the front end. So we migrated all the front end code, the views and controllers, which took another two weeks. And then lastly, we did the entire back end CMS system as a final push and we changed it to Bootstrap, moved all the Rails controllers. That took us another two weeks. So here at this point, the entire migration is eight months but it really ends up getting accelerated in the last few weeks just because we wanted to get that final push. And at this point, we're at three Rails developers, me, myself, Luigi, and oh, sorry, myself, Josh and Luigi. And our CTO goes back to actually doing CTO work, which is great. So now here we are, we're in a monolith. We're in a big, huge monorail. For the entire 2013, we were able to do things the Rails way. We were able to increase our velocity and just able to add lots of features. We were able to program the way we wanted to and really get things moving. We didn't have to rebuild helpers that weren't existing. And so we could just have this one huge monolithic Rails app. We had the back end CMS and then we had our front end and then we had all our Ajax endpoints. But one of the things when you're looking at this monorail is how are you scaling for a variety? So on the campaign, there was a lot of traffic and it was pretty easy to know that in November it's gonna be really busy or in October there's a voter deadline so it's gonna be very specific of when traffic's gonna hit, you can tell. In the viral media world, you don't know if your post is gonna be a hit or not. You don't know when it's gonna get in traction. And so you can't have someone sitting there monitoring 24 hours a day when to scale, when to not scale. So we had to think about how we're gonna do that. So a lot of it was just pretty simple basic stuff here. We added action caching. So we removed the homegrown publisher system, just turned on action caching in Rails and it's backed by Memcache. So people would hit the page, you would hit the Memcache instance of the page instead of going out, hitting our database, pulling in the pages. We were able to just do that. The second part was moving our assets on S3 and CloudFront. So our app is hosted on Heroku. There's actually this really easy tutorial on Heroku on how to do this. You just set up a CloudFront instance and then you just point your config and host for the CDN for your assets to that and it magically works. It's great. And then the third thing is we had lots of Heroku dinos. So at the time we were running up to 40 Heroku dinos and these were 1x dinos at the time. So mainly these are for our AJAX requests. So those asks, those popups and the different things that ask you around the site. We literally needed to scale with those. So we ran this for a while and we had some slowness on the site sometimes. So we tried to figure out what can we do to make sure that our site would be stable and not have to worry about these viral traffic spikes having to scale up and down. So we actually implemented a CDN in front of it. We took some time and tried to figure out what CDN we wanted because we wanted to do pass through posts and just different things. And at the time we ended up on Fastly. So Fastly is a reverse proxy. It runs on varnish. You guys should check it out, it's great. We moved all of that to all our HTML, CSS and images to Fastly and then we turned off the rails actually caching and the way Fastly works is it reads your headers on your page so you just set expire four hours from now. So our site could literally be down for four hours and Fastly would continue to serve the pages. From there we were able to dial down our Heroku dinos. So we switched to the 2x dinos and we only needed about half as many dinos because we were only serving off the Heroku Dino for AJAX requests. Probably the biggest thing that we learned from Fastly was the mobile performance gains. So Fastly has all these different locations around the world. If I'm in California requesting a page from upworthy.com it's gonna go to the closest point of presence, the CDN location in California instead of going out to Heroku in Virginia pulling from the data center and bring it back. So the load times went way down and we were able to fully cache the site and we've had zero downtime since implementing Fastly. So it's just been a great performance gain for us. So with having a big monorail there's huge pain points that come along with it. So we have one app that deals with our www service upworthy.com and then we have a backend CMS that our curators log into and do their work. So we had to figure out the concerns with that so it's really painful there. When there's traffic spikes on the public site it could basically makes our CMS unstable so a curator would log into our site try to do their work and they couldn't navigate and we just have to tell them come back later or come and do the work later and when the traffic spike comes down you'll be able to use it. And as our team grow the code base is becoming very large so the classes will get huge and the front end didn't care about some of the stuff the backend did so it just got harder and harder to maintain. So of course what do we do? We break up here. Fun fact, this is actually filmed in Chicago. In December 2013 our buddy Josh French has another great idea. He says hey I think we should really start splitting up the Rails app. And so if you look at this timeline it's pretty evenly spaced. We didn't just jump into different things and start building them. We like took some time on each of these sections and really focused on that narrow gap. So one of the things is when you're trying to decide how to break up your Rails app into services how do you do it? There's plenty of different ways you can do it. This is the way we did it. This is not the perfect prescription for everybody. I think you just have to look at your app and see like where the clear dividing lines are. So we basically, we just chose two right now. So we have our dub dub dub site and we have our backend CMS. So there's a clear dividing line between the two. What we ended up doing is cloning each repo into its own separate repository so we could maintain the get repo history. And then from there we started breaking everything up. So this is what we need for this side this is what we need for this side and let's start shopping. Which ended up being a lot of deleting and removing namespaced objects. So once we did that we deployed each of these app to separate Heroku applications. The nice thing about Heroku is they have this function called Heroku Fork so it'll just take your current app, it'll fork it into another Heroku app, pull over all your plugins, everything you need. So we did that, we forked our main app into two separate applications, removed the plugins that we didn't need on each side and then we pushed out our applications to those Heroku sites. Everything's working great. And all we had to do was switch our Fastly endpoint to point at the new Heroku app at origin and we're done. Zero down time really helped there. And then we continued to deduplicate the two apps. So we created this core gem that is holding a lot of the code that shares between the two apps. So our CMS runs on about two two-X Heroku dinos and then now our front end app runs about 25 to 30 two-X dinos. This is pretty much what this looks like so we just have an app called dub-dub-dub and then we have an app called CMS and then this gem shares the code between it. People will hit the Fastly endpoint between dub-dub-dub and our app. So what are the real benefits of a service-oriented architecture? I think there's plenty if you look and think about it. One of the big things is we talked about the instability. If our curators couldn't do their work, we can't have articles grow out so it gets annoying. So if there's a problem in one app, it's easier to fix and maintain. Each of them have different scaling needs. So the interesting thing is our CMS, when we have 20 users that use our CMS. So we could have it at two-X dinos instead of having 30 dinos serving all these different pages. So the scaling needs was really beneficial here. And that also divides our teamwork more naturally so we can have engineers on the team decide to work on different parts or features and we don't have to wait for something in a ship or something else to finish. But of course there's a bunch of drawbacks running in SOA. If you're on the full stack and you wanna make a feature on our CMS that needs to carry all the way through the funnel to the front end, you have to have both all three environments running on their system to make sure that your change goes and the funnel's all the way through. Coordinating deploys is also a huge pain. So if you need something that's in the core gem plus the CMS plus the www app, that means you have to deploy each three, coordinate them and make sure that all of them are happening at the same time or all of them go in a certain order, which when you're on the monolith, it's really easy to just push one big Rails app out. And then migrating to a fully dry set of code bases is really tedious trying to figure out what needs to go where and where we put stuff is just been a really hard thing to do. So some of the future work that we're gonna continue to do on the SOA we're gonna continue to add stuff to our core gem, remove deduplication, which is it's kind of a pain sometimes to figure out what things go where. And then we're considering breaking up the app even more. So right now we have two apps. We have this third app that actually uses almost all our Heroku Dinos. So we're thinking about making that its own separate app and service and we can ping that. And then they all communicate with these different data stores. Should we just use a lot of times an SOA has that service layer and everything communicates with that service layer. So maybe we should move to that. Cool, so just to wrap up here, some lessons learned. So first, really when you're working on an app for a company and there's feature requests, there's all these other things going on in your company, really do wait to make these big technical, architectural changes until everything starts to really, really hurt, until you really feel the pain. And once you do decide to make these changes, it's okay to take a really long time. We're probably gonna take eight months to do the fully SOA system that we envisioned back in the beginning of the year. And that's just because we have other feature requests from the rest of the company to fulfill. And luckily as, since we're on Ruby, it kind of makes it easier to do that. It really made it easier when we were moving from Padrino to Rails. Serve everything you possibly can from a CDN. I don't think people use CDNs enough. They're just hugely beneficial to the performance of your system. They're great for end users, particularly mobile users. At Upworthy, we have well over 50% of our traffic comes from phones and tablets. And CDNs really, really help there. So remember that you're just dealing with HTML, CSS, JavaScript, JSON, maybe at the end of the day, right? That's all you're serving. So think about how you can most efficiently serve those resources from your app. And if you are in, you're doing your own startup, if you're working on your own project, I hope you can also experience this kind of high traffic growth because it's been a hugely fun and rewarding experience for the both of us. So with that, I'm Luigi, he's Ryan. Our slides and a write up is already on our GitHub engineering blog. And we will be happy to answer questions.