 All right the fast version because we're 10 minutes in apparently max don't work every single time Hi, my name is Adam Hitchcock. I work at discusses. You just heard you can follow me on Twitter or something Shit, okay, cool Sweet, let's just start off the bat. We're hiring, but we're also American. So If you want to leave Europe Come talk to me, but that would be crazy So what am I gonna talk about? First I'm going to address the issue of why did I lie to you last year then I say what's a SOA? Why should you SOA? What are different data patterns in SOA? How does discuss do it with both a legacy and? greenfields example and Does that work for us? So last year my talk was entitled how discuss does it when it isn't Jango and It was a place over for services I was it was a talk about the very similar topic, but saying specifically. Hey, you don't have to use Django for everything It's not what the cool kids are doing, but The why do I sit on throne of lies is the main question so when I got back From Europe Python last year my CTO told basically said hey We're gonna make a technology choice and that technology choice is Django We're gonna redouble our investment it because we are a huge Django shop. So we already are really good at it And there's some really great reasons for it using standards across a company that are also known outside the company makes it Super easy to hire and there's a huge community So we want to leverage all that in both our legacy application and the platform we're moving to so I kind of had to challenge my thinking about Django and Find some new ways to use it So here is take two is how discuss does it Django edition so If I ask you to raise your hands who will raise their hands all right So about 60% cool Who already knows what so as stands for? cool That's like a hundred and twenty percent because more of you raised Were you correct? I guess so it stands for service oriented architecture, or if you're British service orientated architecture and I'm going to define what I'm what I think that means right now. So a So is something that I'd like to pronounce TLA's or Tula's you know three letter acronyms So a SOA is a system that you've architected in a specific way That system should have discrete software components or those are basically running programs and those are called services Those running programs should have simple well-defined APIs or interfaces and those are the APIs through which they're going to communicate Then you can loosely coordinate or loosely federate these services To achieve some sort of business goal in a SOA. There's two primary roles There's a service provider and sewer consumers and these are providers and consumers of data and you can totally do both That's cool. We're not gonna mark you points for that And it's not a new idea at all Doug McIlroy who I think he worked at Bell Labs a long time ago said this of the Unix philosophy The new philosophy is write programs that do one thing and do it well write programs to work together and write programs to handle text streams because that is a universal interface and He was talking about Unix pipes and programs talking over File descriptors on the same system. We're talking about network pipes and programs talking over file descriptors over the network So they're really the exact same philosophy And I think this still holds true today It's also not an actually a new idea in web systems. It's been people have been doing this for many years You're probably already using services in your system How many of you are Django app developers day-to-day? Just like sweet a lot of you excellent You're probably using Postgres if you're doing it right or my sequel if you're doing it also, right? Or Redis you probably using cues in your system Rabbit or Kafka Kafka is not technically a cue, but it's basically a cue So I put on the cues list or you're using external API. Who knows what Kafka is sweet. I will talk about that later. It's awesome If someone asked me and reminds me So external API's are also services Everything is a service single-page Java apps are great examples of consumer services usually But if they do a post command back to your server, that's now producing data So why should you do a SOA? This is basically my my Report card for later in the talk this you know, I think that if you've done it, right? You can have services written in any language you can it doesn't matter what machines those services run on and if those services have a Strong API it doesn't matter how they store or abstract that data internally. It's just the API that matters services should scale independently of each other and They should also provide you easier test ability easier deployment and because you've split your system up into Well-defined chunks. It should be easier for individuals to maintain conceptual integrity or understanding of the system So hopefully you can learn how to do all this things in the stock I think that in a so it's really important to not like when you're saying, okay, sweet I want to make a SOA well How so I think that you have to understand the data patterns in your system and I see two primary data patterns there's kind of transactional data and asynchronous data Transactional furthers divided by into two areas. I think there's are you doing transactional data against basically database models some sort of straight representation of an object in your system like a user Then rest works really well. It's crud semantics against a model If you need to do something where you're combining data and math or logic to say hey Recommend something to me or can I do something then RPC is a model that works really well and that's a more procedural thing If your response is the same every time then rest might work. Well, that's kind of the way I think about it Then a certain is data maps really well to queuing systems or pub subsystems It's really good when you have high CPU or long running workloads that you can't do in a web request cycle So you also have to pick your API's. There's two parts here You need to pick what are the bites that I'm gonna send to the other person or other service And how am I gonna send those bites to the other service? How many people know what protobufs or thrifts or message pack or Avro is so all of those are examples of non JSON Encodings of how do you put an object across the net like in a format that other languages can read so picking something From that list is super important because in order to have a heterogeneous environment. You need to have Encoding that multiple languages can work pickle is not on this list That's yeah Just spent the last week removing pickle from something. Yeah So You also have to pick your transportation protocol. So HTTP who knows what that is Literally every person this was the lowest number of hands. Okay, so HTTP stands for a hypertext transport protocol if you don't know since no one raised their hands, and so it's You know or thrift thrift provides a network binding as well I Like HTTP and JSON because they're super easy to do Django does them really well out of the box And HTTP is great if you actually read the spec or don't even read the spec like we read the Wikipedia page on the spec You'll learn a lot Things like the accept header so if you you can start with with HTTP plus JSON and if your clients are if your server respects the accept header you can Support multiple encodings when you go forward you can say I accept application JSON or text JSON in the future You can say I accept application protobuf or protobuf version 2 whatever And then you know HTTP works great too. It has things like keep alive as a connection option Negotiating that initial connection a little expensive, but When you're going forward with with HTTP It's like if you're keeping the the connection alive it costs about the same as any other protocol You know if you need to save the eight bytes you get from going from thrift or from a should be a thrift I mean it's sweet or Facebook, but if not it was so transactional data stuff rest Django does it really well out of the box. That's basically what it's been doing for the last. I don't know how like million years so but in recent years We used to use like Django forms and those suck and other Django things that you are easy to make bad practices in Django rest framework is relatively recent framework that makes it super easy to keep your code organized and clean While providing restful access to your database models. It's really cool. Check it out. There's a kick to our campaign It's awesome, or you can roll your own API there were definitely we hit lots of performance tags with rest framework But we also got around them very easily come talk to me about how I do awesome caching things and verging in that framework I'm gonna probably read blog posts about it Cool RPC is also is the other transactional. Like I said logic heavy APIs recommendation services authorization authentication. I Don't like RPC as a first stop because it's prone to over specialization when you're building API's you make APIs that are really good at doing one thing and If you're using if we're building a platform, you want to make API's that can support You know lots of things to be built on top of them Sometimes you need a specialized API, but if you can do it in a generalized way. I think that's better In other systems like the rift and zero RPC which are super easy to use That's their downfall you can abstract away the fact that you're actually doing a remote procedure call So I've seen code that it's lots of thrift requests in a like serially like 50 of them And though it why is it slow like because you just did 50 network requests and you waited for each one to finish before So I'm the next so yeah, it's low HTTP You usually know you're using HTTP because you imported HTTP lib or requests or something It's your goodness good for a high CPU long-running tasks. There's a couple ways to do this I think Django is a great and entry point that our management commands These are highly underutilized on server we have a pattern of wall true do work in our management command and so We can basically do something forever. So in our ads system. We have to rank ads Basically forever. It's just constant CPU and so and we never want to not do that So cron jobs don't really work very well because there's like a minute in between the jobs Celery how many people know what celery is Excellent, I don't have to explain this then so yeah, celery is a tasking system great pattern for asynchronous work post-save hook and a celery task Celery can also use JSON, which is supporting our heterogeneous requirement. So the We wrote something called go celery our DevOps lead Matt Robinal did that Because we need to parse celery tasks and go so if you're doing that in pickle You can't do that because nothing else can read pickle so Use use JSON something in the platform in the pent and celery beat is great for periodic tasks that you need to do like data import Django also has been incredibly easy for us to run It has incredibly well understood IO loop, which is Wait for a command or outsource the IO loop itself to something else like you whizky It has multiple entry points whizky it can be run whizky can be run in like engine next directly you whizky apache Mod whizky and Pro G unicorn and a million other things So it's great because those things handle the IO loop and whizky just has to respond to here's a dictionary of Specific information management commands. Also, you're in control of the IO loop there Celery tasks and celery beat the same. They're just listening on on cues to respond to information So it's really great because it's basically each of these is a message passing interface at the end of the day They either generate or respond to messages So how do we do this at discuss? First we'll look at discuss web, which is a legacy product. It's a monolithic Django product It is this many lines of code 183,000 and that's just because we deleted a lot of code recently. It was great like 67,000 lines or something So it's over seven years old and we made a lot of bad decisions along the way But some good ones too When we're deploying this in our service oriented way, we deploy the entire code base We don't we decided to it's impossible at its current state for us to break it up into multiple code bases So we have to deploy the entire thing and that means we end up treating it like a library We cluster machines by purpose so At the end of the day those will be our service And by doing this we kind of see the CPU patterns emerge Makes scaling easier at the end of the day say oh, we need a different kind of computer high CPU for one thing and high memory for a different purpose and then we route to these services or clusters Based on host name. So we use varnish and h.a. Proxy in order to route Route requests and then further based on the path to get them to the to the right machine that plays into our data data transparency point and then we're when we're deploying we do a three-phase deploy So there are multiple versions of the services out at any given time It goes old version old and new version just new version So you we always have to make sure that whenever we upgrade anything it will play nice in that route To change the the entry points on these in these clusters we use different settings.py files I haven't seen this done a ton in the wild So I wanted to highlight it but basically by using multiple settings.py files We're able to drastically change the behavior of of the Django the discussed library, right? So we can I have different URLs different middlewares different template request contacts Template it's something interesting template request contacts. I don't think they're lazy loaded So if you put something that you want in one of your templates in a global request context Like if you if you're setting that up in a middleware or something that's going to execute every time And so even if you're using it like 80 percent of time 20 percent of time. That's wasted cycles wasted IO so having a Being able to separate that out per service URL resolution is also O of n so Django will go down your URLs list doing does this match this regular expression? Does it match this regular expression? And so if your Most commonly you are use URLs are at the bottom of that list You're gonna have a bad time so we found actually in our API service by Reordering those we found a 15% CPU saving because our most used the URL was at the bottom It was literally the last one and simply putting it in about 15 others out of a list of of several hundred to the top We got a huge savings So do that someone just make a cool product that our project that does that automatically so Two examples of services. We have our public API, which if you is discuss.com slash API It's how we use our own products in the wild and it's also how anyone that wants to integrate with it uses it It has a ton of middleware. I think dozens of them. It has over 300 URL routes and it has several things that automatically get loaded for templates our Internal objects API. So this is a internal to our data center very raw model representation it has no middleware it only has one URL route and It has no it basically has nothing. It's settings up high. It's like 12 lines long So there's an enormous difference in speed between these two setups Unfortunately, I do not have a graph to show that but I can probably tweet one later So did it work for us? Yeah, you know it kind of works fine It was still a large cloak code base and we got all of the problems of having a large code base with it still Virgin conflicts still very problematic whether it's an Function or or method that you're changing you've got to integrate that over the entire code base still or an external package upgrade we've had lots of problems with Zookeeper Cuz you over the last two years because there were several Incompatibilities there and people were using different versions So the conceptual integrity is basically still hard of this entire thing because you still have Almost 200,000 lines of code. So on our report card. We did okay We got our heterogeneous environment and we realized that we have the the go-sellerie tasks data location transparency independent scalability and easier deployments Something I didn't say about those deployments. We can actually deploy those services the service clusters separately So if we need to deploy just something for one of them We in an emergency we can totally do that. We usually deploy the entire thing at the same time though so Now the ad server. So this is a product. That's about a year and a half old and it Was started as a flask application as I talked about in my previous talk last year and Other accustomed stuff. So in the last year, we've ported it from that to Django And we've made some decisions when we did that we want to make sure that we use Django apps very well Django best practices are best practices for a reason they really help you architect your system in a good way We want to leverage Django beyond whiskey We want to use all those entry points like the the management commands and celery tasks And we want to split up our code base very intentionally We want to have one code base that can access the database And I'm talking about model access like using the ORM and the other Code base could only access that sort of information via the strong API. So that's just We separate the services in that that regard But we do have multiple services that can access the database So this is just some of the services that go into the ad system We have our data API this maps really well to rest framework and it has some minimal RPC endpoints because we need to do mass data export for other services We basically just need an endpoint that says everything give me everything and so then we have our ad serving API This is basically a recommendations API and it's an RPC endpoint. Then we have our asynchronous tasks scoring and cash warming keeping caches hot with recent information and these are run using management commands in that while true Do something and then the ads data import service, which is celery and celery beat These are either responding to new objects getting created and having to collect more information about them or Looking for what's changed in other systems on a batch process like every five minutes what was created in a different system and import that information for Scoring and other purposes and this is how we organized it. There's the two on top have the ads stuff because we're leveraging lots of frameworks we get to Only have 11,000 lines of code compared to a hundred and eighty thousand and then on the bottom Those are the the warming services and such and that's also they're very close in lines of code about 11,000 And this is what looks like with boxes and arrows So on the far left you have JavaScript land then there's the internet then you kind of have a web serving Oh, I'm supposed to use the mouse. Okay boom now you have like a web serving layer here and then you have all of our back-end services and these are again coded by Colored by code base just so you can see how that goes and here are the backing technologies for that so backbone on the front end we have our Django Uwizgi with engine X layer and then our backing stuff so So did this work? I think it was kind of amazing. Ah, I didn't play more than once. Ah It's so good. Why is it not whooping? Okay So did it work? All right, I think it worked really well We the only downside is the inverse downside. Well, we had problems that small code bases have It's hard to share a code between when you have lots of code bases I hate things like get subtree and stuff. Those are just going to hurt you So we ended up making a third code base for sharing stuff. We use pip internally So we just we run our own cheese shop. So we we just deploy a package out there That is the ads core package for shared information and that actually even helps us make even better code Django best practices helped us a lot in the long term. I wish we actually were more strict on those And we definitely made it easy to understand the entire system Because it's everything is there's so few lines of code and everything's encapsulated so well It's easy to quickly add new things as long as they're being added as an individual item easy easy to test everything Integration tests are more important because you need to integrate with other processes, not just other libraries So you need to do a really good job of testing your external APIs So, you know the Django test client becomes way more important than I like it to be because it's slow Then service API's live a long time So you're gonna need to learn how to version them and support them for for exam periods of time I definitely have like a couple days every couple of months That's like king you go find everyone else's code where they using the version of the API want to delete and fix it for them We get fast deployments. Our deployments are about two minutes and then our testing is like a couple seconds right now per Per service so It's pretty fast in general And yeah, they scale independently. So we got all of our check marks on our report card. Yay Is it a success for discuss it's a knockout success That was really bad. Okay Yeah, it's easier to run overall The the it's easier to understand and and build new systems in this environment You don't have to we can hire an employee that never has to see legacy code and I'm jealous of them But and it's easier to not break existing systems because you're not touching them What you can't accidentally break something that you're not committing code to well UK you can but it's harder So as a roundup do one thing and do it well You want to examine your data patterns up front and then look based on that data pattern You need to make some decisions those are gonna be API decisions around protocol transport and what methodology are you using to access That data and then Django has multiple entry points. You should definitely use them Don't it's not just a whiskey machine and then do one thing and do it. Well, that's that is the one line from the talk You should remember Here's some links Go support Django rest framework. It's got a Kickstarter. It's got a ton of money right now I don't even I don't I don't know if you even want more money, but You know, it's a great thing support it Here's our go celery stuff Django best practices. I like Lincoln Loop. They're out of Portland I think there are some awesome people and They are really good thought leaders in just how to do Django development in general and the Unix philosophy It's a great Wikipedia page So if that was interesting to you and you hate Europe come to San Francisco Questions. Yeah, I already have one of About three phases deployment. Oh, yeah in case of the service which uses a database directly that one you mentioned Oh, yeah, how do you manage that because actually database is only at one version? I guess, yeah So we basically roll forward. So when we want to do something to upgrade our database, we You We changed the code We change our Django models. We make the sequel that will do that upgrade We run that upgrade when you're adding a new column type That's okay. Like you're adding a new column sweet Django will run with additional columns. It will ignore that if you're changing a column type Well, you can't do that because you we have We exceeded the ID variable for one of one of or the the size of ID in one of our tables So now we're all like big ID or big and ID whatever it is big What is it was a big sequential some post-credits that I don't know how to sequel but so the point is you can't do that You cannot change your models. It's very expensive and basically impossible, but you can totally add to them So it goes to data planning but also because it's behind the API You can always abstract that so we have models that are we have it has its one model It has its tables and then it kind of has adjunct models that are meta information for that And as a consumer of the API, I don't even care because I didn't even know that that exists It's up to that service to manage all of that Right just a question. What do you do for logging and monitoring because you already have lots of Oh, yeah your SOA and lots of external dependencies and I Dreads when something happens in celery or rabbit MQ and I have to go to the manual or start read again And I have no idea what's going on. Yeah, what do you do for that? So I'm not on the ops team But I have a little bit of insight into how that works because we use the same infrastructure to manage our applications And we basically use stats D for everything. Who knows what stats D is. We're still in the participation phase Okay, cool stats D is basically a demon that runs locally or anywhere because it's a service and you can talk UDP to it and It will basically over a period aggregate information and then beacon that out to somewhere else What's usually called a collection service and then that collection service will take everything is collected and Tell graphite or something about that. And so we use graphite. We use I will I forget the thing that actually does the alerting Nagios now can't remember but we basically have graphite graphite stats We alert on thresholds on those stats and we have them for hardware and software and We so we have thousands and thousands and thousands of different metrics being tracked and by aggregating the correct ones You can have a pretty good Intuition on the on the system. I want to answer one question. I was asked last night because I thought it was particularly interesting which I was talking to Tom Christie and he was saying what was the most challenging part about designing your new system and I was surprised when I found was actually just the software architecture It was making sure that we don't write that spaghetti code because when you're when you're taking an idea and implementing it You can spread that idea across your entire application or you can encapsulate it So using Django apps in a Django project as their own mini services with their own APIs at a software level That's like doing that wells was our hardest part And his response was really I thought I would say not Design it so it doesn't go horribly horribly wrong when it breaks And that was actually relatively easy because what because it's a met everything is message passing when something breaks That's just it stops sending messages. So that means everything kind of turns off instead of blows up Which is I think It's still a failure mode, which is bad, but it's a better failure mode It's not the oh, we need to bring more hardware online to handle the load It's just kind of a load dies. So those are the two things that I also had questions for you Which is like, how do you guys like RPC and we can talk about this outside later? But do you guys how do you do maintainable RPC? How do you guys do service discovery? I like DNS because I also like HTTP And what do you guys think on the menu code bases versus one code base thing? I know large companies like Google and Facebook love it and I'm split. So I'm just curious like what are your thoughts? Okay, I'm sorry. We're out of time. Thank you again, Adam Thank you