 Okay, it looks like we're at 130, can you all hear me? Okay, cool, so maybe, well, who am I? I'm John Billings, I've been working at Yelp for almost five years, and during this time I've done a bunch of work on our service-oriented architecture, so splitting out our monolith into a bunch of different services, and I've done a lot of advising different teams how to do this, and so Swagger has formed an important part of this process. So, just a little bit about Yelp. What we do is we connect people with great local businesses and some statistics. As of the end of Q3 2015, we have about 89 million unique monthly visitors via mobile, and more than 90 million reviews contributed since the beginning of the company, and about 71% of all searches now come from mobile, both web and app, and we're present in 32 countries. So, HTTP JSON is amazing. There's a unicorn, a very happy unicorn. There's so much supporting infrastructure. If you're in Java land, then there are frameworks like DropWizard, or if you are in Python land, so I'm familiar with Python Java, so there's HTTP lib, there's simple JSON for doing JSON decoding, you've got frameworks like Pyramid, and then at the infrastructure level, you've got things like Varnish to do your HTTP caching, you've got HA proxy, and Nginx to do load bouncing, and both of those have good layer seven support for doing nice things with HTTP, and you've got Apache for actually serving up things, and then for debugging, you can use things like Curl and JQ. So this is great, so much support. But if you look carefully, the unicorn's actually crying. Why is this? Why is the unicorn so sad when everything is so great? Well, here is a mythical situation that has never happened to any of us, right? The website is down again. Why is that? Oh, I just pushed the Pet Store service. So the Pet resource, it takes a string, right? Like, I passed it a string. No, no, it's an integer, right? So there's confusion over this API. You know, curse you and your strings. So if only there were a better way to specify our API. So we do actually have a few options. I'm gonna walk through some of them. So you can write specification documents, and here's an example of a spec doc for the Pet Store service. And just in case you don't know, the Pet Store service is kind of, is an example crud web app where you can kind of post new pets, you can search for pets, you can kind of download pet information. And so this could be an endpoint for the Pet Store app. And you see there are some attributes and some required or optional and some descriptions over what each one of those means. And then there's a response and an array of pet objects defined below. So, pros and cons. So it's really easy to get started with this. And this is what we did at the beginning when we were breaking out our services. And if you use something like Google Docs with commenting, then people can just look at browse those, during review time and leave comments, and you can very quickly iterate. And if you're not completely technical, so you're a PM, then you can also look at those and try and sort of figure out what's happening. But it's not all great because certainly the implementation and the specification can drift over time. There's nothing actually keeping them in sync. Maybe at the beginning they are in sync, but then some people add some new endpoints, change some parameters. And after a few months, the spec. really doesn't reflect reality at all. And it's also really easy to be imprecise because this is just written pros. There's nothing forcing you to actually specify how things are. So another option is you could use an IDL, an interface definition language, like thrift or protocol buffers or Avro. And here's an example of a very simple multiplication service defined in thrift. And you see it looks very IDL-y. There's a multiply function. It takes two ints, returns another int. So what's good about this? Well, it can be very efficient on the wire. Because you're using an IDL behind the scenes, you really can really compress things on the wire. And it's really efficient to decode. So we certainly have situations where you send, I don't know, 10 megs of JSON and it takes Python services, 100 milliseconds to decode that. And there's not much we can do to speed that up. Whereas if we use thrift, I think it's probably gonna be more efficient doing that. But there are some downsides. So you cannot use nice layer seven technologies such as HTTP caching. That's out, we're just at TCP now. It's difficult but not impossible to debug on the wire. And the quality of support for some of these IDLs can be a bit variable. It really does depend, especially if you're using a less well-supported language. Another option is to write lots of integration tests between your clients and your servers. And then effectively the tests actually become the specification. And you're basically saying something like, as a client, if I send this request to the service, then I should get back this response. So what's good about this? So you should hopefully be doing a little bit of integration testing already so that you learn about errors before you hit production. But it's not all good because integration testing is pretty much at the tip of the testing pyramid. You hopefully just have a few integration tests and they can take a long time to run because you may have to spin up some databases, spin up some services. It all takes time to do this. So the iteration speed is very slow. So ideally you wanna find bugs before this point. Another alternative is you can write client libraries. So you hand your clients just a library in whatever language they're using and they just talk to that library and then behind the scenes the library actually does all the communication on the wire. So what's good? It's a black box of the client lib. Consumers really don't need to worry about what's happening there. And from our own experience at Yale, we have actually done this and it can work. But there are still disadvantages here. If you're writing the client lib, then there's lots of boilerplate, sort of code that doesn't really add anything, you just need to put it in there. And you're probably gonna be doing manual validation, checking the strings of strings, insert ints and the ranges are okay, that sort of thing. And you still haven't got the specification for the wire protocol. You're not actually forced to write that down at all. And so you probably also still need some integration tests as well between the client lib and the service. Okay, or is there another way? So we could stick with our existing HTTP and JSON infrastructure. We could then just maybe, maybe we could invent a machine readable specification language to declaratively specify what our endpoints and what our return types look like. Maybe you can see how I'm going with this. And then we could create a whole bunch of tooling to actually generate client libraries from these specifications of our endpoints. And then we could also create some tooling to perform server-side validation against these specifications. So are people sending us the right things? Are we returning the right things back from our server? And then maybe we could also create a really vibrant open source community around all of this. But what am I talking about? So I'm talking about Swagger here. And there is a URL there if you want to look at the Swagger spec. And now what I'm gonna do is I'm just gonna walk you through Swagger. So the history of Swagger. It's been around since around 2011. That's when version one was released. And there have been a few different versions since then. I think in version 1.2, they actually formally specified the Swagger specification language. And then version two, which was a moderately big change, was released in 2014. And that kind of got rid of a few of the pain points. And then just this year, it was renamed to the open API specification. And it's supported by Google, Microsoft, IBM, and a few other companies now. So here is the Pet Store API. And you can actually try out hitting this endpoint. So if you curl it, and you specify an ID, so 42 of a pet, and then pipe it through JQ just to format it nicely, then you get back this response. It tells you the ID of the pet in the category, the name of the pet, some URLs, and some tags and status. So this, for our services at Yelp, we do things that look a little bit similar to this. This isn't completely strange as a response from a service. And there's quite a lot of structure there. And as a result, it's quite easy to get something wrong when you're dealing with objects like this. So what would a swagger specification look like for this? So there's actually a full spec at the URL at the bottom of the slide if you want to take a look at it. But what does the swagger spec say? Well, it says it's version 2.0, and there's a description and an opaque version string and a title. And then the important things are these path objects and definition objects. That's basically where the meat of it happens. So I'm going to talk about path objects and definition objects in turn. So a path object for the endpoint that we just saw. So you see there's a path slash pet, and then this path parameter, which is pet ID. And if you look down, you can see this pet ID path parameter is bound in the parameter section, name pet ID, and it's in the path. And then there's this description. And we see it's required. What type is it? Well, it's an integer, and there's this format, which is in 64. And I'm going to talk a little bit about types and formats later. What choices you have there. And then there's this response type at the bottom. And you can specify response types for both success cases and also error cases. I'm just showing a success case here. So if you get a 200, it's a successful operation, and we return this pet object, which lives in the definitions section. And so this is just a reference. And the nice thing about Swagger is you can actually split these definitions out across different files, which can help keep it dry. So here's an example of another parameter object. It's not used in the example I just gave you, but it's used in the find by status endpoint. So what is this? Now we're saying it's a query parameter, and the name is status, and it's required, and it's an array, and the array contains strings, and these strings range over available, pending, or sold, and default is available. And so this is pretty powerful. We've got an enum type here, and we're actually gonna be able to check that people are passing in the right enum values. So let's continue with our getPet by ID. So this response type, which is here on the right, I've given you the spec for it on the left. And so you can see that there are a couple of required fields, and then there are a whole bunch of kind of required optional fields defined below. So there's ID, there's category, there's name, there's photo URLs, tags, status. And so this kind of makes sense, and I think hopefully you can see how this all matches up. What else can you specify in Swagger? Well, quite a common idiom is to return a map from strings to something else, and as we were retrofitting Swagger specs to our services, we kept hitting this case. And in Swagger 2 and beyond, you can actually define these very easily. And so here are a couple of examples. So a string to string map, you just say the type is objects and additional properties are type string. And you can actually change the type so you can have a string to a foo map if you've defined a foo object, and that just works. So what are these data types and formats? Well, your types, they range over integers, numbers, strings, and booleans. And then the format basically allows you to put a little bit more structure in the type. So you've got int 32 and int 64 for your integers, and then you've got things like dates and date times for your strings. So Swagger will actually check, if you specify date time as a format, it'll check that you really do have a date time there. But you can also have custom formats, and this is where it gets really powerful. So if you specify a format that Swagger doesn't know about, well, Swagger will kind of ignore it, but maybe in your framework, you've registered a validator for your email addresses or IPv6 addresses, and you can check that the strings are of that format. And we've actually done that quite a few times at Yelp just to provide additional validation. So you've got a Swagger spec for a service. Where are you gonna put it? So Yelp, we check them into the service code base. Why do we do this? Because it basically minimizes distance between the spec and the code. You've got some other choices. You could actually put specs in a single central repo if you want it. We haven't done that, but it could work. So modifying Swagger specs. So there really is no magic here. Currently, Swagger will not prevent you from doing something really bad. You need to make sure that all your changes are backwards compatible. If you like living safely, then just add new endpoints. Or if you like living dangerously, then go ahead, change some existing endpoints or remove some endpoints. Bad things could happen. There'd be dragons. I wouldn't recommend that path. So a brief interlude. What's the best thing about UDP jokes? Anyone? I don't care if you get them. What's the best thing about TCP jokes? I get to keep telling them until you get them. No, I'm not actually gonna keep doing this. Okay, so you've got a Swagger spec. What can you do with it? There are several different things you can do with it. You can review the spec. Or you can aggregate all the specs across your organization into a single location so everyone can browse them. Or you can generate a client library. Or you can perform some server-side validation. Or you can do some testing. And I'm gonna talk about each of those now. So API reviews. So here's an example of review board. We use this at Yelp. And here's a Swagger spec that we've just opened up a comment on during review. So you really can provide a lot of feedback at this stage. And quite often we do reviews for the spec first and then we do the code reviews. And so this just looks like regular code review which everyone should hopefully be used to. So browsing specifications. So at Yelp we use a slightly modified form of the open source Swagger UI tooling. And on the right hand side you can see a bunch of different services. Kind of blurred out just because I don't wanna show you all the services that we use internally at Yelp. And so you can select any one of those services and then you get the API for the service in the main kind of section there. And this actually integrates with our service discovery system so that when you actually deploy a new service with a Swagger spec it just automatically turns up here. So developers don't actually need to do any additional work if they've got a Swagger spec. So in the Swagger UI, you can browse all of these endpoints and you can see the schema, it's great. But you can also perform real queries because this ties into dev versions of our services. So you can go ahead and put ID one for photo IDs and then you can hit that try it out button. And it will actually go ahead perform that request and give you a response. So this is incredibly powerful. You've got all of these services and you can interactively play with them from a single UI. And PMs love this, right? If devs been working on a service they can go ahead and actually try the service. They don't need to understand Curl or JQ or know about the kind of complicated URL structure. This just works. So brief aside, how does Swagger UI work? Well, suppose you have a host which actually hosts the JSON, the JavaScript and HTML and your browser downloads Swagger UI from that host. Well, same origin policy basically says in your browser that you can only communicate with the host which you downloaded that JavaScript from. But there's a problem because we actually want to go and hit the Swagger endpoints of all our services but we're not allowed to do that. So how do we get round this? You can use cross origin resource sharing where your services actually return a header which specify who can access them. And we thought about doing this at Yelp but it's really quite invasive if you have to modify all your services to return these additional headers. And so we got around that by using trustee nginx. So nginx just sits there and it proxies all the requests both to get the Swagger UI and also to get the Swagger content from the services. And so it just sits down a network and your browser just thinks it's always communicating with nginx. So this is just something to be aware of if you want to deploy Swagger UI. So we have a Swagger spec for a service. Let's generate a client library. So here's an example of doing that. You can run this code yourself if you want. You curl this jar, download it and then you just run it with this generate parameter and you specify what language I'm doing Python here. And then you specify a URL the dash i option where you can find the Swagger spec for the service and then you say output into the client lib directory and you run that and it will just generate to you a client library just like that. And then suppose we have this client library now how do we use it? Well this is Python but it would work with Java as well or any other supported language. So you import the API client you import pet API which is defined in the Swagger spec and you construct your client and then you basically pass it into your pet API object and you just say get pet by ID of 42 and it goes off and fetches the response for that and validates it. So this again this is really powerful. We didn't have to write any of the boilerplate behind this. We don't have to write any validation code. This just works. But for Python you can actually go a little further. So Python supports dynamic generation of classes so at runtime. So at Yelp we wrote this bravado dynamic client lib thing for Python and you don't actually have to do that kind of two phase you know fetch the spec generate Python code and load the Python code and run it. You can all just do it in one step. So this becomes really powerful because you can interactively discover and browse your endpoints and you see here that in just I guess three lines you can actually fetch the spec and then you can just invoke the spec just like that and you get back the same response we saw before basically but now it's actually it's an object so it's presumably there's a little bit of typing here as well. So something else is validation in your service. Here's a project for the Python pyramid web app framework. Again there's similar things for Java or whatever language you're using and Pyramid Swagger so we wrote this at Yelp it supports several different Swagger versions it does validation both coming into your service and also going out of your service. And it also has a few facilities for interacting with Swagger UI. So let's just walk through an example. What do you do here? Well you add your route as you normally do in these sorts of frameworks and what Pyramid Swagger does is it actually checks in the Swagger spec that your URL is really there and then when you're writing your handler code so at the bottom of the slide you then have access to this Swagger data dictionary on your request and that contains all of your validated data that's come in. Pyramid Swagger supports those custom formats that I was talking about. So suppose for some reason you wanted your pet IDs to be base 64 encoded on the wire. Now Swagger doesn't natively support this but what you can do is you can register this validator and then whenever Pyramid Swagger sees a base 64 thing on the wire it can automatically decode it and then it can encode going back out and here we see also there's validation which is just try doing a base 64 decoding. So let's see an example of Pyramid Swagger catching an error. So here we're hitting that endpoint that we saw before but now we're using the string 42 instead of an integer and Pyramid Swagger before this even gets into your service it realizes that this string is not an integer and it throws an error for you. So it actually says 42 is not a type integer and it says what it should be. So that's very powerful. We didn't have to write any of this code. Pyramid Swagger just did it for us. Or maybe in your pet store code you've got this getPet function. You get a pet ID off the wire and then you're gonna return an object. This isn't what you'd actually do it's just a kind of minimal example but maybe for the ID we return we make a mistake and we put a string there instead of an integer. It's a very easy mistake to make and you wouldn't necessarily immediately catch this error if you were doing something like testing or code review but you need to really exercise the right things here. But what does Pyramid Swagger do? Well when you try and return it it says this ID really should be an integer and there's been an error so it catches it on the way out. So if you just made a few example requests to your service during testing you could pretty quickly check that you're doing the right thing because if you're returning things of the wrong type then you're gonna get an error like this. So let's talk about testing, testing. So here's the situation if you're not using Swagger. So assuming you are using a client library you've got two interfaces. You've got your client to client lib interface which is presumably within a single address space on a single machine and then you've got your client library to service interface which is presumably gonna cross the wire and in each of these interfaces there could be inconsistencies and we've seen some examples of inconsistencies before. What does the situation with Swagger look like? Well you've got your client to client lib interface before but you've also, you can imagine your Swagger spec living on the wire almost just checking things as they pass to and from the service. It doesn't actually live there but it's a useful mental model. So our generated client lib is automatically consistent with the Swagger spec almost by construction. Providing there are no bugs in client lib generation and also providing you don't go changing the Swagger spec but we'll just kind of not think about that for now. And then you've got your Swagger spec to service interface. So let's look at the client to client lib interface. So this is a fairly standard testing problem. You're using a third party library. You wanna check that you are conforming to its interface and so what do you do? Well you can mock out some return types perhaps and your type checker can probably help here if you're living in a typed language. So what would be really nice for Swagger would be if it actually supported kind of running in testing mode so you could say to your client lib, hey I don't actually want you to make requests on the wire but when a request comes in for this endpoint please return this data. So it doesn't currently support that but hopefully that will happen sometime soon. And then we have the Swagger spec to service interface. So as part of your testing, given you've got a Swagger spec you could use an external validator just to check your requests and your responses are consistent with the spec or if you're using something like Pyramid Swagger well that's already kind of built in so just fire some requests at your service and it'll tell you if you're violating the spec. What else is there? So there's Swagger Hub which is a pretty new service online by SmartBear and here you can actually enter your Swagger specs and straight away see the documentation for them and on the right hand side you can also generate client and server implementations and just download them. So you don't really have to download any of that tooling. It'll do that stuff for you. And you can also share your specs between other collaborators. So this is pretty exciting because it's kind of like GitHub for Swagger and I can see this becoming a very easy way of collaborating on APIs. So Swagger isn't the only way of specifying your HTTP JSON interfaces. There are a few other languages for doing this. One is API Blueprint by Apiri and the nice thing about this is it's kind of designed to be more human readable and behind the scenes actually gets compiled into something a little bit similar to Swagger. So here's just an example of a pet store like Interface. I haven't used this but it is an option out there and another option is Iodox by Mashery and this looks very similar I think to Swagger and you can see it's just specifying an end point, a get endpoint. So conclusions. So Swagger provides a really easy way to define JSON HTTP interfaces both for new services and also retrofitting onto existing services. Once you have an interface you get a whole lot of tooling for free. The cost of entry is that you write your of Swagger specs for your services but once you're in there's a lot of value. So you can automatically generate client libraries for many different languages and you can automatically perform validation of requests and responses. So the unicorn is happy now, no more tears, we fixed the unicorn's world. Are there any questions? Right, so the comment is that the examples that I gave you were with Swagger specs separate from the code, can you do Swagger specs in line with code and I think you've seen an example of it done like that in PHP where it's actually with the code. Yes, you can do that. At Yelp we found it quite valuable to actually have the specs separate from the code because that way you can review specs separate from any implementation but it kind of depends on what you want and I can also imagine cases where it's a lot easier to inline the spec with your code because I know that for some of the Java frameworks you can put the Java annotations on your methods saying what types it takes, what types it returns and it will automatically generate a Swagger spec from those. So yes, that is an option. So the question is how do you use Swagger with JSON API where it envelopes the JSON response? So I'm not very familiar with JSON API but presumably if Swagger can actually express the JSON API types then that would work. I imagine JSON API uses pretty standard objects and types like strings and Booleans and integers and so I imagine you could code that up as a Swagger spec but again I don't know all the details there. Oh so the question is is there a way of specifying a base template that you can inherit in your Swagger specs so there is support for inheritance but I'm not quite sure how that would work with JSON API. So yeah, not sure, that's the answer. Any other questions? Okay, so the question is how does the workflow work? Does a PM sort of define the spec and then hand it off to a programmer? So yeah, PMs are involved with details of the coding but we normally have the programmers actually take the lead on defining the specs and the PM can work with the programmers so it's not like waterfall, PM does this, hands it over to the programmer then the programmer does the implementation. It's much more collaborative. So the question is are there any other tools that we use for handing off specs from PMs to implementers? I know that we still do have some written spec documents for some of our older APIs and so those still exist and I think there might be a little bit of automated tooling around those. So yeah, like spec docs are still used by a few PMs. I think if I can remember correctly the teams who are still using spec docs do not use Swagger so they really are just going from the spec doc to the implementation. Any other questions? So the question is did we encounter any disadvantages to using Swagger? So before Swagger 2.0, there were some idioms that we could not encode for existing services so those map types, I don't think we could actually do those which prevented some adoption. Other issues, let me think. The validation can be a little slow. It's not really slow but it can be noticeable especially if you're validating large data structures and so I think one or two teams have noticed these problems especially in something like Python which doesn't tend to be quite as fast as Java and I guess the other problem, if you can call it a problem, is that Swagger forces you to write these things up front and for some times that can be a bit of a challenge because it keeps you honest. I would say that's a positive but it's an additional step in the road. So the question is are there any problems that we've had? Right, yes, so are there any problems retrofitting Swagger specs to horribly designed maybe legacy applications? So there was the map problem that I spoke about before that problem is now fixed. I think if you do some kind of odd patterns like if you could write an endpoint if it gets an even number, it returns a string, if it gets an odd number, it returns an integer, that might be quite hard to type in Swagger. You might say you shouldn't be doing that maybe but it is that it is. I believe so, yes, those are the cases I have come across during retrofitting of Swagger specs but there may be other edge cases but I haven't seen them. So paging for results. So you would pass in a kind of an offset parameter. So if you want to do paging then you can actually just pass as part of your request as probably a query parameter that says here's the offset and here's the number of results for that offset. So it's not going to automatically make sure you're doing your paging correctly but it will check that you're passing integers for your offsets and your sort of window sizes. That's correct. Out of the box it can do most things that I have seen. Yes. Right, so the question is, am I familiar with Apuri and how does it compare with Swagger? I only have given it a brief glance actually so I can't really give you a full comparison. Certainly what I like about Swagger is that it seems to have a very large vibrant open source community and the tooling seems to be very good and so that was why I went with Swagger but I can't really give you a comparison. Okay, so the question is, have I had an experience using Swagger with sort of client side stuff? The answer is no. So right now we're using Swagger for not all of the majority of our backend services so I can't really talk to how it would work with Angular or something like that but presumably it's still a pretty similar game where you're just typing things and checking and validating so I wouldn't imagine it's hugely different. Do we integrate HMAC? Sort of authentication type things. Do we integrate HMAC? I am going to punt on security questions. Okay, anyone else? Okay, let's call it a day. Thank you very much.