 Good all right, so I guess we'll get started. Thanks everyone for coming to the secret of brewing up a good API It's going to be a an overview of design API etiquette things you should do things you should keep in the back of your mind and Some more in informational pieces to that a little bit about myself I'm currently the technical lead for cognitive development at vulture. This covers API v2 load balancers are managed Kubernetes offering And all of vultures open source So what makes a good API there are a few things I would say The first one being documentation your eye design consistent types pagination informative errors And then authentication rate limiting I kind of packaged up together as they kind of go hand-in-hand But we'll get started with documentation So I would argue that the most important thing of any API is the documentation That is a source of truth. That is your binding contract between you and the developer It should be informative and clear. It should be concise. It should offer everything The developer will need to do to to integrate with your application It should return everything regardless if the responses are good or bad And it should outline the behaviors of query parameters authentication headers Anything and everything should be defined in the API that way or in the documentation that way There are no surprises with implementation. So This may may not look as clear and concise as you would want But if we kind of go over it You'll see that it is kind of clear as to what it does. We have a header That's a create coffee bean the description kind of tells you what the this API call does the specific endpoint Which is coffee beans the method the request body with all the fields defined Whether they're required or they're optional their types and then we also have an example of a request and a response so This is a good starting spot for an API if we were to give this out to anyone You know integrating with this they could have a clear understanding that These are the record. This is what my Chris should look like These are all the required fields if I want them and an example response We could also enhance this out of course with adding, you know, whether there's authentication also listing Any type of other responses such as errors But overall this gives you a pretty good idea as to what this specific endpoint Would provide for you now there are ways to make this documentation Better because this right here is pretty rough when you look at it There are a lot of tools around documentation Specifically the open API spec or OAS. It's a Linux Foundation collaborative project and what it does is it? It lets you Write your API spec in a specific way that way it's portable So there are a lot of tools Around this such as swagger spotlight, but there are a lot of benefits to using these tools If you were to find you were API to the open eye specification You could then add it to tools such as swagger redock and it will create Generated HTML for you so you can clearly and cleanly make a nice templated AP API documentation the other other benefit of kind of following this documentation spec is that you can take this Specification and then in your said language when you're implementing there are tools that will actually parse the application the API spec in Coordinates with what your application is doing and validate them on the fly so in this example Where you have your defined fields and whether they're required or not when you're writing them to the open API spec When you're using these tools to validate your requests and your responses With the spec they will actually be handled gracefully there so I Would say documentation is definitely the most important Part of your API and I would always recommend starting with with your documentation as that is the source the truth And if you start with your documentation it gives you a lot of leeway into Helping you design and think about other portions of your API such as errors whether you want pagination and so on The next portion of this would be the URI design so There are specific patterns you should do but the best one to follow is that your URI Which are your endpoints should be modeling a resource? So if you if we look back at the documentation aspect of this You'll see that We're modeling a coffee bean So you have ID type region roast limited quantity In terms of your URI it should always model a resource not in action They shouldn't contain any verbs and they shouldn't be plural unless they're singleton resources, which I'll cover in a bit Also use dash instead of underscores these are kind of these are kind of the behaviors that the JSON spec follows in terms of API design So these are designs you should avoid right create coffee bean get coffee beans. There's no need for these verbs If you're looking at this URI design, it may take some getting used to if you've never worked with the APIs, but The HTTP verb followed by the resource makes it really clear as to What you're doing with with that said resource post whatever resource in this case is coffee beans It was always going to be a create get coffee beans It's going to return a list of coffee beans and so on so When you're designing your your URIs, you always want to model a resource That way the resource and the HTTP verbs kind of give you a clear-cut Model of what you are URI kind of does now I mentioned that singleton resources usually aren't plural This is because they're usually get or update. They usually just have get or update calls on them So in this case if we look at coffee beans followed by ID, which would be the coffee bean ID that you get back from your list Or your get calls You can also get back a brew config or you can update a brew configuration And that kind of leads into consistent typing This may seem innocent at first to have some of your fields be mismatched whether The quantity type on your request for creating a coffee bean is a string while you're returning an int and most This wouldn't really be effect you as the API designer But as a developer if you're working in a loosely type language, you really won't feel the headaches of that But if you're working in a typed language, that's when things get a bit difficult because when you're trying to model these types For example the coffee bean When you're making the request you have your pre you can pre-define an object or a class whatever you want But then on the on the response you're getting different fields and then takes extra effort as the developer to kind of Integrate doesn't have any predictability So this is kind of Going back to the example where With our request and our response you'll see that the types match up and the responses match up in my Request to make a coffee bean the types are strings The limit the limited field is a Boolean and the quantity is an int and in my response. It's the same thing With the addition of the ID field But that is more of like a computed field that you're kind of getting back on your your your response so This may seem innocent at first when you're kind of implementing it where your fields are mismatched But as an integrator this can have greater consequences down the line and cause a lot of headache for the person Integrating with you with your API And the same thing can kind of be said for your top-level nodes, which are your Responses when you're getting a cop when you get when you get a coffee bean you get a coffee bean Resource followed by the fields if you are getting a list of coffee beans. It's coffee beans followed by an array Whatever the resources is usually what the top-level node should return so if we were getting the brew configuration It would return back a brew config followed by whatever fields Those are kind of defined. So the biggest thing with consistent typing is you want to make sure that your Fields whether requests or responses are the same in terms of typing and you also want your top-level nodes to Usually model whatever resource you're calling you will see a lot of API is usually just do data Which is also a valid way of doing it Maybe preference I usually prefer having the resource to find there that way if I make a Response or if I make a request and my response is coffee bean I can kind of get an idea of what call was made The next thing would be pagination When you're starting out whether it this would be an internal API or a public-facing API You're usually not going to have a lot of data. So pagination isn't something that comes to your mind But as your API Grows and your data set grows the pagination can be beneficial because it can remove a lot of load from the API and improve Response times alleviate the API from heavy loads and overall it's just a very good practice. So Pagination gives you a lot in regards to performance and behavior So this is an example of what you could expect in the pagination request so here there are multiple types of Ways to do pagination. I will cover them in the next few slides, but here you'll see that We're defining a query parameter paid per page one What this is saying that on this list of coffee beans only return one coffee bean And then what you get back is you get an additional meta node and that meta node will tell you exactly how many Coffee beans there are in the list and it also gives you a next and a previous ID so if you wanted to iterate through these These list of coffee beans you would basically do you know coffee beans per page one And you would add you would also add in the next field with that value And that would let you kind of that would get you the next coffee bean in the list So this is a cursor cursor method And right now we'll kind of go over the three main types of pagination. You'll usually see in an API So the first one is pages This is the simplest most common way you'll see where all you're really doing is on your back end when you're getting your data You're more or less breaking up your data sets into pages So you'll see that page one page two page three and all that really is mapping to is just a limit on your database Whether you're saying, you know, get me everything between One in 20 21 and 40 There are pros and cons to this but this is the more or less the easiest way to kind of get started with with an API or API pagination the next one is going to be a key set This one is different Then the the pages one what this does is there's an additional Field in your data set that acts as a delimiter so in this case, it's called since ID so When you're creating all of these coffee beans in your in your back end each one of them has a specific ID What you're doing with your pagination is you are defining that you know return all products from ID one to ID 20 ID 21 to 40 So you define a specific delimiter in your list data and that would return data accordingly The next one is the one that we saw prior and usually the one I like using But it is a bit more complex to do which is the cursor method It acts as a pointer within your data set So like in the example I showed you have your set data set and you get back a Pointer to whatever your list is and that cursor gets transformed in your back end to a specific SQL configuration that gets appended to all of your calls This is nice the pros to this one are you usually don't lose any data if new data gets added into your data set There is a chance with Pages or a key set you won't get that data returned depending on where it falls in the data set with the cursor usually the data is Descending or ascending by a specific field usually a date and That kind of gives you the flexibility to Set this data in a way that every time you want to you add new data to your your set You don't really lose your spot since the pointer is always updated If you have 50 items in your list and then you add an additional hundred the cursors usually aren't a set value In your list while something like a key set will be because it's based off your ID So that that's pretty much it with cursors Moving next so informative errors Same with with consistent with the consistent types and the documentation This is one that usually gets neglected With any error you have in your API you should try to document it and be transparent with those With those errors as they are valid fields Within your API, so they should be informative to the issue you should treat them as regular responses They should be returned with top-level top-level nodes usually with errors or whatever you kind of wanted to find there Make sure you use proper HTTP codes and another thing you may want to do is map mapable error codes And we'll get into that so Here's a good example of you have your error you have your status code You have a message you have an error code And then you have a resource so with this response if you were to get this response you can kind of have an idea of What happened where it happened? You can kind of integrate with this in a way that would give you the flexibility to Execute your application accordingly to either do a retry or to handle this failure in a specific way now you might be wondering why Have a message of unable to authorize with the status code of 401 and then have an error code of you know ER underscore auth The benefit to the error code is that Your message could change unintentionally intentionally But it's not really a constant so if you're integrating with an API And you are specifically looking for failures and you're keying off the message One it's a pretty large string could be long could be longer So it's not really a good identifier as to what happened While an error code is usually something consistent. It's usually something defined in your documentation There's usually a list of what they might be For example like these where we have you know an ear ER off and it's a 401 with the response You know user provided off failed or a error body These things give you an easier way and more clarity to integrate with with errors One good example of what API does this would be Twitter Twitter API has a lot of they define all of these Informative errors these error codes and it just makes implementation and integration a lot easier Like I said if there's any issues with it You can kind of key off these as they're constants while the message may change on you and Going back to the documentation aspect of this in the example I showed of the documentation We didn't define all of the error codes, but you will want to define that if your documentation is Transparent and has more and more information than anything else it gives it's going to remove the headache for the developer Integrating with your application well and also to note on that doesn't matter if it's a internal API or a public API more so on a public API as it's a public API You know people are going to be integrating with it the more information the better it will be for them to kind of get started in internal API That's usually when you know it's an internal API. You know the team that's integrating with it You kind of want you can kind of get lazy With it, but I would urge Urge you to do that extra work as it is a contract and it is it provides a binding Contract between you and that said team It also makes it a lot easier for them to kind of just integrate with your documentation and go from there without asking you a thousand questions So the next thing And kind of the final portion of this is the authentication and rate limiting I bound these two together because they usually go hand in hand depending on what type of strategy you go with so the first thing is Authentication is not the same as authorization, right? Authentication just provides a way for you to validate that person said identity while authorization would then see if This account actually has access to execute, you know creating coffee beans or listing coffee beans They go hand in hand here Usually on an API, you'll see read write or read and write on resources and that's where authorization kind of comes into play Now in terms of Authentication you'll be more common one that I'm sure everyone has seen is an API key That's usually an OAuth type of setup or a unique key A Lesser known when you don't really see it, but maybe if it's an internal type of application username and password This There's a lot to consider here when you're kind of going with the authentication route if you're going with something like OAuth There's a lot. There's a lot of moving pieces within off, but you do get a lot out of it a unique key is more or less a Unique API key that you generate for your API and then you kind of distribute them With the OAuth key and the unique key usually have ACLs that are Corresponding with those specific keys with those specific behaviors Which then you have ACLs and you go from there with the basic auth Think Apache or engine X username and password that you kind of set as as headers That brings us to rate limiting, right? Rate limiting this is extremely important both of these for public APIs internal APIs again, you can maybe Get by by not using it, but it's better to have it when it's better to have it and then kind of hate yourself when you don't so with rate limiting This allows you to define how many requests can be sent within a specific time frame for all Users or a specific user if you're doing something such as a unique key per user You'll usually have a corresponding rate limiter that would limit all calls coming in think engine X sort of blocking the requests if a specific IP or specific key gets pinged too many times In an OAuth flow, you'll usually see that they all have more or less a very large limit of like that 5,000 requests per hour something like that But with the rate limiting it helps you prevent DOS Malicious or accidental abuse of the API and it also does help prevent quality of service and uptime same way pagination where you're kind of Limiting how much data you're sending back in this case. You're limiting how much data can be kind of requested at a given time And like I said, there are multiple ways to do this whether on a web server Where you kind of define this on engine X or Apache where you're Setting a rate limit set there per IP per key or software based implementations such as OAuth Or a unique key Here's some useful links The pagination information that I kind of pulled down from those are just high-level aspects of the pagination But this is a very good medium link if you want to kind of learn more I would recommend checking out that specific post The JSON API spec the JSON API.org if you want to get more granular detail about how your API JSON should look I mentioned a lot of things and I heavily and take from a lot of the JSON spec Your top-level nodes how your errors should be kind of Modeled there's a lot of different ways to do a lot of these More or less the API's follow them, but I would recommend checking out the JSON API org to kind of get your own feel for them Same thing with the API spec and the API initiative Both of those will really help you create a stellar documentation that can be Portable and reusable that gives the integrators a lot of flexibility in terms of implementation so the key takeaways here for Good API design if there's anything you take away from this I would I would say it would be documentation Documentation is the source of truth. I can't stress that enough If you if you start with API documentation first and kind of keep that in the back of your mind always updating it using it as your Design your initial design document it will lead to a lot of Edge cases you may not find in terms of just plain old implementation And that's pretty much it I don't know if anyone has any questions about anything here Anything we talked about or if there's anything you kind of want me to dig into a bit more. Yeah, right? So the question was in terms of pagination You can run into a case where data is being inserted by one specific API call while you're inserting data and I'm kind of pulling data out right so You usually will see that with The page-based there are the so the strategy you go with pagination will kind of define whether or not you run into that That's why I like the cursor method the cursor method will always kind of Depends how you implement it but in in my experience is to be kind of you always you always set your set list data to be newest data that you just entered to be on the bottom of the list that way when you're iterating over it You'll always kind of you'll never run into that race condition But the other ones. Yes, you will run into that So it's just a matter of which one you choose because there are pros and cons But that is quite a big con and that's why I would usually recommend the cursor method but Yeah, it that's something to consider when you're looking at them because you can definitely run into race conditions with that definitely See what have any other questions? So, yeah, there are ways to so the question is You want to encrypt your data kind of in between calls and if you make a request The back end will encrypt that data. I'll send it back to you and only you know how to unencrypt it, right? There are strategies to that One that I've used is usually an HMAC method, so it's just HMAC and that usually works where There's a lot of moving pieces that and it is a bit more Complex than an implement, but it takes a lot of things into consideration. So You know how to generate a key and I know how to generate a key So we both know how to generate our specific signing and unsigning key But when generating the HMAC, there are a couple things we look at we look at the headers We look at the times we take a lot of the information of our request and we use that to encrypt our specific data that way You get this data and there's only a time span of like five minutes that you can kind of mess around with it Once that time expires even though I know how I have my signing keys. I can't even unsign it So there are some methods HMAC There are there are other ways to kind of implement JWT's with an HMAC So you'll see JWT's kind of being thrown around around around the lot now. They're kind of popular But they don't offer any encryption. It's just base 64 encoding So if you want full encryption, I would recommend looking into HMAC for your APIs. Yep Any questions? I would say if you're looking at public APIs, you'll see it's usually OAuth. It's an OAuth 2 type of implementation styles so The way they do it usually in an OAuth flow There's a lot of you have a token you have to refresh it every so often when you're looking at these public APIs They usually handle that refreshing for you Whenever you see an API and it usually has you'll you'll see a pattern and you kind of get a feel that they're using this flow It usually has read write Type of values out of the box and you can also set an expiration date on these tokens So for example, the github API gives you a lot of this flexibility where you can kind of per resource Define your ACLs whether you want read or write and then you can also define You know after X time this tokens invalid after an hour after a day or it never expires So I would say an OAuth Authentication is usually the most common And probably the one you would want to go with You will see others such as like just Custom solutions like unique tokens or which are just API keys, but that's more of like a homegrown solution You kind of have to still implement your own ACLs per resources So it's just a matter of do you want to go with OAuth, which is like a standard? Or do you want to go with something homegrown that kind of offers the same thing? Yeah, so the question is do all pagination methods kind of give you the same performance across the board? I don't know a hundred percent. I would say there might be extra compute in something like The cursor method because you're kind of you're doing extra steps while like the key say and the Pages one they're kind of like they're just limits on your your sequel calls right or specific delimiters while the cursor method kind of adds a bit more to it you kind of have to take that cursor transmutate it to whatever the sequel is and kind of Append it but at the at the core. It's it they're all doing the same thing where they're just you're manipulating how That data in the sequel gets comp, you know computed I can't if there is a difference I would say it's it's minimal The one thing to consider is with the cursor method you are at a you are returning like a whole data node But that should be minimal in terms of like the entire compute So I would say I don't know the the answer off the top of my head But I can't imagine that there would be a drastic difference between them Hopefully that helps Do you have any other questions before we wrap up? I It's an interest so the question is How would you handle the the back end when there are these? There's a set of three calls that you need to make to kind of get the full data set, right? It's a good question kind of stumped on that one so Usually when you're creating for example the coffee beans when you create a coffee bean, right? You're not going to have the full data set immediately Or for any resource usually most of the time you might but there are a lot of instances Where you won't have that full data set Reason being as you know The ID hasn't been generated or let's say you're kind of deploying you know servers When you deploy a server you you're not going to have the full data set immediately because an IP hasn't been issued We you know the Mac hasn't been kind of established So you get a partial response back, but eventually the server will kind of finish that process And then if you make a get or a list you'll have that full data set But that initial call you won't so to kind of further your point where you need to make these follow-up calls to continue the full data set That would really fall on to how you kind of implement and design your back end right the the API is more or less just the doorway to kind of your back-end structure, so if Your data is kind of reliant on the three calls that that would really have to fall on how you kind of Whether you put them in a temporary data store or you just There's a there's a lot of ways to do it So with your specific case if I there are three API calls, and I never make the third The data would just be stagnant. I guess right like I never make the call So what like the way you would have to do it is kind of by piecemeal The data has to exist in the database or some some data store And if you don't make that third call you just won't have the full data set. There's there's really not much to go about there Yeah, it would have to you would have to really consider how you're designing this the way I would probably go about it is Maybe you have a temporary data store such as like Redis where you know incomplete data sits there And you're always making calls against that until it's complete and then you move it into a more permanent data store That's one way to go about it, right? But that really falls more on to How you do the back end in terms of the API you could just leave the API as is But always make some validation check to see if this it does this data exists in my temporary store If it does it's incomplete if it doesn't you can kind of go about whatever that workflow is so a lot of moving pieces there, but I would say that would be more of a Technical design on your back end structure than anything You're welcome Anyone else have any other questions? All right. I guess if no one has any other questions The slides are on Online I did post them and if you have any questions feel free to Ask me in person or ping me on Twitter or Reach out on LinkedIn. I'll be glad to kind of kick these around a bit more if you have any other questions or anything more Technical or in-depth that you would want to share Other than that, thank you all for coming and thanks for the questions