 Hi everyone. I came here today to you from a place far far away. It's called Gaylang and I literally had to travel through half of the country, 13 minutes by taxi because of the traffic and I came here to tell you that your API is too slow and I want to give you this one weird trick that will change everything and make your API super fast. Careful, this is the trick. Your API is already fast enough. Now this might sound like a joke but I really mean it. I feel that as developers we're very often so careful about speed of our applications that we miss the fact that out there in the world there are our users who do not really care about it so much. I mean there are people who open the Bloomberg or Forbes page that takes like 20 megabytes and a minute or more to load and they still read the articles there, right? So if your API renders some result within 300 milliseconds, like come on, it's not even a second and people wait minutes to load websites. So this is my real message to you that your API is probably already fast enough but who would accept the presentation with such a title? You know, I needed to work on marketing. My name is Grzegorz Witek. I work for a company called Caligo where we're building a hotel booking platform. I'm responsible for building API that powers some websites and powers mobile application. Before I spent quite some time working for a company called Fiber where we had also a web API that was built in Ruby and we scaled it up to around half a million requests per minute. So when I think about speeding up APIs I always ask myself if speed is a feature and if not then if lack of speed is a bug. So you might think that one excludes another but in my opinion it's the truth is somewhere in the middle. So depending on your product, depending on how you promote your application, speed might be a feature and if your application is extremely slow it might be considered a bug. If you work for a client that deliver you the functional and non-functional requirements often in the letter ones you will see that the application must respond within certain amount of time. And then of course if your application is slower then it's a bug, right? It doesn't fulfill the requirement that you got from your client. And let's consider two different products here. First is GitHub. I use GitHub API on daily basis and it's fast and even if it was way slower like four or five times slower I would still be using GitHub because the speed is not something that made me choose that product over the competitors. GitHub doesn't say hey choose us over other applications because we are fast. No, their features the stuff that they deliver is something else. They are different qualities of their product that made me choose them over other applications. On the other hand we've got Algolia which is a search as a service. They've got API that takes your data and returns results as you type. We use it for example for autocomplete for the destinations. So as soon as I type n it shows me New York, New Delhi and other cities on my website. And Algolia promotes themselves as an application that delivers results instantly. So if their API was twice slower I would probably choose another application that offers the same functionality but is faster. So whenever you think about speed of your API, the speed of your web application you need to think what is the strength of my product. Am I more like Algolia? Do I promote the speed of my application? And if yes then you should definitely focus on delivering results as quickly as possible. But if there are more strengths there are other strengths of your application and speed is not something that you want to promote to your users then maybe you should focus on a bunch of other things. Now let's assume that you really have a problem with the speed and that your users complain, your boss hates you, nobody is using your product because it's so slow. So then if you have a web application there's many, many topics that you should think about. There's many layers of optimizing web application. Very often these are totally independent layers and this is such a broad topic that today instead of talking about everything I will just focus about optimizing API. So we'll not be talking about JavaScript, static content, etc. Just the stuff that is related to rendering JSONs or XML by your web server. Whenever a user uses the device to make a request to your application this request goes a long, long way. Okay maybe not always. If you reach a certain website that is hosted here in Singapore this way will be just a few kilometers, right? But if you for example ping GitHub that has a server in the United States then it's a couple of thousand kilometers that the request must go there and back. And if we were able to achieve the full speed of light with our requests and packages then this request within one second would be able to go around the air like seven or eight times. But of course we are not able to do that. First because we do not live in a vacuum so the speed of light, the speed of internet is way slower and the second thing is that we can't directly hit the server somewhere in the other part of the world. We have to go through many many devices that slow down this request until we reach the server. So whenever you think about speeding up your application, speeding up your web API you need to think where are my users. Because if your server, if your database is in Singapore and your users are right close from here in Jakarta then the latency will be around 10 to 20 milliseconds. But if you go further to Vietnam this will go up to 100. New York 260 milliseconds just for a ping, just for a small tiny package to hit the server in New York and go back is 200 milliseconds. So even if your API, if your application will render results within 10 milliseconds user will have to wait way way more until until we'll see the result. Then if you have for example customers in Shanghai in China this gets even worse because of their internet infrastructure and the great firewall that slows down all the traffic going out of the country and coming coming from another countries. So the first thing that you should do if you have users out of your country, if you have users far away is to put your application behind the CDN infrastructure. Initially the CDNs were designed and are still considered a solution to serve static content. But this is not the only case where they're useful. So CDN is not only the server that is closer to your user that will store your JavaScript file and render it faster. CDNs actually have a powerful big infrastructure that will allow you to, that will help you to serve even dynamic content faster. For example when user makes a request and this request instead of trying to reach your server goes first to the closest CDN endpoint. Then the traffic between one endpoint of that CDN and another endpoint that is closest to your server will be way faster than if it go through normal route. That's because the routes between the endpoints of that of that infrastructure are improved, are optimized. What's more certain CDN providers work on special protocols to move the data between their endpoints. So sometimes they make just a plain HTTP request between one endpoint and another but certain companies work on binary protocols that allow to this request to go way, way faster. Another thing that will make it, that allows to take advantage, the CDN is HTTP2. So this is a new protocol, like new version of the protocol that we already know and while we can't fully take advantage of what HTTP2 offers us because of the limitations of the software that we use, the user's browser probably allows the usage of HTTP2 and the CDN infrastructure also allows it. So the traffic between the user's device and the last endpoint of CDN, the one last step between your server will be going through HTTP2 so it will allow users to get it faster. Now the next thing after putting your application behind CDN is thinking about where your servers are because even if you have CDN but your server is in Singapore and your users are let's say in Saudi Arabia, this is quite a distance to go through. So you might consider instead of having one server and one database using multiple copies all over the world. This is quite a tricky topic because you need to decide if your application is read heavy or is it write heavy. If your application, if you mostly are reading from database then it's quite easy because you can have multiple read-only replicas of your database spread all around the world. However, if your application is write heavy then you need to somehow synchronize these databases and then you're going to quite a rocky area where you may have issues with synchronization of data. Anyway, if your application is read heavy then you should certainly consider putting your application servers and a copy of databases in various endpoints that will be closer to user. The second step after reaching your server is obviously waiting for the result. It boils down to speeding up your Ruby application and there's many presentations. There's many books about how to speed up your Ruby application, how to speed up your Ruby code and the very, very first rule that I have here is not to be a smart ass. I used to be a smart ass and it costed me a lot of time because whenever my application was slow I was like, oh yes, sure, we should cash it and of course sometimes it helped, maybe in 50% of examples. But in the other half I cashed the content and I deployed it to production and I saw maybe 3% of improvement and obviously my boss wasn't happy when after a couple of days of work he saw 3% of improvement. That's not what you expect. So the first thing to do whenever you want to speed up something is to determine what is slow. You shouldn't focus on the parts that are already fast because, well, they're already fast. Just focus on the slowest part of your application. So to measure the performance of your application you can use many tools. The gems that I wrote here are Ruby prof and Rack mini profiler. Both of them will allow you to see which exact line, which exact method takes more time than others or which method is called many, many times so that you can notice that it takes too much time, it's called too many times, it becomes a problem. So the problem might be database and in some cases will be but it's not always, it's not always true. Whenever you measure your application you should do it in production mode. This is obvious but sometimes I see people forgetting about it and they see that, oh, my application spends like 90% of time on reloading classes or compiling assets. How does it happen? Well, of course in production mode it will not happen, it will happen only when you start your application. So you should always test it in production mode and for that I suggest something called preproduction or beta stage. This is basically copy of your production environment that is not available to users, that uses these gems and preferably if it's possible it should use read-only replica of your production database because, well, the time that your application spends on loading 10 users that you have in your local database and a couple of thousands users that you have in production database will be definitely very different. One of these two gems, Rack mini profiler is actually meant also to be used in the real production environment but nevertheless I recommend you to use this preproduction stage. Except for these two gems you've got a bunch of external tools, both New Relic and Skylight are commercial products that have some free plans so you can use them and what they do is that you install a gem that will analyze your running application, will send this information to their servers and then you've got nice web applications that will allow you to learn something about your application. And then it does not work only on a single request but obviously all over the time that your application is running. So thanks to that you can see not only how much time your application spends on certain part of code in a single request but for all the users. And the nice thing about these applications is that they allow you to see not only the average time but also the 75th or 95th percentiles. Average is a terrible measure that people overuse. If my brother is a millionaire and I am broke, on average we're both very rich but well I do not really experience the same as my brother. So if one of your users see the result in 10 milliseconds and the other in 500, well the average is 250 milliseconds, that's nothing, right? But one of your users is very happy and the other is frustrated. Therefore you should use these applications and you should focus on the 75th or 95th percentiles. Now when you measure something, first you need to profile, which means that you need to determine what is the slowest part of your application and focus on it and then you benchmark, which means that you compare alternative solutions. First profile, then benchmark. Do not benchmark things that don't matter and these two things are not exclusive. They are just one after another and that's how you should do that. Now when you measure and when you determine what is the slowest part of your application, you should start improving and the first thing that I always do and that I think that people say quite the opposite is to rely on your database. Whenever someone says that we need to speed up our application, we start by thinking about caching because in many cases the time spent in database is very significant. But I believe that we learned to treat database as a big black box that we just throw stuff in and take it out. Well databases are really powerful systems. They have many functions that can aggregate the data, they can analyze, process the data and instead of just fetching everything we have in database and processing it in our Ruby application, we can just use databases. I think that the problem is that we are so used to using ORMs like SQL or Active Record. These tools teach us to write Ruby syntax and to fetch just simply fetch the data instead of writing SQL that allows us to fully use all the database features. So ORMs do not offer you everything that the database offers you just because ORMs usually try to cover, try to support as many databases as possible. So very often they only support these functions that are common for multiple databases. A couple of weeks ago here in Singapore on the Ruby Meetup we had a developer talking about the cubes functionality in Postgres database. This is not supported by any ORM but this is functionality that allows you to aggregate your data in database over multiple dimensions at the same time. And this is something, this is something that probably we will not see supported in any ORMs but if you learn some SQL then you can take advantage of it already right now. When I was at university I had two semesters of the Oracle Database course and after that after these two semesters my professor told me that hey so now you know the basics so now you can go and learn yourself something more. Of course I never learned more of Oracle Database later but I learned during this time that I can write the whole application using just Oracle. So it's not only a tool to store data but databases are really powerful applications. It's really powerful software and you should not be afraid of that. You should just learn some SQL and you should trust and rely on your database. Of course at some point you will be forced or you will want to use the cache and then you need to consider where to put that cache. It might be a weird question but you might keep it on a separate machine. So you might store some information in Redis on a separate server in the same network which will be fast. Then you might have Redis on the same machine and obviously it will be faster because we're removing this period of time that is needed for transferring data from one machine to the other. And the third option is to store information in your application memory. Of course if I say that the application memory is the fastest, why not store everything there, right? Well, each piece of memory that you use for the cache is the memory that you cannot use for your server to be running. And we all know that Ruby applications are pretty hungry when it comes to the memory, right? The simple race application run on Unicorn can consume around 4 to 500 megabytes. So you need to consider this trade-off. Do I want to spend my memory on another service or another server worker or on a cache? What I usually do is combine the third and first approach. So whatever data is accessed almost every single request and doesn't change often, I try to keep in application memory but whatever doesn't fall in that category, I move to separate machine on Redis. Because you need to remember that even though everyone keeps repeating that memory is cheap, cheap doesn't mean that it's free. The next thing about cache is using multi-layered cache. So whenever a user reaches your server, you try to increase your cache hit ratio. So you want your user to take advantage of this cache. And sometimes different pieces of information are a bit different for each user. For example, at Caligo we store a static information about hotels. So some of this information is the same for each user, like the hotel rating or how good is the Wi-Fi or how tasty was the breakfast in terms of some scale from one to five. This is the same for each user. But then we've got the description of the hotel, which is in text, and it will be different for user in Russia and user in Singapore just because it will be written in different language. For that case we use this Russian dough or multi-layered cache where all the users will take advantage of this common cache, the information that is common for all of them, and they will not take advantage of the other part because it will be in the different languages. The next part is what I really hate to say, because I'm a huge fan of functional programming. I use Erlang and Elixir on daily basis. I really enjoyed writing Haskell and Prolacat University, and all these languages are functional and do not allow you to mutate data. And Ruby allows you to do that, and you should do it pretty often. That's because Ruby, even though Ruby allows you to write functional code, it's not really optimized for that. So imagine that you have a hash with thousand elements, and now you want to add one element to that hash. If you do hash.merge and then put that one other element, what will happen is that Ruby will copy the whole element, the whole thousand keys to another structure, to another address in the memory, and then add this one piece there, which means that it is very expensive operation if you work on big data structures. Purely functional languages, languages that do not allow you to mutate your data, they will just create a new piece with one element and a reference to that old address in memory. That's because that address, that data is immutable, so it will never change. And using Merge with exclamation mark instead of Merge can be tens time faster, or if your data is really huge, then it can be even hundreds times faster. So I really hate to say that, but when you need, you should mutate your data. It is risky and you should be careful about it, but in many cases it will help you to save up a lot of time. Now this is pretty cheap technique, but people often forget about it. Upgrading your libraries, even between some minor or tiny releases, can have a big impact of your performance. That's because when developers release a new version of library, they increase the minor or major version whenever there is some new big feature or something that breaks the compatibility. But between the big features, there is a lot of work happening in open source, just on this level of the micro optimizations, and a bunch of them can sum up to pretty big difference. For example, we sometimes, some time ago, we used a rural library, which is a representative library. And moving from one minor version to another, reduced our response time by about 10 to 20 percent, just because we upgraded this gem. We didn't have to do anything else. Sometimes it will happen that you cannot use some library anymore, and then you should obviously consider replacing it. In some cases, it is extremely cheap. For example, if you have problem with JSON parsing, you can replace the JSON gem with these two gems here. OJ stands for Optimized JSON, and it's a JSON parser written in C, and the OJ Mimic JSON gem will basically allow you to use the same current syntax that we use for JSON right now with the new library. And if you're often parsing big JSONs, it will make a big, big difference. Now it's getting pretty difficult. Native extensions. I wrote maybe one native extension in my life. It wasn't easy. But you can check some existing native extensions that are written for Ruby. Some of them are written for C. If they are for MRI, some are written in Java. If they are for JRuby, it's not that bad. It may happen that there is a certain function, a certain method in your application that is called many, many, many times. And maybe just by moving this one method, this one function, to a faster language, you can save a lot of time, and maybe it will not be that difficult. And if you do not want to use C, you can use Rust thanks to two libraries that already exist and allow you to write some extensions in Rust. The next thing, this is quite obvious, but often people forget about it, is move processing to the background. The most obvious example here is sending emails. Usually you can send email after user gets a response. No, you do not need to make user waiting until that email is sent, especially that if you're using SMTP protocol, this will be quite slow. But there's many other cases where you can give user a response and you can send a status of a transaction. For example, when you're processing payment, it may take up to a minute. So instead of making user wait a minute, you just send ID of the transaction and status in progress. And then user can hit your server again every couple of seconds to check if the transaction was finished or not. When you're really, really desperate, you might want to extract part of your application. And I really mean when you're desperate, and I really mean part not ever write, rewrite everything in Go. I often read articles like, oh, we rewrote our Ruby application in Go. Now we use 10 servers instead of 100. And we spent a couple of thousand man hours just for that, even though we could afford that servers. So you should be really, really careful with that one. And I think that this is overused and overhyped. Rewriting application is not good stuff to do, because you're basically spending a lot of time on building the same thing that you've already built. There's way more things to talk about that I do not really have time to cover. So I will just skip to the third part of the request, which is the download. Of course we do not have many more things left to do. The most obvious is to use Gzip and to compress your response. If you already have your application behind CDN, this will also make an impact here. And the third thing to consider is that maybe you don't need to send all the data that you want to the user. Maybe a big part of this response will be used only in certain cases, in 10, 15% of cases. And you should move it to another endpoint. So our request is come through the user device, through the whole world, to the server, and back to the user device. And I would like to wrap up my presentation just by telling you that you should use CDN infrastructure. You should profile and then benchmark always in this case, always focus on the slowest part and rewriting is the last thing to consider. Ruby is really fast enough for you. Sinatra and Rails and all the other frameworks are faster. In my previous company we scaled the application, Ruby application, to handle half a million requests per minute before that application will start being extracted to other languages. So unless you hit such a big limit, then you shouldn't really consider rewriting your application. You should just focus on the slowest part and gradually improve the speed. My name is Grzegorz Witek, this was a presentation about your API, which is most probably fast enough. Thank you very much.