 Hiroshina Kamura, and my surname is Nahi at Twitter and GitHub, and I'm working at Aperio and doing Shilvi and JLV development at my off time. And I'm a member of Asakusa.LV. We are having a weekly meetup. And so when you come to Tokyo, Japan, please join us. You can contact from here. Today I'll introduce the matrix I created that shows the advantages and disadvantages of comparison of Ruby's HTTP client libraries. You can see the whole matrix from this URL. And please take care to refer this matrix because I'm also one of those HTTP clients. HTTP client. This is a agenda. At first, I will provide a brief introduction to net HTTP internal. And I show 16 Ruby HTTP client libraries I picked and explain the matrix in detail. API style, compatibility, and supported features. And I also provide a performance comparisons. And at the last, I'll show you my recommendations of Ruby HTTP client for purposes. This is the class diagram of net HTTP. It has HTTP class that represents the connection to a server. And it has HTTP request that's representing request and the HTTP response for the response too. There are lots of derived classes of request and response. But these three classes implement all of net HTTP features. Do you know what net HTTP plug series? It was a class. But it's now the method that returns the crafted HTTP object that utilizes the proxy connection. And it also has net HTTPS library. But all it does is require net HTTP and require OpenSSL. It locates only for backward compatibility now. And this simplicity of class and complexity of implementation causes is the root cause of why developers want to write their own Ruby HTTP client libraries, I guess. Here's 16 HTTP client libraries I picked. There are four groups. The first group of green is original implementation. Net HTTP is a standard library, as I said. And EXCOM and HTTP client are the pure Ruby implementation. And EMA HTTP request is an event machine-based original implementation. The second group is Net HTTP wrappers. OpenULI is also a standard library of Ruby and HTTP party, Metronite, RufusBirds, REST client, and REST free. The third one is CURL wrappers. There's CURV and Patron. And the last one of net is adaptive-based implementation that offers developers to choose the back-end HTTP engine. And there are REST, and WERI, and Faraday, and HTTBI. There are many HTTP client libraries I didn't evaluate. And these are the HTTP client libraries I cannot evaluate because activity source of REST is REST-to-REST-specific. And HTTP is under development. And HTTP request.rb, its test doesn't pass. And also, NetSufra has no test. And Typhus is one of the CURV wrapper. And it's under heavy writing now. So I cannot evaluate these HTTP clients. And these are I didn't evaluate because these gems aren't updated recently. So I think those libraries are obsolete. Evaluation access, project stats, and API style, compatibility, supported features, connection features, basic HTTP features, and so on. And I created a test unit script for checking these supported features. So if you want, you can try the test from this URL. And this script shows how you can use the specific feature like posting multiple forms, and base goals, and proxy authentication. You can see these tests as an example. First, I start from project stats. These are 16 libraries I evaluated. And here is the maintainer. And are there anyone in this room? I think, yeah, Eric is there. And I'm the author of HTTP client. And Eric is the maintainer of Metanize. And we read that the REST client and REST are not updated this year from 2011. So you may need to find latest development folk from GitHub if you want to use these libraries. And these stats at GitHub and RubyGems, there are four libraries, HDParty, Metanize REST client, and Faraday. Those has thousands of GitHub stats and hundreds of GitHub folks and millions of downloads. Next is API style. I summarize the API style to synchronous and asynchronous and parallel. And all of HTTP client libraries have synchronous API. I reintroduce all of the API style. And first is the synchronous API that utilize the client instance. With these gems, developer need to instantiate the client from the provided class. And developer can issue the HTTP request from the created instance. The next one is client class. With this client, with this library, user developer uses HTTP request as the class method. But in the inside, the class method instances the internal client instance. So there's nothing different from the previous API style. The third one is resource. With these libraries, developer instances the resource with URL, then developer can issue a request as the instance method of the created resource without passing the whole URL every time. The fourth one is include. With these libraries, developer can create their own customized HTTP client class like this and use the utility methods and features from inside of the customized class. The last one, others, OpenULI offers like this API. And the rest is interesting, a little bit funny API like this. It starts with the URL storing and converges to URL then user can issue the HTTP request as the instance method of the URL object. Next is a synchronous one. The first one is callback. EM HTTP request. With HTTP request, developer instances the request. But the request not issued at this time. And developer defines the callback for successful request and erroneous request. Then event machines, main set issues the HTTP request. And when request finished, it invokes the user of this callback. So developer needs to implement their logics with this callback style. The next one is polling. With this API client, library offers a synchronous API. This is a sample from HTTP client. It developer issues this asynchronous API instead of synchronous one. And it returns the connection object developer can poll. If it requests finished or not. And when connection is finished, developer can read the response from the connection object. The last one is parallel API. CUB has a parallel API that issues the request, all of the request, in simultaneously. There are multi-objects. And developer can add the HTTP request to multiple multi-objects. And the request is issued at this perform method invocation. So some libraries have a synchronous API, and CURB has a parallel API. For multi-threading, almost all support multi-threading. So developer don't need to care about the thread. But net HTTP and pattern need developer to instantiate the object path thread. So you need to care this multi-thread environment. error100 almost all raises exceptions when error happens. But for REST3, developer can configure to return error object instead of raising exception. EMNCT request requires the error callback I showed you before. The next is compatibility. All of the libraries runs fine on CLB, of course. And except CULO extension runs fine on GLB as well. GLB has CX support, experimental CX support. But I mark this no, because I think it's experimental. And I guess it won't be exit from experimental status. So if you want to use your application as well as GLB, you should use other HTTP client. And for Rubinius, Patron fails running my tests. But I think it should work, and it should be fixed easily. I think there's something bugging Patron or Rubinius. I don't know. Connection features. In HTTP, there are three connection types. I explain it first. First one is no keep alive connection. With no keep alive connection, client and server create the socket for each HTTP request. And with keep alive connection, client and server can use the same connection for multiple requests. So it's a little bit efficient for not needing socket creation and socket destruction. And for pipelining, client don't need to wait. The response comes before issuing the next request. And as you see this, it's very efficient in contrast to keep alive connections. But I think this pipelining processing is difficult. So many web servers like Apache doesn't support this pipelining request. So when you want to use pipelining from your HTTP client, you should take care if the server supports pipelining or not. And the keep alive is supported on Metronize and EMHTB request, HTTP client, CURB Patron and FALDI. And pipelining is supported on EMHTB request and CURB. I show you the keep alive in EMHTB request. For keep alive connection, you need to wait the fast response comes before sending the second request. So the second request must issued from the first callback, third request from second callback. So you need to write this kind of code. So I think you don't want to use it. In contrast to it, pipelining is fine. Yeah. In pipelining, clients don't need to wait the next request issuing. So the client can issue like this. And when the error happens, just handle the error here. So I think it's feasible. I mean, keep alive in EMHTB request. It's not feasible for general development. Next is SSL. I need water. This red cell means there's no verification by default. Yes, no verification by default. This is accepted from a client library. It sets, if it's trying to connect the SSL server, it sets verification mode to none that doesn't do verification. And if options are set, it turns on verify peer to do verification. So the developer can configure to do SSL verification for these libraries. But if developer makes a bug about options handling and the options is cleared, this HTTP library set verify none and send and connect to the SSL server and send users data and secreting without warning. It's not what SSL is expected. If SSL verification is not done and developer wants to verify, it should stop sending data. So when you want to connect to the server, you should use Metronite or HTTP clients that support certification revocation tool and CURV or WEST and HTTBI. Proxy, almost all support proxy and proxy authentication and basic authentication to the server. And some support digest authentication and Windows MTLM authentication, too. HTTP features, almost all supports get paused to delete. And OpenULI doesn't support post and put delete because it just for downloading a file. And some library support custom HTTP method like purge. It's not good thing, but some API server requires you send this kind of non-standard HTTP request. So if you are connecting to such kind of servers, you should use this kind of HTTP library. IRI. IRI is an internationalized resource identifier that includes the multivariate in URL. Unfortunately, URI doesn't support IRI. And addressable gem supports IRI. But also, unfortunately, addressable URI is not aimed to be a drop-in replacement of URI.RB. So HTTP client library developers need some code to support addressable URI. It's not difficult, so you can ask these lively developers to support addressable URI. Response headers. CRV returns response headers in a single string. So if you want to extract the information from HTTP headers, you need to path by yourself. And cookie. Mechanize and HTTP client supports cookie. But HTTP client has close site cooking bug. HTTP client is this cookie for .com site and send it to all .com site. It's not expected because it can cause security vulnerability, like session fixation. And the browsers need to handle this kind of domain. And what kind of domain name can store a cookie and send a cookie? So if you need this cookie handling properly, you must use Mechanize. Mechanize handles these cookies properly, like browsers. Redirect. HTTP server could return a redirect response to inform clients to follow the redirection. And many clients lively support the redirection following, but some HTTP client lively doesn't have a redirect limit. So if the server returns the same redirection again and again, it calls an infinite group and clashes your Ruby interpreter. You should take care of this redirection limit if you are going to use this client lively. Form URL encoded is with these supported libraries. You can pass an HTTP query and the post form in a hash or array. With that, this feature, you need to concatenate the parameters with ampersand and equal by yourself. Sorry. Multi-part posts. This lively supports multi-part file posts from posting and streaming, upload, and download. This is a sample from the pattern. And pattern has a parameter file to specify the file to upload to the server. And pattern reads the file in Chang and send it to the server. And then reads the next Chang from file without reading all of the file and consume lots of memory. And also for download, the pattern has get file method and pass to write parameters to write a file into the specified file and without reading all of the response in memory. If EMA HTTP request also has such a method and for Chang to download, EMA HTTP request offers a stream callback that elit every Chang read from the server. Completion, lots of HTTP client libraries support completion and decompletion. And also for response dataset, some libraries set the response encoding of response string according to the HTTP response header, content type, and charset. So you don't need to set your, you don't need to set, you don't need to combat encoding by yourself. Development support, response serving is serving a response. It's the sample from HTTP client. It has the look back response method and can cache the HTTP response body. And when you invoke the next request, this string returned from the method without accessing the actual server. And HTTP client also has an HTTP message stabbing. And this is an example for redirecting from to another URL and the response, a normal response. And when this method is invoked, this follows this redirect and gets this HTTP response and without accessing this server. Some library has this similar feature and it helps to debug or test your client. But I'll show, I'll introduce later, but you can use the WebMock gem for those not supporting libraries. So you may not take care of this line by using WebMock gem. I will introduce it later. Some library has wire dump debug feature. It allows you to dump the actual HTTP request string and response string to or writing to the file. And it helps debugging, such as HTTP server seems to return and block in response or wrong HTTP response header. It helps you to debug such kind of blocking responses. I'll relax shell. Some has this feature. I think REST client is the first client that offers this feature. And with these libraries, you can invoke the IOV like shell and invoke the method inside form it. And you can, of course, use the IOV's history editing feature. REST client also has an interesting and useful feature and replayable log. If you set this environment variable and call REST client method, it dumps to the file in the format of the Ruby program. So you can take this code to your program. And after trying to connect to the server like this, it's a nice feature I want to implement for HTTP client clients soon, the last advanced features. Some libraries has request and response hook that hooks the request just before sending to the server and just after receiving response and before the parsing group the response. It's useful for setting header like authentication header or tweaking the HTTP response from the server, fixing the charset header or something. And some libraries has JSON and XML convert feature. It converts the hash of an array into JSON or XML before sending and converts it back to hash object from the response JSON or XML payload. Response casing is mechanized and REST free, REST birth, and REST features. It caches the server's response, of course, in proper manner. And it mechanized send the server the request if the resource is updated or not for if the resource is already downloaded. And it's very efficient for network bandwidth usage. But it's very nice for browser-like client. But I mark this yellow is the reason why I mark this yellow is for API client, it might cause some problem because API server may not want to get if modified scenes or update check request. So for API client, you should be careful because this feature is enabled by default for mechanized. Is it right, Eric? Yeah. For HTML from handling mechanized, it's shiny features of HT, sorry, mechanized. It allows to get the login form and login page and get the login form from it, set email and password, and send proper URL to send the form. So it's very useful for testing your web application. Testing client, as I said before, WebMock is the library for staffing and setting expectations on HTTP request. And WebMock supports all of the library I listed, 16 all libraries. So you can use WebMock even if the HTTP client library has such features. And VCR is for recording your tests with HTTP interactions and replay them during feature tests with help from WebMock. So these libraries would be a mass for you to develop your HTTP client. Performance comparisons. This is my environment for evaluation service at West Coast, Ubuntu and Apache 2.2. And client is on East Coast. And I use the Shilv 1.9.3. And I did a multiple downloads of 200 bytes and 24 megabytes. But please don't take it serious, because it's not comprehensive benchmark. So if you want to evaluate by yourself, you can grab my benchmark script from here. This is multiple 200 bytes downloads. We see the blue one first. The blue one is 30 times download by one thread. There are the three groups. The one is doesn't support Keep Alive Connection. The second supports Keep Alive Connection. And the third, when the library supports Keep Alive Connection, it's really efficient for small file downloads, as you see. And the third one is EM HTTP requests and Shilv 1.9. Because the other libraries issue this 30 HTTP requests one by one. But EM HTTP requests and Shilv 1.9 issues the 30 requests at once. So it's really fast, as you see. And the next is red one is with 10 threads and five times download for each thread. It's almost the same trend, but thread is efficient for all HTTP libraries, except EM HTTP requests and Shilv, as I explained, those libraries issues request at once. So thread doesn't have any effect. The next one is comparison of Ruby implementations. The blue one is the one I explained in the previous page. And red is the JLV 1.7.0. And yellow is Rubinius 2.0.0 depth. You can see almost the same trend for Shilv and JLV and Rubinius. But for some libraries, rest free and vary, JLV and Rubinius runs much slower than others and than Shilv and others. So I think there's something for them in perhaps JLV and Rubinius about IO handling, but I'm not sure. And one most interesting things in event machine-based EM HTTP requests. Shilv is fast, but for JLV and Rubinius, it's not fast, as you see. Like it's almost the same as HTTP clients and mechanized for Rubinius. I also think there's something problem in JLV and Rubinius of handling event machine. And as you see, event machine is not updated frequently. So it may take some time to fix this problem. The last graph is multiple 24 megabytes downloads. The blue one is the three times download by one thread. And you'll see almost the same trend for non-Keep Alive HTTP client and Keep Alive supported HTTP client. And for the 24 megabyte downloads, EM HTTP request and Shilv and Maruti is not so faster than others. So I think it's almost the network throughput issue. So Keep Alive supported HTTP clients can run as fast as EM HTTP request and Shilv and Maruti for a big data download. And the next one is the red with three threads and one time download for each. As you know, that's almost you cannot see notable differences from the result. So as I said before, it's almost the network throughput that affects this download time. It's the last slide. No, I talk five minutes faster than practice. My recommendations. If speed is the king, you should use EM HTTP request and Shilv with Maruti API. But if you are going to download the big file, you can use other HTTP client libraries to avoid to fight with complex API of these HTTP client libraries. For HTML operation and cookies, you should use Metanize. And for API client, you can use Faro Day and other adapter-based implementations. But if you want to connect to SSL server, you need to care SSL server verification issue. I explained in this talk. And for SSL and various connectivity, you can check HTTP client first. If you don't know what server you are going to connect at development time, you can choose HTTP client for various connection features. Please check the metrics before you use the libraries. And please let me know when you find incorrect cells in it. Thank you. Any questions? I once created this matrix one and a half years ago for Japanese conference. It took three months. And this time, I just updated the matrix and removed some HTTP client. I think almost four weeks. You're welcome. Hi. You recommended which libraries to use here based on what you've learned. Would you also make any recommendations to developing with libraries that exist or developing a new library? You know, would you say you should have done this differently? This feature is missing. This API is not good. Sorry. I don't understand the question. Oh, sorry. So based on you recommended which libraries would you make recommendations to the design of libraries? The authors. Recommendations for the authors. The feature I wrote in this slide is something difficult to implement because lots of servers implement specs improperly. So you need to take care of the non-standard HTTP response. But even though you want to write HTTP client by yourself, you can first define the API because I think you should. I'm going to provide some recommendation for HTTP client developer authors because there's a lot of HTTP clients and some have the same API that offers the client instance and get method with how you can specify the query and how you specify the file. How do you specify the body? And the web mock gem depends on such interface. So I think you should define the API according to the existing one and try to implement your internal implementation by yourself. Recommendations for sample implementation from HTTP developers. Sorry. Well, there's two. You can either answer what could you have done better with HTTP, for example, in the core library or should we scrap it and put it in something else? The question is, how should a good library be written? What's the API? Make it easy to use, make it easy to keep the standards. What features should it have? What did you learn was missing or wrong? The question is what I learned from various HTTP client APIs, API styles. Is it good? Yeah. I found some not easier to use interface, but sorry, but all I found is it's not good for... It's not good for easy to use, but I just found it's not easy to use like this by including the class method instead of instance method. So I think developers should care not the API itself, but if it supports just letting or passing parameters to how developer can pass the parameters to the client instance, that's what I found from the learning of the various API clients. Oh, thank you. It's time.