 All right. Thanks everybody for bearing with us on the slow start there. Welcome to the discussion on making Drupal the best REST server. It's important for me to note that in this conversation, I'm the outsider. So we've got some Seagulls that have been doing amazing work on REST integration for Drupal 8. And this talk should not in any way be interpreted as a criticism of anything they've done. In fact, it's the opposite. They're doing amazing work and it's incredibly useful. And I look forward to working with them on the code sprint on Friday. But so far, I've not contributed anything towards the web services initiative. So this is an outsider's perspective. This is what someone who is just starting out with Drupal 8 when it's released and wants to investigate this web services thing might find out. That's been the experience that I've tried to go through over the past few weeks as I've prepared for this talk. So in my day job recently, we went through a multi-week process of evaluating front-end JavaScript MVC frameworks. So things like Ember and Angular and Knockout. And I'm not sure that we found out which was the best of those many choices. But what we definitely found out is that there are a lot of good choices. These things are very powerful. Most of them are very easy to dive into with some pretty minimal JavaScript skills. And so that just confirmed for me an intuition that had been growing that page reloads on the web are going to become a thing of the past. And they're already rapidly moving in that direction. We're developing a whole new category of web developer that I think we might call client-side back-end. So it used to be you'd have JavaScript people who do front-end to make things look pretty. But now we're actually building our applications on the client-side. And so the server-side back-end developer is going to become more and more concerned with simply building good APIs and providing performance data servers. So I think we need to be ready for this. I think Drupal 8 is very aware of that and doing an excellent job. And so I've spent my time not looking at the Drupal resources in terms of conversations about, you know, in the patch cues about issues of should we implement this way or that way. I've spent my time looking at other things, at other things that are implementing REST, other things people have written about REST, studying Roy Fielding's dissertation that introduced the concept of REST and some of his further commentary on how, you know, REST can be leveraged for the future of the web. So one of the other leading options that a lot of people are being attracted to for building web services is Node.js. Microsoft has gotten very behind promoting Node.js on their Azure Cloud and they're hosting really good free boot camps, you know, teaching people how to code JavaScript and Node.js to build server-side applications. And when I talk about Node.js, by the way, throughout this entire talk, not just this slide, I'm really talking about Node.js plus the Express module, which makes it really easy to build web services. It basically gives you your put, get, post methods and you just attach them to callbacks and you're done. So that's been my experience as I've tested things in Benchmark and I have some numbers I'm going to show you, but just be aware. It's not just pure Node.js, it's Node.js plus Express. And the approach there is, you know, here's your interface, what do you want to pass through it? And the great thing about Node.js being non-blocking is you can grab external resources from different places and build up complex documents, complex resources that are a mashup of data that can be served to the client very, very quickly. And you compare that to the Drupal model where we kind of assume all your data lives in our database, you've built it out with content types and views and the tools that Drupal provides and we've done all the work for you. And here's a faucet that you can suck it through. So there are two different approaches and they're solving two slightly different problems. So, you know, maybe it's an open question whether we ought to move in one direction or the other. And I think that's something we can talk about but I think it's something to be aware of. So I decided to cast the issue that Rest Strikes to Solve is a user story. You can read it there. But as an API consumer, I want to be able to discover all of the resources available to me via a small generalized set of methods so that the service is self-documented. Rest is the solution that implements this user story. So I'm going to assume people are basically familiar with Rest and the idea of using HTTP methods to access resources. I've also assumed that people are basically familiar with Node.js. Does anybody not know what Node.js is? I guess I should clarify that. Okay, everybody knows what Node.js is. So we've got the basics down in Drupal 8. I mean, the Rest server that's there right now, I just pulled it down from the Git repository on Friday. You know, it does the basics and it does them well. So my notes here are the things that we still need to make improvements on. Things we need to finish on. So maybe these will be motivations for things to work on on the code sprint on Friday or to contribute towards. So the move to using HAL, HAL JSON as a format compared to we were talking about JSON, LD, the last Drupal con, that was a great choice because I think the best Rest APIs give a choice between, you know, receiving a resource that simply has links to all the dependent resources or that actually nests the dependent resources in them. That's what HAL does. It allows you to nest the right resources. So in our sense, you have a Node. It has some data about the Node, a title, a body, but it also has an author. And that's really a reference to another entity that the server can tell you about. So rather than just providing a URI to say, hey, if you want to know about the author, go to this entity slash user slash 26, we can actually nest that user object in there if the client has requested the HAL JSON format. But the HAL is incomplete as it stands right now by my judgment. It's some things you wouldn't expect, for example, with files, what you get back is a files object which has a file ID for where to find that file and the files table and the database, but there's no URI for the file itself. So that kind of seems like a major oversight in HAL handling of files. There's some stuff missing in users. And the URIs that are generated are not actually reusable by the client. And maybe that's, I'm guessing that's just an artifact of switching over some of the recent HAL things. If there's some reason, if I'm not understanding the URL format or the URI format for the resources in HAL, I'd love for someone to explain that to me from the web services team. But my understanding is that with an ideal REST server, the URIs that you get back are simply paths that the client can then call out to to get that object. And right now, that doesn't work. The URIs you get back aren't even rooted in the entity path that the REST server uses. So that's one thing that we need to fix. But if we do that and we do that well, that'll be an edge on some of the other out-of-the-box REST solutions. So one of the fundamental tenets of REST is that the protocol doesn't actually matter. You can use any protocol you like. But of course, we're concerned with HTTP. And there are a few things that HTTP provides that we could make use of to really add some of Drupal's power into our REST API. First, I know that there was a switch from using put to patch. I think supporting patch is a good thing. But it's not actually in RFC 2616. It's not completely standardized. And put is assumed. Every REST client is going to expect to be able to support a put request to write back to the server. So I think we should add that in there. I realize that's not easy because of field level permissions handling. So it's going to take some thinking about the best way to do that. But I think we'd be missing a major piece if we didn't have some support for put requests that do something saying that a client would expect it to do. Because clients aren't going to know to then try patch when they get a method not supported response from their put request. So Drupal's got some really cool capabilities with batch and queue. And we almost get to the point of being able to do this processing in some cases. And so if we get to the point of wanting to be able to pull data from external sources, we're not quite non-blocking, like Node.js can be. But with our batch and queuing system, we could queue up jobs to gather assets from remote servers. And the way that HTTP supports that is with a 202 response accepted. And that's basically defined there for you. And we can return with that accepted response a post callback to tell Drupal, hey, once you've got this data, send it back to me. Don't make me call you back for it. So that could be a really powerful service that we could provide. The other alternative is to do a 201, which is to say that this job resource has been created. And here's a URI that you can pull to check and see when we're done. So it'd be great if we could implement those as well for some more complex long-term processing and reporting and the kinds of things that you might want to do with the diversity of content and services that Drupal can access. So when you strip things down past the web and get down to the bare bones HTTP protocol, we kind of suck at caching. It's time to end the expires hack with Dree's birthday. We'll put Dree's birthday somewhere else in the code base. You know, that's fine. But we should be using cache control and expires the way that they are intended to be used so that clients that are, you know, single-page apps that are consuming Drupal content and request the same node don't have to go back to the server for it unless there's a good reason for it. So entities need to be able to define their own caching structures and set these headers very easily without kind of bucking the trend of what Drupal is doing out of the box. And yes, I checked the rest servers. Headings are still no-gash and the expires. There's also e-tags are coming on the scene as another way of managing caching and there's some complex interactions between these three things and they all can kind of seem to do the same thing and so we should have conversations about how we want to use those. The other thing is that one of the best things you can do for Drupal caching is outside of Drupal. Almost everybody doing serious Drupal in production is using varnish or something like it these days. And this may seem controversial but I think we should ship a varnish to VCL with Drupal 8. It's kind of a pain in the ass to figure out all the different ways to handle image cache and some of the special exceptions in AJAX callbacks and things when they can and can't be handled by varnish and when they should pass through. And I've seen some suggested VCLs on the web that are very broken, specifically for Drupal that will just wreck things and so it's still an effort to set up your varnish correctly for Drupal and of course people are going to say well come on we're not going to add a dependency or relation to a third party system in Drupal but actually we've been doing it since the beginning we ship an hdaccess file so now we ship a webconfig file for Microsoft why don't we ship a varnish VCL if nothing else that would be a big win on the caching front. So the next thing that we'll spend some time looking at is some actual benchmarking numbers because I think the key thing for a data server is it has to be fast not just because people like things to be fast but because of the nature of single page apps that are constantly sending requests back to the server the number of round trips that we're making the number of requests that we're making is going to increase exponentially potentially as we have resources embedded inside of resources and even if you just want to do something as simple as an infinite scroll feature we're seeing all over the place I don't want delays when I'm scrolling I want that content to be right there so I did a number of comparisons of Drupal to some other things and these are not fair comparisons all of this should be taken with a grain of salt but as we get through it there are a couple of interesting things and a couple of things that surprised me so first over here on the right I should explain with this graph this is a number of requests served per second so higher is better higher is faster higher is I can handle more traffic more quickly the blue bar is a Apache benchmark these are both Apache benchmark checks the blue bar is a large number of requests with a relatively small concurrency the orange bar is a smaller number of requests with a higher concurrency so on the right here we have Drupal 8 out of the box REST server with a view for providing a view of titles and all of these tests all the way through all I'm really requesting is no titles just to keep it simple so it's returning a JSON object it's an array of objects that have one or maybe two properties the main thing we're looking for is title some of them because some of the other frameworks also include an ID or a sorting property on that so that's Drupal out of the box the next one that's just labeled module in that case what I did is I built a Drupal 8 module the YAML routing technique that's new so not using hookmenu to define my routes very simple controller callback a naked class with a single method to use the symphony JSON response in that I cheated I used MySQL query to skip over the database layer so I just wanted to see how is the routing layer affecting us how is the overall code load of Drupal affecting things skipping the database layer skipping views, skipping the rest server and all of that so we get about 50% more requests per second there so there is room for improvement above the basic layer in terms of views and the rest server the next we're looking at Node.js and again this is very similar to what I did with the bare module again it's with the express so it does have a routing built in and have a route to a single page that does the same query that views was doing which is a select title order by random from the same MySQL database so in all these cases I was pointing at the Drupal MySQL database and you can see we're in order of magnitude faster than Drupal out of the box now this isn't necessarily a surprise we're comparing a very simple single page script to a full framework so that's not terrible in fact I said to somebody yesterday he's here so you can check me on this I said I expected to be about in order of magnitude faster and that's basically what it is we go from around 100 requests a second to just over a thousand requests a second and the surprising thing here is that the concurrency takes such a big hit on node that I did not expect to see so that's interesting I don't know what that means but that's there so from there I kept going because I wanted to compare more things and more ways of accessing data by REST so the graph just rescaled itself note we now go to 4,000 over there and what we've added on the left is requests to MongoDB's built-in REST server so at this point we're not using MySQL and so you see the Drupal these are the same Drupal numbers that you just saw on the previous chart but relative to MongoDB they've shrunk quite a bit again no surprise here there's no framework at all this is just pinging the Mongo server there's no MySQL and in fact it gets to cheat a lot too because in the MySQL I was using a random sort and Mongo does not support that so in this particular test let's see was I just doing the exact same query I did come up with a way to cycle through queries for later tests with Mongo but this number is really kind of relevant and striving for that kind of performance in a full framework versus just a database that happens to have a REST API is ridiculous but I wanted it on the chart for comparison so this is where it gets interesting because that new bar that we've added again we've just rescaled the graph those four numbers on the right are all the same ones you just saw this is bare PHP in MySQL this is index.php file that calls MySQL query and again runs the same query that Drupal was running in the beginning and so it's twice as fast as the Mongo REST server and in order of magnitude faster than Node in MySQL so if anyone says to you well we're doing a web service we need to use Node because it's faster well actually it's not you could build a much faster REST API for a data in MySQL anyway by auto generating an index PHP file for each resource and letting Apache do your routing and be done it'd be vastly faster so Node does not automatically win on the speed front you know comparing programming language to programming language at the bare bones level so that was surprising and again we've just rescaled and so now we're using Node and it's favorite database Mongo so the huge difference here between Node with Mongo and Node with MySQL is really interesting and I wonder if that's has to do with the maturity of the two drivers everybody loves Mongo I don't know with Bison but at that point you know we're in a whole new ball game and Drupal is almost vanished off the chart so what can we do about this I mean should we be trying to compete with this I don't think that's realistic at this point so what can we do well I think first off the biggest game being seen bringing Mongo in there we need to we need to make sure that supporting Mongo and other document oriented databases other key value stores is easy to do in Drupal 8 I know work is being done on that one of the key pieces to really make that viable is the entity field query backend for views and there's a module out there that's labeled alpha and says you're crazy if you use it in production but I've heard other people say no actually it works pretty good and that's for Drupal 7 I don't know where we're at at Drupal 8 so since this is a core conversation I don't know if Chex is here or anyone else who knows about the state of that maybe give us a quick update on how we're doing with our Mongo support in Drupal 8 but let's keep that as a priority and then I think we can partner with NodeJS instead of viewing it as competitor because we're returning links in our JSON documents actually whether it's pure JSON with just URIs or whether it's HAL with embedded things there's nothing that says that those URIs have to point back to Drupal in an ideal case we could do nothing but switch the port so that if you're a client that just cares about pulling down the data first you get all the data related to the object and Drupal handles that but then we can go to a completely different service or a completely different server running the same service to retrieve the rest of the data that's related to that and so then we can get Node style performance for everything after our first request to a Drupal data API I also think just to prove to the world that Drupal is an awesome platform for building next generation web apps that don't have page refreshers we should try and ship Drupal 8 on a single page theme I talked to Dries about this at Drupal Con Down Under in Sydney and he thought it was a great idea and it's definitely something we can do past code freeze so if that's something that you're interested in maybe we could get a group together on Friday to talk about the pieces that would be needed to make that happen but I'd really like to see a Drupal theme that supports navigating everything in the site I don't think we need to tackle the administrative experience this way but from an end user perspective I want to be able to access all the content all the blocks all the menus in my Drupal site without a page reload that would be pretty cool so that's all I've got I remember the conversations in San Francisco were a lot shorter and we broke up into groups and discussed details this is a pretty big group we had a lot of coordinate side conversations but I welcome anyone with questions or comments to come up to the microphone please do because this is being recorded anyone involved in the web services initiative and the views back end if you want to give us a quick report on where that stands and how people can help out please let's have a conversation so just one explanation why we removed the put support for the entity resource plugin in Drupal Core so with entities you have field access so you have different representations of the same node because depending on the user that accesses the node the fields are different and the specification of put in the HTTP protocol says that you should just take over the resource that was sent to your server and if you're a client that is not able to access the fields you cannot really put the node there again without the field because it would be deleted that's what put is used for that's why we removed it for entity resource plugins but that doesn't mean that you can't use it for other resource plugins so we made the architecture so that we have a plugin in core that covers entities and we do all the CRUD support for them create, read, update and delete so you can write your own plugin that does the data handling for you not working with entities for example or working with entities in a different way and you still have the serialization on top of that that converts all the data that you create down to HAL or JSON, whatever you want I'd like to see that plugin architecture be as easy to use as the express framework in Node.js really ought to be here's my object dot put, here's what to do when you get a put request exactly and that's basically how it works in Drupal 8 so you have according to the HTTP request methods you have just the methods on that object one for patch, one for get one for post, one for delete for example for the entity resource plugin and then you get in either the ID or already the deserialized object that was handled by the serializer before so you don't have to do any serialization that was the goal for the resource plugins in Drupal 8 and yeah, that's basically it the architecture isn't that sophisticated yeah, thanks anyone in the room who's done anything with alternate databases besides MySQL in Drupal 8 where did they go is there a database session going on right now I'm optimistic about that I just don't know where it stands Mongo for Drupal 7 is working pretty well in most cases at least for accessing entities again, views is an issue so I have a question about integrating Node.js the thing that you currently get with the way that Drupal is implemented even though it's slower it's not just like he was saying well, kind of what he was saying earlier about having access to certain fields so if we're going to use Node.js as our backend REST server how do we get all of the other permissions like field permissions content type permissions and everything into Node.js so that it's not we don't get security breaches so first off I'll say we may not be able to I think field level permissions are not used all the time there are a lot of sites that don't have them at all but the other thing is that we could view each variant as a separate resource so that for each role that has different permissions maybe we actually have a different collection in Node.js I don't know that's just off the top of my head but there are ways to deal with that I think but yeah it may just be a case of if you have field level permissions this isn't supported but I think there are enough cases where that's not the issue that it's still worth doing I highly recommend Roy Fielding's blog for follow-up information on ways to do REST and basically do a Google search for Roy Fielding and REST and you'll find a lot of conversations and rebuttals and approaches and people defending why my API really is RESTful even though Roy said it wasn't so there's a lot of good resources to be found what do you see is the use case for single page websites within Drupal I don't know that it's the use case any different from a traditional website it's just about giving them more responsive, more fluid user experience so the issue with single page has always been SEO and so I've always seen there's more of an application type of thing which Drupal has used far less for so I'm just curious I'm really interested I've done a lot of work with single page apps but not so much sites sure, the great thing about Drupal is it makes it so easy to provide your site in multiple formats and again if we're talking about a single page theme we could also enable a traditional theme that we serve to search engines so all of these things that can be done for I think we can still serve the search engines well if we make a point of it and Drupal makes it easy for us because of the ability to serve multiple themes the ability to have multiple response formats I don't see that as a major issue but I realize a lot of people who have built single page apps from the ground up kind of left SEO as an afterthought and went oops but in the case of Drupal we have multiple output formats from the get-go one of our output formats is I haven't really looked at it at all yet but I was wondering about using something like a view mode to limit the data instead of using field level permissions you're saying maybe just not support field level permissions but instead use something like a view mode to have a limited scope so just like you use it for different displays of the content as it is instead of just saying this is all the data for the node, create a limited data set that returns I don't think that helps us any in the node-mongo integration style because still Mongo doesn't know anything about view modes so again the solution would be exactly the same before we have separate collections with separate permissions on the Mongo side but yeah I think the question from a security perspective too is once you've given them access to the object in any form whether you're only showing them some fields on one view yeah if you've given them access to the object then you're potentially opening up other ways for them to view the full object so the awasp project calls that insecure direct object references or something there's a fancy name for it the real thing you want to do if you're concerned about the security of those fields is field level permission but you were saying to not have that at all so field level permissions then you're going to have to route all your requests through Drupal but if you have some content types that don't or some roles that don't then maybe you can so you're not saying to so you're saying just handle it through Drupal and not have that sort of second layer with Mongo right alright thanks for coming thanks for thinking about rest with me and go build some awesome single page apps with Drupal 8 thank you