 Welcome! Glad you guys could make it. Glad we could make it. Big room. We have to be extra theatrical. So we're here to talk about Node.js. Just start with a quick introduction. My name is Justin Randall. That's my Drupal.org and Twitter and IRC-NIC. So I've been doing Drupal for nearly six years. Occasional Core Hacker and... Is that better? All righty. Yeah, so my... Pajivas is my NIC and Twitter and IRC, Occasional Core Hacker, and I maintain the chat room and Node.js modules. And I work in Sydney, Australia for a company that I co-founded called Ampersand Technology, which is just a Drupal shop like many others here. I'm Howard Tyson. I'm a Tizzo on Drupal.org, IRC, Twitter, and anywhere else I can get it. I'm a senior developer at ZivTech, which is another Drupal shop like so many, out of Philadelphia. And I've been working with Drupal for like five years and hack on a whole bunch of different stuff, trying to do more of the contrib. And yeah, I think that's pretty much me. So, Node.js. Hopefully everybody's here to hear about it. So what is this Node.js thing anyway? I think a lot of people are a little fuzzy on exactly what it is and exactly how it can work with Drupal. So we wanted to start off with an introduction and kind of talk you through what Node.js is and why it's cool for working with Drupal. And just before we go, how many people are using Node.js or have written code for it here? Okay, that's pretty good. Cool. So in a sentence, Node.js is JavaScript on the server. So in addition to running on the client side, JavaScript can be run on the server side and the whole idea is you can run it. Not just for the server, Node.js can be used to set up non-Http server things, but that's sort of our use case here, is serving web requests. Next on Google's V8 engine, V8 was designed, I think, for Chrome, open-sourced and has been put into use as sort of the core of the JavaScript handling in Node.js. And one of the cool things about that is Google is constantly innovating and making it faster. So particularly the sort of the dev branch of Node.js right now is just tracking whatever they can upstream. So it just gets faster and better all the time. And apparently a whole bunch of the optimizations that Google put in to V8 actually are much better on the server side because the JavaScript will... the V8 engine will hone the opcode for the objects that are in memory while they're being run on a single page. That doesn't make a big difference, but on a long-running server process for a long time, it tends to get better as it continues to run, which is kind of cool. So JavaScript is kind of inherently event-driven and the Node.js module... sorry, the Node.js itself is an inherently evented system. So you have a main event loop running and when a new request comes in, that single process event loop takes the process, hands it off to a worker. The worker will talk about some of the details in the upcoming parts and keeps running. So it's the reactor pattern is what Node.js is doing there. It's similar to Python's Twisted. And from what I understand, it's similar to the varnish model as well. And so the whole idea is that the asynchronous IO callbacks are performed by C and C++ libraries. So the main event loop runs your code and as soon as it hits something that's in IO, hopefully your IO is going to be performed by a library and that library hands off and runs in a separate thread pool so that your main application loop doesn't block. So a single process can handle tens of thousands of concurrent connections. And any time any of those get to any of the blocking parts, which for networking or disk or almost any application, that's where most of our apps spend most of the time, right? Waiting on IO. So the whole idea is we can have this event loop and every time we're waiting on IO, we just pass that off somewhere else and when that's finished, it comes back and we respond to that event firing whatever callback we need to fire when that happens. It's a lot like if you've ever done any JavaScript programming, it's a lot like binding to the click event. When this thing in the DOM gets clicked, then I want you to call that function until then this function's just kind of hanging out and it's at the time that event happens when we know everything's ready and we have all of our context, then just run my code here. You know JS code will look a lot like that. And the other thing to say about that is that it's possible to get that really wrong and you can really pay. It really is a single process and it's multi-threaded, but the event loop isn't. So if you do something dumb in there and you want to handle a whole lot of connections, you're toast. So the way that they've written it is to make it hard to do that, but it's still possible. So it's something that when you write applications that particularly want to handle a lot of something at the same time, you have to do it in a certain way so that you don't get stuck, basically. So we sort of ended up talking through this, but just to reiterate, so you've got a single process in the main event loop, which is maintaining all of those connections. That helps you be able to handle a lot more connections than you could with other technologies, because you're not sucking up all that memory for every single connection. And then as we were just saying, there's sort of a thread pool where the actual worker stuff happens in some C or C++ library. And if anyone's watched some of the sort of popular applications with Ryan Dahl online, in some of them he just kind of says, yeah, we don't want users to be able to do anything that blocks. We just want to put them in an evented jail. He's quite opinionated about it. So you don't really have primitives in JavaScript to start your own threads or any of that sort of thing. If you want to do that, you've got to use a C library or a C++ library and then wrap it in the real. So when you're in just JavaScript code, you can't sort of operate on thread primitives, et cetera. We said this was intermediary, right? We're doing a lot of talking about threads. I promise we're going to get to a higher level. We've got to say at the beginning, if people have questions like while we're presenting, just go ahead and ask. We've got a couple of mics down here. So just jump in if you've got a question. I'll be asking, all right, threads, processes, concurrency, why do I care? I installed Drupal and Drupal gives me a website. What is Node.js going to give me other than another acronym and a plus-plus on the web 3.00 meter? Well, the best way to answer that is probably to talk about some of Apache's shortcomings. With Apache, you've got one thread if you're using PHP and you're not getting into the crazy performance stuff, you probably actually have an entire process per connection. And that process is a pretty fat process. It has all of the PHP libraries. It has a whole bunch of Apache modules to be smart enough to serve a Drupal page. You need a lot of stuff loaded up. And you have that loaded up for every single process. So one process can only serve one connection. You have some fixed, some finite number of processes. Maybe you have five. If you've got five concurrent connections, the next ones just have to queue up behind them. And you're not going to get to that next process until you've processed all the ones before that. If you're not careful with the number of children you allow and some of your other Apache settings, you can end up just spawning too many of those. And then your server just OOMs and dies. That's bad. So when you've got all these memory heavy processes, you really can't do things like high concurrency. You can't have long, you can't have either way too many connections that are hanging out and you can't have them open too long. Because that's going to cause you to have too many, right? So all this means persistent connections in Apache are massive fail. If you've looked at some of the ways to do push technologies, that's a big part of what we're going to be talking about and a big part of what we like about Node.js. If you've looked at doing push technologies with Apache, it's always a big hack. Doing any kind of comet D or that kind of thing with Apache or with PHP is usually a big bunch of workarounds for a system that really doesn't accommodate it. And when Howard says big hack, he's referring to my chatroom module. Because my chatroom module is based on the LAMP stack and a big chunk of the code is just a hack to try and make it perform okay with sort of 10 to 50 concurrent users working on it. It's polling every second. It's doing a lot of stuff in the back end. It's nasty. And that's actually one of the motivations for me personally to get into Node.js because I just couldn't face porting the module from 6 to 7 and still use the LAMP stack as the back end. That's nasty. So Node.js is good for lots of things. One of the big ones and what we're most interested in is concurrency. Because it has this model of doing asynchronous I.O., what that basically means and all the words we've been spouting come down to having low overhead for lots of simultaneous connections. Because the thing is most of those connections aren't actually doing something while they're connected. So what do I mean by that? So let's say that Justin and I are both on some kind of a chat, for example. I'm going to connect to the server saying, hey, I'm listening for chat messages and the server is going to say, alright, you're connected and then it's just going to wait. And so Node.js doesn't actually have to be doing anything with my connection. It just keeps track of my session ID and waits until it actually has something to tell me about. Justin does the same and now we have two open connections but the main event loop is just continuing to run. Just waiting for something, waiting for a message to come in that needs to go out to one of those clients or waiting for a message to come back from one of the clients that it needs to send somewhere else. That's awesome and it's especially awesome for doing push stuff. Node.js is also really good for any kind of networking application. It's really excellent glue. There are really awesome libraries and it's really easy to do things like routing. I have a connection coming in from over here and then I know that I need to connect to a worker in the background so I can do some routing based on the incoming request and then just pipe the connection from the background into the connection in the foreground. If you know what I mean. There are a bunch of back end workers. The Node.js connection can go out to the back end and then pipe the back end out to the front end request. Right, and if you look at GitHub's blog they're now using Node to do things like push out tables and it's actually something that I'm personally experimenting with for Drupal.org because obviously with Git and serving a Drupal by calling git archive how many different hashers can you pass to that? So the challenges you face if you want to try and generate them all up front are quite high. One of the things that personally just experimenting with Node is putting that two different Node processors that communicate with HTTP and with Node it's just ridiculously easy. There's a primitive on streams in Node that allows you to just stick together to a readable and writeable stream with a pipe and it's just ridiculously cool. So it's a really good glue if you've got a bunch of HTTP aware particularly services that you want to glue together it's just really cool. And one of the other things that really excited me about Node is actually if you look at the hello world the hello world in four lines I think walks you through creating a server that just spits a string out when a request comes in. That's all it can do. So I've shown that to a couple people and they've given me just blank stares. So you can spit hello world when a request comes in like that's not actually interesting for any particular application. What is interesting about it is that if you wanted to do a hello world with serving an Apache page you'd install Apache you'd go into RWWW and you'd drop in an index.html page so Apache then can when a request comes in serve back that index.html but it can also do 100,000 other things because the Apache code itself is this generalized web server that can do a trillion things. In Node.js you can write your own web server from scratch if all you need to do is speak HTTP that's all you need to do your server doesn't even know how to talk to the file system. You have complete control over what's actually going on on that server which means you can keep it really, really lightweight and you have a lot of control over exactly what it's doing. So lots and lots of low level stuff about what Node.js is and why we're excited about it as a technology but again we're all Drupal developers. So are we going to follow certain other web shops and drop Drupal entirely and go into Node.js? Not me personally. Not me how? So Justin and I have been working on I should give Justin credit, Justin wrote the actual working module the Node.js module and we're going to talk you through how you can use it to augment your Drupal site and with this concurrency and how you can use that to get working web sockets without killing your servers. So the Node.js module is at drupal.org.projects-nodejs and what it does is it provides a server script and a Drupal module that integrates with it so that you can have this separate Node.js server that your clients can connect to in addition to the main Drupal server. And just as a quick overview here this was an attempt to have my cake and eat it too. So there are services that you can sort of drop a JavaScript widget on your site or elsewhere and the actual writing and persisting of the messages in chats or anything like that happens on their servers and the integration with Drupal varies sort of wildly. And but in most cases you can't kind of have something that will really scale that is just first class Drupal. So all the data is written straight into Drupal database. You can do views integration on messages. The code that runs when PHP runs is just standard module code that any other developer out there can hook into etc. So just to sort of put a high level picture on a big motivating factor here was to allow plumbing. So Drupal developers can just write Drupal code all this power without having to understand any of this other stuff we mentioned earlier on in the talk. So how does it work? Here we've got the client the end user on their shiny new MacBook Pro, our Drupal site and a Node.js server. The Node.js server can be running on the same server as Drupal but it doesn't have to be because it's still in beta but a service with hosted Node.js and it's basically hosted the Node.js module from Drupal, right? So you start out with a normal you start out by doing a normal Drupal page load. I just make a request to my Drupal site and Drupal comes back with HTML. In that HTML is the JavaScript to connect to Node.js and the off token that's for this specific connection or for this user's session, I guess more accurately. Yeah, your token has a life cycle a lot like a normal logging logout session. So then my client connects to Node.js and sends the off token. Now Node.js obviously doesn't know yet whether my client should be allowed to connect. So Node.js makes an additional it makes a request to Drupal. This is going to bootstrap Drupal. Drupal's going to check whether this off token is appropriate for this user and if so send back oops, let me go backwards fast forward. Node.js Node.js talks to Drupal and Drupal sends back a list of channels this user should be subscribed to. So one of the main aspects of the plumbing is that there are different channels. So you subscribe users to different channels and then every time a message comes out you can broadcast it to that set of channels. A couple of other things to say also is that nothing from the client is trusted just straight up by default everything comes from Drupal and there's a shared secret and so Node.js will never do anything like you can't send a message from a client straight to another client unless you go in and change the default settings in the server and in Drupal because obviously there's all sorts of pain waiting to happen if you trust stuff coming from the client. So Drupal is really the arbiter of all this one to make it more secure but also when this runs back there are hooks that your module can implement so it's very simple to just get involved in the action. If you've got some channels you want to allow people to subscribe to it's just plain old PHP code implement a hook and you're in. So pending the successful receipt of successful authentication and list of channels Node.js notifies that client that the auth is successful there's actually sort of a JavaScript hook that runs so your extensions can do something with the fact that that socket connection has been created and now you're connected. There are subsequent page loads Drupal does its normal page load gets back its auth token and Node.js remembers the authenticated token so when we try to reconnect Node.js just says okay you're all good and reconnects you and there's a standard hook in the Node.js module when that hooks into a user logging in and logging out and it'll send a message over and kill the active session so obviously you need that because otherwise Node.js could be running for weeks or months and you can't just have stuff hanging around so again Drupal is the arbiter so Drupal can just reach in and take you know authenticated token and just say no good anymore you're gone. And we don't go into we're not going to go into detail in this presentation about all of the things that you can do with the Node.js server but the Node.js server sort of has this backend capability to receive messages from Drupal so you can do things like kick a user without them going through the logoff process your module can just kick a user or remove them from a channel or add them to an additional channel or get metrics, find out what channels a user is connected to, find out how many users are connected and what users they are and all this kind of stuff via API, HTTP API that so Drupal just makes post requests and Node.js response so then at this point your client is connected to the Node.js server so now as Justin was saying by default we don't trust the client although there is plumbing in there so you can change that if you want to allow clients to send messages to other clients by default the Drupal server has to send messages to the clients by sending them to some particular channel so so someone comes along so I come in, I load my page I connect to my Node.js server and now my page is loaded and I have a persistent connection if Justin comes in and creates a new piece of content if a Drupal module says okay I'm hooked in to hook Node update and I see that an update has happened to a Node I can fire off a message and broadcast it to Node.js on the channel content updates or something and then that will automatically go down to any number of server any number of clients that are connected so if there's a hundred clients connected and waiting for one of those updates Drupal sends a so that if Justin is logging in and saving a Node we already have a single bootstrap there Justin is clicking submit and Drupal is processing that request building him his confirmation page during that single request Drupal if your module hooks in and sends a message Drupal will send a single message to Node.js Node.js will then say oh I see look at what channel that message was broadcasted to and then send that message to every user that is subscribed to that channel so if there's a hundred users you haven't just had a hundred users hit Drupal to load that you've had a hundred users Drupal bootstrap the single page request that inserted that content and it doesn't have to be a page request so some of the first demos I did were based around Rush so you can just log on to your server and broadcast a message to everyone who's logged on and something will just pop up in their browser which could be kind of fun in fact well I found it fun anyway so some people get a little bit confused about the breakdown of exactly how this authentication process works I just wanted to put it one other way so client request hits the Drupal page be it Apache, Nginx whatever Drupal sends down the HTML with a JavaScript token the client sends the auth token that it got from Drupal back to Node.js directly Node.js verifies that with Drupal hey I got this auth token is this guy cool Drupal says Drupal sends back the list of channels and Node.js connects but I wrote the code so it's more just like you know Drupal going affirmative I'm not really cool like that Adjust and bot is pleased and so subsequent requests don't even need to hit Drupal so if your client disconnects and then needs to reconnect without reloading the page it can just hit Node.js again and Node.js doesn't have to hit Drupal and if you click another page loading another Drupal page you don't have to hit another full auth process that's going to add another Drupal bootstrap so sometimes people say like if we're still hitting Drupal to do the authentication then doesn't that mean that we're not really scaling and this isn't helping us that much with our concurrency because we still have to hit Drupal we only have to hit Drupal one additional time so when I load a page there's essentially one additional bootstrap when Node.js hits Drupal just to verify that that session ID is okay so the performance overhead with connecting Node.js is really low it's essentially one new bootstrap per login so where? how do I integrate my module with Node.js? it's pretty easy actually all you have to do is run Node.js in queue message the message Justin likes objects so the message is supposed to be in objects but I've made it an array here just to cut down the number of lines I need and cast it that's actually my code I think so you'll see here that the data structure is pretty simple broadcast is whether it should go out to absolutely everybody regardless of channel channel I guess is probably if you have broadcast is what specific channel should go out to even if you're broadcasting you can still list channel which is useful so you can have just a simple switch in your JavaScript of listening for what to respond to you'll see how that works in a second and then data and you can put whatever you can put structured data in there whatever data you put in there gets passed through as sort of the contents right and it ends up down the other end there's Jocelyn so anything that works with that you get whatever actually you sent down the other thing to note is it says in Q message because what we do is we register a shutdown function so you can call that as many times as you like in a request patch is shutting down usually patchy and we'll send them out there's also an API to just send it sometimes you don't want to wait and you can just send it but probably most people should use that right so that aggregates all of the messages that need to be sent throughout the entire Drupal page or the entire Drupal life cycle and then at the end sends all the messages and so this is the if you wanted to write, if you wanted to do Node.js integration in addition to your .info file that said we require Node.js and wherever this needs to go your broadcasting of your message the only other thing you need to do is write this JavaScript function so what you do is you just add another callback so drupal.nodejs.callbacks here we're doing the example module very clever right now you say your callback function and then you probably should always switch on message because depending on who sent the message you probably can't count on a whole lot of about the rest of the contents switch on message so if we're using example and that's our namespace of our module if message.channel is example we should be responding to it and here our super simple example is just write message.data oh I'm missing an equal sign you're right and that's how it's going differently the other thing to say about this pattern is that it's kind of lifted straight from drupal you add elements to an object and we'll just iterate over that object and call you but there's another pattern where you can set in the message a specific callback if that callback is defined in JavaScript and that's the only callback that runs so you can kind of do it in a way depending on what you need we've talked a bunch about adding users to channels so this is how you add a user to a channel in hook node.js user channels you just return a linear array of the channels you want to add someone to and then they automatically get subscribed to that so every time I load a drupal page a message goes out from drupal to node.js informing node.js of the channels that you would currently be subscribed to right and that's just called by the node.js call module whenever you authenticate but obviously you can also push people into channels and take people out of channels aside from when they authenticate but that's again probably what most modules will do most of the time yeah it should have been clear on that when you authenticate is when it happens not just every single page is coming so how do you broadcast a message to just a specific channel you switch broadcast to false so that doesn't go to absolutely everybody that just goes to specific channels and then you just specify what channel and it's going to be routed to the appropriate users and node.js itself provides user underscore user ID right right so the node.js module is basically a suite of modules some of them are in more or less of a repair and some of them initially were written so I had something that consumed the API so I could kind of get the hang of it and make sure it wasn't completely stupid so but the call module is really meant to be plumbing it's really just kind of something for you to use it really doesn't do very much out of the box and the suite of other modules that come with it do more things like allow you to subscribe to a node and various other simple things there's a module which will update your watchdog for you in real time that Mark Sonnebaum wrote and other things like that that just provide added sugar and most of what they do is just consume the API that don't do any of the plumbing one last thing to point out here as I mentioned you can send structured data through so that'll show up in your callback as a JavaScript object with keys something and something else and then foo and bar so we're gonna have a demo but before that just wanted to talk to you guys about where the module stands right now and what's coming soon version 1.0 will be out when it's ready but it is right around the corner we're on beta 3 right now right and we've actually had a Google Summer of Code working on integrating chat room Drupal 7 and another chat module that he maintained before with Node.js and obviously it's pens down with a couple of days ago so we'll be wrapping up that with him and he's got a separate branch and it's not really until we can pull that in and integrate it both into chat room and Node.js that we can start to really get down to a 1.0 yeah and the other thing actually if people want to comment in discussion if anyone actually uses chat room in Drupal 6 one of the things that's going through my mind is I kind of like to drop all lamp support from chat room so it doesn't really provide a back end that handles all the persistent connections in Drupal you have to plug it into something else now I'm not said on that yet and I'd be interested in what people think because the way that you write code when you have to do polling and the way that you write code when you can just rely on stuff to come to you when you need it is quite different and there's a huge amount of code in chat room which is just a hack straight performance hack and I'd really love to just throw it away but on the other hand if there's a whole load of users who go no no no we actually still want to use it even if it can only handle 10 people at once change your mind so I'm interested in what people think so new features this is one I've been pushing and working on a little bit content channels it started as per page channels so when you authenticate you get added to the set of channels that you should be subscribed to but then what if I want to send messages out to update a specific page what if our dashboard just needs live updates and if you're not on the dashboard and you want to send you live updates or what about if you're looking at a node I don't want to just add you to every single node you ever look at so how do I make sure that whatever node you're looking at is the one that we update and only that one content channels is the answer so right now you stay connected to that channel until you have been kicked or you log out content channels are going to be per socket connection channels so when I load a page and I connect I can add you to so when you view the content I'm interested in I can add you to a per per socket connection channel and then when that socket connection closes by your browser closing or quitting or clicking a link and going to another page that doesn't persist this is going to enable views integration although if anybody's actually thinking through how that would work it's going to be a little bit tricky and it's probably going to be kind of on the order of something like views content cache or views cache actions where you have to sort of manually configure the rules that are going to cause that view to be entirely rebuilt and sent back down and one thing that we just started sorting out is how to do generic entity updating so if I'm just viewing a node do I update that in line without refreshing the browser so that you can get live updates across everything not quite Google docs I'm editing and you're editing too but at least if I'm viewing and somebody else is editing it doesn't blow up I got bit by that in a meeting recently when we were sitting in there and it's like but that's I just copied that from the node and somebody else said well I've just edited it I was like oh I didn't refresh there's no reason we can't with the Drupal.org sorry with the Node.js modules plumbing and I guess part of the point here an underlying assumption is fat clients and people who might want to move towards shoving down a page and basically expecting someone to be on that page for tens of minutes half an hour an hour or longer and then you want to register it's not a page it's a piece of content and as far as nodes concerned it's just a hash that identifies it really and it's what's used as the channel and they're just associated with a socket and cleaned up if you do refresh the page there's not going to be as much use to people who are expecting their users to just do a sort of normal flow through a site and cycle through pages and open up and destroy and open up and destroy sockets quite regularly but if you do want to build a fat client and sort of hone down to just arbitrary pieces of content on the page and whether or not they're visible and send down updates that's what this is about and your module here so the whole idea is we're talking about how Node.js is just a bunch of plumbing we're really hoping to see other people start chipping in on and using it and building stuff out so next a quick demo how are we doing on time are we out of time 15 minutes, oh perfect we can see this alright so we put together a quick demo just showing sort of the chat module kind of hacked together with some stuff that's not totally ready although all this code is out there in the Drupalcon London branches of chat room in Node.js on Drupal.org so here Justin's logged in in one browser and I'm logged in in another browser if I click chat chat pops up with Justin so the point of that is just there's no polling there's no refreshing chat room just creates a channel per chat and subscribes you and we just shamelessly stole ideas from other modules on Drupal.org and Facebook to just put this together but this whole thing took us about I don't know maybe 15 hours or something it wasn't very hard and I'm hoping that other people can do whatever they like in terms of better frontends than this well and to be fair that 15 hours was a lot of fighting with stuff in the chat old stuff in the chat room module and working on some of this like theming stuff the actual Node.js part wasn't that involved and a big thanks to Dave my business partner sitting down there who really helped us with making it not look like us but really again in writing this it's just so clear that the way you would do this sort of thing where you can just rely on push is just so different than when you're doing with polling so again you know yell if you still want polling because I'm quite tempted to just rip it out of chat room and assume that you have a back end that doesn't suck one other thing we didn't even mention what we're using is the Node.js has a presence API so you can essentially right now it's kind of directly connected to one particular implementation but it's going to be a generic thing so that you can sort of have a buddies list identified and then when those buddies become present and when they log in and log out they will appear and disappear from your list over here so one of the challenges with buddy lists in Drupal is which one do you want to use how many of them work etc and I really don't want the Node.js sort of implementations here to care about how you decide who your friends are so this is just a hack around flag and flag friend and that's how we decide who your buddies are and who they are and who will appear in that list hoping to make it more of something like real buddy list modules or organic groups or whatever can just implement a hook and tell the Node.js buddy list module about who are their friends in terms of the server side stuff with Node.js it's a very low level it just simply gets messages again from Drupal saying okay who am I allowed to see presence notifications for again all of that is controlled strictly by Drupal and module code and whenever someone who is in the list of people you're allowed to see comes online or offline and messages sent that's it it doesn't have anything except what the presence event was so it's really hopefully just going to enable other modules to actually do something with that and this is just something we whipped up quickly but again I'm hoping people will integrate things like organic groups or whatever whatever people can do in Drupal you can link in with this and push notifications out Do we want to take questions? So we've got enough time for some questions we've got some mics here I can see a guy down the front who's got questions Hello is that working? Is it possible to connect to one Drupal site and have it pushed to another site so maybe a central monitoring site that aggregates messages? Right you could do that but once something that we actually haven't talked about a lot just to try and keep this down to time is the concept of like an open persistent connection and all the work for that is done for us by Socket.io because you can just drop some JavaScript in the client and it'll connect you can keep that open if you were to have two Drupal sites and some event happened and you want to send a message you probably don't need to have a like a persistent connection between the two you can probably just fire something off and also actually I don't know if there are any PHP implementations of like Socket.io's protocol and they would probably suck if they'll PHP so yeah I'm not sure you would do that. You could as long as the thing that's connecting to the Node.js server can authenticate and can implement Socket.io protocol so it can just keep it open then you could do whatever you like in this case it just happens to be web browsers because we get that for free Yeah the Socket.io will use web sockets a flash socket or it'll even text HR multi-part or long polling I think so basically only certain browsers support web sockets Socket.io provides all the plumbing and all the necessary stuff to just make it magically work across an amazing number of browsers I think it even works with IE6 like it just practically works everywhere mobile as well and so I think if you're saying get messages from one Drupal site to another Drupal site I'm not sure the question is sort of like what are you optimizing for I don't think having a Node.js server on the other side is really going to help because you're still dealing with a bootstrap on either side so yeah you made something lightweight in the middle but you're still dealing with something heavy on either end but you probably want there as a message queue that collects all those messages so they can be processed the use case I'm thinking of is you've got 100 sites running but it's difficult to look across them to see what's going on in the watchdog and if you're in the watchdog hook you could fire a message and have a central site and you could filter down by severity or message type or site and but see them coming through live that's the kind of idea I had in mind right and that makes sense and again there's already a sort of fairly simple implementation based around Drupal's call database logging messages and I think what you'd really want there is you would sort out your back ends to talk to your collection server and then again though it really comes down to the real time part is your browser with a socket connection open to that one collection site and that would be a really good use case and in fact some of the stuff that Howard's working on with Node.js is just around dashboards that need to get updated from all sorts of different data sources would it be possible to use Node.js as a reverse proxy instead of varnish just stacking up content in memory basically it certainly would be possible but probably not worth your while in terms of trade off unless you wanted to do something that you couldn't do in VCL so basically what Node.js has over varnish is just you can do crazy stuff in terms of what you want how do you want to react to requests etc so it's way more powerful than varnish but if you were to put them side by side and just pure performance I would expect varnish as a reverse proxy to outperform Node.js Do you have any idea how much of an outperformance we're talking about here? Nope. The Node.js guys are some the Node.jtsu guys high profile guys in the Node.js community are working on just that they have built a Node.js proxy but it doesn't do reverse proxy caching it doesn't do the caching part but otherwise it can behave as a sort of central point to load balance and do all that stuff they were kicking around the idea of using mencache as a data store so that you could do that sort of thing they were saying that their initial experiments with it the heap got too big and pain happened Node.js doesn't the default setting for v8 is for a process to only be able to consume a gigabyte I think so that's going to put a limit in addition to whatever your application logic is and stuff so they were talking about doing that but then you have another HTTP request that's hitting mencache before you can serve the content back out so you could do it but I'm not sure it would be a good idea the benefit there is that you could get around some edge cases that can cause problems with cross domain requests if you have your Node.js server running on a different port in our experience it's usually not a problem because flash sockets totally make that go away and the web socket implementations on mobile seem to not care that much about hitting a different port or a different domain but that was the use case they were looking at is we can proxy it so that you can have Drupal sitting behind it and also Node.js sitting behind it on the same server and the same port I just wanted to ask about the security side of Node how do you secure it how do you test against cross-site scripting and the general hacking scenario is there any documentation on how to secure it or how secure it is or how to the white hat hacking side of it right to be honest documentation and securing it not really documentation on any of this not so much well there's documentation in the settings file telling you what each thing is we actually had talked about shutting down I don't know if you can pull up the diagram between that's got the triangle not letting you run unless you are securing the connection between Node and Drupal with like an auth key like a shared secret I guess most people would probably do that the network layer though because that's just kind of a back channel between two servers but if you can't do that you can use the shared secret in terms of like XSS and all that sort of stuff that's not really I don't know XSS is not so much an issue one thing it does do is it does derive the auth token from session cookie and at least in Drupal you can't get at the session cookie from JavaScript it's HTTP only whereas the auth token that we send down to send back up is obviously explicitly available to JavaScript but it's pretty short lived and you can't use it to unless you can reverse engineer and get back the session cookie you can't use it to access like Drupal with the session cookie I'm not sure what else to say it's pretty the basic idea in terms of security is that anything that comes up from here to node unless we've like authed you you just get dropped on the floor so that's the basic architecture in terms of securing it yeah I mean and the all you could do since the default behavior is for Node.js to only accept messages coming from Drupal all you could do is hear messages on another channel and again this process the token generated to send down is different from the session cookie and I think it's hashed with the time or something no oh I guess it wouldn't be that wouldn't work well it would only be good for your authentication for your Drupal session though because it's derived from the session cookie so I guess the risk would be someone else connects on your behalf to Node.js they'd have to sniff it out of your browser and use it before you logged out the risk is probably pretty low and I'm not sure we can make it any better than that if they could do that they could probably use your Drupal session and log in as you and there they could do more than just receive messages presumably so it doesn't seem like a big concern to me but maybe we should document that better think about if we can make it stronger anyway did that kind of answer your question or is yeah one question on scalability would it be possible to run multiple instances of Node.js concurrently on one site or is there any ceiling in performance you're bound to hit we've talked about it it's a very good question lots of trade-offs here and right now out of the box if you can't fit it in one process then you're kind of out of luck with the current state of the module basically it's just what you can fit in memory in that Node process and that's it so to be honest I haven't tried to find out what the maximum number is I know that it's in the thousands but it isn't beyond that but I don't know where the top is based on the benchmarks I've seen on other Node.js apps my guess is it's in the tens of thousands for most servers I mean it's going to depend on how much RAM you have available but my guess is that before you hit the one gig limit you'd be in the tens of thousands but nobody's tried it I know I can say it's in the thousands because I tested that but that's it, beyond that I don't know the basic tradeoff there is like state so state is really easy when you've got a single process the bookkeeping is all just in memory in that one process end of story if you want to share things like the authentication the fact that I authed you and you want to make it so that you're not tied to any one particular Node.js process that state's got to live somewhere so that adds complexity because nothing's as quick as just looking up something in memory it's kind of fall later because frankly I actually don't know how many people are using this and no one has come and said we hit ten thousand and it fell over but if that starts happening a lot then we can look at trying to route and look at trying to share the state so that different processes can get access to it are we finished? thanks everyone