 Basically, some of us know each other. I work for ActiveSphere. I'm going to talk about Node.js programming patterns. I know you probably had an overload of Node.js earlier, but I'll try keeping it not really basic. I hope you know all the basics. But a little bit ahead of that, some of the issues we had when we started playing with Node.js, who am I? Mostly do Ruby. Sometimes I kind of dabble with Node.js. In one of those long hacknites, we basically built this thing called ActiveNode, which I kind of talk about some point in time. So I do a lot of small Node.js projects. Have you used Redback earlier? I guess the red is anybody? OK, so I'm the only guy. So Redback is a Redis data structure provider sort of thing. Everybody knows Redis? OK, how many? Oh, cool. That's awesome. So there is also a bunch of Redis slash, I mean, I'm going to assume certain things, so bear with me if you have questions or something. Just ask me. So basically, this is about patterns we learned on a bunch of small slash large projects. We'll sort of come to that. So about questions, it's really hard to answer some things that don't have context. So ask me right away, stop me. If it's not relevant or if I don't know the answers, I probably say, sorry, I don't know, and time out the conversation, but ask the questions right away. So mostly I'll be skipping the Node.js basics. It's pretty hard here, but I think it's a question that's been asked pretty much everywhere. Why Node? Have you been convinced about the Node argument? Anybody who's not convinced or who's still on the borderline sort of thing? OK, so all web apps, I do web apps. I mean, I hardly ever do anything else. All web apps are IO bound. Deal with that. I mean, it's like a cardinal rule. IO is everything that you do. Talking to the database, file access, every time you talk to my SQL, it's IO. Every time you're reading a config file, it's file IO. DNS, looking up your name to host IP resolution is even require express is blocking. Anyway, there's a require async, which does that. But general thumb rule, this is not some random statistic I kind of made up. This is reality. This is observing a lot of things, and there are lots of paper that talk about what CPU gets actually utilized. If you look at your uptimes on your servers, at least I run a bunch of big apps. And I haven't seen more than 10% of CPU. So basically, that means 90% of your time is basically waiting for IO. Returning data, passing the data, stuff like that is basically where your time is being spent. That sucks. That sucks big time, because you're just using 10% of your beefy server, which you're paying $1 an hour sort of thing on Amazon. But anyway, it sucks to actually use 10% of your brain power as well. So the only way forward is, can you see it? OK. Yeah, like somebody else said, I think the fonts on my machine look much better than on this. So basically, if you want to scale your servers, basically you have to use non-blocking. So Google actually targets, Google in its data center targets 80% CPU utilization. Like from 10 to 80 is like a big jump. But the only way you can do it is not by using blocking calls in Ruby or Rails or whatever. So there is a big, big move towards anybody who's doing high performance apps basically is moving towards non-blocking, whether it's a Ruby event machine or Node.js. So this is an interesting tweet that somebody else mentioned in one of the other sessions. Basically, LinkedIn went from 15, so this is, if you read this, it's their LinkedIn mobile app site. Basically, it was running Rails. They had 16 instances, and they just replaced it with Node.js, and they basically could handle 2x load with just four instances, like reduce the instances by four times and increase the bandwidth throughput by two times. That's like the things that we're seeing. And I mean, this is not like, I mean, there's always further than like hype associated with these things, but we're seeing that on our servers as well. I mean, we kind of had Nginx just fronting some of our static content, and we saw, well, Nginx is actually really good, Nginx release. It's also a reactor pattern, so it's actually quite fast. But we could actually do a lot more, like in 1.5 times more than with Node.js than with just plain Nginx. So it's really interesting. So I'm a Ruby dev. How many of you are Ruby? Like, do Ruby or play with Ruby? OK, not much. Python? Wow, OK. So I have no idea about Python. I only know enough from CoffeeScript to translate to Python, but I have no idea. But I guess Event Machine has some history behind that comes from Twisted. So Ruby has Event Machine, Python has Twisted, and obviously JavaScript has Node. So I'm just going to talk about Ruby, actually, because I don't know much about Twisted to talk about. Ruby, not all gems are blocking. I mean, like, are non-blocking. So if you want to make a HTTP request, you can't just take REST client and say REST client.get something in an event loop, because that sucks. Because if a reactor has blocking bits, then it's not scaling anymore. It's still blocking. So the only alternative is to use EM understanding stuff like a EM HTTP request or sort of do a EM differ, which is both of them kind of hacky. You have to replace a lot of code. Anyway, it's not really built to do that stuff. Or that's my opinion. I think a lot of people will disagree with that, but that's fine. So Node's pretty much, to me, it seems like the obvious choice. Has anybody seen this earlier? This is the classic XKCD about learning curves. Like the Emax is really interesting, because I don't understand what that means. Anybody use this Emax? Well, I use a little bit, but OK. So this is my version of learning curves. Ruby is very nice and easy. Like, user beginners can start off very easily. Then they hit the curve where they start following conventions, and they go to the next one, then they start metaprogramming, and it's all nice. C++ is like every day you learn something new. Anyway, get to Node.js. I don't really understand. It's just confusing async programming, right? It's really, really like basically that's a fuzzy mess there. OK, I'm going to take a short detour. Yeah, it's really bad. So we did, me and Neela sitting here, we sort of participated in this thing called Node Knockout. Anybody know about Node Knockout? Some people, yeah, cool. So basically, it's a 48-hour hacking contest. Basically, you write some application in 48 hours. Basically, don't sleep, do whatever you want. Just finish it in that time, like 29th August. And so we built this thing called Active Node. I'll just sort of show some pictures of it. Basically, it's a monitoring application. Has anybody heard of New Relic? Ruby guys? Yeah, so I mean, it's sort of trying to do New Relic. Except because Node is so isync and very dynamic, we want to do really, really, real time. So basically, you just do that in your code, and it just does magic, like almost real time magic, monitoring what's going on with your website, what kind of browsers connect to you, what are the slowest bits. It's still not there yet. I mean, it's still in the work. We don't know what we are going to do with that. Maybe release it some point in time. But yeah, we'll see. We're not really sure about that right now. Some of the learning is basically from Active Node. Like when you compress so much learning in 48 hours, for example, when you have to get stuff done and you're learning certain things, you sort of remember it. You just don't forget it, even if you hacked it in the night in half a sleep sort of form. So this is another tweet that I saw recently. It's really funny, but it's also true. I mean, it's funny because it's true. Can people relate to that? Because I definitely relate to it. I do all tested, everything nicely written Ruby. But when it comes to JavaScript, I'm like, oh, WTF. I'm sure some people agree with me on that. It's just not intuitive. JavaScript, there's so many gotchas that you don't. So this is Pit Luga. So Pit Luga is an ex-colleague of ours. And he tweeted about this, basically says, they also do a big-ish node system, which is like a payment gateway. So basically it says it creates unmaintainable soup. Yes, no. I think there are some patterns that you deal with to you start using. Some of them are just about getting your domain models right and your abstractions right. So I'm going to talk about some of that. That's basically what I'm trying to get to. I think programming is not intuitive. But if you want high performance, just with like you get so much power with node, you have to just take that responsibility and say, yeah, deal with it and do something else with it. So one thing I tend to do is like one at a time. And the first one I'm trying to target is basically use coffee script, period. There was this session, I think, in the other room about coffee script. It was like not equivocal about it. Just do it. Just use coffee script. If you're starting a new project, just use coffee script. Don't bother about JavaScript. I mean, it doesn't mean you don't understand JavaScript. I mean, coffee script needs you to understand the innards of JavaScript, what are good and what are bad. But just don't bother about writing JavaScript. It's got all the good parts, no curly braces, no functions, whatever. It's really easy to read. And I think it really generates much nicer JavaScript code than I can write. I don't consider myself like the alpha JavaScript writer. But I do OK. But still, I think it's really nice. If you have old projects, then I think my recommended path is not to do that converter thing, because it sucks. It won't work. Just basically start all your new code with coffee script. Keep your old JavaScript code. Do this fancy required coffee script and write everything in coffee. Basically what it does is it hooks into Node.js and basically registers an extension. It says anything that has coffee script is coffee.coffee associated, compile it. And it does it automatically. So all this stuff is really coffee files. I'll probably show some of them. So yes and no. So what I tend to do, I think I sort of recommend it. With coffee script, it's kind of hard to find errors, especially when you do that thing I'm recommending. So what I tend to do in my development environment is just compile it and use the, because if it sees a JavaScript file, so if I say, if I see a host.coffee and if I see a host.js, it'll preempt the JavaScript. So it'll load the JavaScript up front and not do the coffee. So in my debugging compilation, like development time, I tend to compile it and then use the JS and not check the JS in. So that works as well. But yeah, this is a caveat, though. I mean, there's no, as long as I think V8 doesn't support natively coffee script, we'll still have to deal with it. I mean, I don't know if there is a better solution to that. So I'm just gonna sort of take a step ahead and say, so there's a small problem we had with... So we embed our application in somebody else's Node.js code and it's sitting there monitoring stuff, doing current jobs to publish information about what your CPU usage is, what your memory usage is, and stuff like that. So there's a small daemon that we call tracker that runs. So this is how I first implemented. So by the way, I mean, there's a lot of code coming in front, so danger ahead. So this is my first implementation. This is really my first implementation. So one good thing I did was basically didn't write it in line in that monitor itself, but actually abstracted this thing called host.info and had a method called info. That's the only good thing that I can say of this code. I mean, anything else that you can see good here? Probably not. So this is the nature of the beast. I mean, this is the callback mess that we talk about. So one, the API is actually quite bad. The API that I used, sorry, I'm gonna have to drop it. This is apiinfo.node.getversion. Take a callback that has a result in it and basically that's a result for getversion. Oh, by the way, this is coffee script. So you have to sort of understand coffee script a little bit. Sorry, I didn't warn you earlier, this is coffee script. This is not JavaScript. So basically what it's saying is, take a callback, take a callback which does this rest of the graph. And ditto, it's like every time you see that arrow, basically that's sort of a callback. I'm sorry? With the result as an argument. With the result as an argument. So basically the background is pretty much the same. So you just have to look in the file. But can I start sticking that information into the data? Clear? Everybody seems to have an answer. Just makes no way. So the requirement here is basically I can't, I can't make callbacks to my server to post this information. Like every, oh, get environment post to the server. Get version post to the server. Get memory usage post to the server. I wanna get all that information and then post because that's really horrible. If I get 15 requests like this in like one second, I'm screwed. My server is gonna go down. So basically the idea is collect all the information and then call this callback right at the bottom. Right? This is the guy who's doing major chunk of it. Clear, right? Yeah. This is how it gets to see. I don't think anybody understands this code. This, yeah, I don't claim to write this code, but yeah, that's fine. Everybody can see this code, right? I mean it's not like really cool. Cool. So this is the second, but less complicated version. Basically I abstracted that away into a bunch of system calls and basically do a call, apply your call doesn't matter what it is. With this thing and then do a callback. Anybody see any issues with this code? This is like a classic noob JavaScripter, right? I mean, this won't work. This doesn't work. I mean this is totally wrong. This is my classic noob mistake. So because it's async, this responses come back much later and this callback is called with empty data. So I'm basically seeing no data at all. And this is again the asynchronous nature of the piece. You just can't call the callback before getting all the data. The interesting thing though is if you go debug or something, you'll see the data in like two seconds later. It's there. But what was posted to the server was empty. Like very subtle, very stupid bugs, yeah. So this is my favorite part of the slide by the way. When in doubt hack. So hack another version of it. By the way, I don't know whether I showed underscore. Yeah, basically the underscore is underscore.js. I guess everybody knows that. So my second version basically is slightly more complicated but does the right thing. It works. It works because I'm doing a count. Basically, this, oh, sorry, yeah. Yeah. So basically this is trying to synchronize certain operations basically saying every time you get the result, callback here. And there I basically decrement a count. And if I see that count is less than this whole thing, I'm done, right? Am I making any sense at all? Maybe, yeah. So this is like, this is almost like a classic pattern in asynchronous design. You see this in Event Machine. You see this in Node.js. You probably see it in Twisted as well. So yeah, so this is, so basically count the pending. There are, it's kind of complicated. This is really simple because I just want the hash, right? I mean, if you want to sequence these operations as in get environment should run first and then get prefix and then get version, then it's tricky. Then you basically have to see, basically have to clone the array and do stuff. So maybe it's a reader exercise to sort of do the sequenced version of this. There are other ways of doing it. Next stick is another way. Basically, instead of doing the callback, you basically say, in the next stick, when the next reactor loop runs, schedule this, get version, or the next method. So basically, pop off the array and basically schedule methods. So that's one version of it. So as you can see, basically, there's stuff happening in parallel. There's stuff happening in sequence. Like the collection and sending to the ready server or to the server needs to be sequenced after all the parallel calls have run. So you see this pattern a couple of times and then a new flow control thing is run. I mean, I create another flow control module. Basically, abstract this away into some sort of looping mechanism, right? And that's why they are like, I think I did a WC minus L for this and I got like 40 flow control modules which do async node.js stuff. So basically, drop that idea right away. Don't do a new async module for yourself. Understand the plumbing and scratch whatever I said and just use this thing called seek. Has anybody used seek? Well, so I'll show you the code again. Like I said, there are 40-ish async modules that run. Yeah, you just do a require seek. So it's really simple. Back again, dropping it. Basically say, wrap my execution flow in this thing called sequence. For each, it's basically parallel. Run all these pattern parallel. Don't bother about when they run. How does the order sound like? How does, sorry, what's the, no, I don't know. You want to remove these async, right? Yeah. How does the order matter? So the exact order of this doesn't matter. It doesn't matter. What matters is this callback should be called at the end of all of these. So yeah. So that's why I say this. All this should be parallel. I don't care about the order of execution. It's just this. So parallel is all of this sequence, this one. So what it does, it's quite interesting because it sort of creates a stack of all the results that each gets and it stores in this thing what this one was. So yeah, so basically it's really, I mean, I just looking at this code, this is really intuitive. I mean, there's no compromise, there's no magic. Is it like the map-reduced concept? Pretty much. That can be, like all the codes pretty much can add, say, a bunch of boxes here. Like if you can have those map-lasts, it doesn't have that. You can actually. So the same sequence is your reduce. But also, I would not think of it as map-reduced. I mean, you can do maps and reduce it and everything. There's a method called reduce but yeah, there's no, I don't think there's a correlation with the cloud map-reduced or the map-reduced. It's basically almost like regression. It's reading certain things, waiting for certain things to happen and instead of recursion. Yeah, so I think one of the other things that you could do, like I gave the next step pattern, right? Actually recursion is another way of moving it. I mean, this is like we follow, this is like a movie programmer, this is like a movie programmer, do you know what I'm writing? Yeah. Yeah, so the other thing you see is this catch-l. Basically, this takes care of your stack issues around, like if you have an error in Node.js, basically it will crash out or you'll have to handle it separately. But what this does is it lets you handle each exception at the right place and handle whatever for usage sort of thing. So basically it builds exception handling in as well. So I think like all the code probably is over now. Like, why is that? Yeah. Like, catch-error is depending on the parallel space, what happens? So, yes, so all the parallel ones continue? Anything that is parallel will continue, will not, there's no break or something. But say a couple of them change, and then a couple of errors. Yes. And still have categories moving back. Yes, so if you basically can choose to throw out, I mean like say raise from here, and say I don't want to deal with this at all, you can also say move your catch, this is like a catch on everything. You could move your catch to somewhere here, or sorry, just before sec, and say I only want to deal with this catch applies only for this parallel request and not for this sequential request. This catch will apply for everything that it gets executed. Yes, unless you throw it like you're deep with it. This basically means I'm eating it away, and I'm not doing anything, I'm just logging it. So sequential will happen, with some random junk data or bad data. Cool. Oh yeah, so the other patterns, I mean I think a lot of stuff. So I'm not a big client-side developer, so I don't do a lot of jQuery sort of thing, but I've done enough jQuery to know, like this is a classic jQuery pattern in that sense. Basically you just emit, like when you don't want to deal with certain things, you just emit somebody else catches, basically you just trigger an event, and like somebody else handles it somewhere else. It's also like a classic way of abstracting your code away, like as in breaking it into multiple modules. So in Node, it's called event emitter. Everybody can see the code? Okay, cool. So basically, you do all the boilerplate stuff to set up your event monitor, and basically I could have run all the spaghetti here, and like do if remote addresses this, do this, and that, and that, and so on. But basically I chose to say, this is not the right place to handle, this is just a controller, this is not the place to handle all the request stuff. I'll push it somewhere else to deal with it, and somewhere else there's a code that is controller.onrequest, do this, right? It just, this is, this is to me, this is one pattern that I use to break out of this, that wirey nested loop, nested callback nest. Obviously it's useful to modularize your application as well. So I mean, if you take one thing away from this, basically it's like the constructs may be different, but the way you design your node application is exactly the same as you design your Ruby application or anything else. Basically think of the abstractions and abstract it away. The implementation may be different, different by maybe using event emitter or maybe using like SAC or Flow Execute, Flow Control sort of thing, but get to the right abstractions and you don't have to worry about it. It's really that simple. So this was an example of internal sort of, sort of eventing, so basically we're saying the node application itself talks to parts of the component inside the node, but it could, and we use it, we use Redis in the background, back end to sort of store all the information. And we event with Redis, well we don't use PubSub, PubSub is a really cool way of doing it as well, but we use the monitor. Has anybody used the monitor on Redis? Basically you say you register a monitor and say send me everything that's coming to the database, like any change that happens to the database, give it to me. And I think the code sort of was here. So basically you say on monitor, just give me all the stuff. So basically as somebody else is writing data to Redis, I'm sort of streaming that into this part of the code. And all it does basically is like pipe this data to socket IO, like right to the browser. So it's not even like understanding what data it is, it just takes that data, pipes it into socket IO. Yeah, so it's sort of a node I think. Like when you start working with node, there are lots of options. I could, we could choose to use MySQL for instance. But personally I think it's not evented in that sense. You tend to choose architectures that are evented. Like change happens, I got to know it. Like you don't want to put a loop around or a polling thing that says oh, if check for, keep checking for updates from MySQL and if something happens, do this sort of thing. So prefer architectures that are evented. And that's why Redis kind of fit in for us. CouchDB has a similar infrastructure in that sense. Basically it has the underscore changes which let you feed into, like see what the changes are pretty much real time. So for me, and this is my personal opinion, I don't know whether people will like read to it. But basically it's definitely, I mean like when you're using node, think about what tools you're using along with it. You can't just take a 18th century thing and like stick it along with node. Maybe MySQL is not, but yeah. So one thing to also understand is like know what, how to use common JS modules. There are a couple of patterns. I mean that's very simple. I'm just throwing it out so that you know how to do it. Like this is how you require active node. And it just exports one method called request tracker. That's about it. This is okay if you're using like one or two methods when you're exposing one or two methods. But if people know this, does everybody know what common JS is? Okay, common JS is like this common abstraction for all the JavaScript servers in that sense. So there are a lot of people who support this and this modules.exports is basically a common JS directive. There are a bunch of common tools that they're intending to expose. I don't know whether it's what shape it is in, but node kind of half-heartedly accepts it. I know Ryan has like a big issue with it, but fine. So it's good. When you're exposing a whole bunch of methods like objects and stuff, use this sort of syntax. Like, yeah, I'm sort of. So the other thing that we tend to run into was node memory issues. Has anybody dealt with node memory issues at all? I mean, how many use socket IO? Yes, socket IO is known to be a big leak. So node, so I don't know. This is, I'm trying to make a controversial statement here, but having the closures and not understanding when things go out of scope leads to a lot of memory leaks. I mean, that's what my personal view is. I don't know whether like all the hardcore JavaScript guys agree to it, but node just has some issues. Socket IO, definitely. Like we just ran socket IO just, like in four days, I think it just crashed running out of memory without doing anything, without actually even publishing an event. Anybody use node inspector? Yeah, pretty cool. Yeah, I think this is my best bet with... So personally, I don't debug node too much, but when I'm running into memory issues, and like sometimes once in a while, I do. So it gives you like a nice, this is Chrome. Probably works with Chrome and Safari, yeah. So you put breakpoints. You can just sort of step in, step out, look at call stack, and yeah, lots of... You can do lots of things with it, but basically it lets you see what's on your call stack, what's lying around, stuff like that. So do that really early, because we've seen this over and over again with node, like it just runs out of memory and dies. Like my status, so there's this status dashboard thing I work on, which is looking at my production servers and saying whether this is up and this is up and this is up, but like in a few days it dies. Like I don't even know which one to monitor what with. So it's kind of funny. Like my status dashboard keeps dying, and most of the culprit is actually socket IO. I can't sort of blame node, but keep watching that. Deploy early, I mean this is like going into that agile stuff. Each platform has its own idiosyncrasy. So when we were doing active node, we sort of went to node.d, which is a joint node, and WebSocket sent work, some of our IP resolution stuff can't work, and like we said, no, we can't use node IO, sorry, node d, and like we went to join, and joined half of the other stuff can't work, and like we didn't know what to do. So at some late night debugging sort of helped us to fix it, but we went back to node.d. Yeah, WebSockets and socket IO is like pretty much, like all the socket IO needs all these ports to be open, and like it works, but if it starts falling back on long polling, so everybody knows what socket IO is, right? It's like real-time updates, whatever, yeah. So it will fall back to long polling if it can't find the ports open, or if it's not Chrome, for example, when WebSockets are not turned up. So if it's long polling, basically you're screwed, so don't bother about long polling and socket IO. So if you debug these issues right away, it will be useful. Sort of winding down now. So use a deployment node hosting sites, don't deploy on your own. It's pretty basic right now, like even node guys say, don't front it with NGINX, don't just use the node.js HTTP server because it's still not really complete, so don't bother about fronting with NGINX, like doing it on your own. Like node crashes all the time, like because if you don't do the right exception, handling it will crash, you'll need somebody to restart it when it dies. So maybe NGINX, maybe these guys are really good, so yeah, just pick one of them and stick with them. That's pretty much the end of it. Thanks. Yeah, so if you have any questions, there was like too high or too low. Sarif? Because you have to. Okay. What's your company Ruby? Oh, okay, we're into consulting mostly Ruby. It's like, I think 90% of our work is Ruby. I mean Ruby means JavaScript and everything else. And we choose to do node.js when we can. Basically, that's, yeah. So you develop a device, so yeah, we worked with PIC, we didn't actually do it, yeah. We worked with PIC to develop that. So mostly our customers tend to be like early startups and yeah, cool, thanks.