 Hi everyone, sorry about the delay and we'll start. My talk is basically about writing bindings to native libraries in to Node.js. So let's say you want to use something like an image processing library, say OpenCV, or you want to port Node.js to a smartphone and you want Node.js applications to be able to access smartphone APIs like the phone, the API, the contact work, so on. Right? How do you do that? You integrate C++ libraries and Node.js, that is what the talk is about. If you are interested in generally knowing about the internals of Node in some sense, then also this is a good talk for you. So basically Node is a small grapple around the V8 JavaScript engine, which makes scripting JavaScript, I mean, integrating JavaScript into other applications as easy as something like Lua, and it's quite possible that JavaScript will be the new Lua someday. So yeah, if you want to follow the slides and code in the setup, all the code itself is like you can write out as I go along. So what we want to do is use C++ libraries in Node.js, exchange data between C++ and JavaScript, and the main part which is do asynchronous IO with Node. I don't know if I have enough time for the async IO part, but let's see what happens. Yeah, this talk is very code heavy. You will see a lot of C++ on screen. So if you have any questions at any point, just like stop me. So the very basic thing, like the first simplest Node module you can write. How many of you have written Node modules in JavaScript plain? Use multiple files so that you do export, dot, something, right? Okay, so three or four people. So just like you have exports in JavaScript, what you do in C++ is that first of all, you include obviously the V8 and Node header files. And then you define this function called init, and you still Node using the Node module macro that my module is called first step in this case. And to set up this module, call this function called init, and this init function can then set up the module as you want it, right? So just like you have the exports object in your JavaScript modules, you have this target object in C++, so you can set properties on that and they'll show up in your module. I'll explain the handle and object and these things as you go along. So this is like the basic template. Every Node module you ever make will have this, right? So how do you build, how do you compile this? So Node uses this tool called VAP, WAF, for its compilation, which is basically a Python replacement for me. So this is like a simple make file, I mean, a W script file for where you set that it's a C++ project and you have this Node add-on thing which can, which includes certain Node related libraries like V8 and Node itself. And then you tell it to build a module called first step using the source file called first step.cc at the very end, right? So this is again something that is common. So in all the code that is in my Git repo, you'll find every directory has a build file and so on. So you can just directly compile it. So to compile it, you do Node dash WAF, which is a slightly modified form of WAF and you say configure and then build and hopefully it builds successfully and then you can use it. So just like you require modules in Node, require normal JavaScript modules, you can do the same thing with your C++ module. So you just do require and the path to the module itself, right? And if you compile this module, then this is what you would get because it's like you haven't actually done anything in the module, it's just an empty object, right? So this is basically what Node is. It's V8, it takes this library called lib uv which does async IO and all those things. And it wraps the entire thing and calls it Node.js and your binding is the one in red. So that interacts with V8 and the library part is your native library. For example, OpenCV or SSH, for example, a lot of libraries in the Node code distribution itself like the crypto library and Zlib and so on are written exactly as I'm going to describe, right? So V8 itself has good API documentation, but it's not officially hosted on their website. So this is one of the places on the web where you can find it, okay? So now let's go back to the code. You see this handle object target, okay? So what V8 does is since it has to do garbage collection by itself, right, it's going to pass everything through this handle wrapper. You can think of handle as a smart pointer which keeps track of how many references there are to V8 objects and so on. And the wrappers in a sense also encode the scope of that variable itself. So as long as any V8 object has at least one handle referring to it, the garbage collector knows that I'm not supposed to delete this and it will keep it in memory. So handles are of two types. First is local and which is also Eda's handle and is persistent. As soon as a local handle goes out of scope, the garbage collector is free to delete it and a persistent handle is not deleted manually by, I mean, not deleted by the garbage collector, you are manually supposed to tell it to delete it. So you'll usually use persistent handles when you want to pass around data through multiple function calls and store them around for later. Otherwise you'll stick with local. So let's get directly to implementing functions. This is a simple JavaScript module like you would say export.square equal to whatever that's implementation and then in other node modules you could just require, let's say this is called square.js. You could do require square and then you could do square.square itself and call the function. So how do you do this in C++? So it's pretty simple, nothing too complex. You just say the target object just like you do export.square. You do target.set, the last line of the code. You do target set, the name of the function and the corresponding function itself. So any C++ function which has to be called from JavaScript has this signature which is the first line itself. It should return a value and of course the function name and then it takes this arguments object which is used to transfer JavaScript arguments from JavaScript to C++. Then in the init function I have created this function template. So I'll explain what a function template is in a few slides. Think of it as saying that telling v8 that this C++ function should somehow be magically put into the JavaScript function. And so you use this function template to allow that. So you function template and then you tell the function template give me an instance of the function and then you inject that into the target. So this is equivalent to export.square function n. Now I'll actually go and implement the function itself. So this is the function implementation. Just like I had a square, I mean I go and implement it. So args is something like an array. Okay, just like you get one argument named arguments in JavaScript, you can extract them by index in C++. So I'm taking the first argument in the arg0 and right here. And extracting the integer value from it. Then I square it. This is like normal C++. And then I do scope.close integer nu square. So at this point I would like you to realize two patterns in writing bindings. One is that anytime you use a v8 class, you don't do v4c++ nu operator. You use the static functions capital N nu. And that's because again the garbage collector is doing a lot of back end bookkeeping. And using these static functions allows it to keep track of all that. The other thing is this handle scope thing. So as I mentioned before, there's local handles and position handles. So local handles, v8 maintains a corresponding stack of some sort. About what variables are in scope currently. And when like, so a handle scope is like adding a new stack frame in v8's garbage collection thing, right? And then you say that any new values, like any new handles that are created in this scope, right? Once that scope is popped off, you can delete all of them. Okay, so the only problem is that return value has to go back to the calling function, right? And if that was deleted when this function went out of scope, then that's not good. So what scope.close does is it goes and accesses the scope that was in scope before, like the handle scope that was in there before on the stack. And it makes this whatever you pass in the close as parented to that. So it doesn't get deleted. Basically, any time you want to return something, it's best to wrap it in scope.close, right? Excuse me? Yeah. You don't need to instantiate the scope there? Uh, yeah, I should see the first line that's handle scope scope, and that's basically all you need to do. And internally it will handle you. So, uh, this is like the analogy from C++ v8 to what happens in actual JavaScript. So function template itself doesn't have very good JavaScript analogy itself. It's more of a way to, you know, before I create the function, I keep it in some kind of paused state. And so that I can do stuff with it. Okay. Then when I did the get function to get an instance, get function, the return of get function is something like what happens when you refer to a function just by its name, right? Without following it. It's just the function itself, right? So function template provides certain things, instance template and prototype template being two of them. So using instance template, when we go to like the next slide is object oriented, like how you can create JavaScript objects in C++. Right. So when you do that, there's this common pattern where you do, like you do, let's say the inventory is the object. So you have a function and then you create it with new. And when you create it with new, it's like object in JavaScript. So then you use this dot items to set properties of that object, right? So instance template is equivalent to this in JavaScript, right? And if you move here, and this is the traditional way of adding methods to a JavaScript object, like if any instance of inventory will now have add stock and ship methods as part of it, right? So just like that, prototype template is the corresponding dot prototype thing, right? Any questions at this point? Because now we'll go to object oriented. So now it's like, if you actually try running the code called this simple function in the GitHub account repo, right? You will see this same thing I've like had in the header files and so on. And you can just run it. And if you require the module, it will have a square function ready to go, right? Yeah, so coming to objects. So there's not much difference between creating a function template and creating, you know, objects of that kind. So again, it's the same thing I created a function template and inventory again has the same signature if you look at the bottom, right? What I am doing differently is that since I want that this dot items equal to something, right? So I get the instance template, right? And on that I set items new, new integer called with a value 257, right? And then I say, I inject that into the target itself. So I'm doing exports.inventory equal to the inventory function, right? The interesting part about the inventory implementation is that it's just one line. Return this, okay? So what is going to happen is that just like in normal JavaScript, this pointer is created by the JavaScript environment itself. It's just the empty object, right? Similarly, args.this is the equivalent in V8 and it will be an empty object again, created by V8 before it calls your function, right? So you can just return that and that makes it an object, right? Okay, so just like you have a string new, symbols are something like internally V8 will keep that in some sort of, I mean, it will convert it to an integer. We call it interning in JavaScript engines. And what it's going to do is instead of having to look up by the entire string compiles and every time, it's going to say that it goes on a unique code and that speeds up the lookup. So for certain frequently used things, you might want to use it. Yeah, so now this was operating on instance template. So every instance of inventory will have its own items and you can really modify it. Right? So how do you transform the method pattern of JavaScript into C++? That is how do you set methods on the inventory class? I mean, it's not a class, but we're going to treat it as a class, right? So again, not hard. What I'm trying to show is that this like V8 has a tremendous simplified, how you write, you know, applications that can interact with JavaScript to one row and know that certain useful things on top, which I think make a lot of difference. So just what it's going to do is prototype and create. So node provides this macro called node set prototype method. And you just pass it the object itself. And then you specify what method is going to be called in JavaScript. And what callback, like what C++ function is going to be called when add stop is called. Right? I'll explain why you don't directly want to operate on prototype template and why you should use this macro a bit later. But I mean, it will become clear as we go on why you should use a macro, right? Again, not much difference here. Just get a function and set it. And you should have add stock and ship as part of inventory methods. Right? So let's actually turn to the implementation so that we can understand how that object is now internally accessible in C++, right? Again, I have used args.this. So args.this will, let's say I do something like var x equal to new inventory and then x.add stock and let's say I pass five. Then in this case, if you were writing JavaScript, this object in add stock would be x, right? So that's what is going to be here too. It's going to get you that this object. And then you just do a get items and get its value, right? If you understand this even 32 value and so on, you can just consult the API documentation. Basically, what it's doing is it's converting from the V8 representation of an integer into the C++ thing, right? And otherwise, I'm using this object as a simple, some kind of dictionary. I'm just setting and getting problems, right? Very simple. And then I just return undefined. Undefined is just a convenience function. Similarly, the shift method. So again, I do get this point over and get properties. I can throw an exception by just calling throw exception and passing the string. So let's say I'm trying to ship more than I have and you can throw an exception. So as you can see up till now, we have a very one-to-one analogy between C++ implementation of the module and the JavaScript. What you would actually be doing in ship would be, let's say you were calculating, I mean, you were using some native implementation of inventory, right? You would be calling methods of that native library on the instances and then echoing back those values, right? So that is where we come to, I think, the most important part of writing bindings, which is object wrap, right? Where you associate native C++ objects with JS objects. And what I mean is that if you see the diagram, there's a v8 object, let's call it obj. And v8 objects are these internal fields, okay, where you can arbitrarily insert data, right? Any kind of thing, and v8 won't touch it. You can do what you want with it, right? And similarly, you have this instance in a stick of an object wrap subclass, which you will create. And then you have the native class instance. So let's say you are having some image processing library and so that library has a class called image, okay? And for every image that you want your JS app to use, you want the instance of the native image class to be created. So in JavaScript, if I do a new image, you want a new image to be created in your native app, a native library as well. So what you do in that case is that you would create an object wrap subclass, let's call it image in another namespace. So let's call it JS image, so we don't get confused, right? And JS image would contain the native image and would do operations on the native image. And JS image would handle, it would have methods on itself and it would have feeds on itself and expose itself to v8, right? Except what happens is that how does v8 know that for every instance of like a normal v8 object in the JavaScript space, this is the instance of the sequence object, right? To establish that association, we use the internal field. And this is the object wrap what it does is it simplifies doing the internal field manipulation. And when you call INHD wrap, so wrap is actually a method of object wrap and since it's a subclass, you call it. And on the OBG, then internally it sets the field. So field number zero is now pointing to the instance of your subclass, right? And this way v8 knows that this is the association, right? And at any point, when you get a v8 object, you can extract the JS image class by accessing the field. And you could ditch object wrap and just operate on the internal fields yourself, like instance set internal field count one, right? You need to do this because v8 by itself will not initialize any internal fields. So you need to tell it I'm using one field, two fields, two fields, so on, right? You could do it manually. The only thing is that object wrap does some, again, garbage collection. So you see the difference between JavaScript and C++ and with all your manual memory management, you know, you are in full control. But when you are dealing with a garbage collector that is running parallel to your program, then you have to be careful that, you know, the garbage collector doesn't go and reclaim memory that you are using or that you don't delete memory that the garbage collector thinks is valid, right? Because otherwise you get this mismatch and your program supports. So you have to be careful about that. So this is where object wrap is going to deal with those things, right? So this is our example going to proceed. This is your native C++ library. I'm calling it library. And it has a class inventory, right? And all this is native code. Let's say in a real situation, you would probably have a compiled library which implemented this and you had only the header file. And you would go about this. So think of this as a header file, which has all these things, right? So we do have to set the internal field count manually because, I mean, no doesn't know which instance we are dealing with. So this object template instance is again, like you create a function template for inventory and what you get from that template instance template is the instance, right? And you set the internal field count to one. So you say that every inventory class in the inventory object in JavaScript has one associated internal field and which you are internally going to point to your JavaScript, I mean, C++ wrapper, right? So in namespace binding, okay, my bindings code, I create a subclass of object prep, right? And I have this new method in it. So just like v8 has integer new string new, so we use the same convention in our classes so that it makes using them easier, right? And realize that it's static, okay? Any class method, I mean, any method that's in a class that you want to associate with v8 has to be static because only static callbacks, I mean only static functions in C++ can be passed around as function pointers, right? So they have to be static. And then what you're doing is internally you are creating an actual instance of the same class itself, okay? This class inventory and here inventory is the same thing, right? So you create a new instance of that and then you wrap it around argument this, right? Except I'm not using this, I'm using Holder. So what's the difference between Holder and this, right? If you realize in the first one, when I was just reading the items and getting and setting them, I was using args.this, right? And now I suddenly switched to using Holder. So what's the difference, okay? So args.this is always the this object with reference to the JavaScript scope, okay? And you know that in JavaScript, you can always rebind functions to other, you know, deal with other objects as this. Except what happens is that when you're dealing with internal fields, okay? So when your inventory had this internal field storing reference to a subclass, right? And then I go and change the prototype of my internal object to point to something completely different, okay? And then when you invoke the method, args.this is actually that different object, which no longer has an internal field and I try to access it and the program crashes. Right? So what args.Holder does is it makes sure that this object you're dealing with is a valid, you know, object wrap instance or whatever you store. So that even if you try to call something, it will fail with a more decent message than just crashing. Right? And this is the node prototype set method. I told you, don't directly go and set the prototype template. Right? What node prototype set method does is it uses something called signature in the game. I'm not going to go into details of that, but basically node prototype set method combined with using holder will ensure that it gets called on the right class and I mean, right object in the prototype chain. Right? So you can always use holder. I'm not saying use this in certain cases, use holder in certain cases. It's preferable to always use holder itself. Right? So getting back to the wrapper, yeah. So what I've done is now I wrap the this instance in my wrapper so that this link here is established. And then I return this object itself because that's our instance. Now, this is where the interesting part comes. Since we have wrapped it, now we obviously need to unwrap it at some point to operate on this. Okay? So when in JavaScript, the user does x.ship by items. Right? This object is going to be x, which is a v8 object. And that v8 object has the pointer to our internal subclass. Right? So we unwrap it and we convert it. I mean, we then do our normal operations. So wrapper itself has, okay, sorry, yep. This wrapper itself will have internally key preferences to, you know, the native library classes. Right? In this case, it's called INV. So what I'm doing here is I'm actually calling the native method ship. And then I can check the return value and do error handling. Okay? So this is like the simplest binding you can get from JavaScript to a native library. Right? Any, like, questions up to this one? So, how much time do we have? Wait a minute. Okay. So we have time to get on the asynchronous IO part, which I think makes a ton of difference to, I mean, that's what Node is all about, right? You shift operations that block to another thread and you can process lots of stuff. Right? So what is the concepts involved in getting the... You wrap this same native object twice. Do you make the same JavaScript object? Okay. So let me clarify. When you wrap, okay, so it's somewhat unintuitive. When I'm doing wrap world or wrap, r-short-folder, it's not r-short-folder that is being modified in IV, except that r-short-folder's internal field is being mapped to wrap itself, wrap or itself. Right? So if you call it again, it's just going to reset it. It's the item, I mean, idempotent operations, sir. Of course, if you try to unwrap something and it turns out to be of the wrong type, then it will be null, right? So in all my examples and even in the code, I've been very relaxed with error checking to simplify the code, but even like accessing arg zero and then just calling even 32 value is a very bad idea. If somebody passes a string, then you're going to get a wrong value, right? So you have to do typing and so on in production code, right? Yeah. So what is the problem we are dealing with when we want to go easy, right? Most libraries are not designed to be non-working, okay? So when you do file IO in normal C, you do open and then you read, right? Until the read doesn't finish. So let's say you're reading on a socket until the socket doesn't actually send data, read is going to block and your program is going to block, right? So most native libraries are not designed to be non-blocking by design, okay? They are native implementation. So the workaround that Node does is to allow casing Node to use synchronous bindings as that. It has an internal pool of threads, okay, in the background and Node itself runs on this main thread and using LiveUV, it abstracts away the thread pool and so you can give it a synchronous task and we'll move it off to the thread pool, right? It will run it over there, it will detect that it is finished and it is finished, it will notify you to do the further processing, right? So internally it will use multithreading and simulate the asynchronous nature. So you need to do three steps to have casing bindings, like convert a synchronous library to a casing behavior in Node. So the first step is to use this function called UV underscore, Q underscore, right? So this function takes a couple of things. First of all, you will need to define three functions, okay? Just like we had a one-to-one from the JavaScript ship to C plus a ship, right? So when somebody calls, okay, let me take the code itself. This is what we're going to do, okay? I've called this async module, I'm creating an inventory and let's say in your company, the inventory story becomes very full and because you want to optimize it, you tell your workers to reshell the whole thing in some sort of order, right? Except that your inventory is so big that reshelving it in the main thread would stop people from interacting with the inventory, right? So you want to make it an async operation and you want to know when it's done, when reshelving is done, right? So what do you think is going to happen if you just read this code as an order? What is going to happen? First of all, you're going to do the reshelve call, but then it's immediately going to go and print after reshelving source because this call, we want to make it non-blocking, right? Then it's going to do this tick, tick, tick, give printing it. And the native version of reshel, okay, in your native library, your thing is blocking and what it does in this simple case is it just leaves for five seconds, okay? So here you're going to get output like after reshelving source, then I think I have the output, yeah. This is what the output is going to be, right? So how do you do that? So because it's like this three-step thing, you need to define three functions other than one function for every async function you want to use. The first one is, of course, the direct mapping, like shift will go to shift, right? Except that there, instead of doing the operation itself and then returning, you are going to set things up and put it on the third pool, right? The second function is the one which actually gets run in a separate thread and does the synchronous task. So calling the native version of reshel would be done in the second function, right? And once the second function is done, liveUV is going to tell you it's done. So what do you want to do now, right? And that is going to be handled by the third function. So you usually do garbage collection there and call the callback, right? So how do you transfer data through all these three functions, right? Because you will have some kind of parameter. So if reshel was called with what to reshel, then you want to pass that into those native every, right? So we use this concept of baton to pass around data. Okay. So this is the baton in this case. I call it a reshel baton. So yeah, yeah, so if you see, it's not exactly the easiest, like till now we were doing pretty one-to-one mappings, but to get asynchronousity in the mix, it's a bit cumbersome. So it might be too much typing effort, but it's worth the performance improvement, right? So the first member of the baton always has to be this thing called uvworkt, which is something internal. You don't need to worry about that, but it has to be the first number. Okay. And then you can keep track of whatever arguments are being passed. So in this case, we want to call a callback, right? So we keep a reference to that callback here. Then we also want to call reshel on a certain instance of the object, right? So we call, I mean, we keep track of that for here, right? And so what is going to happen is when I made out reshel is called, or corresponding reshel function is called, and you see this code, right? That goes and creates this button, okay? So reshel is going to create this button here, right? I'm going to set the callback as a position function. So why I have a mirrored position? Because as a function passed into the callback arguments, it's a local handle, right? And as soon as this reshel function exits, it's going to be garbage collected. So I've made a copy of it, which is persistent to keep it around for later, right? And then I pass the wrapper. So this is like very basic. I'm setting the fields of the button, right? And then I call the magic function uvq work, which is going to handle all the details. So this is the main event loop. So you just get access to that. Then you pass the request along, and then you tell uv that the actual function doing the blocking work is reshel vsync. And once reshel vsync is done, I want you to call reshel vsync after, okay? So I'll get to the implementation of these two, right? But this is how you do it. Basically you set up, you do blocking work, and then you clean up. So, okay, we come to reshel vsync. Remember that this one is the blocking function which runs in a separate thread, okay? So you do not, do not, do not do anything with E8 in this function, okay? Do not access any VAT related functions, objects, anything, because it's in multiple threads, all kinds of bad things can happen, right? What do you mean by multiple threads? Yeah, so you know that node is an event loop, right? Yes. Yeah, but the thing is a lot of IO operations that occur are still blocking, right? Native IO operations. So to simulate this single-threadedness, Yeah, yeah, but the actual IO operation has to run in another... It means whatever node application whatever, operations that you do in an AC block also still, one core only, it's like a respect to a number of cores in it. Sorry? So you do multiple CPUs on the machine? Yeah. So you do multiple cores? Yeah. You're still running an AC operation? Only one of the cores will spike? Yeah. Irrespective of... So... No, but there's no question of how cores spike in. You're doing IO only. I'm saying that there will be only one process running. There will be no thread. So if at all we have to ask, a thread will be a child process. Yeah. It will be spot on. So there's no need for a node process. No, you have to understand that blocking IO has to still be done. Okay, so what's going to happen is, let's say your node app does a IO request, okay? And you are saying it's part of a continuous thread, right? I agree. I agree. So what's going to happen is internally it's not blocking. I mean, IO by its nature is not non-blocking, right? No. Okay, so even a library like EV, like if you run it in the same thread, even if you say it's not blocking, as long as it's in the same thread and you call a blocking function, it's blocked. That thread itself is not going to move, right? So if you want to have the main thread, which is the BV thread, while I'm doing the blocking IO operation, you want another network connection to come in and be registered, right? You need to move this blocking IO somewhere else so that this thread can continue running, right? So internally, node and BV together conspire to create this thread pool and any blocking IO operations are moved off to the thread pool, simply so that node can simulate the virtual reality of having just one. Visualize this ship. For me, top would be the only way. I can see this spike on a pool. I can see that the process has started running in that. Okay. But when you're saying that it stops that operation, fix up a new operation. I thought the CPU cycle has clicked. No, okay. So I'll just finish up and then we can... Okay, so this is the blocking function where, as I said, don't interact with V8, only do your native calls. So I get the wrapper, I call the reshell function, right? And then in async after, which is called once it's done, you can invoke the callback, right? So this is the output. If you actually were using a native library, you have to tell Vop to link to it when you compile. So again, you can just add some flags. Yeah, I covered this. So there are a couple of things I didn't cover in this, which V8 provides, especially accessors. So you can, you know, every time a certain variable is set, you can have a callback, get triggered and so on, right? Function signatures, again, I didn't do it because we didn't have time. But details of VV and using V8 on its own, what I mean by that is if you wanted to use V8 in another application, not in a different mode, just as a scripting language for JavaScript, then how do you do that? It's not much different between node and this, you just get context and so on. But most of the concepts are the same, right? So these are three good add-ons to start browsing the code for if you want to write your own add-on and confuse about some things, then they are a good point to start from, right? So yeah, now we can get to the question. Okay, so can I just take one, 20 seconds? Yeah, I actually wanted to ask you people a question. I'm just doing like a kind of Soviet kind of thing. If I want to ask you, what do you think is the one technology or idea or concept or whatever that you think every software developer, irrespective of whether he's a mobile developer or assembly hacker should know? And if you just think about that and you can shoot me a mail or something with your answer? Okay, so yeah, okay. So if any of you have different answers, please let me know. And I'm just doing the short story. Thank you. Thanks, Shikhar.