 Hello everybody, welcome to OpenJS World 2021 and our presentation about Node.js, Node API. We're going to be talking mostly about how to build modern native add-ons. Node.js is, Node API has been around for a while and it has enabled everybody to build ABI stable add-ons, but we've had significant features added recently and those are the ones we would like to talk about. And my name is Gabriel. I'm a member of the Node.js TSC and of the Node.js Node API working group and I originally contributed some portions of Node API like promises, exception handling, environment propagation, module loading, wrapping, unwrapping of objects, threat-safe functions, and context-aware add-ons. And I am Kevin Edie, a API team member primarily working on the Node add-on API wrapper. We'll talk about that shortly. And I'm focusing mainly on thread-safe functions and asynchronous programming in general. Building native add-ons. So, when you need to compile your C++ code to run on Node, there exists multiple ways to compile your code currently. One method to use this is NodeGip. This is the de facto method that exists whenever you install Node. This application, this build system is installed on your mission as well. This allows any dependencies that you install. They don't necessarily have to also install this in order to compile their add-ons. Another alternative is CMake.js. If you've used C++ build systems in the past, you may be familiar with CMake. This one allows you to easily integrate with other C++ projects that are also CMake-based. This will easily allow you to incorporate the dependency trees that you need. However, all of those require compiling the code every time you install the package. There are additional methods that allow you to download pre-compiled binaries and use those so that you do not have to compile every time the package is installed. One method is Node pre-Gip. This one integrates easily with GitHub releases, and another one is pre-build. This one integrates easily with Amazon S3. These two solutions, when you compile your code as a developer, as the package maintainer, you can upload the built binaries to a repository, either GitHub or S3, and depending on which solution you use, it will download those directly instead of having to compile. Node APIs can only be called from the Node thread. There exist multiple APIs in order to facilitate the communication between your application and the native Node thread. There exist multiple different ways to perform asynchronous programming with Node. We have the async worker, which will be able to create a single worker thread, which will execute your logic on this worker thread, and when your logic is completed, you can call back to the Node thread with the result. On this callback, you're able to transform your native C++ data into the Node data. As well as the async worker, we have some Node add-on API-specific extensions, the async progress worker and the async progress queue worker, but these allow you to do is ping from your native thread to the Node thread various progress updates. Then finally, we have thread-safe functions. These are full-featured APIs that allow you to create as many threads as you want and ping into the Node API as you find necessary. One small addition is that one of the crucial differences between the async workers progress or otherwise in the thread-safe function is that in the case of the async workers, the threading implementation is not up to you. In the case of async workers, you're relying on Node.js to create the threads, whereas in the case of the thread-safe function, Node.js creates only the communication mechanism, and it is up to you to use your own threading implementation or to create a new threading implementation. Back to you, Kevin. All right. The first example we'll talk about is async workers. Async worker is implemented in both the Node API, which is the C-based ABI stable that we have, as well as Node add-on API, which is the C++ wrapper. The async worker, as Gabriel mentioned, this would end up creating the worker thread internally, which internally may be a libUV worker thread, but you as the programmer have no control over what that worker thread is. It is a thread created by the underlying system. When you create this async worker, you provide two callbacks, the execute and the complete callback. The execute callback is what you use to run your native C++ logic outside of the Node thread. When that finishes, you call the complete callback with your C++ result, and there you can transform your native C++ data into the Node data. In addition to a normal async worker, like I mentioned, we have the progress worker. This has an additional callback, which is the on-progress callback. The on-progress callback executes also on the Node thread, which allows you to ping from your C++ thread into the Node thread. The final variant that we have is a progress async progress queue worker. This guarantees a progress update every time that you would like to send a progress update. Next, we'll talk about thread safe functions. These are feature-rich APIs that give you full control of how you would like to use Node in any of your multi-threaded applications. Thread safe functions are backed by a message queue, and when you and your native C++ code want to communicate with Node thread, it's basically a request that you post on this message queue to say, I would like to execute some code. The way that you place a request on this message queue, and there's two different ways. There's a blocking call, which will wait for the queue to have space for you to put in your request. And then we also have a non-blocking call, which if your queue is full, you would end up receiving a special status saying the queue is full. With thread safe functions at construction, you specify some queue data. What you have first is your initial thread count. You also specify your maximum queue size. This goes back to how we were talking about the blocking calls and non-blocking calls. If you specify an unlimited queue size, then it would never block, because you would always have space to put something in the queue. But if you have, say, a queue size of 100, and the queue is full, and you try to make a post, at that point, if you make a blocking call, it will block and wait until there's space and the non-blocking will just say it's full. And you have a couple of callbacks. We have a JavaScript function callback, a call.js callback, and a finalizer callback. The initial thread count is, like I said, you provide at the beginning, and it's stored internally. Once this counter reaches a zero, the thread safe function is no longer available for use and will be cleaned up and finalized. And as Kevin pointed out, having things defined statically at the global scope is no longer such a good idea, because nowadays you can have threads running not only on the native side, but you can also run JavaScript in multiple threads. Now, this is not the same as native threads, because what you're basically doing on the JavaScript side is creating a whole Node.js world every time you start a new Node.js thread. And what that basically means is that if you require a native add-on from one thread, and then you require it from another thread, that's going to be two copies of the native add-on. But, you know, having variables at the global scope, thus, is not thread safe, because you don't necessarily load multiple copies of the memory area where these global variables are stored, because usually, since native add-ons are stored as shared libraries, a shared library is not loaded multiple times from disk, it's only loaded once. And so what you consider to be a global static variable exists only once in actual memory. And so if you start accessing the same global static variables from multiple threads, then you end up shooting yourself in the foot that's sooner or later sac faulting. And yes, C++ nowadays has something called thread local as a storage class specifier. However, that is also not recommended because a JavaScript thread may or may not coincide with a native thread. So it is possible to create a JavaScript world without creating a new thread, in which case into that JavaScript world a native add-on will be loaded, but really it's two worlds running in sequence back and forth time-sliced and cooperatively multitasking on the same thread. And so the best solution is to use self-contained add-ons, where it is guaranteed that multiple instances can be launched. And the cleanup is no longer process cleanup, it's cleanup of this JavaScript world, which may be the process, it may be a thread, or it may be multiple JavaScript worlds, we call them environments, running on the same thread. Since you don't know which it is, it's best to use these APIs to store what you need in terms of global data. And accordingly we made an API end per add-on instance variable, it is no longer shared among add-ons, and so this gives us a good hook for adding native data. May I have the next slide please? So we have several tools that support this. On the C side, we have an API module in it, which is basically a macro that plays the role of a function header, and it gives you the formal parameters ENV and exports, where ENV is an NAPI environment and exports is an NAPI value, so it's a very usual arrangement. So basically, when you call NAPI module in it, it is expected that it's followed by a body that is the implementation of a function, which can assume that variables ENV and exports exist, and that ENV is of type NAPI N and exports is of type NAPI value. And so here you do the usual initialization, except that now you can be certain that this add-on can be loaded multiple times from multiple threads and so forth, whereas with what did we have before? If you just use NAPI module without using NAPI module in it, then that may or may not be. Nowadays, NAPI module works as well, but there was a period when it didn't, and so we introduced this macro, which ensures that it will always work. May I have the next slide please? So where do you put your data? Just because your module now supports getting loaded multiple times, you still need a place other than global static to put your data, right? And so to support that, we introduced methods NAPI get and NAPI set instance data, and they're very simple methods. They basically associate one void star pointer with an instance of NAPI ENV, and set obviously sets it, get obviously gets it. If you set it twice, then it forgets the old one, so be very careful. It's basically a matter of project management to decide who gets the privilege of calling this and setting the data for everybody, and it is their responsibility to then sort of delegate access to this so that others may attach pointers if your project structure happens to be such. Otherwise, it is under your sole discretion what you do with these two APIs. It accepts a cleanup parameter, so the usual NAPI finalize mechanism works with this data as well so that you can, it basically acts as a destructor when your add-on is being cleaned up. And so that way you can align the life cycle of your data with the life cycle of your add-on. Next slide please. So here's an example, right? We define a structure called add-on data, right? And we define a deleter for it, a finalizer. And so we use an API module in it because this guarantees that our module will be available multiple times and it will be loadable and will work correctly when loaded multiple times. And inside the body, we create an object of type add-on data and we call an API set instance data and we pass it this data and the deleter, right, the finalizer. And if you look online five through 10, that's when our function gets called from JavaScript and we can maintain, so we can access this global data by using an API get instance data and then you can, you know, basically use it in whatever stateful way you would use a global variable that you would have had in previous versions of your add-on implementation. So, yeah, I guess that's the most basic usage of set and get instance data. Next slide, please. So in node add-on API, we have the luxury of having more convenient C++ abstractions on top of this and one of the most intuitive ways that you can create and make use of, create self-contained add-ons and make use of an API set and get instance data is by using the add-on class. The add-on class works almost exactly the same way as object wrap, meaning that you take a C++ class and you basically expose it to JavaScript, only in this case the life cycle of an instance of this class coincides with the life cycle of an add-on instance. So basically methods that you add using the instance method API to this add-on class are basically your bindings. They are not associated with the class that you might expose but they are associated with the add-on itself and in the next slide, which I'm now kindly asking for, you will see how this all hangs together. So this is the definition. This is a complete and functioning node add-on API add-on. The class my add-on is a subclass of an API add-on and in the constructor, you basically define all the methods that you would like your add-on to have and you use that using define add-on. And in this case, we define a very simple add-on which just increments a value and basically doesn't do anything else. And so we expose that method as the increment method to JavaScript and when the method gets called, it increments a value called x. And you notice that the increment method on line 8 through 10 is basically the C++ version of the method in the previous example that has the NAPI get instance data called built-in. So the increment method is an instance method of class my add-on. It's not a static method, right? And so the instance that it receives was stored on the environment using NAPI set instance data. And so even though increment is not a JavaScript object instance method, it is a C++ instance method and so you can easily access this state and you don't have to store any global variables. So in older versions of this API, int x equals zero would have been declared at the global scope but now you don't have to do that and the construction and destruction of these instances is handled by an API or node add-on API itself and you declare that you wish your add-on to be represented by class my add-on by using this new macro called node API add-on. We have node API module which creates an old-fashioned module where you have to use the instance data yourself. But this macro additionally gives you the convenience of not having to do that anymore and the methods of the my add-on class will retrieve the instance data from the environment for you and so you can just treat everything as an instance variable. Next slide, please. So what else do we have besides multi-threading and context sensitive add-ons which can be loaded multiple times? We also added several convenience methods because they proved very popular. One of them is data object, obviously it's used extensively in JavaScript and very often it gets transferred to the native side. We added begin support as it's rolled around in the engine. We also rolled it out into our API. We added support for detaching array buffers which means basically you promise that you will no longer use the array buffer and it just turns into sort of a stub object and it cannot be used anymore. Next slide, please. And we also added an API to freeze and seal JavaScript objects. This is very useful for Electron. It basically renders a JavaScript object variously unable to be extended and so forth and the properties are all frozen so they cannot be changed anymore. And we also added an API to tag a JavaScript object for type checking. Now what this means is that once an object is tagged you can later check what the value of the tag is and you can be certain that the native data that you associated with that object is indeed a pointer of the correct type. So the story behind this is that we realized that using prototype manipulation it is possible to fool a native add-on into thinking that an instance of a certain class of JavaScript object is of that class while in reality being an instance of a different class. And so what this means is if both of these JavaScript classes were created using object wrap meaning that their methods are backed by native methods then it is possible to retrieve a pointer that refers to a native structure of a different type and cast it to the wrong type of structure using prototype manipulation. And this is especially true if you have multiple native add-ons interacting with one another and objects from one native add-on end up being passed into the other and so forth. And so to avoid that we are now making it possible to unambiguously tag an object using like a UUID which is basically stored as two consecutive 64-bit integers. And so you can basically paint the object with the UUID and then later on retrieve the UUID and compare it with one that is stored in your add-on. And yes you can store this in a global static way because it's read-only. You store it once, it's const, and so you can read it from any number of threads. You cannot write to it. So the context sensitivity is not affected by these tags. So these tags are unaffected by context sensitivity. Next slide please. All right. So these are all the modern tools that are available to you. So where do you get started? So one of the best ways to get started really is to just pick something that you want to do. Like what do you wish Node.js had that you had when you were still doing C++? What kind of cool library that does image processing, encryption, some network protocol? You pick one of those libraries and you start writing bindings for it, right? And you can use all the tools that we discussed and you can look at these resources to see all the basics for writing add-ons. So for example, the Node API documentation is a great place to start. It's a C API. So if that's not to your liking, then you can jump straight to the C++ wrapper which will result in code that essentially ends up calling the C API in certain well-known and often used patterns. So you can think of the C++ wrapper as basically a really, really, really fancy set of like macros. So you do not sacrifice API stability if you use the C++ wrapper. That's probably the most important takeaway. We have a great many tutorials. We have examples of simple add-ons and simple calling patterns and how you get from JavaScript to native and back to JavaScript synchronously, asynchronously, using threat-safe functions. All those examples are at the link you can see. We have a workshop that you can watch and we even have a conversion tool which uses regular expressions to take your existing project. If you've already ported something to Node and you've already written bindings for it, however you wrote them in NAN or directly in V8, then we'll try to do a decent job of catching all the places where we can do like a one-to-one mapping. Expect many compilation errors afterwards because obviously it's not a perfect tool but it should give you a good start on getting to an API and to ABI stability as a result. All right, then we have a generator which basically creates a skeleton module for you which you can then flesh out and add bindings to. We have two different versions of this. GenAP is another option you might have and we have, if you're not a big fan of C and C++, we have bindings for Rust via Neon and we also have NAPIRS. You can write your add-on in Rust essentially and then you can use the NAPI implementation provided by Node.js via the Rust language rather than directly using C or C++. Finally, the things Kevin talked about at the beginning about packaging your add-on once you have it, pre-build is a great solution. One thing that I might add to all of these solutions is that they all fall back to compiling the code if you happen not to have provided a pre-compiled binary for that specific platform or that specific architecture. They all try to compile the code if they fail to download a pre-build binary. It'll solve a lot of your problems and you may get fewer bug reports because people actually figured out that if they have a compiler, it'll still work. It's a good fallback to have. Next slide please. Yeah, so these are links to the other tools that Kevin was talking about. Pre-buildify is another tool that allows you to pre-package and build add-ons. Node pre-jip likewise, slightly different flavors. CMake.js is not such a package. However, it is a way to avoid using Node-jip and it integrates better. In fact, Node-jip is sort of kind of unmaintained now. We're using it for Node and we're maintaining it only insofar as Node.js needs it. It's actually a good idea to migrate to a more widely used build system if you can. So we have an example, like a real-life example from MongoDB about how one specific add-on was migrated from NAND to Node API. You can read up on the ins and outs, gotchas and so forth of people who have gathered real-life experience in this area that may also help you with your own porting. Next slide please. And finally, if you would like to contribute to our effort, you can read about our work at ABI Stable Node. That's where we keep sort of the design discussions and the milestones that we wish to achieve. And of course, if you find issues and you wish to contribute to an API or any other part of Node, you can just open the repo and file an issue, file a PR. And if you wish to contribute to the wrappers, then know that on API is the place to go. That's where we maintain poor requests and issues that people find while using our Node API wrappers, the C++ wrappers. And of course, you're always welcome to join our meeting. It's weekly on Fridays at 8 a.m. Pacific time. And we're always happy to see new faces. So by all means, if you wish, if you're interested in getting up to speed with contributing or if you just found the typo and you feel like you need to share it with the group by all means, come on to the meeting and we'll be happy to receive you. That was it, folks. Thank you for listening. Thank you. Bye.