 Hello, and welcome to Testing EggmaScript Modules in Node.js. With me, David Mark Clements. In the bottom right, mistakes are staying in by the way. In the bottom right of the slide, there's a URL. You can use that short URL if you don't have access to the link through some chat mechanism or otherwise. To load these slides and follow along if you so choose. So we're going to be talking about EggmaScript Modules and actually testing EggmaScript Modules, which are relatively new. Well, they're not that new, but they're relatively new in Node. Speaking of Node, Node now has two module systems. The original is referred to as CommonJS. It's a slight misnomer. At the time when Node was going through a rapid phase of innovation and coming together, there was no module specification for JavaScript. EggmaScript, by the way, is the official name of JavaScript if you didn't know that. But there were some experimental specifications for something that was broadly referred to as CommonJS. RequireJS in the browser was related to that. But whereas RequireJS, the front-end RequireJS worked asynchronously, the CommonJS implementation that Node was inspired by, let's say, is synchronous. So when a dependency loads a dependency, say RequireFoo, the file is synchronously loaded from the file system. And what that means is that execution blocks until that file is loaded and then evaluated and then whatever that file exports on the module.exports object is returned from the Require function. The object that gets returned is mutable because it's really just evaluating a file, wrapping a function wrapper around it, and then seeing what that function wrapper does to the module.exports object. There's a manipulate in an object, essentially, and then returns that result. If a file has already been loaded, then the cache is used. So it's not going to reload files that represent a dependency over and over again. They come from the RequireCache object. So CommonJS is the original module system of Node, and it can only load other CommonJS modules at initialization time. You can use the dynamic import function to asynchronously load an ESM module from a CGS module, but only after initialization. So it can't be part of that initial dependency tree. It has to be loaded later and then used. ECMAScript modules are an official specification for the module system in Node or in the browser. I don't think it's unfair to say that the specification was primarily browser-focused. What it's intended for is kind of a replacement or, let's say, an analog of the script tag in the browser. If you think of a script tag, it's HTML that gets passed out, and then whatever source you specify is then loaded by the browser. And the browser can choose, to a certain extent, how it wants to do that. The ECMAScript modules' import syntax is supposed to be and is statically analyzable in a similar way to script tags in the browser. Let's make sure that we differentiate here, because if you use TypeScript or if you use Babel, you might think that you already use ESM modules because an ESM module essentially sort of looks like this. And you do see that in code all around the place. But it's not the same as native ESM modules. I refer to these as foe ESM because if we just focus on Node for a second, if you're using TypeScript or Babel, that syntax is essentially being transpiled or compiled, in the case of TypeScript, I suppose, down to common JS syntax. So when you're using foe ESM, you're really using common JS. For the most part, I'm sure things will evolve slowly over time. But when I'm talking about ECMAScript modules here, I'm talking about the natively implemented specification that is present in some browsers, I think Chrome has support, probably some others as well. I'm not paid too much attention, but in Node, they are in version 12 and version 14 and version 16. Something that is a very large differentiation between common JS modules and native ECMAScript modules is that ESM loads asynchronously. So the file, for instance, could be loaded asynchronously from the file system. The thread wouldn't necessarily then be blocked, and you could potentially load multiple files at once. There's advantages to that. Along with it, there's in Node 14 and Node 16, but not in Node 12, there's top-level await, so in a Node 14 or Node 16 ESM module, you can do await foo. And because the modules are loaded asynchronously, this means that you can do some asynchronous initialization in your module if you need to. Unlike common JS, which is essentially just returning an object that you create, the modules that you can, as when you import them, are immutable. They're read-only, so they can't be manipulated. Kind of an advantage, I think, over CJS. There's, instead of the invisible function wrapper that you have with common JS, where it's essentially just, in a common JS situation, you say const foo equals choir. But if this was a common JS module, what would actually be happening, what Node actually does behind the scenes is wrap it like this in a function, and then I'm not sure exactly what order the arguments are, but it's something like module exports require file name. This is off the top of my head, so I'm not sure exactly on the order of that, but essentially that's how a Node common JS module works. It's wrapped, and then the arguments are passed in from the Node initialization stuff. And so when you modify module exports, for instance, that is the thing that's then taken and returned. So that's kind of how common JS works, but ESM is an entirely new specified module scope. It behaves differently, it behaves more as if with a function you had the use strict pragma. There are some minor differences, but essentially it will be in use strict mode, which means that you can't use things like with, and eval works a little differently in other things along those lines. The import statements are statically passed and then loaded based on that passing. There is also a dynamic import function, which is why top-level await can be quite useful if I, for whatever reason, needed to do a dynamic import. I can say await import. The exporting in ESM is syntax-based instead of modifying an object. So export const meow equals cat, for instance, or you also have the export default, say export function. So you have this slightly different thing with the default export, which is its own special thing, which would sort of be the equivalent of what module exports is, but then you have these, you can do named exports as well, which is kind of similar to how you would have an object with properties on, but it's classified slightly differently. Very crucial here. Common.js has require.cache that you can sort of mess around with a little bit, which holds a cache of all of the loaded modules. The ESM does not expose a module cache so you can't inject anything into it, you can't read anything from it, it's black boxed. ESM modules can load both Common.js and ECMAScript modules. So in terms of the ecosystem, the ecosystem is very Common.js heavy, and Common.js can't load ESM, so that is definitely a friction point. So before you move to ESM, like if you were to publish a module for consumption, do consider that a lot of deployed projects are still just written in Common.js. One other thing that's not on this slide is that there's no way for node to know just by passing the content, whether a module dependency file is Common.js or ESM. It has to assume that either by the file extension, which for if it's .js or .cjs by default, it's Common.js or if it's .mjs, it's ESM, or by a type field on the package JSON that sets the type to module, in which case .js files will be interpreted as ECMAScript modules. It's complicated, no doubt about that. So I wanted to see if I could build some stuff with ESM, see where the pain points were, and the first problem I hit was that testing it is very difficult at the minute, because primarily because if you can't manipulate the cache, then you can't override dependencies, which means you can't mock dependencies. So I went about finding a different way to do that for ESM. But before we talk about that, I wanted to kind of contextualize my approach and why I've taken the approach that I've taken and the things that I feel are important with the testing strategy. So to be clear, we're talking about unit testing here. For me, and we can get into semantics, but for me you have to think about what the unit is, particularly in other languages, the unit can actually be a file in the lib folder or something like that. For me, the unit is the API boundaries of a thing that you're building. If you're building a large monolith, it's something different. So if you're building a small module that's to be consumed, then my perspective is that you should write tests that test the module from the outside and not try and test any of the module internals. Just make sure you hit those internals on your test coverage. And if you can't find a way to hit those internals, any internal logic, then remove that logic because there's not a way for it to be reached. A microservice I think should be defined as the unit, in which case you test the edges of that service. So if it's HTTP-based, say RESTful, then you hit endpoints of that service. Some people at that point might be saying, well, those are integration tests. Call it whatever you want. This is the testing strategy that I use for whatever I'm building. If I'm building a front-end application and I'm dealing with, say, a React component, then in that case the React component is actually the unit. So it kind of depends on what you're working on and the methodology. If it's a monolith like a React application or a small library or a small service, those are different things. So I don't want to test internals. I want to test the edges via the exposed interfaces. I don't want to mock any internals because you just end up in a situation where if you start mocking internals, a lot of the time it's just a very elaborate way to test whether true is true. And those aren't really tests. But dependencies, I'm all for mocking. There's always exceptions, but if you want a dependency to behave in a certain way and you don't want to figure out how to make that happen with an integration environment and databases and sending up everything, all of which is reliable. Once you start working across the network and that makes your tests inherently unreliable, then make the library that you're using that integrates with those things behave how you want it to behave for that test scenario by mocking it. This is black box testing. And as I say, name it as you wish. Let's talk about test libraries and test frameworks. What I'm calling test libraries and test frameworks. There's really two kinds. There's ones where you have implicit globals. So like Mocha and Jest, Mocha has described it functions and some other stuff. Jest has an implicit test function. Those are what we call frameworks. But a test, also what I would call frameworks are test code that needs a test runner, e.g. a separate CLI tool that runs that test code. And the reason it needs that runner is because you have things like these implicit globals and possibly other implicit behavior. That's a test framework as I define it. A test library is something that can run independently. It doesn't have implicit globals. You have to import or require any of the API surface of that test library that you use. And it should be able to be executed directly with Node, although a test runner isn't out of the question. You can still have a test runner for it, but it's not required. It can be used directly with Node or it can be used with a test runner. So an example of a test library would be tap and take. I prefer test libraries over test frameworks personally because if you don't have implicit globals, you don't have the magic. I think less magic is a good thing. Any time you have things that are implicit, there's linters and other things that do static analysis that you increase the burden on of knowing these things and syntax highlighting and editors and stuff, all of that kind of stuff. It just for me places an unnecessary burden on other parts of the ecosystem. The other thing is when you're trying to debug, like if you're writing tests it's because you're trying to apply a rigor to your code and sometimes that can expose problems that you come across while you're writing your tests. So being able to debug is quite important. Debugging through the test runners of test frameworks and the implicit logic and the wrapping that they do, it's not the end of the world, but it does increase the complexity of debugging exercise. If you can run things directly with nodes, you know exactly what flags are applied, you know exactly what code is being executed from initialization onwards. For me it simplifies that process. Putting individual, I mean the way that I write tests is I break it up by file often times. So a test directory may have five, six, seven, eight, however many files and then running MPM test will execute all of the tests in all of those files. But if one test in one file breaks, I don't want to run all of the tests again. So if you have a test runner what you can do is you can install it globally and then you can use that to run your tests and that's fine. You can use that to run just one file and granted you can do that. But when you're switching node versions, now you've got a new problem. You've got to reinstall your test runner or you've got to import it over from another where it's installed elsewhere. If you're using NVM for instance, I mean it gets complicated no matter what. If you can just run it directly with node, it's fine, you just run it directly with node. So you can just isolate that one file with the failing test, just run it straight. It's good to isolate your problem in tests is what I'm saying is it's good to do that in the least noisy and least overheading way. Okay, cool. So let's talk about mocking dependencies. So in CommonJS, no matter what library you use, there's manipulation of require.cache going on. ProxyQuire is a very popular mocking library for mocking things. I think Jest has its own mocking thing as part of it. But generally the concept is that you supply stubs, what I'm calling the mock subject here. And you specify essentially the requireable namespace, whether that's a native module like FS or an ecosystem module like Open or a path to a specific file in your project. You set them up and you supply the things that you actually want to override and you implement that behavior that you want. And then you just pass the file that you actually want to test, say the entry point for your module or the path to your React component, I suppose, if you're doing sort of server side testing of React components. And then that will be loaded. But as it's loaded, the dependencies that it's looking for will use the mocks that you've defined if there's dependencies that correspond with mocks that you've defined, right? And it does that by basically overriding require.cache because require.cache is checked by require before it's going to load a file. So say your path to file being tested.js file loads, the .for slash path for slash to for slash file.js file as a dependency. If you override that namespace in require.cache, the require function is just going to return whatever in require.cache. So as I mentioned, the ESM loading algorithm does not expose any equivalent to require.cache. It's also more complicated the way that the dependency trees are loaded. So require.cache wouldn't even work. It would have to be something more substantial to expose. I still think we could really do with something like that in Node, but I think we're some way off from that and not sure if it will actually happen. Just to be clear, we're not talking about ESM module like code that transpiles down to CGS. We're talking about native ESM, not transpiled code. So there is a way and it's with the loader's API. If I click this link, we can see the description of loaders, but we can also see that the stability is experimental. While ECMAScript modules themselves are not experimental anymore, the loader API is still experimental. So caveat, mtor, as it were, if you want to use ECMAScript modules and you want to test those ECMAScript modules, then you have to use this experimental loader API if you're into mocking dependencies, basically. So an ESM loader is specified by passing a loader flag to the node binary. You can only specify one loader. So if you have two reasons that you need loaders, now you've got to figure out a way to specify to solve both of those reasons with one loader. This is why the API is experimental. I'm hoping that there will be efforts to allow for multiple loaders in future. It's likely to change as well. Whatever I'm discussing here could break soon or it could last. We'll see. That's the nature of experimental APIs. This is the only way that, well, it's the only way that I can think of and after talking to various people involved in the node project, I'm pretty sure it's the only way you can modify ESM module loading. So what we can do then is we can use a loader to modify the loading algorithm of the ESM module system and then somehow have some sort of API that allows us to provide mocks for our tests. But that means that every time if you want to run your test directly with node, every time you do that you're going to have to specify the loader flag, which, you know, is not ideal. I know who cares about typing a few more things, but it's really about the cognitive overhead and having to remember to type out this fairly lengthy thing every time, dash dash loader equals whatever the thing is I'm loading. So what you could do is you could wrap a CLI around node that then uses child process to spawn with that loader flag preset. But now you're back to test runners and I'm trying to avoid this scenario where I have a test framework. I personally want to keep this idea of having a test library. There's also worker threads. So worker threads is the ability to, it's another fairly new node API where you can spawn another instance of a JavaScript environment with node around it. It is slightly different to the main thread of node, but it's pretty much good enough. We'll talk about more, the discrepancies later. And when you spawn a worker thread, which again, it's a thread, it's not a process, but you can still set the loader flag when you create the worker thread. Better that in mind, we're going to come back to it. So with all of that information that I've laid before you, now I'm going to talk about my attempts and my, you know, they work to have some way of mocking ESM dependencies. So the first one is called Lasaretto. Lasaretto was built for a slightly different purpose, but it was part of my personal journey into figuring out, you know, how to make this work. Side topic, I'm the technical lead and primary author of the OpenGS certifications. One Node.js application developer and the other Node.js services developer. I'm also the author of the training courses for the certification. And also there's, if you look at point two C there, there's a free course that you can also take. You can also get certified in that free course, but you have to pay for that. But you can take the free course to get acquainted with Node.js in general. And then there's two other courses that talk about specializing in services development, which is, you know, building Node services and HTTP and security. And then Node application development, which is pretty much everything else. So I've told you all of that because I had to build Lasaretto when we upgraded the examination environment, I think it was last October, so that we could support ECMAScript modules. If candidates now want to use ECMAScript modules in their answers, they can. Because those ECMAScript modules were made official and they were no longer experimental, we need to be able to support them. So because the exams are actually automatically graded, I needed to be able to load a candidate's answer that may be CGS or maybe ESM and then sometimes mock dependencies. So that was the primary purpose of Lasaretto. So the way it works is it accepts a mock subject similar to the one that proxy require takes that I've gone through. You pass it an entry point and then it loads a worker thread with the loader flag. Because it runs in a new thread, what it initially does is it has to take the functions on the mock object, which actually the functions on the mock object is slightly different to the proxy require mock object because it takes each mocked namespace is assigned to a function and then that function returns the actual mock. But those functions get stringified basically and included in the worker thread and evaluated as part of that. This isn't ideal because it means you lose access to the closure scope. But what this is good for is that because it's running in a worker thread and because you can pass messages between the main thread and the worker thread, it allowed me to do some white box sort of testing strategy as well. So within the closure scope of the candidate's executed code, I can also run expressions and check for things. Let's take a quick look at the API of that because that's a lot to take in. So we say we have a mock object like this. We have a function and then that function can be an async function and then it returns the actual mock. We pass in, we get a sandbox by awaiting a Lazareto, pass in our stuff. The sandbox as well can actually call and evaluate code within the worker thread context as well. So take a look at that. It's interesting. I wouldn't recommend using this other than what I use it for right now or ever because it is very specific to the use case but it's part of that work or of getting to that place. So Lazareto where I built it, it can mock both CGS and ESM modules because someone could be implementing using ESM but then consuming CGS modules. It can be directly executed with node, node has run it required. There's a context object that's used to pass state between the main thread and the worker thread and that's a way I can receive state back that I can assert against in the main thread but it's limited by the cloning algorithm that the worker thread API relies on which, yeah, there's things that just won't clone. Yeah, the white box testing, cool. Okay, so that was the first pass. Then I started building a CLI tool and I wanted to try and build it in ESM and so I wrote moccalicious to do that and I wanted this to work differently. I didn't want to have code that gets evaluated and then inject code mocks that have stringified functions that then gets all concatenated together and then run in a really esoteric way. I wanted to see if I could improve on the approach. So unlike Lazareto which is for a very specific use case, moccalicious is more for a generic use case just the general unit testing approach that I've described already. So it doesn't need to evaluate code in a separate thread to the actual test code. It can still use a worker thread in certain scenarios which I'll explain in a second but the actual tests that are running also run in the worker thread which solves the how do we get a mocked thing from one thread into the other. It can also be used for both ESM and CGS from an ESM or CGS parent and Lazareto tends to be fit for purpose for what it's doing but when you get into sort of more in-depth implementations you can get into situations where the dependencies are sort of fork out with ESM. Long story short, moccalicious had to also introduce cache busting to make sure that a file that's loaded doesn't start blocking our dependencies. If the file is loaded before our mock is inserted there's no way to mock it anymore so moccalicious solves for that edge case as well. By cache busting I mean if you've ever worked in the browser and you couldn't get a script to load the latest version of a script load you'd stick a question mark for the query string parameter and you'd put some random hash on there or something and then you were always getting that latest script and we essentially have to use the same approach in moccalicious because it's dealing with file URLs rather than just paths and so because they're file URLs we can jam that query string on the end. There's a lot of weird stuff in all of this. So this is what using moccalicious would look like which is fairly similar to the proxy-quire thing. We just pass in the entry point file, that's the path to file being tested and then we pass it on mocc subject and yeah, pretty much the same. Proxy-quire has a sort of like a fall through mechanism by default where it uses the underlying module. Moccalicious doesn't have that currently so you have to kind of do that part yourself but it's geared towards mocking ESM as well. So this default property is the maps to the default export. If you don't provide that in an object and you just have say a function like in this case then that becomes the default export. It doesn't violate any expectations basically, it just does what you think it's going to do. So moccalicious has an auto load mode and a manual load mode. The first time I said that out loud I didn't realize what a sort of weird thing that is to say. So if we go with the normal loaded mode you can just specify the loader flag so dash dash loader equals moccalicious forward slash loader dot mjs and then whatever it is that you're running so maybe test.js would be better than app.js there. If the loader flag is emitted though what moccalicious does is it starts a worker thread automatically with the loader flag preset on that worker thread and then crucially it blocks the main thread. So if you initialize moccalicious at the top of some tests before you run any of the tests the main thread will just halt right there if you haven't used the loader flag it will just halt and then all of the tests will be reloaded in the worker thread. The worker thread will then have the loader flag set in such a way that the loader can replace dependencies with the mocc specified in each test and that just happens in sort of like some sort of piece of global state. But the worker thread part is really just about making that loader automatically present and again that's about just being able to run a file with node. I acknowledge it's a ridiculous amount of effort to just run a file with node but it's just a really important thing to me for some reason. So just to sort of talk through that. So the way that it works is we set up a loader, call moccalicious and pass in import meta URL which is the same as underscore underscore file name basically in CGS. It's just referring to itself. So if our test file was like dot four slash test four slash index dot test dot j s then import meta URL would be the full absolute path to that file being executed. So that tells moccalicious load this file. If when this code here was executed, if it was executed without the loader flag moccalicious detects there's no loader flag set and then it will execute the same file in a worker thread with the loader flag set and it will block at this point in the main thread. Then in the worker thread it will run this test output the results and so on and so forth and then when the worker thread exits this will force exit from the process so this test will never run in the main thread and that was one of the key parts of moccalicious. However some things don't work in worker threads. It was good enough for me in my use cases and I knew how to dance around some of these pieces but yeah worker threads don't support process chdir for instance. So if you do a process dot chdir in your test it's going to break in the worker thread and it will break explicitly and give you advice on what to do but yeah there's some other stuff like that. Now moccalicious also does a bunch of stuff where it sort of patches some of these things for instance well I'll show you. So if we're in the moccalicious repo for instance in worker threads there's these methods on process dot sd out that aren't present so it recreates them. There's other stuff that it recreates as well all around this area. It recreates process dot sd in as well in a way that's compatible with our implementation. Let's not get too much into the weeds on that though. So if you are doing any of those things that cause issues then you have to just go back to using the loader flag but it will work. So the autoload mode is tentative at best let's say but if you have to specify the loader flag it's kind of a... I don't know I don't like it. I don't like having to do that. So then I wrote a thing called tapx which is just a thin wrapper around tap. No tap and it basically makes developing with ESM or writing tests for ESM I should say easy. It pulls in moccalicious itself and exposes it from tapx and the test runner for tap is also wrapped by tapx in a way where it just adds the loader flag for you. So you can just run tap test and it will work as a runner. This is like a scenario I've created where it's 90% good enough and if I get myself into a situation where I can't run the thing with node anymore then I either change that situation or I just go right I'll just run it with tap. The other thing is that ESM modules and this is another thing if you start using ESM and you have to test them they're currently incompatible with NYC which is a thing built on top of Istanbul. You might be familiar with Istanbul which is the code coverage reporter. Luckily V8 and I'm sure they'll fix that at some point but luckily V8 has something now that's where it has its own code coverage instrumentation in the actual JavaScript engine. V8 is the JavaScript engine that node uses. There's a coverage reporter called C8 that can create reports based on V8 instrumentation output. So tapx also replaces that coverage reporter so you can do proper coverage reporting with ESM. It doesn't work in Node 12 though because the coverage stuff is too buggy. So that leads me to this question. Is ESM ready? I don't think so. You've seen how much I've had to do to just to make the testing work and yes okay maybe I had a couple of pretty strict red lines but those things are important to my workflow and I stand by the effort but it's still very rough and as I say the methods I've used to try and make it work they're good enough for me, are they good enough for everyone? I don't know. I think Mochalicious could with you know we're about 80% there with Mochalicious and another 20% to go but it's going to take like you know 80% more effort on top to get that final 20% piece to make Mochalicious actually work for everyone easily. Well at least in terms of running the file directly with Node. The other problem is I talked about debugability of running a test file directly with Node. Totally true but with Mochalicious because it's using the worker thread approach it's not very easy to debug worker threads right now so we've got to wait for that to catch up as well. Loader APIs are experimental so you know all of this work to get the testing to work could also be wasted or it may need some serious updating or maybe it'll be okay but until that's stable you're really limited on the use cases I mean putting some ESM code into production without having any way to instrument other than via an experimental API means that things like APMs like New Relic and so forth they're going to struggle to support that because the primitives aren't there to support it and the ones that are are experimental so there's a lot more that needs to happen yet I think. Current tooling code editors, linters, test frameworks, spill tooling there's this whole issue of FOE ESM and a lot of that tooling is based around the way that transpired ECMAScript modules works and not around how native ESM works so that all has to catch up and also how do you differentiate. ESM modules also break compatibility with CJS because CJS can't load ESM in the init phase and that's going to start to cause, it's already causing and will cause a lot more ecosystem friction. There's a thing called conditional exports you can find this in the node docs I would say look that up because you can use that to provide support for both ESM and CJS so I would say if you're going to create ESM modules that you want people to be able to consume then either make them CJS or make them both CJS and ESM and check out conditional exports in the docs on how to do that. So my conclusion is I don't think that ESM is ready for most cases the one case I've used it for is building a CLI, a little CLI tool because I'm not so worried about application performance monitoring and I solved the testing part. In terms of when ESM will be ready, time line is unclear to me but I would say at very least we need to have the loader's API as stable and then you need to have everyone's implementations on top of that by everyone I mean the APMs mostly. Okay, here's some things we discussed and some links. I hope this is valuable to you, I know it was quite intense in places I'll be in the chat whenever I'm going to be in the chat to help answer questions and stuff. I would love to have gone deeper but I'm really just trying to give you a broad overview such as it is. Thanks very much for watching. Follow me on Twitter. Thanks.