 Hello, my name is Benjamin Ko, and I'm going to be giving a talk today entitled, Shiver Me Timbers, Migrating Yargs to ECMAScript Modules. This talk is essentially a discussion of what the process was like, migrating yargs to a form that works for both ECMAScript modules and common JS modules. But first, a little bit about me. As I said, my name is Benjamin Ko. I'm an engineering manager at Google, where I lead a team focused on automation, mainly in support of GCP client libraries. I was also the third employee at MCAM Incorporated. And this is where I became fairly involved in the Node.js open source community. In the process of doing so, I began maintaining the library yargs, which we're going to talk about a little bit today. And I've also collaborated on various other open source projects, such as Node.js and the V8 JavaScript Engine. I will be outlining an approach that allows you to migrate your library to ECMAScript modules in such a way that you don't break your existing user base of common JS dependence. This is an approach that works well if you have a library like yargs that has a large number of dependents, and might not be the best approach if you're writing a brand new library today. Given that this talk is from the perspective of migrating yargs to ECMAScript modules, I just wanted to make sure that if you'd never heard of yargs before, you had a passing understanding of what it does. Basically, yargs is a library that takes the string arguments passed into a command line program, and it parses these in such a way that they're more easily consumed by the person writing a command line application. So rather than just getting these as a string as they're passed in Arc-V, you get them given to you as a JavaScript object that you can then grab properties off of. Here's the example of the parsed object that you get for a really simple command line program where you're setting the property hello equal to world. Once you've defined the properties that your command line program accepts, yargs also will provide for you a pretty help output. Here's an example from the test runner Mocha, one of yargs as many dependents. Now that you hopefully have an understanding of what yargs is, the majority of this talk is going to be discussing our motivation for migrating yargs to ECMAScript modules and what the experience was like being an early adopter of ECMAScript modules within the Node.js ecosystem. So let's look through the agenda. I'm going to begin by talking a little bit about the history of module systems in JavaScript, which is actually a fairly fascinating topic and has been many years in the making. We're then going to discuss the benefits of ESM or ECMAScript modules and why we should be excited about this shift in the community. Assuming that we've sold you, we're going to speak to some of the options available to library authors for supporting ECMAScript modules. And then we're going to speak to the specific approach taken by yargs to support both ECMAScript modules and common JS modules, which is a topic we'll also touch on briefly. Having shown you the approach we took, I'm going to show off some of the neat benefits we get in our library as a result of the migration. And I'll conclude by both discussing the challenges we faced and the benefits we gained from this migration. And I'll share the links and a bit of a bibliography from all the research we did for this talk. So what is a module exactly? At its essence, a module is just a reusable snippet of code that may or may not be published to a registry such as MPM. Other ways you might reuse a module would be between files in the same code base or perhaps between libraries in a monorepo. What is important if you wish to share a module with other folks in a community like on MPM is that you have an agreed upon way to export and import code from that module. Most of the folks attending this talk are probably familiar with syntax that looks something like this. The use of a require statement to pull in either the default exports from a library or in the case of yard slash helpers to pull in a submodule from within that library. This is an example of common JS. Common JS was one of the first module systems proposed in the JavaScript community. It began its life in 2009, initially called ServerJS, when it was proposed by then-Mozile engineer Kevin Dangor. This was essentially just an effort to get the community to rally behind a set way to define their exports and a syntax around how requiring would work for fetching in these exports. The approach gained some popularity and to indicate that it was applicable both to the web and to server-side JavaScript. It was renamed to common JS from ServerJS midway through the year. Later in the same year, Ryan Dahl presented NodeJS at the NodeConf EU conference. This was presented as a server-side JavaScript built on VA using the common JS module system. As we know, NodeJS became quite popular and, along with it, the common JS module system. Despite its emerging popularity on platforms like NodeJS, common JS was not adopted by web browsers. One reason for this is that common JS assumed that modules would be loaded one after another synchronously. This isn't a problem when you're loading from the disk because that's very fast, but on a website where you're trying to load multiple scripts from an HTTP endpoint, it can become prohibitively slow not to try to load multiple scripts at the same time. In an effort to address this problem, Chris Sip proposed in September of 2010 the Asynchronous Module Definition Spec, or AMD. The idea being that this would be an extension to common JS that addressed the problem of loading multiple scripts in parallel. Ultimately, this was not adopted by common JS, but AMD in a standalone form was adopted by popular frameworks such as the Dojo toolkit and RequireJS. Despite solving the problem of loading scripts in an asynchronous manner, one of the problems facing web browser adoption of common JS, ultimately web browsers did not adopt AMD either. In parallel with the development of common JS and the proposal of AMD, in September of 2009, Ehab Awad and Chris Kowal presented an early draft of the concept of ECMAScript modules. So this was the idea of getting modules built into the JavaScript language itself. Ehab was a worked on the Kaha team, which was a team at Google concerned with sandboxing JavaScript such that untrusted JavaScript could run in a web page. So this initial pitch for ECMAScript modules didn't just address issues like asynchronous loading, it addressed the topic of how you would use modules in a secure manner in a website. This proposal evolved and by February of 2012, the V8 JavaScript engine, the engine used by Chromium had adopted some of the initial proposals of ECMAScript modules. By July 2017, Node.js had shipped its own experimental ECMAScript module implementation, but this was behind the flag experimental modules, so you couldn't use it without your user setting a flag. One of the reasons why ECMAScript modules are so topical today and why this is an important talk is it wasn't until May 2021, so quite recently, that ECMAScript modules became available on all versions of Node.js. So for the first time in the history of module systems in JavaScript, with ECMAScript modules, module authors can now rate code that will work both on Node.js and all common browsers such as Chromium, Firefox, and Safari, along with new platforms like Dino, which have themselves adopted ECMAScript modules. So as a library author, what are some of the benefits of adopting ECMAScript modules? As we've already discussed a little bit, ECMAScript modules perform asynchronous loading when you load a dependency. Here's an example of an ESM module loading in the web browser. If you look at the timeline, you'll notice that many of the dependencies load in parallel, making the overall load time much shorter. The next benefit that I think is worth calling out is the fact that if you're using Node.js, ESM modules support top-level await. So in the past, to rate code like this, where we have an await at the top level, you can see it awaiting YARGs there, we would have had to have had a function that we wrapped this in and then invoked that function. Having top-level await is much more elegant. The fact that ECMAScript modules are a standard, developed by the same standardization body that standardizes the JavaScript language is a huge benefit. It guarantees that code we read in one place can work in another place if their standard's compliant. Here's an example from the Simpsons of one of the pitfalls of not standardizing things in which Patty attempts to plug a razor into a plug that has not followed a standard. With the standard in place, we can start to rate JavaScript that we know will run on multiple platforms. So we can rate JavaScript that will work on Node.js, Dino, and all modern browsers with minimal changes to the underlying code. At this point, you're hopefully saying, all right, I'm sold. I'd love to start using ECMAScript modules inside of my projects, but how do I do so? In this part of the talk, we're gonna outline the two main paths that one can go down to adopt ESM modules inside of their libraries. The easiest approach you can take to start adopting pure ESM is to quite simply switch to using ECMAScript module syntax inside your libraries. Just switch to a .mjs extension rather than the .js extension on your library files, or alternatively set the type module in your package JSON, or in the case of the web, type module in your script include, and that's it. You begin rating import and export statements using the new syntax, and you'll have code that pretty much works in modern Node.js versions and in the web and in platforms like Dino. There are some caveats, however. This will only work in newer versions of Node.js, so versions newer than 12.x, and it forces the hand of your dependents. If they wanna adopt the newest version of your library, they'll also need to use ESM, or they'll have to use the somewhat clunky dynamic import syntax. Another option available to library authors rather than rating purely ESM is to rate a dual ESM slash common JS module. What this entails is the library author rating their library either in common JS or ECMAScript. They then use a build tool such as rollup or babble to target the other flavor. So if you've written common JS, the build tool will create ECMAScript modules for you, or if you've written ECMAScript modules, you'll use a build tool to target common JS. In your package JSON, you then use what's called an exports map to indicate what file should come in when you use a require statement versus what file should come in when you use an import statement. Some challenges and caveats that come up with this approach are that configuring the compiler to do this can be complicated, configuring the exports map itself can be complicated, and it increases the size of your module because you now actually just have two versions of your library in the same module. What's nice about taking this approach is we can ship a version of the library that dependents can consume regardless of whether they themselves have adopted ECMAScript modules yet, or are still using common JS modules. Given the option of migrating fully to ECMAScript modules, or trying to create a module that targets both common JS and ECMAScript modules, YARX has opted for this dual module-based approach. Our reasoning was as follows. YARX is one of the more dependent upon libraries in the Node.js ecosystem with over 24,000 folks using it. We didn't want to force the hand of all these dependents by making them have to move to ECMAScript to use the latest and greatest version of YARX. Rather, we'd like them to be able to continue using common JS, still get our latest features, but as they themselves move, we provide a paved path for them to use the ECMAScript version of YARX, which with this dual approach is available within the exact same library version. Having said this, because computers are hard, there are actually multiple approaches one can take to creating one of these dual-mode modules. One such approach is if you're using pure ECMAScript module format in your library or pure common JS format, is you can use a tool like Rollup to target the alternative. So as an example, if you've written your library using ECMAScript modules, you have a Rollup build step that creates a common JS library version for you. Michael Rogers has written a tool called IPJS, which is meant to codify and simplify this process. And it's worth checking out if you're considering this approach. Another approach, if you're writing your library in TypeScript, is to take advantage of the fact that TypeScript itself can compile as an ECMAScript module. In this case, you can have TypeScript compile the ECMAScript module, and then again use Rollup to compile the common JS version for you. A third approach you can consider is that the JavaScript compiler Babel has been adding similar functionality to Rollup and additional compiler options that are meant to make it easier to create modules that work as dual-mode ECMAScript and common JS modules. There's a blog post linked here on this topic. Given that Yarg's already used TypeScript in its code base, we opted to go with the approach of using TypeScript in combination with Rollup in a combination with an exports map, which is needed to be used for all approaches, to support a dual-mode ECMAScript common JS module. The approach we developed has the following steps in the build process. We rely on a plugin called Westburg slash Rollup plugin TS. This teaches Rollup about TypeScript syntax. So when you're compiling Yarg's, we added an additional step that using this plugin takes an entry point in TypeScript and using Rollup builds a bundle that's a single common JS file that represents the entire code base of Yarg. So it's basically a combined file that has all of our library files in it. Along with adding this additional compilation step, we updated our TypeScript configuration so that its module field targets ES 2020, which represents ECMAScript modules rather than its default common JS export. Thirdly, we defined an exports map inside of our package JSON that explains to someone relying on our library what file should come in as a result of import versus what file should come in as a result of require. It's worth mentioning that adding an exports map to your library should be considered a breaking change because it locks down within your library what files are accessible. Let's take a look at each of our configuration files. The first example here is the config for our Rollup compilation step. Things worth mentioning here are that our output format is common JS. We're using this Rollup plugin TypeScript to actually perform the compilation. And the third interesting thing is that we have this common JS.ts import file that we created. And this is just a tiny little file that is meant to make the bundle that Rollup creates match as closely as possible prior releases of YARGs. The next file shown here is our TypeScript configuration file. The only real change that's important here is that we've switched the module to ES 2020 from its default common JS build process. Finally, here's our package.json with an example of an exports map. We can see here that if you import the root of the library, we're gonna give you index.mjs. And if you require the library, we're gonna give you index.cjs. Similarly, if you require YARG slash helpers, you're gonna get helpers slash index.js. And if you import YARG slash helpers, you're gonna get helpers slash helpers.mjs. I've actually trimmed down our exports map a little bit to fit within a slide. It's worth mentioning that this was definitely the most tricky part of getting our build process to work. For instance, we had to put the exports map in an array rather than an object to make it work on some versions of node 13. And we even had to remove the extension from some of our files to trick node into loading it as a common JS file rather than the default MJS file because we set our type in the package.json to module. These caveats aside, with this new build process in place, we were able to ship a version of YARGs that pretty much worked both in common JS and MJS as you'd expect. So for someone who happened to be writing a hello.mjs file, they could just import YARGs as you'd expect and then use YARGs by invoking it with the argv. And if someone was still using common JS in their module, they can create a hello.cjs or a hello.js and instead require YARGs. So now I've got a little bit of a show and tell to demonstrate some of the neat benefits that we get now that YARGs has been migrated to supporting both ECMAScript modules and common JS modules. Here you can see I've implemented a command line interface in a web browser. And so it works just the same as if you were typing into a command line in your terminal, I can type help and get help output. I have a command alert which will pop up an alert inside the web browser. If we take a look now though at the actual source code that's allowing YARGs to run in the web browser, what's so neat about it is that we actually just import YARGs the same way we would import into our normal Node.js module and we just do so inside of a module script tag in the browser. So in some 40 lines of code, we're able to have YARGs work identically to how to work in Node.js inside of the web browser. If we take a look at the files loading when we actually load this page, what we would see is that it's actually identical files to what we see when we're running it in Node.js. So there's unlike in the past where you might have had a webpack build step, there's no build step required. We're able to just load all of the YARGs ESM dependencies right in the browser. So in conclusion, let's discuss some of the things that went well in this migration to add ECMAScript module and common JS module support to YARGs and some of the things that didn't go well. So in the good column, we're successfully able to target multiple platforms with YARGs now without a build step. So YARGs works in Dino now and as I showed in that demonstration, YARGs works in modern web browsers. I think this is really exciting that we're able to target a wider number of platforms with YARGs without having to resort to a build process. We're also seeing 10 million downloads a week now with these new versions that support ECMAScript modules and we've been getting minimal bugs related specifically to this migration. So I think it shows that the migration has been successful. We've also, and I think most importantly, created a path forward for YARGs's 25,000 dependents to gradually adopt ECMAScript modules without having to stop using the latest and greatest version of YARGs. In the bad column, the complexity of our build process has increased significantly. We now have a rollup build step as well as a type script build step and we have to worry about how these things interact with each other. The size of the YARGs library has also increased because we now ship two versions of it, one that supports common JS and one supports ECMAScript modules. So there's just a larger number of files that end up getting published to MPM. In the ugly category, it was very finicky and difficult to get our exports map working appropriately. The core of the problem is that we're setting the type module field inside of our package JSON, which indicates that .js extension file should be treated as modules. This in turn meant we had to do some weird shenanigans like dropping extensions from files to get them to load as JavaScript files in some contexts. The reason we had to do these backflips falls under the other ugly problem we were running into, which is that TypeScript does not yet support .mjs extensions, which meant that for our TypeScript build step, we had to have .js extensions, which complicated our export maps significantly. The final ugly thing I would say is that I was hopeful that now that we supported the web browser natively without a build step, it would be easier for webpack and libraries of the sort to create bundles of YARGs. We've actually been finding so far that there's been quite a few bugs related to trying to webpack the new version of YARGs. So when all's said and done, was it worth putting in all this extra work to make YARGs support both common.js and ECMAScript modules? My opinion is yes. YARGs is depended upon by many libraries, and I think it's useful that we're able to help steward people towards ECMAScript modules without them having to make the swap immediately. My opinion is different if you're writing a brand new module. As I said, it adds a lot of complexity having this dual mode build process. I think if I was writing a brand new library today, I would actually probably consider writing it just an ECMAScript module and then listen to the community and see if folks are pushing you to have a common.js version. So I wouldn't start immediately with this approach that YARGs has taken. I myself, I'm gonna be keeping an eye on how the Node.js ecosystem adopts ECMAScript modules, and I'm gonna be looking for ways to reduce YARGs' complexity over time. So if a year from now it looks like 80% of folks have adopted ECMAScript modules, I'm gonna be considering removing the common.js build step from YARGs. I've included in this last slide some of the interesting reading I did when I was preparing this talk. There's a great kind of early 2009 discussion of what kind of kicked off ECMAScript modules, which is what server-side JavaScript needs. There's Ryan Dahl's original Node.js talk where he presents Node.js and talks about how it uses the common.js module system. There's my colleague, Jan, pulled together this great timeline of all the, what was basically happening over the years as ECMAScript evolved. And this timeline was part of my reading. Cydrasaur, who has wrote a really good post on how they're migrating all of their modules over to ESM, I think this is also a great read. And there's the blog post by Babel about how they're trying to make a better ESM-CJS interop. There's a blog post I wrote recently about how it's possible to do this dual-mode module so that the ECMAScript migration could be a little less disruptive for the community. And finally, Require.js has a history of common.js on their website, which is an interesting read. Thank you so much for listening to me today. It was a pleasure to be at OpenJS World 2021. I'll be available for Q and A. Please come with any additional questions that you have for me, and I'm excited to talk to folks today, so take care.