 So thanks for joining today. I know it's day two, so I'm sure a lot of folks wrote having dinner late last night like us. I'm going to be talking today about code coverage, which is a topic I'm really passionate about. And specifically, I'm going to be talking about code coverage as it relates to the Node.js project itself, which is something I've worked on quite a bit on the project. My name is Benjamin Coe. I'm a DPE at Google, which stands for Developer Programs Engineer. This is basically what you get if you smash together a developer evangelist with an engineer. You get to do some talking like this, which is really fun, but you also spend a lot of time engineering, which I enjoy. So good for folks who enjoy both things. A little bit more detail about me. I was the third employee at MPM Incorporated, so I've been involved in the Node.js and MPM community for quite a few years. I currently work on Node.js client libraries at Google, so this is a really neat role where basically my team works with teams within Google, mostly on Google Cloud, to help them make Node.js client libraries that are idiomatic to the community, feel like a normal MPM package, even though not all the teams that Google necessarily are part of the Node.js community. In the world of open source, I contribute to YARGs, which I've been working on for quite a few years. I wrote the tool NYC, which I'll be talking about in this presentation, which is a tool for collecting code coverage, and I contribute to Node.js. Actually, I've done a lot of work on coverage in Node.js, so that's why I'm talking about it today. Another thing I really like is I edit this specification called conventional commits.org, which is a way to add conventional kind of structure to your commit messages so that you can parse them and automate them. I'm Canadian, actually, but I live in California. No one's perfect. And today, so to give you a bit more detail about what I'm going to be talking about today, I'm going to talk about what exactly code coverage is and give you a little bit of a history of code coverage, which is, when I started researching this talk, it was surprisingly interesting, the history of code coverage. I'm going to talk about what it's useful for, so why you should want to use it on your projects and how we use it on the Node.js project ourselves and how it benefits the project. I'm going to, for folks who are curious like me, I'm going to go into details about how it actually works behind the scenes, like what's actually happening. And I'm going to talk about how you can both use it on your own projects. If you don't currently use code coverage as one of the tools for your open source projects, I'm going to talk about how you can use Node.js's own coverage as a way to start contributing to the Node.js project. I'm going to show you what our coverage looks like. So what is code coverage? I mean, the simplest explanation probably is to say it's a way of tracking the execution of branches in your program. So if you have an if and an else statement, you can run your program of coverage, and it will keep track of whether the if was run, say, three times and the else was run four times. So it gives you nice analytics about how your program executed during its runtime. As you probably guessed, code coverage was invented in a top secret army biological laboratory. This is a joke. You probably didn't guess that, but that's true. It was proposed in a paper called Method for Systematic Air Analysis of Digital Computer Programs by Joan C. Miller in 1962, declassified in 1971. And what's kind of interesting about this is I was reading the paper. She's actually proposing an approach to using coverage that's really similar to what a lot of people think of when they think of coverage today, which is using it to better expand the test suite that they're using inside that lab, which I think is really neat. What the paper suggests, basically, is that you take a program, you create a graph of the different paths through that program, and you enumerate the exit conditions of that program, and then you use this chart you've made to create some more tests to help make sure you've covered that entire program's exit conditions and branches through that program. Kind of interesting that I thought that this paper doesn't talk about automating it. It mostly talks about how one would draw this out yourself. And I was thinking, if you think about 1962, there's a good chance maybe even the program was punch cards. So you can't use a lot of the methodologies I'm going to talk about in this talk on punch cards, or I think it would be harder to do. So you can see why someone took the approach of doing it externally on a piece of paper. So what's happened since 1962? Well, I was kind of doing a little bit more of a literature review. Turns out IBM has a patent from 1989, which is called a system for determining the code coverage of tested programs based upon static and dynamic analysis of recordings. This is the first kind of reference I saw to taking that 1962 paper and starting to automate the process. Interestingly then, by 1996, this tool G-Cov came out, which is part of the GCCC suite. And it allows you to collect coverage in your C code. And it has a lot of the features we're used to today. So it has branch coverage, line coverage. It gives you pretty reports out the other end. And luckily, they weren't sued by IBM because I had no clue IBM held this patent, which luckily ran out in 2006. So the tools I've worked on, I don't think, are infringing on this patent right now. The same year, 1996, Java 1.1 also was released with coverage. So this has been a tool in the Java and the C communities for quite a few years. Jumping to the Node.js community, we were actually a little late to the game. So our first kind of real popular tool for collecting coverage was called YUI Coverage and was part of the YUI tool suite. This apparently did not come out until a SPRIMA came out. So the SPRIMA is a parser for JavaScript that allows you to do your own parsing. And this is actually what enabled a lot of these JavaScript-based coverage collectors. Part of this, people would actually transpile down to Java and then use something like JCov to collect coverage for your JavaScript. What's interesting about YUI Coverage is it actually has a direct lineage to Istanbul, which is a popular coverage tool used in the Node community, which in turn is what's used by jests. So this YUI Coverage tool actually inspired a lot of the tools for coverage that exist today, including NYC, which is something I wrote while I was working at MPM with actually Ryan, who's in the audience here. And we wrote this at MPM to collect coverage specifically for the MPM client, which spawned lots of sub-processes, which a lot of tools didn't exist for doing that at the time. NYC also uses Istanbul. So most of these coverage tools actually kind of follow from the same set of tools. Until 2017, when V8 actually introduced coverage into the V8 JavaScript engine itself, which is the JavaScript engine used by Node.js, I'm going to talk about this in detail in this talk. So I'm not going to dive into it here. Around the same time, 27, 2018, I began working with some folks on the Node.js team to expose this functionality in Node.js so that we could, in the Node.js project itself, also take advantage of the fact that we had coverage built into the V8 engine. So unlike that manually drawn chart I showed earlier in the talk, this is closer to what you might think of as a coverage report today. It's a nice kind of human readable, shows you the code that ran. And just to explain it, we basically see the code that wasn't run by my test suite or by some run of my processes in red. The code that is in yellow is a branch that wasn't executed. So this line might have run, but one of the branches on the line didn't execute. What's interesting about this is it shows a problem in the Node code base, which I'll talk about later, which is that we historically collected coverage for Linux, but not for Windows. So a lot of the time, we'd have these false negatives, which were related to our Windows coverage. The other interesting thing about this report, it's not just showing you what lines recovered. It's showing you how many times they were executed. And so this is kind of interesting. You can actually start to think about it as a way of tracking performance analysis. So this is great. This is kind of a little bit of a history of coverage, what exactly it is, but what is it actually useful for? I'm going to start by showing a little video here that kind of shows what a lot of people think of when they think of test coverage. And what's kind of interesting about this, too, is it's also really close to the approach that was being proposed by that 1962 paper I showed at the start of the talk. And this is using coverage as a way to better expand your test suite. So imagine you have a program. You can add NYC to your program to actually run your test suite with coverage being collected. So then when we run our tests, we actually see that we've only hit 50% of the lines of code here. We can then go back at an additional test suite, go back and run our program, and see if we've actually exercised the lines of code we were trying to exercise. And sure enough, we've hit 100% test coverage now. So it gives us this nice feedback loop as we're trying to expand our program and make sure that we've hit all the lines of code that we want to hit with our testing. But this is just one use case. Another use case I really like is that if you have a long-lived open-source project like YARGs or Node.js, if you get your code base close to 100% coverage, it's easier to identify whether a contribution to the project has been appropriately tested. So imagine that I'm working on an open-source project and I get a pull request that looks something like this. One of the first things that jumps out at me is that there's no new tests added in this PR. So even though there look to be maybe some problems with this meaning of life method, like it looks like it might throw an exception because it's calling a method on a number that doesn't exist, my existing test suite is actually going to pass. This is because it's completely new functionality. It has no tests written for it. There is no way for tests to really fail and tell me that that code had some issues with it. But if we start to actually enforce a certain coverage threshold on our code base, we do see a big red X that says that this new code has dropped our coverage threshold a little bit. And we can use it as a way of keeping a certain level of coverage inside of an open-source project. And in an automated manner, help encourage folks to rate more tests as they submit PRs. Another interesting thing that coverage is good for in a project, though, and I've seen this used in the Node.js project quite a bit, is it can give someone an idea of where they can contribute to the project. So for the Node.js project, we actually published nightly coverage reports to Node.js.org. And these reports, which look something like this, can actually be a really good place for someone to start contributing to the project. So this is an actual snapshot of a night of coverage. And what you can actually see here is you can see that our coverage is pretty good across a lot of the code base. But what will happen a lot of the time is when a new feature is added. So in this case, we can see RIMRAF actually has quite a bit lower coverage than other parts of the code base. So if someone looking to come in, contribute to the Node.js project, I would happily accept patches for more coverage on RIMRAF. I don't know if a lot of people even knew we have RIMRAF in Node.js, so kind of a call out to that feature, too. RIMRAF is a way to delete a directory recursively instead of just a single file and directory. But great place to start contributing to the project, looking at these coverage reports. I found an actual example from earlier in the week, and I thought this was a really good kind of canonical one. This is an individual coming in. They've written a few tests for our DNS functionality inside the Node.js code base. And they actually reference those nightly coverage reports as a way of saying, look, I've written these tests, and I'm specifically trying to target these lines in the coverage report. And we are very excited to get this pull request, and it got three approvals really quickly. So great way to start contributing to the project. But I would like to point out, too, that code coverage is not just about testing. There's other interesting approaches you can use it for. One such approach is that you can use it to say, what parts of my program should I think about optimizing more? So this is, again, from the full test suite run of the Node.js project. We can see that a few of the lines of code are hit 22,000 times during that test run. A few of the lines of code are hit twice. If I was coming in and saying, I need to optimize this method and make it run a little faster, I'd probably want to put more work in optimizing read package scope, which gets executed 22,300 times. Then I'd want to put into optimizing air require ESM, which only gets called two times in kind of an exceptional case. So not perfect. This is only telling you what the execution count is of stuff you actually exercise and test. But kind of a neat indicator of some of the really hot points in your program. So it can be a good place to start when you're thinking about optimization. So we have a little bit of the history of code coverage. We've talked a little bit about some of the things I think it's useful for. I now, for the kind of folks who are curious like me and like to know how things actually work, I want to talk a bit about how coverage actually does work. There's two main approaches that I can think of for collecting coverage in the JavaScript world. There's the compiled-based approach where you take a program, you transpile it with something like Babel or Esprima, and you add a bunch of counters in that hopefully don't change the behavior of the program. The other interesting approach is there's the VM level where you're not actually changing the initial JavaScript itself. The virtual machine that's running your JavaScript at the end of the day is adding some additional instruction codes that collect counters as your program runs. And I'm going to talk about both of those approaches. So let's take this really simple program here. It's an if statement with an else statement in it, and it doesn't look like that else statement's probably going to run. But all this program's going to do is it's going to run Hello World in that if block. So imagine that we wanted to instrument this for coverage. How might we do that? Well, what we basically do, and this is what's happening today, just trim down a little bit, something like Istanbul or NYC takes your program, it rewrites it, and it adds all these counters in throughout your entire program. So if we take a look at the true block here, if true, we see that we're incrementing branch 0 and statement 0 if we happen to go through there. We're then doing the same console on info we did before. If we were to go into that else block somehow, we then increment the branch in that else block. And if we went into the ternary operator, we'd also be incrementing branches and statements inside of that ternary operator. At the end of the program's execution, we can then dump that report we've been collecting the entire time. And that's how coverage has traditionally worked in the JavaScript world. Obviously, you don't want to do this manually. So there are tools available that do this for you. One I like is called NYC. You can just type npm inyc savedev, and then run your program the same way you'd run it with the node bin usually. You just put NYC in front of it. And what it's actually doing behind the scenes is it's capturing your program's runtime, looking at when you require files, doing a bunch of other interesting things. But ultimately, it's just taking your program, rewriting your thousands of lines of code, and putting these counters in all over the place, and hopefully not breaking your program while it does that. So NYC is a tool that automatically does this for you. The other approach one can take is they can use the virtual machine to collect coverage for them. I'm not going to go into a ton of detail about how V8 spike code works. Frankly, I'm not an expert on it. But I do link to two blog posts that I think are worth reading on what's actually technically working behind the scenes. These are Understanding V8 Spike Code by my colleague Franziska Hinkelmann, and JavaScript Code Coverage by Jakob Gruber. I did put my slides up just before this talk, so these should actually be links you can click to if you do want to read these papers. But basically what's happening is when you're running JavaScript, your JavaScript code gets turned into a tree. That tree gets turned into byte code, which runs through a virtual machine, much like when you're running Java code. These counters are actually added in at that virtual machine kind of byte code level, rather than at that source code level when you're using the V8-based approach. There's some advantages to this. There's no instrumentation step necessary. So in the case of the Node.js project, actually my colleague Anna set this up originally, what we used to have to do was we would take all the JavaScript inside the Node.js project, transpile it with these counters, compile it into the Node executable, then you could do one run of test coverage to see what your coverage was like. If you hadn't actually written the test that hit the line of code you were hoping you hit, you had to untranspile your code, do some changes to your code, retranspile it, rebuild the Node executable, and then maybe you've hit a few more lines of coverage. And this created a feedback loop that was really slow on the Node project for our coverage. So it was nice to be able to, what's nice now is we don't have to do this transpilation step. We can just run the Node project like we would normally with coverage enabled and use this for our coverage collection. Furthermore, it was significantly faster for the Node.js project. It ran, I'd say, three or four times as fast using it in the V8-based approach rather than transpiling all the code. I put a little asterisk here because NYC is pretty fast because it aggressively caches. It's a really mature project. So it's not always gonna be the case that using the V8-based approach is faster for you. I do feel, however, it's less error prone. So as I said, the way that coverage has worked with Istanbul or NYC or a lot of these tools is it takes your big computer program, tries to make a new computer program that just has thousands of counters thrown into it. There's been a variety of bugs throughout the years of working on Istanbul that we just put a counter in the wrong place and now your program behaves differently. It's hard to get it perfect all the time. We're shockingly, we get few bug reports, but it's from years of debugging weird edge cases around adding these counters into your programs and it's harder than one might think. Enabling this coverage in the VM, it's really easy. We have actually built it into Node.js itself. So if you take the latest version of Node, Node 14 off GitHub or Node 13 and you set this environment variable Node underscore V8 underscore coverage and you give it an output folder and then you run your program as you would usually run your program, you will actually end up getting a bunch of inspector coverage information out to that output folder. So it's built in, you'll just end these, just raw JSON files get output that tell you what's been executed. These JSON files are hard to consume, they don't give you a pretty visual report. I've written a tool linked here called C8. It literally just turns on this variable at the top here and having turned that variable on gives you pretty reports out the other side and lets you see something more human readable from those reports. So we've kind of given you a bit of a history, we've talked about how coverage works to a certain degree. So why do I like coverage? Or like why do I like using it so much, why is it something I've taken the time to contribute to the Node.js project? Well for one, it helps me find parts of my code bases that could be, could use more testing. And I mean even if you could be the best engineer in the world, you're not perfect. And I definitely find the parts of my code base that I haven't written tests for are always broken in every imaginable way. I think I'm working with numbers, they're actually strings, just everything's wrong with it. I find that using coverage to find those untested parts of my program just helps me write more stable applications. It also helps me have more trusting contributions being made to my projects though. So for a project like YARGs, I've actually taken the time to gradually get it up to 100% coverage over the years. Helps me know that those new features coming in are keeping that same baseline level of we've at least demonstrated every new feature added runs as claimed. It doesn't make your program perfect, 100% test coverage is not 100% bug free code, but it's a better indicator than just throwing code out in the wild with no testing. The other really cool thing though, and I love this for the Node.js project is it helps folks find ways to contribute to the project. And I'm going to link to it again at the end of this talk, but this coverage.nodejs.org site, really cool way to find places to contribute. So hopefully I've also started to get you a little bit excited about coverage and Node.js. So you're probably asking, how can I actually set sale on this coverage train? Well, there's a couple tools available that I've already referenced a few times in this talk, but if you haven't taken a picture or written them down yet, do so now. There's this tool called NYC. It can be found at Istanbul.js slash NYC, and it does that approach that I was talking about where it takes your code, rewrites it with counters automatically, completely invisible to you, and can then use that to collect coverage. What's kind of neat about this instrumentation approach is once you've rewritten your code with those counters, you can run it anywhere, you can run it in a web browser, you can run it on your server, you can run it in a light bulb if you're doing internet of things stuff. So it's neat, it's very portable. Once you've done so though, you can just run your program like you would usually run your program with this NYC bin in front of it, and it just magically gives you this coverage information. The other tool that I don't think too many folks know about yet, we use this on the Node project itself, and I use it for a lot of my open source right now, is this tool called bcoe slash c8. I'll probably move it into the Istanbul org eventually. And all it does literally is turn on v8s. It turns on Node.js' coverage collection environment variable. Gives you the exact same pretty reports as NYC. It literally uses the same part of NYC's code base that does the reporting. And instead of typing NYC my program, you actually get to save a character and just type c8 my program. So it's actually even more efficient in that regard. If you really want to rate your own reports or you're curious about what's actually happening behind the scenes, you can just set the environment variable, like I said earlier, give it an output folder, and then run your program. And this will just work with the modern versions of Node today. And this is actually what we do. We just run with this environment variable, and then use c8 to make reports out of the other end. So c8 can just take that raw output and give you reports. So we use the bottom approach for Node. So now kind of getting back to that topic of using coverage to contribute to Node.js. This is a snapshot of coverage.node.js.org. An exciting thing I've been working on recently with Rich, who I see sitting in the corner there. I'm gonna call him out, is we now have this nightly code cover report we put up. And what's neat about this is this actually takes the combined coverage of our Windows and our Linux test builds, so that if you go to this top report, you're less likely to get these false negatives where it looks like we haven't covered any of the Windows code, but actually we did cover that in Windows tests, just not in our Linux tests. So if you're thinking about contributing to the project, I would start by looking at the coverage reports in this combined report. What we also have is a nightly Linux branch coverage report. This is just more detailed report that gives you a lot more specifics about the branches that have executed, execution counts, you see the left in the right of a ternary operator, so a little bit more detailed. We also have our C++ coverage reports, which are linked to here. Kind of a throwback to the history of coverage. These actually do use GCOV, so it hasn't changed much since 1996, except we use a tool called GCOVER, which I believe gives a nicer report at the other end and makes it easier to instrument your code base with it, so there's that. So if you do wanna contribute to the Node.js project and you do wanna look at these coverage reports and maybe increase rim rafts coverage, I'd very much appreciate that. Feel free to flag me on GitHub, you can find me at github.com slash bico, and also I'm always open to talking to folks on Twitter, so you can go to twitter.com slash benjaminco. Always happy to answer questions, just talk with people. I'll also be here for the rest of the day, so if you prefer in real life, I will be on the stage for the next few minutes. I don't know, I don't think I was too fast, so I'll be here for five minutes, and so feel free to come talk to me. I thank you very much. This gave people a good background on what coverage is. Yeah.