 So, hello everybody, welcome to DEF CON 2020. This is the first session of the division today. And the first talk will be about navigating the MPM ecosystem in the enterprise. And it will be presented by Betty Griggs, who is a scenario software engineer at the result. So the stage is yours. So hello DEF CON, welcome to my talk in navigating the MPM ecosystem in the enterprise. This will be my second DEF CON. That means unfortunately I've not actually got to go to one in person just yet, but I think it's of course maybe next year. To start with a little bit about me. So my name is Bethany Griggs. I'm a senior software engineer at Red Hat. I work at Red Hat in the Node.js runtime team. I've been at Red Hat just over a year now, but before that I was actually doing a very similar role but at IBM. I'm a member of the Node.js technical steering committee. So quite heavily involved with the upstream community. And also spent some of lockdown writing a new edition of a Node cookbook. And my job role is really quite a big mix between open source, engineering and advocacy. So back to my talk topic. Today my talk is titled navigating the MPM ecosystem in the enterprise. And what I want to do is share a little bit about a recent approach. We've been taking at Red Hat and IBM to help our internal teams and clients navigate the vast MPM ecosystem. So for quite a long time, probably since the adoption of Node, the MPM ecosystem has been recognized as a source of friction in the enterprise. Particularly when the package maintenance working group was kicked off in the Node community. There was a statement that said Node.js has been growing rapidly. There are aspects of the module ecosystem which act as a source of friction to adoption. Particularly in the enterprise. So why is there friction? Well, there's over 1.7 million modules in the MPM registry, which is like many, many more than any of the other front times and languages registries. This means that there's quite a lot of choice out there, but that means there's a lot of good choices and actually some potentially bad choices too. And also when you look at your typical Node.js application, the large majority of the code that makes up your application is not your own. It's quite common in Node.js for this to be the case. But unfortunately, you're actually responsible for all of it in production. So you can't just ignore it. You can't take it for granted. You're just as responsible for that code as the code you didn't write. And I always like to start with a few horror stories that always attract a bit of attention. So for example, way back in 2016, that was the left pad situation. This was a quite heavily depended on module that the author unpublished and it broke loads of dependency chains. And just to show this is still happening, we get the odd case every year. A similar thing happened just in January with the Faker module where the author actually published what you would say like an infinite loop to the latest version, which broke a lot of projects. And this is not a new problem. It's definitely not specific to the Node.js or JavaScript ecosystems. But what I do find is the MPM ecosystem in particular does surface these issues a lot more visibly. And that's mostly just because our dependency trees in Node apps are typically much wider and deeper than other languages at one time. And just to show, it's not only an MPM issue. Obviously, we have the log for J that I'm sure you've all heard about. So in the organizations I've worked with, so IBM and Red Hat, we've sent many different approaches to try and tackle this problem and to try and alleviate the concerns over the years. So this is going back to one of my first ever conference talks, which talked about one of the early approaches we took at IBM. And that included defining some metrics or facets of the dependencies or MPM modules you're using that really helped developers make an informed choice. So it was all about providing the guidance for mute people to make a good decision on which dependencies to pull into the project. So the types of metrics we called out at the time were things like security. So does it have a good security policy? Are they patching in a timely manner? Licenses is a license compatible with how you want to license your application. Maintenance, are you fairly confident that the dependency is well maintained and will continue to be maintained for the lifetime of your product? And then there's also breaking changes. So what's the history in breaking changes in dependency? Are they cutting major versions really frequently and you won't be able to keep up with those? And then there's compatibility. So Augustine needs to check that the dependency is compatible with the platforms and runtime versions to your targeting. So this was all about providing metrics to help you make an informed choice about which modules they're choosing. And this is a unique approach and actually you've got companies like Sneak and MPMSIO doing a very similar thing, trying to take some metrics from these modules and packages and formulating health scores so people can make a slightly more informed decision. But metrics really don't tell you the whole story. So some of the most downloaded modules on the MPM ecosystem are probably not a good choice of what's used to build an enterprise-grade Node.js application today. And one of the key things I like to highlight is governance. Governance is crucially important. I see one of the key benefits in the MPM ecosystem is that when you hit bug or you need a new feature you are able to invest and contribute to fix or the new feature yourself. But this does mean when you're choosing a module you need to have confidence that if you did need to fix your own bug and help yourself, which I believe we all should be doing. You need to be confident that the dependency will actually allow you to contribute. And that might not be the case, particularly for say it's maintained by just one individual or maybe it's controlled by another corporation. Is that something you really want to build your product on top of or would you rather build it on top of a community-owned firmware? One other approach that we have taken in the MPM Node.js space is we defined a Node.js package maintenance working group. And this is a group of people from varying companies like Netflix, GitHub and many of the enterprises all just getting together on a couple of week basis and trying to provide guidance for package maintenance to promote responsible consumption. They're also working on tooling to help improve things too. So it's just nice to see that there's a concentrated effort on this problem bringing enterprise concerns to the table. So on to the new approach we've been taking at Red Hat and actually we were taking it at IBM just before our team moved over. And we're actually taking this new approach and use some of the other approaches. So this is additive in nature, we're not doing this instead of. So what we set out to do to help enterprises and our clients navigate the MPM ecosystem is to build a reference architecture. And what we call reference architecture or define has is it's the team's opinion on what components our customers and internal teams should use when building Node.js applications and additional guidance for how to be successful in production of those components. It's essentially a set of good default choices in the MPM ecosystem to use when you're creating and deploying enterprise Node.js deployments. And I just want to explain here to be clear purposes of the talk is to share how we're approaching the problem. We're not saying the recommendations that we come to are right for everyone. And we're very conscious of this even internally. We're not trying to say always use this module categorically. It's more this is a good default. We have tried and tested experience in the organization. We may have some contributors to that module. So if you have no reason to use anything else start with this. And personally this kind of reminds me of my personal opinions on like Lego. When I was child I had lots of Lego bricks and it was just a tub of assorted bricks. And if someone had come along and said you're using these bricks and this is how you're going to do it. I probably wouldn't have been particularly happy about that. So it's not about picking winners. It's not saying this module or MPM packages categorically better than another. It's just saying this is a sensible and good default. And also it's not about personal opinion. It's the opinion of the group as a whole based on evidence and experience. And on the other hand the whole restricted usage approach being told you are using these modules and these tools is not unfamiliar to a lot of our enterprise customers and clients. And particularly in like the financial services and government industries. You will be told this is what you're doing. This is what you can use. And these are the allowed dependencies. And there are benefits to that approach particularly for their organizations in terms of more efficient due diligence. So if they have to do things like license checking they can, by focusing on just a small set it's more efficient to do that. And also more efficient risk management. So we set out to build this reference architecture. How are we actually doing this? Well, step one is to bring together all our internal teams who have experience with Node.js and also anyone who's worked with customers who have experience with Node.js. Bearing in mind there's like over 300,000 employees between two companies. And we've got many, many customers using the Node.js runtime. We set out to find everyone that's creating or deploying Node.js applications. And so far we've built up a good group of probably at least 30, 40 folks coming to our meetings over the period of the year. And our colleagues in these meetings range from people who have internal Node.js deployments for our companies. People who are working with clients with Node.js deployments and more. And the kinds of things we're doing is like actually interviewing with our customers learning about how they're using Node.js really to try and verify that our recommendations are working for our customers. And this is somewhat like an inner source approach. So if you're not familiar with the term inner source it's essentially applying open source principles and practices within internal organizations. And there's lots of benefits to this type of approach where you get to exploit the expertise of the developers across the whole organization. So rather than just having your team you have a much larger group of expertise to exploit. You also get independent peer review of recommendations by others in the developer community and developers can identify which areas of the project they feel they can contribute best. So once we've got this group together our next step is to define all the components that form an enterprise Node.js deployment. And initially we set out to define this based on every one set my monitor has just turned itself off so I'm just gonna move my laptop over. Okay, sorry. So initially we've grouped all of these components into three categories. We have development, functional components and operations. And initially the types of things we've defined development components are building in containers, code quality, consistency, how you keep your modules and applications up to date and some specifics around MPM and proxy and publishing. And then under functional components we have things like accessibility, API definition, data caching, databases, message queuing. And that's actually within the functional components that we surface many of the module recommendations. So there might be a very specific module that we've got a lot of specificities in the organization for doing things like authentication and authorization. And then we have operations. Obviously operations is very heightened importance with our types of enterprise customers because they are very much likely to have massive Node.js deployments across many clusters. So looking at their concerns and trying to form good defaults for them is very beneficial. And under operations we've got things like how to do tracing, how to handle failures, what you use to help checks and logging. So we've kind of defined this initial set components. It's not an exhaustive list, not at all yet. It's just the initial list we started with the people in the group. And then for each of the components we talk through and share experience within the group and try to form an opinion. So I do kind of like a worked example here is logging. So logging was a topic that we discussed quite early on. We brought together lots of folks from IBM teams, Red Hat, even some of our acquisition teams. We talked about what are you using for logging, any recommendations, best practices, what's worked for you. And after talking in those groups for a few sessions, we eventually converged on the fact that Pino, which is a logging framework for Node.js, is a good sensible default. And some of the justification of this included things like it's got good performance. It's structured logging by default, which works very well with clusters and when you're deploying clusters because you can populate it easily to a collection system. And generally everyone had a good user experience when using the module and it was also well maintained. So we converged on this recommendation to say, hey, if you need to do some logging in the Node.js application, Pino is a good default to choose. And actually even more beneficial from this approach is that when we're having these discussions, we're able to surface even better guidance for our internal teams and clients. For example, in the context of logging, we're talking about how many of the teams were initially storing their logs in maybe a fast access location that's often more expensive. And then after maybe seven days, they migrate all of their logging data to a local storage place. So what I really enjoyed about these sessions is we're able to surface not just, hey, you should use this module, but also when you're using these modules and building Node.js application, these are some good approaches to take as well. And then from our perspective, once we've converged on a recommendation, it's really helpful for us to be able to converge all of our examples, all of our demo apps, all of our tutorials to use that recommendation. So there's been quite a lot of benefits to our approach that we've taken so far. First one is just finding context. Our team in particular have a lot of people who are working in the upstream community and just having opening that communication between the product teams on ourselves is really mutually beneficial. If they have a bug in node, we can work in the upstream to fix it and we're getting feedback from the people actually using it runtime. This was a really good knowledge sharing. We've actually seen teams helping each other and suggesting that there are practices for them to make. And we've also got recommendations on hand for our clients. So if our clients are stuck and they're like, we just don't know what to do with logging, we can say, hey, take a look at this, this is a good place to start. And also we also get those benefits around more efficient due diligence. And one of my favorites is more focused open source contributions. So in many organizations, not necessarily our own. Many developers struggle to get the kind of management backing to contribute to open source because obviously it takes work time. If there was a restricted set of modules or a smaller set of modules that you could focus on and the whole organization are using the same module for then they just logging, that justification is easier to say, hey, our whole organization is relying on this module. We're gonna contribute to it. It's just more focused and it may help folks in smaller organizations that can't justify contributing to every module that's out there. So really our reference architecture approach has been to bring together the internal teams and customer engagements with order related experience to get together and define all of the components that form the Node.js application or deployment. And then for each component, we discuss the experiences within the group and form an opinion on the default. Once we've done that, we actually document all our recommendations on GitHub and this gives us benefits because it means it goes through the proper PR process, keep looking at the thoughts and feedback. And actually we're continuously evaluating our recommendations. And this is because what is a good choice today may not necessarily be a good choice in the future. And we've even gone as far as trying to predict some future trends, particularly in the web framework space. So today from hearing from all our internal teams and clients, we're still hearing a lot about Express. Everyone's starting with Express. We've got numerous success stories internally from the IBM Cloud UI and some red hat deployments using Express. But then we also have the IBM Cloud Garage and if you're not familiar with the IBM Cloud Garage, it's kind of like a hub for clients to come in and build applications. And from our colleagues there, we learned that the clients are still asking to use Express. So generally we're happy today if you don't have any reason to use anything else. Using Express, if you're breadfroaming, there's probably a good choice. It has high downloads, it's one knowledge and it's tried and tested. But there are concerns raised in these sessions. The Express project is kind of considered to be in a maintenance phase. Contributions have been tailing off for quite some time. Up until I think this month, we haven't had a release for a couple of years. So there is concerns about the slow release cadence. If I needed a bug to be fixed and I've got to wait two years for it to be released, that's going to be a problem. There's also some concerns about without having such a slow release cadence, it's not going to keep up with the evolutions in the runtime. So if you know today, we're saying Express is a good default. We are already thinking it might not be in say five years time. So actually to train, identify what might be next and where we should maybe invest some of our time, we looked at pulling some metrics. So first of all, I just built a spreadsheet for some very high-level metrics for many of the popular web frameworks. And actually we purposely didn't compare like for like replacements of Express. In NodeSpace, five years ago, to build an app, you would have probably always defaulted to Express, but actually there's a lot more choice and a lot more specialist frameworks now that you may choose something slightly different like Next to build your application. So as we expect, looking at downloads, Express is still massive, like 22 million versus the Next framework, which is a two million, but it was just useful context to see which frameworks are being misused today and most downloaded. Download stats aren't very good though, obviously they are impacted by CIs, build systems. The fact that Express was like the tutorial framework for the first five years means there's still a lot of tutorials out there using it. So that will mean that the downloads may not necessarily reflect what's happening in real deployments. Also, the fact that the registry downloads are increasing year on year by quite a large number means that you want to really be looking at the relative increases and decreases in the module. So we did pull some stats to just look at the potential total registry downloads per year. And what's interesting here is you can see even though the Express downloads are increasing every single year, it's actually share of the downloads are tailing off. And there's many reasons that can be the case. It's not necessarily say Express is losing favor, but the fact is it's not gaining share. So what we really want to look at are the modules at the bottom here where they've got very low downloads, but actually a lot of them are growing in share. So these modules are growing in share, while Express is losing it. So they're definitely ones we want to keep in eye on and have discussions with our clients and customers about. Excuse me for that, we have two minutes left just for information. And so other metrics we looked at were the OSSF protocolity scores. The OSSF actually has a tool that uses some very complicated metrics trying to find how critical that packages to the ecosystem. And what was interesting there is we see next and fast-fired at the top and Express further down. We also were looking at contribution activity. Many of the frameworks that you probably know from the node space actually have the trend showing in the top graph where our peak of activity in tailing off. And then, but some of the frameworks that we're looking at with the increases are showing relatively active contribution activity today. One of my favorite metrics is contribute to share. So these are actually generated based on the number of commits per user for each of the web frameworks. And one of the leading web frameworks is framework one. And you can see it's actually dominated by one person. So you kind of think as an enterprise, would we want to invest in that place? Or would we prefer to invest in a framework like framework three where you can see that the contributions are more spread and that can indicate it's more of a community-owned framework. So that may alleviate some of the concerns around ongoing maintenance because obviously hit by bus scenario on framework one. What are you going to do? When will we change our recommendation? Really just when we see usage internally migrating away from Express when we have numerous tried and tested experience with another framework and we're continuously evaluating because we find it useful to predict what's happening next. I'm sorry about that, but we need to finish because it's already five minutes till the end. So thank you. Do you want to say some words? Okay, if not, so thank you very much for the talk. We don't have any more time for the question. So feel free to go to work at Incher, which is 8-bit style virtual platform. It's really fun, so you can go there and you can discuss if you want anything related to this topic. So thank you again, Bet, and the next session starts in a few minutes.