 Good day, I'm Jack. Until last week, due to mass layoffs, I was a software developer at Shopify. Because that's changed, it should be understood that nothing I say should be construed as speaking on behalf of Shopify. I am just an independent person who happens to know some interesting facts. And of course, this was authorized before that event occurred. So basically, one more thing is that I'll be using our and we and is and was interchangeably just out of habit. So the problem that Shopify had that I no longer have was typo squirting. I beg your pardon, typo squirting. You know, which is the most frequent numerically speaking attack on supply chains. Mostly because it's so easy to do, but it's very annoying, right? You're running a large company, you have thousands of developers, very easy for one of them to fat finger something and then you're in trouble. So what did we say? We said basically we're going to allow list our dependencies. This is still a thing in progress, but as a general class of solution, if you allow list your dependencies, it becomes impossible for developers to do the fat fingering. It'll just say, no, that's not on the list. But you know, like, okay, how much work is there gonna be? I don't know whether they're not lining up, but too bad you have to live with it. How much work is that going to be, right? There's a lot of dependencies. So we had some questions. And basically our questions were, first of all, how many dependencies do we have right now in RubyGems, not speaking of all the other things we have? What is the velocity? At what pace do new dependencies get added to the estate? And sort of a bonus question, how quickly do we react to vulnerabilities? That's probably a thing that would be nice to know. So what did we do? Basically we scraped the hell out of GitHub. What this diagram does not include is the amount of logic that's necessary to do that successfully. GitHub, quite understandably, is not super keen on being scraped. So you need a lot of logic that does things like wait a certain amount of time, react to the different kind of areas that you'll get. Sometimes GitHub tells you that you're scraping too much. Sometimes they just drop the connection on you. That's a lot of fun. Basically we did that. We went through each commit that had been made to a selection of repositories. We found all of the ones that were done on gemfile.lock. We pulled all the commits for gemfile.lock from those repositories, parsed them, wrote that information into SQLite, and then we were able to join that with vulnerabilities data. And then there was a bunch of pretty plots that came out of it. I'll give you a few seconds. If you're interested in the vulnerabilities data, it comes from the Ruby, the Ruby advisory database of RubySec. This is a volunteer effort. They do a fantastic job of keeping up with the vulnerabilities. And here are a few things we learned very quickly. So I'm gonna show off, first of all, the numbers, about 1,300 repos. It should be noted that Shopify is kind of like a monolith with a couple of large services and then there's this sort of like dark matter of small services that do a little bit of this and a little bit of that. Out of these, we scraped 230,000 commits made to gemfile.lock since 2011. Thousands of distinct gems, unfortunately I can't give you the exact number, but thousands, not single digit thousands of gems. It's interesting to note that three quarters of those gems are still in use. There's about 25% of gems that have been used in the past that have since been abandoned. And about a fifth of those are used in the monolith. Shopify is famous for having the monolith. And this is a lot of fun. And this is the punchline. We got dozens of new gems coming online every month. So every month, across the Shopify monolith, across the various services that serve the monolith and across the many other services that serve many other purposes, there are dozens of new dependencies showing up every month. And so that would be the workload if we, for example, decided that you had to be approved to adopt a new dependency. This is the pretty diagram, which is not fitting properly. Let me start scrolling it. There we go. This is the diagram. I had a lot of fun with this, sort of setting this up. It shows clearly that the data does not go back to the origin of Shopify. And it goes back to 2011. Shopify is about 15 years old. So it'll be 2008, I think. And particularly that Rails didn't show up first of all. And that's probably partly because it's so old that Rails was a zip file that got sent to the founder. There's the first appearance of the monolith in the data, which again suggests that, yeah, the data doesn't go the way. But I really enjoyed seeing that AWS SDK pulls in the universe. If you've ever dealt with that gem, it pulls in every other AWS SDK gem as a dependency. So that's why you see that sudden bump. What about our responsiveness? So we did get better at updating gems. I'm glad I can brag about that. Huge shout-out to Dependabot. There's certainly, I can't sort of talk about the exact things that led to what and how and whatnot. But yes, Dependabot made a visible impact. There is a lot of custom automation. I would have liked to have pointed you to a blog post or a talk. There isn't one yet, but the team that is responsible for the main automation would like to, well, those who are left anyway. And this is the sort of the important thing, right? We had pretty bad remediation times. I suspect that 2014 bump is when they went, oh, we're not actually remediating things. And so a whole bunch of stuff got closed that year. But the most important thing is the 2022 plot, which you can see has gone down humongously. And again, big props to Dependabot for making that look so good. So what can you do if you're sort of like looking at this and going like, I want a cool plot like that too? Well, the first thing you can do is start collecting the dependency data now. Now, there's no end of tools emerging, projects, vendors who will sell you stuff. You can roll your own, which I wouldn't recommend. I think the vendors are getting pretty good and the open source projects are getting pretty good. But that's not going to talk about the stuff you had in the past. I haven't seen anybody, any vendors or tools where they're actually looking up historical data in the way that we did. So you're going to have to write some scraping logic. That little diagram I gave in the slides on the site gives you the basic process that you have to follow. I really do recommend using Ask Your Light. It's magnificent when you have little projects, normalize your data damage, I will die on this hill. And plot, plot things. Once you've got stuff in a normalized format, you can ask arbitrary questions. I met about 20 different plots that were circulated internally over a bunch of things that were quite sensitive and I can't share, but it was interesting. Once we had the data, we could ask it questions, we could interrogate it and find out things. And so often the thing that really moves minds is a plot. And that's it. Thank you very much for coming to my talk. Do we have time for questions or is it just floating around? One minute, okay, one question. Who's got a question? Or did I do too good a job? I think I did. Thank you all so much for coming.