 So thank you all for coming to the session. My name is Gil Yehuda. I run the open source program office at Verizon Media. And this is strange, because this isn't my presentation. This is a presentation of somebody else in my team who was unable to be here. So I have the joy of giving somebody else's presentation. There's some other interesting elements of this presentation that I think will be different. Because I think I'm going to ask more questions than I answer. And because it's Friday, and because you had puppies earlier and you're going to have lunch later, I kind of wanted to keep this a lot more casual and interactive than a traditional session. So for those of you who are, I don't know, doing your work, awesome, don't let me get in your way. But for those of you who are interested in this topic, I'm going to explore why you're interested in this topic, because I'm interested into why you are. And then, I don't know, play around with some ideas. So let me start with the acronym OSPO, Open Source Program Office. I work for a large company. It's called Verizon Media. It is effectively better known as Yahoo. So Yahoo was acquired by Verizon. It was merged with AOL. Yahoo and AOL merged together to form a company called Oath, which was actually the name of the company, for a while, and then rebranded to be called Verizon Media. But it's effectively Yahoo and AOL. And if you think about Yahoo and AOL, they're effectively 50 companies each that were merged together, because each of those two companies had acquired a whole bunch of other companies. So it's like Huffington Post, and well, as of until last week, Tumblr, and TechCrunch, and Engadget, and Riot Studios, and a whole bunch of advertising products, and a whole bunch of tech companies of various sizes that have merged together. Given that these are two primarily large companies that have had a very deep history in open source over decades, 20, 25 years in each case of being around, there are deep open source legacies. And in the Yahoo case, which was a legacy Yahoo, we made quite a splash a little over a decade ago when we open sourced Hadoop and created the big data ecosystem by open sourcing a whole bunch of other technologies around Hadoop, who is EpoKeeper, ZootKeeper, and things like that. And in other cases, identify really cool candidates for open source, like Spark and Storm, and things that were emerging at the time that we invested in to help bring them into the Apache ecosystem. So we have a lot of depth there. And because of that, we run an open source program office, which is a programmatic approach to the way we as a corporation deal with open source. So primarily it's about legal compliance. At least that's like the first conversation. Why do you even hire somebody? Well, because you need to have somebody to make sure that we're not going to get sued by doing something wrong with respect to open source licenses. That's always like the fear-based thing. But mature open source program offices extend way past that to like, we have people who publish code. And whether or not we have a publication process, if we don't, employees will publish code on their own, and we'll just call that a leak. And we don't want leak, we want actually a review. We want to make sure that there's the right that it was published properly. So we want a process because the publications happen anyway, we might as well do it right. There's a security element to it, which is going to be the focus of this talk, which I think is kind of an interesting thing because you've got to ask yourself like what exactly is the security element to it. From a corporation perspective, there's an extending the life of your technical products. If you think about it, you have an engineering team who solves a problem. They build something really cool, and then, I don't know, the main engineer leaves to another group, to another company. And now you're stuck with this really cool technology and what you have is like this impending debt. You know that there's going to be a problem. And one of the things you can do to sort of allay that debt is to proactively open source the project and say, hey, we built something really cool, but it's not exclusive to us. It's not something we monetize. It's just something that we needed to have because we encountered a problem that no one else had yet solved in open source. So we figured we'd do it first. But we want to proactively open source it to make sure that it's out there that other people can use. And this way we can protect ourselves from the situation that somebody leaves the company and now we're stuck holding the bag and saying, now what do we do? We don't really understand this code. So we want to make sure that our code is understandable and other people can use it. So that's that. We want to improve our project visibility. So in the cases where we've published something, we want to make sure that people see it and love it. And so there's a little bit of branding that ends at the bottom. We want to hire people. There's a talent war. We want really talented people to work in our company and we want them to recognize that we love open source. We're far more than just, hey, let's make sure we don't get sued. We actually want to show the love and really be awesome open source citizens. And so our engineers can recognize that, yeah, this is something that you can do. So this is sort of like the ethos of our OSPO. Just in terms of like scope or size just to give you a sense of what it is, we have about 330 active projects currently. We publish projects on a rate of about once a week, which is crazy, but we literally published 14 projects two weeks ago. This massive, wonderful collection of projects that are related to internet advertising and sort of like a reference architecture for internet advertising. But we have projects that kind of span. Like for instance, screwdriver is a CI CD system, pure open source CI CD system that works at scale where you can use and go to screwdriver.cd to find out. Vespa, which is probably our most significant open source publication since Hadoop, it is a very, very big, deep, powerful, AI-based vertical search technology for anyone interested in search. It does things that are quite impressive. Athens is a container-based role-based security authentication for containers. So we do that. We have Denali, which is actually a very recent publication. It's a UI language. It's a design language, CSS strips and icons for open source projects because open source projects usually don't look good. Like they operate really well, but they look wonky. And we're like, hey, we have a design language that we've open sourced so that other open source projects can use it. And as it turns out, screwdriver, that open source project uses Denali as its UI. So we use our own thing. We say, hey, we're just open sources because you guys can use this too in your own thing. So we have that. We have about 6,000 engineers, about 600 of them participate in an open source program. We operate a whole bunch of tickets. So you get a sense of the kind of company we are. We publish up, but we also archive a lot. The ASPO itself, and I'm gonna soon pivot to the security part. I know you guys are here for the security conversation, but just to give you context, the ASPO itself does a whole bunch of things around managing open source at scale for a large enterprise. So for instance, like anytime an individual wants to use open source and bring it into the company, if they have questions about licenses or how we use it, does this thing go into our grid, into our backend system? Is it into a product that we deliver, if there's license implications in that? So we do that. People wanna contribute to projects. They ask us questions, can I contribute? Is this part of my work? Have I granted patent rights if I contribute to something? Is that a problem? Can I contribute on my own time, or do I do it on my corporate account? How does that work? Compliance management. We have partnerships. So my team is the team that manages our relationship with Linux Foundation as a foundation and with Apache Software Foundation. So we have that centralized as part of our tech group for all of our open source initiatives. We also do community management, where I wanna tell people, so we have the marketing and tech branding part. And then at the very end, surprise, surprise, we have these things related to things that are the opposite of open source, or that you don't think about when you think about open source, which is one, the depublication of code that shouldn't have been published. So it's like, oh, there's stuff on GitHub, that's ours, but we didn't want it to be there. That's a problem, let's unpublish that. So that, the security folks come to us and say, hey, this should not have been published, it was not authorized, or this was authorized, but it wasn't carefully reviewed and there's something in it that shouldn't be there. And then there's the good old fashioned security alerts, which is, GitHub says, hey, this project is really cool, but did you know that it's dependent upon something other, this other thing, and that thing is no longer cool. So when anyone uses your project, they're accidentally using something else that is less cool than your project attempts to be, and that's a problem. Well, we think that's a problem. And here lies the question. And this is the question that we faced. When we publish code and put it on GitHub and say, here's this code, you can use it. And that code has a dependency on something. And that something has a vulnerability should we care? Should we care? On the one hand, the license says, limit of warranty. Just because we put the code on GitHub doesn't mean that it does anything useful. If you happen to use the code and in so doing break your leg, don't sue us for breaking your leg. It wasn't our fault, right? On the other hand, since we want to build our program to establish some sort of tech reputation, we wanna make sure that when you use our code, you have a certain sense of confidence that it's not garbage, that it's not like you're taking this risk. Oh, I'm gonna use this code. I wonder if anything bad is there, or I wonder if anything they connected to is connected to something bad. By way of interaction, let me just ask folks in the room. And between sort of the two poles of the debate, would you say, by show of hands, a publisher really shouldn't have to care about a dependency being vulnerable, because at the end of the day, it's code, and the person who takes the code is ultimately responsible to ensure that the code does what they need it to do. And if you believe that, please raise your hand and say, you know what, it's really not the ASPOS, the publishers, the user is ultimately responsible to make sure they're using good stuff. Do you think that's the case? Some of you do. If, however, your bias is on the other hand, and you say no, if you're a publisher, you kinda really take some responsibility to make sure that not only is your code solid, but the code that your code brings in as a dependency, you need to make sure that that stuff is good too. If you are on that side of the divide, raise your hand. A few more people are on that side of the divide, okay? So because of that, we asked this question, like what is information security with respect to an ASPOS? And we thought about this. We worked with our, with our paranoid. So we have a team called the Paranoids. That's the Yahoo name for our information security folks. It is a loving name. We call them Paranoids because we love them, and they remind us that as engineers, and especially as engineers who love open source, we also have to be paranoid. So it turns out we run a Bug Bonnie program, I believe. I believe this is the case, but don't quote me, but I think we're the largest Bug Bonnie program on the Hacker One platform. If, I think we are. So it's pretty large, and we pay out nicely. And it's because we really, really care about security. And one of the reasons we really, really care about security is because we were not always positively affected by security incidents in our past. And there's very few, very few things as awesome as a post breach security team. Because a post breach security team is finally empowered to do the things they always needed to do and should have done, or should have been empowered to do prior to breach. But when you're a post breach kind of company, information security is an awesome place to be. So our InfoSec folks, our paranoid, basically said, we got the Bug Bonnie stuff covered, but your open source code isn't running in production. It's just publication on GitHub. So no one is gonna come with a Bug Bonnie saying this code is a problem. And we have code scanning in place for static code analysis and all those kind of things. Again, for our production code. And we have red team and blue team kind of set up for our paranoid. So if it happens to be that you're really interested in information security and you wanted to work at a company like Verizon Media and work on all types of awesome technology and the network, there's a place for you there on the Paranoids team because we have it all. And it's an awesome team, but it doesn't really focus on open source. We're talking about the kind of vulnerabilities that are found in a piece of published code. And initially we thought, well, a piece of published code is like a blog post. If I wrote a blog post a year ago and the blog post a year ago said something that now a year later I thought is maybe incorrect, I don't really feel the need to go to that blog post a year later and to fix it. I could say that's what I thought a year ago and you can read it and if I gave you bad advice, well, it's just free advice on a blog post. And maybe my open source code is like that. And then as we thought about it more, we realized that that's really not the case. We do believe that OSPOs need to care about security in their published code. And fortunately, GitHub in particular has been really good about giving us alerts. So here's the agenda. Wow, slide number nine, we're getting to the agenda. What GitHub does to help you and where we think that kind of falls short in terms of where that helps an open source program office and then my call for help to see if we can make this better. So the good news is that GitHub helps. The not as great news is that I think that their help still leaves a gap and they are closing that gap. And the call is that I think we can do more. So what does GitHub do? GitHub provides security alerts, which is really kind of cool because if you publish a code there and then everything is cool and you have some dependency and then there's something in the dependency that's bad, GitHub will say, hey, we found this vulnerability and we get this as a maintainer of a project, I get an email all the time saying, hey, one of your dependencies has a vulnerability. Does anyone else, wait, how familiar are you guys with this? Does anyone see this? Do you guys get that? Okay, what do you do with it? Right, so depending on the issue, you might fix it. Depending on the issue, you might fix it. Okay, so severity, ability. Okay, right. I mean, the good news is that in some cases you can't fix it and they kind of make it easy. I mean, they'll give you like a little pull request template which most of the time, it's just a matter of upgrading to a new version. I mean, most of the time, what we've seen is that it's pretty straightforward. It's like, oh, you're using version 2.1 and that has this known vulnerability use version 2.2 or later and you should be okay. And then a few months later you say, oh, you're using 2.2 and there's a vulnerability. You should upgrade to 2.3 and then it's okay. So it's a little bit of a whack-a-mole kind of thing, which you can if you have it's direct dependency and it's harder when there's a transitive dependency, because now you're dealing with somebody else who needs to upgrade and right. Now, does anyone else have any other sort of thing that you guys do? Any other option? I mean, the other option is to ignore it, right? Now, in those cases, those are your own projects that you're getting this. Okay. What if it's somebody else's projects in your company? Right, so therein lies the problem with the OSPO. So, well, GitHub does provide us a security alert. So if you have a maintainer, if you are a maintainer of a project and you care about your project, then when there's a problem, you will get an email saying there's a problem with your project. And if you're a diligent maintainer, you will try to do something about it and if you can, you will do something about it. And if you can't, you'll at least make some sort of attempt to do something about it and hope that the problem goes away. And all that is actually pretty good. You get whatever and it tells you what to do when you get this, how to do it. So kudos to them for creating the tooling to make this possible. The challenge from an open source program office perspective is that I have, as I told you, over 330 projects. And I'm getting these alerts as the sort of master admin of all of our GitHub orgs and some of the maintainers are getting these alerts and some of them are diligent in doing something about it and others of them are not, right? They're just ignoring it. So from an ASPO perspective, it's challenging. Now the good news is that it's our experience in terms of our data is that 81% of our vulnerabilities are fixed by just moving to a new version. So it's mostly an easy problem from actually having to do something about it. But then it becomes like the sociologically hard problem to do because for many of our repos, it's a publish and, not publish and forget, but a publish and I have to do my work. I'm really working on my internal version and I'm trying to maintain this external version as best as I can, but I have a deadline and this is my internal version. And I got this alert and it came in mail and there was only 10 of them and there's actually more, I have more than 10 vulnerabilities, but the mail is a little, it's only, there's a limit to how many. So there's like an automation opportunity, right? We get the email which says, hey, there's this high severity alert and we say, oh, awesome. I need to make sure that somebody sees that email and does something with it. I hope they do and if they don't, then I can check if they did, right? So the problem for an OSPO, some of our repos are private, in which case you have to opt in. So for public repos, we're okay, but for private repos, we had to opt in and I think that's still the case. The API that GitHub had provided does not allow us to automate the turning on notification, which would have been awesome if they did and they might, we've asked them. So we've shared this with them. And this was six months ago. So some of it may have been changed, which is why I said this is somebody else's presentation that I'm presenting on behalf. So, but we did share this a few months ago saying, hey, give us an opportunity, give us a way through an API to programmatically turn on or off these Peripo. The email itself is limited. So the email is, again, awesome if you're like the maintainer and you're an active maintainer, works great. If you're an Ospo and you have hundreds of which some of them are great and diligent maintainers and many of which are not, then it kind of falls short. Cause it's like, oh, now what do I do with this email? Am I gonna forward it? Not all, I mean, there's still opportunity for GitHub to do more and they are doing more. There's no central dashboard that I can go to. And I think that they recently addressed this in part. And again, there's lack of automation with respect to the non-GitHub workflow. So if you're living in GitHub, you can create a pull request or an issue for these things. But we manage a lot of things on Jira. So there's this GitHub to Jira jump that we need to make, which is, okay, your publication has a problem. I need to assign it to somebody in the company and then track to see that they're doing it. And again, this is the aspo take on this problem. So looking at the dashboard, we found that sometimes people ignore these problems. And that to us is indicative of a project that once was awesome and the maintainers have stopped playing with it. And maybe we need to change the project status to archive and then change the read me on the project saying, this was once an awesome project. By the way, our maintainers have left and are not maintaining it. If you want to be a maintainer, let us know, because there's some things that can use some fixing and then the project can become awesome again. But until such time that happens, buyer beware, here's a less than awesome open source project that we happen to still have on our repo. And again, one of the things that ospos have to do at scale in corporations is look at the portfolio of the dozens of projects that have been published over the years and say, okay, all these new things that we publish are awesome. What about all those old things that are no longer as awesome? So again, the good news is that looking at the dashboard, I can say, oh, Mendel, that was an awesome project. But you know what? It's not being maintained. I need to somehow communicate to people who want to use it. Beware, this isn't, like this is on its diminishing sunsetting phase of its awesomeness. So I'll tell you what we did, and I'll tell you what the call for support is. What we did is we just looked at the GraphQL and to see what GitHub could give us and then we started automating things to get things from the GraphQL into a JiroQ so that we can manage it the way we do things internally and sort of assigning it and tracking it in Jira. Again, corporate perspective on this. And set up this thing where we got a cron job through screwdriver, which by the way is our CIC system. So it also allows us to do this. Bring it, you know, bring the data into this database, set it up so that it either sends a Slack message to the team that's been identified in the database as the maintainer team and email to the person or a Jira ticket through the Jira API and we set up this architecture and said, wow, this is really kind of interesting and yet we don't really have a tools engineer on the team who can do this. Let's see if we can find somebody who has some spare time to build this out. We found somebody who can and said, this isn't a problem that Verizon Media has. This is a problem that every single company that publishes open source code has. So we went to our friends at Amazon and then to Microsoft and a couple of other companies and said surely you have this problem in your company. I mean surely you're not the only company, like we're not the only company in the world that publishes code, gets all these security alerts and wonders whether the maintainers are actually doing anything about it and every so often audits to say that some of them are really well maintained and some of them aren't and then wonders about our reputation as publishing garbage on GitHub when we're really trying to promote being good stewards of open source. Surely you can give us some tools engineers to help out and to build something cool together, right? And the answer is yeah. They were able to when we were able to pull together some folks that look through the events and created like a little bit of a database schema so that we can start automating this and look at specifics and well slides are out of order because okay. What we did is we put it in a GitHub repo called GitHub Security Alerts Workflow. We did this six months ago. I checked this morning. Hasn't been updated in six months. Which sucks, right? Because that tells me that we were thinking about something really good and that we worked with a couple of really cool people that helped us build something that was almost there and then we went out to one of these conferences and said hey, we have a problem that we think is a shared problem. We think you can help us. Please go to this repo and see if you can help us and no one did, right? And we still think that this is a shared problem. We do recognize that GitHub is working on things that would help because they get it and they recognize that supporting individual projects and individual maintainers is good but supporting organizational projects and groups of maintainers at a corporate level requires a different way of looking at the problem. It's more of a dashboardy thing. It's more of like the super admin of all of the projects versus the admin of any one project, right? And it's that problem that traditionally GitHub tooling wasn't really focused on because it was all about the individual maintainer and the individual team. But again, the corporate perspective is yeah but some of my maintainers are good and some bad so let's figure this out. So this call, really I guess the message that I wanted to convey here is to help us. And I can think of like, I don't know, two awesome ways of helping. One is to go to the repo and see if you know, take a look at it, implement it, see what works and what doesn't and start like adding to it because if you're an open source person in your corporation especially if your corporation has an OSPO then you probably have this problem and you probably want it solved too. Or somebody in the OSPO wants it solved too. The other way of helping solve this problem is if this is interesting to you and you actually wanted a full time job working in an OSPO as like a tools wonk and this is the kind of like problem that you would love to solve for a living then come talk to me right when I'm done in a few minutes and tell me who you are because I'd like to hire somebody. Not just to do this, I mean, because this project will be done in I don't know, two months and then we're good. But to do this plus a whole bunch of other projects like this that automate the kind of workflows that larger OSPOs need that individual maintainers are already cool with. I get an email, I get a pull request, I either handle it or I ignore it. But corporations can't afford to ignore it because it deals with the reputation. So this is a pitch for people who are interested in this to help. Again, one of the ways is to like apply for this job that I have to do this kind of work and you would be working on something like this like hey, let's parse through the GraphQL, automate something that goes into JIRA, puts a report up to see how well we're doing and if somebody is totally ignoring their thing automate the archival of it, turn off the notification, put some sort of read me flag says this was cool, knock yourself out if you wanna use it, tell us if you wanna fix it, we will have you join. For those people who have this notion that security and open source don't really work well together and I especially speak to like the corporate crowd in the more traditional role in the more traditional context that I deal with where people say, oh, open source is less secure. I remind them that the question isn't whether open source is more secure or open source is less secure because that's not a useful fact. That's not a fact that you can gather. Rather, open source is more secureable. It has the potential to be more secure than closed source because more people have access to the code so you have more people who are able to fix it whereas with closed source you fundamentally have a smaller number of people who can fix it and therefore if they're diverted because they don't have budget or they're left the company or whatever they can't fix it. So whether or not a particular piece of open source is more secure depends on that particular piece of open source but categorically every open source project can be more secure than its closed source counterpart because anyone can come and help it become more secure. So you have to take advantage of its potential to be more secure in order to achieve open source's dominant security posture over closed source. Again, I think that open source is more secureable, has the potential to always be more secure. So with that I wanted to thank the people who worked on the project that we put together because it was sort of a cross company collaboration but it's completely not done and could certainly use some help from some other folks and to do that I would like you to reach out to me. My name is Gil Yehuda, I run the open source program at Verizon Media. Ashley is on my team but she wasn't able to be here but she put together a lot of the code and the architecture and really owns this problem. You can find us on Twitter, you can find us on LinkedIn, you can say hello to me right after the talk and I'm glad to take any questions from the audience. Have I found any of these to be false positives? You know, I don't have data to say yes. I can say that a lot of the people, we assign these to somebody and say, hey, can you fix it and they come back and say, yeah, no problem, whatever. It's like it wasn't a big deal but what it was is like sometimes we pin to a particular version and sometimes we set it up so that it's just like the latest version, depending on the build system and how we build the software. Sometimes they say, you know what? The fact that there was this problem reminded me of like this configuration or showed me that there was a configuration error in the way that we did things. So it was like the security thing wasn't the thing that scared us but the alert indicated there was a different problem. So we fixed the other problem. So does that mean false positive? No, it means that it's an indicator that it's, you know, it canary died in the minefield. Was it because of, you know, because of the gas? I don't know, but the canary died. So we looked and we said, hey, maybe something else is wrong, let's see. Yeah. Right, right, so sometimes like, you know, back to sort of like the larger arc as an Ospo, I care just as much as I care about the security of any one project, I also care about the reputation of our program so that the community perceives that the open source code that we publish we care about is relatively good. So if there's a whole bunch of security alerts on my code, even if they're like false positives, I want to take care of those. Like I don't want you to go to my code and say, oh, there's a whole bunch of security alerts. It must not be good. Because most people aren't going to go through the diligence to say, oh, they're all false positives. I'm sure all that code from Yahoo is awesome and has no bugs in it. Like, no, you're not going to think that. You're going to think, oh, I'm sure it's terrible. Right, so I need to allay the perception in as much as allay the reality. So again, from a programmatic perspective, I want to burn those down. I want to take care of those and make sure that we don't have any alerts even if they're false positives. Yeah, it is but of a different alert. Right, so we do get alerts from GitHub and I think they just announced last week or earlier this week that they've expanded their token scanning. So GitHub does scan for certain tokens, like if you have an AWS key or whatever. GitHub will say, hey, we found something in your code that you probably didn't want in your code. And by the way, with those, there are false positives too because we have code, whatever. We have code with fake social security numbers because it's a social security demo that processes something and there's a, this is a fake SSN and we're like, yeah, you got an SSN. Yes, we know it's a fake one. So we do have, we do scan for that. We try to identify that before we publish code. Obviously that's one of the reasons we have an OSPO is to scan and to make sure that we're not publishing stuff that we shouldn't. And we also have a bug bounty to find out if we publish something that shouldn't have been there. In some cases it's a, hey, there's like, there's something that shouldn't be there. So let's take that down. That's part of reality. And that's all, and currently that's not yet automated. So again, that's manual. And to run at scale, that becomes a difficult thing. Yet again, why I want to automate this, which is why I'm shopping for a tools engineer who can say, this is a really awesome problem. I would love to be paid to solve it. You're probably, yeah, I think you, so thank you for the correction. You're right. It's hard to see, you know, when you see something and you don't see what other people see, it's like, if I look at this, oh my God, there's all these things, it's gonna look terrible, let's fix these. But if other people don't see it, I guess? Wow, phew, that's much better. And then I was like, no, that's much worse. Actually, maybe I kind of want other people to see that because I kind of, like, I don't know, I'm kind of, I have mixed feelings. I wouldn't mind a feature where GitHub shames me, right, and says, you didn't do the, like, I'm kind of okay with having GitHub shame me if I'm not doing a diligent job of providing you with awesome software. And I think that might help fix the problem. So you may be correct, but I wouldn't mind the ability to turn that on and make it public. To me, that's the finding of modern languages. So all these modern languages like Node and Go and Ruby and all that, when you write this and then you build it, and suddenly you have all these things that came along for the ride, you're like, having somebody say, hey, did you know that your code depends on all this other crap that you never even looked at before? Like, oh, I didn't know that. And maybe I need to carve that out a little better because, right, so you're right, it may not be a security flaw, but it does tell me more about what's in my code. And at the end of the day, I feel responsible to make sure that when I'm giving you code, I'm giving you code that you trust. So I want to do a diligent job of earning that trust, but in order to do that at scale, I think there's a little more work we need to do. And I'm grateful to GitHub for the work that they have done so far, and I'm hopeful and encouraged that they are gonna continue to do this work, but I wanted to share with you what I think is the state of affairs, which is we're pretty good, but we have more to go. And with that, I'm gonna let you guys go because it's time, but if you have any questions, like, feel free to.