 So hi, I'm Eric Sorensen. I'm the product manager for Ospo at GitHub. I've been a Huber since last August. Come on in, folks. You're just about to beat the rush and get the good seats. Feel free to find a spot anywhere. Prior to starting at GitHub last year, I worked at Pubbit for about eight and a half years. And prior to that, I spent most of my career as a systems administrator and an SRE with a focus on what's now known as DevOps, but at the time was just called Configuration Management. Worked in the open source community there for a long time, and it's really great to be back amongst my people here. Today, I'm going to talk about open source at GitHub. I'm the product manager for our open source program office, and GitHub has a unique position, I think, with respect to the open source community. Software development generally is deeply collaborative and sort of the world's largest team sport. Open source software is... Don't have to belabor the point for this audience, but enabling organizations and individuals to collaborate that we can all use together to build better solutions. There are great advantages to doing this, and at GitHub we've seen some tremendous growth, particularly in the past few years where things seem to be catching on in the mainstream in a way that previously we'd only really dreamt of. GitHub, as a company, is committed to helping our customers make a better world through open source, and as presentation, I'll talk about that through the perspective of our open source program office. So we truly believe that open source is a differentiator for enterprises and our mission is to enable organizations and individuals to achieve more. This is a quote from Thomas Donke, our CEO, and he's spent much of his career as a developer, and I have to honestly say it's been wild and refreshing to have people at the highest levels of the executive chain at GitHub who really truly get it about open source. We're not having to constantly justify the rationale and the investments behind why we do what we do, we really get it and have continued to pervade the DNA of the company with a deep commitment to open source. There's sort of an outdated stereotype of open sources, the realm of hardcore hackers and fringe hobbyists. It's really completely invalid now. It's squarely in the mainstream to the extent that if you're at an enterprise that isn't establishing an open source brand, a phrase which I hate, but there's no better option, that you're probably losing out to a competitor who is. And just on the merits, open source in the enterprise means that your developers can focus on building differentiated value instead of toiling away on commodity or utility components. You're not in the container scheduling business, or if you are, you should probably rethink that plan. You're not in the logging framework business or any one of the hundreds of necessary but already invented wheels that go into a complex application. But being able to use an existing library really accelerates your development cycle. It lets you leverage the power community and enables you to focus on what matters to the business. But as Scott Neely said a long time ago, open source is free like a puppy is free. It's not free like free pizza. It comes with a set of responsibilities and different risks that many enterprises are not used to. And it's particularly relevant if you care about doing open source right and being a good open source citizen is the phrase that I've heard several times over the past couple of days and re-engaging in the community in an authentic way instead of sort of plundering the labor of others and not contributing back and becoming part of the community. So at GitHub we don't inherently do everything right but I do think we have a pretty good perspective on open source and enterprise. And first and foremost we are a software company just like any other and we've gone through our own journey with multiple twists and turns with open source. So I'll start talking about that. So we have a strong open source program that encourages contributions that respects the license obligations and allows engineers to use open source with ease to work out on the open and release their own projects while still maintaining security and compliance. At GitHub we see that our developers are using open source in order to get their jobs done more effectively. We're using something like 45,000 numbers probably higher now, different open source components across all of the software that GitHub builds and ships. And this allows us to focus on innovating as I mentioned before to sort of eliminate a bunch of that toil and focus on what really, building things that really matters. One interesting thing that happens too when the employees are allowed to contribute freely to open source is that the distance between developers and the customer shrinks and they don't need to go through support, sales, product management necessarily to learn how their software is being used and what the customers want. They directly interact with people using it and that's satisfying to developers as well as to customers who get their needs heard directly. And this is going back to that point about establishing a brand. I've heard from a lot of OSPAs that I talked to amongst our customers that it's really important in a hiring market to have established credibility as a organization that participates in the open source ecosystem to be more attractive to job seekers. So a big part of the depth and the breadth of this adoption is cultural. We have what's called a balanced intellectual property employment agreement. It says that employees are welcome to work on open source outside of work. So a passion project that you have working on Arduino stuff or Raspi's or whatever is that there's no inherent approvals that you need to go through in order to work on those kinds of things. And additionally, for things which are work related, there's really just some very minimal impediments to contributing upstream and to working out in the open. And for projects which are developed internally and which we want to release out into the open source world, that's also, I'll talk about that in more detail in a minute, but we have a pretty streamlined process for a developer that's built something and wants to release it out in the world to make that happen. And I'll talk more about each of these points, but the idea here is that we want to both help the projects that we work on to share and maintain our own stuff and work on the community while still maintaining some of the same kind of compliance and intellectual property protections that every other organization has. So we're heavily involved in upstream projects to the extent that there are engineers and whole teams at GitHub that are dedicated to working upstream and in ecosystems that are critical to the business. There's just a few of them. Get, obviously, we have core maintainers on the Get project and the programming languages and frameworks that we use most frequently also get upstream involvement. From an ecosystem standpoint, there are, these are more open source sort of platforms for hosting artifacts and building a marketplace that GitHub's also involved in. There's a few larger projects that started off as independent open source like Dependabot or NPM or things that were created by GitHub initially like the desktop and CLI client that started off as open source from the very beginning. Foundation-led open source is increasingly the way work gets done out in the community and that's been interesting, Chef, for me personally because the infrastructure tools where I started off in my open source career like Puppet, Chef, Docker, Terraform, those were coming out about 10 years ago and were really, they were nominally open source but were really dominated by a single vendor in a lot of ways and where they were an open core project that was tied into a larger commercial thing and so the shift over the past few years towards foundation-led open source, obviously a CNCF and the Linux Foundation which is why we're here are sort of leading that charge and clearly there's companies that are built around CNCF projects but it's very difficult to walk in with a thing that you've built as a single vendor and find success in those communities. It's more about working with the special interest groups and building collaborative relationships with people that are also working on the same problem space and then moving on into implementation. So again, we want all of this to succeed because more open source equals more good and we want to continuously improve what we can do as a platform, as the platform we're so much of this discussion, design, development and support and community interaction actually happens. Culturally, we try to work in the open as much as possible and I'll talk a little bit more about the mechanisms behind that but I think it's really important to model the community interactions that we want to improve. So these are a few things that we're pretty adamant about publishing and keeping in the open. The docs, entire docs system for docs.github.com is open source. Primer, it's our design system, had a really good conversation with a couple of folks yesterday about open source in design and how people are reinventing things over and over again like icon sets and that sort of thing. It's not really differentiated value if obviously design is really important but making a pixel by pixel representation of a save file icon is not something that people need to keep reinventing. And Primer is sort of our, we use both for everything that's on GitHub and in our products. It's also an open source site at primer.github.com that you can go and check out. The roadmap is interesting. There's a repository at github.com. That has all of the things that are flagged as being public in the internal roadmap project and it's a great way to get community and customer feedback on issues and to be as transparent as possible about the things that we're trying to accomplish in the future. That said, there are exceptions and some restrictions may apply towards working in the open. Obviously not everything is open source and I think that's true for every company. In some cases the tools and the projects that I'll talk about here are not open source because they just aren't open yet but they could be and in some cases the teams made a decision not to open something even if it was potentially a good candidate because there's a non-zero cost doing that. It does, once you put it on the world like Scott Manele put a minute ago it's sort of like giving a free puppy. You do have to continue to feed and care for it over the lifetime of the project and so sometimes we'll make a strategic decision not to open source something. So a bit more about that roadmap item if you want to check this out. It's amazing the level of detail that you can get into and if there's a particular problem that you're having with GitHub I would encourage you to look here first and see if there's already an issue that you could comment on, add your voice or use case to. Some things are obviously not on there so like co-pilot for example wouldn't show up on this until it was actually launched but for in-market features we really do encourage teams to keep the public roadmap updated and to get real-time feedback from the customers and community through that way. Under the hood there's a private repository with a super set of all these issues and the ones that are tagged as being public roadmap periodically get synced out to the public repo. So maybe something to think about for your organization. Obviously we heavily use GitHub issues so it's easy to implement that sort of automation but I think in general the idea there are plenty of other companies that have opened up their products roadmap. GitLab for example has taken the theoretical extreme has literally everything on their roadmap going out a couple of years but it's amazing to see what people are interested in and to get that near real-time feedback about what's important to your customers and your community. So though we've been working in open source for a long time in 2021 GitHub established a formal OSPO program to sort of centralize governance and coordinate activity around managing our own open source projects and working upstream. This is the group that I'm in and I'll talk a little bit more about how we work and what we're doing. Our mission overall is to as it says help individuals innovate more through open source. But we're in the interesting position of having sort of a dual role where we maintain our own open source projects and work on things inside of GitHub as well as help everybody out in our customer base and in the wider community that are using GitHub as the central point for their open source activity to do so more effectively. We kind of think of this in terms of program, the program side and product side. Programs being things that are internal facing work that help us primarily and products being outward facing work. And I'll dive into each of these in a little more detail but just to give you a quick overview from a program standpoint we're really concerned with the problem of license compliance. We're working on durable ownership at KS Cons. Sylvia mentioned this concept of sustainability and to differentiate between a point in time project health from the idea of how long and how sustainable is that over the long term. Resilience was a term that came up as a more descriptive phrase for that. We're using the term durability. That just means that the projects that are out there in open source have a defined maintainer that we know who is responsible for them and they have sort of a SLO in the same way that the software that runs internally does. And the release process I mentioned that a moment ago and we'll get into more detail. On the product side personally I think this is really interesting. We're building things into GitHub that help people work at that intersection of large organizations, large enterprises and the open source communities that they work in. So our primary user here is the Ospo manager which I think there are several here. And so I'm really concerned with the problems of how do you use GitHub to maintain and manage the open source that you both have generated internally and have maintainership over as well as working upstream in projects and communities that are outside of ones that are under your purview. We have a dashboard that shows organizational health metrics and I'll show an example of that. The open Ospo project is something that I'm really keenly interested in hearing your responses about. We have open sourced a bunch of the policies and procedures as well as some tools and guides that we use internally and are trying to see if those are helpful for people that are bootstrapping their own open source program office to just have a template or a starting point for things like a contribution policy for your employees to work on open source. And the last one is sort of a catch-all bucket of friction fixes around things that are maybe cross-cutting concerns that are primarily affecting Ospo managers but may also be painful for maintainers of large open source projects but things that wouldn't necessarily bubble up to the top of any one product team's backlog at Ospo we can sort of look broadly across the platform and say can we help out people who are experiencing pain in this particular area can we dive in to this part of the code base and help improve that for the customers that we care about the most. So I'll talk a little bit more about these programs. As I mentioned earlier we use a ton of stream projects to build GitHub. There are lots of dependencies across thousands of repos and many of them got pulled in back in the distant misty past so there wasn't good governance around their usage. So one of the first projects that we undertook was to get a handle on what those dependencies were to mitigate the risk particularly of software that we distribute and ship out to customers of unapproved licenses. So the idea is to implement a get clean workflow that is to get to the point where we understand the current state of things we have everything nicely bucketed and we know where there are exceptions what those exceptions are to try to be minimally annoying and not bug people unnecessarily. A lot of the times if people have a license or if the license date is incomplete in their repo they're the only people who can go in and really understand what the ramifications are fixing that but to alert people unnecessarily and to cause a bunch of chaos amongst the development teams would not be a successful way to go about this so we want it to be minimally obtrusive. And lastly to sort of do this with an eye towards potential future productization to think about what this would look like if we were to make it available for customers. And just to get into a little bit more detail about this there are some cool open source components of this project the overall thing itself is again one of those things I mentioned at the beginning that the team made a decision to implement in a particular way and not to open source the whole thing but there's definitely big pieces of it which are reusable and which are working in the open the biggest of which is a project called Clearly Defined which is a service that is now under the purview of the OSI which is a service that runs with tens of thousands of packages and their license information about them and the provenance of those packages so we use that as a source of truth for finding out license information. We've written and open sourced a Go library called Go SPDX we've seen a couple of talks in the past couple of days about SPDX this library allows you to express in the SPDX language a list of permissible licenses and the service compares that policy against the results for packages that it's scanned from repositories against the Clearly Defined database and opens up issues if there are something that's on the disallowed list. It's a big undertaking it's been a big project we've learned some interesting lessons as we've gone through it there were it was not a huge set of repositories that had problems which is the first one like you kind of look at that number of them and the state of things and worry that it's going to be tens of thousands of alerts and it's going to be a massive undertaking to try to fix them it's definitely a relatively small number of repositories that were affected and about a thousand total issues that were initially came up after we sort of cut down the initial data quality problems and even amongst those 350 most of those were just bad data like a licensed file that wasn't in the right place or had terminology that the software couldn't parse correctly very very few of those required actual code changes the absolute kind of last step in the chain of having to fix something is you have included this dependency the dependency is written in a way that has an incompatible license and there's no way around that and so we have to change the code that we've written in order to use a different library or use a different implementation of those 350 I think there have been zero of those cases like we haven't found anything we've had to actually go in and retool in order to fix a irrevocable problem with but despite all that it's still pretty annoying we've definitely got some feedback from the developers as the issues were opened up in their repositories that the documentation needs to be clear the chatbot interactions that are available in the issues are maybe not as intuitive as they would have liked so we're still trying to work on the developer experience and the UX around making that less subnoxious and ultimately more and better documentation more automation to go through and fix those things would be great additional curation of the clearly defined data always helps and having being able to send in a pull request to the upstream source of truth for that data means that everybody that is affected by that by a bad license gets or bad license information gets the benefit of it and additionally as we expand the scope more dry runs and just sort of eliminating those spurious alerts will be a huge boon to the end users and as I mentioned you can see the Go SPDX library that's available clearly defined itself is the service runs and you can communicate with it over API but the service is itself open source so as I mentioned we're really also interested in this problem of durability and durable ownership we use the term durability around internal software components indicating that it's got human owners who agree that they are maintainers that's an important part of it they're not going to write in a file somewhere but they also have agreed to take on their responsibility that it has an SLO that's appropriate to the criticality and that the users of that component have a path to getting support for it so last year we started a project to extend this to open source projects under the github organizational umbrella there's a bunch of work here involving getting an inventory of these organizations migrating them to a new enterprise management repositories from those with external collaborators bringing the unmanaged organizations into our access control system and then adding the ones that were actually live into a service catalog that you can see here so they live alongside of the internal software and that they have a clear path of ownership and all of the good things that come along with that I thought this one was going to build out too some lessons learned from here so if you are in a similar situation in your organization where you're trying to wrangle a large mass of open source that has been written over the course of several years and maybe has unclear provenance maybe you can learn from the work that we've been doing the first one is to go invent a time machine and go back in time before all that stuff is created and prevent it from going out the door in the first place without having this kind of ownership setting in the absence of that try to handle on it as early as possible because the longer you delay the more chaotic it becomes and the harder it becomes to wrangle it's important to backstop those written policies with automation and tools so it's one thing to say that you can't create new organizations it's another thing to actually physically prevent that from happening inside of your tooling and ideally you want to make it easy to do the right thing like I think most developers want to do the right thing but often can be confusing or difficult to know what that is if you also make it so that there are checks and balances or protections upstream of the point at which they could go wrong that helped keep them on the paved path and makes it easier to stay clean over time and this last one is around providing incentives and not just deterrents I think this is true broadly speaking but in this case we want to show them that even though there's some work that you have to do or there might be some changes you have to make that there's benefits at the end of it in the case of say this migration into another organization it meant that they didn't have to go through an on-boarding process to add external collaborators to an open source project they could just go ahead and add them ad hoc as they wanted to because it was no longer there's no longer a risk of associating them with an enterprise so that was appealing and so we wanted to highlight that and make that as easy as possible for folks last program piece that I want to talk about is our open source release process and this policy and the process itself is available in that github-ospo repo that I mentioned you can use the template that are there to start this start up a repository for yourself that has some of the same setup that we use at github it starts off with a written policy that's around how you can release a piece of software they've written internally as open source and it's obviously something that needs to be get by in from across your organization from your legal department from the other stakeholders that are that are involved in that but once it's established and published you're done and you can just point back at the policy for people that are working on software the policy points at the implementation which is in a repository that has those issue templates that I mentioned users can go in make a new issue fill out a pretty simple form it's quite maybe it has gotten procedural scar tissue over time as all process does but still pretty compact that describes what the software is that they want to release whether there's any burdensome intellectual property concerns that we might want to be concerned about any cryptographic related things like those kinds of things and a short checklist of things that they can do to make sure that the repository is in good shape to be open source that is does it have clear code of conduct file a clear license a clear set of maintainers once that's once the issues are in we have a periodic triage amongst the Ospo team where we go through and review all the issues that have come in since the last time as well as ones that have we've been shepherding along the process but aren't quite through have back and forth with the folks that are responsible for the software and have open office hours where they can drop in and frequently we'll just work through one person's problem during that office hours and get their code released at the end of the office hour session so that works out really well and then once it's once it's out the question of sustainability and maintenance comes in for those for those releases and hopefully if they've gone through the process correctly they are now set up for success with respect to you know the list of maintainers some external collaborators who can help out with issues and pull requests and those kinds of things and again I'll put the URL up in a minute but this is this is all available for you to use as a starting point for your own organization it works really well it's I think one of the more satisfying parts of the job that I do it's really great to work with the developers internally and to help them take something that they've written and you know have maybe been trying for to get traction on for a while and actually get it out into the world it's a pretty fun process so I've been talking a lot I'm going to pause for just a moment here if there's any questions about any of those program pieces I just talked about I'm happy to take them now and from them on the next section are the issues clearly defined resolved now is that a question I believe so it's definitely been had some additional attention put on it in the past in recent two to three months and I think the transition over out of being purely Microsoft project and onto into the purview of the OSI means there's a dedicated person that's responsible for it and that you know whose job it is to care about and maintain the service itself so I think we're in a better in a better shape than we were several months ago and on the right trajectory the other question is how recursive is the checking for license compliance it goes through a full dependency tree so if you have package.json that pulls in files it will recurse all the way through there and check those oh my god I knew that was going to happen I'm so sorry um sorry I'm a sound guy myself and I just put my water next to the amplifier and I'm like mortified that I did that sorry okay I'm going to keep going the drinking our own champagne section so what parts of github do we use to foster collaboration and open source culture sponsorships if you're in the room earlier Mike and Gerald from Stripe gave a great talk about how they use github sponsors at Stripe to build a sustainable set of open source dependencies that they contribute monetarily towards sponsorships really do move the needle for maintainers particularly sponsorships from organizations like a single organization sponsor can make a project more sustainable than 100 individual sponsorships Steph Lakin who runs the program wanted me to mention that the average sponsorship from an organization it tends to be out 14 times what an individual sponsorship is and so it can really help out for projects that are underfunded that are trying to um find a path towards sustainability and to take to scale up their their work so we've been focused on removing friction so that organizations can sponsor at scale like adding invoice payments and a dashboard for insights these features are now available to all organizations and we've used sponsorships ourselves to help fund projects that we rely on at github we heavily use discussions internally and I chatted with somebody their day who has wanted to uh encourage more collaboration amongst their uh the developers in their organization but because everything was centered in issues they felt like it was a high bar for people that weren't necessarily technically involved with that code base to get involved and to participate in a meaningful way at github we use discussions to solve exactly that problem and there's sort of an internal philosophy that everything should have a URL and having a discussion about a question that you have or a problem gives that problem a distinct URL that people can go to that can contribute to they can chime in on the discussion with what feels like a lower uh level of maybe commitment or a lower cognitive barrier than uh having to go and make a pull request or to create an issue that can just chime in on a discussion and have their voice heard that way um this is heavily used in engineering and company-wide communications and individual teams have really adopted this workflow for uh talking about things that are in the early design phases and haven't quite yet moved into a more formal kind of engineering plan I talked a minute ago about the metric dashboard this is a product piece that we in the ospo engineering have built and are made available to users on a opt-in beta basis um it's um we really focused on surfacing these community standards and making it so that you could as a ospo manager or as somebody who is responsible for a large number of open source or repos in an organization you could find ones that were missing a read me or missing a code of conduct or had a license that was uh incompatible with uh with your policy as well as to surface some of the contribution data and project activity uh so you could see um projects that were becoming stale and where the activity levels were tapering off and weren't getting as much um interaction with the community as they had been saved through four months ago and there's a quick snapshot of it um we're at a place with this where we're um finished up the beta and are looking at what we need to move on to the next phase had some great conversations at chaos con with folks and I'd love to talk with any of you if you're interested in this this sort of metric and this question of uh getting uh your arms around the uh state of the repositories that are across your organizations and finding out um maybe problem spots where a maintainer is burned out or maintainers have left and there's a backlog of pull requests that aren't being addressed or conversely projects that have unexpectedly gotten popular and uh need more love and attention in order to break through to the next level so I'd love to chat with you more about that if you're interested in it come find me afterwards so as I've mentioned a couple of times many of these policies that I've talked about are available in this open OSPO project so this repository at um github slash github-ospo uh we're gonna keep adding more resources over time but the contribution policy the open source release process are up there today as well as the templates for new repos that you might make in your organization and some more guide how to kinds of docs so check it out let me know if there's anything on there um that uh you you find useful for your organization or more types of things that you'd like to see either other things I've talked about today or things that would be helpful to you to expand the open source practices inside your organization so we have a uh community that um we started up I mentioned discussions earlier there's a um OSPO focused discussion area uh underneath community slash OSPO um this initially started off as a way for the beta users to provide feedback on the dashboard but we've gotten uh discussion started from uh folks asking for feature requests in other parts of github there's a great discussion with uh Jordan and with Tierney around uh two factor authentication and and how we could improve the workflow rolling out to FA um the github OSPO one that I mentioned and just a shout out to to the to-do group ospology uh repo which has a ton of great information particularly if you're getting started uh trying to figure out how you can uh establish a OSPO or establish open source practice inside your organization you want help doing that that's all I've got thanks for your time and I'm happy to take questions thank you I honestly can't believe I spilled water all over the receiver I'm so sorry go ahead yeah sure so the question is about the uh dashboard beta what are the next steps now that beta is closed the first one is that we are um exploring options for a different back end data store so that the hitting our target of organizations that we wanted to onboard uh coincided unfortunately with the rapid sunset of the platform that we're using on the back end there so we've got to rethink that a little bit but part of it is um making the data that people find most valuable available in the github apis so you can incorporate that into your own dashboards and uh make it um you know incorporate it with your own data sets that maybe we've had requests for wanting to overlay things like that PR uh close rate over an SLO which is an internal bit of data that we don't have insight into uh if we make that available what available through the API that becomes a lot easier for you to build on your own so that so yeah investigating a new back end making that stuff available through the API and figuring out um what the whether we're um whether we can enable it for uh organizations on demand instead of it having to go through kind of a manual onboarding process that's sort of what's up up next for it and I'm sorry about the timing there it's probably unfortunate any other questions yeah go ahead so are there questions when the license compliance scanner finds a problem with the license what are we what what's next are we making recommendations about what what they ought to do again it really depends on what the nature of the problem is and we have a few steps which uh uh to your to your later point of your question what how can we make that that information available certainly the process and the decision tree that we've built is a good candidate to add into the open source repo and make that available because it's implementation agnostic uh and it might be helpful to answer questions like this but it really is like a sort of flow chart of decision making about is the problem that the license data is incomplete or inaccurate if so can we fix that uh in the clearly defined in the upstream source of truth or do we need to fix it in the repository if we need to fix it in the repository is it something that's really like maybe it's a typo where they put you know a p a c h w instead of a patchy and we can just fix that typo and improve the the quality of it that's a surprising number of things like that are are are really easy to fix uh if not if there's really a uh a substantive problem with it in the sense that it's a the software really is licensed under something that's incompatible with the policy uh which to a point of your question we did not explicitly set the policy we worked in conjunction with our legal organization to determine which uh what categories of licenses were okay broadly like uh permissive licenses more restrictive licenses very restrictive licenses and categorize the licenses that we found into those buckets so that we can make policy determinations without having to go back and make the attorneys read every single open source license in the world um so we help get you know shape that with them um but uh if there is a if there ends up being a problem we'll work with the developer team developers that are responsible for it to go through and fix them but again so far we haven't found anybody that actually had you know it was a serious problem in the sense that there was a completely incompatible license that was against policy that was deeply embedded into the software and so we had to go out and find another replacement that was under a better license but had equivalent functionality as the one that was not allowed there haven't been any of those cases yet but if there were we would definitely want to work with them to get it because that's especially if it's a big you know that's the big fear is that you would have to rework a ton of engineering we work not because of anything that's wrong on a technical level but purely because the policy says that you can't use it but so far that has not been it hasn't been a real problem any questions alright thanks very much I'll be around uh out at the booth probably this afternoon so stop by the github booth and chat and grab some stickers and I'm happy to talk talk more about any any of this stuff thank you very much