 Hi everyone, thanks for joining this talk on managing the GitHub organization Where I'll go over some tools tips and best practices that were useful for us when we got started My name is Mark Matias. I'm a software engineer at Qualcomm Technologies Inc. where I work in the software content compliance team We provide tooling and support around open source compliance and distribution Here's some of the topics I'll cover today Quick intro and how we get started Process and workflow we developed documentation for best practices and guidelines for internal Employees to use and then some tooling that we we leveraged along the way build automation and publishing and then branding and landing page And metrics and analytics for your org So how did Qualcomm Innovation Center get started on GitHub? We started working on github slash comm slash quick about 18 months ago Engineering teams working on things like machine learning demonstrated interest engaging with the open source community on github We also Around the same time at partners that wanted to collaborate on github. So we realized we needed a process around this Quick has been contributing to open source projects for a long time, but we never Had a central location for projects that we own so So this was a new a new effort for us One thing that that actually helped quite a bit was identifying a champion quote-unquote project that helped move things along It helped it helped push push us through hurdles and get a getting clue and get buy-in from leadership and legal For us it was the AI model efficiency toolkit, which became our first project on github and You know as the same goes if you build it they will come once we got that one project on there We started to get a whole lot more interest as other engineers Started to to know about about this and we started to get a lot more requests and it just grew organically from there So easy right just create a github org and you're done Well to do this right you're gonna need resources time you're gonna need help to maintain this so the first thing we did was develop a proposal and plan and Circulated that around to get feedback and buy-in at different levels. Here are some of the things that we had in there Use cases driving factors why we're doing this. What would be the process and workflow for engineers and legal? documenting all the management tasks that that we anticipate having to support and Then what kind of tooling we're gonna have what's telling me what kind of tooling is available for us to use and And then yeah, and then LOEs and resources what what's our level of effort for this? How long how long is it gonna take or how much time is it gonna how much effort? To stand this up and then maintain it in the long term So first thing we did was discuss and set up a framework with with legal on how to process github contribution requests This will most likely be the long pull in the tent Of the hope of the whole process and getting this set up But once you get legal on board and they've developed their own internal process For handling these kind of requests and everything should fall into place much quicker here is a workflow diagram That describes the overall process So employees submit a ticket for legal business and marketing review Upon review and approval the employees given guidance for project scope repo names Committer IDs and open source licenses and notices The employee direct is directed to review guidelines for project structure and governance We have this internal repository that has best practices guidelines and Helping tooling that that sort of thing to help help them get started and once all the review is complete the ticket is handed off to the github ops team and Then we can go ahead and create the repo team add users all that So in an internal github repo we documented all the operational tasks and requirements to be met when actioning requests So we set up a dedicated ticketing queue to track requests and communication with with engineers When requests come in we vet the request You know the requests are linked to a legal open source request so we can go check it out Make sure that it's approved and within the guidelines and We also do some checks like ensuring the users have github accounts and they're they have the correct committer IDs per the guidelines It's a fairly manual process for us right now, but I think that's something we want to further automate and And have a tighter integration between the between the two or three different systems Like I mentioned we have an internal Github repo for with resources to eight engineers And preparing their project for open source The repo describes the overall process for open sourcing repos for in github But also provides requirements and guidance on how to prepare the project It lists best practices for example files to include in the repo and we also have commit guidelines information like you know if an internal project is being released and You know you got to take some care to make sure that the get history is also compliant You may we provide some some scripts and information on you know squash merging or you know the the history or potentially Rewriting some of the history to make sure the right committer IDs are in there and that sort of thing here's a sample of some of the requirements that we have for ensuring best practices a license file is required with a Well-formed OSI compliant license a read me file that references this license file But also has instructions on how to build and test the software or at least a link to maybe a contributing file Which explains how to contribute Which includes the DCO sign-off which will we'll go over in an upcoming slide a Con a code of conduct file as well is required and then we Also require copyright and license headers to be present in every source file including spdx identifiers short identifiers We recommend not having any executable binary artifacts and also having a test suite and some sort of continuous integration using Get-up actions or equivalent So I mentioned the DCO which stands for developer certificate of orange origin If you're not familiar the DCO is a lightweight alternative to the contributor license agreement or the CLA It's just the way that contributors can certify that they wrote the code or that they otherwise Can submit the code that they're contributing it requires All commit messages to be signed off So if you see here on the right in the screenshot, you can see an example the commit message There's just a signed off by line at the bottom of your commit that has your name and your email address and you can You can add this by simply passing passing in a dash s flag to your git commit command And so to enforce this We use DCO github app, which is installed globally at the order level. So this is applied to all Repos in in the org and it requires a commit message is to contain the signed off by line and ensure that the email address There matches the commit author This has worked fairly well for us there's a run into a couple issues one being that Squash merges swallow up that, you know, the DCO there the signed offline So we've disabled squash merging in our repos But furthermore the DCO is not a first-class citizen in github. So When merging or committing from the UI, you can't specify the DCO and it and it, you know So you could introduce a commit that's missing that's missing it but Those were that feedback has been provided back to github Another tool that we use is called repo enter. It's a project hosted by the to-do group From the Linux foundation and this tool lints open source repos for common issues You know and so things like you can see in the example it checks for license files read me files contributing files And it's fully Customizable you can provide your own rule set and have a check for your own types of patterns for example We have it check all the source files for a specific copyright pattern and spdx license identifier And we have a get-up action that runs on on PRs so new relic created the The get-up action for repo linter and it was recently forked into the to-do groups org Which is now the official version. So thanks to new relic for getting that started it works really well So what we did is we added an org wide github workflow template So you can create these templates in a special repo Called dot github which you place in the in your org and when you go to your to a repos actions tab Then this it can see on the screenshot you get this suggestion from from github like hey Do you want to you know install this this action? So this allows for an easy one-click install from from from from the actions tab We we we only added a little bit so our we have essentially our we have an action that wraps this The the the new relic action that it's on the to-do groups org now, but it checks for a local Repo lint jason rule set if not present it uses our default one So we want to allow for some custom customization Depending you know depending on the project depending on the programming language you're using or the nature of the project We may need to make some exceptions or tweak the rule sets So we we check if there's a local and first if not we use the the global default that we provide So managing github We had some goals initially So we obviously wanted it to be efficient. We didn't want to have to do everything by hand Right we want to avoid having to click around a web UI to create repos and add teams and that sort of thing you know if we wanted to Apply a label to all repos or if you wanted to for example disable squash merging Which we had to do recently across all orgs. You don't want to have to do that for every single repo. It's a lot of clicks We also wanted a workflow that allows us to review proposed changes beforehand, so if we have for creating a new repo and and granting, you know, five contributors access we wanted to be able to have a Review process there so we can have some level of accountability and verification We want to be able to track these changes so we can easily revert or if we needed to we can audit it So we want to leave a paper trail so we can investigate problems and that sort of thing and Yeah, we wanted and we wanted some we wanted a way that we could view all the repos memberships and employees easily You know one place to go look at all the stuff so without having to click around And into each individual repo and if possible we wanted, you know, we wanted to do this with some tooling that was easy to use So our buddy Kevin at Bloomberg introduced us to this tool called terraform Which is this open source software tool by hashy core It's an open source tool for managing infrastructure as code using config files It's generally used to manage infrastructure for cloud operations. However, they have a github provider as well, which Which we've been leveraging and it's pretty awesome You you configure Everything and files using terraform language, which is similar to Jason and Yammel. It's pretty straightforward And we we configure and manage this in a github repo so we get history for free History of changes and we also get a nice workflow with pull requests Terraform is CLI that allows you to preview and apply proposed changes, but also there's a github action That means which means you don't even have to use the CLI at all Here we have an example From our terraform configuration On the left here, you can see a repo that we've defined There's attributes and and some topics description the repo name that sort of thing on the right side here you can see a team that we've added Team repository relationship and then a membership We organize files by resource type, but you could organize by project or repo We found that that works a little bit better for us when you're Introducing a new contributor to several teams. For example, we only have to update one file But also, you know, if you It's common that the you have the same team apply To multiple repos so having it this kind of structure work better for us We try to we try to keep or try to use resources by reference to keep keep the config files dry and Minimize duplication make it makes it easier to maintain So when we have to update something like if a repo name changes or somebody's github username changes We only have to update one file Which is where it's defined so you can see here. We try to use IDs and And in user names and things like that via by reference the terraform command line interface has a couple Useful commands may mainly plan and apply so use terraform plan to preview changes that the config that your new config Would introduce so you can see here in this example After you know, you can see that it created or it's going to create and A new a new membership here on the bottom and in the top you can see on the left It's going to update a repo in place. It's gonna update a couple attributes in the description So once you review the proposed changes and it meets your expectations, you know, like it's not Destroying anything or recreating something from scratch or whatever you can go ahead and run to terraform apply which will then Apply your changes using using terraform which leverages the github API Terraform allows for values to be declared in that in variable files That you can then use by reference. So that's how we we've been managing users We we map internal user information to their external github information in a in a dictionary And this has been useful for a couple reasons one is github user names can be changed by the user or they may decide to create a new account and Use that for work So having this central file means that we can we just have to change their username in one place and not in every Membership that they may be referenced in Another benefit of this file is it allows us to run scripts and checks on it So for example when an employee leaves the company, we probably want to remove their their right access maybe not in all cases, but so we can have a periodic task that Runs nightly and then goes through pulls the repo down You know and parses this file and looks up every employee and in another system like LDAP and if they're no longer employed We can then open a ticket for ops to to to follow up on There are or there is a Python parser for the Python or for the terraform language I think there's other they support other languages as well, but so this means you could load your configuration into like into a dictionary and memory and then you could read from it write to it that Sort of thing that's something that we we haven't really explored it, but we want to do more of in the future. I Mentioned terraform has a github action. So this has worked really well for us It helped automate and streamline our workflow So once we get a request to maybe for example create a new repo we go ahead and update the config file add the new repo Attributes description topics whatever user provided and then we can submit a pull request with those changes The github action will run terraform plan for us and add a preview Or the output of that into a comment of the PR So you can you can see there in the screenshot the top screenshots There's a little show plan section that you can expand and it has the output of that The reviewer can then approve Once they you know reviewer improve it they can merge it and then terraform apply is run by the action and Github is updated and the local TF state terraform state is also updated. So that has worked very well for us teams on on github.com quick want to make Artifacts available docker images code packages. They want to make these available to the open source community You know these things help a lot of people get started quicker developers, but also they can be used to speed up downstream automation You know future PR checks that depend on on on some bill that can take a long time You may want to have a docker image to speed that up So we leverage docker actions to to Provide some of these workflows and you can run them on merge release and other and several other events Here's a diagram to illustrate an example. So a quick engineer Maybe they publish a new release on on a quick repo that'll trigger a build The that there's a script or docker file there where it will pull in third-party dependencies maybe build Build a new image or a tar ball or a package and then once that's built it'll publish it to either You know gut get up packages or even as a github release asset itself or To some other artifact server like JFrog's Artifactory whatever whatever it may be And then the artifacts can be consumed by developers and any downstream automation that you may have. In terms of branding You can't really do much on the github org page. We you know, we added a logo and the URL but The engineering and marketing teams really wanted a landing page Or they can highlight relevant recent projects provide more information and resources and other links So having to support this I wanted something easy and in lightweight Having used Jekyll in the past Using github pages in combination with Jekyll came to mind Jekyll is a is a Ruby based static site generator. So You you can you can have layouts you can You know, you can have your content and markdown files. So there's some formatting there and and then once you run it It generates a complete static website for you so Using that with github pages, which is free for public repos and as long as you don't want to update or change the default domain Works really well. So for us We we we just use the default domain. So we have quick.github.io. However, you can have a custom domain and have it work with github pages and All this lives in a repo a github repo So again, we can leverage the PR workflow there allows for easy updating and review of changes there's a couple screenshots of our quick.github.io landing page where we can showcase our branding and some descriptions and some information about Qualcomm Innovation Center and also highlight some some new projects Healthy open source projects are important To keep your the community engaged. We wanted to avoid, you know, having stale projects and and and ensure Responsiveness responsiveness in terms of issues and pull requests that come in But how do you you know, how do you measure this? How do you act on it and that's where we turn to chaos, which is the community health analytics open source software project the Linux Foundation project and What they do is establish standard metrics for measuring community health. They produce open source tooling for analyzing community development And and what specifically one tool that we've been leveraging is grimoire lab, which is a tool set Built on top of Kibana That that allows for us to look at software development analytics It collects aggregates and visualizes data for from from open source projects within our org so here's a couple sample metrics You know that I like to look at activity so pull requests and issue Creation trends over time How long does it take for us to respond between a creation and closing of issues and pull requests, you know time to respond and then Growth and retention. Are we attracting new contributors and retaining existing ones? So there's several different metrics that that you can configure and use And we're we're still in the process of evaluating what's important for us, but I think these three are good But yeah, take a look In the future So yeah, what's next for for quick We want to further simplify the process for our users adding self-service options where we can Eliminating some of those manual cross checks that our ops team are currently performing Allow for new contributors to be a more easily added for example via UI or something And we want to make project metrics more accessible to project maintainers and provide some actionable actionable feedback there and of course they want to onboard More projects. We got about 30 projects right now and we keep getting requests every week for new projects Here are some resources some links The to-do group provides some awesome Documentation and so does a hashy corp on terraform and the chaos community as well So that's all I got Thanks for attending Feel free to shoot me an email or if you have any questions or comments and I'll be around to answer And talk to people in a slack group at a predetermined time after the talk. So thanks a lot. Bye