 All right, thank you all for coming folks. We have our next talk from Jeremy Garcia, who is the, you're the head of open source at Datadog, yeah? I am. Excellent, he is the head of open source at Datadog and he is talking to us about a new open source project that helps your project to track metrics beyond just code. So thank you for being here, Jeremy. Thank you. So as you said, welcome to your open source community metrics should be tracking more than code. Just out of curiosity here, how many people are either maintainers or responsible for an open source repository? Also get them all across. So what I'm gonna do is give a brief presentation, give a quick demo and then I'll leave questions probably half time before and after the demo, but if you have a burning question during, feel free to be here in the master. So who am I and why should you even consider with the community? So I'm a community ambassador for opensource.com. I'm a co-presenter, louder. I'm a co-presenter in a podcast called Bad Voltage for those familiar with that. I'm the founder of LinuxQuestions.org which is the largest non-company community, Linux community on the web. And as Sheen said, I am the lead of the open source team at DataBill. So metrics. I talked with folks about metrics and communities in open source a lot. Realistically probably is more than unhealthy, but it's something that I really, really enjoy. So if I put a statement like this up here, using metrics to make data-driven decisions improves your ability to build sustainable open source communities. I don't exactly think this is a controversial statement. I think data-driven decisions just produce better results. And that's something that most people would agree on. So if you have an open source community, you know what metrics are, you should just jump right in and say, well, what do I want to measure? And I think that's what most people do. I think it's the wrong approach. What you really want to do is figure out what you want to accomplish. So you know what metrics you actually need. So what you should be asking yourself are things like, what outcomes am I looking for? What specific questions do I want these metrics to answer? And really what are my end goals here so that you can then get actionable information that will help you answer those questions. It actually makes some changes, which is in the end what you really want. You don't want metrics for the sake of metrics. You want metrics to make things better. So as I said, as the open source lead data dog, I'm responsible for the repositories there. So what kind of questions do I want to answer? So I want to be able to better understand how responsive we're being to our community, right? And that means I want to explicitly know with data. Now I don't want gut feeling. I don't want anteceda. I don't want to be like, ah, it seems like we're doing fine. I want, you know, data. And I want to know how, not just how the repositories are doing from how quickly are things being merged, but how quickly are we emerging community contributions? If I have employees that are not able to get their code merged, that's a different problem with a different solution than if I have community members who can't get their code merged. And I want to better quantify the value that our community provides. And I want to be super clear here. It's not because I want to put a number on the community, not stackricking community members. It's nothing of that nature. But I really want to do though, is kind of prove the value so that I can do things like apply a good contributor out to a summit or send a bunch of people shirts or do things like that. Or if you have a company contributing a lot of code and you have a new feature coming out, maybe you want to have them part of the discussion while you're building it and giving private beta access and things like that. And having real numbers to be able to facilitate that kind of thing makes it that you're able to argue off the chain that it's worthwhile. I want to kind of understand, and then also raise your hand here if you work at a technology company and you have extra engineering resources. So I should note here, Datadog is hiring, my team has a bunch of positions open. Across the org we have hundreds of positions open. I should be asking you where you work and seeing what I can work for you because no one has extra engineering resources. So with the limited engineering resources that I have on my team, I want to be able to know where I can focus that energy to get the kind of the most thing for the buck. I want to be able to track interactions, my memory, not very good. So I want to be able to say, I ran into CAD at five years apart. This contributor is having a problem. So that way you can kind of build patterns and see where you should focus that energy again. And I want to better understand the community at a personal level. A good friend of mine, John O'Bacon, is not here, but he's once said, everything in life starts with people. And I really believe that. And I think we need to focus more on people, on relationships. Because in the end, code's great, but that code's created by people and you should kind of focus on that. And I want to tell a couple of stories that I think are illustrative of this. Let's say you have a long time contributor, contributing at a very consistent rate over a pretty long period of time and he suddenly disappears. I want a tool that lets me know that so I can reach out. Maybe he had just had a child and I should send him a data.1d. Or maybe he's having a problem with the product and I want to know that so I can help him. Or maybe he's having a personal issue and he's been a great contributor and maybe I can help him with that. Or maybe you have a contributor, she's worked at Company X to use your product. She moves on to Company Y. Once again, use your product, create contributor. And now you know she's moving on to Company Z. You want to reach out and say, is Company Z using our product? If they are, how can I enable that contributor to succeed there? And if they're not, how can I enable her to maybe get that product into the organization? Or a third thing, unlike the first story where it was one contributor that stopped contributing, let's say you have a large organization and dozens of people are very regular in contributing to your product or project. If all of them stop contributing at the same time, you probably have a problem on your hands and there's something you're going to want to look into before you lose that trustee. It's a couple other things to keep in mind here. Understand this thing, it's always going to change, right? As you get more data, you're going to want more data and things are going to just evolve and you need a tool that's flexible and will evolve with you. And it's going to be different by community. What's right for me is not going to be what's right for you. Every community is different and really embracing that and understanding that is really, really important. And you should really understand whether you're running a project or a product. They're two very different things but I think flexible tooling should be able to address both of those. And it's also really, really important and I think undervalued to understand what you don't want to track. Be very careful what you incentivize because the law of unintended consequences will get you. So with those kind of requirements in mind, I set out looking for some open source projects. I'm an open source person that never wants to build something new if I don't have to. There's a ton of great software out there. So the first thing I did is kind of looked to see what there was, to see if maybe I could build on top of it. Or if something existed that did exactly what I wanted, which would have been great. So here's kind of a list of the current options in this space. For those of you not familiar with Grimoire from Batergia, absolutely awesome product. Absolutely awesome project. Awesome team, awesome company. For many of you, that's honestly going to be the right answer. It does pretty much everything. It pulls things in from CICD, from Git, from GitHub, from pretty anything you're using from bug trackers to forums to mailing lists. It's going to pull that data in. It does really great analytics. It's kind of a little bit of a steep learning curve and because of the products that it built on top of, it was kind of difficult for me to extend it in the way that I wanted. So that's why we didn't use that as a base. OSS dashboard from Amazon is actually really good, but the way it runs in phases, I also didn't think it was appropriate to build on. OSS tracker from Netflix was kind of the most encouraging from the beginning because I think philosophically, it was the closest to what I wanted. Unfortunately, I couldn't even get it to run after quite a bit of trying. It was written in Scala, which I don't know. So it just didn't seem like a good base because of those reasons. Gator from PayPal also really awesome. We used to use it previously. It's languishing a little bit now and it's not as maintained as it used to be. So it didn't seem like a good base because of that reason. And there's a bunch of others, but like I said, none of them are really extensible in the way that I wanted to go. So I was, as I said, I talk about community analytics a lot. I was talking to this gentleman for those of you don't know him, Stuart Landridge. He was not able to be here today, but he is watching the live stream. So everyone say hi to Stuart. I said hi to you. So we were going back and forth and the idea kind of coalesced. And one day in the pub, he said, haha, I can build this. And he started. So what we have is something called Measure. I don't love the name. It turns out naming things really is hard. So if you have any better ideas, certainly open to change in the name, but I'm happy to announce we tagged version .01 yesterday. And what that really means is it's going to be pretty stable, reasonably easy to install, and mostly work as expected. It's definitely still beta quality software. It does seem to work. It does need some work. The documentation especially needs some work. The install needs to get a little bit better. All things were aiming for version .02. And things are gonna involve, so I want feedback on the idea, the implementation, pretty much everything. I'd love to get your feedback on it. But at its core, for lack of a better term, and I don't love the term, it's a contributor relationship management system. It's really metrics that focus on not just the code, although it does focus a lot on code, but also the contributors. Basically what I want to understand is how people as individuals and people as organizations are interacting with my projects. I'd also like to take a brief second to thank the Linux Fund who made .1 of this possible. End note, I abstained for voting on this one, but I am on the board of the Linux Fund. So if you have an open source project that you think needs funding, feel free to track me down and let's check. So a few quick product philosophies here should be simple. We're willing to trade to be frank some flexibility for simplicity. We want this to be really easy, very little learning curve. We want you to be able to kind of install it, jump in and get right into it. It should look good. And I think this is sometimes undervalued in the open source community, although recently this seems to be changing, which is really awesome. Should be pretty opinionated on the box, but should be really extensible. Like I said, what's right for my community is not gonna be what's right for your community. So if you don't like the decisions I've made, you should be able to very easily change them. As I mentioned before, I think your employees not being able to get their code merged is a way different problem with way different solutions than your community not being able to get code merged. So it should very explicitly and easily be able to separate them out differently, but also aggregate them obviously because you wanna know your overall health. And it should treat the people as first class citizens. Like the concept of a contributor should be baked into the product at a very fundamental level. Just a few implementation details. Of course, 100% open source. You wanna do Apache 2 license. Actually uses GH crawler from Microsoft, which is, we found a few bugs in there. Submit them upstream. Microsoft was surprisingly receptive and merged them actually quicker than I would have anticipated, to be honest though. I thought that was interesting. We've also contributed to a couple other upstream projects, charts.js and stuff, for example. So it takes the concept of a widget and reports and then it builds those into dashboards. It's written in Node and there's the GitHub URL for those that wanna check it out. Like I said, I tagged them. We'll call it the first pre-release yesterday. So here is a quick screenshot. Like I said, I am gonna do a demo. But it kind of took the unit's velocity here. Every widget does one thing. One thing only does it really well and you can kind of pipe them together to do pretty much anything that you want. Like I said, we wanted it to look good. So I really wanna thank Michael as well as a friend of Stuart. He did the design work on this. I think he did a really good job. And I'll kind of walk you through some of the widgets and things here in the demo. But if you don't have any questions or anyone has any questions, I'll get into the demo. Questions? This resolution on this thing is not what I thought it would be. So it should be interesting. I can't even see those right here. So here's the overview. This is gonna be every repo that you have in the system is an aggregated view of this. So it's gonna get one of the widget types is just data. One of the widget types is gonna be a line graphs. One is gonna be stack charts, things like that. So you're gonna get a very quick overview of how the code is progressing, things like open issues, things like new contributors, old contributors, contributors are not in any organization. And then you can also drill down by repository and get that same information. So this is just a fictitious kind of setup for this. So if you go into Mongo, you can see where your open issues are, open PRs, which contributors are leaving, which contributors are coming is on the bottom here. So you can say, here's a list of new people contributing to the project. Here's people that are leaving. Yes? It should be tracking more than code. Yeah. These metrics always even code related. So there's a bunch that are contributor related. There's an entire contributor panel. There's an organizational panel as well. What's the metric, the process of making code? So they come with the process of making code, but the concept of kind of the notes and the when's the last time this person left, when's the last time this person came, things like velocity at a contributor level, velocity changes. Things of that nature where it comes into focus with people. So I want to be notified, like I said, when people that are regular contributors, when they leave, or when their velocity of contribution changes. So that's where I think you need to focus on the people. And that's something that this tool enables you to do that I don't think the other tools do. The other tool is really the contributor is just, they checked in this much code kind of thing, where this, if you look at the, so if you go into an organization that'll list the people that are in that organization and you can click on them individually and see what they've done. Like I said, you can add notes. So ran into them at FOSTA, had a problem on this date. The notes right now aren't structured because we're still building that out. But once they're structured, you'll be able to do alerting and things of that nature. Or you'll be able to put a note for a future date saying, sell this contributor, we'll be at FOSTA, remind me the day before kind of thing. So that's where I think the focus on people is going to come in. So here's the contributor dashboard. You can see how many open PRs they have, open issues, a bunch of data about them, the rate at which their PRs are being accepted or rejected, things like that. And then you can also do, so the concept of teams is, you're gonna have a bunch of repos obviously, but you might have one team responsible for six. That just aggregates it by those. So those are arbitrarily defined user definitions. And then reports are, you might have like an engineering lead who doesn't want to go into this tool and get a certain numbers and look at graphs. They want one very specific data point. Reports allow you to build that one thing and you can email it, have it email it to them or just give them the URL either way, but it's just that specific data. So you don't have to, so if you want to respond to them by repository, for example, it's just that data with nothing else there. So that's a demo, I'm happy to take questions from there. Sometimes you have, you know that, and sometimes there is a lot to show them. So a repository can be a part of multiple teams and a person can be on multiple teams. So, and then we're building out sub teams as well, but that's not in there in point one. So the data right now is all in GH crawler, which is basically a MongoDB store of all of the, so it has like a one collection for issues, one collection for PRs. So you can also query against Mongo directly if you'd like, but the widgets query against Mongo now. So no, we don't do that, but that is a good idea. Thank you. So the inside outside org right now is not a GitHub definition, it's what you define it as. So the one downside that I would say of the tool, and I don't know a way to fix this and I'm happy to hear of answers to this, but people use GitHub orgs in way different ways. So we didn't want to base it on GitHub orgs. So now you actually place people in orgs manually. There's an API to do it programmatically, but so you would say they're in the Microsoft org and they can be in sub orgs and multiple orgs and they can leave and come and go from orgs. Sometimes the contributor is going to become an employee. You want to track those two time frames separately, which the tool currently does. So you would put them in those different orgs. You can right now only have one in the org though. We should be the biggest man. Okay. One thing I should have showed. So if you do go into a repository, this is right here is where it is inside and outside the org. So you can see everything's going to change based on that. Yeah. Are you thinking of, are you thinking to start in terms of relationship between contributors, having office hours and you know, the participation of these people in office hours and any other, any different activities than co-communications? So the way the structured notes will be built out, something like office hours would be appropriate for a note. We eventually will track more than just code, certainly, things like mailing lists. It will never to be frank to track all the things that are to track because I don't think they're gonna ever be competitive tool. But realizing that code is not just the only way to contribute. So we will pull on other things. Probably after version one note. We want to get this experience correct first and then build out other data points for you. Oh, sorry? I don't worry. I'd like to talk about extending it to not only get up, but getting other sources like Wikis. Of course people, projects are also participating in writing documentation. Writing has FHQs, whatever. Or even writing policy, writing whatever you can think about what you need for running your project or your office source and software development or whatever. Is there a plan to integrate other sources like this? Yeah, so for version point one, we want to just get hub. Version point two will probably be other like GitLab and maybe Bitbucket or something. But after version one, we would definitely want to take Wikis, mailing lists, Slack, things like that. Any kind of participation point that we can get and pull it in. Absolutely, yeah. Because to get the whole community inside here, we need to expand it over just the coding platform. Absolutely, right. And that's where it's really about people. If you have someone that's really awesome in Slack but doesn't contribute code, that's still an awesome contributor. We want to know about that. So reminders are going to be in the next version. Yes, there's an API. Most of the things you can do in the UI, we can already do through the API. So you can write automated tooling from there. We haven't written any yet, though. Gameification or anything? Sorry, can I just repeat the question? He was asking if there'll be gamification built into the platform. I'd probably not. Because I believe you have to be careful what you incentivize. I would be reticent to do that. If there's a way that does it in a way I'm happy with, I'd be open to it. But my gut would be probably not. Is something that just the project moderates for this work? Not the individual contributors. The contributors don't see this. So it's just for the owner of the project. So I would imagine that some projects will absolutely make this data open to their community. I would imagine that some companies would not. We also, right now, the authentication model is one of three things. It can be completely authenticated, which means anyone can view and anyone can add notes, can change orgs. We have an authentication model that everyone can see, but you need to log in to edit. And then an authentication model where you have to log in to see anything. So we kept it pretty simple for now, but those are the three kind of models. Is this a project or a product? Project, very much. Deliberately. So this was something that the company wanted to have for their repositories. So they don't sponsor the project in any way. They do allow, as I said, we're hiring. We allow our engineers to do cool things. I have to be over time, right? I don't want to... No, you're not. Actually, you still have three minutes and then we have a five minute transition for community carryover, which will be me standing at the front of the room embarrassing myself until other people want you too. Sorry, thanks. So on the roadmap, he asked about AI impacting the tool. Things like sentiment analysis are definitely part of that plan. I think there's some very cool machine learning things that would enable us to make more accessible data, which is something we want. Definitely, yes. If you want to contribute that. Anything else? Thank you, Jeremy. Oh. Wait, wait. I'm under the wire. Puzzle beater. So you're saying it doesn't help? So right now it doesn't. We'll have to add that to GHcrawler upstream, but yeah, happy to do that. It should obviously support more than to help in the end, yes. In the afternoon, analyzing developers and networks in the community, is it somehow related to your purpose? No. Thanks. Yeah.