 But thank you all for showing up today for 30 first day at KubeCon. I'm quite tired, but it's still here, still standing, and thank you all for showing up. I'm here to talk about burnout and metrics that could help to signal burnout. For those of you I haven't met, my name is Sophie Vargas. I am a program manager focusing on research and analytics within Google's open source programs office, and also an active member of the chaos community where we talk about and define metrics related to project health and sustainability. Now as you can see there is metrics in the title of this talk. We will be sharing metrics and research talking about burnout and experiences in open source. But this is also kind of a live testing environment for further research in that I'm attempting to show proactive metrics about communities that might be present today in this room. So the hope is that we can explore these ideas together and test whether or not these experiences that we assume by looking at the data actually are reflected in your own experience. So I'm really excited to have discussions with folks after the conversation to see did these things resonate with you, or do you think we're missing something in our basic assumptions that are making this impossible to make the comparison with your own community or project. So I'll be around at the conference for the next day or so, and you can find me on Twitter and on LinkedIn if you have any comments or feedback, all is welcome. So to get started, I wanted to start with the common definition of burnout and I pulled this from various medical journals and outlets and overviews and it focuses on this idea of mental, physical and emotional exhaustion often brought on by prolonged stress. I also found a number of other definitions related to burnout at work that were characterized by lack of meeting in work or decreased in performance or productivity and also by exhaustion. I will be shamelessly using photos from my own cats in my own home to illustrate these points. This is my cat Moby who sometimes is just overwhelmed by the world and he likes to retreat into what we call the cave as a way to reduce some of his sensory overload and this helps him calm down. We see the number of studies that suggests that the rate of burnout is actually increasing. This was a particular study in 2021 from Haystack Analytics noting that over 80% of software developers that they surveyed felt burnout in their work and an increase since the onset of the pandemic. And so we have to imagine that this is a present and growing phenomenon and the pandemic most certainly has had an impact to it in some way, shape or form. There's also a number of very personal stories about open source burnout that I've been either reading about on public posts or listening to at other presentations and those aren't my stories to tell so I didn't want to put them on a slide but I highly encourage folks to go out and read them and really learn about some of the experiences other folks are having in these spaces because they are very real and they are very raw and I think we can only really learn from each other's experiences and so that can help us bridge the gap between understanding each other. On the other hand, we saw reports during the pandemic say from October so the report that GitHub publishes annually that they actually saw open source product activity spike and holidays and weekends suggesting that as a behavior distinct from work, people were turning to open source as a way to connect and engage and learn with people when they were isolated in their own homes. So we have two kind of similar but conflicting stories where the pandemic is certainly changing behavior it could be causing burnout at work but also it could be an outlet from people from work and so we're not really sure how to handle the impact of this event and how it's still impacting our lives because clearly it changed something. To help provide another perspective, I was able to run a study last year focusing on open source software developers, contributors and maintainers including both that were currently employed and some that were students to kind of get a sense of where this population identified and a number of questions related to this topic. So for first context we were looking at folks that were contributing to open source software projects in both personal and professional time and as you can see over 50% of these respondents were doing so in both contexts. So for them open source work didn't really split neatly between personal or professional time and for many I have to imagine in this room probably feel that as well in their work. I know I personally am able to work on open source as part of my job but I will volunteer for events for hallway tracks for things that are outside of that description in my own personal time because I am personally interested in supporting those spaces. So I was curious to know in this particular segment could we distinguish burnout experienced at work versus rate of burnout say working on open source. So we asked a question and here I have it cut by contributors versus maintainers which were distinct populations in the study and 43% of them of contributors said they felt burnout at work and 21% said they had felt burnout working on open source and their rate was slightly higher for maintainers. So 43, 83, remember the slide a few slides ago, this seems like perhaps maybe a little bit lower in this population but also this was taken a year later so maybe either sentiments have changed or the population that we reached was distinct enough that we were looking at much lower rates of burnout. Now hindsight is often the best site and later on I looked at this question and I was like wait how are we defining work? We didn't explicitly define this because I wanted to be more general because we have folks that were working at companies as well as contractors or working for themselves and so I realized that here work wasn't a clean distinction here and it gets even muddier when we again look at those folks that were working on open source in both personal and professional context. If they're working out of that work and they're experiencing burnout at work then are they also experiencing burnout in their work in open source. I did like to throw this in at the end just because if you ever look at survey data I generally like to include a few slides about the methodology and the sample itself. We're looking at an international population I handpicked about seven different countries and I wanted to gauge their relative experience in contributing to open source projects and so here you can see it split by their years of experience and note that about over half of them were three years or less in their 10 year of contributing to open source projects which for some might skew toward the newer but we did have a percentage of those in the older population as well so I shouldn't say older more experienced open source contributing population but I do think this is a slight skew to less maturity especially when we think about a few slides ago the effect of the pandemic one to three years ago is right in squarely in that bucket how many people started working at open source as part of that and how much has that colored their initial experience of it. So always always acknowledge your sample bias in any time you look at survey data so I'm going to reference the survey a few more times but now you have a better sense of who we actually sampled in it. Now let's get to a population a little closer to home. I was thinking about how work in CNCF and CF and CF related projects might skew toward either paid and employed work versus volunteers and I personally know of folks that sit across that entire spectrum in that they are primarily volunteers or they're primarily paid employees or they are somewhere in the middle like myself we can do some in both contexts. So I tried to estimate this by looking at the def stats page for all all projects under the CNCF and only saw 3% labeled as independent affiliated individuals and over 8,000 distinct companies and organizations listed as contributors to the CNCF projects but I also know for a fact that some of those people that are affiliated with companies are not actually working on these projects on behalf of their companies so we can't really estimate it with the current data set as is but we do know that there's a mix of both paid and volunteers working on all of these projects. So we're coming down to the hypothesis of the session. Can we actually use metrics to proactively signal collective burnout? Notice that I'm using the word collective burnout versus individual burnout because we're looking at larger populations versus the experience of any one individual. So we expect a range of experiences but are there indicators that the community itself is starting to suffer or starting to feel the impact of something like this? So I have an idea about four segments that we could potentially look this at or sorry four focus areas. Say work distribution, who's doing how much, looking at things like backlog or build up, what things are slowing down, what things are piling up. I'm thinking about actually Don's presentation this morning about how to nurture and grow new contributors in the community because we have to think about that from the relationship to burnout how easy is it to take a break which is the other alternative I had for this option but that's not really something that's easily measurable versus looking at the influx of new contributors and maintainers over time. And the last section is really how do people feel about their open source work because if we think about that definition again emotional exhaustion that's entirely related to how we feel and how we feel doing this work. I'm about to show a lot of data predominantly from GitHub blogs and to make the visuals less of an eyesore I started throwing in acronyms and then realizing instead of putting a legend on every slide I should just tell you what they all are up front and try to continually repeat them as we go through but so you're not ambushed with a bunch of letters you don't understand. PR pull requests OPR opened pull requests. I introduced a new pull request review comment combined number which if you're familiar with a GitHub event types there's two separate events stream coming from the pull request review events and pull request comment events and the first one was introduced in 2020 and the latter had been around as long as I've been looking at this the project in question. So to make things simpler I looked at this as an or one of those events contributed to this pull request review experience or event stream and a reviewer was someone who created one of those types of events. I'm also going to be looking at predominantly people-based activities in which case I have removed bot accounts thank you to Lucas and the dev stats folks for providing a list of bots that I could actively filter out or filter in depending on the conversation and if I'm looking at any aggregate statistics I've generally removed watch events which is when someone goes in as stars repository as this doesn't really reflect engagement in the community even though it is somewhat engagement with the project. As a base project I picked Kubernetes Kubernetes the repository a subset of the broader Kubernetes project organization and I had mixed feelings about this up front because I know say something like maintain or burnout is more cutely felt in much smaller communities but if you're like me and you've tried to create metrics and analyze very small populations the data can sometimes be nonsensical especially if you have one contributor and say they didn't do anything last month you don't have anything to measure so it was much harder to test this case against smaller populations and something like Kubernetes is large enough that we can even cut it up and start to look at sub-populations with enough statistics. The other side of this is that I'm hoping that there are some of you in the room that have actually worked on this repository so that can hopefully inform the data that you're looking at because you have a personal experience to go along with the metrics that we see. So let's first start with work distribution and to set the stage I really want to get to the metric this isn't quite the metric but sometimes you have to look at more things understand the numbers that you're looking at say total population size that we considered for this type of analysis as you can see I'm looking at 2015 to 2022 I took a really broad scale here because again I wasn't really sure how to count for the pandemic in the middle of all of this so I just figured let's look at the entire baseline to see how things are growing changing shifting and you can see a natural growth and sort of slight decline at the end in terms of total population of PR openers and PR reviewers with the explicit numbers of 2022 listed on the left. The last bullet is relating to the individual provers which is by sub-directory in the project so it's actually maybe a hundred or so but for the individual experience of submitting a pull request it might have to go through two to ten people depending on where you're submitting this request in the project. In terms of overall pull requests open over this time scale you can see the total number in blue and I started looking at this top 50 percentage top 50 individuals by pull request year over year and as you can see they represent a fairly large chunk of work and I'll have the specific numbers later I mostly just wanted to show you what the totals were for both open pull requests and pull requests review comments again that double event type which you can see kind of spikes after 2020 because now there's two events being counted not one but then in terms of the individuals say top 35 that ratio is holding somewhat consistently even with the added event type. Why did I pick 35? It's because it's the same ratio when I look at total population of each of these different event types. So this is the actual metric I wanted to get to how much work is done by the top x percent in this in this particular repository and for open pull requests the top 50 were submitting 40 to 50 percent of all open pull requests in any given year and when we look at just the reviews of pull requests that gets even larger to between 53 and 59 percent or you see I'm 2015 was a bit more of a anomaly because I was the first year in the CNS they have some not counting that one as much but we can see it kind of reaching stabilization between 53 and 59 percent. Now what's potentially even more telling is the overlap between these populations so we've gone down from say 900 and something people submitting an opening pull requests and over 700 people reviewing them but between these top 35 and top 50 there are 21 individuals that were in the top 35 reviewers that were also in the top 50 pull request openers. So we've kind of narrowed down this population to a core set of 20 to 40 people depending on the type of work that we're looking at. The other question that I had again because the nature of the Kubernetes community and the incredible growth that it's had over time I was looking all the slides this morning in the keynote that demonstrate the immense size and growth trajectory of this community I was curious to know how things were scaling in terms of the number of people that were opening pull requests versus the people that were actually taking them on and reviewing them and we can see that it's it's kind of stable if anything gets going a little bit down over time. So to me this was fairly promising that we actually were scaling proportionately for these types of responsibilities in the project. So next category what's slowing down what's building up. One of the most popular metrics to look at here is how much backlog do we have in something like issues. How many are being closed over time or how many open issues are you caring month over month that are just getting stagnant and unresolved. So look at this over time you can see a lovely outage right in the middle there and quite a few actually this is somewhat of a spotty pipeline but that's that's a conversation for another day. In 2022 I saw 64 percent of sorry issues opened in that year also closed. So that's a pretty good number and looking at the sort of ratio of open to close over time it seems like it's holding consistently and as the volume's gone down that ratio is kind of staying the same. But if you're familiar with this particular repository or how the broader Cooper Studies organization likes to run these things you might be aware of some of the bots that are acting in this community that are impacting these metrics directly. Here we're just looking at the bots and in fact I actually had the prior slide I first built it looking only at humans and I was only looking at 30 something percent of issues are being closed because I wasn't counting all of the bots that were closing things in which case you can see how bots are actively skewing this metric and maybe it's not the best indicator for this particular project. In fact just looking at the Kubernetes CI robot it closed 78 percent of issues opened in that year and it also is closing pull requests. So we didn't look at closed pull requests or merger but this is also going to impact those values. Now in further conversations with folks that work in this repository I was able to clarify that this isn't all robots if you submit slash close the robot will the bot will see that go in and close it and it will also label things as stale over time. So because of that potentially a better marker of this is something like issue velocity which is already tracked in the DevStats dashboard which is taking those labels and now saying how many issues are labeled as an active as in so that's kind of the block of carryover or backlog that we're looking at because we can't look at close rate because eventually it will just flag it to be closed because it's too stale. So that's not a good marker but this might be a better one given the process and nature of how this community treats issues. The other metric I wanted to look at was daily reviewers. Now I had a lot of conversations with myself about whether or not I wanted to include PR velocity here. I ended up not and I have some thoughts on that and I'll get to it later. It was a metric that I considered but if we're focused on burnout I didn't think that a metric that you tracked against that encouraged higher responsiveness times might be a good indicator for contributor experience but maybe not for people that are being overworked and expected to be always on if you think about what metrics are actually encouraging specific types of behavior. So looking at daily reviewers was a better measure of just are people showing up on a day-to-day basis to help the project to help move things forward and the reason this metric was flagged by one of my colleagues was because if you look at 2019 in blue and 2020 in red we can see that there was a drop at the beginning of the pandemic and that was felt acutely by some of the maintainers of this project when we were starting to feel strain of the pandemic and changing our behavior in this space. So this was flagged and discussed and as you can see as the year went on we got a little bit better at working in this style and adjusting to the pandemic. Now this is a little bit of a messy chart and you can see by the end of 2022 we're actually at a lower rate again it kind of adjusted down noticing the other outage in October of 2021. I saw a 13% decline if we average this out by between each month year over year but we also noted that there was a decline in open pull requests. So to me this isn't actually a sign to be worried about because we know the overall volume has gone down so the number of daily reviewers most likely is going down because the work itself is going down. So the next section is around contributor growth and maintainer growth and replacing and able to replace and onboard new maintainers. So some of the most basic things to look at are overall attrition and replacement and in this case I was looking at reviewers given that that is often responsibility of maintainers. If we had ability to tag maintainers then we could do this more explicitly explicitly but in this case I'm again just looking at that reviewer bucket. As you can see over time there's a bit of a growth and then a little bit of a decline we kind of saw that number earlier in the total population and I'm looking at very basic metrics like how many new people submitted reviews in that year and how many people stopped submitting reviews after that year and looking at the difference of those numbers as the replacement rate are we bringing in as many people as we're losing and as you can see not exactly it's kind of going down solely over time and so maybe we're seeing a slight not to say attrition because this isn't actually a measure of attrition but it's a measure of replacement rate is is declining a little bit in the project so maybe a little bit concerning but at the same time we know the overall work might be sloping down and maybe the project is achieving another level of sustained sustained maintenance versus growth. However when I looked at that top 35 again now in this case I'm kind of flipping the the chart around what we're looking at is the 122 individuals that during this period of eight years at some point ranked in the top 35 by pull requests review comments and reviews and as you can see these folks really skewed toward the longer tenured in fact 22 individuals have been working or providing reviews for pull requests in this repository for all eight years and so to me what actually ends up being more concerning looking at this is that we if we just look at replacement rate purely when we think about the expertise and knowledge and institutional knowledge that these folks have by being around for so long that's much harder to replace we can onboard them and teach folks to be productive contributors and grow them into leadership roles but what are we losing when those folks start dropping out and so the hope would be that this sort of distribution would carry over time but only time will tell if we're able to really start continuing to bring people into this much larger role of responsibility and experience so the last section is around how people are feeling and feelings are really hard to measure and so one of the simplest ways to do it is with the survey and I know there is there is a user survey and a contributor survey that go around but in this case I tested the metric put forth by the chaos community to help understand burnout and as you can see we're asking on a kind of a Likert scale how regularly people either feel this way or never feel this way and asking questions geared to understand is your work on open source generally giving you energy is it exciting is it keeping you curious and engaged or is it depleting you are you feeling drained you feel like you need to take a break and as a way to help indicate maybe you don't feel burnt out yet but you could be heading that way if you're generally starting to feel like your work is draining you versus it being something that you're excited to do and excited to take on so in general in this population in our survey we saw generally fairly positive associations with open source work and part of me thinks about that the distribution of experience in open source these are relatively newer folks to open source so how much is that impacting these values because they're still approaching this with rose colored glasses another thing that I wanted to call out here is a little bit more anecdotal this was based on a conversation with individuals in a workshop hosted by the chaos community focused on burnout at github maintainer month last year and we had a number of amazing conversations and stories shared from these individuals about how they experienced burnout how they were able to detect it in themselves and others and I wanted to share that because I thought it was more applicable to this discussion than any other generalized discussion of burnout because unless this is coming exact it's coming from people that are in the community that understand what it means to maintain an open source project and the top thing that came out in this discussion was losing your patience and being generally disagreeable maybe noticing that your emotions are taking over the example here from my cat again Moby if we don't feed him on time and we leave the toilet paper unattended this is a potential consequence of that and so in order to help him manage his aggression we either have to feed him on time or protect the things that might be shredded but in other cases it might be we might not have as much physical evidence that our emotions are taking over and so we need to be able to listen to each other and notice those subtleties in each other being always on or too available and this kind of comes back to that pull request velocity metric which is a really popular one to see kind of operational productivity I have seen a number of projects use it as a way to just say are are we being productive are we responding to people in a timely fashion and I think it's generally a good metric especially for things like external contributor experience how quickly are we responding to people in these spaces because if we take forever to get back to them that's not a good experience in the community but by putting a goal against it you could be encouraging this behavior and so that's why I kind of a little hesitant to suggest it in this focus around burnout and then it's always possible that you're just not interested anymore and maybe this is a sign that you need to take a step back but maybe it's also a sign that you need to move on because this is no longer no longer interesting to you or no longer striking your curiosity and so what are the ways that we can help to reduce burnout I tested a number of options in our survey again and then split them by contributors and maintainers and as you can see there was kind of a split in school of thought here some folks thought one thing and maintainers kind of thought another thing and again they have different experiences here but the top the top things that came out to me were increasing variety and changing it up so people aren't feeling stuck in the same things and keeping things interesting tax delegation being able to hand things off to people so you can take a break allowing increasing the pipeline to new contributors or maintainers so I really liked Don's point this morning about providing real milestones to help define what it means to mature into these roles what's expected of you because not only does that allow more people to achieve it but it might give them a goal might give them incentive to work towards something versus just flailing around hoping someone will notice and for the perspective of leadership they just want to make sure they don't get stuck a lot of the personal stories I held were around people that felt like they couldn't take a break they were just always accountable for those thousands of users downloading their package and they're the only one maintaining it and they felt like they couldn't take a break from it so closing out with our recommendations or thoughts here boundaries are vital because at the end of the day if you are not healthy then that will be exhibited in your work and in your performance so please make space for yourself I really like how some projects take breaks or expect slowdowns here we're just looking at those daily reviewers over time again and it's a bit of a spiky chart but you can see natural dips around the holidays that signal people are actually stepping back and taking personal time I always like to close with some caveats because we did focus a lot on things we can measure but there are certainly things that are not captured in what we looked at so what are we missing here first how do we know what we should even be counting again I thought about this what I was looking at the entire kubernetes population and all these various repositories under four plus different organizations on github kubernetes kubernetes was the largest repository but again it doesn't reflect all experiences in the community in general I think this project might be too large to look at from an aggregate population because there's so many sub-experiences that looking at the entire population will probably mask that so I like how dev stats have started looking at sigs for example as or or sub-directories as a way to say what's happening in these small pockets of the project because that's actually going to be closer to individuals experience in that place and also again signal when there potentially are waning populations in those sub-directories choosing what time frame to look at we looked at a really big time frame because I was struggling with how to handle this pandemic effect on baselines so I wanted to start incredibly from a high level from a macro level to understand what are our baselines how are those changing over time so that when we take a look at the current state then we have a better sense of how that informs what's actually happening in a given moment but most likely you're probably going to want to look at current state with a reference to what these baselines are at any point in time another glaring problem with my analysis is that really a better measure over time is release of a release not month over month because that's actually aligned to the project's experience and cadence full disclosure this particular repository has had over 40 releases and anecdoting the data, annotating the data to that was just too much that I wanted to take on for the given moment so I didn't and it was more simplistic to have these broader views of the project but if you were going to do this in earnest I would suggest looking at it release over release because that'll give you a better sense of a natural break in those cadences there's also the work that extends beyond code this is an incredible amount of work that's often not captured anywhere so we think about the time that you're spending in the project and we're only looking at logs on github we're missing a lot of stuff so the general suggestion here is if you can keep a record of these things as well if there's the same person is planning your event or meetup year over year then everything is dependent on that one person if they take a step back or they want to hand it off to someone else and just vacate then someone else has to recreate all the work that they've done and so that's part of it is understanding all the other places that you're contributing and understanding how work is being distributed in those areas as well the other side of this slide is kind of coming back to that work versus personal contacts what do we choose to do versus what we're tasked with because if you think about our motivation and our potential experience for burnout we might be more excited or energized by work we want to do versus the work that we have to do and this is kind of the problem i have with all of this analysis is i'm unable to draw this boundary in between why people are doing what and when um the github attempted to do it by looking at hours worked on weekends or holidays but that's maybe not the best indicator as some people work on the weekends too um so this is an ongoing problem that i would love to hear thoughts on if you have an idea of how to separate or understand it because when we look at motivation and participation we have to take that into account and there's always the question of how many people are working alone we looked at a massive project but we know there are a lot of single maintained repositories out there um this is not a great measure for a number of reasons um as this doesn't actually equal the number of open-source projects on github but there are a lot of things out there that only have a handful to less people working on it acknowledging that there is more data than github and we actually have data on that now thanks to some researchers at uvm so i wanted to focus recommendations to close on the community aspect i was thinking about writing these in two ways one from the perspective of the community and one from the perspective of your company because again i see those as two different motivating factors but to make this more inclusive i focused on the community aspects to schedule and take breaks don't be on call on the holiday like i was during log for j that was not a great experience um document your work so you can hand it off document your milestones so you can encourage people to work toward them possibly automating tasks that are highly mundane and redundant or repetitive and it's always adjust the metrics to account for that and at the end of the day the most important thing is communication and this comes on two levels being able to listen to people to develop relationships with people because when you can build trust and respect in your community people will be more likely to share their experience with you and you'll be more likely to listen to them and want to help them and so we won't make it without communication and we even have data to back that up a couple years ago some researchers at uc davis looked at the success rate of a paci incubator projects to graduate that program and they looked at a whole slew of metrics and what they found was that communication was a larger determinant of success than any coding related activity and i want to close with a final reminder from another piece of research looking at why people work on open source that people are predominantly here to learn and to have fun and if it's no longer fun they're not going to stick around and so i don't i just i don't want to break it if for lack of a better description and i want to have that remember that this is why we're all here because we care because we're interested and we're here to have fun with each other so i don't know if i have time for questions given that i started really late but i want to thank you all for coming and for sticking through all of technical difficulties and please come find me if you have any thoughts about this after a matter