 Hi everyone, thank you for attending this talk. So today, we will be presenting a talk on growing communities, eliminating barriers to contributions. And I have here with me Daniel. So Daniel, if you want to give an introduction. Sure. Thank you, everyone, for joining us today for this talk. My name is Danieliz Cerdo. And I am one of the founders of Vitergia, currently holding the position of CEO, active member of the KS community and the inner source commons. Perfect. I'll go ahead with a quick introduction. My name is Miriam Huzani. I'm a PhD student at Oregon State University. I just recently finished my internship at Microsoft Research. And I've been working with Daniel for the past year and a half on this research. And I'm very happy to be presenting our results here. So why are we here? So as a reminder, we are at the OSPRA track, right? The open source pro and office track. So the question we had on the table when we were presenting this was, OK, we've done our research. We have certain numbers. So now what, right? So how can this be used by others in the open source community? What we discovered at the research was that there are barriers to contribution and the research specifically. So a set of mitigation strategies provided by the community. And then at the very end, what we have indeed is a framework that we can all reuse and have like a model that we can all apply in each of the projects. So from this OSPRA perspective, if we are part of our organization that is willing either to invest or open source technology, or even to increase the contributors to their open source projects, the thing is that inclusion, being welcoming, is one of these key things to have in mind. So as part of the to-do group, probably you're aware of this. There is a specific guidelines and chapters discussing about diversity and inclusion. What we bring here, let's say, as a new thing for all of us is this framework where we can think of. So there are some ways of thinking that we can consider. So one of them is the outdoors perspective. So from the company perspective to the rest of the open source ecosystem, there are things of, well, how can I lower the barriers to contribution to my projects? Are there barriers? Are we aware of the existing barriers? Initially, what we are interested in is in raising awareness into the organization that this is happening. So there are barriers. In this case, the analysis that we are doing and the results we'll present are the application of this framework into the ASF community. But the point is that this can be easily extended to any other open source foundation project or organization ecosystem that you are part of. So the goal would be as well to find certain key indicators that we can track over time. We are giving some of them in the case of the ASF. And from the point of view of the indoors, there are other things that we can discuss. So specifically in the case of the outdoors or the public relations of our corporation, so probably one of the questions that we have is are there barriers and what are the usual ones? Because once we understand this, then we can think of specific policies that we can put in place, right? Indeed, specifically in the case of the ASF for the research framework, we discovered 88 different barriers that were split into three main categories. We'll see about this later. Then the other one that I mentioned was, well, we have, we need to raise this awareness so that there are different challenges. We want to characterize this. And initially what we are providing as well in the framework or as part of the results of the ASF research is the agent that can facilitate this mitigation. Specifically in the case of the ASF, the framework that we produce was split into process, technical and social as main categories, but others can be added later. And finally, the key indicators, right? So one of the goals of the research in this case was to look for the top barriers to contribution, but given that any open source project, any foundation is different from each other, the barriers that we may found at the ASF might be different to others, right? So a good thing to have in mind is to have the diversity numbers, keep track over time for them, look for correlations. And in this case, this is part of usual research in the open source industry. There are certain barriers that we see that are kind of, we see once and again as the language, the gender, education is another relevant factor that we have to take in mind. And then from the indoor perspective, so how we can think of these barriers and be part of the solution is about what can we do from a more corporation perspective? So we can work on the outside, but then we can work inside with our developers, our management team, et cetera. So for this, the main things to have in mind are about participation guidelines. So it's not only about understanding the technologies, not only about understanding the workflow, it's about actively training teams or pick the contributors funnel, understand how perhaps underrepresented communities are going through our contributors funnel or what we call sometimes the onion analysis, right? So who are the core developers, the regular developers? How easy is for people and for underrepresented communities to advance in the funnel for contributor to become maybe a core contributor at some point, have certain focus on volunteers. So this is something that should be done initially at the corporation when running specifically, when working in open source projects, either you are consuming or producing. Then address the challenges. So in the previous slide, we discussed about if there were barriers, right? And we see in the ASF that we were discussing about 88 different, but then you need to address the challenges. So what are the mitigation strategies that you will put in place? So in this case, in the research that we are, that we see today is, we are providing different mitigation strategies, all of them provided by the ASF contributors as well. And then finally, probably one of the key aspects of the open source program office at any corporation is to become a trusted department, right? A trusted partner that you can think of when dealing with open source. So this diversity and inclusion branch is another topic of importance. So how to create assets, trainings, that others can consume, look for good practices, et cetera. So this is part of the motivation of all of this. So summarizing basically what we wanted to bring today is not only the results, but why this is another useful tool that you can apply at your hospital, basically. For the framework, we were, what we did was these 88 barriers were split into three main categories. And then these are at the same time split into different levels. So basically we have like nine options, let's say where you can position each of the barriers. So this is useful to characterize the existing barriers. And then we can add new ones, move ones from one place to the other and so on. But well, Mariam will provide more details in this case. Yeah. So like you said, Daniel, splitting these into these categories in these levels. And this game really ground up from the data and we'll discover this later on on just how to replicate this. But the idea is, so as we see here, it's a little bit of at least our distribution according to the ASF data of these challenges that we found. So what we could think of is process, technical and social are the types of challenges. What are the challenges? Is it a challenge related to how contributors are contributing like the process of contributing? Is it a challenge related to the technical aspect like maybe not being familiar with a certain language or maybe setting up the environment, et cetera? Or is it a challenge that is related to the social interaction with communications? From the levels perspective, it's all about agency. Like who can fix this? And this is the question we ask when we're trying to put or categorize a challenge into a specific level. So if the challenge is about the person like maybe not being an example would be if I'm not familiar with Java and the project is in Java, then that's one of my individual challenges to like learn more about it and be able to contribute. A project challenge would be a challenge of maybe how the project is presenting the information or the architecture. And that's something that the project could fix to make things better. For the foundation level challenges, for example in Apache a lot of the things that came up were some of the Apache ways that might be a little bit confusing for certain people, especially if they're new. So this would be something that the foundation could fix. So really thinking in this level about agency. Yeah, into fixing. Yeah, and to kind of wrap it up, it's like levels is who has the agency to fix? And the type of challenge are those process, technical and social. And this kind of makes this framework a little bit more actionable in terms of making interventions or implementing solutions. Cause then you know who can implement the solution and in which type of solution you're trying to implement. So for our specific research, we looked at data from the Apache Software Foundation from both survey, a large survey and interviews. And this involved a lot of iterative work on this data which ended up resulting in 88 challenges and 48 strategies that are organized within this framework. And this is our team of like amazing people who helped make this happen. So looking back at when we looked at this data and these surveys, we looked at what is our average ASF contributor? Like what's the average profile of contributors? And this is where DNI also comes up a little more is it's mainly a 40 year old man who's confident in English, born and lives in the US, has a bachelor degree, is volunteering, so no compensation, has approximately one to two hours of volunteering, has been in the community for five years, did not have a mentor and faced no challenges. And this really brings up a little bit more the question of diversifying and having a diverse community and how important lowering the barrier is to make that happen. So some key insights that we found in our research, going again to those levels and types is that because we looked at the Apache which is a big open source foundation, the majority of the challenges occurred at the foundation in project level, more than the individual level, which there's a lot of research on that as well, like a lot of prior work. And the majority of challenges at the foundation level were related to the process. And this is followed by social interaction and then technical. And this is also interesting because a lot of the challenges we found are ongoing challenges. So people are experienced but they still face these challenges related to the process, the governance model and the social interactions as well. And this is a really interesting thing to consider because we, depending on where we are working at where we tend to think of is that, well, if someone is kind of adopting the technology that this open source project that I'm donating to the community or so, the main issues will come or will be around the technical level, how people may understand this new programming language or how they will understand the architecture and documentation and everything. But if you think about the findings that we have at the ASF, the majority of those challenges takes place at the foundation or project levels which are not that much related to specific lack of expertise or lack of skills maybe at the individual level but those are related to internal processes of how projects work or how the foundation works which is really, really, you know that was surprising for me indeed. Yeah, exactly. And it does bring up the question of how can we make those better and how can we provide interventions that make those better? And just one example here of, so this conceptual model, like this framework is this long framework which you could use the link to look at the paper and some of the documents we have for a more complete picture but this example of challenges that are related to the process. So we took this snapshot of them and some examples are for the Apache way, for example there was one of our survey participants mentioned that it's also not super clear how the idea of rough consensus works and how to proceed if rough consensus can not be reached. So the idea of voting and how this voting process works this was mentioned as a challenge for people to navigate and especially if they're new to the Apache or even to the project this could be even more tricky to navigate. Another interesting quote here just as an example is switching from one man coding to a community approach is sometimes hard. And this is particularly relevant when like projects and industry move to open source there's like a shift in paradigm in a way. So from working on your like project or on your so far on your own you're now in a community where there's this coding as a community that's happening so that takes a little bit of a shift in the way we do things. And just some strategies about the process that we've noticed where really a lot of it is about providing providing clear guidance on the governance process like the voting process being able to provide these guidelines provide some explanations about it providing regular training and updated training is also helpful that way people that want to be trained in certain things can and if they're familiar with it then that's great. And for a more specific suggestion we've got is for providing training and best practices tool for reviewers just to make the reviewing process a little bit more streamlined a little bit more equitable and just kind of easier to navigate. And for the social challenges so in here some of the examples is this go for example where a contributor says I was happy being a committer but once I saw all the member discussion in the list I was overwhelmed and have never commented on the members list for this reason. And this brings up a good point where really like reading through a mailing list could be overwhelming for someone or the way the communication goes back and forth could become a challenge that someone could refrain from participating. So really making sure these communications are being welcoming and inclusive of people just to help them get that extra courage to start especially if they're being added as a new committer or being added newly to the list. Another one that Daniel you mentioned language also as a factor earlier and so another one was it is still hard to understand phrases slangs or irony from native speakers on operational lists and this could be unnoticed if these people are not integrated into the discussion of DNI because we might not be aware like somebody who uses certain phrases or slangs might not be aware that somebody else might not understand what they meant. So this is an interesting thing to keep an eye out just as an example as well. And to link right to that last one is really making sure of including the minority groups in the DNI discussion. If you're discussing diversity inclusion try include as many diverse committer and diverse people as you can because you'll get their insight and their challenges that you might not be aware of. So that's a very important step to just keep moving this forward and just have their thoughts on the state of DNI. Also promoting minority focused online groups just having that openness about creating like a minority focused group which could help have certain like for contributors to have someone to look up to who's similar to them maybe help them with advice or help them with how to be successful in certain ways or who to communicate with or who to ask for help. So this could be really important and another one would be to also leverage both public and private channels and disclose their visibility. So just having those two options and being able to work with that but also making sure the other party knows what visibility they're at. Yeah, so this is like a summary of the insights that we have from running the ESF research and specific recommendations that we came by analyzing the community and together with the community with all of the mitigation strategies. So these are specifically for the ESF use case. In your specific use case those might be slightly different. So it's a matter of applying the framework like say. There are some slides around the replication how you can run the service the interviews, interventions and the quantitative approach that we have left here in the slides. So you can have a look at it. We prefer to focus on the results themselves specifically but just to be sure that you have the complete picture here. We have left like the phases for the survey design, the response rate, the number of the people involved, the researchers and all of the analysis the total time that this analysis took in the case of the survey. And then in the same way for the interview design so how long this took and the number of people involved. So you have an idea of the effort, the timeline of the analysis and everything. I don't know, Marian, if you'd like to highlight something specifically. Yeah, no, this is perfect. So just to say that this data came ground up so it came from our interviews and surveys from what the contributor were mentioning and building this into categories and challenges. And all this could be found also the survey questions in all this data could be found in the additional links. Yeah, and with respect to the quantitative approach what we did was to analyze 12 of the most mentioned projects in the survey. And for these we've been using Grimoire Lab which is a chaos project part of the Linux foundation. Well, we use specifically the third analytics which is the commercial service, let's say of Grimoire Lab but this is kind of the architecture where you gather all of the data from these repositories either GitHub or Git repositories, Jira and GitHub tickets or GitLab or any other kind of data source. And then finally you produce certain numbers around gender that we know that there are limitations to the study but this was useful to illustrate let's say the problem from the very beginning and say this is happening. So we can see at least some numbers even given the limitations that we are all aware of this. Extra ball here part of the discussion is taking place in a panel discussion that is again pre-recorded but I suggest that you go to the talk or to the Viterja booth and then you will find Mariam or Georg. Well, please ask them questions really tough ones and so on. You can ask questions right now. A bit about Viterja. So we provide specifically OSPO services so go by the booth if you are interested in those. Just to say thank you all for your time and thank you all of the participants here. Yeah, thank you. Thank you all. Thank you, Daniel. Yeah. Yeah, it was a pleasure, Mariam. Thank you.