 Welcome everyone to our panel discussion on eliminating common barriers to contribution for more inclusive communities. My name is Georg Link. I'm the director of sales at Betrugia. My interest in barriers to open source communities is, well, it started with my first engagement back in high school when I joined my first open source community. But then now today we have the chaos project where I'm one of the co-founders and we're always thinking about how can we make it more welcoming more inclusive bring in new people. So that is that is my background how I'm approaching this topic. And we have three experts on this panel today, Anita Mariam and Daniel and why don't we go in that order. Thank you, Georg. Hi, I'm Anita Sarma. I'm a professor at Oregon State University. I got into open source research because one of my passions in research is understanding coordination and how teams work together the collaboration involved. And open source is this mega mega large core collaboration across, you know, completely geographically distributed people so how does it even work. And the more I started looking at it, the more I realized how awesome complicated software people are making and this coordination is happening through all this online challenges, but also I realized that there are so many barriers and challenges that newcomers face in trying to contribute. And as we have progressed, like through the last 1020 years of 10 years of my research and 20 years almost like when open source is getting more into the mainstream. More and more open source is looking like for skill development and as career progression. And one of the things we noticed was there is a diversity gap, right, especially if you look at one diverse dimension gender. Women are pretty lowly represented in open source projects in general. And this became a cause of concern. One, because, as I said before, open source can help with skill development and career progression, and we are leaving behind the women were not participating here. So the second problem is research has shown that diversity in thought, which comes from diversity in teams leads to more innovation and more productivity. And if we have this diversity gap in open source, it means the products would be even more innovative and more innovative teams, if we had diversity. So that's got got me interested. So I have been looking at onboarding challenges that newcomers face, as well as the problems with the I diversity equity inclusion in open source and what we can do to improve and that's what got me to this project. Hey, hi everyone. My name is Maryam Hizani. I'm a PhD student at Oregon State University. Anita here is my advisor. Recently I completed my internship at Microsoft Research this summer. My topic is my research topic is on improving diversity and inclusion in open source. So in general, I'm very interested in on the topic diversity and inclusion just as a general field but then more specifically in open source where research has found where there's lower gender diversity, and there are some barriers and challenges that could be bettered. So in my research, I try to both understand the state of diversity and inclusion, and also come up with interventions to make things better. And I've had the pleasure to work with this amazing team for the past year and a half. And we've been working with the Apache Software Foundation to understand their state of diversity and inclusion. And I'm very glad to be here sharing our results, which have also been accepted as a paper at CSCW, the upcoming one. So, yeah. And finally, so this is, this is Danielis Kierdoch. How are you doing? Unfortunately, I cannot be there, but Maryam and Georg will be around. So my suggestion is that you ask them about everything of this research to make the most of the of the trip. I started in open source back when I asked when I did my PhD, right, so my background is computer science and PhD was related to free software engineering and critical software in the engineering. And then, well, I'm one of the founders of Vitergia currently holding the position of CEO. Then in this journey with Vitergia and the company, well, I was part as well as of the first steps of the of the chaos community. And then in the chaos community while back in 2017 2018, that was the time as well where we had the opportunity to start digging into this topic, into the diversity and inclusion topic. Specifically, at that point in time with the OpenStack Foundation that we had a project together sponsored or co-sponsored by Intel where called OpenStack Gender Diversity Report in case you have certain interest. So at that point in time, we were trying to bring numbers to have like, you know, to illustrate the situation at that point in time, either because we were discussing about code related contributions, non code related contributions, leadership and so on. And I can say that these were kind of the seed for the chaos diversity and inclusion working group, which is doing great work nowadays. And that was my first time learning about all of this topic and I'm still learning so I'm a learner here, I'm merely a listener. This time with ASF project, so as you said we've been working together for a year and a half, so more or less a couple of years ago we had the pleasure to meet Griselda Cuevas at that point in time was creating together with other ASF members the diversity and inclusion committee and then this moved into the VP of diversity and inclusion that in this case was Chris, one holding the position and this is where I always started all of this research. Today we want to talk about barriers to contribution and what we can do about them. But I think it's critical that we take a step back and think about okay, how did we learn about what those barriers are, and how did we arrive at those recommendations. So maybe we can take a few minutes to talk about the process that this research project went through, and to make it more interesting and talk about the challenges that we've faced, and how we've addressed those. The process was in different phases really. So we can see here in phase one so we had three different phases in our method, and we, the idea was to have a rigorous process that also was involving the stakeholders. So we wanted to have that feedback loop as much as possible. So, for example in phase one it all started with this large survey where there was the phase of designing the survey what are the questions that we're going to ask, and have that feedback with the community to make sure that we're covering all the questions we want to have answers to. I think I will interject and I will say even before the community feedback, one of the things that Daniel had mentioned already was Griselda or Gris was really interested in working with us so I actually met Gris at OSS Summit in 2019, and I had done past work on DNI on open source and she talked to us and Maria and I were together so we talked to her, and she was really interested in getting a scientific understanding of the state of DNI in the ASF. And then towards the scientific understanding of the challenges. And as a first step towards the scientific inquiry, we wanted to create the survey questions itself that were rooted in academic science, as well as past best practices in past research that have been done. And so I went through examples of survey questions that in this topic were presented in academic conferences we pulled out the survey questions that were relevant. And then we also looked at past survey questions that the ASF had itself done in 2016. The Stack Overflow Survey, the GitHub Survey, as well as looked at the best practices that chaos had recommended about DNI questions that should be asked. So we took all those questions, we compiled a huge list of questions. So we kind of went back and forth with Daniel, Maria and I to kind of say what questions would make sense. And we had a short list, and then we presented this questions to the DEI community. And I remember there was a lot of useful feedback and engagement, even at the wording level right so something that was really interesting was how international the community is. One of the questions we had was about education background. And you know, we, I had put college undergraduate using the American violence and from USA Oregon State. And then lots of people brought up like in other parts of the country what you mean by college or higher education might be different. There was a lot of interesting perspective of taking a step back and thinking about yes, you know as I said this is a collaboration it's a worldwide collaboration. In other words actually vocabulary we use technical jargons or even, you know, this is not technical for say but educational jargons would actually mean how they would translate to other parts of the country. So that was very interesting and I felt having that kind of discussion. If anyone is interested in the survey those questions are all there you can replicate the research. Now, after spending so much effort on arriving at a really high quality survey. How did you go about collecting the data engaging the community to get that feedback that you were seeking. So just to add really quickly. This is research on open source so we were very cautious about using also an open source tool for the survey so we did use lime survey in case anyone is curious. After all this work on designing it we obviously want to publish the survey but also communicate that the survey is there and that was through the mailing list and I believe Anita maybe you can interject here. Other other means where we encourage people to participate. So, so we did two things so it was mailing list and one of the things since this was ASF sponsored. We actually got the email addresses of ASF contributors and we emailed them. We emailed our 7000 odd contributors, but we also broadcast it on Twitter, our accounts as well as business account to get it as much publicity as possible, because one of the problems with empirical research that is data related research and survey is getting the response rates right if you have a very small response rate 1% or 2% it means only few people in the community have answered. So the data you're getting might not be reflective or representative of the broader population. We really wanted to see how much breadth we could attend in getting in getting the responses so we got 8.5% it looks low, but actually all our fast research and software engineering surveys of different kinds. So from 7 to like 15% response rates. So 8.5 is pretty decent from those books. I know that you spent a lot of effort on analyzing this but I want to move on to the second part of the interview phase. We talked about why was the second phase so important in generating the barriers and the recommendations that we are talking about today. So, so the first phase of the survey is so we got 624 participants responding and we took those that data and analyze it but that gave us a broad view of what are the challenges there. So just to go back really quickly a lot of qualitative coding there with like tagging their responses, but then once we get we got that we got that data we wanted to dig deeper so we wanted to have more of a one on one interaction with these contributors follow up with them and really ask them more pertinent questions about these challenges. So that's where the phase two comes in and it's the interviews. The most important part was all the like there were a lot of meetings involved where we really go through this data can and again. So it is a long process, but we're also trying to have this final conceptual model that we can later refer to. So this was the phase of the interviews. So one thing I want to just add to that is again this is a scientific method that we wanted to do and we did not want to come into the report or the analysis with preconceived idea. So we let the data speak so this was grounded up from the data. So that's why it was a lot of looking at the data, the interview transcripts open and the text responses, and kind of seeing what the community had responded and bringing it up. The data speaks here. Indeed. I, while you were discussing about this here, I was checking the report and I remember we had some big numbers here and more than half of the respondents that we had in the in the survey face challenges. So in this case the numbers. So this is exactly a 52% of the contributors to the survey have faced challenges and this is, this is a really big number. This is more than half of the population we had as people answering the survey, which is something to to think about. And one tiny thing that there has been at least academic research and and I have heard talks in industry conference to that it's really hard to make it into right how to become a contributor. But what we found was even existing contributors is 54% includes even those people who have three to five years of experience right, even people who are contributors who are making changes. And that's still facing ongoing challenges, just not just entry barriers there are enough problems, even once you enter. Exactly and the challenges just to add to that. When you progress within within when be well being a contributor when you get like more responsibility there are additional challenges that you start facing that you might not have faced before. So, this was really interesting in terms of inside of like challenges that come out at the beginning at the more like medium experience level and even at the very high level of being a member at Apache. Yeah. I think we have established now that what we are presenting today is based on really good quality research with a big data set of just contributors from the Apache Software Foundation. And that barriers are really affecting everyone, regardless of how experienced you are where you are in your in your journey of being an open source. And so let's start talking about the actual barriers and the findings. So in this model that we did. We have this framework now of categorizing these challenges. So, these higher levels and categories that came up from the data where we take a challenge and we see, well, what type of challenges is it is it related to the process of contributing to the whole process of being an open source project. So this is something that's technical that is about maybe knowing a certain language that the project is using or knowing how to make a full request, or just setting up your environment. Is it something that is related to the social interactions and communications. You can categorize them by type like what type of challenge it is, but also by level at which they occur, because some challenges are very proper to the individual. Like if I'm not familiar with Java and I'm trying to contribute to a project that is using Java. That is a challenge for me. That is an individual challenge where I'm not familiar enough with the language. Then there are like challenges that are relative to the project, maybe maybe the way the project is set up the way, whether it has like a contributing dot and D the way the reading is set up is or even the way the project communication is being done right which which tools they're using so it is basically tools technology processes that the project has set up for its contributors that is where the problems arose. Exactly. And the third one would be the foundation level which is which are challenges that occur more at the Apache foundation level. And here you might ask like how would you know, especially for the levels. So for some some participants specifically said what level it was at they would be like in my project and they named the project and say what what the problem was, or they would mention that it's a problem at the foundation level. If they did not what we do is we think about who could fix this who has the agency to fix this. And that's how we detect what level it's at. So this is where this is kind of like the distribution across the categories and levels, and we ended up with a 88 challenges a conceptual model of a challenges. Yeah. So one quick note about this. It was like one third all right this is pretty well distributed. But if you can see that the foundation level the process challenges where the most people face right. How does each project interact with the foundation the Apache way, just understanding how everything works together at the foundation level the kind of guidance is documentation exists. And if you look at the social part that's where a lot of communication coordination collaboration happens. The majority of the social interaction problems are arising at the project level. So that's again where the project can do something to fix some of the problems that we will be raising here. I wanted to highlight here that we have all of these types of challenges and we've characterized all of the activity I mean all of the, all of the barriers into this different levels and so on. But then I wanted to bring that at the same time and part of the interviews. The interviews were providing a specific mitigation strategies that they thought might be useful to serve with the community and that are part of the report as well. I wanted to to mention specifically in this case that we can see here in the, in the picture so we have the something that was really surprising to me was the Apache way so people were facing issues with Apache way while while doing. The Apache software foundation my specific experience and this is totally talking about me and my experience is that it's, it's, it's clear how they work. But it seems that people are still facing challenges when understanding all of these Apache way maybe this is because they are used to work in the open but Apache software foundation has a specific, you know, ways of working that that Apache way and most and I was reading across the mitigation strategies and I would say that most of the, the relate the ones related to process can be in somehow mitigated or the strategies that were provided were mostly related to extra documentation having a modern way of introducing the foundation to the newcomers. Either if they are volunteers or those coming from from from a corporation and training a specific training or maybe even provide clear guidance on the on the governance process and I would say that this we characterize this as part of the foundation. But if we think a bit more about this, I would say that this is, this is an issue that should be addressed as well by the organizations because if you are willing to invest or contribute or participate in in ASF projects. Probably one of the main things that we should all do as a as a company or any other entity is to, to help our own developers to understand how these open source foundations work and it doesn't matter if this is the ASF, which is the research we are, you know, talking about today but there are other foundations that have a different way of, of working a different idiosyncrasy right so, so that's my point so there are, this is not only a set of problems that we want to highlight here today but there are a bunch of mitigation strategies and that we can all help to advance into, you know, lowering the barriers to be part of open source communities. So, let me, let me add one thing since you mentioned the Apache way right. What, what was interesting and this is where you know, where projects exist by itself or they exist within foundation level what is the governing principle. So the Apache way governing principle is very nice is very principled and lofty saying that there is no one way right Apache way is all about democracy that every way is fine as long as it's open. Because in the ASF and this is becoming more and more commercial companies this is hybrid contribution model we have commercial companies who are used to working a certain way, and understanding what the philosophy the patch you way is is hard to figure out what is it that we need to do if you're a company trying to get into it because every which way is right. So one of the comments were like, we looked at it we interpreted it we did something but then an ASF member came and said no this is not how it is done. They were right, but the company felt they were right to because they interpreted it so so while the ASF way is nice that it is open and is democratic it is causing problem by its very openness and fuzziness of what they expect. So, I think this is something more and more we will see, as we have more hybrid models coming in more commercial companies trying to contribute to open source. Yeah, and just to add to that it's a tricky balance because you want to keep that flexibility but you also want to give guidance for people who need guidance. So one way I remember from our either interviews or surveys I remember some recommendations about Yes, but how about mentioning some specific projects that were successful so that we can follow their footsteps if we need that a more explicit guidance with while keeping the overall governance way flexible. Samples I always love templates right if you give me a template at least know what a successful template would look like and that's something that could be done to fix that is not very high challenge of getting some successful projects showcasing what they did. So I know we have in this from this research identified more than 88 barriers or challenges and possible mitigation strategies, maybe we can at least provide some examples beyond what of what you've already mentioned. So, so these are some examples so we saw earlier that like more of a tree diagram and you can go through the like the research or paper where we have kind of the complete the more complete bigger conceptual model. But as an example of like recommendations we found is first off, providing for the process the one we just saw for the example the process for example providing ongoing trainings and best practices for reviewing. So just making sure that you're you're providing these regular trainings for people that need it, or wants to like freshen up on them or you want to have these updated trainings available. But what also people mentioned is the process of reviewing so really being able to provide how how are you supposed to review these four requests and what are the best ways to review them in more of a like a systematic way or have some sort of guidance available. So that was something about about the process. For example, for, for the, the technical challenges faced. A lot of people suggested using automated tools whenever possible so you want to have like that part taken care of again open source takes a lot of time a lot of people are volunteering. So I want to really be make sure we're aware of that in providing automation whenever possible, but also incorporating those new technologies that are just making life easier. So before we go into the DNI the social parts right so these two are process and technical examples. It's kind of interesting. I think one of the interviewees mentioned. Oh, the project I'm contributing to is still using SVN why can't we just move to get right so because it is so mature and some of the projects and I have have so long been in in, I don't know in production that they are still using tools on JIRA for example that were new and latest when the project started but they have kind of become dated. And this is a problem, especially if I think this person was also a commercial that came as is an employee in a company and he's contributing to open that employment. So he's a paid developer. And the problem arising is if his if they're regular company and other other projects you know are using a different technology tools. Now a another barrier for this particular contributor is to also learn or no update and you know figure out all the nuances idiosyncrasies of different tools. So let's just add more overhead to the contributor but it can also lead to more problems like technical hurdles or mistakes made for the project itself and maintainers are so busy that it is, it is so difficult for them to find the time to approve this. And there's a straight up right sometimes it's just easy for me as a contributor to sit through it always and just do it manually versus spend the time and the investment to update to a new technology that maybe the project doesn't know and project has to prove. So there's this trade off to see like how and where the time investment would pay off later in the future. So that's a conversation even my team small team has about the newest and latest technology that keeps popping up nowadays. So there was some discussing as well in the by the interview is related to, to the time to get a response to get a replay. How to get, how to get let's say some acknowledgement that this happened that the, the, the reviewer or the committer. received the PR or that they that there is someone on the on the other part of the channel communication channel that is is going to do something with the work I've done in the last hours right. There were, there were, there were some barriers in that specific area about, you know, not receiving a proper answer in place or not even without probably the, the word proper but delayed answer or maybe having not actionable and a specific steps after you've reviewed something or you've been reviewed by others. So there is there are some reception issues around that are are worth mentioning here all related to what you mentioned and this delay is usually a way of saying we are maybe we have, you know, too much work. We don't reach to answer everyone here. So open source communities, you know, each of them they have their own, their own pace of development and, and this is this is something that it's definitely toxic knowledge that you learn by simply staying in the communities but by at least. But actually Daniel that is actually past academic research on newcomers on boarding that has shown that the biggest reason or the biggest challenge newcomers had faced or people had stopped contributing or could not make it was basically the no response and the maintenance are busy but if, if someone has put in a request, and they don't get any response you know. That means the community or the project doesn't care about that particular person's contribution or worse someone else might actually make that fix or make the contribution and this person's contribution gets to be obsolete right so this is, this is a challenge and at the same time the maintenance have to respond to incoming pull request, but also non responses means we are losing people who could make a contribution. Yeah I agree with you but there is, there might be this discussion about showing others how the community works and sometimes the, the, the point of this, this takes time, please don't get upset. So that's something that we, if you are part of a community that may happen so you don't get a response and that's okay under certain circumstances. But Daniel instead of not having any response maybe a right way to say is like, you know how, how you know customer service when you call like you might take 20 minutes to get a response maybe there should be something that says how long it might take to respond because otherwise it's very disheartening for newcomers. Yeah, it's all like open and having that open communication and just acknowledging which could take a few minutes where you're saying, we've received your full request and we'll get back to you in two weeks or we're in the process of reviewing it just having that feedback, especially for someone who's new. It might encourage them to you know not drop out completely off the project. So as, as the facilitator, I'm going to encourage to move the conversation forward, I think this was really great and you know on the panel that's the kind of conversation that we like to get. Maybe we can move on to the next item here. Yeah. So, so to loop again with the with the communication and the open communication so this is important for open source but this really just is important for industry. It's just important in general is to include minority groups and the DNA discussion. And this was emphasized a lot because you're if you're discussing like diversity and inclusion you want to include these members you want some rich diversity of thought and like you say, that is coming in so you really need those voices to make sure you're going in the right direction you can discuss something without having those experts are the people that are most close to the issue be there to give their voice their opinions. And one quick thing I want to add is like, we did get some responses saying that diversity is important and everyone seems to be talking about gender diversity, but what about English fluency or or like people who are like in China. I mean how are they able to communicate and be part of it. So one of the things we have to consider is diversity is not just you need to mention the different dimensions of diversity and how to include this minority groups in these kinds of discussions. Exactly. And these are insights you get when you do include these people you can you can get insights about differences in cultures about differences in languages, etc. And another way is to promote minority focused online meetings. So it's really empowering when when you're able to have these connections and be able to have these online meetings. You can build friendship theory can network you can also have someone to look up to who is maybe similar to you so that you can. They can give you advice, you can discuss things with them so these could be really important in in open source communities but just in general, having that be like having it be to have online meetings, maybe to have people start their own online meetings and have that community come together. Yeah. I really appreciate everything that you have shared with us today on the panel. If anyone is interested in reading the research, please look up the research that was published. We also have if you're looking at the slide deck a summary slide so feel free to download the slides for yourself. Now we're going to do one last round of summarizing what is our personal takeaway from this panel conversation. So, we know diversity is a challenge. We know there are barriers to contributing right there are at different levels. So, there is there is a lot of problems, but there are also solutions right because the community was so responsive. The day who you know interviewed, we've interviewed they were very nice. They had ideas of how to change things, or how they individually are making changes. So there is a lot of possible solutions out there that we should bring up. I think one of the things we need to do is how to get some best practices on the ground like the code of conduct is the fantastic first starter, but how can we have such, you know, best practices are how to do good reviewing practices right those kind of practices and resources that is common across different projects foundation so that we can all work together to remove this barriers you know one barrier at a time. I would add to the my takeaway on this is open source is a really, really great way for for for individuals to just, you know, it's it's out there. It's an way to just learn it's very much an informal learning type of way. It's a way to build a community. It's a way to network. It has so much to offer. And I feel like improving DNI and open source could have this gigantic ripple effect on so many different levels on just like economically it could help people get jobs and help people get skills. It could improve the the project because there's more people in with different thoughts different backgrounds they're bringing like their own perspective. So it's really impacting so much and sometimes it's just small changes. And one more thing I want to add is to just thank all all the contributors that like answered the survey and all the people we interviewed, and how responsive and excited they were about this. And I really think that moving on we will be able to implement some really interesting interventions to just move things a little bit further and start that ripple effect. Yeah, that's that's that's indeed a very good point. So I like to, yeah, to thank you to all of the people involved in the research all of the community that provided feedback. It's like at the beginning for the survey during during the interviews, but during the whole process, in the readouts as well. This has been a really great community work and really great experience. And indeed something that I'd like perhaps to highlight from from today's conversation is that open source is all about community is all about the people. If, if you as an organization company or individual are willing to, to be part of a community to, to let others, you know, be aware of the kind of things you are doing to be transparent to play with the with this kind of rules. I would suggest that you have a look at, you know, all of these research we've done with the goal of lowering the barriers to contribution because you will accelerate or speed up the process of adopting the technology you are open sourcing or maybe adopt improving or speed up the development process of other projects that that matter to you for any reason, either because they are part of your tech stack because you are running an hospital because you are adopting the technology internally in your development. So it's all about people. So take care of them. So thank you so much for joining us on the panel today. Thank you to New York for moderating and keeping us on time and on tracks. And again, thank you to this community. It was a pleasure working and bitter.