 Gràcies per venir i benvinguts. La presentació és la història d'aquest camps de Tòquio, on l'amor de la collèquia de Madrid i en Vitoria i jo anàvem discutint amb Nidia unes idees sobre com canviem les activitats de les grans activitats. La idea és que, a aquest punt, les persones de l'OpenStack acompanyen que era una cosa de 11 o 12% de les persones que venien a la samarit o a la fundació. Ells han reaccionat una sortida que diuen quina manera de les persones actualment contribueixen. Aleshores, vam començar a pensar sobre tot això, i després vam dir que potser podem intentar tenir una veu a això. Bé, tenim uns números, i això és una de les coses que volem presentar per tu. Això és més o menys l'outline d'avui, una altra introducció sobre la història i algun context que hem trobat al voltant de grans companys, com Google, sobre algunes analitzacions de gènere interna. Alguns els primers steps, en el sense d'arquitector de tot, just en el cas que volem entendre com és, i potser volem aplicar-lo a un altre projecte. Després, uns números, un metge. Si tinc una bona connexió d'internet, voldria jugar una mica amb algun dashboard que he preparat, i, bé, unes conclusions. A part de mi, és el mateix que no ser involved en les persones de l'OpenStack grup, això és una de les meves analitzacions i interès. Quan treballo en Beteria, anirem al startup, anirem al 2012, en juli, i bàsicament provem metges d'open source, de performance, comunitès, activitès, aquestes coses. Estic interrompte en l'OpenStack activitès part, doncs pots anar a Activity.openstack.org i tu veus els testos que produ hips. Estic interrompte al report już en quartel, que veiem unes coses juntos amb el Stefano Mafulli, quan jo estic com a manageador de la Fundació. Per què aquesta estada? Com anirem a saber més del context del que per això. Bàsicament, la qüestió que tenim és com les contribuències d'opensors per a les dones que desenvolupen en el temps en la fundació de l'OpenStack. A mi personal de punt de vista, això és probablement tot sobre la transparència. Aleshores, estem treballant amb la mateixa ambbrella. Hi ha moltes organitzacions i companyes amb interès comercial, però estem tots en el mateix vot, i tenim la mateixa missió. Aleshores, tenim aquesta transparència amb tot aquest info, tot aquest desenvolupament. Jo diria que és bo per a nosaltres, i bo per a tot. Aleshores, el que tenim sopar, algunes resources sobre l'OpenStack, hi ha una línqueda, les dones d'OpenStack, que té alhora de 600 membres. Bé, he aplicat, però encara no tinc, no ho sé. Sí, potser. Aleshores, hi ha unes 140 discussions. Hi ha una llista de menys per a les dones d'OpenStack amb 400 e-mails i drets, i 100 participants. Hi ha una línqueda d'interès, amb resources. D'acord amb les dones. Hi ha alguna cosa extra. Hi ha una investigació que va fer en 2013 d'una xarxa de l'Ebresoft. 11% de les persones que s'acabaven eren les dones. Hi ha uns números inicials. I després vaig veure una xarxa de l'Ebresoft, i després de l'Economia, vaig veure una xarxa de 5% de les dones i 80 companyes eren les dones. Hi ha una xarxa de 21% per a les dones de l'Ebresoft, i un 32% per a les dones de l'Ebresoft. Hi ha algunes companyes. Hi ha una xarxa de l'Ebresoft, i després d'interès. La xarxa de l'anàlisi de l'Ebresoft que té unes grups específics per a la xarxa de l'Ebresoft, té un 90%. Hi ha un 20%. En el cas de Google, per la xarxa de l'Ebresoft, hi ha un 80%. Bé, molt clara. Hi ha uns números per a Facebook. Hi ha un 32% per a les dones de l'Ebresoft, en la companyia. Hi ha un 16% per a les dones de l'Ebresoft, en la xarxa de l'Ebresoft, en la xarxa d'enginyer. Hi ha uns dropboxs. No tinc els números per a la xarxa de l'Ebresoft, però deixeu que hi ha un 34% a la companyia. Les dades, per mi, eren al primer lloc del dret de l'espai.com. I al final, en el cas de seguirem aquí. Hi ha uns links, molt de això. I per exemple, aquestes números no són representatives de tota la xarxa de l'industria. Però, en el cas de la xarxa de l'Ebresoft, parlem de un 30%, 40% de la xarxa de les dones de l'Ebresoft, 10 i 20% per les teams de tècnica. Què passa en l'open stack? Això és la segona qüestió. Per primera vegada, què és la contribució tecnic, que és part de l'altre títol d'aquest stock? S'està parlant de comits, s'està parlant de uploads, i s'està parlant d'arribots. Per què vaig fer-ho? El primer comit és la base de informació que algun desenvolupador produirà en la comunitat. I aquest comit ha de ser previstament aplaudit, que és, potser, una altra contribució tecnic. Tu pots tenir diverses literacions, i després hi has d'anar a altres persones que et rebeguin. D'acord, potser has d'improver això, potser no. Aquesta analització és basada en aquestes tres, per exemple, comits, uploads, i reviews. Però potser hi hagi d'una altra manera, una altra mètrica, per exemple, per fer-ho per a una companyia, fer-ne, en el procés de review de la ciutadania, per comparar els females amb els altres persones, per fer-la, entre les diferents organitzacions. Definitament ens parlem de la transparència. Si hi haguem no tan bons punts per al projecte, potser ho hem d'improvar, però ho hem de fer a l'endemà. En el cas que és una informació molt sensativa, So, when I started with this, I was checking the internet again, which is a great place, by the way. Then I found that there's an API named jainderize.io, which is, you give them a name, and then they return something like, okay, you gave me this name, and then I give you this probability of being male or female, which is, oh, that seems to work. And then I said, okay, let's go for it. So, we are talking about 10,000 different identities that I found in Geed and Garrett. We'll go for this later. And then those needed on top of this, some manual analysis, okay? I also focused on main developers. We will see some percentages, but I don't have the 100% of the population, right? The architecture of this, we are talking about the, we are parsing Geed and Garrett repositories. We have some mining tools coming in for the metrics in our toolset and developer there. So, yeah, I know them. We are talking Geed repositories, which is for Garrett, and sorting Hatties to managing identities. Then there's some process for enriching information, like adding the gender, adding the company, et cetera. And then some visualization, which is in this case based on Elasticsearch and Kibana. Some numbers that we have here, these analysis was based in the Jamf file provided by the foundation, which is something like 450 repositories. We are talking about 400,000 commits, or a bit less if we remove bots and mergers and all of this. We are talking about quarter million chainsets, close to one million patched uploads and more than one million patched reviews, right? Just to give you some context here. The mining tools, as I mentioned, CVSanali is focused on Geed, Vicho is focused on Garrett, so we provide, we build some MySQL databases and then we start messaging this, okay? Then CVSanali and Vicho databases are available in this activity board that I mentioned at the beginning. So, if you have some time and you would like to play with the data, you can go there. Shorting hat, this has some organizations information and this gender and everything. It's a bit more sensitive, so you should ask for them. Okay, well, now we are now migrating to a new platform. It's name is Grimoire Lab, more schema-less focused on Elasticsearch and so on. So, we have MySQL databases, I need to aggregate the information. Okay, let's go. So, I use pandas, I have everything there. We are talking about something like three million events, one million and a half from each developer touching each file, which are kind of a lot of them. And the same for the reviews. We are talking about another one million and a half. Then I started to play with the data and with a lot of coffee and some manual work. I had some numbers. And well, Elasticsearch and Kibana, they are kind of a good team together, but you should try them if you have not tried them. It's really good. Okay, some validation, as I mentioned. I wanted to be sure that at least the people listed in the Women of Open Stack Wiki were right, so at least they are right. Then I went for the main projects and the main companies, and then I was looking for the top developers so we can take something like most of the work. Just some numbers, something like 80% of the work is done by a 20% of the community. So in terms of number of things, so that means that we are covering kind of a big amount. I had some really hard process checking Asian names. I'm really bad at this, so any help or any clue is appreciated. And well, some numbers. So I will show now some demo if internet allows me to have it. Okay. This is the task board. On the top left, some big numbers. This is for the last five years, as we can see on the top right here, last five years. We have something like 200, more than 200,000 commits, 5,000 developers, 44, 45 project teams. In terms of the last five years, we are talking that males, it's something like 86% of the total population of Open Stack, right? Well, sorry, they develop an 86% of the total activity in Open Stack, measuring commits. And in the case of women, they develop something like a 7.16% of this. There's some unknown activity there, which is around a 6%, a bit less than a 7%. This is a bit different from the population. So if we go to the population of men, we are talking about a 65% of them. So that means that 65% of the male developers are doing something like a 86% of the total activity, right? Around 11% of the population of women are committing something like a 7% of the activity, right? Then the top projects, in this case, we have this chart here, this table, which is, okay, in general, we have infrastructure, natural, Nova, documentation, quality assurance, okay? This is the top 10 projects. And we also have a type of contribution here at the bottom, which is, just have a look at this big particle thing, which is the code, okay? Python, cell, whatever. So we are talking that something like a 52% of the activity in Open Stack is related to code. Let's go for the pure numbers for female, which is clicking here on some time. Okay. It's interesting to see a couple of things. So if we go for women, the top projects are a bit different. We went for infrastructure, Nova, etc. The first one now is documentation. So we have documentation, infrastructure, neutral, and orison. It's interesting to see this chart here, the bar chart, because this is providing kind of a jump in the case of activity for women, right? There's a difference between 2015-2014, which is around 300 commits per month, while here we are talking about 600, 400, 400, 900. So there's a lot, right? In this case, we had something like a 55% in mean of the activity in Open Stack is code. In the case of women, we are talking about a bit less, 43%, right? We can also compare something like, let's go for the last specifically year activity, right? We have similar numbers, code activity, and we are talking that we had something like 338 female developers that were active in the community, right? They participated in close to 6,000 commits, and they participated also in around 40 project teams. I also have information about the people here, but, well, they were anonymized. I didn't feel in the mood of providing such information like these developers has this gender, so it's quite sensitive information, right? If we go for male, which is, I click here, and then I click here. As I mentioned, the code activity is a bit higher, which is something like a 55% of activity. It's a bit different. We are talking about infrastructure in first place, neutron, documentation, and civil novel. Okay, so we have this. Okay, sorry. I also prepared, so this dashboard here is focused on gig activity and gig population, right? I also prepared another one for gig activity, which is again for the last five years. Okay, main numbers. We have the big numbers, like we mentioned before, close to 1 million reviews, something like 200,000 changes, more than 4,000 developers, and around 45 project teams. This chart on the top right is showing evolution in terms of this activity. This is code reviews, specifically code review activity, right? We are talking about that in the OpenStack project maybe in March, in the first of March of 2014, they had something like 28,000 different code reviews. Plus one, minus one, plus two, minus two, right? Okay, we have this chart there. And then let's focus on this pie chart. I don't like pie charts, but let's go for the pie chart. Okay, the one in the center is providing information about the code reviews. So we have plus two, which is this one, which has 400,000. Then we have plus one, which is close to 300,000. There's something like 200,000 minus one, and then minus two, around 10,000. So OpenStack community seems to be really polite. You don't really like to have minus one and minus two, right? In any case, let's focus on... Well, Andy, the circle on the outside is providing the gender. So we are talking that 42% of the class... Oh, sorry. 89% of the plus two were done by men. While this activity in case of female is around 8%. Then we have, again, an unknown activity here. But anyway, in the case of maybe plus one, we have a bit different numbers. So instead of having an 89 by male, we have something like 76. In the case of the female, we have something like a 10%, 11%. But these are code reviews. What's going on with the core reviewers, which are those that can do minus two and plus two? Please remember this chart on the top right. There are something like close to 30,000 code reviews. Let's go for the plus two, which is running. Okay. There are still some similar numbers. We are close to 15,000. You can see that female is kind of increasing, right? It's a bit bigger than previous charts. And it's quite interesting if we click in female that we have a big jump. We have even doubling here. We are talking about 1,000 something and we are talking close to 2,000 here. So that means that during the last year the women activity, at least the core review, women activity has increased a lot and has increased really much. Okay. Some other numbers that I can show you. Well, in terms of activities, it's pretty similar to the previous numbers that we had. Okay. This is related to, okay, code review. And then I wanted to show you something about the demographics of OpenStack, right? Okay. This pie chart is, again, displaying this idea of the population by gender. Okay, so we have male, we have female, unknown, and we have female, right? This chart, the top bar chart, is providing information about the per quarter, the number of new developers that the OpenStack community had. Great. So we had something like in the second quarter of 2013 they had something like 300 developers. Awesome. If you go to the bottom chart, which is this one, this is, again, providing information per quarter, but this is aggregating information when a developer did the last commit. Okay. So we can say things like during the last quarter the OpenStack community had something like 1,500 developers active. Okay. And the interesting point about this is, we click here, we know on the top when the developers were born. So when they are coming. We have some activity at the very, very beginning that are still there. We have something like nine developers that entered in May in 2010. We have some developers that entered in May 2013. Well, the big bunch of developers are coming, came during the last quarter. This is quite useful. So we have in OpenStack kind of a long tail of really small contributions. One, two commits. So this is quite useful. Most of the people came, did a commit, that's all. But if we are interested in all of this, as we have the table of top developers, we can go for clicking here and saying, okay, given this list of developers, again, it's anonymized, we can know when they came. Okay, let me remove this. Let me remove this. Interesting case here is again, and hopefully this is useful for the women of OpenStack working group. This is the attraction of new female developers to the community. By the way, there are some jumps that are of interest maybe like May 2013, February 2014, August 2014, August 2015. They seem to be related in somehow to the summits, but I'm not pretty sure. But they seem to be. So similar trend in the general OpenStack. So this is not something really special for female. And the point here is, okay, let's go for this May 2013. We have these people, these female developers that when they did the last commit. So we know that from this 34 here, eight of them left the community right that quarter. But we have something like 10 of them are still contributing to the community. The point here is, in the wiki of OpenStack, and the women of OpenStack, there's a list of 20 or 40 developers there. But we are talking here about a population of 500. So if there are people that are leaving and they have the knowledge about how to contribute to OpenStack, maybe you can send an email saying, hey, we miss you, how are you, we would love if you come around. Okay, so this is more or less the kind of tooling I wanted to present to you, right? Let me go back to the slides. So in the slides are all of these numbers. So don't worry about this. Let's go for the analysis. This is some extra bonus for this presentation, as I have, yes, I still have time. EDI's outreach is helpful for a lot of things, but it's specifically outreach helping the gender gap in OpenStack. So, yeah. Well, the idea of outreach, this is from the website, outreach helps people from groups underrepresented in free and open source software getting involved. Great. How is performing the community retaining those developers? So there's a developer that came to the community, how good is OpenStack retaining those? I think it's a little bit different. I'm going to start with a small analysis about the starting in 2013. So we have four of them. Well, if we say that developers are active during the last year, 2015 outreach programs are not that interesting. Well, we can compare this outreach retention with the general retention of the community. So let's go for some numbers. One is still contributing in orange. So if we go for the first year, as far as I know, that OpenStack started in this, they attracted one developer for the outreach program. It's not contributing anymore. If we go for the second, it's in 2014, February. Four developers were attracted by the community. One still contributing. So 25% of retention. Then we have two developers in August 2014. No developer is still contributing. And we have something like six developers that were attracted in 2015 in the outreach. It's one developer that is still contributing. Let's go for the top right chart here. So this is for those months that I had a look at the outreach. I went for those and I looked for the women specifically that entered in that moment and are still or not contributing to the project. So in blue again, the attracted developers in orange ones that are still contributing. So we have that in the first period. Around 15 of them entered as newcomers to the community and we have five of them still contributing. Around similar number in 2014 in February, around 16. I think in this case it's one or two developers. Then there's a high increase in 2014 in August, something like 22. And we have something like close around nine developers that are still contributing. And for the last period we have something like 20 developers that were attracted and four of them are still contributing. So it seems that the number shows that there's a better retention. So it's something like if we ignore outreach for this purpose and we go for this. I mean if we compare both, it seems that having new women in the companies is a bit better than having women coming for the outreach in terms of retention. I'm only going to clarify that. This is the comparison of the retention rate. The blue chart is showing retention rate for the general open stack community for women. And then the orange one is the retention rate for the outreach program. So in the first period we have something like a retention rate for women coming from any place en August, something like 30 something percent and we have zero in the case of the outreach. If we go for 2014 we have something like, interesting by the way, 5% in the case of the companies and the general organization and we have a 25% of the outreach. In the case of August 2014 we are talking about a 40% but again zero in outreach. And then finally we have a 20% approximately the retention rate for 2015 in January. And the outreach is quite similar in this case. So some conclusions that I may have from this analysis, small analysis. More women are attracted and retained also in relative numbers thanks to companies in open stack. So we have something like a big company coming and they are bringing something like a state claim between 10% 20% d'organitzacions that the outreach programs as they are in this specific focus problem like decreasing the gender gap, right? So even though I would like to say that we have big companies claiming that they are around a 10% and a 20% and they are mostly between close to a 19% but hey, we are having an 11% in open stack so there's still room for improvement. What's going on here? And then a couple of questions that I would like to tell is something like okay, is it worth exploring our kind of investment? So we have outreach, which are great. Can we go to companies and say, hey, you are let's say, we are kindly pushing you to have a more diverse touching group here. We would love to have it and we would love to hear that from you. Because we have the numbers indeed and we can show them to them. Maybe as a result of that we are focused on degree programs maybe what about going to high school? I don't know about the USA or other countries from Spain, so I know what's going on there but not here. So those are some ideas that we have. So some conclusions. First decisions are based on data. So we may have the perception that something is going well but then data tells you no, that's not true but the point is that we need data, right? So probably at least with this kind of dashboard, tooling and all of this could be useful in somehow, I would say. Some answers. So there's a continuous increase in the attraction of women to open a stack and it's growing. We are talking about 11% during the last year. If compared to the last number of summits I think they said that it was a 13% of attendees. There's a great increase in the core reviews. Women doing plus 2 and minus 2, which is great. I would say that most of the women are coming as new organizations showing the foundation. So this is probably a good path to explore maybe. I would say that probably having some tooling is good to have some numbers and to have some comparison and finally to make decisions, right? As I mentioned, we are having something like 25% of the population as a known gender, but they are only doing less than 7% of activity. So there's room for improvement of the data set. I also found some false positive but that was done in my time so I have two hands only. But at least this provides some numbers about the initial status of all of this, right? I don't know, I guess. Some open paths. Well, we are talking, if you remember at the very beginning I mentioned that we had something like a percentage of probability given a name. We are talking about 550 female developers for any kind of probability. Well, more than 50%. But if we go for 100%, we are having those numbers by 200%. If we play with the dashboard as I did before for that probability, we have similar activity. Similar activity and population. Numbers are pretty similar in that case. So, as we have the numbers and emails and everything, well, we can talk to any of them and let them that we are here and that it's open to participate on everything. So maybe as a new developer come together and the system say hello, maybe there's a way of coming to the community, I don't know. Or maybe any other diverse group. So, further work. As I mentioned, this is sensitive info, well, at least from my point of view. I don't know yours. So the dashboard is still private. We may go for something public. Based on this and this is kind of work of Viteria, we are providing any kind of analysis. So we can go for some time to merge firmness, so if the code review is firm between men and women or things like this, we can go for percentages of what are the most diverse project teams or the most diverse companies in open stack. So this may help to say hey, they are doing great, so maybe the others can have a look at this. Quarterly reports or maybe analysis like we define some policy. We have data previously, we have data after this. We can compare the data and check what is going on. So, hopefully this is useful to have a better picture of women of open stack activity. Well, as to say, we are looking for a sponsor for this. It's quite heavy time consuming. So, thank you very much. I don't know if we have any question. Thank you. Oh, please, if someone, I think you should stand up and go to the microphone. Is there a specific call to action to help you get more data corrected for the name identification? Asian names are really hard for this. Yeah, I know that. Is there anything specific we could patch, do, help with? Well, either you can go for each of the companies asking for the specific data so they can provide more accurate data sets. So, is it a call out to the individuals to identify? That would be great as they identify the country and things like that. Oh, please, go ahead. You may not know, but one of the suggestions would be to integrate open stack ID with Garrett so that Garrett all of a sudden uses the same ID that is provided when you log in as a foundation member there you specify your gender however you may if you like. So, by the foundation I think that there is 40% coverage in there for new members and there could be a push from the foundation. So, if we end up using the same database we can collect that information much more precisely. That would really helpful, thank you. Probably we need something like from the foundation, something. Daniel, this is absolutely fantastic because it was just a casual conversation that we had in Tokyo and I did not realize how much work you guys have put in and, you know, data speaks louder than words. So, it's really good too and in fact in the breakfast meeting at the women of open stack some of us were saying we want to see metrics because that's the true measure of progress, right? But the actual numbers are changing especially coding numbers and Garrett numbers and things like that. So, this is great and we'd like to kind of see how we can take this further. So, I'll follow up with you on behalf of that. And I also want to give a shout out to Lana and to and gentle. And the reason the documentation project is doing so extremely well is they have very inclusive leaders such as Anne and Lana as PTLs, right? I'm going to start with the Q and A but I'll start with the Q and A so, I'll start with the Q and A and the Q and A and Q and Q and A and Q and A and Q i given that the projects grow and shrink rapidly, it would be very interesting to see clearly there are more and more women participating, but it's not clear whether they are proportionately more and more women participating. Is that makes sense? Yeah, well, I don't have the numbers right here. Yeah, I understand you don't, but... It's doable, I mean, yeah. And the other thing about your presentation, which was great, and that has my head spinning, so I'm peripherally involved with Wikipedia, and we, people that have spent a lot of time trying to correlate these little micro interactions that people have that are largely attracted to Wikipedia, like edit reversions, for instance, and how these positive or negative interactions discourage and cause people to drop out of the project. And of course Garrett is a perfect Petri dish of measurable interactions, right? I mean, presumably a negative one isn't necessarily a negative interaction, I feel like there are great depths to plum there in terms of what process results in developers being discouraged or encouraged or welcomed when they arrive and so forth. Yeah, well, first said that we, as in the activity board, we are aggregating information from much more bunch of data sources like IRC channels, mailing list and all of this. So we don't really have to focus on Garrett or Keith because there are a lot of interactions around. And probably some social network analysis would be great because probably if documentation people are doing a great job there, it's because they are good at some social stuff for sure. This is people, this is a lot of people in the end. So probably kind of measuring things would be great, I guess. Thank you for the great presentation. Thank you. It was really good. One of the things in your recommendations or suggestions which I liked about was high school. You were talking about, I really liked that idea of bringing more awareness and also giving opportunity for high school students. I have not seen much high school students getting even awareness about OpenStack or any contribution. And in extending that, even the undergrad, when I was talking to some of the tech talks in the universities, not everybody is aware of OpenStack and how to contribute and everything. So I don't have any suggestion, but I'm just saying I'd love to help whatever I can. I'm going to definitely help with that. But if there is another way, we can all have a forum or something so that we can say, hey, go and join this. And this is a quick cheat sheet or you'll have mentors and everything. Love to have any kind of resources for that. Yeah, that's it. I agree with you. Just wanted to say thanks for putting together the data. But I was curious if there are any ideas or plans to address the people outside of the gender binary, the non-conforming and genderqueer. A lot of those people in the unknowns or maybe some people that have been marked male or female don't actually identify that way. Yeah, I agree with you. Well, I tried to have this as a semi-automated way so that's the idea I had about the first names and go with this. Then if we need more qualitative data, we should probably, as Stefano mentioned, this gender or something else idea about this. Having this automated, if you have any idea about this, would be great, but I didn't have a clue so far. So, more comments or questions? No? Thank you very much for your time. Thank you.