 All right, welcome, everyone. Good morning, good afternoon, good evening. I am delighted to welcome you all here. I'm Kasti Sugimoto. I'm the chair of the School of Public Policy at Georgia Tech. And I am absolutely delighted to be joined today by our talented presenters to discuss disrupting the status quo or perpetuating inequalities. When we were designing MetaScience 2021, it was clear that we wanted to have a session that would focus on diversity, equity, and inclusion. And recognizing the methodological angle of the conference, we decided to go with a session that would present some of the state-of-the-art techniques in incorporating DEI variables in the large-scale studies. I want to make a note that these talks are not meant to be exhaustive or even comprehensive. Rather, the goal is to present a particular approach that allows us to center our discussion about advances in research in our field and where we can go future directions. I want to mention a few things. We take as our starting assumption that contemporary scientific practices are not value-free and perpetuate several systemic biases in science and society. We also acknowledge that our data remains embedded in categorizations and taxonomies that reinforce problematic racial and gender categories that may be exclusionary. What we hope to do today is to demonstrate some empirical studies in this area to begin to interrogate our current approaches and to create some new ones. So to do that, we call upon you as our audience to engage. We've organized this event to allow ample time for you to question, to critique, and to imagine. We will do short presentations with focused questions from you after each and then turn to a general panel and discussion. While I have some questions prepared, I would ideally like to focus entirely on your questions. So use the chat function early and use it often. Side conversations are not only allowable, they are encouraged. So with that preamble, it is my great pleasure to introduce our first speaker, Danny Bassett, a name professor of bioengineering with a secondary appointment in physics and astronomy at the University of Pennsylvania. They received a BS in physics from Penn State and a PhD in physics from the University of Cambridge as a Churchill scholar and as an NIH health sciences scholar. They then did a postdoc at UC Santa Barbara and a junior research fellow at the Sage Center for the Study of the Mind. The American Psychological Association called them a rising star in 2012, and they have certainly lived up to that label, becoming a Sloan Research Fellow, MacArthur Genius, Office of Naval Research Young Investigator and as a career awardee and more recently an Erdos-Renny Prize awardee. Their publication record is an expression of their industry and insatiable curiosity. So I'm delighted that they brought their skills as a neural and systems engineer to study disparities in science. Dr. Bassett, the floor is yours. Thank you so much for the very kind introduction. I'm really excited to be with you all here today and also just wanted to underscore what Cassidy said about the excited about the conversation and how to engage in this work even better in the future. So the title of my talk today is racial, ethnic and gender imbalances in reference lists in scientific papers. And as we all know, gender and racial inequalities are pervasive in academia as well as in industry. They have been reported in compensation, in grant funding, in credit for collaborative work, in teaching evaluations, in hiring and promotion and in productivity and authorship. What's really interesting about the conversation around these inequalities is that most of the conversation still focuses largely on the role of people in positions of power such as journal editors or grant reviewers and agencies or department chairs or society presidents. And that's interesting because many of the imbalances are in fact caused and perpetuated by researchers at all levels. And this particular systemic imbalance is something that holds true for citations which is going to be the focus of my talk today. So citations are a really interesting and multifaceted object. The Sarah Ahmed is a fantastic scholar who focuses on diversity work across academia and she refers to citations as academic bricks. And there are sort of at least two ways in which citations are academic bricks. They're first the basic building blocks of academic careers. So they are the building blocks of success, of compensation, of promotion, of grant and other funding awards, of collaborative opportunities and of speaking invitations even if we would hope that citations were not so influential. The fact is that they actually very much are. But citations are also the basic building blocks of whole fields of inquiry. So citations map scholarly fields. They define spaces of inquiry and spaces of non-inquiry. They determine the scope of questions considered and the scope of questions that are not considered. And they record a history of scientific ideas. And I am careful to say not the history, not necessarily the true history, but they record a history of scientific ideas. So as academic bricks then, citations can build a more diverse scientific community or they can erect walls of exclusion. So we wanted to study citations and specifically understand whether we cite equitably or instead with clear preferences for a given gender, race or ethnicity. And in the first study, we collaborated with a range of scientists, statisticians, a physicist and a professor who works in gender and race theory. Our approach was to examine the authors and reference lists of 61,000 articles published since 1995 in five top neuroscience journals. Physics is coming shortly. These journals were reported by the web of science to have the highest eigenvector scores in the field. But we did next was to assign the term man or the term woman to each author if their name had a probability greater than or equal to 0.7 of belonging to someone labeled as a man or a woman in the Social Security Administration Database or in the Gender API. In our analysis then, there are some important limitations and I want to put them up front here. So the term gender then in our analysis really refers to the genderedness of a name. It does not directly refer to the sex of the author as assigned at birth or chosen later, nor does it refer directly to the gender of the author as socially assigned or as self-chosen. Second, I want to mention that this binary man woman gender assignment is not well accommodated to intersex transgender and or non-binary identities. We have some ideas of ways to engage with those communities further and I'd be happy to talk more about that in the question and answer period. So first just to set the stage of the genders of authors over time. So here on the left side, we have the proportions of papers written by different author categories in 1995 and then up through to 2018. So what we found is that the proportion of articles with a woman as first or last author significantly increased between 1995 and 2018 at a rate of roughly 0.6 percentage points per year. Now with that data, we wanted to test three hypotheses. The first hypothesis is a general under citation of women. So for each of 31,418 citing papers between 2009 and 2018, we took the subset of its citations that had been published in one of those top five journals since 1995 and determined the predicted gender of the cited first and last authors. These two author positions are quite important in neuroscience. Note that we also removed self citations and those are defined as cited papers for which either the first or last author of the citing paper was also the first or last author. That allowed us to calculate the number of cited papers that fell into each of the four first author and last author categories, man-man, woman-man, man-woman and woman-woman. So next in order to determine whether any particular group is being over-cited or under-cited, we have to have an understanding of how many citations a paper is expected to receive. So to obtain that number of expected citations, we used at first a random draws approach where we calculated the gender proportions among all papers published prior to the citing paper. So that represents the proportion among this whole pool of papers that the authors could have cited. And we multiply that by the number of papers that the authors cited. So with this definition of the expected proportion, we can then determine over and under citation as the observed percent minus the expected percent divided by the expected percent. What we find is that the MM papers are over-cited by 11% and the woman-woman papers are under-cited by 30.2% for a difference of over 40 percentage points. Now you might say that when you place references inside of your citation list, you don't randomly draw from the literature and that is absolutely true. So we followed up that initial analysis with a second assessment of expected citation count that accounts for additional paper characteristics. So for example, the year of publication, the journal in which it was published. So some journals have higher impact factors and perhaps papers in that journal will tend to be cited more. The number of authors on the paper, whether the paper was a review article or a empirical paper. And we did that because review articles are typically cited more frequently than empirical papers. And then fifth, the seniority of the papers first and last authors. Now I'll note that we don't have precise ages for the authors in our study. So we used as a proxy the number of papers that that person had previously published. So that sort of productivity mark was a estimate of their seniority. Then we specified a generalized additive model on the multinomial outcome of paper authorship in those same four author categories. And we found again that the man-man papers are over-cited by 5.2% and the women-women papers are under-cited by 13.9% for a total of about 20 percentage points as difference. Our next hypothesis was that that under-citation would be driven by the majority group. And you'll see this here for gender and later for race and ethnicity where the majority group is white. So here what you can see is the citing practices of the man-man teams. So along the y-axis here is percent over and under-citation and along the x-axis are the four author categories. And what you can see again is that for the man-man teams they tend to oversight other man-man papers and under-cite the women-women papers. Whereas if you look at the citing practices of papers that have a woman either in the first or the last author category or both, you see that they cite very close to the zero line. So are very close to that expected. Now, the third hypothesis that we wanted to test was that these effects should be decreasing with time. We're optimists here and we expected that given the fact that there's a growing diversity in academia we should see that the under-citation of minority groups should decrease with time. But what we actually found is that the imbalance within these reference lists is increasing with time contrary to our hypothesis. And that's particularly true in the papers authored with a man in the first and last position. So here you see along the Y-axis the percent over and under citation for papers that have a man in the first and last position. So you can see they increasingly cite man-man papers over time and decreasingly cite women-women papers over time. Our forecast in a few minutes you're going to see the same effect in race where the majority race is increasingly citing their own race. And on the right hand side here you see the same plot but now for the citing practices of papers that have a woman in either the first or the last author position. So here you can see that there's a slight widening of the citation gap in this group but it's significantly less than what you see in the man-man papers. So now we can ask the question of what other dimensions of difference might these kinds of disparities arise? And the obvious next one to check is race and ethnicity. And I'll note that the first group of individuals who did the gender study was a relatively small and not racially or ethnic diverse group. And so in addressing this second question we really broadened our team significantly and we're very grateful for all of the individuals who provided their additional expertise and their experiences in crafting narrative around these results. So for race and ethnicity what we did is that we assigned the author race and ethnicity using again publicly available probabilistic databases and a deep neural network that learns the relationship between names and racial or ethnic categories in voter registration data, US census data and Wikipedia entries. Note that the voter registration and census data are US centric whereas Wikipedia is quite international. The approach then allows us to estimate the probability distribution across four racial ethnic categories. So Asian, black, Hispanic and white based on each author's first and last names. What I'll show you for the next few slides though collapses all of that data into either white authors or authors of color. So across the 63,000 articles the proportion of articles with a person of color as either first or last author significantly increased from 1995 to 2019 at a rate of roughly 0.49% per year. So that's the good news. The bad news is that the authors of color are consistently under cited in comparison to both the random draws model and the paper characteristics model. So here's the random draws model just so that I can parallel what I showed you in the gender study. So here along the y-axis is the percent over under citation and along the x-axis is the four author categories. So white, white, white color, color, white and then author of color, author of color. What you can see is that the expected citation counts are in these light unsaturated violin plots and then the true data are in the saturated colored violin plots. So you can see that white, white papers are being significantly oversighted by about 8% and author of color, author of color papers are being under cited by 17.2% for a difference of about 25%. Note that the white ciders are driving the majority of this effect. So white people like me are over citing other white, white papers by about 12% and under citing author of color papers by 24%. Ciders of color on the other hand are citing much closer to the zero line. So much closer to parody or equality. So they're citing over citing white papers by 4.3% and under citing author of color papers by 8%. Now, moving beyond the random draws model and accounting for these additional characteristics of papers that might be important or might be explanatory separate from race or ethnicity, we included now the year publication, the journal of publication, the number of authors whether it was a research article or review article, the first and last author seniority and in this case, we added a sixth factor which is the location of the author's institution. Now, according to this model, again, you see the same effects. So over everybody, the white, white papers are being oversighted by 5%. The author of color, author of color papers are being undersided by 9.3%. And the majority of that effect is being driven by white ciders who cite other white, white papers by 7% and under cite author of color papers by 14%. Again, ciders of color are citing closer to the zero line. Now, is the racial and ethnic imbalance in citations increasing or decreasing with time? What we see is that this also is increasing with time, although at a slower rate than we saw in gender. So here along the y-axis is the percent over and under citation again and then the x-axis is time. And what you can see is that white ciders are over citing white, white papers increasingly with time. So this purple slope is slipping upwards, whereas they're citing author of color papers less and less with time and that's the burnt orange color going down. Ciders of color, as you can see, again, are citing closer to the zero line, but you can still see a widening gap, although to a much lesser extent than what you see in the white ciders. So just as a quick pictorial summary, well, it's actually a data summary. It's a summary of intersectionality, which is what happens when we account for not just the predicted race or ethnicity of the person based on their name, but also the gender of the person based on their name. What we find, so here along the y-axis is the first author, race, ethnicity and gender, and then along the x-axis is the same information for the last author. The color indicates over citation in red and under citation in blue. Just a quick sort of bird's eye view of this figure shows you that there's a clear demarcation by gender with the man-man papers, which are in the top left quadrant being in general over-cited and the woman-woman papers in the bottom right quadrant being in general under-cited. But even within those gender bins, you can see a parameterization by race. And I wanted to call out in particular the endpoints. So black women being under-cited by 47% and white men being over-cited by 24% for a difference of about 70%. And again, these effects are being largely driven by the majority race and the majority gender and are increasing with time. So obviously that's a sort of a downer, but I think that it's important to think of what we can do. And I think what we can do is to attempt to do science better. I like to quote Maya Angelou and saying, the truth is no one of us can be free until everybody is free. And I think all of us want everybody to be free. So one of the things that we have chosen to do it and encourage others to do is to just check and fix your own reference lists as you're writing papers so that none of us contribute to an ongoing imbalance in these citation practices. And there's code available that we've developed. Feel free to use it. We'd love to know about bugs, et cetera. Number two, append a citation diversity statement in your paper that is a way of increasing awareness about this disparity and of also holding ourselves accountable to one another. And if you'd like to read a paper perspective on that, we have just recently published one in 2020. And lastly, consider contributing to this field by bringing more inequality to light and developing more mitigation tools. For example, there is a citation transparency code available from the Chrome Web Store. And tools like that can be very useful for helping us to further this work. So with that, I'd like to say thanks. I think we have maybe a few minutes for questions before turning to the next speaker and Cassidy can say, thank you. Thank you so much, Jenny. I am so impressed with the robustness of your work and it is just very important work as noted by people in the chat already who utilized it both in their research and in translations to practice. There are a few questions, so I will move through those. The first comes from Rose Branson who asked, what impact of any do you think with the underrepresentation and undercounting of people of color in voter records and census data not on the neural network and learning names and resultingly the analyses of the names and citations? Yeah, that's a really, really great question. So I think that for the voter registration and the census data, there is an underrepresentation of people from an equal, there is not an equal representation of people from these racial and ethnic categories. For the Wikipedia data though, I would say that the representation is much better. I will note that there is one group of individuals for whom the predictions are not particularly good and those are black individuals in the United States who have names that have come through, that are sort of Anglican names that have come from the history of slavery from the people that they were enslaved with. And so that's a group of individuals for whom the predictions are poor and that's something that we all have to live with in the sort of history of that country, my country, unfortunately. And sort of related to that, but on the gender space, one attendee asked about unisex names noting that this applies to many sequence. Can you address this and how the unknowns in gender might be unequally distributed across the population? Yeah, that's a really great question. So there is a percentage of names for whom we cannot assign a gender so that their names don't pass the point seven mark. So they would be called unknown in our data. At the moment, we just focus on the names that are known but in the sense that they are gendered enough to cross that threshold. But I think that that means that the results that we find are really name-based biases that are evident in the languages and communities in which names are quite gendered. For language and communities in which names are less gendered, I think there need to be different sorts of tools and different sorts of analyses to uncover these same kinds of discriminatory patterns. And a follow-up question, what is the distribution of unknown gender names across races? Is this limitation that is in the paper? We don't have the, that's easy to find. It's not in the paper, but we have not, but that's easy to do. So I'd be happy to follow up on that. And maybe one last question before we move over from Dominique Roche, who wrote that these results are really interesting and worrying, and I agree. Is there any experimental evidence that authors actively investigate the gender of authors at the papers they cite? For example, first names are not always indicated on PDFs or even on journal webpages. So determining an author's gender or inferring an author's gender requires an extra step of looking them up online. Could it be that men tend to cite other men because they know them and are familiar with them, whereas women tend to cite authors with whom they're not necessarily familiar? That is such a good question. So after 2006, many journals switched to using first names. And so those are more and more available for most journals in the newer work. But before 2006, it was not as common. So predicting names in that sense, or predicting genders for those older papers probably came from knowing that person or from other people suggesting that work for you to read. But for everything past 2006, most of the journals are including first names now. So the second question is, is it that people, do you look people up online? I think that certainly that's something that we are doing to increase our knowledge about and our ability to cite other individuals who may not identify as a woman or as a man or who identify as a trans woman or a trans man who may not be in the cis categories that we would be thinking about more typically. So that's something that we are actively using as a positive action that we can take. And Andrew, Maya was also followed up and said also curious about the mechanisms. Can we know effectively, can we know that this is some form of implicit bias or are there other correlative effects that may be driving some of these things such as networks, relationships, et cetera, that we might be able to control for? That's another really good question. So some of the data that I was not able to go through today because of the time, but is included in both of those papers is that social networks do play a role and do account for some of the variants in these patterns. There's still variants left unexplained, but there is a significant amount of variants that is explained by a co-authorship pattern. So if there is a person nearby in your co-authorship network, you tend to cite them. There's also significant gender and race-based homophily in co-authorship networks. And so it's possible for somebody to be in a group that is mostly a majority gender and mostly a majority race and cite those individuals. So I do think that that driving changes in the social network and in co-authorship networks could be a great potential mitigation tool. And I said one more, but I'm going to ask you one more. So David Bernard said, are you able to see any difference in trends pre and post this general's policy change in 2006? Right now, just a change in the number of authors that we can assign a predicted gender, but we don't, that's all we have. Fantastic. All right, I'd like everyone to join me in a proverbial hand clapping and gratitude for Danny being here and sharing her expertise. We're going to come back to you Danny, but for a moment we're going to turn over to Diego. Diego Kuzlowski is a PhD candidate at the doctoral training unit in data-driven computational modeling at the University of Luxembourg. His work focuses on implementing computational methods to answer questions in social science, particularly in the science of science. He's an economist by training with a degree from the University of Buenos Aires and a master's degree from the same institution in data mining and knowledge discovery. He's been taking his skills in data mining to study inequities, which he will present here. Diego, take it away. Thanks Cassidy for the introduction and thanks Danny for the presentation. I think it was really, really great. I'm really happy to be here with all of you. So today I'm going to present a research project called Intersectional Inequalities in Science. And first I'm also going to talk a little bit about the name-based racial inference, as Danny did also, but with a different approach. Then I'm going to use those conclusions to apply this disambuation to authors in U.S. and maybe the time allows a little case example. So Subaru says that the resalization of data is an artifact of both the struggles to preserve and to destroy racial stratification. In our study, we want to understand how the cultural construct of race in U.S. influence U.S. Academy and generates inequalities. But our bibliometric databases, namely Web of Science, don't have information of self-persuaded race, so we have to first make an inference on this. On the information that we have, that in this case is names as Danny was explaining. But also, this was mentioned before, this can introduce new biases in the algorithms for racial inference or trying to assign a racial category can generate new biases and can underestimate populations. So first we want to focus on how can we make the most and possibly unbiased methods for this. So another, since 2008, in the Web of Science, has most of them have given in a family name. And each of these names can be assigned to a probability distribution over the categories from the census and where given names in our case are coming from information on mortgage applications and family names from the U.S. census. In, if our goal is to assign a single label to, this means assigning a single racial category to an author, then we have two things to consider. First is which of these probabilities or a combination of these probabilities to use. And then an assignation mechanism, for example, a thresholding. So let's say we have a Juan Lee who has a certain distribution for his given name and for his family name. And we want to use a 90% threshold and given names. So we would say that Juan is probably a Latinx author because there is more than 90% of probability of being associated with the name Juan to be self-perceived as Latinx. While if we want to use the family names, the Lee family name is partially associated with Asian population, partially associated with white population and partially associated with black population. And therefore we won't be able to assign it to any of these categories. So with this general framework, we define several different models and then compare them and see which one are more or less biased. First, we use family names, then given names, then a mixture of both distributions. And also, as I was saying before, the information on given names comes from the mortgage application data that has a very different underlying distribution and specifically is particularly biased towards white population. So therefore we also build these same models using a normalized version of the distribution of given names that matches on the aggregate level, the one on the census. But we can also abandon our first original idea of assigning a single category and instead use fractional counting, which means that we are not going to uniquely identify an offer with a single group, but rather we are gonna use the full distribution of probability. And on the aggregate level, we are gonna sum each of those probabilities. In this way, we are gonna get the most biased result, specifically if we use the family names that come from the census information. So in this example here, we are using a 90% threshold in authors in the web of science. And the first column is the fractional counting. This means it doesn't use any threshold at all. And it's specifically for information from the census and that's what we consider our gold standard because it's the best information that we have available. And all the rest are all the different denominations of models that we tried. And we can see that in all cases, the black population is heavily underestimated. And while for almost all the models, the Latinx population is also underestimated. And this is something that Danny was explaining before. These models particularly are working bad for underestimating the black population. And this is related to the history of slavery and how the naming practices in US are made. If we try to use a different threshold, we will see that we have more or less the same reasons. In this case, we are showing the ratio between the fractional counting and all the different models. So the one there means the same numbers we would get with the fractional counting. And we can see that all models are heavily underestimating the black population and almost all models are heavily underestimating the Latinx population. So also on a partially different conclusion, what happens on Web of Science, even when we are restricting our analysis to US, still there are many names that are missing from the census data. So we want to impute them somehow to avoid losing that information. And there we find ourselves with different possibilities. One is to use the US census aggregate. This means the general distribution among the census. The other possibilities use this special category that appears on the data on family names that is all other names. But then we compare it with the fractional counting performed on US Web of Science authors for those names that we do know. And we can see that the underlying distribution is very different because the distribution of US authors is very different from the distribution on the census. And therefore we will be performing other types of biases. In this case, underestimating the Asian population. So the conclusion for us is not to use this special category from the census. So in summary, our recommendations or our conclusions here are that we should use family names instead of given names, especially if they are coming from mortgage data. Maybe we didn't try the information on the voters registration. Maybe there is a less biased source of information there. Also, that fractional counting is the best approximation rather than using thresholding. And that we should impute by our own data. But I think the most important consideration here is that we should always consider the historical context of our data because these racial categories that I'm presenting here and that we are using in our work only make sense in US contemporary US because they are a product of this society. Taking these categories of Latinx authors, for example, to Latin America would not make any sense because everybody there is a Latinx population except, well, people that's living there. But I mean, it doesn't make any sense. And also many practices are a product of society. And as Danny was mentioning before, this goes all the way back to times of slavery in US. So it's very important if we want to extend this type of models to a different country, for example, to understand the historical context and how naming practices are made in these countries and the relation with the racial categories that are existing in those places to be able to find the potential biases in our algorithms. And also using a full distribution instead of simply assigning a label, we understand that might generate complications and might make more difficult analysis afterwards. But it's the only way we found to have an unbiased analysis. So now I'm going to present the use of this virtual accounting on US authors in world science in between 2008 and 2018. And first, we are going to see the aggregate distributions. So in this figure, we are showing the census information by race and gender. We are also using a gender disambiguation algorithm from previous work. Also the distribution in world science. And using the NSA data on PhD graduates and their nationality and residence status, we are doing something like a proxy of US residents. And what we can see overall is that there is an overrepresentation of white and Asian men. But also in the third plot, we can see that a large portion of Asian authors are not US residents. So using the census would not be the perfect benchmark to define overrepresentation on this specific group. We can also see that women are underrepresented, that black and Latinx authors are underrepresented, and that the intersection of black and Latinx women are the most underrepresented categories groups. If we try to understand the distribution over disciplines, we can see in this figure the relative under and over representation of groups. And we can see that Latinx, black, and white women tend to have a high correlation in their distribution over fields and tend to be more present in topics like health or psychology, while less present in topics like physics and mathematics or engineering. We can also see in the figure on the top the distribution of citations by race and gender. And we can see in blue the regular citations, the traditional average citations by group. And we can see that there is a big gap in citations as Danny was explaining before. And we can also see here on the figure on the margin that the distribution of citations varies a lot by fields. So we use a normalized distribution, normalized citations average and in red. And we can see that the gap does reduce, but is still present. So our goal now is to go deeper into the analysis between fields and distribution by fields and race and gender. And we are going to go all the way to research topics, which would be something like microfields. And for this, we focus in this case in the discipline of health. And we define 200 specific topics using a model from top modeling, LDA. And then we analyze the distribution by race and gender and of the average participation on each of these specific topics. So this figure shows on the vertical axis, the proportion of women on each topic, each dot is a topic. And on the horizontal axis for each of the four plots, the participation of each of the racial groups. So we can see first in every figure that a woman tend to publish more in topics related with nursing, pregnancy and education. On the top right corner, we can see that black authors tend to focus more on African-American studies and racial disparity studies. While Latinx authors on the bottom left corner tend to focus more on Mexican and Latinx body studies, but also on the English-Spanish relation and language issues. Finally, we can see on the top left corner that Asian authors have a specific topic related with China, and that white authors on the bottom right corner don't have any specific topic in which they are more present. And the question now is, how does this relate with the citation gaps? So in this figure, we're again showing under an overrepresentation and in this case for each of the 200 research topics. And we are sorting these topics vertically by the participation of white men. So on the bottom, we have all those topics that where white men are less present, and on the top, we have all those topics where they are overrepresented. And on the margin, we can see the distribution of the average citation on each of these topics. And we can see that there is a positive correlation between the participation of white men and the average number of citations on these topics. So this means that white men tend to do research on more highly cited topics. And also we can see that there is a clear gender pattern across research topics in the same way we were showing before on disciplines, but now on this micro level in health. So does this imply that the citation gap is just a matter of distribution along research topics? Well, not exactly, because in this figure, we are showing now the each of these topics sorted by the average number of citations on the horizontal axis. So on the right, we have the highly cited topics. And on the left, we have the low-cited topics. And each dot represents a specific race and gender and the average citations they get on each topic. And then we smooth these distributions using lowest function. And what we can see is that all those lines on the top represent the different racial categories for men. When all those lines in the bottom represent the different groups of one. And this implies that men tend to be more cited in both highly cited topics and less cited topics when women tend to be less cited in both groups. So this implies that there is both an inter-topic and intratopic bias. So in conclusion, there is an underrepresentation of marginalized groups in the intersection of race and gender. These groups have specific research interests. And therefore those specific research interests that mostly affect them are understudied by science. And also we saw that women tend to be less cited. And this is due to both the field and topic distribution and within topic bias. So I have also a case example. Cassidy, if we are fine with time, maybe I can continue. Yes, please. Okay. So as I was saying, there are relevant understudied topics in science. And these topics mainly affect marginalized groups. And this can appear under the form of missing data sets. And I really like a quote from the authors on data feminisms that says that the things that we do not or cannot collect data about are very often perceived to be things that do not exist at all. So the question is, how can we make evidence-based policy over these topics if we are not studying them, if the data for which we can understand them is not there? So for that, I'm gonna present the case example of how can we do it? And it's on data on abortion in standardized situations in Argentina. And I think this is a relevant case analysis because we are currently living specifically in US and increasing persecution against abortion. And this raised the question, how can we understand this practice based on evidence rather than on personal beliefs? So to give a little bit of context, Argentina legalized abortion last year. And during this debate, one of the main arguments against delegalization was that abortion causes permanent trauma and distress on the people that takes these practices. But for this word, there was no proof. And because it was illegal, no studies were conducted at the time to understand if this was real or just a myth. So to solve this issue or to get data into the discussion two feminist organizations joined forces. First, a grassroots organization of female doctors that help people to have the safest conditions when practicing abortions called La Reuelta. And a feminist organization that among other stuff does that analysis called Economía Feminita. And I think this is really interesting because these are organizations that go beyond traditional research institutions. None of these are labs or universities or research centers, but actually we have grassroots organizations present here. And in this case, it's particularly important because in a situation of clandestine, clandestinity, the only way in which we can collect information about this practice is a building trust. And the only way and a grassroots organization that help people going through this situation is the only one that can build a necessary trust to gather this information. So La Reuelta made more than 400 interviews of people that were accompanied by them. And one of the questions they made was which were the principal emotions that people felt after practicing the abortion. And the data shows that by far the most common emotion was actually a belief and not anguish or guilt or sadness. So this evidence is going against this more or less generalized belief that abortion produces anguish and permanent trauma. And we now realize that this was actually just an idea popularized by these anti-abortion organizations that in the case of Argentina, they are most of the times also against science itself. So now to conclude for real, understudied research topics also appearing in the form of missing data. And this missing data is necessary for public policy. And in order to move towards a more inclusive science, we also need to include this grassroots organization in many of our research. And to bring this to the US context, I think that this rise in the laws that persecute organization that help people practice abortion will not only create a more unsafe environment for the people that needs this help, but will also restrict the possibility of further research on the topic. So thank you very much and I'm waiting for your questions. Thank you so much, Diego. All right, turning to our questions, feel free to throw them in the chat or the Q&A. The first question came from the identification of China as a prominent topic associated with the Asian classification in the dataset. And the question was does Asian, is it inclusive of South Asia? If so, why do you suppose we are not seeing articles that relate to Pakistan, India, Bangladesh, Sri Lanka or West Asia? So can you say a little bit about how to interpret the data that you've found and what we see is that over dominance on China but in not more inclusive topics around Asia? So maybe I was not so detailed on the part. Thanks for the question. The labels that I was showing in this figure are labels that we put based on the model that is automatically generating those topics. It's not something that we are defining a priority, but this is a model from text mining from metal language processing that it takes all the words that are in abstracts on the papers in Web of Science and automatically generates these 200 topics where each of these 200 topics is defined by the most common words that appear on the topic. And then what we did was take in these top five, top 10 words on each topic to assign a label. And we only did it for a few of them that are the ones that we display on the figure. So we put China, but actually there are maybe two or three more words. But as far as I remember, I did not see something related with India or Pakistan or other countries. It was clearly like the first word, the most distinctive word of the topic was that and that's how we label it. Thank you. Well, I have a question while we're waiting for other ones, both you and Danny talked about the difficulty of accurate assignment of names to black scholars. And particularly that surnames for the black community are often symbols of erasure, of ownership, even of rape. Can you talk a little bit about how we do this with sensitivity? Is it, you know, going back to the title of our talk, are we perpetuating some of these problematic issues by a focus on names and how do we avoid that and improve the validity of these studies? So maybe Diego first and maybe I'll bring in Danny as well. Okay, yeah, I think that's a very important part of our analysis. Actually, when we started the research, our first idea, the most intuitive idea was, okay, let's take the thresholding. Let's take all those names that are clearly from a group and let's assign that label. And then we discard the rest because we cannot assign. And then when we were doing this sensitivity analysis, we realized that because of the history of naming practices and because of the history of slavery in U.S. and how the slave owners were assigning these names to the black population and the African-American population, this implies that because there is a higher proportion of white population, most of the black population was gonna be under like this card or was gonna be assigned as a white population. So we were heavily going to underestimate this population and we realized that the only way to partially solve this issue is to use this full distribution. But I think maybe it's not a perfect solution but I mean, clearly this is a problem that maybe it's impossible to perfectly solve without a survey and people self-identifying from a specific group. But we saw that there were many research and our first idea was to use this thresholding that was going to completely disappear this part of the population. And actually I was very happy to see that in Danny's work they were not using the thresholding as well. Danny, do you wanna jump in on that? Yeah, I think that it's a really important point and I think it makes me go back to one of the earlier questions, which is what are the different ways in which bias can creep in? And I think that one of the ways that bias can creep in is by the name that we see when we search on Google Scholar for a paper about this particular topic and we see a name and it looks of this particular race or ethnicity and we say, oh, because it's that race or ethnicity I'm gonna read the paper if it's a different one I wouldn't. Obviously I hope I never do that but that's the way bias can creep in according to name. And in that case, somebody who has a very clearly African name will be more impacted by the name-based bias than somebody who does not have a clearly African name. However, bias comes in in other ways. Bias comes in in who we saw at the conference in how somebody appears. And so when it's those kinds of effects then obviously the estimates that we have from the names is not going to be enough to capture that. So when I think about the data that we showed I think that there's definitely a group of black people who are in the white group. And that means that the actual true citation gap is probably a lot bigger than what we estimated. It's probably, the story is probably worse and more upsetting than what we showed. So I think what we have is an underestimation of what's actually going on rather than a true estimation. So I'd love to be able to do it better but I do agree that for, I think that this should be a close to accurate assessment of name-based bias but it's not going to be an accurate assessment of the bias that comes in once we perceive one another. Following up on that question and both of you referenced it in your answer is that use of a fractionalized rather than a threshold approach. Could you say just a little bit more about that both in how we conceptualize that but then how we interpret that and to be more specific many times when talking about this work people will say, but you classified me wrong taking it to the individual level. Can you say a little bit about what a policymaker can do when interpreting fractionalized results and fractionalized reporting over these data? Danny do you want to start and then Diego? Yeah, sure, absolutely. So the goal is to report global effects, large scale effects that are going to be true on the average and not necessarily true for a single individual. So I think that policies that focus on these broad effects and then have mitigation strategies that are global rather than specific is going to be important. I think that how we, but your more specific question is how do we think about the fractional nature of the predictions? And I think particularly when you're using first name and last name, you can have a last name that has one predicted ethnicity or race and a first name that has a very different predicted ethnicity and race. And that that can actually be indicative of the fact that that person comes from a multiracial family. And this is actually their heritage. They have more than one. And so that's I think something that's a benefit of the probabilistic assignments and these sort of the fractional nature of the prediction that we can go beyond single bit and therefore get to get a much better understanding of the more complex racial and ethnic history of the individual based on their names. But Diego, do you have other thoughts? Yeah, I think this is a really good question. I think in this context, two different issues. One is related also with the previous question. And because there is a lot of overlapping in the names of the black and white population when we use the fractional counting, we are gonna overlap a lot these groups. And that's maybe why there are some results that where they seem really similar and actually it's just a sort of misclassification or as Danny was saying, things are probably even more segregated and worse. But the second thing specifically related with this individual classification, I think it's rather good. Actually, I think it's a benefit not to be able to have an individual level classification. And maybe it's a little more complex when we always have to deal with a new distribution on top of other distributions rather than just a single category. But I think it's pretty good. And actually the other day, I was contacted by someone from a firm, I think, I don't understand these names of companies, but they were something of assessing debts or something like this. And they were using, this guy approached me saying, okay, we want to unbiased our models because we are inferring this and he was like trying to seemingly to do something that was positive. But then he was asking me, how can we use this model? Because it cannot work on an individual level. And I was really happy to say yes, if you want unbiased results, then you can use this to assess your system and see which is the level of segregation and bias in your results and how much you are complicating lives of people that is black, Latino or Asian. But you are not gonna be able to assign this individual that you want to assess and you want to put the interest rate on their debt to a specific racial group. And that's really good because they should not do that. They should be able to account the bias in their models, but they should not be able to have this as a parameter for the models because it's really dangerous and it's really, it would be awful to bring this tool. So if we are doing research that needs to develop these tools and this is published and this is also used by maybe private companies or, I don't know, other people, it's very good that our models are, our conclusion is that you will never be able to do individual level classification. That's a fantastic story, Diego. Danny, you use the word global. So I'd like to go to Mila Kishko's question about white names being synonymous with Anglo names and maybe even break this question, open a bit more to talk about many of the other ethnicity classification algorithms that are out there on the market and being used in research right now. Of course, both of you were focusing within a US context with the classification set by the US census where white includes Middle Eastern. It includes people from British colonies. It's, you know, there is a different level that we are inclusive in white in this country that doesn't apply to every country necessarily. But can you say a little bit about the global application of these kinds of algorithms? How might we do or should we do ethnicity disambiguation beyond the confines of a single country and their census classifications? Yeah, I think so in the work that we did, white does not necessarily just mean Anglo. So it does include other nationalities in Europe. But I think the broader question of how to potentially do ethnicity disambiguation across a wider group of cultures that may not be directly from the particular focus that we had. I do think Wikipedia remains a really good place to be building some of these algorithms. And I think that that motivates perhaps just increasing commitment to expanding Wikipedia as a place to be generating the data on which these kinds of algorithms could be trained appropriately. That's my feeling that that's an opportunity, a place for the work to go. I don't know, Diego, what do you think? I have a little conflict with that, honestly, because I don't think racial categories make sense uniformly in all countries. I think they are really a cultural construct that is really country dependent. And they won't make this any sense. I mean, maybe white is because white people invade almost all the world. This is more or less a general category. But beyond that, but as Cassidy was saying, white in US is not the same as white in other countries. And also, I think specifically, I mean, in the case of the Latinx population, it's very clear that in Latin America, it won't make sense as a racial category. And so I think that if we want to move to a more general, I don't think we can move to a more generalistic approach, but rather we could do something like a multinational case analysis and study each country by its historical development and their historical construct of race. And we have to rethink, I guess, this assignation of racial categories in each country in particular, or maybe regions. Maybe Europe has a similar concept and construct of race. And maybe Latin America has a similar construct of what race is, where we definitely have to include the Native American population, which, by the way, it was a limitation of our work. We cannot include it in all the research that I was showing because of statistical issues. But for example, in Brazil, in Bolivia, these are countries where there is a large proportion of the population is Native American. And Native American, and this necessarily has to be a category. And the same in all different regions of the world. We have to, first of all, be very critical and really study the history of these places and how the construct of race is built in each place, how the naming practices are built in each place. And from there, we can start to build this more global analysis. Thank you. I want to return to a comment that Carol Lee made in the discussion, quoting Danny from your Nature Neuroscience paper about the limitations in these tools, particularly around the binary classification of gender. And in your talk and your opening slide, you talked about hoping to expand our analysis to be more inclusive of more genders. Can you say maybe a little bit about this line of research and how we could go about that? Yeah, absolutely. So obviously, one way that we could do this is to have more collaborate with institutions that have access to self-identification data. So for example, I know that John Freeman at New York University is being really a strong advocate and activist for having people able to self-identify their gender and sexual orientation, or several other kinds of variables when they apply for National Science Foundation grants. Because he claims that, and I think, rightly, if we do not collect that kind of data, we will not be able to evaluate where bias and discrimination is occurring. And so obviously, it would be optional, but he would like to have the opportunity for people to self-identify if they wish to. And that would support this kind of work. The second thing I guess I wanted to mention is that in our approaches not to doing the science, but to addressing this kind of bias that into our own reference lists, we have done kind of the very slow approach, which is to look up every single paper that we cite, look up the first author, look up the last author if we don't already know them, look up what pronouns we see on their website, and then so that allows us to include people who are non-binary and identify for those who are out, anyway, out publicly. The second thing that we do is that we use 500 queer scientists.org, which is a wonderful organization that has life stories of individuals who are from the LGBTQ plus community in the sciences. And so we also look up each of the person's names in that database as well to see whether they are out as trans or non-binary there. And then lastly, what we do is take it upon ourselves to take the responsibility of knowing more people, noticing the flags they fly on Twitter, for example, and then incorporating that into our memories, not just forgetting it immediately, understanding the kind of work that they do, and asking ourselves frequently whether that work is relevant to what we are doing and deserves to be cited. So those are more on the personal mitigation strategies rather than the science part, but I think that's equally important here. To go on that a little bit more, I want to ask sort of two questions. So Diego talked about these statistical necessities leading to exclusions of certain population from the research. And as meta-scientists, as those of us who use very large-scale or quantitative approaches to analyzing science, we're often reverting to tools that force us to exclude certain populations when they don't meet a size threshold. So for Danny, your comments about increasing the genders, sort of two questions. How do we avoid collecting those data only to exclude them, given the kinds of tools that we use, or including them at the risk of violating identification and privacy and other human ethics responsibilities that we have as scientists? So how do we value making visible those who are invisible without making them more vulnerable? Yeah, I think this is so incredibly important. And I think that all of the work that we've been doing is in collaboration with, like on our team, there is not a trans scholar and there's a non-binary scholar. And so we've been in conversation about how to do this well, not that we know, not that we are a voice for everyone, but it is something that's very present in our conversations. And I think that there's been a little bit of pushback even, to say, well, even if you go onto web pages and see somebody's pronouns, it might be the pronouns that they feel safe using there, and they may use a very different set of pronouns elsewhere, or the trans person, this particular one doesn't want to be known as trans, wants to be known as man or woman. And that's up to them. And it shouldn't be really anybody's business to do otherwise. So I think it's something that needs to be done with a lot of care and a lot of respect and a lot of collaboration, I think with the communities that are involved and really understanding what is important to them and how to do the research in a way that supports them and is valued by them, that they feel that they get something out of it. So I think this motivates, you know, community partnership-based research. It's not, you know, one group studying another. It is what is the kind of work that we could do together that would help, that would make everybody happier, that would really has something to offer them, not just as an object of study. Diego, do you want to add anything on that? I just want to reinforce the last part. I think working with communities is the most important. And organizations need to be involved in specifically in this type of analysis. And maybe it's not as simple as we do when we take this large-scale bibliometric analysis and we just use a algorithm to infer things, but rather on survey-based, which also are very valuable. And also we have to always, when we are working with this smaller-scale data, we have always to be careful not to show, like not to be, how to say, anonymized the information, but when we work with communities, with grassroots organizations, with people that has an interest in improvement of these groups, I think it can always be for the benefit. Perfect. Let's turn to a couple of questions on citations themselves. So first I want to ask you, Danny, a sort of a general question, and then we're going to go to Elizabeth Butler's question. You talked about moving from a mythology of the history as evidenced by citations to a history. So how do we get from a to the, or is that a mythology that we should do away from completely? And on that, right, a mythology of the meritocracy of citations, but both of you have utilized citations in your work. So how do we both critique citations, but also utilize them in these studies? It's sort of one question. And the second part of that question is one of your recommended policy suggestions is actually to go and look up and to understand how you're citing. So as a devil's advocate, does that not introduce more bias to be hyper-aware of race and gender when doing citations? And does that lead to even more pervasive kinds of inequities? Maybe your last question first. So does looking up individuals create greater inequities because it's foregrounding gender and race in our heads? I think the answer is no, because I think what we are all doing is coming from a place of bias. And I think that our default is that we need to overcome that within each of ours, where it's part of our culture, it's part of society, it's part of who we've been exposed to in textbooks, at conferences, it's everywhere. And so I think that we should be very, very committed to re-educating ourselves. So I do not see that as a negative activity. I see that as a necessary activity given where we are. Maybe in some future state it'll be different, but I don't see that as now. And so then the question is citations as a meritocracy versus, and shouldn't it not be about citations? Yes, it should absolutely not be about citations. People should not be promoted because of their citations. However, it happens all the time. And because it happens, if you want to address the now of society, then you need to be able to do work that shows that that method of evaluating the merit of a scholar is wrong. It's not an accurate reflection of the merit of their work. So I think you have to answer today's society with the way today's society is working and still with a hope obviously of changing for a new society in the future. But then your first question was about, oh gosh, working memory. Three questions back. A history to the history. Thank you, thank you. But that was an impressive recall for the first two, frankly. So A history to the history. So I've actually just begun reading this book by Patricia Hill Collins, who's a fantastic scholar who just retired from University of Maryland and she wrote this book called Intersectionality as Critical Social Theory. And in that book, she has just an amazing sort of take on resistant knowledges, which is the kind of knowledges that oppressed or subjugated individuals create over history and how that knowledge is important and not available elsewhere. And I think about that a lot in the context of citations, that I think that there are whole areas of science that have been underutilized in the direction of scientific progress simply because of the demographics of the individual scholar involved. And so that makes me think that, well, it makes me want a different future for us. It makes me want a future in which all questions are valued and in which science does not narrow constantly by the demographics or the seniority or the citation counts of the people involved, but is constantly freed from those and really explores the space of questions that are available. I think that we could make so much faster progress. I think that discoveries would happen so much more quickly if we freed inquiry from these kind of blinders that we have on. So it is, I want a new history. Fantastic, Diego? Two things. First, with respect to citations and how influenced they are on people's career, I think also this type of work is very important to demystify the meritocracy myth of related citations because if we can clearly show that citations are biased by race and gender, they're this idea of, well, we are just promoting people that gets more citations, is clearly more in trouble. I think this is, even though maybe in the bibliometricians communities, the idea of more citations as a better work is not so common nowadays. And we discussed a lot about that. I think outside in the general community of science, this is still there and pretty much strong. And I think that's why also so important to show this type of analysis to show that this is actually full of other stuff and history full of discrimination, full of sexism. And with respect of whether we should check race and gender or gender identity in our references, I think it's a really interesting thing to do. I think I will do it for my papers to see what I'm doing. I think it's a really interesting self reflection on our own biases. I would not be so positive on having something like an automatic way of doing this and like automatically showing to everybody that because that might introduce a rainforest biases for some people that is not thinking about decisions. But if, I mean, if this is a tool, I will really, really want to check out that package. And I think for the people that is concerned about their own biases and how can we deconstruct them? I think it's great. But if we do this, I think it's the type of things that when we go on a large scale automatic thing, then it can bring more problems that it can help. Thank you. Now from Elizabeth Butler, do you have a sense if the increasing under citation is indicative of a larger number of papers written by women but the same small group is being cited? Or is it a stronger and stronger tendency, for example, to feeling infringed upon or internalized bias? Yeah, that's a really great question. In our data, what we find is that people seem to be citing like it's 1995. It's a great title for a song, right? But yeah, so they're citing like it's 1995 and they've done that since 1995. But what's happened obviously in the last 25 years is that the demographics of the academy has changed significantly and the number of papers that are being written by marginalized groups has increased. And so that means that the citation gap grows. All right, let's turn to sort of what we do now. So we see a message from Carol Lee saying, I'd love to have the opportunity to self identify when updating my orchid ID information or VSM other cross-organization database associated with publishing. Are there any efforts to make progress on this front? I did not think of orchid, that's a great idea. I don't know, I will, I could ping again, I feel like John Freeman is the one who I would ask if something's been done in that space yet. I can check, it's a great idea. And then more generally, how do we get from these descriptions to science policy, how do we change and what levels should we change? What are the sort of lowest hanging fruits of what we can do next? And sort of how do we get to that future that we want to, that new history that Danny wants to write? Diego, do you wanna go first? Okay, I think from our research, I would say there are two main ideas that I can, as a conclusion, and it's related with these research topics. And we have these highly-cited research topics that are more popular and where white men are over-represented. They are always the majority, there is a general over-representation, but in those, let's say, hot topics, we are even more over-represented. And I think for those, the policy should be to try to diversify them and to include marginalized groups, but always taking into account that this can also be more hostile environments and it's not just a matter of throwing people into a place of hostility and violence. And also, this should be taken into account when diversifying these places. And also, I think the most impactful conclusion is that there are understudied topics and they are always related with the topics that are most relevant to marginalized groups. And I think we need to fund them. We need to promote the research on those marginalized fields, marginalized topics. And this is not only for marginalized population, but for everybody that wants to do research because that's kind of the urgent thing to do. We have less knowledge that we should as a society on several topics and there are several examples in health on this. And that's like a really urgent thing to do also. Thank you, Danny. Yeah, I think that it's important definitely to increase representation, but I also think that it's important to increase engagement. So what we see in our data is that even though there's an increased representation over time, both in terms of race and in terms of gender, there is decrementing engagement. So there's something that needs to happen in terms of engagement. And to the degree that I notice that we notice that this is related to social networks, I feel like a lot of it could be that providing more places maybe for funding for diverse teams to grow those social networks, to grow those collaboration networks would be important. That might be something that would help alter, the demographics are what they are right now. If we just take them as they are, that could change engagement, I think. But I think it's, yeah, maybe the last piece of data that I didn't show, but that we have is that for journals, if the journals that do publish more papers from diverse groups, so if their author pool is diverse, that journal's citation practices also are diverse. So I wonder too, if there's in terms of policy, I'm thinking like kind of lower tier, like journal policy, is there more that journals could do to ensure the diversity of their author pool, maybe by invited pieces or other avenues that they have available to them, and potentially that would be able to turn it around a little bit. Perfect, well, I am going to ask for your concluding thoughts and I'll sort of give one anchoring question, but feel free to share sort of what you think are the key takeaways you would like for this community to have. But for this community of meta science researchers, what would you give as one key takeaway or piece of advice for them and any just last concluding words you have to say on this topic? Maybe we'll start with Danny and then with Diego. Yeah, I think I would probably underscore the importance of collaborating with people across different domains of expertise in this space. I think that collaborating with people who have expertise in gender theory and race theory, the people who have expertise in intersectionality who have worked on kind of the history of these problems for so long. I think that combining those perspectives and that knowledge with science, you have to grapple with a lot of hard problems at that intersection, but I think that's really where we'll hopefully be able to do the work that is most meaningful to society and most carefully crafted and carefully interpreted. So I guess that's what I would emphasize is just that I've certainly learned so much through this process, mostly by the collaboration with people very far from scientific fields. Fantastic, Diego. So at the beginning of my presentation, I quoted Suberi and I would like to do some kind of rephrase out of that quote. And I think our research in meta science can be an artifact of both the struggles to preserve society as it is or transforming society in a positive way. And this is really, it's related with how much we are considering the historical context of our analysis, how much we are considering the missing data, how much we are considering the missing research topics and the marginalized groups, the people with which we are working with and all of these factors, maybe we are not going to be, this doesn't guarantee anything, but I think this really increase the chances of our work being meaningful to have a positive impact on society or to have actually a pervasive impact. And I think in meta science, we have also this struggle of what our work going to point to and also consider that our research is not always, like it's not only upon our free will of what we want to do, but it's related with race in academy for academic capital or the impact factor race or the impact metrics race. And we are also all of us following that and our research is determined by this. So it's also, it's partially a matter of us taking the role of understanding the context in which we are doing research, but also trying to improve the institutions in research to make all these, how do you say, all these tendencies, all these conditions on our research to try to make things positive. Thank you, wonderful last words. I just want to thank our presenters for their labor here today. I know this was a very long panel and you withstood all of the questions very well and thank our audience for engaging so robustly. I really do hope that this serves as a catalyst for future conversations. I know that both of these authors are very open in their scientific practices. So if you have questions about data, about approaches, about methods, about tools, please do contact them and let's keep this conversation going. Now it's three minutes till midnight here and I'm going to turn into a pumpkin. So I think it's a good time to say goodbye. So thank you and have a good night. Thank you. Thank you very much. Bye.