 I'm going to welcome up our next speaker, Suman Maiti, who will be speaking about citation imbalance and gender citation practices. So welcome and take it away. Hello, everyone. Thanks for sticking around till the end of the conference. Today I'm going to talk about citation imbalance and gender citation practices. So this is a joint work with Maika Altman, Roger Levy, Jordan Dworkin and Danny Bussard. So this is still a work in progress, so any sort of comments, criticism are most welcome. So science has its gender problem and there are prior research that has found gender imbalances along various measures of academic inclusions and success. Now such kind of inequalities has been found in compensation, grant funding, credit for collaborative work, teaching evaluations, hiring and promotions, authorship and citations. Due to the potential downstream effect of such kind of inequitable engagement, the study of authorship and citation dynamics is a critical endeavor for understanding inequities in science. And here we seek to understand the prevalence of gender imbalances in citations. So towards this objective, we collected data from open Alex dataset, which was formerly known as Microsoft academic graph. The whole dataset consists of around 207 million papers and these papers are categorized into 19 top-level fields ranging from physics, chemistry, mathematics, computer science, medicine biology to fields like arts, sociology, psychology and business. So to understand citation imbalance, we first need to understand the authorship imbalance. So for this we have constructed four gender categories, MM, MW, WM, WW, based on the gender of the first and last author. So MM refers to the first and last author both being men. MM refers to the gender category where the first author is man and last author is women. WM refers to the reverse case and WW refers to the one where first and last author are both women. And then we look into a field for example, computer science and look into the participation of authors in this field. And what we found is that men are participating more than women, but however over the years, it seems that there is an increasing participation of women but there is still a wide gap existing between men and women participation. Now that's computer science. Now what about other fields of science? So we see more or less the same trend having in the initial years, sorry, the recent years that there is an increasing trend for women participation in science. With the notable exception for psychology and sociology where the women's participation tends to catch up with the men participation and in sometimes doing better. So which is very good news. So to quantify such kind of citation imbalance, we need to consider such prior authorship imbalance into account. So based on prior literature, we have considered two models, random draws model and relevant characteristic model. In random draws model, the gender proportion of the reference list are compared against the gender proportions of the existing literature. So we are basically comparing from prior literature. And this is basically just a scenario where you are randomly picking up some article and citing it. But that's not realistic, right? So to have more realistic selection of papers, we will use this relevant characteristic model where the gender proportion of the papers in the reference list will be compared against the gender proportions of the articles that would have been cited and they are similar to them. So armed with this citation characteristic model, we shall look into citation practices. So what we'll do first is we'll segregate the citation behavior based on the gender categories. So we'll have these four categories of citation behavior. And what we found here is that there's an interesting pattern that the MM teams tends to have higher citation preferences towards MM teams and lower citation preferences to other teams. Similarly, the WW teams tend to have higher citation preferences toward WW teams and lower citation preferences toward other teams. So we see that there is a homophily in citation practices for the men and women lead teams. Now to understand whether this generalizes across like the all subfields of computer science, what we did is we looked across all the subfields which has been categorized by Microsoft Academic Graph. So they are roughly like 32 fields which we are showing here. And for simplicity and for better depiction of the picture, we have combined the WWMW to a single category, W or W, or non-MM teams. So basically MM versus non-MM. So what we found is that similarly that there's a consistent over citation by men. And also there is homophily in citation practices but it's not so universal. For some field it's there, for some field it's not there. So that's computer science. Now what about other fields? So for this we looked into other fields as well for times constant I have put some of the examples here. So this is like physics. For chemistry as well you can see. For all the cases you see that there is a consistent over citation by men. Same for biology and medicine as well. Same for sociology and psychology. So across all these fields we see the similar universal over citation by men. So now we understand whether field level characteristics are associated with such kind of gender imbalance. So for this we'll look into two characteristics. One percentage of women participating in field and popularity of field. And we'll try to see how this influence the gender citation. So first we shall look into percentage of women participating in field. So what we are finding here is that with increased participation of women authors, MM teams increasingly cite non-MM teams. It seems to be a good news. But MM teams also cite more MM teams with increased participation of women authors. That seems like a bad news. So and we also observe that the non-MM teams cite less MM teams with increased women participation and also cite more non-MM teams with increased women's participation. That's also good news. So now we shall look into popularity of field. For the popularity of field we define, so the popularity of field is defined with two metric, number of papers published in that field and total citations in that field. So total citation of all the papers that has been published in that field. So based on these two metrics as well, we see a consistent pattern that the MM teams, MM teams increasingly cite more non-MM teams or the women authors with increased popularity of the field. So as the field become more popular the over citation, so or rather the under citation becomes more balanced. But we don't see much of good evidence for the MM citing MM. So there seems to be still a good over citation by MM teams. So to conclude this, we see that there is a prevalence of gender imbalances in citations. We observe a consistent over citation by men across variety of field of science. So we looked at almost all fields of science and we see that that's a very consistent pattern. And we also observe that there is a homophilic citation pattern but it's not so universal. So we see that the over citation is there for the men's side but not always for the women's side. And there are field level characteristics which are strongly associated with citation behaviors. Thank you. And I would like to take any questions you might have. Hi, thank you very much for a great presentation. I'm Basu Mahkouz from UCL. I just wanted to ask or maybe suggest to look at how this changes when you look at quality indicators of paper. So same thing, same analysis but just look at top tier papers published and top tier journals or leading papers in their field, however you want to define it and see how that changes if you looked at it. I see, okay. That's a good suggestion. So there are prior studies also that looked into like small pool of papers like mostly nature related or nature science related to papers that they also find similar kind of over citation by men and under citation by women, yeah. I wonder if you've looked at the sort of citation patterns like for the homophily effect like people self-citing their previous work whether that, what's the effect? Yeah, so this study does not include the self-citation. So we have discarded the self-citation. So this is, yeah, we don't say that. Any other questions? We still have time for one more question. Okay, thank you so much, Hime. Thank you.