 Okay, cool. Thanks so much for having me here today. I am thrilled to be here. I've never visited Berkeley before. I've never visited a Center for Data Science before. So I hope you'll be interested in what I'll talk about. The culture at the Santa Fe Institute is very much like... There's no point in being here if you don't understand what I'm talking about or if I'm unclear about something. If you have questions about the bigger picture, we can talk about those at the end. But if there's something that I've said that just doesn't make sense, please definitely interrupt and make sure that we're all on the same page. Okay, so I'm interested in gender prestige and productivity in both academic hiring and also what happens after you get hired in career trajectories and productivity. But I spend a lot of my time with mathematicians and physicists and computer scientists, and if you have walked through the halls of those departments and most universities in the U.S., you know that gender inequality is a big issue. So in computer science, it's particularly steep. The only field that's worse is physics. One in six computer scientists in the academy is female. And this is important, I think, because of two reasons regardless of which one you particularly subscribe to. So equal opportunity is important. And if we're not making opportunities equal for everybody who is interested in being a computer scientist, then that's a problem. But if you're not swayed by that argument and you prefer a principle that's more based on just sort of productivity and the ability of teams to solve problems, there's a lot of research that shows that more diverse teams are more creative, especially when the problems get hard. One of the problems in studying this is that the literature is confusing and also conflicting. So here are two studies that illustrate this well. On the one hand, you can see a 2-to-1 preference for female applicants. This was a study where there were three CVs with essentially controlled content sent to people on a sensible faculty hiring committee and they had to decide which of these three people would you hire. And of course, the names were altered so that they could control different variables and manipulate gender. And this revealed a 2-to-1 preference for the female applicants to be followed up for an interview. And then, you know, within three years of that there have been other studies that use different techniques that show subtle gender bias, favoring male students. So what are we to believe? I mean, what we have here are two different studies that use two different methodologies and not only are they showing different directions of the effect, one favoring women, one favoring men, but they're also showing different magnitudes. A 2-to-1 preference is substantial. A subtle difference is small. So I'm coming at this from the point of view that broad conclusions drawn from very specific experiments and sort of univariate analyses are probably not that helpful and potentially dangerous, especially if you're interested in coming up with some sort of policy that's going to address some of the inequalities that you think are important to address. So my belief is that these analyses need to, whenever possible, really be confronted with the complexity of actual hiring. And so the idea here is that we're going to be focusing on outcomes that are revealed in hiring data and in publishing data. So this is the outline of the talk. I'll first talk about identifying large-scale patterns in academic faculty hiring. Then we'll talk about how gender prestige and other factors affect the placement of PhDs into faculty jobs. And then talk a little bit about diversity and productivity trajectories in the years after people are hired. So here's the first section, and I can't help but criticize my own field, which is network science. This was last year's NetSci. And if it looks like you have a nearly all-male panel or all-male set of invited speakers and plenary speakers, yeah. Genestra Bianconi is a fantastic statistical physicist, and she was the only female representation at NetSci. Things are a bit better this year. Okay. So let me tell you about the data that we collected and the data we used for this study. So we collected a lot of data by hand, and it was time-consuming and expensive to make sure that the error rates were low and that we really got a snapshot. But what we did is we got all U.S. and Canadian tenure-track faculty in computer science, business, and history in PhD-granting departments. So if there was a PhD-granting department anywhere in the U.S. and Canada in one of these three fields, we basically got the faculty list of tenure-track professors and then got as much CV data as we could about them. And because the data was unstructured, there was no clear way to easily scrape everything. So we paid grad students a wage for grad students, I think $15 an hour, and it took a long time. But this kind of gives you an idea of what we ended up with. 205 computer science departments, 112 business departments, and 144 history departments. And the numbers of tenure-track faculty add up to around 19,000 records. So the average size of a CS department in the U.S. and Canada is 25, for business much larger, 83, and for history 32. And you can see that the breakdown of full professors, associate and assistant, or whatever their named equivalents were in their home departments, these are all about the same with a bit under half being full professors and the rest distributed among associate and assistant professors. These are the fractions of those professors who were female. And I just want to say something about coding. What we did was we asked our data collectors to tell us what the gender of the professor was. So this is not the gender that the professor expresses. This is what our graduate students did based on evaluating the name and the photograph. And we made no attempt to try and do anything more sophisticated there. And it said relied on the perceptions of others as what we thought might be useful when we're trying to predict how people are actually treated by the system. So I hope that makes sense. One of the questions is how many of the people who are faculty actually got a PhD from one of the other universities within the sample? If you did your PhD at La Sapienza in Rome, then you would be out of the sample and you would be out of this network. But as you can see, around 85 to 90% of the PhDs were in sample. And these were snapshot data. So over these different collection periods in 2011 through 2013, if you were a sitting faculty member then, you would be in the data set. If you retired in 2010, you would not be in the data set. And if you were hired after 2013, then you would also not be in our data set. So that's one data limitation that we should talk about in the Q&A after. So from these data sets, what we did was we created directed, integer weighted networks. And the idea is that from a CV, the trajectories of an individual can form a network. So you can move from your bachelor's degree to a master's or a PhD. And if you stay at the same institution, this could be a self-loop in the network. Then you move on to maybe a postdoc, maybe another postdoc, maybe another postdoc. And then eventually you get a faculty job. And even at the faculty level, people make moves from assistant to assistant, from assistant to associate, to associate with tenure, without tenure. So they're all kinds of different moves. And so what we did was tried to respect that kind of diversity of possible trajectories within CVs. But what we focused on was the transition from PhD to faculty. That includes professors between universities? Yeah, it does include moves of professors between universities. So what we focused on was the doctorate and the eventual final faculty placement. So if you were a full professor in the data set, which you had done multiple moves before then, we were mostly interested in kind of reducing this more complex trajectory into just a move from PhD to wherever you are now. A question in the back? Is that a question or a statement? Yeah, so, yeah. So because we were only collecting CVs from folks who are sitting faculty, or who were sitting faculty at the time of the snapshot, we have no information about those who left academia, who are lured away, or forced away for whatever reason. So that's another data limitation to talk about, which is that all of our observations are conditional on people being faculty at the time of the data collection. So what we assemble are faculty hiring networks. And this is computer science data. These are the top ten institutions. And you can see nice little visualization here. And I've highlighted the flows coming out of MIT just so you get an idea of how crowded and tangled these are. So our premises in assembling these is that actually faculty hiring reveals collective preference. So you have a hiring committee, and they receive, let's say, 400 applications for a particular job, at least 100. And they need to make a decision about whom to hire or whom to interview. And in doing so, not everybody's an expert in whatever that applicant's field is, and so signaling theory tells us that the prestige of the doctorate is going to affect people's beliefs about who they should actually hire. So under the premises that faculty hiring reveals some of these preferences, and that hiring committees actually want to hire the best, that the system is an attempt at a meritocracy, the collective hiring patterns show us how the institutions view each other, how the departments are actually viewing the training that the graduates are receiving. So we can take this and create a ranking out of it. Okay, so there are lots of different ways to do rankings. And the way that we chose was what's called a minimum violation ranking. So the algorithm is really, really simple. You just take all the institutions, and I'm going to arbitrarily order them. And this should place about 50% of the flow up the hierarchy, and 50% of the flow down the hierarchy. And on the assumption that what you want to do is have your prestigious graduates placed into other prestigious institutions, what we're going to do is just propose to swap to institutions if it puts more of the flow down the hierarchy and less of the flow up. And we just continue with this zero temperature sort of sorting process until eventually there are no more moves that we can do that would put more of the flow down the hierarchy and less of the flow up. And at this point we think that we've arrived at a nice kind of local solution. You can repeat this algorithmically many times. What you end up with is a ranking of the institutions according to how they hire each other's graduates. So this is just the top 10, but this is what the rest of the data look like. So if you like interactive visualizations and D3 and stuff like that, please play with the data. It's kind of fun. You can hover over and see the different flows. But what I did here is I colored these arcs red or blue depending on the direction of the flow. So if the flow is down the prestige hierarchy from a more prestigious PhD into a less prestigious faculty job, it's a blue. And if the flow is up the hierarchy, it's red. And if you look at this and you say, it's definitely blue, you're right. So three brief observations about what we see from these collective patterns. One is that this is systematic. About 90% of the hiring movement is down the hierarchy. Which means that if you're graduating with a PhD and you're wondering about your fate, one in ten will move up the hierarchy against the current and nine out of ten will move down. It's also pretty steep. So fewer than 7% of faculty have a PhD from the lower 75% of universities. Meaning that 93% come from the top quarter. It's also biased and the median change for women is about three ranks worse than the change for men. We'll revisit this in just a moment. But there are other aspects of this other than just the sort of large scale behavior. So there's also a very clear core periphery structure within these networks. So what I'm showing you here is a plot that goes from the highest prestige down to the lowest. And a brief note on when I'm saying prestige, I'm not trying to disambiguate it all between prestige that comes from the name only and prestige that comes from like the actual quality of the candidate. I think prestige mixes those things and that's what we're observing here. But there's a clear core periphery structure in the network. And so what I'm showing in these little diagrams are minimum spanning trees where the node corresponding to a particular university is pinned right at the center. So if you look at Stanford, the minimum spanning tree is essentially hierarchical. You can just spread easily from one institution to the others. And if we go down to the other end and we look at a university like Montana State to the minimum spanning tree is much more thread-like and so to find a trajectory that goes from Montana State to all the other universities requires a lot more hops. The plot itself is showing the mean geodesic distance from that particular institution to all the others versus diameter. So in a sense, Stanford is closer to all the other institutions through these hiring moves than Montana State. The implications of this are interesting if you're thinking about the spread of ideas and the spread of culture and norms through the process of exchanging graduates in departments. So some comments about this. Universities of the Core are sort of by definition very close to all the other core universities. But this core position enables, we think, some substantial influence over research agendas, communities, and the departmental norms. You learn when you're a PhD student aren't necessarily just techniques and the things learned in courses. You learn from your advisors and the other people in the department what a research seminar is like. When is it okay to ask questions? Can you ask rude questions? Can you tell the speaker, I think what you're saying is bull or not. So those institutions that are placing far more faculty into other departments are able to spread these ideas more easily through the system regardless of their merit. If you are trained with a certain set of premises at a particular institution that is, again, able to spread these sort of cultural or ideological seeds more easily then that's not a merit-based process. It's just more facilitated by the fact that you're placing more faculty. Yeah. This is an assumption. Speculation. You could test that. Just some other numbers here. 68-88% of faculty at the top 15% come from within the top 15%. When we talk about rich clubs and networks or core periphery structure this is one example of that. Again, only 4-7% receive a doctorate from the lower 75%. When I'm showing ranges here depending on if you're looking at history or business or computer science there may be variability. Other interesting bits are clear and significant differences in how women and men place in computer science and business but we didn't see any evidence of differences in history in our data. We think that might be related to the fraction of men in history at 64% as opposed to business at 78% and computer science at 85%. Again, we don't have a mechanism that we could explain this with so that's maybe something interesting to follow up on. On the other hand in history the prestige hierarchy was steepest. If you look at the top 8 most productive in terms of placement departments they account for over 50% of faculty which means that if you go to a PhD granting institution in the US and you pick a faculty member at random 50-50 chance that they're going to be from one of eight universities versus the rest. In terms of placement on average faculty are going to place 27-47 ranks below their doctorate. The median is 21-35 so there's a sizable downward skew and if you think about this from the perspective of network science where we know things like for example your friends are on average more popular than you are by definition. What this means in this case due to the magnitude of the inequality is that a typical professor can expect to supervise 2-4 times fewer grad students who become professors than their own advisor did. One thing that I think is really fun here is that we're not creating a ranking per se, we're letting the faculty hiring process sort of cast votes for different departments. I think this has implication if you're interested in non-gameable systems or systems that you can sort of fix or use to your advantage. My undergrad was at WashU and when I get phone calls from WashU like alumni development or whatever they call it they'll say things like if you could only give $5 that would be fine because one of the things that US news uses when they calculate these rankings is faculty participation rate and fundraising which is like very explicit and fine I'll give $5 but the system itself when it starts to be gamed loses meaning and what's interesting about this system is that for a university to move up in a ranking based on faculty placement what you essentially would have to do is convince other departments to hire your graduates which I think is like a real test that would take either a substantial increase in quality or prestige or some amount of collusion and so just to summarize this part I've talked about PhD placement networks in which there's a directed edge from one university to another if you got your PhD at one place and now your faculty at the other calling it a near perfect prestige hierarchy meaning that only 9 to 14 percent move up against the stream there's dramatic inequality in PhD production which I didn't talk about but in terms of how many actual PhDs you supply to the system there's a huge inequality there and whether or not that's based on meritocracy or not I think is up for debate we could talk about that at the end as well we're able to predict placement better than a US news ranking using this sort of ranking and that makes sense if you derive the ranking from the data that you're trying to predict you should be better at predicting that data but I want to talk about predicting placement the reason is that what we saw here was that there was a large amount of structure that was explained by hierarchy and by prestige but I also mentioned that there's a difference between how women and men place in the system which suggests that we're not capturing everything when we talk about prestige there's a lot of pattern here that is is not not well modeled by prestige alone so the idea here is to model the assembly of hiring networks so because we have data on basically lots of different hiring years depending on what year somebody actually came into the system we're going to model the annual matching of candidates to openings so here's a diagram on the right we put the candidates on the left we put the openings that they actually placed into on the right and we're going to cut them all and try and model how we would rematch them so each year T I have some set of candidate stubs or half edges and some set of opening stubs so given some pair of opening and candidate the probability that there's a match is going to depend on that pair's features so what are the features I'll get to that in a moment so we're going to select first the hiring institution with probability proportional to the departmental prestige so this is basically encoding a belief that the more prestigious the institution sort of the first dibs that you have we'll score each potential match into that opening so we have all the different candidates still on the left and we're going to see what their scores ought to be in terms of which one is likely to place into that job and then select a candidate with probability proportional to their score and the way that we're scoring these now is a function of the features of both the candidate and the opening on the left and also the weights that are going to tell us what the features are and as you're guessing and reading ahead what we're going to be doing is fitting the weights to the data so that we can understand which things are important and which are not so we'll repeat this for the remaining candidates and jobs sequentially placing people in and this addresses one of the difficulties in modeling this data which is that there's non-independence of hires so if I take a job I can't then take another job and if I take a job they can't also hire somebody else into the same faculty line and so what this does is it kind of forces us into this situation where we need to model things sequentially and it makes the math a little bit more difficult because we can't make these nice independence assumptions that would really make the code fly very fast for example so what are the factors I've already talked about the actual hiring data but on the right here are the covariates and features that we included in the model so first the applicant gender which we again coded as female or male applicant postdoctoral training is also included on CVs so that's just a binary 0 or 1 another non-meridocratic factor that could be interesting is geography so we included geographic closeness of doctoral and hiring institutions just by U.S. Census region or Canada which we treated as a different region we included the prestige of the hiring institution as well as the difference in prestige between the doctoral and hiring institutions and finally since we know that publications are important we included the scholarly productivity so how many papers had somebody written since we're looking at computer science we could do this using DBLP which is essentially a database it's like Google Scholar but only for computer science journals and for conferences so we generated for each person a list of papers published up to one year after you got hired and the idea was that you go on the job market with some number of papers so we should also include that extra year and we made no attempt to evaluate which venues were prestigious and good and which were garbage and pay to play thought that if we got into that sort of rabbit hole we'd never emerge so it's very coarse and it's just by quantity we also made no attempt to allocate credit so if you are the fourth author on a 17 author paper that counted for as much as if you wrote a monograph again these are assumptions that we made and whether or not they're good assumptions we could talk about one of the issues that we ran into is that publication rates vary by subfield I had a nice dinner with a theoretician last night who talked about writing an entire paper in a day because he had an idea and wrote it up and then submitted it but if you're studying data like people in the Institute for Data Science are then you know that this is impossible so what we tried to do was subfield detection so we aggregated each individual's list of titles from DBLP into a single document and then used just off-the-shelf latent-deerish lay allocation to try to find ten different topics among all of the different people's documents or in terms of their own subfields that they tend to publish in here's some examples you can see my colleague Aaron Clausette at Colorado works in machine learning data, some theory a little bit of human-computer interaction and so some people are sort of distributed across disciplines other people like Ahud Charlene it's basically only working in one field and so based on this we thought as people should be judged by hiring committees based on the different subfields that they represent so we could use this to account for the fact that people are going to be evaluated differently by people in different fields so this is pretty straightforward we have a model we have these features and covariates associated with not only individuals but also potential pairs matchings of people into jobs and the idea was that we're just going to try to learn the weights by minimizing the placement error so we added covariates one by one in a greedy fashion and then saved gender for last and so this is showing the error reduction relative to sort of baseline where we're not using any model at all and including the rank difference was the most important thing followed by productivity that explained another 2% of the error if we then include the rank of the hiring institution by itself that again is a significant although not necessarily substantial reduction in error if you then include postdoc experience in geography those together give you a significant additional decrease in error but then when we included gender there was no significant difference and our interpretation of this is that gender bias is not uniformly and systematically affecting all hires instead what we think is that gender is essentially encoded in these other variables so if you instead include only gender as the only feature that you're willing to consider then of course you're going to see a significant effect a significant reduction in your modeling error but if we account for all of these other factors these other features in the data and then include gender we don't see a significant difference I think this is a point worth making because when I say that it's not uniformly and systematically affecting all hires I think this flies somewhat in the face of these experiments that claim a 2 to 1 preference for women or a substantial or significant bias in favor of men to say that this is a complex system and that things are subtle and that things are correlated with each other for me is a more honest treatment of the problem and if you're going to be prescribing solutions knowing that complexity I think will hopefully help in coming up with solutions but another thing that we can do is look at institution level results so instead of looking at the global picture what we can do is since we have information about who was hired when and we know which candidates were potentially on the market in each one of those years we can go back through and simulate using this model to understand what was the distribution of the number of women that could have been hired into a department versus the actual so each green line in this represents a single simulation of the cumulative number of female hires from 1970 when our data set essentially starts to 2010 and so this is data from Berkeley the observed count is in black and you can see this distribution so we can look at lots of different institutions and Berkeley sort of falls right where you would expect Princeton seems to have hired more women than you would expect and Brigham Young fewer women than you would expect what we can do is then place all of these where we're taking the actual number of women hired as seen through our data set minus the expected so we can see whether or not you're hiring more women than you would expect or fewer women and then order these by rank when we did this we saw something interesting which to me looks like an oscillation of hiring more women than you'd expect in the top 10 fewer in the next 10 more again and then fewer again so here's one possible interpretation and you can tell me whether or not you think I'm making this up or whether or not it's real but it looks like an interference effect which is to say that it looks like the top 20 institutions are essentially competing over the elite candidates and they want to hire the best and they also want to hire somebody who's a woman and so they are essentially competing for these very very best candidates and what that means is that if you have a stronger department and a more prestigious department you essentially have more attractive power to actually get those people to come to your institution whereas between 10 and 20 you might not have the same name recognition the same resources the same ability to attract those candidates so if you make an assumption of a finite hiring pool then this interference effect could be real but the fact that it flips again here maybe means that there are sort of two distinct candidate pools you have the people who are applying broadly to the very top institutions and those who are casting a wider net I don't know if this is real we're not sure exactly how we would test this but one thing that we're keen on doing is expanding this to other fields to see if this sort of pattern repeats itself there that would be much more convincing I think so some conclusions about this section or that the best predictors in our model were doctoral prestige and productivity and accounting for gender does not help predict faculty placement when we add it at the end and there are three possible explanations so one could be that the effects of gender are modeled unrealistically one could be that gender is simply an irrelevant feature in faculty hiring and the third is that the effects of gender are accounted for by the other covariates that we've already included and I already mentioned this but I think that there's more evidence for number three so three points in particular we found significant differences in the effects of publishing and postdoctoral training for men and for women by going back and visiting the weights of the fitted model what you can see is there's essentially an exchange rate between gender and productivity and specifically what we found is that for an equally qualified man and woman the woman would have to write one additional paper to be placed the same way to be treated the same way by the model as the man and these are first CS candidates in there to work it with between let's say 12 and 15 papers so the difference of one paper out of 15 it's a lot but it may be different depending on the field that you look at that exchange rate may differ we also saw significant differences in men and women who move up the prestige hierarchy I didn't talk about that and we see what we believe are these interference effects between higher and slightly lower prestige institutions future directions here's one future direction just out of the trends and the data if you project a head based on the percentage of hired faculty who are women and men it looks like we'll reach hiring parity in 2075 so that's if nothing changes mathematically it would be interesting to look at likelihood based models instead of trying to fit something with a loss function that's based on placement error we may be in a situation where we're placing our own tail in terms of using prestige to model prestige error it would be interesting to look at different fields and other underrepresented groups to see if some of these patterns that we see when we look at women in computer science are consistent and finally we're looking at this project now where we're revisiting a lot of the faculty in the data set now six years later to learn things about retention in addition to the hiring process so this brings us all the way up to hiring and I want to talk now about productivity post hire so there's this canonical productivity trajectory which says that early in your career you rise to this early peak in your productivity and then there's some amount of like decline or flattening so here I'm showing data from a 1986 paper this is about a thousand North American academic psychologists and they've been binned into these about 10 year intervals and yeah you can see that the number of papers per year kind of rises and then declines so it's been observed in psychology this is another one now we don't have quite the same binning but this is age versus creative production rate for Russians only in science and math and this was in 1954 this paper is kind of fun because they break it down by countries so here the English here the French the Italians the Germans the Americans and then everybody all together so science and math in general this one I thought was interesting this was from the anthropology literature from a review in 2000 from the Ache and the Hiwi hunter-gatherer groups so productivity here was measured in calories per day hunted or gathered and as you can see by age there's a peak around age let's say 35, 40, 45 and then a decline after that my favorite was this this book published in 1835 and I don't speak French so if somebody objects to the translation I'm about to provide sorry but this was just a statistical review of all the different things that people do in 1835 in France and also a little bit in the Americas and this is a table of crimes so here you have murder infanticide rebellion, I don't know what that means exactly assassination and the column headings are under 16 years of age 16 to 21, 21 to 25 and there's this nice table here and if you end up plotting these data you end up with these kind of conclusions so let me cite my source here this is page 242 and I just want to read this the fatal inclination appears to be developed on account of the intensity of the physical strength and the passions of man it reaches its peak at the age of 25 when physical development is nearly complete the intellectual and moral development which takes place more slowly then cushions the penitent to crime which diminishes even later by weakening of physical strength and passions so I think this is really fun because we've talked about psychologists and now we've moved on to criminals and it's interesting that these observations come out of very different literatures but the other thing that's cool about this is that the author whose name I'm going to butcher and therefore not say is talking about the mechanisms what are the mechanisms responsible for the passions of a young man to go out and murder somebody or whatever this is interesting so he looked at French and Philadelphia criminals and also in French artists so the artistic production of the artists in terms of just the number of paintings you again see this rise and decline okay and many others and if you look in the literature there are lots of these examples and you see plots like this all over so what about computer scientists so I know what papers you've written I know the year that you were hired and it's interesting to see that we could maybe assess this so one of the issues that we ran into when we started to do this is that DBLP's coverage of your publications isn't perfect and it's also improving so what we did is for 10% of the faculty in the sample this was a few hundred we downloaded their CVs directly from their website and then we compared the number of publications on their CVs under the peer reviewed section to DBLP and as you can see DBLP's coverage has been growing over time so these systematic changes we think are related to indexing like what DBLP is actually crawling to get their data accessibility which conferences have actually been put online and which journals are now digital or have digital representations and then the third one which is sort of fun and interesting is like what is or was computer science like what counts as computer science professor CV now and in the past so for the non-CV 90% we adjusted their past counts to account for this coverage trend but what was fun is that even after we did that publication rates have been rising dramatically over time so the rate of additional growth is pretty consistent you have one additional yearly publication per decade so let me explain this plot on the right we know that productivity is different depending on where you are in your career like it's maybe a little bit lower when you start out and then it peaks coincidentally around 10 year time and then goes down after that so depending on which years you compare you may get sort of different growth rates but if we normalize these by the number of publications in 2011 what you see is that if you look at the first 5 years versus year 5 in particular years 5, 10 and 15 you end up with a very consistent collapse of the data and so the growth rate in publications over time is pretty consistent now speculation about why this could be so one thing could be the minimum publishable unit may be decreasing so it's easier to write little papers another thing could be that in computer science there's a strong tradition of publishing at conferences and conference proceedings are yearly and there are more and more conferences so these conference deadlines approach and you send in whatever you have but what that meant for us was that we had to adjust for coverage on the previous slide and now adjust for growth the reason we're going to adjust for growth when we're looking at trajectories is that if you just publish 5 papers a year but the norms within the field are changing then it's like you're publishing fewer and fewer equivalent papers per year even though the actual number is the same so this I thought was interesting this is what you look at when you look at the first 10 years post hire and this is prestige increasing so Berkeley is over here on the right the difference I thought was really substantial so in the first decade per 100 rank difference it's 33 publications on average we were also surprised that public and private status of the institution was not significant so the orange dots are public the black dots are private and we thought that maybe in our heads private and prestige may be culturally correlated as a signal because the private institutions that are over here on the left tend to be a very very big name but we don't think about the fact that there are lots of other private institutions as well so the data sort of revealed that the signal of increased productivity and public private status may be unreliable once you look at prestige so this is what things look like if you look at all the computer scientists and now I'm breaking things apart by the top 50 institutions the next 50 and then those with a rank over 100 so the shape is consistent with what we've seen before it seems to scale with prestige and what's cool about this is that it's longitudinal data for individuals we're tracking a single person through a career instead of at one time point binning a lot of contemporaries by their age so I've added computer scientists to this list along with the criminals and the psychologists so we sort of got interested in this because this is an average and if you study systems where you're only looking at the average but you're interested in learning about kind of the different trajectories that people actually take you have to ask whether or not there are any individuals who are actually well described by the average like the average looks like this but are there any people who actually look that way and know that's interesting and if not that's also interesting so we decided to instead of averaging over thousands of people try to model individual trajectories so here's what we did if there's a rapid rise to an early peak and then a decline or flattening the simplest model that we could write down is this just stupid piecewise linear function that's continuous and so it's got a slope and then another slope there's some change point t star and then it's got an intercept so it's a four parameter model and this canonical trajectory of rapid rise to an early peak and then decline or flattening said okay there should be four mathematical conditions that from the parameters kind of describe these these four things so rise means that the first slope needs to be positive early peak we said okay t star needs to happen on the first half of the career decline means that the second slope needs to be not positive and then rapid we interpreted as the slope on the way up needs to be bigger in magnitude than the slope on the way down and then what we'll do is we'll look at all the individuals we'll fit this four parameter model to each individual's productivity trajectory and then see whether or not these four conditions are met so just a note here this is editorializing calculus really does work so what we have is a simple four parameter model here and if you're trying to fit this by least squares one thing that you could do is just write down this error function and then try and fit the four parameter model to the data using off-the-shelf techniques and now you're doing this search of a four dimensional parameter space or you can go back to your statistics course or wherever you learn this and you take partial derivatives of the error in terms of the different slopes and the different parameters and what you can do is actually do an analytical fit using linear algebra to the two slope parameters and the intercept and then you've just reduced your search over the 4D space to a one dimensional line search and the code goes much, much, much, much faster so anyway just a little side note that I think that in all of these problems even though you're hitting a lot of data with all the tools that you can think of to answer the questions it's still worth like a little bit of whiteboard time before you even start writing a little bit of code or a lot of code okay so this is what things look like if I plot all of the people in our data set in terms of their slope parameter in the beginning that's the horizontal axis and then their slope parameter after t star that's the vertical axis so from the four quadrants I've pulled out four examples so you can see the canonical trajectory lives down here a positive first slope and a negative second slope and then the opposite the people that drop and then go back up they're on the upper left down and down, it's over here up and up, it's over there now in some sense we shouldn't be surprised by the fact that there's nobody in the bottom left and nobody in the upper right because if you start publishing at one rate and then you drop and then you drop again you're probably not going to be in our data set right there's some limitations to what is possible regardless of how many grants you have and what a giant empire you've been building and so you can't consistently increase at too high a rate so I don't know if you guys can see it back there can you see the shading, this kind of triangle here so on this side this is where the slope is steeper in the second half of the career so 62.1% kind of look like a tent but those that look like the canonical tent we find 32% so about a third of people are robustly within this regime when I say robustly you have to worry about overfitting in this situation like any time you've written a paper it could come out in 2017 but like if the review process went really well maybe it would have come out in 2016 or if review goes horribly maybe it comes out in 2018 so what we did is we added a little bit of noise to each individual's papers and we said this could have some possibility of occurring in any one of the adjacent years with some small probability when we add the noise we can ask whether or not people were consistently found in the same region of this parameter space so 25% of people 25% of professors are unstable which is interesting it says like even if there seems to be a trend in your career in terms of your paper productivity it may be susceptible to reinterpretation in the presence of noise so let's talk about gender for a moment men and women were statistically indistinguishable across these quadrants as well as within the canonical triangle there were some differences by prestige however so the initial slope was higher for faculty than prestigious universities 1.21 papers per year versus 0.75 papers per year but based on that what goes up must come down so if the slope in the beginning is higher then the slope afterwards also tends to be more negative and we did notice significant effects of your PhD prestige on slope and the prestige of your PhD as well as your faculty, institution and your postdoc these were related to the value of intercept the B parameter here suggesting some amount of continuity with what you were doing before so if you were very productive as a postdoc you're likely to remain productive once you start a faculty job I don't think that's particularly surprising there's an interesting question here about causality so does being at a prestigious place like Berkeley make you more productive or are you here because you were productive what we're hoping to do and you can probably think of some good ideas that I would never think of are like find some natural experiments where people change institutions or institutions themselves change to see whether or not there's a causal relationship or not that's to be continued let me briefly talk about career change points so if we look at the value of t star the inflection point or the change point in the model not inflection point, the change point in the model what you can see is that the change points tend to have a mode even if you add a little bit of noise here between 3 and 7 years so is it surprising that the time when most people are putting together a 10 year case is also when there's a change point I think that's probably not coincidental I want to note that we only included those with what I'm going to call a meaningful t star which is that we're fitting a model with 4 parameters to data but it's not necessarily the case that we should be able to interpret t star if somebody is better modeled by a straight line so we just used AIC here to decide between using this 4 parameter model and a line we chuck out all the people who are better modeled by just a line and only interpret those t star values that we think are meaningful as determined by AIC what's kind of interesting is that if you look at from the actual data when was your peak productivity when was the year in which you wrote the most papers and we place a dot for an individual here based on the length of the career and your most productive year what you can see is that again if you include these early career faculty in gray or you don't include them and you just have the orange you can see that the model year for peak productivity is once again right around year 5 but there's a lot of diversity so all the people across the bottom here these are people whose first year on the job was their most productive year and all the people along the diagonal their most productive year was 2011 when we collected data about them and there are some people who have been in these jobs for 3 or 4 decades whose most productive year was this most recent year so okay, yes there are patterns yes we can tell a story about tenure but there are lots of different ways to have a career we also looked at author order transitions because it's not just about which papers you write but also what your role is on those papers and what you can see is that the proportion of all publications that are last author publications grows over time first author publications decrease over time but that's split by prestige here so the orange are the top 50 institutions, the black are the others you can see that the rate at which you move from writing first author publications to last author publications changes of course the missing fraction here is the middle author publications one thing we have to worry about here is that CS theory is like math and people order the authors alphabetically so we need to remove those folks from this data set so you can write down a permutation model if there are 3 authors on a paper there is a 1 over 3 factorial chance that they are going to be at random alphabetical so we discarded those venues where they tended to be alphabetical so this is interesting so the transition to last author happens about 2 years earlier for faculty of the top 50 and there is also a different steady state for the lap rate but all faculty have the same fap rate which is first author publication, last author publication so I'm not sure how to interpret that why there should be a different fraction of publications that are last author versus middle author where is the first author the same we were about to submit the paper when we realized that we were falling into the same trap that we were sort of criticizing some of the literature for which is that here I'm averaging over lots of people so what if we look at actual individuals and their own transitions from first author publication to last author publication so these plots are a little bit messy individuals are dots and then a kernel density estimation is shown in the field behind the top 50 on the left on the right what I'm showing you is the first author paper publication rate here the first author publication rate in the years 3 through 5 so if you are going from writing a lot of first author papers to fewer first author papers then you're going to be below the diagonal and so what you can see is again there's a huge amount of diversity here there's no one way that people write papers over the course of a career so some conclusions here this canonical narrative appears to describe only about one of three faculty and patterns within the doctorate in terms of productivity may persist post hire and this includes gender so if gender affected how a PhD went how many publications you were able to write the support that you got from advisors and so on those kinds of things that happened during the doctorate are going to persist through post doc and post hire but I want to stress that our conclusions here are a little bit limited so there are lots of ways to be productive that don't involve peer reviewed publications and anybody who has been in academia for longer than a couple of years knows that you can review a lot of papers make a contribution that way you can write a lot of grants you can chair a department you can start the Berkeley Institute for Data Science you can do all kinds of different things so this is only looking at publication rates and the conclusion should be narrowly interpreted through that but there's some fun future questions so we're only looking at sort of survivors of the academy in 2011 what about those who left and there was a question in the beginning like what about the people who have gone to industry so it'll be very interesting to look at that data now in 2016 those people who were originally in the data set in 2011 to see what's happened to them and what have they done I think it's a fun question to ask does prestige cause productivity or does being at a prestigious institution somehow cause productivity is there something about creative environments that actually draws out of you something that would not have been drawn out at a different place and if you look at the architecture of this room it's really designed with that kind of intention bids is designed with this open office space so that people are bumping into each other and when particles collide there are different energy releases another fun thing that I think would be cool to look at is how much structure does a model for productivity actually need so if we go back to the writer in French in the 1800s he's talking about the passions of man to commit crimes and then eventually the ethics kick in so that's supposing that there are these mechanistic factors that are actually causing some of these growth and then decline patterns but you have to wonder based on the diversity of the trajectories that we see but the consistent averages over all sorts of different fields what are the mechanisms that are consistent across fields and do you even need mechanisms at all so one of the things we're experimenting with is what if writing papers is just sort of a random walk in paper space and then you just impose a sort of threshold right around tenure time that people try to make their way over can you use that to get both the stochasticity of the different patterns as well as the consistency of the averages this would be a little bit of one of those annoying physics papers that says well you don't really need a mechanism here and anyway so we'll see how that goes I want to acknowledge my collaborators Alison Morgan and Sam Wei are grad students in computer science at Colorado, they're great and if you ever get a chance to meet them at a conference they're super nice Aaron Closet is a long time collaborator of mine and just a pleasure to work with there's three papers here if you're interested so on the archive and under review now is the productivity work the work on gender came out in the dub dub dub conference in 2016 the science advances in 2015 was the prestige work it's 2017, things need to be open and available so if you want any of the data or any of the codes it's all on my website or on Aaron's website we need to acknowledge the Kauffman Foundation who paid for everybody to hand collect the data for the original prestige study I'm here because of the Santa Fe Institute and the NSF so I'd love to answer any questions that you have thanks for your attention thank you