 So thank you everyone again for coming out on this steamy August day. I guarantee you will not be disappointed. At this point, I want to turn the program over to Sarah to open up and talk to us a little bit about the significance and importance of data as a tool for racial justice. Sarah? Thanks so much, Valerie. Thanks everyone for having me here. I'm looking forward to getting to know everyone. I don't know most of you, but I do know your organizations. You rely on your work a lot at CLU, so I'm looking forward to that. So as Valerie said, my name is Sarah Jimenez. I work with Community Labor United in Boston. I'm a senior researcher there. CLU convenes labor and community-based organizations. With those coalitions, we drive issue campaigns. And our value add to the coalitions that we convene is to facilitate, to provide research and policy development support, and to direct legal and communications resources. So that's the intro done. Is everyone having a good summer? Okay, great. Earlier this summer, I was up in Maine. I was watching my sister walk across the stage. And as she walked, the audience was quiet. We were listening. The emcee read several lines of her work. And then we clapped. That's the tradition for her program in creative writing. So she has a degree in creative writing. And I have a sister with a degree in creative writing. Which is great, because I've been picking her brain on storytelling for a few years now, and that's important because storytelling is really essential to the research and the campaign work that CLU does. So here's a story. How many people here study childcare? Okay, great. So most of you are new to it, and you can tell me later if the story made any sense to you. The story begins with two announcements. Back in April, there was a press release out of the Massachusetts EEC, which is our State Department of Early Education and Care. They're launching a new program. It's called Strong Start, both worth capitalized, smushed into one word. EEC has been overhauling its QRIS, or its quality ratings and improvement system, and Strong Start is a major part of that. A lot of states have similar QRIS systems. They're specific to early education in childcare. Typically, it's a rating scale for childcare programs, and then there are resources to help programs move up in the rating scale. The EEC is very excited about this. They even said that word excited in their press release. So as for the second announcement, I can't tell you when it was made, because in our state, childcare advocates waited and waited and waited for this announcement, and it never came. A few years ago, to everyone's surprise, federal block grant funding for childcare nearly doubled. It was discretionary, so it might have only been temporary. But almost immediately, there were statements from all these different states about how they would spend it. So what is Governor Baker doing with 58 million that our state got? He's not saying. But in this case, no message is the message. All these other states, they were happy to share their plans, because it made them look good. They were going to fund more vouchers for low income families, and they were going to raise reimbursement rates for care providers, and so on. Those are things that our state strongly needs, too, with some of the highest childcare costs to the nation and some of the lowest care provider rates. But if the Baker administration were going to do something about that, we bet they would have said so. So let me introduce you now to someone who didn't notice either one of those announcements or non-announcements. Her name is Rosena. Rosena lives outside Codman Square in Boston, and one day in the square, there's a woman with flyers. Hospitality training, the flyers read. Most hotels in Boston are Union, and the wages are decent. In fact, Rosena vaguely remembers seeing hordes of red t-shirts marching around the city recently, shouting something that seemed both common sense and also totally utopian. One job should be enough. So Rosena turns to her friend Ava, who's also a mother, and pushes the flyer at her. Let's go in together, she says. Ava looks at it, but she's not smiling. I know this program, she said. They said that first year hotel workers get the graveyard shifts, and we need to have overnight childcare in place before we even start this program. So Rosena wonders briefly if her mom could come over nights to watch little hostelito, and then just as quickly she dismisses the idea, because her mother hasn't been well, and how can she take care of someone when she needs someone at home taking care of her? So Rosena stuffs the flyer into her handbag, and that's where it'll stay until the next time she empties the whole thing out to clean. The flyer is still in there that evening when she's home, and she's trying to enjoy a few hours with her husband before he leaves for the night shift. And I say try, because she's struggling not to be upset that he overdrafted their account, and he's struggling not to be upset that she didn't fill up the gas tank before she got home. It's like this a lot of nights, they try not to fight because their apartment is small, and little hostelito would be able to hear them through the walls. Community Labour United was paying close attention to these two announcements. At CLU, we've been doing research for our coalition on childcare for about two years now. The coalition finally has a name, Care That Works, and we've made progress on our issue analysis, which helped us to interpret those two announcements probably a little bit differently from most people. So what kind of research is happening? The word radical means relating to the root of something. Radical change means getting to the root of a problem or a system. To target root level change, you need to start with a root cause, analysis. And for big, unwieldy systems like childcare, those roots usually go way back in history. So research is the foundation of all of our campaigns, and it's never complete without looking at our histories. We look at the events, we look at the statistics, and we try to make sense of them and turn them into stories. Rosanna probably doesn't think about this often, but in 1935, when her grandmother was still a young girl, a bill was passed called the National Labor Relations Act. People probably know about this one. When our country moved to protect the welfare of workers, a huge population of workers were deliberately left out of it. This included domestic workers, and that includes childcare workers employed in private homes, largely poor women of color. And their being left out was not an accident. In 1946, when Rosanna's grandmother was a teenager, the federal government ended its first and last ever universal childcare program. The war was over, the men were coming back, and the women who had been working had to go home. And with women at home, no more need for government funded childcare. Decades later, after Rosanna's mother had been born, there was another stab at universal childcare. In 1972, the Comprehensive Child Development Bill passed a house, and it passed the Senate, and then it died on President Nixon's desk. Two days, two decades later in the late 80s, all we got instead was this block grant, the child care development block grant. So in the mid 90s, when Rosanna was in the first grade, we got welfare reform, and with it were requirements, which were very popular because people didn't like the visual image of poor mothers of color staying at home with their children and receiving help. The childcare system we have today has it through to all these historical moments. With hindsight, we see these decisions justified with convoluted and contradictory logics, the logics of white supremacy, of patriarchy and misogyny, and class prejudice. These forces still shape our world today and are poised to shape our future. So we need to know this history, and we need to tell it, and we can't do that without the historical data. When we look at the landscape of our childcare today, the disaggregated data helps us trace the legacies of our history to our current circumstances. Given Klu's model of bridging community and labor organizations, when it comes to childcare, we focus on both families and care providers. This is especially important because when there isn't enough money in the system, these two groups are so easily pitted against one another. So take Rosanna. Her son, Jose Lito, is two years old. Like most other families, Rosanna relies on a multitude of arrangements to care for him. She uses family, friend, and neighbor care, an arrangement which is heavily used by families across race, ethnicity, and income. She wouldn't be able to access a childcare center probably without a voucher, especially not for the rates for infants and toddlers. And she definitely wouldn't be able to afford a full-time nanny. But sometimes she does rely on family childcare providers, who in our state are disproportionately women of color, especially older women of color. Unfortunately, these family childcare providers have been going on a business recently. The numbers have been falling steadily in our state over the last decade. The childcare landscape is so, so fragmented and so segregated. This aggregated data has helped us see in particular how women of different races and incomes are sorted into different care settings, both as parents and as care workers. Low-income women of color struggle to find care that they can afford during the hours that they need it. And care providers have some of the lowest wages in our state, despite increasing credentialing requirements. So who remembers what those two announcements were? There was the one that the state was going all in on quality measurement, training, and credentialing. And then there was the one that was mum, and the latest opportunity to help families afford care and increase care provider compensation. And Clue's analysis in a very abbreviated form is this, that policy priorities are ignoring the childcare challenges of low-income families of color by focusing almost exclusively on education metrics rather than basic access. That policy implementation is driving vulnerable caregivers out of the system by ramping up credentialing requirements without sufficient compensation. And that policy frameworks overall are ignoring the role of family stability in child development, ignoring the need for childcare to be a genuine work support so that parents can be at their best for their kids. And then that framework only reinforces the focus on education metrics and credentialing because then the caregivers and early educators become the scapegoats for poor child development outcomes. So by the way, Rosanna and Ava entered the hospitality training program after all. A pilot program was launched in Boston. It organized single mothers to go through these training programs together with a firm job opportunity at the end. And it organized family childcare providers to work together to figure out how to offer care for non-standard schedules. Because Ava had been a prospective trainee before, the hospitality training program had reached back out to her when this new childcare program was launched and she pulled Rosanna into. So as part of this program, Rosanna and Ava were invited to a special event. Other women in their cohort were there. The care providers were there too, including the ones who were caring for Rosanna and Ava's children. There was music and food and socializing and also history and statistics. This is the kind of gathering that Community Labor United organizes to engage our partner members in coalition work. Clue uses conversations, co-learning and creative curricula to share the stories that we learned from our research, the stories about the past and the stories about what's happening today. So now Rosanna and Ava and their care providers are in a room together. And maybe they never realized before how they had absorbed the message that life is hard because of their unworthiness or their bad decisions or their random misfortune. But now they see themselves and their own lives reflected in the numbers and the stories and they're starting to get the idea that they aren't alone. This kind of gathering is a key space for power building. Making change at the root level takes power, enough power to challenge the existing power structure that has control over so much of our daily lives and enough power to ultimately wrestle that control back from those who hold it. So a lot of groups run issue campaigns and advocate for progressive change. What makes Clue different is our emphasis on power building, specifically the grassroots power of those who are most affected, like Rosanna and Ava. And sharing information in all directions, including all those histories and statistics, is a key way to start activating that power. This kind of gathering is also a key space for campaign development. Through our own research, you know, guided by the leaders in our coalition, we can develop ideas and options for campaigns. But without the grassroots members, there is no campaign, so the ultimate direction needs to come from them. Clue hasn't finished this process for our Care Networks Coalition yet, so there isn't a clear campaign that I can share with you today, but I can share two currents of interest. First of all, the program that Rosanna and Ava found is something that we're working on developing now. It's called the Independent Women Project, where we want to model the provision of licensed childcare for families with non-standard schedules. It's something that can meet our people where they're at, it addresses an immediate need, and it's a vehicle for organizing. And down the road, if the model is successful, it can be a tool for our campaigns to shift care policy at the state level. So if we can show them that we're doing it, they can't say that it can't be done, or that it can only be done with for-profit corporate partners. Relatedly, we want to revive the framing of childcare as a work support and as a public good. Right now, the policy landscape is dominated by the frame of childcare as an educational setting, which it is, but it is so much more than that too. And establishing a system of care that works, there you go, is going to take more than just a QRIS overhaul. Another reason for this framing is that we know that corporations are the beneficiaries of the unfair scheduling policies in the low wages that make it hard for families to juggle work and care or to afford care at all. And big corporations are the beneficiaries of the low taxes that diminish the public revenue that could fill in the gap between what families can afford and what care providers should be earning. So we want to push this framing of childcare as a work support and a public good to therefore push the idea that big corporations, as employers and taxpayers, should be accountable for childcare too, and that they need to pony up. So there's so much more in our campaign to be said, but I want to save some time for the takeaways. The prompt for this introduction, by the way, was how do we use data in our campaigns? So it's in there, but in case you missed it, I took them and made bullet points. We use data to help us understand an issue for the perspective of frontline communities and to develop a root cause analysis that will point us towards the kind of root-level change that we want to move. And that research involves tracing historical conditions up to today, including systems-shaping injustices in our past. That same historical research can also one-form a campaign strategy by suggesting targets. So that is the groups and the organizations that have benefited from injustice are likely to stand in our way if we move to change things. We use data to critique existing policy priorities and to shape our own policy agenda so that the interests of frontline communities that are usually ignored will be pushed to the front and center. We use data to support information sharing, organizing, and leadership development with the members of our partners so that we can build grassroots power that will confront existing power for control over our lives. I'll add that when we're in active campaign mode, we deploy data strategically to advance our vision, our values and narrative, and to influence the public and other decision-makers and targets. So that's probably the most visible and well-known use for data in campaigning. And one last takeaway is that a root-level change is a major proposition and no one organization or one coalition is going to do it by themselves. So we all have a role. And my organization's role is to take existing researches out there and turn it into stories that people can move. So that means that we rely heavily on all of you, all the policy shops to analyze that data and make it available. So thank you for your role and thank you for listening. Part two, we'll talk about that. So thank you, Sarah, for sharing that story. And I just want to remind you all, one of the ongoing themes of these workshops is making our research more useful and more practical to folks that are actually moving campaigns and trying to influence policy in their local areas. And the Center for Popular Democracy is one of the co-hosting organizations for this workshop series. Thank you, Sarah, for being a part of that. And I hope that you will keep some of those things in mind as we go through the day and get towards the second half of our program today as we begin to think about how we can... what kinds of information would be useful in terms of helping them to move campaigns like the one that Sarah has mentioned. We will have time for Q&A after we break for lunch. So if you have questions, jot them down. We'll get back to those. At this point, I want to turn the program over to Trevon Logan, who's going to talk to us about race and ethnicity in empirical analysis. So thank you. I'm excited to be here to talk about... what may seem like a boring topic, but to a historian is actually quite exciting. So to give a brief outline of what we're going to talk about today, it occurred to me when Valerie and I were talking about this particular presentation that many people as secondary data users use race a lot in the data that they use, but they don't necessarily know how races coded or changed over time. So we're going to start with some terms and definitions, which themselves are very interesting and we'll talk about why they have changed so dramatically over time. And then we'll talk about how they become variables in analysis, in the analysis that we do with some very simple examples. I'm not going to teach any econometrics today. I guarantee you. And then talk about how we interpret that data and also with some examples, particularly from social media and then thinking about the meaning of race itself in empirical research. And so some of these in the last two are blurred together because there's a lot of discussion about race and ethnicity today in particular in the United States and the data that we use to talk about it itself is not always something that is as organized as we might believe. So some definitions. There are racial classifications that come from the U.S. census but more specifically they come from the Office of Management and Budget and they are requirements. So these racial classifications are the following. They're white, black or African American, Asian, Native American, American Indian or Alaska Native, Native Hawaiian or other Pacific Islander. And then there is another category of some other race and I'll talk about that in a moment. And there are also two ethnicity classifications, Hispanic and non-Hispanic. And so everyone is a member of both a race and ethnicity and usually when we're talking about some of these there are some groupings that people use canonically as races but which actually are not. I'll talk about that in a second. So how are these races defined and where do they come from in terms of policy? The Office of Management and Budget requires that race data be collected for a minimum of five groups. White, black or African American, American Indian or Alaska Native, Asian and Native Hawaiian or other Pacific Islander. They permit the Census Bureau to use a sixth category which is some other race. And respondents may now post the 1990 census report more than one race. So people identify then their origin. So ethnicity is about your origin and it's not about your race as Hispanic, Latino or Spanish and they may be of any race. So ethnic designations are completely separate from racial classifications and that's important when we talk about race and ethnicity. So what are these definitions? What is the definition of white? What is a white person in need data that you may use in a secondary data analysis? Does anyone know this definition off the top of there? I did not and I talk about white people a lot so I was really interested to know who are white people? So who are white people? Okay, white people are is a person having origins in any of the original peoples of Europe, the Middle East or North Africa. So it is not confined to those derived from the Caucus Mountains. It extends a little bit further than that and includes everyone essentially in the Middle East and North Africa. Those are all by the census OMB definitions. Those are all white people. It includes people who indicate their race as white itself but also people who would indicate their race as saying Irish or German or Italian or Lebanese, Arab, Moroccan or Caucasian. And these are in quotes because this is right from the census website. It is very important to note that in every racial classification I could find on the census website, white comes first. That is not lexicographic but it comes first. And so I'll talk about it first. And so this link will be on the website and you can see that from all of the classifications. So those are white people. I'm going to do these definitions in the order that they appear on the census website. So who are black or African American people? This definition will be a little bit incongruent with the first one because it will say that a person having origins in any of the black racial groups of Africa. So we are now going once again back to the African continent but now we're explicitly saying that there are whites in Africa who are obviously from North Africa and everyone else in Africa would be from the black racial groups of Africa. So this is confining them to being of Sub-Saharan Africa. It includes people who say that their race is black or African American but also entries such as African American, Kenyan, Nigerian, or Haitian which was not in this grouping which was confining itself to people of the black racial groups in Africa. So those who are from say Caribbean islands who would classify themselves as Haitian are also recorded as black. But it's not clear that that person would be necessarily given Haiti's history black or white. So there's some commingling of ancestry here in a pretty unclear way. And black always comes second. Always comes second. This is, it never begins, I've never seen the racial classifications on the census website start say with Asians which would be the sort of alphabetical way of defining race. Now who are American Indian or an Alaska Native? So I will say Alaska Native should be pretty self explanatory by what it says. But who are American Indians? Once again, something that we use but don't necessarily know the definition of, right? It is a person having origins in any of the original peoples of North and South America including Central America and who maintains tribal affiliation or community attachment to that group. It includes people who indicate their race as the category itself but also those who would give an explicit tribal designation. All of those people are American Indian and Alaska Native. And so this group always, the census website always puts American Indian and Alaska Native third. Asians, and this is going to become a little bit tricky because we have another category of which there is some overlap. But Asians are persons having origins in any of the original peoples of the Far East because remember the Middle East they are white. Asians come from the Far East, Southeast Asia, or the Indian subcontinent. And so these will be people everywhere from geographically this is of course the widest area of coverage in terms of geography but everywhere from Cambodia and China to India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand and Vietnam. All of those groups fall under the Asian racial category and so people who report themselves as Filipino, Korean, Vietnamese, other Asian are always going to be in the Asian racial category. Because this is self-disclosed you may consider yourself Asian, you may consider yourself white. You couldn't be American Indian or Alaska Native right so some categories are excluded but where you would actually fall is your own self-report. So it's very important to note that this is all now self-reported but I'll talk about what this meant historically. And this category is always full. And now we have Native Hawaiian or other Pacific Islander. But no, Pacific Islands did fall into the Asian of Asian. So there will be once again a little bit of incongruence here. And these are people having origins in any of the original peoples of Hawaii, Guam, Samoa or other Pacific Islands and so it includes Fijian, Marshallese, Native Hawaiian, Samoan, Tongan and other Pacific Islanders are all in this racial classification. And then we have ethnicity which is recorded separately from race. And then there's the category some other race which means of all of these groupings of the world I fall outside of that by my own disclosure and I am a member of some other race. And so ethnicity is disclosed independent of race. So one marks their race and marks their ethnicity separately and what does it mean then to be Hispanic or Latino because the other category is non-Hispanic or Latino. So if this applies to you, you are of Hispanic ethnicity. So it's a person of Cuban, Mexican, Puerto Rican, South or Central American or other Spanish culture or origin regardless of race. Which is a bit interesting because Brazil is in South America and they would have they would not want to be a Spanish origin by historical term. So the term Spanish origin can be used in addition to Hispanic or Latino. So there's some mapping once again of geography but this is not a racial classification this is one of origin. Critical to note that this ethnic classification has become used as a racial classification for a lot of empirical analysis. So in the 2010 census more than a third of those claiming Hispanic ethnicity claimed also to be of some other race. So they did not fall into any racial designation. They didn't want to say any racial designation. So there is some discussion of whether Hispanic should be also a racial classification or whether it is as it's used currently in ethnic classification. Okay, so these categories are the most current definitions that are used by OMB and by the census. But obviously these categories have changed over time and it's very important because until the 1960 census race was determined say in census records or even in many surveys by the surveyor. So the survey would simply look at you and say this is what I believe your race to be. And that was also a time period in which there were fewer racial classifications than there are today. So categories until 1850 had white and non-white. And so historically we tend to use and assume that all of the non-whites are black. Then in 1860 Native American was recorded and for some of this it's important to note that they would record Native Americans as Indians and then they would separately record Indians on reservations. Indians not on reservations and so once again there's a geographic split. By 1870 Chinese was added that is not the Asian category. The category literally was Chinese and then Japanese was added in 1890. Filipino, Korean and Hindu were added in 1920 all of which now are in the not Filipino but the Asian category and then Mexican was added in 1930. So you can see all of this the Pew website has a great historical diagram of what the racial classifications were and what ethnic groups fell under them over time. So when you see a lot of analysis say before 1960 extending racial analysis over long periods of time it is commingling self-reported race with survey or assessed race and so there may be some significant measurement error in any sort of longitudinal analysis. So even allowing for multiple categories to be checked as we now have it the race measures that we use now are all self-reported. They do not easily map to nativity and they can apply to very different groups which vary by geographic concentration over other measures which makes the empirical analysis somewhat very difficult. So someone asked me recently in the controversy say of Richard Anzel who considered herself black in any survey say a census survey if she checks African American there is no supplementary check of sort of correcting that and that would be recorded as a black person so anyone could be their own self-reported. Nativity is supposed to be in the Hispanic ethnicity designation but it's a very either you're in the box or out of the box and so there isn't any other way in these definitions so it marries some because some of the definitions talk about original people to a geographic area and some of the definitions do not. So these categories are certainly problematic but they're also systematic and so not having a racial categorization even if it is flawed might not necessarily be an improvement over what we currently have. So to do comparative analysis over time and between groups it's important to have some sort of consistent measure of race that is in some way time invariant so that people of a particular classification can always be noted as being such right. So imagine if we recorded race separately and so what happens with these definitions are like many things that come out of federal surveys they're used in almost every other survey and nearly always almost always mapped one to one so we take our racial classifications and a lot of other data outside of the census using these census definitions of race but if we recorded race separately or allowed for complete self-categorization of race in other words you left a race box completely blank and had people write in their own responses for race it would really be very hard to have generalizable claims across disparate data sets or even over time about racial classification so there are advantages and disadvantages to these definitions and it's important to note that. So what about race and ethnicity right so using these definitions obviously leads to several problems a salient example is the discussion of research on the Hispanic or Latino population which commonly if you're on social media or even the way that it's talked about in academia is typically coded as a racial designation but technically it's an ethnic designation so when you're talking about Hispanic populations you're typically in analysis talking about white Hispanics as a separate as a separate group typically in the data not all Hispanics are lumped together if they have a non-white racial designation so black Hispanics for example who might be someone's day from Caribbean island would be just classified with all the other other blacks and that might be problematic for analyzing Hispanic or the Hispanic population so it's always really important to think about and check how populations are defined in any racial analysis that you see so how are blacks defined that's a racial category but that black group may include Hispanics and will include Hispanics and non-Hispanics and are those actually separately broken out because some of the empirical trends that we might see may be explained by ethnic differences as opposed to purely racial ones so race and ethnicity are most often used in empirical analysis and I'm speaking as a social scientist or broadly as explanatory variable and so we are typically looking to do one of two things with a racial variable conditional on these classifications so we're thinking about how looking to see the average of some measure by race or ethnicity itself in other words we have some other outcomes some other why and we're simply looking to see what that y is equal to that y bar is equal to conditional on race right so if we're looking at the fraction say of African Americans who are unemployed we're looking at given a racial classification the fraction of people who are unemployed or looking at wages for Asians we now have all of the wages reported by those who self-report themselves to be Asian and we take the average of those and that's the average wage for those who are Asian the other thing that we're interested in is typically then the proportion of the population belonging to some category that is a particular race or ethnicity right so that would say for example the share of the school aged population that is Native American right or the share of the school age population that is Asian so that's then conditional on some other variable how is this distributed over racial classifications and those are the two primary ways in which we use or think about race in empirical analysis so even in more sophisticated analysis race is still fundamentally telling us about the average for people in that existing classification that's essentially all you can get out of these race variables so if I'm running a linear regression model and I have some outcome variable Y and A is my intercept and B1 is my race variable and I discreetly categorize it for all of these races and I control for all of these other sorts of things that go into X what I'm estimating with B1 is the average level effect intercept shifter for the race variable conditional on all the other controls that I'm looking at and what this really tells me then is just a race specific intercept that's what I'm getting out of any race effect that I would see in any type of racial analysis and I'm going to have a diagram that shows what I'm talking about in a second this is still to this day the state of the art basically in race analysis in any sort of paper I review any other paper I see papers I write look at race in this way it's just going to be essentially an intercept shifter you could be a little bit more sophisticated by trying to think about the distribution so what you do instead of just having an intercept is you might want to see that there's a slope difference I can now estimate this equation and so this is really really sophisticated race analysis it's taking some other variable M and then interacting it with race to see not only do I have a race specific intercept but do I have differences in the slope by race itself right so what you're estimating here in B2 is the slope effect for a race conditional once again on all my controls and then for a different intercept by race okay so what would that look like in analysis so this is coming from the the Rise Chetty project which is looking at income mobility over time and what's great here is we have these race specific intercepts and these race specific slopes right so the intercepts are going to be given where of course they are the intercepts and they're different intercepts by race but as you'll see in this example Hispanics and whites have essentially the same intercept they're blue and yellow but they have different slopes so if I was estimating this equation I would see that I would have similar intercept effects for the white racial category my white B1 and my Hispanic B1 but I would have differences in the slope my B2s for whites and Hispanic and that is what racial analysis empirically is telling us differences in intercepts and differences in slopes or members of these racial categories as defined by OMB so their averages right so the average is useful information it's obviously the first summary statistic that we would want to turn to but we might not only care about the average averages of course have lots of problems they can be skewed by outliers for example and they might not necessarily get us the answer to the questions that we want to have answers to so we're not saying anything typically about the distribution itself by race and ethnicity when we do this sort of analysis because we're simply looking at differences in slopes and differences in intercepts so the distribution could actually be far more important depending on the question that you're asking and we're not doing any distribution analysis with the way that we typically encode race in our empirical work and differences in variants between racial and ethnic groups could be very important as well so typically if I'm running a wage equation and I'll return to this when I talk about discrimination in a second and I have a racial difference what I'm looking at is the difference in the average wage say by race what I might really care about is the black or white or Asian or native Hawaiian or the Pacific Islander income distribution which could be fundamentally different some of that will be captured by the average but it wouldn't tell me anything about say a kurtosis or something in one distribution relative to another whether one has really really thick tails or really thin tails relative to another and that might be really important for policy particularly for policies that will target people at one end of the income distribution itself that could of course vary by race so by definition the average is estimated over everyone that's assigned to this category and that's important because the group difference effect is literally the unweighted sum of all of the members of the group and so when we're doing this racial analysis we really are counting every person in there sort of equally unless you have data that might use population weights but all that's trying to do is to get you to count everyone in the data as they exist in the population and that may or may not be what you'd actually want an answer to and so this is an aggregation of individuals and so we typically interpret this as a group effect but is it actually a group is it actually a group effect and I pose that to the group because I'm still thinking about whether it actually is a group effect or not it is in the statistical sense but when we start to interpret this in terms of policy it may or may not be appropriate to think of it in that way so what is missing in these racial categories now that we've sort of talked about how we define them they don't tell us very much or anything at all about heterogeneity in these racial experiences or classifications and especially what I mean by that is heterogeneity among those who would have the same racial classification or self-report right so all white people are lumped together and as you saw from the definition of this white category it includes people who are from Europe it includes people who are from North Africa and it includes people who are from the Middle East right so they are all together in the white category but I would argue that they would have very different experiences as white people right conditional say on the question that we're asking right they would even have for example different phenotypes or appear differently in the population they might be assumed to have different interactions say with law enforcement conditional on their on their skin shade and the skin shade in that category will vary considerably right we'll have a lot of heterogeneity among all of the people who are white people by this census classification none of that is reflected in these sort of averages over all of all of the white people and so that heterogeneity could speak to significant differences for members of the same racial category and that's might be important for policy discussions because what we might want to do is think about these racial differences and then think about the heterogeneity in these racial differences as well which gets back to this issue about the distributions among those who are in the same racial classification so the real problem with these race variables is that they're either on or off right you're either self reporting in the racial classification or you're not and so it's very difficult many times to think about cumulative effects even in sort of panel data that follows the same person over time because racial categories can have differences and salient experiences at different points in the life cycle so if where people end up at the time of the survey is a multi-generational process itself or if it's a cumulative process we may miss some of those critical windows when race was actually really important and could be something that we want to identify and have policy action on and we will miss that if we're just looking at the outcome now say the labor market and the experience of someone is clearly from any basic economic model a cumulative experience right of from everything from conception and their fetal origins which might give them differences in human capital all the way through their schooling experiences parental resources etc etc and now I'm measuring everything at the time of the labor market and those differences that might manifest themselves as appearing as a racial effect could actually be a cumulative effect where race is salient at some points other points and where policy could have been particularly useful in ameliorating any of those racial inequalities that we see so one way of advancing beyond the current definitions is to use subcategories to expand the analysis and that gets us closer to sort of the sources of this heterogeneity right so in other words I would like to break these categories of white and black and Asian apart because that would help me to say something more about what is going on and what would be appropriate and important this though requires two things I think it requires number one more data purely to do that sort of analysis but the second is you need to have some historical and contemporary knowledge about the racial formations and how they've changed over time and space right so it's very important that before you think about moving and advancing in this area you understand the population that you're trying to do some analysis for so here's one example and this example is just coming from me as someone native to the Twin Cities in Minnesota so if I was going to analyze Asians in the Minneapolis St. Paul metropolitan statistical area that would almost be useless for me to analyze because it contains a wide heterogeneous group a significant portion of that group will be mom the mung we're one of the largest mung populations in the United States in the Twin Cities and this mung population is quite distinct from others who would also have this Asian classification researchers found very high poverty rates among the mung population specifically in the Minneapolis St. Paul metropolitan statistical area around a third of the mung in the most recent census data are found to be in poverty and that's two times the national average which itself is higher than the Asian national average for those in poverty so taking Asian in the census definition would combine mung immigrants to the United States who came from a very different historical and political circumstance from Laos with those who are of South Asian and East Asian descent all of these are Asian by the census definitions but have very heterogeneous immigration experiences in and of themselves which may have differences in their rates of poverty human capital attainment et cetera that we would see in census data so the Asian classification in this particular MSA I would need more information on to say something what you could end up with is essentially a spurious result where you might say Asians do very well say in the Minneapolis St. Paul MSA but that's ignoring a very large fraction who may be particularly impoverished relative to other Asians so this is not comparing them to any other group but simply comparing them to other Asians in the same group and so recent research for example on racial wealth inequality has investigated the ways in which sort of classic racial definitions can obscure these differences within races and so another example of this is going back to this Asian racial category the Asian population has high degrees of geographic dissimilarity over particular cities and so if Asians are geographically concentrated in areas where for example housing prices are relatively high they will appear to be wealthier simply for geographic reasons and not for other reasons that we might want to explain say in economic models of wealth formation for example so you'd want to take into account these geographic distributions particularly when you're looking at national data to sort of factor them out and explaining these racial differences because not all racial groups are in the United States in all of the parts of the United States in equal proportions and another example is that recent African immigrants may have better or worse economic outcomes than other blacks but this will be hidden by analysis that just looks at everyone in that category as all being black the national asset scorecard for communities of color was an ambitious project which sought to analyze racial wealth disparities within and between racial groups in a number of American cities and so I'm going to go through some of the slides of this project because you're going to see a lot of heterogeneity within racial classifications and the reason that project is still important in informing our analysis of racial wealth inequality is not the analysis that necessarily was between races from other standard sources but a significant amount of heterogeneity within races which is important and then also with various over cities so DC comes up in this example as well so if you look at this from the SIP data and I'm using this by permission of Sandy I want to be very clear this is all from Sandy Darity and Derek Hamilton and I have the permission to use this I want to assure you this is all coming from SIP data so the SIP data has racial classifications similarly to the way that we were just talking about the racial classifications and census data so if you look at median net worth and the relative holdings of these racial and ethnic classifications relative to whites this is what the national data would show you in the survey of income of program participants so that for a dollar of white wealth the average black person has nine cents right in 2005 and 2011 has six cents and so the average Asian had a dollar in 24 cents for every dollar of white wealth but by 2011 had 83 cents and for Hispanics it went from 14 cents down to 7 cents but if you go and look at the ancestral origin distributions which is what they were using within these racial classifications they were breaking apart these sub-categories so blacks became not just this black category but now became much more specific so you could be a U.S. black descendant you could be a Caribbean black which is someone who is native to an island in the Caribbean you could be from Cape Vernardo which is particular to the Boston area which has a very large Cape Canadian population or a African black which would be a black African immigrant so those sub-categories are going to be important because they're going to then look at this racial wealth data within those sub-categories and also bearing them over cities in the United States so they do not break up the white category that is by design because they're looking at this among communities of color for the white population but you could similarly divide the white population in subsequent work by those who are from the Middle East Africa as opposed to those who would have ancestral origins to Europe to see if there are differences in wealth there for that group but if you look at the Asian sub-total we're now breaking down all of these groups to Chinese, Japanese Korean, Filipino, Vietnamese Asian, Indian or other Asian and the Latino designation is now broken into Mexican Cuban, Puerto Rican, Dominican South American Latino or Central American Latino excluding the Mexican category or other Latino so what is this data telling us and the Native American sub-total are those who have tribal enrollment and those who don't if you're looking at the median household income or these groups within these racial categories there's a significant amount of heterogeneity so when you get to for example Boston US black descendants have significantly lower household income than those who are Caribbean black in the same metropolitan statistical area in the Boston area and similarly you'll see that those who are South American Latino are much more than those who are Cubans in Miami so all of this would be collapsed into the average right if we're just looking at this for racial or ethnic classifications but this heterogeneity is obviously telling us something very important if you're looking for example for the earnings of those who have tribal origins and you're looking in Tulsa those who have Cherokee tribal enrollment earn significantly more than those who do not have a tribal affiliation or formal tribal affiliation and then if you look at this heterogeneity in Los Angeles which has a large Asian population but broken down into ethnicity there's a huge difference between Asian Indians those from the Indian subcontinent versus for example those who are Korean and all of this would be subsumed or assumed away in the average of these groups and this is important when you look even at things like home ownership right once again significant differences or US black descendants versus those who are Caribbean black and home ownership in Miami for example or even in the Boston area and even in DC where the home ownership and these flip so relative to African blacks the home ownership rate of black descendants in Washington DC area is significantly higher if you're looking at the home ownership rates among the racial and ethnic groups it's actually lower in the Los Angeles area for those who are Asian Indians and much higher for those who are Chinese and then when you get to wealth these become these really interesting results so probably most salient or punchline from this study is that the average black US descendant has $8 of wealth in Boston whereas those who are Caribbean black have around $18,000 that's a significant wealth difference within the same racial category and so having some knowledge about the racial information in the Boston area gets us much better information about racial wealth inequality but also intra-racial wealth inequality itself similarly you'll see significant differences in Miami for those who are Cuban who have around $22,000 of wealth versus those who are South American and Latino who have $1,200 of wealth so all of these would fall into the white Hispanic or Hispanic category that you would see in the analysis and significant differences in wealth Koreans for example in Los Angeles do significantly worse than Chinese or Asian Indians and a big part of this difference for those if you look at these wealth numbers here for these Asian you see that they're not as populous in these other MSAs who don't have data on them but a lot of this will be in areas that have of course high housing prices so a lot of this wealth is tied to real estate markets at very high prices both in Los Angeles and I don't have to tell you in this audience high housing prices in DC so there are many instances and this example really shows you that there are many instances in which social science theory really does not square with the empirical analysis of race and so in economics and I'm just going to go because this is what I know best traditional models are egocentric and so people don't really although economists do a lot of racial analysis there's nothing theoretically that we have in economics that really is a theory of race it's sort of attack on and empirical work but there's very little work theoretically about sort of members of racial groups if you take sort of a standard microeconomic theory course you're not going to see any model written down about race it just isn't people have preferences they have preferences between different groups but everybody is their own little person running around in their little atoms they might form households and do some bargaining within those households but those households themselves don't have any racial classifications themselves right so analysis though proceeds and this happens not only in economics I don't want to pick on economics sociologists do this as well political scientists do this other social scientists do this even though there's been very little theoretical development about what that race variable would actually mean this influences though how we interpret those racial variables I'll talk about that in a second so what can we do about these sort of conceptual issues so go back to those regressions and those coefficients that I was talking about and so typically we're looking at individual outcomes so say that this why was earnings and this was my regression right so how should we think about this beta one coefficient if I'm looking at the earnings of individuals and I'm taking together all of the individuals who have a specific racial classification and looking at their average level difference effect relative to all of the other groups what does it actually mean is this the average is this for the average black person even though we have these controls and then what does that tell us actually about race and there's a way in which economists have interpreted these coefficients these beta ones very differently over over time and it's changed without any subsequent sort of theoretical development unfortunately so in most traditional economic analysis theory would predict that the coefficient on beta one should be close to zero right so in other words once I've controlled for all of the choice variables and inputs say into a human capital production function that would explain all the differences in wages that I would observe and there wouldn't be any residual racial effect itself and it really should be that way because race isn't even in the model that you're writing down about this so it better be equal to zero so you can just sort of wave your hand away at it yet I have yet to see any empirical analysis where that race effect is actually zero so you would think the first thing we would do was sort of knock on the door of the theorist down the hall and say can you get to work on this race thing and they're too busy playing games literally so we have very little theory to tell us why this sort of egocentric nature is going to work to explain these group differences and outcomes right so how is this usually interpreted and so a typical sort of old school way of detecting discrimination would be to take the beta one coefficient as evidence of discrimination in other words I've now looked at all of the measurable human capital inputs and this beta one say for those who are Asian is negative there should be evidence of a wage penalty for Asians which are most consistent with discrimination because all of these other observables are controlled for so there's something different here so the controls themselves are the choices that people would actually make and the characteristics that should not be related to labor market outcomes shouldn't matter to labor market outcomes if there's not discrimination in the market so that beta one should be zero and if it's not zero or statistically distinguishable from zero then I have some evidence of labor market discrimination now if you regress for example wages on age, sex, education work experience, geographic location controls for unemployment rates controls for the quality of schooling etc etc etc kitchen sink regression I should not be able to detect any difference for race alone by itself right it should not have an influence on wages that is independent of all of these other effects so these have fallen out of favor almost completely in empirical analysis at least in economics because we now believe that there are many factors related to wages which would be in any theoretical model someone's marginal productivity that are not included in those regressions because they're not observed by the econometrician right and the thought is that these omitted factors could be correlated with race or ethnicity or drive that level effect that we're seeing by race or ethnicity so these could be non-cognitive skills for example someone's teamwork, their persistence, their drive their independence, their personality etc or omitted skills so cognitive skills, school, education and quality communication skills etc there's a problem with this because we're running individual level regressions and what you then have to argue is that these non-cognitive or cognitive skills vary by group because you've estimated these with an individual regression and you're looking at a group effect so the only way that these omitted factors can work to explain these racial differences is if they are correlated with the group itself that's just extremely important and that has to be true statistically but wouldn't you think that if there were differences in for example school quality by race that we want to talk about differences in school quality between races even in these egocentric models explaining this difference by race so we have a difficult time in economics thinking about racial inequality which should explain some of these unobserved factors themselves so what we would think about doing is parsing out say that be one coefficient into factors that are related to racial inequality which give rise to racial inequality and those which do not that typically is not the way economists proceed because they'd like for the market to have none of these sort of structural factors the market should give them all away so this example from discrimination shows that the issue hinges on how we interpret this race coefficient is it racial inequality, is it discrimination or is it actually some omitted factor so it could be seen as evidence of discrimination or disparate impact or it could reflect those omitted factors but those factors would have to be correlated with race or ethnicity to drive the result so let's think about these effects that vary by race typically we do not talk about the fact that whatever is omitted and whatever one would want to critique about these models itself has to be strongly correlated with race or ethnicity itself to drive the analysis so that would be my take-home message at least from this part of this that has to be part of the way that we discuss these empirical results so what about some deeper conceptual issues now one new approach has been to think of these outcomes by race not as the average of all the individuals who belong to a specific racial category but as the outcome of a process that is group specific by nature so thinking and putting people theoretically into groups in a way that would marry the way that we do empirical analysis with the theoretical development that we should use for that and so in economics this is referred to as stratification economics and so this was pioneered it's not a coincidence that I was using the work of William Darity and Derek Hamilton who really pioneered stratification economics but it seeks to look at race and ethnic outcomes as a rising from group conflict over resources and that sort of squares the circle that we have currently as a problem in empirical economics is because it now explicitly gives a role for racial inequality to be related to these omitted factors that we think drive the race coefficient but in stratification economics they exist because race is a salient point of policy and resource allocation in the way that we make decisions about policy in the United States so the reason why the race variable matters is because race is important for how we divide resources and how we develop policy currently so this helps guides and motivates the group analysis because now it gives you a reason why you might expect a group effect even if you have all of these individual controls controlled for say in your empirical analysis so thinking along the lines of groups and groups competing over resources gives you and resolves sort of two issues in the racial analysis right it gives you differences in the cumulative effects that one might expect and that may play out over the life cycle at different stages of the life cycle and then we can begin discussing policy that has disparate impact by group rather than being an average due to choices made by all of the members of the group themselves right so when you think about the way that economists would typically interpret these things everyone in the group is making these similar decisions but we really don't have any theoretical way of thinking about why they'd be in this group and I'll make these similar decisions if there were all of these different returns for example in the market they probably shouldn't exist but in stratification economics the group actually is driving a key source of those differences themselves and so this offers us a new way of thinking about the dynamics and frankly structural impact of a lot of different different outcomes whether we're thinking about segregation or sort of physical separation itself both over place and over time and I've done a lot of work now that looks at historical segregation having contemporary impacts and who was moving and aggregating and what that actually means discrimination which is about power between groups and control of resources resource allocation itself the distribution of resources so there's some exciting work now in political science which looks at policy that becomes coded by race and how political support for policies becomes very different when it's coded by race or not by race so a co-author of mine has a great paper that looks at it is not the case for example that the south has strong preferences for low levels of resource allocation say for aid to mothers and children so for example southern states were quite generous in their confederate widows pensions because those are race specific there are no black confederate widows and so you can have a very generous welfare benefit as long as it is racially exclusion and so that's an important way of thinking about how resource allocation might map to race and give rise to these racial differences investment and this could be community level investment given geographic segregation in areas redlining and there's a lot of research now on redlining for example returns on investment so given that there might be differences in market returns to human capital investment that would vary by race optimal decisions by those in different racial categories should be different and then thinking about inequality more broadly will be one more way of thinking about these differences so questions to ask and so I will stop after this slide and we will I think I'm the last thing holding everyone between lunch okay so you're getting ready to eat so the questions to ask is how is race defined in this discussion so whenever you see any racial analysis that is out there the first thing to ask given these definitions is how are they defining race who are and it might seem like an odd question but you really want to ask who are the white people who are the black people who are the Asians who is Hispanic because that would be an ethnic designation and what racers are they including in that Hispanic designation who are the Pacific Islanders etc etc and then is there any thinking about heterogeneity within that racial classification or are they only reporting differences average differences between members of the groups typically they're reporting average differences between members of the group which is fine but a question that you might want to ask for policy is the underlying variance or the underlying distribution within those racial categories themselves and next are we assuming a group or basically an average of individuals framework right so you shouldn't run any empirical analysis without some idea or some theoretical model behind it are you running an ego-centric model when economists talk about this it's almost always going to be an ego-centric model others are thinking about relations between groups and it's important for them to think about those frameworks specifically as we're thinking about policy and the role for policy for them and then how do the questions and answers to those questions inform the interpretation that we would have of the results so who are the people defined in this group how are they defining group are we thinking about heterogeneity within that group we're thinking about inequality and are we assuming an ego-centric model or a group model and then how does that actually help inform the interpretation that I'm giving to whatever averages might be reported in the data we have definitely been given a lot of food for thought as we're chewing on that hopefully we have some good things for you to chew on in reality we're going to take a break are we good Nina are we good good okay good before I send you out there I want to make sure there's something there we're going to take a break for everyone to grab lunch again as with the last time I would like to give our speakers a chance to go and get served first so that they can complete that before we start with the Q&A but we're going to take a break now we should resume about 10 minutes before one to start with the Q&A if you have specific questions or comments from either of our presentations this morning and then after that we will get into the more practical part of the workshop where we look at some specific examples and work through them I think we're going to work through that together actually yeah okay so we will break let our speakers step out and get something if you need a restroom you can go out this door the glass door then go out the white door and the restrooms are right down that hall so we'll meet back here ready to start the next part of the program at 12.50 thanks so there are two microphones I have one, Jacoba has the other so if you have a question raise your hand or comment even raise your hand and we will get you the microphone so questions, thoughts feedback on what we heard I know Trifon you said you weren't going to talk about econometrics I still ask you an econometrics question because I totally by what you're saying a lot of research has not thought theoretically about these things but there are some methods that could get at it so maybe the solution is just to have best practices don't just control for race do decompositions or run regression separately by race so you can look at the differences by race so is that part of the solution and then I have a second really specific question not really just looking at sort of how much worse non-white people are doing but also looking at a premium that white people receive and I know it's a little bit problematic because you would compare it to like an average to the whole population but I'm just curious your thoughts on a method like that so I'll respond to them in order so that's the first question I do think that some best practices exist so if you look and I think this also includes this talk isn't specifically about gender but also bringing gender into this as well so that we're not thinking about all blacks and we want to think about gender dimensions within races as well as another source of heterogeneity and so if you go back to even this chain mobility work we find significant differences in mobility between black men and black women and so we would never know that if we were just looking at black mobility overall and so this average for all black people obscures a big difference between two different groups of black people and so that has now become an initial policy important doing that sort of subsequent analysis is very significant but also I think there's a role for qualitative research in economics with respect to race as well that's not a component that we've thought of a lot as economists but how people are thinking about so one standard question we were discussing this at lunch is what do people think when they're checking the box for race and how are they defining race so rarely do we see this extensive definition when you sort of check a box on a survey but certainly you've responded recently probably to a call trick survey or some other sort of survey that's asked you your race or ethnicity and what were you thinking when you check the box that you you check maps into these definitions or maybe it doesn't and so the first thing to understand is do people really understand race the way that we've defined it administratively to the point where we can actually use it in data your second question was about oh yeah yeah so I think when we go back to this regression instruction we sort of think about differences between averages and this goes to the interpretation of those results so typically say you see a negative coefficient say on Hispanics right however they've been defined and we think of that as a penalty but this depends on the omitted group that you're leaving out and it is equally as valid statistically that that means that there's a premium for the other groups and so I think in terms of a policy discussion thinking about rather than posing all of these as penalties for being say black or Hispanic or Asian premiums to being white or male might be another way of phrasing those discussions that would have I think significant important policy other questions this is a very practical question we constantly come up against problems of sample size you know even basic occupational analysis and I was looking at your wealth data and thinking oh where does he get the sample from but you know what are ways how to okay you can interpret that and you know say okay we have Asians and this includes a broader variety but it doesn't really add that much to knowing what happens to Asians in that so what strategies do people use around this apart from pushing data sets for new sample sizes well I would say to this and certainly the data that I showed from the project with Sandy and Derek the first thing they would tell me is that it's expensive so collecting this sort of data is incredibly expensive because it is a survey that has to go out into the field and so for most people involved in policy analysis they simply don't have the resources to do these sorts of surveys and then it's very difficult because you'd have a small sample size and what is the extent to which you could extrapolate from that even with this national asset scorecard to what extent can we extrapolate from that to a national representative sample and this really is where I think the policy community in general needs to push on the federal government who is really the the largest mover in this area and actually has the resources to collect this information in a wide number of administrative data sets that would allow us to see if these differences are there you do need and you have a power problem with small sample sizes for any sort of analysis and so you really do need to either combine this together with other data that may have similar ways and expanding research networks so that we can have this data systematically gathered or a large number of different surveys will be one way of sort of getting the scale that you need to have this operate but another is simply petitioning the Census Bureau and other sort of federal data collectors to have this administrative data collected in their samples I'm thinking particularly in terms of education this would be very very valuable information to have because of the statistics that we gather continuously and annually on education if we had those broken out by much finer segments of the population and we would have the sufficient sample sizes in those data to be able to say something significant so if I could just add on to that I'm Kilo Lokija Kazi and I funded the national asset scorecard study when I was at Ford Foundation and it is expensive but we were determined to try to get this disaggregation by not only race and ethnicity but also national origin and tribal affiliation it's expensive but it's also difficult to find the sample so there really was a process of creating a method for identifying the sample members by geographic area, by cell phone regular phone just the whole range of approaches that were used in order to even just get the communities that we wanted in sufficient sizes we also worked with the Federal Reserve Board in trying to get them to include within their surveys modules from the national asset scorecard survey because I agree this really should be done at a federal level in order to get the sample size that you need but also the frequency in terms of the collection. So I really appreciated your comments about not only disaggregating race categories but also about different, the potential difference that race makes over the course of a life cycle so I'm going to situate this question in sort of that area so we know based on a lot of the work that Killolo funded there's a huge racial wealth gap and normally the way we think about that is sort of a point in time where we compare black wealth to white wealth or Hispanic wealth to white wealth and my question is when we make those comparisons we always have a sort of a running kind of commentary which is that well we know the historical sort of discrimination against black people is partly to account for these differentials in wealth so my question is when we think about something like that comparing wealth and we think about the desynchronous nature of wealth accumulation in certain communities right so at some point you might have held wealth then it was taken away from you then you rebuilt it then it was taken away from you etc we could say the same thing for Native American communities and then we think about in some white families and in some white groups that wealth has been held for generations and accumulated over time okay so in that scenario I just want you to comment generally on how do you think about that theoretically because the way we think about it now is we do the measurement and then we have the running commentary right so that means that the measurement isn't theoretically grounded so I just want you to kind of comment on that there's a new round of research in economic history which has started to look at some of this so some recent work by Huycly and others that's looked at Georgia's land lottery in the early 19th century which was land from the Cherokee and other nations by treaty given to the United States and then literally lotteryed off to to white males and they look at the effects of those who won the lottery and those who don't and there weren't these large ratios there weren't any differences between those who won the lottery or who didn't Leah Buston and some co-authors have looked at white wealth before and after the Civil War with linked data and find no differences with those who held many slaves of course those are a source of wealth before emancipation and those who come after so one critique of that work which has found no long-standing effects of these wealth distributions which you can use to argue that we shouldn't have reparations or we shouldn't think about wealth as being something that has as you sit in this room and commentary these sort of cumulative effects behind all of that are a couple of other things that I think we have to bring into our analysis which is first the federal government historically engaged in huge wealth redistribution efforts that were racially restrictive so you could just squat and just get a ton of land right if you were of the right race historically and that becomes a store of wealth that you can use for many different purposes and wealth actually is something that allows you to have other opportunities which we know are important so I think a part of this is that the running commentary I think is largely incomplete because it's looking at accumulation of resources but it's not looking at the role directly of policy in allowing those to accumulate resources but also to protect the resources that they've accumulated historically so one of the things we might find about there being very little effect of of destruction of wealth say after the civil war is that the political institutions don't transform or permanently transform which is one thing that keeps those in power in a sort of oligarchy empowers through the political system as well so I think we always have thought about this as wealth separate from the political process but I think they're actually much more engaged with each other historically so I think to do that analysis and think about these over a long time period you have to bring in a role for active federal intervention into the transmission of wealth which has happened historically for primarily for white Americans and then think secondly about the ways in which political institutions protect that wealth or receive and allow further transfers of wealth I think it's safe to say that most of us who do policy research in Washington are very involved, have backgrounds in social science and I was struck Sarah when you were speaking by how important historical research seemed to be informing what you were doing and I think that's also true obviously of so much of the work that you've done, Trvon, which has a historical background historical connection I'm just wondering if either of you, both of you have some comments on guidelines or suggestions for how public policy researchers can bring in historical context a little bit better That's a great question I think that so first of all I wouldn't strictly call myself a policy researcher and so I'm in this position of I think of not knowing a lot of what you say to day basis I mostly take your results and apply it to the field so that said I think this was actually going to be a topic of a little bit of an exercise that I was going to have folks here do which is you know we just need to I think people often don't think to look at history it's not sort of automatic that we have that every year the newest round of census data or that data comes out and I know Mass Budget for example releases an analysis every year of the most recent data on workers in our state and we just have a little bit of a tunnel vision of what's happening right now and so it is a little bit of a practice of principle and ground in this to even remember that everything that's happening right now is a product of so many things that have happened before and to seek out that information and you know this is not something that all organizations in my and remember to do either so I think it's just a practice that we have to build much more over time Great I think to add to that is one thing that we can do to bring in some historical analysis and this is something we currently see I think in a lot of policy debates as we think about policy as giving us very precise estimates if it's done it correctly if we use difference in differences or some other approach about identifying the causal effect of something but many things that are very interesting to us about historical trends are the trends themselves and we don't have as much as economists believe that they have the best techniques none of us are working with the best data and so we have to think very hard about you know we just went over these racial categories and we've been using these in our work for many years but we really haven't thought about them deeply and so we're using racial analysis and we have race analysis we have trends in racial data but have we really thought about what they mean conditional on these definitions which may change over time so I think we have to really think about and we have to bring history into inform the way that we think about measurement more generally which is quite important for policy what questions we're asking how people are responding to those questions and what they think that they're responding to versus how we use and interpret the results that they're doing that itself changes over time because these definitions change as society changes and our data is always in some sense backward looking at its reactionary the data that we collect is always in reaction to something that is going on or a definition that has changed right the way that we define families have changed as families change right so it's always something that is ex-post so we have to then think about all of the things that we'd like to put into a long time span as being conditional on changes that are happening on the ground so how do we actually anticipate those and how do those actually inform policy I think is important. Chai Ching from the Centre on Budget, thanks so much for both of your presentations all the way back here in terms of trying to push federal agencies further on data collection and disaggregation and more enriched data do you see any sort of positive opportunities that are immediate because like in a lot of spaces everything feels quite defensive right now census protecting it from the immigration question, funding cuts across a lot of different agencies and now some concerns about actually reduced access to microdata because of privacy concerns are there any positive things that you can suggest that we focus on or engage in or is it mostly just building for the long term? Are you asking an economist to be optimistic? To go back to a point that was raised earlier they're not Obi-Wan Kenobi but they really are our last hope at this time would be the Federal Reserve simply because they have resources, significant resources but they're a bit constrained into the questions that they might ask but they might be very important and they would have the resources to underwrite if the if the board could be so convinced that this data is very important because they have more control over their resources than other institutions and I do agree that many of them are under assault just to collect the basic information that they are due so this is not the time to say turn to the Bureau of Labor statistics for new data collection efforts when many of their existing data collection efforts are being subjected to funding cuts so it is very difficult to get that data. States however in some states are much more amenable to collecting this data or allowing this data to be shared so states have a wide variety of administrative data and some states now are sharing more of their administrative data than ever before so we may not have to necessarily reinvent the wheel as much as it is forming cooperative efforts to sort of dig around and resurrect the wheel or construct it from the parts that currently exist and so there are some great resources in states that are much more open to researchers working with de-identified individual level data that might have some of those possibilities but the larger the state and the more heterogeneous the better it is going to be so I wouldn't necessarily put states that are relatively homogeneous at the top of the list but those states tend to be the ones it is sort of a perfect storm that states that are more heterogeneous tend to have larger data collection efforts but also much more willing to let social scientists in the policy community analyze that data. I am actually interested in speculating on that question from a campaign perspective. I am not familiar with the existing campaigns that are around data availability and data collection and usage but it strikes me that one very good angle that could be kind of like a unifying campaign motivation is that at the same time that we are sort of losing the capacity to collect more and more and more meaningful data about ourselves that is publicly available and therefore available for the use for public good there is an increasing amount of that that is being collected by us on us against our will in some cases or against our knowledge and is being deployed in many ways against us to extract our consumer dollars or to discriminate against us in various ways so that is something that I think among the different things people are struggling with right now that might not float to the top but there is a way I think to tell that story that could be very motivational and maybe build some movement around it I will take one more question and then we will get to our next okay I hope it is a good question I am nervous now Courtney Sanders my question is actually to the point you are making about homogenous states I work with states like Maine and Nebraska and they always ask me well Courtney we have this black population but we don't want to include all we don't want to just say it is black people because we know it is a lot of immigrants in this population all these different things and it is not statistically significant and I was hoping from this conversation that we talk a little bit more about how we pull back that statistical significance piece on why it is important to talk about these very very small 2%, 3% populations of race in these very homogenous states and how do we do that and I really like you didn't use as many data points as possible but I understood your story and I understood the course of the story through the historical lens and obviously that is something we offer but if you had any comment to that I would appreciate it could you repeat the question part of that sorry and it is for both of you but how do we communicate and talk about the data in states that might be homogenous like all white states Maine and Nebraska they are very small populations of color but we know that the programs and services that the state want to provide are trying to be targeted to those populations even though they are so small I would say I am from Massachusetts so this is not my experience but I would speculate that in that case if part of your premise is that the data simply isn't there or isn't really credible because of example sizes that is a very good opportunity for you know grassroots kinds of organizing and research activities themselves so you know we can so one mode is to kind of rely on authoritative institutions to tell stories about us and then the other mode is for us to be able to tell our own stories and this is where you know this participatory action research comes into play which is not you know something that I suggest lightly because it is very expensive and very hard to do well but it is you know in this process people who are you know affected by this particular issue come together to gather data about themselves and their own circumstances and they do so kind of with the support and guidance of people who are trained in kind of you know research methods so that to kind of lend it some credibility for external observers and then you know and then actually not only the action part of it comes in the fact that you know actors in their own lives they are in a position to kind of act immediately upon seeing this information that they are producing themselves and then you know and they're by setting themselves up for be able to measure their circumstance again at the action and then proceed from there so you know communities can be organized to generate their own data right and it's and there certainly might be some people who will poo poo it because it isn't sort of you know official authoritative data but it is information and it is actionable and relevant then you know that would be my recommendation I think to piggyback of what you're saying if the group is that small there should be highly likely to be targeted you could really target them because you'd be able to count all of them right the numbers really truly finite then you really should be able to target them and I think that the one of the things that you might hear is this really this complication of statistical significance with general importance and there those are two very different sorts of things so in a place that is relatively small you won't have a large end but that is the end that exists right so it's just what significance only matters if you're running a statistical test and so if you're thinking about policy that might not be it might not have any say in the discussion or the policy discussion if you want to target say an immigrant population in Nebraska or an immigrant population and so if you're thinking about the name for a specific resources it wouldn't matter so I think that when we were sort of confusing statistical power for measuring effects with targeting policies at a particular population and so to one effect to measure effects it's actually a disadvantage to have a small end but to actually target a population to be effective in terms of reaching that population a small end is actually to your advantage and so this is where policy actually has two different areas of need which is if you know who you want to reach you can reach them really relatively easily it doesn't necessarily mean you're going to have a measurable effect in terms of the actual outcome but if you're actually able to meet those people as you're saying they generate their own data which can be very useful for a targeted population. Thanks so now we will move into the part of our where we sort of try to apply some of the things that we've talked about and heard about today and specifically I'm sure that many of us maybe all of us are engaged in social media in some way or another and see these various conversations and threads that come up specifically related to race some may be useful others maybe not so much but we had a few examples of things that we wanted to share and sort of work through these specific questions that Trevon identified so we're going to do that for part of the time and then the other part of the time we're going to have Sarah lead us in thinking about how we as researchers in the Washington DC community can help to compile analyze data information in a way that is more directly useful and applicable to various campaigns and grassroots organizations across the country so we're going to try to get both of those two things done in our remaining time and given that I'm going to be quiet and turn it over to our facilitators. So keeping I'll give you a moment to write these questions down because we're going to work through just some examples as an open end of discussion answering each of these questions and some things that we've captured specifically from Twitter but to think about our answers to these four questions so the first is how is race defined and we're just going to see these tweets where people are talking about an issue that certainly has a racial component but how is race defined is there any thinking about heterogeneity within the racial classification are we assuming a group or an average of individuals are we thinking is implicitly what's behind this some sort of ego-centric model or are we thinking about group relations and then how does that inform the interpretation we would have about this outcome so we'll leave that up for another minute before we get to the slides. Everyone ready? Okay so Marianne Williamson said if you did the math of the 40 acres in a mule given that there were 45 million stays at the end of the Civil War and they were all promised 40 acres in a mule for every family of four if you did the math today it would be trillions of dollars reparations big so who is she talking about in this in this tweet black people but now we go back to our census definition of black people she's not just talking about black people right so when we think about say what the work that Daredine Hamilton were doing she's putting this in the US ascendance category strictly right but that certainly then is not all that certainly is not all black people is that herogeneity in this black population taken into account here in this discussion she's talking about this dollar bill so she is saying all black people are equivalent here in this discussion about reparations and third is this egocentric or is this actually a group and she's certainly applying a group analysis to them because she's putting them all in this group now she's putting people who are the descendants of slaves in a group with people who might be very recent immigrants to the United States in the same group so then how does that inform the interpretation that we're giving to how she's structuring her argument about reparations there are no right or wrong answers but you're exactly things yeah the millions of dollars the follow-up question would be who does that apply to now we immediately have to go back to a question of race so we've certainly decided that she's not talking about Asians or other groups she's talking about enslaved enslaved Africans but then who would then be aiming to claim these reparations as this dollar value so even in discussions where she does not actually so right to your point she does not name black people here at all but this discussion will certainly be framed in our popular media discussions in terms of race and then will that heterogeneity and race actually be important and next we have student debt cancellation as proposed by Elizabeth Warren and Bernie Sanders reduces racial wealth inequality right so the more debt cancellation there is the more racial wealth inequality is reduced with this analysis of student debt and the racial wealth yeah right so what races are we how is race defined here our implicit assumption is we don't know or explicitly we don't know implicitly if we're using data on student debt that is largely going to be administrative data so race would typically be defined the way that we have been talking about it in the discussion that we've had so far so these racial differences would not necessarily give rise to discussion of the heterogeneity that might exist among racial groups themselves in terms of the racial wealth gap but this is framing the racial wealth gap as not a heterogeneous difference within races and then are we thinking about groups are we thinking about egocentric models this is a bit important these get commingled here because student debt is certainly observed at the individual level and these differences in student debt would be group averages of individuals belonging to that group so it might be a function of structural inequality all of those things are here and these tweets aren't necessarily a whole lot of characters but thinking about race in them becomes very complicated just answering this very short set of questions about the implicit assumptions that are made there and so when we think about this if I tell you this is going to eliminate racial wealth inequality it's going to eliminate racial wealth inequality to the extent that I make some other assumptions about the groups that would or would not have higher levels of student debt and who those people would be because cancelling up that debt may or may not exacerbate differences within the group so all of these things are going on when we have this analysis these are the examples that came to me I have some of my own but all of these I think are really interesting but this gets back to thinking about talking about economic research that then informs people in the policy space so economists Chetty and Hedron came out with a new study this year it underlines what so many people knew intuitively race is more important than class when determining futures even with economic status racism still lingers the two are inextricably linked in housing so once again this explicitly says race but it doesn't define racial groups but we know that Chetty and Hedron are using administrative data so we're back to this traditional racial classification which does not take account of heterogeneity within racial groups themselves and then are we thinking when we say race and class are we thinking about groups and so we're saying race and class typically gets us thinking about groups but remember that Chetty and Hedron papers are literally all individual level right they're linking parents to children and then aggregating that this really is truly the aggregation of individuals belonging to specific categories and then how do we want to think about the interpretation of this which then looks at housing policy as a source of intergenerational mobility so all of that is in this discussion of these results and that the New York Times and these great interactive maps that we can all go and play with and look at where we were born and see about the map of mobility as a whole lot of implicit assumptions about race that were not necessarily active and we're sort of thinking about them currently and so this is another one so Democrats like to talk about black Americans and so here are some facts the black unemployment rate has been at or below 7% for 17 straight months black poverty rate is at the lowest level in history black unemployment rate hit the lowest level ever recorded under Donald Trump right so we certainly know the racial group being talked about here are black Americans we're not thinking about heterogeneity within this group and these are citing facts about black economic conditions at present but is there something missing from this discussion about race? we don't know about this but these are all sort of facts and so without that context about speaking about other races we don't know how to even think about whether this is these all sound like great results until I say 7% what is the unemployment rate for any other group which gets us back to how we traditionally would do racial analysis there's no discussion of the black rate versus any other group which then decontextualizes the black results as if they would be typical or these results don't reflect any historical cumulative effects of sort of racial disadvantage if they might presume so this is one function of that so this is the last example black ownership fell to 43% in 2017 virtually erasing all of the gains made since the passage of the fair housing act in 1968 landmark legislation outlawing housing discrimination to increase in black home ownership so again we have a black group where there's no discussion of heterogeneity within that group home ownership is an individual outcome but there's not a discussion of differences in home ownership rate to the places where geographically African-Americans are more likely to be concentrated that could be partly endogenous if African-Americans are in places with generally lower home ownership rates they would have lower home ownership rates but it is contextualizing African-American home ownership relative to itself in terms of 43% and going back to 1968 but similar to the previous tweet it doesn't contextualize the home ownership rate of African-Americans relative to any other group whether that rate is 43% high or low compared to other groups it turns out to be low but it isn't contextualized here so many of these discussions about race when you go through that list of four questions the interpretation that we have to whatever statistic and whatever political argument someone is trying to make hinges on the way that sort of race has been coded but I think most important from these examples hinges on a way in which we sort of instinctively react to racial discussions so no one goes through I would never tweet myself here's the outcome for the income well we all remember black is and give you the long definition because no one would want to read it or any other racial group so the way that we talk about race typically is going off of essentially our own sort of intuitive knowledge about what race is and that may or may not be the appropriate way for thinking about analytical descriptions of race and data that actually contains racial classification so I think it's important for the policy community to think about that really deeply when we disseminate results but also when we want to place those results in some sort of context so this result has a little bit more context in the previous slide but both of them lack context in thinking about racial inequalities right so they're trying to say something about racial inequality but it's very difficult to say that without any comparison group whether we're talking about low unemployment rates for African Americans or low home ownership rates for African Americans and as a result of positioning these two tweets are things good for African Americans are actually bad for African Americans in the economy so low home ownership rates but low levels of unemployment so how do we actually want to think about that really depends on what outcomes that we're looking at and that's multiple ways of thinking about race so recent events of course make these examples relatively small but they're part of the way that race is going to be discussed day to day particularly on social media as results come out and so the way we discuss race really influences how we discuss race in moments that are particularly salient right and that would be moments for example that we've experienced in the last week or so and so is there a difference in the way that we might discuss race in a policy discussion versus the way that we might discuss say race when there's been racial targeting or violence and then what are those things apply for how we're thinking about race right I think that's actually some of the most important stuff we can get out of that well my thought for this exercise was that when I go to these great panels and events I usually have an item in my work plan which is like read my notes and follow up and try to integrate into my practice and then sometimes it happens and more often it does not happen so I thought it would be great to one thing we could do here is maybe kind of take some of those first steps towards just thinking about how some of these things show up in the work that you are going to go back to today and tomorrow and the rest of your life so the format here is that I have kind of a series of questions hopefully folks are like taking notes so as I'm reading out the question just kind of start jotting down your thoughts if you don't know how to answer the question that's a really good flag and I'll get to the end of these questions and then we'll have some time for you to kind of turn to your partners and maybe share what your thoughts are coming out of answering those questions and then maybe some details and then at the end we'll just have some like report backs from some of the tables so if you want to share with the whole group you can so what is your area or your areas of research and specialty what do you focus on is it housing is it energy policy is it multiple and that's an easy one I hope you all know that what are the data sets that you work with regularly on a day-to-day basis so maybe you have written down a few now how far back do they go in years what's the earliest year do you know what the earliest year is that data was collected quick show of hands if you know the answer to that question can you raise your hand you know how far back let's go okay now keep your hands up now do you know do you also know whether so if now sorry so hands are up if you know your how far it goes how far back it goes second hand up if it goes back more than a hundred years probably very few so now for those of you who probably for everyone actually if you if you have to go back beyond when your data started to be collected do you know what other data is available a different data set another area where information was collected so switching tracks do you know the major historical moments that kind of changed that within your issue what are the pivot points in history over the last hundred years say back to writing things down if you have you know the answer and finally do you know are you familiar with the full range of opportunities and limitations in this whole kind of set of data and our data sets and multiple data sets that would enable us to reflect on history and the issue of the content of your issue history of your issue yes so do you know are you familiar with the range of opportunities and limitations and using this data to reflect on that history within your issue and especially reflecting on the multiple histories of that issue for different communities a couple of folks are still writing alright so you you can turn to your tables and other folks and if there are things you want to share maybe you know the answers to all these questions and you want to show off a bit or maybe you're wondering why don't I know this I'll stop you after maybe 90 seconds or 22 minutes one mic so we can we can talk we can talk in code have you ever compiled like a curriculum or resources or individuals kind of are kind of leading the work in data disaggregation that would be really curious to kind of see who those folks are what do you guys think about it it's kind of a guard isn't it you know Sandy and they have done it in the field but a lot of people I was just I'm just saying the data that exists we really are just data users even our macroeconomic like GDP we have data the people doing the disaggregation are always landing on new data to think about anything historical and to do that it's just really really hard because people want to link people back to historical since it started who links and self-function being able to be found we can't make these things we can't do a burden in this danger but I think particularly there are people in social science thinking about say segregation and other issues we've been doing a lot of work in a couple seconds we can come back I know once you let them go alright I'm going to have to resort to organizer and if you can hear if you can hear me clap once if you can hear me clap twice okay we don't have to go through time okay so I hope you're having great conversations about what you know and what you wish you knew does anyone want to share does anything great come up in your group you feel like you ought to share with everybody even if you don't want to I shared some of my information so I look at racial and economic disparities I use data from the census ipoms or ipoms which I love if you want to know about it please come and talk to me it's like the best data set ever I also use data set from project hail the census data set starts in 1790 I don't know how I can get data before 1790 historical events that happened I think it was like the 1, 2, 3, 4, 5th question my response was lynchings and civil war so again I look at lynchings and voting so those are the two historical events that have happened and then limitations is that I wish that my lynching data set started earlier so it starts in 1882 and ends in 1930 it would be great if I just had lynchings from the first slave being brought over here that would be great also it would be great if I had voting data between like the 1870s and 1970 just to kind of fill in those links am I still on here? yeah I think so others want to share so actually I didn't get to share this with the group because we didn't have time so I do well I shared some I work on economic security issues and retirement security usually with the racial and ethnic lens and I focus on the racial wealth gap in particular I typically co-author documents I do the historical research part and so I'm not focusing on a specific data set my co-authors use PSID, CPS SIP with the historical data that I use I mean I just research extensively I can find the information so for example the looking at the racial wealth gap in Washington DC I went to the inception of DC and read historical documents on how policies and institutional practices created wealth for whites and created barriers of strip wealth from African Americans so there's one last pair of questions that so for those of you who who did know the answers to all those questions that's great you don't have any homework for those who don't I mean I can't make you do anything obviously but I hope that you'll go back and kind of take a look at you know what are some of the resources that will help tell more histories so the second pair of questions for our last kind of small group discussion is just kind of a little bit of a reflection on two sides of this sort of partnership and you know this role in making change which is from your organization are you personally what are the what are the practices that you have in trying to make your research that you do your work actionable and accessible for people who are doing work on the ground on that issue today and and the second question is from your perspective what are the things that you wish that those groups on the ground do you like like my group like campaign or movement groups what do you wish that they would do to ask for support in a way that will help you little time to think about it alright we can wrap up your thought couple seconds so I feel like these last two questions that I posed could really deserve their whole workshop or like a whole day or just a lot more time than we have unfortunately so if you know folks can if there's anyone who has something like a really juicy that you want to share right now and we can maybe do one or two otherwise I hope that like you can if you have something that you want to share and you want to share it like email it to me I do compile these things and try to act on them so you know as someone who is often trying to get work from people so does anyone have something you want to share just quickly others maybe one more earlier in the process I think a lot of times some groups come to us and they want a very specific number and they want it tomorrow and the tomorrow is hard enough but you've gone off in the specific number instead of the specific number what do you get in that what is the argument you're trying to make more broadly because you can almost never deliver the exact number that is the dream the argument you can deliver something really close and you don't ever want to say no but if the demand is for a specific number then that's much harder so I think getting in early so we can have an understanding of the distribution of the campaign and figure out what we can actually deliver that will help it rather than just sort of like a very specific job title awesome great well so thanks everybody this notion of how do we work better together to make things happen very very near and dear to the realm of organizers and the realm of researchers so if you are interested in this then please come by me after let's go ahead and give them a hand so I want to thank everyone for sticking around with us today I hope that this was useful and productive this actually is like the perfect bridge to our next workshop which will be on building effective partnerships between research organizations and I forget the specific title but generally groups focused on racial justice so that will be the entire focus of our next workshop which I believe is scheduled for September 26th which is a Thursday so the next time we'll be meeting on a Thursday instead of a Wednesday and as you know I'll send around a reminder and also will include links to the videos for today's workshop as well as the previous workshop does anyone have any final questions thoughts or comments before we adjourn yes absolutely I can share that as well alright well thank you