 Thank you everybody for coming. It's exciting to see everyone. This is a great group of people Some are of our old Friends and new friends the peers So my name is Linda Kellam Welcome to the NCLA government resources sections help. I'm an accidental government information librarian webinar series or help for short Thank you for coming and I'll after I hand over to Jeremy I'll give some links to some resources for learning more about NCLA and also our YouTube channel. So today We have a webinar that we're very excited about us Jeremy has presented for us. I think a couple of times at this point Yeah, at least twice yeah election topics and it's always funny always learned so much from his webinars So I'm really excited to have him here This one is called what were they thinking exploring America's voting preferences and add attitudes using the American national electionist study And Jeremy is the politics librarian at Princeton University Library He has a BA in international studies from Brigham Young University and graduate degrees in Library and information sciences from the University of Washington and in political science from UC Berkeley He is a past chair of the policy politics policy and international relations sections of a CRL And he recently received the Marta Lang sage DQ press award for distinguished contributions and to librarianship in political science. Yeah, congratulations Thank you. Just got my plaque this week. Oh sweet. Oh, I wish we could have had some champagne in person, but we celebrations Jeremy is happily married and the proud father of four children. Alrighty, so Welcome everybody. I'm glad to do this and just to follow up on what Linda was saying just a minute ago If you don't volunteer, she's also totally comfortable and just like calling on you and asking you to do something So that's how I end up getting roped into these things. I think every single time Linda has just said, you know I we need you to do a webinar. I say, okay Anyway, this is a good topic You know, I do a lot of stuff covering elections and especially in election years You know, like many of you I get lots of questions about elections about public opinion related to elections and politics And I get stuff come in from people all around the country and even sometimes from around the world and It's a fun topic, right, you know, I think there's a lot going on with this topic and especially in a very polarized environment in which we have been for, you know, at least a couple of electoral cycles and and depending on how you read the research at least, you know, 20 30 years increasingly It's I think it's useful for us to sort of try to explore a bit more deeply What drives people's voting preferences? And what we can learn from that. So we're gonna talk a little bit about that today I will just say, you know, I'm not an expert in public opinion polling They're political scientists who this is their whole career is just working in sort of public opinion methodology Very sophisticated work in political science dealing with lots and lots of interesting questions related to this But I do get a lot of questions about this as may some of you Reference stuff. So just kind of taking a broad view here on sort of public opinion polling It matters because, you know, elections can tell us what people Choose who they choose to vote for but they don't tell us why, right? That's the big question's always left unanswered and for social scientists I'm just the interested public, you know, we kind of want to know like why did people vote the way they did, right? You know, you can vote for somebody as an affirmation You can vote for somebody as a protest against somebody else or against some other kind of a System you can vote specifically just because of a single issue that really drives you There are lots and lots of reasons and we don't know anything because of the secret ballot about who votes The way they do and why they vote the way they do so we have to rely on survey data to help us figure that out You know the genesis of public opinion polling, especially political opinion polling in the United States really kind of takes off in the late 1930s especially into 1940s with Gallup and and Elmo Roper And you know, they really pioneered a lot of this and it's become more sophisticated obviously over time The methods have improved a lot some of the early surveys in those early decade or two Suffered from a lot of methodological problems that we only subsequently learned that you know About sampling and about how to properly control for particular issues that Might introduce error or bias into our surveys and we'll talk a little bit more about what some of those things are in a couple of minutes Um You know and if you want to know what people think obviously ideally would be to be able to figure out like what every single person in the United States thinks You know but with roughly, I don't know 255 million adults in the United States asking everybody is just a little bit cost prohibitive so That's what we use surveys right and surveys In order to be representative have to use some kind of a random sample of the total population so i'm going to talk really briefly and in just sort of The general terms about some of the kind of key terms related to public opinion pulling I'm not going to go into depth people have questions. We can certainly take some of those And I'm sure many of you are already familiar with some of these things So the general idea is that if we have a population of interest like the entire adult population United States Like I said, that's 255 million people In order to figure out what those people think, you know, we can't just stand on a street corner and ask You know A thousand people what they think about certain things because that's going to be a really skewed sample Right, we're only going to see a certain segment of the population depending on which corner We would choose to stand on which city what part of the country What time of day even and so in order to Make sure that this the surveys that we do are at least going to be somewhat Or fairly representative of the general adult opinion in the United States We have to provide for some way of random sampling of the population taking just a few people to represent the whole Now if we go just a little bit more in detail in sampling a couple things are really important, right? So the randomness is really important. Obviously if it's not random We've introduced a whole bunch of bias already directly into the survey and we can't rely at all On the results that we get being representative of the general population Right, so we say random We just mean that everybody in that total universe sometimes they call it a universe Sometimes they just call the total population that you're trying to represent everybody in that Population has to have at least the same chance of being sampled, right? Now obviously start to thinking about that there can be real challenges to doing that United States As in many countries But and we'll talk about a little bit more about that in a minute But that's the idea in principle is that everybody should have the same chance of falling into that sample, right? Now in practice You know with surveys usually they employ something that we typically call a stratified probability sample Which is to say that what they want to do instead just like you know a random sampling of the entire adult population If we want to make sure that it's representative of kind of key groups that we're interested in knowing their opinions Then we need to make sure that the the sample Frame itself has some way of making sure that it captures at least a representative sample of some of those sub populations. For example different racial categories like blacks or hispanics You know registered voters versus non-registered voters rural people versus urban people people that live in the south versus the northeast, right? All those things Are things that we'd want to take into account when we think about building our sample So all we mean by stratified probability sample here is that you know If we know we want to represent geographic regions of the country Then we better make sure that we don't just take a random sampling of all 255 million people in the country But that we then choose, you know some segment to say like, you know 300 people in our sample are going to come from the south 300 from the west 300 from the north 300 from the midwest and so on and so forth, right? Now the methods for doing this varies They can vary Based on organizations, you know, they're all different organizations that do sort of public opinion May use slightly different methods for their sampling But most of the differences in sampling tend to vary based on for the administration of the survey instrument itself So, you know, typically you have kind of three main ways either you interview people face to face So you send your training interview or you send them to somebody's house and you know, they will interview somebody in that household And those will usually use some kind of area probability sampling, right? So you basically you take a map to the United States you divide it up into those different segments you want Um, you use different second census figures to represent different sub populations in the United States and say like We got to make sure we get enough people to match those things And then within those sort of subgroupings you take a random sample, right? phone surveys, um, which obviously for many decades were one of the most common ways of running a public opinion poll Simply because it's much much cheaper, right and to send somebody in person to a whole bunch of parts of the country That's really costly. So to do or the phone is much easier to do And in the past those when everybody had a landline, they would rely on random digit dialing, right? So because area codes represent different areas of the United States and then um The phone number itself the first three digits would then represent different parts of a city or a region even You know, you could use random digit dialing with variations on the methods to do that To represent, you know, certain areas so that you could know you got part of Your sample from the place that you had targeted to get your sample from Now that's become a lot more difficult over the last number of years last decade or two with the rise of cell phones Obviously as you know when you have your cell phone, it will represent an area code But you don't have to live there anymore and you can take your phone with you when you move so And one of the other actual big issues is the FCC Has regulations that mean you can't auto dial Like from phone banks, you can't have Computers auto dial cell phone numbers, right? So it's against the law and you can incur really big fees So survey firms have to actually hand dial those which is really tedious and very costly So over time increasingly, you know, many surveys have moved to sort of more hybrid methods Where they combine different administration forms and increasingly using the internet, obviously And now you can do that in different ways In many cases, you know, you can actually use the postal addresses to create your sample send out an invitation Maybe with some kind of incentive like 20 bucks in cash So, you know, go take our survey and if you fill out the survey and finish it and we'll give you another You know, 20 30 dollars or whatever They also many survey firms now also increasingly using sort of these opt-in panels We'll talk a little bit more about that in a minute Obviously, those aren't going to be representative because they're not Being sampled everybody has the same chance of being in those But they have different ways of sort of correcting after the fact To match up people's responses based on what share of the population they might represent All right, so looks like somebody has a question here on Yes, so on We're going to get to that in just a second about question with cell phones and do people answer them or not, right? So that falls under this sort of next general category of survey error and bias right, so It's important to understand that there are lots of different ways that surveys can have errors or bias introduced into them, right There are a couple of major categories and I'm just going to sort of briefly go through these the first would be sampling error, right, so this is Derives from the fact that you might interview only a sample rather than the entire population, right? So this is actually a statistical feature Right, we know we're not interviewing all 255 million adults in the United States population We're going to interview just a small small subset of that And so necessarily there's going to be some amount of just error Introduced by the fact that we're not interviewing every single person, right? This is of all the types of error that are introduced in surveys This is really the only one that we can actually quantify and know statistically About what size it is right and that's it's important to note that and pay attention to that Usually we'll refer to that in surveys as sort of the margin of error I'm going to pop out of the survey here real quick and show you there's Various calculators you can look at this online survey monkey's got a nice one where you can sort of see the effects of this Um, and like I said, you know, there's really complicated. It's not super complicated, but there's math You know, if you look at this most of us got I don't want to know anything about that I don't understand what that means, but um, the general idea is that you know, there's statistical Methods behind this can help us gauge what the margin of error is going to be based on the total population size you're working with As well as the confidence level you want. We'll talk a little bit more about that in just a second As well as the desired sort of margin of error, right? So like I said, they're about 255 million us adults in the united states So this is a sort of a conventional confidence level 95 percent will give us just a minute So let's say you want to target different levels of error in your survey So if we were shooting for sort of the 3% error You'd need about only a thousand people about 1100 people to pretty accurately represent the adult population of the united states That's actually pretty remarkable when you think about it, right? Um, if we want to go, um Higher if we're willing to tolerate a much higher margin of error, although still not Grossly high right you'd only need to interview about 400 people to be You know, roughly represented the united states population of adults We wanted to go smaller than that. We might go to like 2% You can see this sample size increases and you know, you sort of have decreasing returns on this The more larger your sample size doesn't continue to decrease your margin of error at the same rate, right? So most surveys will tend to run around 2 Or 3 because it's a fairly acceptable margin of error. There's not a big difference sort of 2 and 3 percent And the sample sizes make this much much cheaper, right? Like, you know, if to go to 2% we have to more than double our sample size Well, then most of us are like, well, we can tolerate 3% right because it could be much cheaper to do that um, so To talk a little bit more about what that means, right if you look at an example Here's a recent survivor just looking at this yesterday or the day before You gov, which is a polling firm just put out another poll Sort of the horse race kind of poll about the general election in which they show Voters registered voters supporting biden 49 to trump 40, right? So a lot of us will see that headline 49 to 40 and think well, biden's got a big 9 lead Well, it's it's a little bit more complicated than that, right? Because we have to take this margin of error into Consideration, right? So if you look in the details of the actual survey Uh, you know, there's sort of a report on it. They'll mention that, you know, the sample size was about I think there's about 1200 people in this case write that down But it's a margin of error about 3.3 percent at the 95 confidence level So what that means is if you sampled the same population all registered voters in the united states And you did that a hundred times on about 95 of those draws 95 of those samples Would represent the total population's preference for biden somewhere in the range of 45.7 percent to 52.3 percent, right? So that's taking 49 Minusing 3.3 as your lower threshold adding 3.3 as your upper threshold and the same thing with trump here He could range anywhere from a low of 36.7 to a high of 43.3 Right. So taking into account the margin of error. What that means is, you know, biden has a Has a statistical lead over trump It's not just washed out by the margin of error, right? Because even at his lowest 45.7 percent He still has a lead over trump at his highest within that margin of error. Okay Now the important thing to mention about that or to think about right is that this is a statistical feature, right of of you know Trying to represent the entire adult population What it means is at least five of those samples though five of those draws five times out of 100 We might get something that's outside of that margin of error and not represented this population So even taking account the margin of error It could be that this poll is just wrong right that we happen to get a random group of people of the adult population That aren't actually that representative Right. That's a possibility. Now, um, it's not a super high possibility. It's fairly low, right? That's why we use this 95 percent sort of confidence interval Um, but it's always there that there's always that possibility and you can't really get rid of it until you Ask every single person uh in the united states in this case every registered voter in the united states what their opinion was Now outside of a sampling error, which I says really the only thing you can quantify There are a bunch of other types of error that might creep into our surveys, right? One could be coverage error, which is simply that we don't uh, not everybody actually Has the opportunity to be included in the survey itself, right? So there are various populations. They're just hard to reach Right. They might be homeless. They could be military people serving overseas. Um, we can't get in contact with They could people in prison or people in other uh, you know Institutional settings hospitals or whatever that we might not be able to reach. Um, they could just simply be on a vacation Right. Um, so those are all different ways that we might not reach people And so subsequently could miss people that could Change the the answers on our surveys, right? And it may be and you know, this is usually not that big an issue Unless we uh have reason to believe that some of these people might have Certain preferences that align with the reason that they're not being covered in the survey, right? So for example looking at like homeless people, right? Like homeless people might have Certain attitudes that correlate with the fact that they experience homelessness, right? And if we don't have any of those people in our survey, we're going to miss some element of You know opinions or attitudes that just aren't going to be represented the same way So that's one kind of error another kind of error and this is a much broader category And this is the kind of thing that we typically Uh caution people to be aware of like, you know students when they're creating their own surveys or whatever is Is measurement error, right? This this comes from all different ways that we might not actually measure What we think we're measuring, right? This could come from problems with the way we word our questions like leading questions the sort of bias questions Or you know using certain kind of key phrases that might Be problematic It could have to do with the way that we order our questions, right? Like, you know, so certain questions if you place them in a certain sequence can affect people's attitudes And there's a lot of really interesting political science research and and sort of social science research about that It could be interviewer bias or simple mistakes, right? Like, you know, they fail to accurately Write down or code somebody's response Or they could just be badly trained and like, you know, they're really leading people and sort of The way that they ask their questions Respondents themselves might have a lack of candor, right? Like haven't forbid somebody to actually tell the truth That's a possibility, right? And we when we think about like elections and you ask people about like, you know Who they voted for in the last election, you know, most of us would think well It's really hard to forget whether I voted or not in the last presidential election And then, you know, who I voted for but that's actually not true There's a lot of uh Research that shows that people have can have fairly faulty memories on this, right? Like, you know, the further away you get from an event the more likely you are to misremember Especially something that's big and salient like a presidential election. It might be that You didn't vote you had every intention of going to vote and you're a diehard You know, maybe not diehard, but you're generally a republican or you're a democrat or whatever And you so you're pretty sure that you voted for for trump, right in the last election and he won Right. And so you may not have actually shown up, but because you have an allegiance for that party You favored that candidate you intended to vote you might just say like, oh, and he won, right? Like though all those factors can sort of Move together to sort of Help prime your memory to think like, well, yes, of course I voted I voted for the winner, right? And so those kinds of things happen as well Now another category of error can be non-responsive So this gets back to the question somebody asked about cell phones earlier This is a serious problem, right? Like is if you create a sample, right? Like you want as many of those people that you have randomly sampled to respond, right? Because if they don't respond then you're potentially missing people In that sample that could represent somebody, you know, this larger population you're trying to to understand And if People don't answer, right? If they don't respond simply because they don't want to take a survey They don't answer their cell phone or whatever that that can be a problem, right? Because it depresses your response rates, which makes your you know, your error rate likely to be higher But that can be really hard to estimate But especially like when we talk with coverage error another issue here is that if And it's really hard to know, right? But like if people refuse to participate in surveys because they have certain attitudes that they just don't want to reveal Right, then that's a systematic error. That's going to be really problematic when we think, you know Well, maybe a lot of Trump voters and this was something that a lot of people argued In the 20 run up to the 2016 election was that there were a lot of people that ended up voting for Trump That were shy what they called shy voters meaning that they weren't willing to reveal their preferences and polls, right? So if that happens and it's systematic, right? Like then we're going to miss something big when we think about like, you know Trying to project who a winner for an election is going to be And finally there can still be yet other errors, right? And these might be things like, you know, you think about sort of data processing like, you know, recording something wrong or, you know You process something wrong. You miss code a variable or whatever There are other kinds of things because humans are involved where we might just mess something up And introduce some air into your into the survey, right now Obviously that there's a lot of different potential types of air And it's worth, you know, just mentioning that This doesn't mean that surveys aren't valuable, right? I mean, they're the best we have Uh, in many cases, it's the only way to figure out some of these things And so we do the best we can to cope with Both known and sort of unknown but suspected errors But it is important just to kind of remember, right that, you know, none of these things are are perfect And, you know, we can't have 100 reliability But we can be fairly reliable, right? And the more that we can sort of quantify what the likely known errors are The better off we're going to be And there's a lot of sophisticated work, like I said in social sciences generally You know, there are whole journals, you know, there's a big journal in American political science called public opinion quarterly where there's lots and lots of this kind of stuff Is constantly being published where people look at different survey methods and try to figure out how to correct for different kinds of errors Come up with new ways of doing things so we can get it around some of these types of biases And like I mentioned a little bit earlier because of issues like non-response error Many survey firms are moving towards these sort of opt-in panels And what they do is in order to deal with that You know They will do what's sort of called post stratification Adjustments like waiting the survey responses based on how representative somebody is, right? So if you have somebody opt into a panel who's like me like a white male in his 40s Within a certain Income category and educated whatever I think all these things about you because they're you're on the panel They can look at the population. They're trying to represent and say, okay We need somebody who's you know, so we're more or less representative of these kinds of categories This person is you know He's about the right age demographic about the right, you know racial demographic But he's got too much income or too little income or whatever. So we're going to adjust his response We're going to wait it slightly lower. So it's more representative of the sample. We're trying to go for right So that's uh, and we'll talk a little bit more about that in a minute when we look at some examples from the a and s um You know in terms of waiting, uh, and how that can affect responses But just to briefly sort of review some different types of uh public opinion sources, you know, there are major media outlets Do these you're familiar with lots of those exit polls is another major category These are the polls that are run on election day as people are leaving the voting booths, right? And so they will use these both to try to project who the winner is going to be that day. That's how uh, you know media tv And other sort of fora may decide, you know, who they'll call the election before all the vote returns are in and those are based on exit polls And they also do this so they can start to get the first glance at what demographics Seem to correlate with vote choice since we don't know that about people in the voting booth We only know whether somebody showed up to vote or not You know, we don't know whether they're male or female Usually that's not recorded on voting registration. Usually race isn't recorded on voting registration certainly income Many other kinds of things aren't recorded. So exit polls helps us to get a first glance at that And election studies fill sort of a unique niche here You know, these are designed specifically by social scientists who want to study elections, especially elections and change over time They usually happen and you know, uh, they've been happening in the United States for a long time many other countries, especially in the western Western Europe and other parts of the world have national election studies that Usually happen in conjunction with national part of parliamentary elections. So that can vary on the timing in different places Here they happen every two years Many of them typically have both a pre and a post election wave of the survey where those interview a bunch of people before the election Get their responses who they anticipate voting for interview them right after the election and say, okay Who did you actually vote for? I followed with additional questions so we can sort of track attitude change during the course of the actual election cycle the campaign itself Um, and they have very broad and sort of deep set of political questions You know, so lots of the media sources will have some political questions But these election studies tend to go into much more depth which allows us to investigate lots of really interesting questions Uh, and I have a guy that has some different examples here Um, but looking at the american national election study, like I said, this started back in 1948 So there's a huge uh long sort of running history of this. It's been every two years since 1956 Each year we'll have sort of an individual file associated with it. We're going to look today at the 2016 version of the file But there's also a cumulative file that runs from basically 1948 to the present and Pulls together all the questions I've been asked. I think at least three times On the a and e s across different years Looking at the 2016 study, you can see it's a pretty large sample, right? So, um, this allows us to get a good Tack on sort of the u.s adult population their political attitudes with pretty low margin of error It's roughly around one and a half to two percent Um, this is split between sort of face-to-face and online interviews They're pretty good response rates considering that response rates for lots of surveys these days are down in the teens And you know, if we looked at the code book and I'll show you that just briefly here The code book for the a n e s is 2200 pages long. So there's a ton of stuff going on here Right. It's a it's a huge survey many many things being asked and and Tried to figure out, right? So, um That said, um, we're going to look at sort of a simplified version of this today Just because it's going to make it a little bit easier to navigate. Um But and I'll send the links out with this later, but there is an uh UC Berkeley has this great program for doing some online survey analysis Which is what we're going to do today You know, you don't have to know a ton about statistics to be able to at least run some cross tabulations And start to understand start to dig in a little bit more into the details. So On berkeley site, they have a number of things on here They have the a n e s on here They have sort of the most recent several years worth of individual sort of presidential year files as well as the cumulative file We're not going to go into that here because what I'm going to show you instead is Icpsr if you have access through university icpsr I should say so berkeley's site the sda site is is totally free. It's available to anybody icpsr is a membership organization. So if your university belongs you can get access to this But every year since like the 1970s they have this This great system called setups Which was that stand for supplementary empirical teaching units in political science? Where in order sort of start to teach students about how to use Survey software how to understand some basics of statistics. They use the election year from A&E s to guide students through some beginning understanding. So This is a great source because it makes for a little bit Easier navigation. So if I show you here on the sda Version for the 2016 file This is generally what it looks like And when you look over here at the questions if we expand that to the pre-election questions You can see all these different categories. There's a ton of stuff and within any one of these categories. There's quite a few different questions And So When you get into this this can be really overwhelming because there's a lot going on So this setups version is nice Because you go here They've basically some political science professors have taken this and kind of slimmed it down a little bit to make it a bit More manageable for students. So when you look at it, uh, this sort of slimmed down version What you get it is sort of nicely categorized. So the A&E s on Berkeley site The full one is is a little bit more unwieldy this one They've sort of done a nice job of trimming out just sort of the key things probably people are going to want to look at And then categorizing them into these different categories like voting behavior or a candidate image Ideology social more issues, right? So it makes it a little bit easier to navigate. So we're going to spend some time kind of going through that today Okay, so what we're going to do here is Oh, that's nice to know linda. I didn't know Actually, I think I might have seen that so in this sda thing essentially what you're going to do is We're going to build tables right to sort of understand what's going on Right. So if you look at some of these categories, you can see something like, you know, let's say we want to understand the presidential vote So we're going to stick that in our row Categories. So essentially we're going to think about this row being our dependent variable what we want to understand How people's choices differ here and then we're going to look at some kind of an independent variable Which we're going to place in our column essentially. This is a thing that we think might be driving some of those attitudes, right? So let's start by looking at a really basic one We're going to look at vote choice dependent on people's party identification Okay, and if you look at any one of these things over here if you click on the view button up here This will sort of pop open the full code book Makes us a little bit smaller here Um, where you can see the question text up here Generally speaking, do you consider yourself a democrat a republican an independent or what right and you have a strong or not so Strong sort of preference right that so this is a typical what we call party identification scale runs from one to seven You know, I think myself really strong democrat. I'm a really strong republican or somewhere in the middle So if we run this table here The other thing you need to pay attention to is you're going to put in your row in your column here Make sure usually by default in this the software system here. It doesn't apply a weight So we want to apply the weights and I mentioned that earlier, right that the weighting essentially what it's doing is This is a slight statistical adjustment to represent the fact that not everybody That ends up in a sample Should be reflected the same way and that's because we want to make sure that this is representative of all these different sub populations We might be interested in right so In order to make sure that this is is broadly representative They create these statistical weights. So whenever you're running these tables, you want to make sure that they have the weights here applied Now I'm going to do some other cleanup here. You can include some charts Mostly we're going to look at tables. So I'm going to get rid of the chart just to kind of clean up the page and down here below in these decimal places, it's not really super important, but When you're thinking about the number of people Representing the survey and you'll see this in a minute when we look at the table itself It'll show us how many respondents showed up in each category as well as their percentages I'm going to change this to zero just because it's easy to think of people as whole people and not as 1.3 percent of or 1.3 people And now I'm going to run my table So what you see is it shows us up here what we just did right that we have over here vote choice on the row Right, we have party identification up here on the column And these all sum in columns right so you can see here that these all sum up to 100 down all these columns Right, so they show us the percentages within the column And if you look at across the rows here, we see the total vote choice right that 49 of the people here Say they voted for Clinton 44 percent say they voted for Trump 7.1 percent say they voted for somebody else third party candidate, right? and Then you'll see sort of the party identification and you can see some obvious differences here when You know clearly not surprisingly people identify with democrats tend to vote for the democrats people that identify as republican vote pretty strongly republican So what I'm going to do is I'm going to clean this up though just to make this a little bit more obvious you can do what are called recoding variables, okay, and Because I'm not going to be interested primarily to make this a little bit easier to see I don't want to Include the third parties here. Let's just be interested in the two parties. It's going to make this a little bit easier to navigate as well as There's a lot of talk in political science research about like, you know, who are actual Independence right or independence really a thing some people don't even believe that there are actually any independence So the different ways of sort of collapsing this from either From the seven categories that we were seeing here, right? We can collapse to stand into a smaller number To say like let's just look at All those who say they identify with democrats even if they sort of lean Sort of you know weekly or strongly from independence to democrats Same thing on the republican side and we'll include only as independence those people who say they don't lean to one party or the other right Because when you look at this you can see that People that tend to lean one way or the other tend to Vote pretty strongly with the party. They say they lean towards right so let's clean that up a little bit You can do this a couple different ways. You can do it on the fly here By recoding and saying we want everybody in categories one to three To show up as a democrat whoops Everybody in category four. That's just a true independence independence No lean And then we could say people that fall in categories five to seven those are the republicans that lean different ways Are just going to show up as republican right And if we run that table again Then we'll get this much cleaner table. It makes this a little bit easier to see here now We can see sort of the differences Based on people that identify with one party or another You can also do the same thing up here with The vote and say like we only want to have just The people that voted for the two parties and what this will do is it will drop out everybody else and code them as missing data and not take them into Into consideration when we run our calculations, right? Oops I did that incorrectly What did I do? The other thing I'll say besides doing on the fly if you go to recoding variables this will whoops That was the help I meant to go up here and to create variables to go to recode variables It'll take you to this little thing where you can fill in basically what I was doing there That might seem a little bit more clear to do it this way What you can do is then you can save variables So that you can reuse them later So you can see I've already done this for some of these things You know I create just the two parties where I code for clinton and trump and I drop out third party candidates You can do the same thing with this party identification categories So I've done some of that ahead of time just to make this a little bit quicker here So if we go back to our analysis You have to redo all this stuff because it Knocks it out You do that oops Wrong one. We want the vote choice by this new Collapsed party identification And so what you see is um when we look at just for the two party vote based on party identification You can see like obviously people that have a partisanship very strongly support their party, right? Um, it's you know, this is one of the most sort of overwhelming Influences that we know of in terms of determining people's vote choices I mean with independence true independence. There are much more evenly split, right? And you can see that they tended to break a little bit more towards trump In the election overall 2016 Okay So just for a sake of time to make this a little bit faster I'm not going to be going and running all the tables individually, but I've got them here on my screen So, you know, we looked at this party identification. That's a pretty strong influence, right? But what about people that don't vote, right? So we were just looking at people that say they did vote And if you notice the the total number here is only about 2,500 people and there are about 3,650 people That completed both the pre and the post election survey that we're looking at here So there's quite a few people that don't vote that don't show up in this table. So what about them, right? So if we rerun this and say like let's look at everybody Right and look at this in relation to their party identification If we don't drop out that missing data what we can see here is You know, still you see pretty strong You know correlation between partisanship and vote But what you'll notice is, you know democrats of those who say that they identify as democrats about 22 percent And about 20 percent of republicans people that say that identifies republicans didn't vote in 2016 election, right? But look how many independence people that say they're true independence They don't lean one way or the other more than half of them. So they didn't bother to vote, right? And uh, so that's pretty that's pretty that's a pretty strong effect, right? It's pretty remarkable that we're missing a pretty large group of people that just are clearly turned off sort of the partisan Polarization in the country and so consequently they're not voting And thanks chris. I see you know what I did wrong earlier. That's helpful. Um, hopefully I won't be doing too much more of that okay, um, so Moving on from sort of party identification. Let's look at another example here, right? So if we look over here in the categories, there are lots of issue categories that ask about people's attitudes on things Um, one of those areas might be something like, um, you can see all these different things here, you know government regulation Environmental protection or whatever if we look at something like social government spending on social security Now this wasn't a huge issue in the 2016 election Um But you know, uh democrats generally, um, have favored increasing spending on social security Republicans have tended to either favor just their sort of the status quo or even reducing spending Trump was kind of ambiguous on this in the election Um But if you look at this, you know, they just asked this question Should the government increase spending on social security increase it stay the same or decrease You can see sort of these and now these are not weighted So if you ever actually want to see what the sort of the weighted values of these are You can run this just in the row table here and say, okay, let's just run the table And you can see here about 60 of people say we should increase it 35 percent say or 34 percent say Stay the same six percent want to decrease social security spending right So this wasn't necessarily a huge issue, but this is the useful thing to think about how does it affect people's vote choices Right? So if we look at this in terms of the two-party vote and we say, okay Let's see how this affected people's Vote choices, right? We see that at least there's a correlation, right? That people that wanted to increase on social security spending tend to vote for clinton those that Like the status quo wanted to reduce it tend to vote republican, right? um And yes, rachel we will always definitely send these out afterwards So that's pretty interesting, right? What about though if we sort of Modify this right like what what we want to know though is does that actually drive people's vote choice, right? Is it that you know Social security really managed a lot of people and that's why they voted for clinton uh over trump Well went a way to think about trying to investigate that would be to think about controlling for uh, Social security on the basis of people's party identification, right? So if we went back to the table here And we left this as it was right Actually, we might sort of recode this here to say, you know, what I want to do is I'm going to make it just a little bit clear So let's just say We only want to look at increase versus everybody else who is same or decrease, right and then we want to control for this on the basis of party identification We'd get a different look, right? And so this is a little bit clear, but I'm going to go back to the slide because I Little easier to see horizontally than it is to see vertically, but what you'll notice is when you look at this It doesn't seem to be driving people's choices, right because partisans people that identify as democrats Whether they favor increasing or decreasing spending didn't change really at all whether they voted for clinton or not Um same thing over here with republicans, you know, you don't see really hardly any difference down in these cells Down here, you do see a difference for the independent, but it goes the opposite of the way you might expect Right, so people that want the status core want to decrease funding seem to vote for clinton more than they voted for trump Right, so that seems sort of counterintuitive. So we can sort of, you know, reject that hypothesis that it doesn't really seem to be driving people's opinions In this particular case So this is the kind of example, uh, these are examples of the kinds of things you can do when you start to dig into this a little bit more Um, so shifting gears a little bit. Let's look at something like gender and the vote Right, if we were to run this, um, and I've collapsed out in the most recent surveys for the a and s They actually include now a non-binary gender category. There's not very many people there So just make this a little bit cleaner I've dropped them because they're a very very small percentage of the total people here You could include them if you wanted to do this on your own Um, but looking at this as sort of males and females, uh, and their preferences in sort of the two-party vote They say they voted for you can see here a really pretty stark difference in females, right that females Preferred clinton by a fairly substantial margin here, right? Um males were more evenly split, um And you might think well, maybe this is because you know, hillary was the sort of the first major party Female nominee for president but actually, uh, if you go and look and the Berkeley sda archive and look at the cumulative file over Multiple years what you will see is that this is actually something that has existed for a while, right? So it's fluctuated over time. This is The sort of vote for president by year of this study For males up here on top and females down here on bottom and you can see that basically since 92 everybody voted for clinton but 96 you start to see this divide Between males who are fairly evenly split and females now pretty predominant in voting for the democrats, right? And this continues to persist throughout many election cycles now for 20 years, right? So that leads to an interesting question like what's driving that, right? There's actually a lot of research that looks at this, but you know, we could do some investigation of this on our own in using the a and e s here So what we could do is um come back in and let's think about like abortion, right? Like maybe it's abortion attitudes. This might be what we would call sort of a confounding variable, right? It might be that what's happening actually may be an intervening very much better way to say this is that People's abortion attitudes and the women are going to have certain abortion attitudes different than men We might hypothesize and so subsequently that will lead them to vote for the party that think is going to be more supportive abortion rights, right? So, you know, you can look at this. There's a Item here on abortion attitudes and if we look at this you can see there's sort of four categories To make this a little bit simpler. We might just collapse these into two categories that, you know, generally we're opposed to abortion And generally we think there shouldn't be very many restrictions if any restrictions on abortion, right? So if we were to rerun That previous table on gender in the vote but use abortion attitudes as a control We'll see something slightly different, right? Which is now we can see broken out by those who either Uh Don't support abortion And people who think that they if that support abortion don't think there should be very many limits If any limits at all and what you notice is so here's our gender split from before up here in the corner You'll notice that this is about, you know, five and a half percent difference, right? Male females to males and it's still roughly the same proportions here in both of these categories So it looks like abortion attitudes while important clearly important in driving people's vote choices, right? Those that don't Support abortion voted much more strongly for trump those who do what much more strongly for clinton But that doesn't seem to be accounting for this gender disparity, right? Um, so you'd have to keep fishing to try and figure out. Well, what might be driving that and I'll leave that for you to explore on your own Um, what we're doing on time All right, I'm gonna end up having to cut some of this out, but I'll leave all the slides in so you can look some of these later Um, looking at another one looking at, uh, how about like voter candidate? Images, right? So we think about like how did people Assess the two different candidates clinton and trump how honest do people think they were, right? So if you ask them does the label sort of being honest sort of extremely well represent this candidate or not well at all And sort of this this range here. You can see these breakdowns. This one is for clinton You can see kind of varies, but you know the large plurality almost the majority people say we don't think she's honest Same thing for trump. It's a little bit less because it's a little bit less known of a quantity. Maybe at that point But still large plurality say that neither one of these people is honest, right? If you run this, um Sort of separate tables here and look at okay What is the effect of people that think that trump is honest or not on their vote for president? Um, as you can see not surprisingly people that think that candidate Uh, trump is not very honest or highly highly likely to vote for clinton in the election Think that people that think he is honest or highly likely to vote for trump and the same thing goes for clinton, right? We run the same table. It just flips perspectives, right? So the more interesting question becomes well, what about people that perceive the candidates differently, right? So people that think that trump is honest I think that hillary was dishonest or people that uh thought they were both basically honest or both basically dishonest How does that affect people's vote choice, right? So we can do that by adding in One of these as a column And the other one as a control to see, uh These variations here So if we were to do that what we would see is something kind of interesting here, right? Um You can see that not surprisingly. So this is looking at Uh, this one appears for those people that thought that donald trump was honest This table is for those people who thought he was dishonest and up here This is do4 is for clinton's honesty assessment. So people that thought that hillary was honest And trump was honest. You can see if you look at the frequencies down here about 119 out of the sample 66 So a very small number of people actually thought that they were both basically honest, right? Um, not terribly surprising, right? So I've recoded the honesty here. So one to three is honest four to five is dishonest, right? So you can see that not very many people think they're both honest A much larger group thinking that they're both basically dishonest And then you have a fairly substantial portion of people that think that one is honest and the other is dishonest, right? So looking at that, um, not terribly surprisingly if we look at people that thought that trump was honest But clinton was dishonest they overwhelmingly voted for trump Same thing over here if you look at it just kind of vice versa Interesting to look at though if they thought that both candidates were basically honest They broke for clinton much more strongly here and they thought they were both basically dishonest They also broke for clinton At roughly similar margins, right? Now What about If we sort of um, oops I'm sorry. I'm gonna slide behind on my paper here. All right, so shifting gears a little bit more. Let's look at I'm running out of time. Let's look at immigration. Right. This is a pretty salient issue in the election Um, there's a question on here k11 asks people, you know, do you think that we should increase? Uh, the number of sorry in terms of the desirable immigration level should it go up Should it be the same should decrease a little bit should it decrease a lot, right? And if you look at this based on its influence on people's uh vote for president Not surprising you see that people that favor the status quo or want it to go up voted for clinton Uh, given all his anti-immigrant rhetoric trump Garner the people that wanted to reduce immigration, right? And if we collapse this we recode these so that you know one and two here are basically sort of keep it the same We're increase it and these people are the decrease so reduce immigration You can see it's pretty strong effect, right? So about 74 percent in each category Voting sort of the way you would expect However, if we if we start to ask, well, why is that right? Is that what's driving people's vote? There could be a couple of reasons why that might be like one might be the effect that immigration people hypothesize It's effect on jobs, right? So maybe there's a segment of the population that thinks that you know Immigrations likely to reduce jobs for americans and so consequently we want to reduce immigration And so we're going to vote for somebody who champions that platform, right? so this question here on terms of immigration reduces jobs you can see that fairly evenly evenly split across the four categories of yes It's very likely to reduce american jobs or no, it's not at all likely to reduce american jobs, right? And so if we add that as a control on our previous table here You can see now that we're looking at people that think in this table. It's very likely It's somewhat likely or very likely to reduce american jobs and this one it's somewhat or Very likely unlikely to reduce american jobs. How did that affect their vote choice, right? So for those people who think it's likely to reduce american jobs and that want to subsequently decrease immigration They voted very strongly for trump, right in comparison to that 74 percent. It's a pretty sizable jump about six and a half percent or so, right? Um If you look at people, uh, though that think it's likely to reduce american jobs, but Just kind of want the status quo to stay. It was much more evenly split, right? But if you look at people that think well Immigration's effect on american jobs is actually not that big or they don't think it's likely to reduce american jobs You know immigrants likely Filled jobs that people don't want then it's a bit more of a mixed picture, right? So people that want to still decrease immigration They still voted for trump overwhelmingly, but at a lower rate, right? A noticeably lower rate here of about seven percent Actually almost ten percent, I guess Whereas those who want sort of the status quo increase it because they don't think this has an effect on jobs We're much more likely to vote for clinton All right, so it's interesting, right? Like it doesn't tell us exactly why But we can make some hypotheses, right? And this is where you'd bring sort of your theoretical expectations into this to say like well Maybe in this case people that think that it's likely to reduce american jobs Want to reduce immigration levels because they feel a personal threat Whereas people that think that it does reduce jobs But they are fine with the level of immigration as it is. Maybe they don't feel a personal threat They just see a general threat and so consequently they're likely to less likely to be motivated by this to vote according to that issue Right. That's one possible explanation We wouldn't know for sure without some additional investigation. Um, but that's another possibility We could also look at this slightly differently by looking at there's an issue A variable on here a question where they have sort of They've taken multiple questions about how people feel about immigrants and created sort of this index of tolerance towards immigrants, right? We come back over here And look at this variable you can see Up here It ranges from sort of one to high either people are very low tolerance They're high tolerance and it's built from these various questions You know something like immigrants are generally good for america's economy or america's cultural is Harmed by immigrants. There are I think four questions on there. They used to kind of create that index And so if we run that then And look at its effect on people's vote choice We can see that people are very tolerant of immigrants tend to vote for clinton those that are less tolerant tend to vote for trump Not terribly surprising Given sort of their positions But if we sort of problematize this just a little bit and say like okay, what about The effect of party identification because we know that party identification is a really strong effect on people How does that interact with this tolerance towards immigrants? Right, so I've run that here with party ID as a control and now we have sort of three tables here So here's the original table up here in the corner and down here You can see for democrats for republicans and for independence separately That you can see here that those that had the most tolerance among democrats still voted very high rates for Clinton as you would expect based on their partisanship Um, you know 92 of democrats voted for clinton. So that's not a big surprise there But you'll notice that she had some defections here amongst Democrats who were less tolerant of immigrants, right? They were less likely to vote for clinton more likely to vote for trump And you'll see sort of a flip on that with republicans Those that were most tolerant of immigrants were less enthusiastic about trump right by almost sort of 20 points here Comparison to just all of them in general. It's a pretty substantial about 15 percent, right? Drop here in in those that support public in candidate Donald trump Independence a bit more mixed, right? So you see sort of this is an issue that seemed to really split independence, right? Those that were high tolerance Much more likely to vote for clinton. Those that were low tolerance Of immigrants much more likely to break for trump. So we are probably just about out of time So i'm going to skip ahead just a little bit The next couple ones are actually pretty interesting. These are about education. I'll sort of skip that Let you look at those afterwards But I don't want to end with this one. This is about race. So thinking about race and and the vote Right, if we look at sort of initially sort of party identification One of the things you notice is if we look at race as it relates to how people identify with parties Republicans overwhelmingly white democrats much more diverse right in that sense You know, it's much more evenly split republicans 85 15, right? If we look at there's another One of these index that sort of combines multiple questions about supports for blacks, right? So there's a lot of talk about, you know, how race affects different people's votes in the election I sort of collapse this into sort of broadly supportive And broadly unsupportive of blacks based on these questions This is over here. If you look this mo six, you can get a kind of a sense Actually, I think I have I want the code looks to me to see It's built from these statements things like irish irish italians jewish and many other Minorities overcame prejudice and worked their way up. Black should do the same without any special favors Other kinds of things blacks have gotten less than they deserve, you know So they scale these things and combine these attitude onto this one index to create This notion of how tolerant people might be either low support to high support, right? So I sort of recoded those So that one to I think one to three is low support and four and five is high support. Okay So looking at that if you look at that's effect on vote choice You can see like it's it's pretty strong, right that those who voted for clinton Correlated with very high support for blacks Those who voted for trump correlated very low support for blacks Um So what about when we look at this in comparison to uh, their party identification um What you can see is um Actually, so I've done a couple of things here. This is a little more complicated So let me go back to the online thing and just show you so there's also this notion where you can use a selection filter Right, so what I want to do is uh, you know, there's a lot of talk about obviously races Um, you know and white voters and and what they thought of in the election and how the race their attitudes towards race may have Impacted their vote, right? So what I'm doing here is we're looking at this black support index related to people's vote and their party identification But we're looking now just at whites, right just at white voters So you can see this selection filter here when you're into the thing here. This is right here So if we want to look at people's race Which if you look down in the demographics down here is this ro2 We would say I want to just look at all those that are whites. So there are multiple categories here if you look at race And generally I've um, I've recorded them I've collapsed them to be sort of white and everybody else that's not white because there's uh, not as many people Obviously you could investigate this more But we're just interested right now and kind of looking at white attitudes So I've selected to just just look at whites in this case And when you look at that what you see for whites is we have now broken up between sort of those to have low Support for blacks and those who have high support for blacks and what you'll see is You know, those who have low support for blacks very predominantly Sported republicans those high support for blacks very predominantly But for democrats, it's not terribly surprising given some of the things that we've heard no What's more interesting is to look at sort of this split in the democrats right that clinton had some major defections here in terms of the white vote This is the total white vote based on party identification, right? Um that 87 percent of white democrats voted for clinton say they voted for clinton You can see almost a 20 point drop in those who had low support for blacks Right, so there seems to be a pretty strong effect of racial attitudes driving this difference in support between Amongst the parties themselves right and you can even see this with republicans. It's a little bit less stark But you can see that those that were have high supports for black A little bit less likely to support trump Then sort of this 93 percent about a 10 drop but not as significant as the one you see amongst democrats Which may give some credence to the talk about, you know, sort of democratic defections in 2016 Amongst whites The slides that didn't have a chance to show you had to do with education and one of the things it shows is that Lower educated whites whites that don't have college degrees Trump over performed with that group In a pretty strong way, so Anyway, uh, I know that was super fast But if there are any questions, I'm happy to take questions But I hope that's at least When did your appetite wanted to make you get in and just kind of play around with this because there's so much Interesting stuff in here You would barely be going to scratch the surface and this is just a simplified version of the a and e s like I said You know the full a and s has many many more items that you can look at many more Variables about the respondents and stuff that you can Look at you know, you could look at things like income and how that correlates with attitudes And many many other things besides Thank you very much, Jeremy Sorry Thank you very much, jerry. This is awesome. I haven't uh played with a ands quite this much And there's a you mentioned this I think in the beginning, but the voting behavior Module has some exercises you can go through to get yes Yeah, in fact some of the some of the things I put up here come directly from the exercises But if you if you can have access to icpsr you can come through here and there's a whole series of exercise These are great. Just to do on your own. They'll walk you through it and help you get familiar with running the data analysis And so these talk about different types of categories just to kind of get you familiar And then once you've done those you can certainly play around more on your own They're also great if you want to you know use these as examples with students This is a great way of getting students interested in both looking at public opinion data As well as starting to think kind of in terms of statistical relationships, you know, the soaps have great opportunities I think in the teaching environment as well. Yeah, and I think well, I know at UNCG They just this is how they actually get political science students started doing statistics is through the voting behavior Yeah, it's a great. It's a great resource. Awesome. Thank you so much. And there's a lot of people saying thank you Um, yes, definitely much easier than using spss Yeah, that's that's one thing I didn't say which is uh, you know All of this data is available for download, right? So if you want to play around with it If you know a statistical software package like r or stata or spss All these things are freely downloadable You can download the full data sets and play around with your heart content To your heart's content and do much more sophisticated stuff than what we're just doing But you know, I wanted to keep this at least simple enough so that people that you know, aren't real familiar With those packages can do some pretty interesting things and get a good sense of what's going on with attitudes Short of having you know, sort of a high powered knowledge of statistics And uh, yes, definitely and phyllis was asking about the youtube channel. Um, I just put a link in The chat, um, we will send out the recording to everyone, um and the Uh, I'll I'll send my slides too. So we'll send the slides with the presentation We don't have it send the chat with the presentation just because it's identifiable We'll we'll send the um slides with this presentation and um as well as the the youtube Link, um, so you'll have all of that Um, and feel free if you know, I like I said, I'm not uh the expert on this, but you have questions Um, just about political opinion, uh polls and stuff in general I'm always happy to field those if you're looking for things You know, I do a lot of those kinds of questions and answers. So now don't hesitate to contact me I don't know that I put my Email address in the presentation. I'll make sure it's there when I send the slides to linda Yes, that awesome. Yeah, we'll definitely get that in there. Thank you so much everybody and um Yeah, thank you for coming. It was a great turnout and thank you so much jeremy for a very rich presentation It's exciting to see I haven't played around with it in a while. So Good to see especially as we go into the into the fall. Yes Get ready for that Awesome. All right. Well, thank you everybody. Have a wonderful rest of your thursday and a great friday and Um, we will see you in september All right, thanks. Thank you jeremy