 Great. So as Mohammed said my paper is called Emotion in Recent and Political Language. This is joint work with Gloria Genaro who is postdoc at Etihad Zurich. I'm an assistant professor there and we have our one paper already ready to go and working paper out, but I hope that you'll see that this method and this substantive quantity of emotionality, emotion and logic in language is something that could be applied in many domains and I hope that some of our machinery and our approach could be useful in some of your work. So this is an old issue. We have the classic Aristotle quote that an emotional speaker makes his audience feel with him even when there's nothing in his arguments, which is why many speakers try to overwhelm their audience by mere noise. And this is the classic pathos and logos as being these two sources of rhetoric for persuading people. More recently Drew Weston says in politics when reason and emotion collide emotion invariably wins. So we're going to see if this is true in some kind of basic sense then politicians should not even have any logic, right? They should just use just emotion, but we'll see they use both. This is a part of a broader literature in psychology, political science, which has, there's been a lot of psychology work, you know, looking at emotions when people make political decisions. But there isn't any work up until very recently looking at measuring emotions and text or really trying to analyze the emotional intensity in the wild because there hasn't been a data set for doing so. I would say that that positive sentiment so like positive versus negative language that's widely studied, but emotional intensity in terms of saying, you know, am I, you know, emotionally expressive right now? That's that's more new. The one of the closest papers is Dietrich et al, who they look at emotional intensity in audio so they measure it in in sound. There's also, I should have done this, I didn't update the slides but just a couple of days ago a paper came out actually in American political science review one of the top political science journals, doing something quite similar to us in the United Kingdom for some recent data, which I can talk about if you guys are interested. And then separately we're contributing to this, this, this literature and natural language processing computational social science, which is using some of these, these newer NLP models to analyze social attitudes, stereotypes, and other cultural dimensions in language. So what are we doing in this paper. First we ask, how can we detect emotion and reason and political speech and so we're going to assume there's just the single dimension, which will see in a human validation that it's actually a pretty, pretty good assumption. We're going to combine a dictionary or lexicon based approach where we have a group of emotive words and a group of cognitive words and use word embeddings which are a spatial representation of language. And we're going to use that to trace a dimension trace a direction in language corresponding towards reason or or cognition on London logos on one end and then emotion or affect or pathos on the other end. And then we're going to apply this to US Congress. And this is just a nice setting for economic history, social science, historical social science, because we have 150 years of speeches about politics in the US, and we can trace the emotional expression over time across topics, including some policy topics that are going to be interesting to economists such as fiscal policy. We're going to relate it to individual institutional characteristics, and I have time today to show you the next paper we want to work on, which is taking a looking at performing a causal analysis of transparency or visibility of politicians on the amount of emotion that they express in their speeches. So our method in and I should mention, I know that for a technical presentation like this, you might not want to wait for the q&a to ask questions. And so, you know, given that I think we have plenty of time so so feel free to just raise your hand or unmute yourself if you have a question about anything that I say. The corpus, as I mentioned is 150 years of speeches from the US Congress, it's almost 10 million speeches. We perform some standard pre processing in the text is data literature. We extract only nouns adjectives and verbs. This is because these words are relatively informative for getting at, you know, kind of emotional content, and it serves to drop function words that won't be informative about emotional expression. After that we drop punctuation capitalization and numbers and we drop the word endings. So like taxes and tax would both become tax, for example. We also we just drop rare words. And so the final vocabulary 63,000 words. And then we train word to that which is a word embedding model which I'll talk about more in a second. So word embeddings is this relatively new technology. It's, you know, it feels new in social science, but it's actually the first paper on the system 2013. But, you know, the technology the in the literature in LP is changing really fast. Word embeddings is this is this now fundamental or foundational technology and natural language processing and computational linguistics, where words and phrases and also documents are represented with the spatial. So they take a spatial representation, where the, this is a high dimensional space it's like 300 dimensions, but it's very low dimensional relative to vocabulary which has, you know, 65,000 items in it. And every word gets a space, it gets a gets a point in this space. And so the first, the first quality characteristic of these of these vectors is that words that are related to each other such as you know student and people, they're going to be close to each other in the space so it's the the the objective function for learning these vectors encourages vectors that encourages words that appear in similar contexts. So that that appear in the same types of sentences to have similar vectors that have a high have a high dot product. So when you represent words with this dot product objective in this, this linear objective. One of the outcomes is the vectors not only put similar words near each other, but they encode in a logical relations. So dimensions directions in the space encode meaning and in very kind of a direct sense, such that if you take the vector for man, and you shift the same direction that that moves from man to woman in the 300 dimensional space is almost identical to the the the translation between King and Queen. And similarly, it will encode, you know, verb tense or you know capitals of countries. There's a set of standard analogy tasks that they provide to the word embedding model and word embeddings they can solve analogies almost as well as people. In our case you can imagine that there's just like a dimension here from emotion, you know, to cognition. And so we're going to be trying to to construct this dimension in the space. Our starting point for doing that is a seed dictionary or seed lexicon of emotional and logical words from L I W C or it's pronounced Luke. This is a dictionary developed by researchers at University of Texas, their linguistic psychologists, and they validate this in a number of ways. But for our purposes, we have these two pretty big dictionaries of words that are 800 tokens for for what they call cognitive processing. This is what you would call kind of like logical reasoning, including causation comparison, a probability inclusion and exclusion, these types of words I'll show you some examples. Affective processing, which is you know emotions like emotions and moods. It has 1400 tokens in that it's like examples include anxiety anger sadness positive negative things like this. We do some pre processing on these to include nouns adjectives and verbs, and the resulting lexicon has 359 cognition tokens and 848 emotion tokens. And I should mention that we read through these, and we dropped any words that were were kind of false positives in in our context so an example is the word admire. Like this, this like wildcard expression, what is considered emotional, and it's supposed to, it's supposed to match to admire admire admiring things like this. But in our case it matches to Admiral, like, like the leader of the Navy. And so we read through all of those and made sure that they weren't included. The, the, the output of this process is, we combine this dictionary with our word embeddings to produce a an affective centroid in this in a cognitive centroid so these are the, these, this is the, the average embedding, the average vector for these, these constellations of logical and emotional words. This is what the these, this is what the emotional and cognizantal centroids look like. So these are, these are emotional here on the left. These are cognition or like logic or reason over here. So you can see that I all we also split them up by positive and negative just to show that there's that this isn't capturing sentiment. So here are the positive valent words, and here are the negative valent words. I think that it's capturing, you know, you look at these word clouds and you, and you can kind of, I think you might be able to guess what what it's capturing that, you know, these are the more emotional words over here, you know, gay tea smile thrill serene, but also frightened discussed stupid, angry vile, and then on the right hand you have a characteristic discern analytic and fur contradictory imply vague. So you can see that there's kind of like these both positive and negative valent words. For each centroid. And we did measure sentiment and but it turns out that the this the sentiment dimension is not is not that correlated with emotionality. So how do we go from these, these emotion and cognition centroids to scoring speeches in the congressional record. The first step is to represent each document each speech by a congressman in the same space. So recall that we have these average vectors in this type of space. Representing these polls here and you know we want to say take a take some speech and put it on this scale. And the way we do that is we take the average of the word embeddings in that speech. And so you know you can imagine that there's a set of words in each speech, they all have a vector. And the only the only slight wrinkle is that we, we, we add a waiting term following this, this NLP paper that it increased it downweights very frequent words, so more informative or distinctive words get are up weighted in the document vector. We did some we did some validation this this turns out not to be critical, you don't need to do the waiting. So, now, we have a, we have a vector in the space, going from affect to cognition right, and we want basically, we want to see for a given document vector. What is what's what's the cosine of that angle. So if, if, if a document vector is kind of moving towards if a document is close to the affect relatively close to the affect dimension and relatively far from the cognition dimension. They're going to have a small angle, and that will result in a high cosine similarity. And similarly, if it if a document is closer to the cognitive vector. So this is just some linear algebra, taking advantage of the geometric representation of language that we've done to prepare this, this measure. We also do a just a simple ratio of similarities that's, that's really similar. Here are some examples of the most emotive and most logical sentences in our corpus. So you see, for example, the key to whether or not we are going to be successful in ending what is a national disgrace is those of you who are watching this program today and others, or let's do our part my fellow Americans and make this a better country today before we go to bed tonight as a tribute to our brave men and women who are fighting for us around the clock. So these are like you know really emotionally intense and I think these are useful right because they're not saying I'm happy. They're not saying I'm angry. They're expressing emotion so I think it shows what it's capturing. And the cognitive sentences, they're talking about kind of procedural issues or, you know, appropriation requests for Indian irrigation projects. You know, the these also make sense. So, we don't just trust our these handful of examples though we sampled random sentences and asked. We did a crowdsourcing human validation task, where native English speakers on in Turk looked at snippets of two, they looked at two sentences, and we asked them, which of these sentences is more emotive than the other, and then we just we validated our measure by asking how often does the ranking, provided by our skater score, how much does it match with human judgment. And so this is, you know, kind of what they would see, like they would they would see two sentences and then they would guess which one was more emotional and then, or they say I don't understand. They are quite good or we thought so that in the whole sample that the arm our score matches human judgment 87% of the time in the set of sentences where where the where two coders agreed about about a sentence, it's even more accurate. And for the purposes of historical analysis. It's important that this this high accuracy it holds across all the decades of our data set. So this is back in the 1860s. So this is you know 150 years ago. Our the our emotion measure it captures it properly ranks the sentences even that far back in time. Also, just as an aside, we produced rankings. We produce this human validation using the same method as this paper in American political science review that came out just now it's basically this method, and our method is significantly more accurate at replicating human judgment. Okay. So this is the measure of emotionality, and I think I have about 11 minutes left so that's good to have time to report the results. So first, here's emotionality by chamber over time. So you can see in green, that's the house, and in red, that's the Senate. And, you know, for people for students of us politics the fact that the house is more emotional than the Senate is actually pretty intuitive. The Senate is kind of the more deliberative body. And, you know, we see some interesting just initial regularities here that emotion tends to go up when there's a during wars. So like World War one World War two there's these spikes at the time that the US is joining these wars. You know, we saw this and we weren't sure about it we thought there maybe was a problem with our measure because there's just this unexpected shift like really big increase starting around 1978. And we were wondering what was going on there. And there's a few different things, of course, but something that I think was really salient for our case is that this is C span is coming on so C span is this television network that put the floor the floor speeches on TV on cable TV, and it really was right around that time when the emotionality went up. And in case it comes up. I have a question actually about this. Sure. How do you explain for the rise and this. Yeah, this. Yeah, no, that's striking to as well right. I don't I don't have a good thought for that. Because I mean, this is right around, you know, this, you know, late 1960s, early 70s this was like during Vietnam this was a very, you know, divisive time. So it's kind of striking that emotionality would be going down right. So yeah, this is an interesting puzzle that we, you know, we should we should think about. I don't know how many idea about why that would be. No, I was I was just thinking about the bit. Yeah, but I did not understand like why it was maybe they were trying to counterbalance them. Yeah, I mean this was that you know this is like, you know, I guess like, you know, I guess in Watergate and stuff like that I don't know maybe maybe it's some of like the legal discussion around there. Yeah, I'm not sure. But we should check that. I just wanted to mention that these these trends are not explained by sentiment so it's not them becoming more positive or negative. They're also not explained by readability or kind of text textual sophistication so it's not like they're using you know, more, more, it's not that it's not they're using simpler language, which you also might space and but it's not it's not explained by that. You can see like that's that has a very different trend. And also it's not driven by something something happening in larger society so if we look at Google in grams which is, you know, basically Google books. There's a much different trend where it's actually just it's actually just been going down rather than going up. So, this effective C span, you know this just you wouldn't just trust this time series right, but we actually have some initial causal evidence that it's a causal effect because we followed the method and Martin your co glue 2017 a our paper where you instrument the ownership of C span based on the channel position in the sense that when when C span has a lower channel position, people watch it more, because it's easier to access. And it turns out that in places where C span has a lower channel position, people watch, people watch it more, and there's higher emotionality in those congressional congressional districts. It's very new we haven't you know done all of the checks yet but I'm pretty excited about this that that C span could be this this driving force in making and shaping the rhetorical choices of of congressman. So a few more descriptive results, and we can come back to the C span thing at the end if you guys want. So here's a motionality by topic so we trained LDA, which is a standard topic model that it learns clusters of related words from the text. And you can see that, you know, they're all most of them are pretty low, especially procedure so this is like, you know when they're talking about who should vote now so like that it's kind of a nice placebo test that that has actually stayed low during the whole time but what has increased a lot is like this national narrative topic which is like, you know, the American dream, we should salute the troops things like this. Something that I think is interesting for economists is we ranked the topics by how partisan they are in terms of their emotionality. So what's the ratio of Republican emotionality that that's over here Republicans are more emotional about the topics versus the topics where Democrats are more emotional. And you can see that fiscal policy is the most is the most emotive topic is the most partisan topic in terms of its emotionality, which I think is very interesting, because you know Republicans have had to come up with all of these arguments and narratives to defend extremely high inequality in the US. And you see that in their emotional defenses of a regressive fiscal policy. And then, on the other hand, Democrats they get emotional about social issues so this is like, you know discrimination and abortion and things. Also economic policy which is like a labor unions and things like that. So we also looked at the level of emotionality by the minority and majority status of the political parties. So you might think that, you know, when when parties are out of power, they have to attack the incumbents. And so they might be more emotional and we see that so this actually you know this graph we were pretty happy with whenever there's a change in power so like blue means a Democrat majority. Red means a Republican majority. The other party becomes more emotional so you can see it's like really, you know striking here like when when they trade places in terms of like they're being the majority party. They use more or less emotional rhetoric you can see it switches here. It switches again. And this isn't driven by the topics or like using more or less procedural word so it holds when you when you control for control for for topic fixed effects. Have a quick question on that. Yeah. Is it something about composition effects or it has nothing to do with it so I guess there is turnover in those houses in those. I would like to characterize each Congress member by some, you know, the emotion score and see whether it's driven by changes in the composition or just, you know, it's not because these regressions have speaker fixed effects. So even like within the same, but that's a great question within the same person. If they're in the minority or majority they use more or less emotion. And you and we control for whether it's divided government like the length of the speech so I kind of sophistication and also sentiment and it's still it's still there. I see I see. It seems to be a rhetorical strategy for the minority. Yes, but the fact that between columns 23. If I understand correctly you have this kind of. And it's different. So this is these columns show you that the topic fixed to make facts make a difference right so what the kind of the topics they are talking about does make a difference. And also there's there is there seems to be a composition effect. That's quite interesting I had not noticed that before, but we should mention that in the draft. Thanks. Thank you Victor. Also, you know somewhat thematically related to kind of being in the minority or being kind of under pressure and also being more emotional is that when you look at the DW nominate scores. These are these are measures of polar is a policy polarization based on your roll call votes. The, the congressman and senators who are higher up or lower up on DW nominate are also more emotional. So basically these are these are the congressman here they're like they're really right wing. These are the congressman over here they're really left wing you know like Bernie Sanders and stuff like that. And they tend to use more emotional language relative to their colleagues who are towards the middle of DW nominate. This also holds with the with the different fixed effects. Finally, this is just some final descriptive evidence that you know that there could be many reasons for this of course but you know when you control for all the other observables. Democrats are more emotional. So our female congressman black and Hispanic congressmen are more emotional Asians are less and Catholics are more and there's no effective for Jews. So to sum up, we have this new measure of emotive versus cognitive speech that combines and dictionary methods with word embeddings. We have a package on my GitHub. That's already pretty usable. And you're welcome to check it out and if you have any requests, we will add that we show that this emotionality measure it is kind of intuitive. And it has helped us answer some some start to answer some interesting substantive questions such as what is the impact of television on emotionality. Are there links to polarization. We also have already tried it on on Twitter, and you get some pretty nice results there to where these are the most emotional tweets we can see like it is heartbreaking that our children and their families are facing such hateful rhetoric and unprecedented and targeting. We can already show that it seems to work pretty well in other other corpora. And that's my presentation. Thank you.