 Founder of conductors are giddly labs, and I'm a research fellow at the Berkeley Institute for Data Science. What I'm going to be sitting to you today, I'm going to talk fast, is text pressure, also known as annotator-constant analysis, sometimes confused for Obamacare. But this is a tool that allows researchers like me to enlist crowds and citizen scientists in the annotation of a lot of text according to concepts that I think are important as a research scientist. I've gotten help from a lot of people including my Hypothesis Open Annotation Fund, so if you like text pressure, you should thank Dan and my Hypothesis team, and Jerry, and X-Ting, and everyone. So let me talk about this from the standpoint of our researcher. I want to understand the occupied movements, and I want to understand every little action that's happening, how those actions are input linked, and then how those actions are shaped by the events in which they occur, and then how those events are interlinked across a whole campaign in a particular city, and then I want to understand this across 184 cities that have their own occupied movements. So to get all of the variables of information I need to start doing statistical calculations to understand, for instance, why certain sequences of behavior lead to violence or to negotiation, I need a lot of really well-structured data, and I need to gather it all from newspaper articles. So I had over 8,000 newspaper articles. I want to understand everything that was happening during protest or initiated events, everything that was happening during police initiated events, everything that was happening at the city level that the civilian government was up to, and everything that was happening during those occupying camps themselves. So well over 120 different variables I want to extract from all these 8,000 newspapers. Existing text algorithms can't really do this stuff, and we mostly trust humans to do this because we understand natural language so well. And in the science and social sciences in particular we've been using this method called content analysis, and basically it's stuff that people have been doing for millennia ever since we've had written the text. People write their annotations in a marginally note, and they'll start using different colors that mean different things. And lately in the last decade, so we've had, we've moved this into a computer environment, but it's kind of an esoteric process, and it doesn't really scale. The big columns are really demonstrated well by the dynamics of collective action project at Stanford. It took 10 years to get 26 variables, not 120 some variables, but 26 variables out of a bunch of newspaper articles describing movements in New York state. The big bottleneck from the research standpoint is that undergraduates learn how to do the task, and then they start playing ultimate frisbee when they graduate, and then you have to train a new wave, a new wave every time. So I wanted to figure out how I could get this work done by hundreds or thousands of people through the internet. So again, here's the problem, 8000 news articles, all these different variables. What I realize is that each of those different sets, there are different sets of variables that pertain to different types of text that I can find in the news articles. So right now I can point them to say that blue text is about a police initiated event, the orange text is about a protestor initiated event, and that brick colored text is about what's going on at the camp. And I had my team go through and do this, but I've created text pressure to make this process really work at scale. So once you've got all of these high level limitations, the next thing you do is you ask people to do a basic reading comprehension task that we've all been doing since middle school. We just serve one of those little pieces of text, and then we ask them a few questions about that text, and they highlight the words that justify their answer. That's all the text pressure does. It's pretty simple. Let me show you. So here is that high level highlighting interface. I've just started from here. What I'm highlighting here is all the information that's about the protest initiated event, and I'm using this red highlighter to do that. I go to the next article. This one might have information about the protest. It might have some information here about what's going on at the camp. It might have some information here about some of the police did. So you can imagine hundreds of thousands of people doing that across a bunch of articles. And then the next step is that we retrieve those tasks with text pressure, jump from the internet. We're doing all of this with crowd crafting in Pibosa, Daniel Lombranya. It's now called PsyRabbit. PsyRabbit is a part of this community, because I don't take communities that you may be delivering them. We're using a lot of his tools. And then the next thing I do is I send this to... Is that not quite right? I send those highlights, those high level highlights. Thank you, that's much better. I send those high level highlights to the next interface of text pressure. Make sure that worked. And now I have a different interface. I'm going to... What did you do? I didn't hit the quotes. Oh no, this is so embarrassing. I've not had a fail yet. I'll show this later to people, because it looks like we have some other time. But basically what you're going to be seeing with this interface is that we have a display of the same text units that I just highlighted for you. And we have people answer questions. And as they answer a question, they get a highlighter. They highlight the text that justifies their answer. And we apply dozens and dozens of different highlighter colors to just that one small text unit. And in the process what you're doing is you're getting that very fine-grained, highly granular annotation on the text units for each different type of text units, whether it's describing a protestor in the shade event, a police in the shade event, what's going on at the camp or whatever. And then all these aggregate up to the level of the article. So we have incredibly fine-grained annotations on all this stuff. We're using this for data on occupying newspaper data. We're also using it with a different set of conceptual schema to look at how people commit argument and fallacies or influential mistakes in the news media. And we're kind of using this right now in a particle public editor that we're actually pitching to the next foundation today. So I wish I could have shown you exactly how it works. But I will show you later if you want to come and see. Thank you.