 My name is Chris Emery and I'm a PhD student and a lecturer at the Department of Cognitive Science and Artificial Intelligence at Tilbury University. And I will be presenting some of my work on what I call invasive artificial intelligence. Now, before I do that, I was asked to open my talk by explaining why you're listening to me and not seeing me in this call. Now, I had written a very nice, if I say so myself, TED talk about this, talking about big tech and privacy, where I also tried to convince you that I'm not a full blown tech skeptic, but I was urged to leave time for discussing my own research. So I will forgo the friendly intro and just keep it maximally brief. So here we go. Now, it's it's my and I'm sure many of my peers there believe that over the last two ish decades, we have kind of led ourselves full victim to the data hoarding practices of what we assume are quote unquote benevolent tech companies. And this all for the sake of free and very convenient applications. We seem to, you know, happily invite these invasive artificial intelligence driven systems into our homes and pockets. And we pay with with vast amounts of personal information for for very little in return. Meanwhile, these seem to have a much stronger impact on our lives than than the mere services that they offer. And negative I might add on our moods, connections, life satisfaction and even safety. Unfortunately, many of these systems are so ingrained in our personal lives that it's just very difficult to part with them. But with new services, I think come new opportunities to be critical about them. And when there are enough and I, you know, I admit that they're unpopular, but but there are enough secure open source alternatives. So you can actually make a choice you you can vote with your data. And the video platform that we're currently using is to me simply just a new big player that fairly quickly run its trustworthiness into the ground. After they became circumstantially popular, and everyone was just happily hopping on. So I decided to vote with my data. And doing so, I think I sacrificed a lot of contact with with my colleagues and already quite isolated time. I had to send many apologies for all the inconvenience I'm causing. But given the present state of our digital lives, I really believe that this is the right thing to do. Now, given this gloomy intro, it should come as a surprise by now that my research mainly involves privacy. To be more specific, I'm primarily interested in how AI or rather machine learning algorithms contributes to the degradation of our privacy and can potentially influence our behavior. But more importantly, I'm interested how we can employ those techniques to either gain insight into them. Or even, you know, strike back at them with privacy preserving algorithms. Now, when we talk about personal data, we typically think of data that we actively consent to to the collection of when we're sharing them. And we're typically also aware that we are sharing them. Now, my research relates to information people are typically not aware of that they're sharing. It's the kind of personal information that's encoded in the online text that you write. So it could be messages or social media posts, etc. And therefore it pertains to language and my work lies at the intersection of natural language processing, machine learning and privacy and even security. Now, I hear you think if I don't want a system to know personal things about me, I just don't talk about them, right? And you're absolutely correct. But there are things in your writing that you're much less aware of that you're sharing could be gender, age, personality, political opinions, even mental issues. And those are definitely captured in content words. So literal words that you're sharing, but they're much more varied than you'd expect. So it's not only the concrete mention of this. What is way more characteristic is your unique style of writing and how your style differs from other people, but overlaps with some that I might, for example, know demographic information about. That facilitates automatic predictions of those attributes. All those analyses are part of computational stylometry and specifically author profiling. Typically, such work needs large, vast amounts of hand-labeled data. For example, by going through people or social media accounts, looking at names, pictures, cues in their writing, it's a very time-intensive task. And therefore, it also restricts the application of those techniques, which is generally good, I'd say. However, unfortunately, in my research, I showed that we don't even need that kind of structured information. We just need people to talk about themselves. I say that they turn 30, mention that they're a woman, for example. Posts on particular fora related to an attribute that we're interested in could be self-help fora, for example. And that particular research came about because I was wondering about the pervasiveness of those self-reports. So how common are they? And more importantly, when they are pervasive, can they provide high-quality labels for classification, despite the fact that using these as kind of heuristics for labeling is quite noisy? People might not tell the truth. They might have said something different that we're literally interpreting with these queries that we're using to retrieve that data. But if it's the case that they're very effective, it will be an excellent example of people oversharing and thereby compromising the privacy of others. And yeah, it turns out they do. They are quite effective. So we trained author profiling classifiers and specifically on binary gender classification, which is a task that I didn't invent. So please don't shoot the messenger. But so if we train them on this data, on these profiles that were labeled with gender that they basically self-reported themselves, obviously we removed the reports while training. But if we train on this data, these models actually perform very close to models trained on expensive manual labels. And what I thought was even more scary is that the pipeline I set up, so collecting data using this heuristic, even taking into account Twitter's rate limits, can't do very fast data collection, creating a fairly simple linear model and then inferring these labels, like everything finished within 24 hours in a very simple four core laptop. And unfortunately, this distant collection of labels, as this is called, was also shown in previous work to work for a very large range of attributes, personal attributes, which implies that the collection of this data and making inferences about such attributes is incredibly easy, like scarily easy individuals that are beyond regulation easy. And I find this very troublesome more so because it facilitates invasive predictions that are particularly harmful to people in vulnerable situations, regarding, for example, mental health or race or gender, and self-censorship, so by not engaging in this online discourse, really seems like the only way out of it, you know, encryption is not going to work here. Anything facing the public, so a group or maybe an employer is potentially compromised. Another field of study is adversarial telemetry, which intends to break such classifiers by changing things about your writing, so that these author profiling classifiers cannot infer these things anymore. That could be either with you in the loop or completely algorithmically driven. And it's a highly complex and very interesting problem because it involves these algorithms to maintain the original meaning of the text and not to be super obvious so that it's not too easy to spot that changes were made to the original text. Now, my first attempt at this was trying to do things completely end-to-end using, of course, deep learning. The problem here is that you can't really tell neural network what semantically consistent output is to construct a loss. There's no good metric for preserving meaning, like, for example, maximum likelihood across entropy would tell you things about classification. There does exist some reinforcement learning work using metrics that do seem to correlate to some extent with human judgment, but very limited and varying success. So in machine translation, this kind of semantic problem is tackled by the source and target languages, always containing the same information and they can therefore be seen as semantic mirrors. Now, unfortunately, there's not much data around that has people of a particular demographic category writing exactly the same text. And to be honest, it seems like a nightmare to set that up for collection anyway. So we solve this by running auto encoders that are actively trained not to encode things about a particular style using a gradient reversal layer. This layer features a linear predictor that tries to classify the style based on the hidden layer or the context vector. And the gradient of this particular predictor is inverted, which implies that when we run backprop the encoding of the words has to minimize the performance of this classifier. And hopefully when we decode back to text in this sequence to sequence model, the style is gone. Now, we managed to get performance of a target classifier close to chance in our experiments, but the trainings model on noisy social media unfortunately proved very difficult. This is why for my latest research, we kind of backed off to more controlled local changes through lexical substitution. Lexical substitution involves determining words or particular sections of text that are important to this author profiling classifier its decisions and then successfully replacing those to both fool the classifier but also maintain the meaning of the text. And if you have access to the model in any way that is very doable but yeah in reality we don't we don't actually have access to the systems right. So, I focused on the deployment of these techniques on their realistic criteria so can an average internet user run such software on their own systems, can they collect data to train the systems and inform them and do these locally trained attacks successfully transfer to actual author profiling algorithms. And this all needs to be fast and very easy to use and also work very recently at the same time. So for this we employed several methods, we proposed a few transformer based techniques to both suggest candidates to substitute these important words with. But we also ranked them according to their contextual fit in the sentence so does it make sense given the surrounding words and their kind of semantic information. Now this alter not to be effective, but the changes were not completely inconspicuous to humans. And that is for the simple fact that the output of these rewriting techniques is just not completely semantically correct, which is to be expected, given the difficult nature of the task. I'm however confident that with the human in the loop that it has access to the behavior of such classifiers and and I can choose which words to fool them with that this is actually very promising for privacy preserving natural language processing techniques. So on that note, you are more than invited to check out my work. It's typically all open source open access well documented if I say so myself and reproducible because I think if you're doing science for the benefit of at least a subset of humanity better make sure they can use it.