 To this talk on chaos zone TV this talk the rise and fall of social bot research Will be presented from for Florian Galvitz No, my for the deutschen zu shower and höher dieser talk wird übersetzt schaut nach der Übersetzung Ich werde nun zurück nach Englisch wechseln. I will switch back to English now This is a talk presented by Florian Galvitz from the Nuremberg Institute of Technology and he will talk about social bots in Recent years we have all observed the phenomenon of social bots Accompanying different media events and also the media had like waves also showing those social bots Activities in the social media Four years ago already michael found significant shortcomings in the research on social bots and Florian will now deeper show us why the current research is Floored or even deeply thought in this specific field So have a lot of fun in the talk the rise and fall of social bot research Thank you very much for the introduction. So unless you've been living under Rock somewhere in the desert. You've seen Headlines like these are just randomly selected a couple of headlines through Google news. So we have been told that Bots on Twitter are amplifying conspiracy Theories or they are spreading Disinformation before the election or they even half of the accounts tweeting about the corona virus are likely bots and bots are a major source of climate disinformation and they There are hundreds of thousands of them and they even interfere in Canadian politics and in British Politics and Donald Trump has millions of bots supporting him and They are a danger to democracy and they are poisoning Democracy and they are damaging democracy They supported Boris Johnson and they are spreading fake false claims about the bushfires in Australia and they even trying to influence Elections in Germany, maybe so this is You will find hundreds of Headlines like these if you just search for the keyword bots in Google news I'm there behind a lot of many of these headlines. There is some academic Research, most famously the paper by Ferrara and colleagues called the rise of social bots which inspired the title of this talk and Well, they the research that papers that Claim to Prove some influence of social bots on the US presidential elections and the spread of fake news by social bots And this is a paper one of many papers by Philip Howard who claimed that bots had interfered into the Brexit referendum But there are many other issues that are allegedly Influenced by social bots. So for example and health communication vaccine the vaccine discussion. This is a paper from 2018 so before the COVID pandemic and This is a more recent paper Both claim that bots are interfering the discourse about vaccines but they're also interfering the discourse about climate and generally Any low credibility content on Twitter is allegedly Spread by social bots. So by the end of this talk, you will Hopefully learn that the low credibility content is actually inside these papers So what's the social bot? If we look at one of many slightly different Definitions, this is one by Ferrara. So he's the one who basically Created the social bot hype. So we have to believe him So if we listen to him me social bots are automated accounts that use artificial intelligence to steer discussions and Promote specific ideas of products on social media such such as Twitter and Facebook The typical social media users browsing the feeds social bots may go unnoticed as they are Designed to resemble the appearance of human users and they behave online in a manner similar to humans So that's a pretty clear definition of what social bots are supposed to be if Basically, if you look at all definitions of social bots in academic papers This is a pretty precise picture of what the general consensus on social bots is So typically they are assumed to be political accounts influencing political discussions and they have to be automated and They are accounts with human fake profiles pretending to be humans and somehow behaving in a human fashion So obviously at the intersection between political accounts and fake accounts You would find human paid trolls for example, so people who are paid to To take part in political discussions. Also, you can easily find automated Political accounts. So for example, the Fox News feed is political and it's somehow automated. So automatically newly published Articles on the website for example Get tweeted on Twitter. So there's some kind of automation. So this would be automated political accounts. And of course, they're also Or you can at least suspect that a lot of automated accounts With fake human profiles are used with porn or crypto scam So basically to scam individual users on Twitter, but social bots they combine these three properties That's generally the idea of social bots and social bots are At least attributed with a with a They're there somehow People believe that they are able to produce content on their own. So they're come somehow autonomous and the very often the Words artificial intelligence are mentioned in context with social bots So if Twitter is counted with Crowded with social bots where the hell is one? That's the my starting point for my interest in social bots So I felt kind of gaslighted So media is telling me that there are millions of social bots on Twitter And I've spent almost a decade on Twitter and I've never seen this social bot So that's the point where that got me interested I have a background in AI In pattern recognition, I've been building conversational dialogue systems for many years which we call chatbot today and I've never really believed the idea that at the current state of the art you could build Political bots that's it could interfere in current political discussions That seemed kind of strange and implausible to me So if they exist, how do these things work? So I wanted to find one to take a look at the social and actual social bots And I an actual social bot and I started to search for an actual social bot in late 2018 so three years ago So since then I've studied dozens of scientific papers media reports about alleged social bots. Basically, I read all newspaper articles, media reports about social bots that Google News could come up with in various different Countries and I've analyzed hundreds of Twitter accounts that have been accused of being social bots For example in newspapers where actual social bot accounts were named And of course, I asked a number of social bot researchers for an example of an actual social bot And I've performed experiments on using the Twitter API Which involved hundreds of thousands of accounts some of these experiments together with Michael Kryl Who was mentioned in the introduction who has been working on this problem One or two years longer than me So the total number of social bots I found until now using all these different approaches is zero So I couldn't find a single social bot that fits the definition of emilio Ferrara So that seems kind of strange. So where does this apparent mismatch come from? In order to understand this, you have to take a look at the methodological basic basis of all these scientific papers And basically it boils down to two different methods. The first method is the so-called Oxford Criterion named after the University of Oxford, Philip Howard, who defined heavy automation as accounts that post at least 50 times a day So basically he counted the the tweeting frequency of different accounts. Everybody who tweeted more than 50 times a day was defined heavily automated and referred to as a bot So that seems kind of strange And the second approach and the more common approach nowadays is An automated tool called a botometer Which is You can use it. There's there's a public website where you can enter a twitter account and press a button And then you get a score somewhere between zero and five or between zero on and one It depending on the scale you use and it basically and and The researchers typically use the api. They feel a long list of Twitter accounts into botometer for each account They receive a bot score and then they use a threshold typically 50 percent So 0.5 on a scale of zero to one And if the bot score is larger than 0.5, then they simply assume it's a bot A similar there are similar tools that basically work in the same manner But botometer is by far the most popular tool of this kind So in both cases no manual checks are performed whether the accounts Identified using these rather crude methods are bots or not And even more interestingly the names or the user IDs of these alleged bots are not published You will won't find any bots in all these dozens of research papers And if you ask the authors of these papers For the actual bots, they are routinely routinely withheld. So they won't give you the data They will give you some flimsy excuses about data Protection laws and privacy or they somehow Accidentally deleted the data or they didn't store the data or whatever So pretty much like if you ask That kind of the dog ate my homework type of excuses Okay, let's take a slightly closer look at the way these Researchers produce headlines. So so that's basically a certain perfect recipe for for Creating a headline in the New York Times. So you Um Decide which political issue you might be interested in really You can come up with any topic. You might be interested in let's say you're interested in gaming So you will you want to Find create a headline about gaming and social bots in the New York Times. So what do you do you? Look for gaming related keywords on twitter And you identify accounts that That tweet about gaming then you Can come up with a long list of of user IDs of twitter accounts That tweet about gaming then you feed them into about a meter use your threshold 0.5 for example, and you Get a list of bots and a list of humans And then the New York Times will publish a headline Researchers nearly half of account tweeting about gaming are bots Okay, so now skeptical people like me Might want to take a look at the actual bots to verify your claims So obviously it's more useful to hide this data But um journals don't seem to be interested in the actual bots They have never questioned the validity validity of these claims even without looking at actual bots So they will believe you and they will publish this in the New York Times So um the first Criterion That's used for this type of papers is the oxford criterion. So the 50 tweets per day criterion. It's pretty easy to to Show that it's not very useful. So you find very Large twitter accounts celebrities who tweet more than 50 tweets per day like journalist gren Glenn Greenwald or Corey doctoro Was a blogger and author with a very large following Or you on his cars for example a german member of parliament who tweeted up to 300 tweets per day Donald trump himself tweeted more than 50 tweets per day on six days in 2020 and so um all right at least in a very short period of time in 2020 so I'm political activists who have more time than presidents or members of parliament obviously tweet a lot more My favorite account on twitter is actually a guy called erie glucksack. He's a canadian engineer and retired college instructor he has Been tweeting at rights above 300 tweets per day over a period of several months And he's always ranting about liberals in canada and he he hates just in trudeau um another interesting experiment i performed with with uh Was about the k-pop band bts extremely popular and um a german radio reporter made some derogatory remarks about the bands and the the k-pop band the the bts fans were extremely angry about this and um They created more than three million tweets in a period of four days with the hashtag buy-on-three racist buy-on-three is the the um radio program involved and they um So 500 of those accounts 500 different accounts each of them fulfilled the oxford criterion and um the maximum was 344 tweets on average per day and I looked at the top ranking accounts in this list and none of them showed any signs of automation that there were real people they were They were tweeting pictures of themselves and they Everything was consistent. They're coming from different countries. So those are real people not bots um another easy way of Uh, the second criterion was criterion that's often used for the bot search is the bottom meter Bottom you can try it. Uh, the the link is given at the bottom of the page And you can simply enter the name of a twitter account and the bottom meter will return a score So for example, um, so those are all bots according to bottom meter. So for example tim cook is a bot The pope is a bot. Uh, the new gem chancellor olav schaltz is a bot former um chess world champion garrick asparov is a bot Auschwitz museum is a bot Joe biden is a bot The brazilian president rio balson arrow is a bot the Very well respected germ virologist christian drosten is a bot but also his counterpart A retiree economist Who is basically who doesn't believe in the corona virus? He he is also a bot. So Basically, if you just randomly enter account names chances are very high that you will run into bots according to bottom meter um a similar Analysis had been performed by um michael kyle in 2018. Um, he fed all the twitter accounts of um members of us congress into bottom meter and um, you can see that basically it's a It behaves like a random number generator with a normal distribution pretty much around the threshold that's usually used for um for discriminating between bots and humans so, um If we want to use the same approach um The the bot the social bot researchers use for identifying But we can find interesting stuff. We could find ufo's or we can find any kind of Of stuff we want to find And I want to show how we could find unicorns. Let's say we want to find you We want to find unicorns in africa. So we would train a kind of phototrap for unicorns and And we want to deploy that in the serengeti to find actual unicorns So for training the unicorn, we have the same problem the social bot researchers have We don't have actual unicorns to train our Our classifier on so we would use Unicorn like Animals like a white horse for example, or we would use toy unicorns as As a training set for unicorns And we would use some other animals like a cow a pig a cat in the dark as non unicorns And then we would would train our classifier on this data and then we would move our trained classifier to africa and then we would Find lots of unicorns. So for example in this case the zebra would get rated would get a 75% Unicorn score because it looks quite similar to this white horse And also this white egret would get a very high unicorn score because it has this pointy Pig and it's white. So it looks a lot very a lot like a unicorn So now if you want to convince people that you actually found Lots of unicorns in africa Some people might want to take a look at your pictures, but obviously that would Be bad for your Claims. So obviously you you would want to hide the actual Unicorns you would find you would simply Rely on the results of your unicorn classifier and people who who Complain you would tell them why we have almost 100% recognition rate In our on our training data. So you believe us That's exactly the way the bottom meter people argue If we take a closer look at how this kind of classifiers work, well, basically Statistical classifications or neural networks or in the case of bottom meter is a it's a Random forest classifier any any of these classifiers basically places each of the Observations in a feature space. So the dimensions might be the the whiteness and the pointiness of the animals and So the unicorns and the non-unicorns They form some kind of cluster in the feature space ideally And the the training the classifier will create a class boundary that Can perfectly discriminated between unicorns and non unicorns on your training data or and very similar data But in reality if you now deploy your classifier To africa or to the real world you would find lots of Unicorns and lots of non unicorns depending on the the class boundary chosen by the Classifier so it would still produce 100% recognition rate on your on your Training data, but it seems far stretched to believe that this is actually a unicorn or this is a unicorn So if you now remove the training data, you see that's basically You get some kind of unicorns non unicorns depending on the class boundary you trained But it seems outright ridiculous to believe that the animals labeled as unicorns will be actual unicorns Or even that the relative amount of the unicorns that result from this approach conserved as an approximation For the true prevalence of unicorns, but that's exactly the way those spot researchers argue So there's another problem if you think that what I told you so far is bad for social bot researchers, it's even worse because Now you I have to Explain you some fundamental ideas of of pattern recognition so in Let's first take a look at the approach So we the the idea people used to come up with claims like 50% or 15% of the accounts are bots They feed the list into the Into the classifier and they come up with five bots and eight non bots for example And now they calculate the relative amounts of bots like 38.5 percent in this case Now they claim that 38.5 percent of the Of the accounts are bots, which is the so-called a priori probability for bots or the prevalence of bots Or the the priori probability And the the fun part now is that this probability Is already part of the decision process that's used by the classifier. So for example, uh, um, uh Random forest classifier like bottom meter will include this probability and will learn this probability based on the training data So, uh, if If the bottom meter people use 50 percent bots and 50 percent non bots to train bottom meter The classifier will have incorporated The information that 50 percent of the accounts are bots and uh, the optimal classifier with the lowest possible recognition rate and this is what the training process is trying to achieve Will always adjust the threshold in a way that it will Will will conform to this equation in order to produce good results on the training data or on similar But this is circular reasoning. So to estimate the number of bots using this approach, you have to know it beforehand So this is it doesn't make sense. It's simply impossible to use a statistical classifier to, uh, to Determine the prevalence of bots or of unicorns or whatever In reality in if with bottom meter Um, uh, there's typically a curve like this. So depending on the threshold you choose You get a percentage of bots that are classified a percentage of bots or a percentage of accounts classified as bots So the standard threshold many people use is 50 percent But if the accounts of bots is the number of bots is too high to be credible The researchers typically choose a higher threshold, which would give you a lower percentage of bots If you want more bots, you can choose a lower threshold if you want to uh, want Few bots you can choose a higher threshold. But so basically the the results the researchers present are based on their guess of what And uh, uh the prevalence of bot might be in this area. So it's can It's uh does not really make sense So far we have falsified the methodology used by social bot researchers using examples Of human accounts fed it into the different method and also we have shown that it can't even work in theory So there's one secret That's remaining. What do the bots look like that these researchers claim to find in the studies? So the accounts they prefer to hide when we ask them for the list of these accounts and we've spent a quite a lot of time trying to get behind This question and to find out what do these actually the accounts look like They might find in their studies So the first study we took a closer look at Was about the german election in 2017. Um, the authors claim that The share of social bots among the followers of seven german parties increased to from seven to point one percent to nine point one nine percent during the election campaigns So basically they they claim that roughly ten percent of 38 000 accounts following the largest german parties were social bots. So more than 80 000 accounts Following german parties are social bots. That's interesting. So I asked the first author for examples of actual social bots and He gave me four accounts which didn't remotely look like social bots So that seemed kind of strange and he also seemed to be fully aware that there are no social bots in his sample So, um, we try to replicate this study A couple of months later. So we basically we downloaded all the followers of german parties We fed them to through about a meter. We used the same threshold These authors had used and we for the first time we had a look at the actual bots that show up in this type of study So now we can't look at we found 260 000 bots We took a random sample from these in this case 109 accounts. I spent a lot of time looking at those accounts very carefully Those are they are ranked by the bots because so this is close to the basically the maximum value a bottom meter can produce and Um, those are all the accounts you can check them if you want if you believe any of those are bots We couldn't find any bots So for example a dense cafe in air forward this guy Or we found a forest in southern bavaria or this guy complaining about the noise produced by motorbikes in gels and kechen and It took me a quick google search and I could find out exactly at this day There was a motorcycle event in gels and kechen at the at the amphitheater in gels and kechen So they those are real people with real ears Complaining about real noise. Many of these accounts are very inactive. So basically the opposite of what what social bot researchers make you believe About the about the properties of social bots. So we couldn't find a single social bot in this list. So some more of these accounts So for example Probably kaira miller probably she asked microsoft for for help and the The one that only tweets tweet she all she produced with this account and it resulted in a Bot score close to five like four point five six four so close to the maximum awesome a blogger a doctor a politician Former regional chairman of the political party d linky. She got a bot score of four point four in a scale of zero point five no bots the second study we looked at closely claimed that they they found Very large in a number of social bots in Vexy spreading vaccine critical information kind of 200,000 social bots and I asked adam done the first author for the list of the For bots of his bots and he actually gave me the list of that for a very first time We had the precise list that was used to produce this kind of paper and also again We sampled A random sample of 121 in this case of the 200,000 bots and I took a very close look at these 121 accounts and we could not find a single social bots and those are just nine of the 120 accounts. I analyzed You can see those are the doctors. This is a very high-ranking A doctor at the world health organization A medical student from rando A professor from texas He's a Working at a medical technology company in Nigeria Medical intern From saudi arabia A health professional from california. I think a postdoctoral researcher and And on rna viruses data Operator it professional from india so People from across the the world many tweeting under the real name many with impressive professional credentials and they simply These social bot researchers simply claim those are all bots. So this seems kind of strange Uh, so if we take a closer look at this list of 121 one accounts a lot, you can Check the accounts. They are all in our paper. So for example the world association for medical law was rated a bot or this pastor from Boston was rated a bot but None of them seem remotely automated This is one of my favorites tail or timothy Tweets by a girl called taylor who is seriously in love with a guy called timothy musk with whom she is engaged So those are two of her tweets and the bot score is 4.39 for close to maximum or this low Laura james She even tweeted pictures lots of pictures of herself. She tweets a lot about unicef. She was really rated a bot Okay, so In this year two other papers came out two other interesting papers Okay, okay Okay Okay, so there were some problems with the slides, but seems the problems seem to be solved. So the um, the uh in 2021 two papers were published where Researchers use bottom meter and they actually checked The list of Bots they they got very much similar to what we actually did and in this obviously in this in this paper They took a very close look at 500 Accounts and they fed them into bottom meter and they also manually checked them if they could be automated Or if they looked automated and it turned out that bottom meter labeled roughly six percent of the uses as social bots But none of them account looked like a remotely automated. So they simply ignored the bottom meter labels as rubbish And another paper very similar It was just the preprint was published a few weeks ago And they also use bottom meter. They used their standard threshold of 0.5 and So 86 accounts were labeled Bots, but actually none of these was a social bot Only one of them was automated. It's automatically shared article from an external personal blog So that that doesn't make it a social bot is it might have been automated So we ignore the people basically turned out that the bottom meter scores are Absolutely unreliable and have nothing to do with reality So now we've seen so basically we can ignore the existing social bot research based on on the oxfuck criteria Or and we can ignore the research based on Bottom meter, but there are some other methods being used Where people claim to find social bots So this is a headline from the times of london army of twitter bots follow top politicians such as niggler distortion and a few others and when you ask the There's no The the people how do you how do you know it's bots? And that is actually this guy He he Sorry, okay, still some problems with Problems with the slide stream, but hopefully I think I won't change anything now. Okay, so the This guy explained on twitter how it worked We downloaded the latest 1k followers of seven politicians and checked numbers of users with exactly eight digits in usernames So these guys assume that bots are Usernames with eight digits like mic 2648 1564 So they assume that the bot many people believe this but it's actually complete bullshit because Names like these are automatically assigned to newly created accounts by twitter at least since 2017 So you can if you just create a new account With a common name you will automatically get assigned a name with these age Digits attached So if you're familiar with twitter you would you have a possibility to change the name of your twitter account But most people aren't aware of this feature. So they simply They look like bots for the rest of their days on twitter What about accounts posting the same stuff very often claims are made about bots With screenshot like the one on like like this So different accounts posting more or less the same message and typically those are so-called copy pastas and where people ironically Make fun of some Or a tweet they don't like by posting it over and over again. Sometimes they They post it Over and over again unironically or different accounts post the same message. It's pretty hard to Understand if you're not familiar with the meme meme Culture on twitter this guy for example, all these accounts Are on the same side of the discussion But he doesn't get the joke. So he he believes those are bots Actually, those are his friends in this in this political issue And and then you find reports like the one that's a year published yearly by the computational propaganda research project in the oxford internet Institute and they give you colorful lists of countries where bots had been identified in the year and obviously the The the whole list if you look take a close look is not based on actual bots But it's based on news reports about actual bots. So that's a kind of In a process that will continue to go on forever because Newspapers will report about this report and the reports about the report will be Will lead to further bot bots in or bot Claims in the newspaper. So Uh, the best uh, there's a very nice claim by a germ journalist in a very different context That describes what's going on If I found an institute to combat public health hazards caused by hedgehog bites And provide this institute with staff positions Then I will get a study every year warning about the growing dangers posed by aggressive hedgehogs Anything else would be rather stupid on the part of the institute's stuff So that's pretty much describes what's going on with all these Research institutes involved in Investigating social bots So to conclude social bot research is fundamentally based on the misclassification of human users as social bots A malicious interactive social bots don't seem to exist at all After years of studying the output of social bot research We have yet to see a single credible example of a social bot And somehow the research is involved in this type of research by all its basic academic Standards, though they make claims that are not supported by data and they withhold data Which would allow us to verify their claims And the public has been misled over the non-issue of social bots for years So um michael cryland me we Published all we wrote all this down in paper The name is the rise and fall of social bot research You can download it at the link on the slide and you It has all the lists of bots in the appendix so you can take a closer look and Try to find social bots if you uh in those lists if you if you like. Um, thank you very much for your attention well, thanks so much flurian and um Yeah for your talk and it seems to me that indeed the field of social bot research is in a very problematic state To me it also seems just for the sake of realistic thinking and discussion The work you do and michael cryl is doing is really important for all of us So we and maybe you or all the scientists need to provide a clear base for discussion And media coverage such that those topics are actually kind of um talked about reasonably So we collected some questions from our audience And um, maybe I'll go through these questions one by one. The first one would be Um, yes, you talked about those social bots and like what is a bot and what is not a bot? But um, shouldn't the question rather be about the effect of such accounts Rather than whether they are click workers people which too much time in their hands paid influencers or automated accounts so saying being a bot or being Acting in kind of those political ways or acting like a bot. Isn't that actually the same or like the effect Is important would you agree with that? Well, it's um We are not the ones who made the claims that these automated bots exist So first of all before we talk about the Effects of stuff we would have to first have to talk and do they exist at all? There are many both social bot research can take but can let's talk about the effects not about the existence Because but first of all we have to read out this belief which I think it's a it's a it's a conspiracy theory It's a fairytale. Um that the social bots these automated accounts exist and many people actually believe it and typically when people Claim uh, you're a bot you're no you're a bot. Uh, this is your bottom meter scroll. No, I don't I'm not a bot But you're a bot so these discussions on twitter keep going on and they are based on this Um, uh, uh, belief and this ridiculous conspiracy theory. So, um First of all Michael and me we're trying to get uh to make uh get Make let people have a realistic view of what's actually going on on twitter Now the the question whether accounts are maybe paid by Some malicious actors To influence discussion is a very different question So we we have been looking at automation. I'm I'm a computer scientist and uh, so my my, um, I'm not, um The right person to me to analyze the motivations why people might be Tweeting on twitter a lot but my Impression is that the very active accounts on twitter like the uh, the canadian account I mentioned in the talk Erie Gluck, I don't believe he's a paid actor He's he's a retired professor and he uh engineer he He has a lot of money He he's certainly not paid for being a political very active on twitter He he is acting out of intrinsic motivation same holds for all these 500 accounts that tweeted Hundreds of times about the about the the k-pop band They are not paid actors. They are people who really Who are angry about stuff or people who are really trying to To to to change things by being very active on twitter So my general impression is that people have a tendency of overestimating the problem of malicious fake accounts people The default assumptions should be actually those are real people with real Real intrinsic motivation Another question we have is um in your slides you had this circular reasoning Example or maybe this kind of fundamental flaw of circular reasoning. Um, and the question would be from this person Is it maybe? Like circular reasoning maybe not rather resonance with which would be a valid method to get a result kind of That the results actually resonate with their With the real world such that in the end you would kind of confirm that there's something resonating With your findings. It's a bit hard to ask this question, but maybe you got it I think I think I get I get I believe I got to get more of the questions going out. So maybe the the idea would be to So you To start this to use this method you would have to have a guess for the the the prior probability of bots So you could use this iteratively and hopefully end some and at the real value But actually I believe it will certainly Either it will Explode and go to 100 or it will go to zero from all my As I'd say 25 experience in pattern recognition. I don't think that this approach is remotely capable of Estimating the real probability by using it it iteratively or something like that So I don't know that I don't think that's that will work the interesting part about this This equation is Any kind of classified use it is so even humans use it. So if if I have the If my task is to label bots or whatever in a list of accounts And I have some expectation of the real prevalence of bots. I will include this information. So how how So basically the the the problem is If we don't know the prevalence Can how can we even estimate it? And the only way to do it was it would be if these two Posterior probabilities which basically tell us how much does the do these features? How much does these do these features? So for example the tweeting raid Look like a bot and if these two distributions are very different So for example, if bots tweet 10,000 times a day and humans only it tweet 10 times a day Then we keep it can keep these two apart And these two probabilities which we don't know are very Unimportant but if these two distributions look are somewhat similar. So if so think let's Think of we're trying to estimate if it's a man or a woman based on the on the on the height And of course if somebody is 1.75 meters it can be a male or a female It's very hard to discriminate So these distributions overlap and if these distributions overlap we can simply cannot count male and females based on without having Without knowing in advance which which are the two classes what the distribution is So this approach even if we there's no way to make it To iteratively use it Unless we have a very clear Identifier for humans and but so for example if each of the bots has a label that says it's a bot Then we can discriminate the two classes Another question that is similar maybe to the effect question that we had in the beginning Are you not worried that your research will be abused by those employing rooms full of humans to do bot-like posting to just claim there are no problems at all and just Just giving this as an excuse saying like look at those researchers. They claim there are no bots at all. So Like what we do is a good work or something like that Well, first of all, I'm I'm a researcher. So I'm not interested in what people might Make of the results I have. So I'm not trying to produce results That might be useful for somebody or I'm not afraid of producing results that might Might come handy for Parties in political discussions the whole bot research often very often involves Very Polarized political discussions. So sometimes people like me because I Tell the right-wing people. They are not bots. They're real people and then in the next discussion I tell the left-wing people they're not bots and they like me because I tell them they're not bots so At the moment I start to think about The the effects that some people might might like about research This would be I would would no longer be a scientist, but obviously we're not talking about people Getting paid for for tweeting. So obviously we are only interested in the claims about social bots So the automated accounts interfering in political discussions using fake profiles. So the pay trolls I don't claim they don't exist. I know they exist especially in Venezuela for example or in Mexico or in russia. They're very valid Very credible reports about Pay trolls even insider reports by people who have been working in this kind of setups So I'm certainly not claiming that there are no humans trying to to interfere in political discussions Yeah Maybe a follow-up from another attendee. How can we incentivize this replication work? For those studies you've shown and you kind of try to replicate as well Or at least got some information out within the scientific community for this field Is there kind of something that the general audience can do or maybe something that researchers need to do In order to put more focus on replication in this field Which seems to be a bit flawed and needs a lots of replication. It seems Well, it's I think The replication issue is a big issue in all scientific fields even in computer science A lot of the research for example um New topologies Designed for neural networks which work in certain setups in certain But it no longer work in slightly different setups So um replication is a big issue in in all scientific fields But I think it's a very huge issue in social sciences social sciences So I'm a computer scientist. I come from a technical field. We're used to very strict standards. So we're used Uh, we nobody Uh would dare to publish something without being ready to to publish the raw data as well So the idea of publishing results without Being able to hand out the raw data is a huge It is basically scientific misconduct in our field So in social sciences, obviously people seem to be a little bit more more relaxed about about Reality and about truthfulness and I have a feeling there's a kind of cultural mismatch. They they'll It turns out that a lot of studies in social sciences are not replicable and The problem seems to be extremely Prevalent in social sciences. Yes Maybe last question A couple of people ask the variations on the same question In your opinion, is the lack of evidence for social bots sufficient to prove that social bots do not exist Or not in any meaningful amounts. I mean like that we didn't find them doesn't mean they don't exist I mean we can build one for a proof that they exist But uh, do you think that um kind of the lack of evidence is um prove for maybe That they do not exist at all or not exist in a in a large quantity Well, basically, um The the interesting, uh, maybe you're familiar with russell's teapot. Uh, he he used the the The the idea of a of a teapot flying around the around the sun in an orbit beyond mars and Claiming that the teapot was flying around the sun in orbit around mars is hard to disprove So you cannot you can Took a very close look at the orbit and he will still not be able to disprove that That or that a teapot is flying around mars and um, obviously we have the technology to send a teapot in an orbit around mars so, but It's uh, and if people claim that teapot might be orbiting mars. Uh, it's very hard. Yeah, we can build rockets We can build teapots. Um, we can send them an orbit around mars and it would be Very hard to disprove these claims. It's very much like social bots Now I believe it's very unlikely that somebody has already launched a rocket with a teapot teapot to to to orbit mars and also believe it's pretty unlikely that people would try to build automated political social bots simply based on my understanding of the technology So building chatpots is extremely cumbersome is expensive and the results are really really bad So if you've ever interacted with a real social, uh, a real chatbot, you know, the those Bots are very limited at the current state of the art. They can't even handle negation. So if you tell, uh, Siri do not, uh, switch on the light, it will switch on the light. So even simple stuff like negations Are still beyond what natural language processing can do at the current state of the art so the idea that of rapidly coming up with a, uh, a chatbot like System that can interfere in current political discussions, which are usually only going on for a couple of days on twitter And then you would have to buy a new one for the next discussion Or you have to to program a new one. It seems very very, uh, Beyond what's possible from a tech technological perspective. So I'm pretty sure these kind of bots do not exist But of course I cannot prove it All right, um, thanks. I think we got some more question, but we cannot really fit them into this slot. Um, again Thanks so much florian keep up the discussion and also keep up your work And, um, after this talk, uh, we will continue here on chaos sony tv with a talk from big alex about Software engineering, I guess and we will see you soon. Um, after a short break. Thank you Thank you