 session. It's my pleasure to introduce Professor Animesh Munkerji from IIT, Karakpur, and he's an expert in social computational science, and he will tell us about the analysis of hate speech on social media. Please, Animesh, are you there? Yes, I'm here. Okay, can you share your screen? Okay, thank you. You can go ahead. Am I visible? Is my screen visible? Yes, yes, we can see it. Okay, thank you very much Mateo for inviting me to this workshop, and a very good morning, good afternoon, and good evening to all of you, depending on your time zones. So today I'm going to talk a little bit about our research on analysis of hate content in social media. So there is a small disclaimer that I wish to give at the beginning. So this presentation might contain some offensive words, like, but however this cannot be avoided given the nature of the work that we are doing, I've tried to suitably, of course, get them wherever possible. So let's begin. So I think like social media needs no introduction these days, like these days, whenever I go to talk somewhere, I ask young people, like, how many social media platforms are you member of, rather than asking, like, are you member of a social media platform or not? So it's defining our social dimensions, like, and I find like, some of the young people have social media accounts in more than three platforms. It's interesting to understand, like, how they manage so many accounts. And actually, recently, social media has also become one of the primary sources of news consumption. So of course, although there are a lot of positive things that social media has brought with it, there are equally large number of negative consequences, some of this polarization, abuse, and hate speech, which is something that I'm going to talk a little bit about in this talk today. So many of us are aware of certain offline events, quite unpleasant, that has happened in the last few years, like the Rohingya genocide, the Pittsburgh shooting, the Christ Church shooting, the Bullenshire violence. And many of these seem to have been triggered from certain online events. So these offline real events, real life events seem to have an online trigger. So, and this trigger can come from different social media platforms. It could be Twitter. It could be WhatsApp. This is a WhatsApp message that is an offensive and hateful content expressed toward the Muslim community in India. And it could be also some very new websites coming up like Gab, which I'll talk about a little more in today's lecture. So our efforts in this area can be broadly divided into studying the spread of hate speech, studying the temporal dynamics of hate speech, detecting hate speech, misogyny detection, and how to handle hate speech or counter hate speech. So in today's talk, I'll be mostly concentrating on the first and the last topic. And this work is a joint collaboration with one of my colleagues here at IIT Kharagpur, from Saqawun World, and three PhD students, Vinay Mathew, Pulna Jaya Shah, and Vikram Das. So the first work that I'm going to talk about is about spread of hate speech in online social media. This was published last year in the ACM website. So I'll introduce you to a platform, to an upcoming social media platform, which is called Gab. And this platform actually promotes itself as a champion of free speech. It has been typically criticized as an echo chamber for alt-right users. Most of its features are like Twitter, but then with a much less moderation than Twitter. So Gab says that it promotes free speech, but in the disguise of that, it seems to promote a lot of hateful speech, which could have many ill repercussions. So we curated a massive dataset of around 21 million posts and 343,000 users by crawling the Gab API. So this crawl actually contains some basic information like the username, the posts done by the user, and the followers and the followings of a user. And you will see what we do with this. So the first thing, the first task in order to study the spread of hate is to identify hateful users. So in order to do that, what we do is we create a seed set of hateful users, those who have posted some 10 plus hate posts. And then we create a repost network of users and from there we build a belief network. And then we initialize a seed set of hateful users and give them a score of one. And for the other users, we give them a score of zero. Then we run a very simple diffusion model and the final belief scores actually tell us whether a particular user is a hateful user or a non-hateful user. So if the belief score is between 0.75 and 1, we call the user to be hateful. For 0 and 0.25, we call the user as non-hateful. So as you see, there is a large number of non-hateful users compared to a very small set of hateful users. So I will now detail out each of these steps in the next few slides for more clarity. So the seed set, how do we construct the seed set? What we do is we actually create a high precision lexicon of 45 keywords. Now these lexicon contains like various racial slurs, as you see here, some of them are written here. So we use these racial slurs in order to build this lexicon and these are high precision because the presence of anyone or more of this slur on the GAP platform, which is like reasonably an unmoderated platform, is almost surely indicative of hate content. So we try to call all those users who have 10 or more posts with one or more of these high precision keywords as hateful users. So from this, we construct a repost network. So what is a repost network? Let us assume that we have a small set of users, A, B and C. Now let us say, let us consider the node C. Now the node C posts 10 posts of its own, that is denoted by this self loop here and it reposts 5 posts from user A. So that is denoted by the directed edge. So in this way, user A posts 17 posts of its own, but does not repost anything from any of its neighbors. Whereas B does not repost anything, but only sorry does not post anything, but only repost 9 of the posts of A. So now from this repost network, we construct something called a belief network. So again, let us consider the user C. So the user C, as you have seen, makes 10 posts, so it makes 10 posts of its own and 5 posts from the user A, which we call the reposts. Now therefore, the belief of user C on itself is defined as 10 by 10 plus 5, that is the total number of messages posted by C. So that is 0.67. Whereas the belief of C on A is equal to 5 by 10 plus 15, which is 0.33. So now you see that from the repost network, we construct a belief network with edges on the opposite direction. So this is how you construct a belief network from a repost network. Now on this, we run a very simple B growth model. So and say, let us say through our hate lexicon, we have annotated the user A as a hateful user because it has 10 or more hateful posts containing one or more of those hateful keywords. So now once you run this diffusion algorithm for some time, it converges and each of these nodes actually gets some scores. Now this score would be something between 0 and 1. If the score of the node is between 0.75 and 1, we call that node to be a hateful user. And if the score is between 0.0 and 0.25, then we call the node to be non-hateful. So there is a big gap, which is like a confusion gap. And we do not want to wish to comment on that. So we only take cases which we are almost sure that they are either hateful or non-hateful. However, once we do this, we also do some addition. Hello, Animesh. We cannot hear you anymore. Once, okay, let me go back to the previous slide. So once we have got the score for every node and we have denoted the nodes with score 0.75 to 1 as hateful users, KH represents known hateful and the score with 0.0 to 0.25 as non-hateful users. Now we do some additional checks to understand whether whatever we have labeled are actually correct. So we asked two annotators to annotate 100 such users, 50 from the hateful group and 50 from the non-hateful group. And we asked them whether they find that the person that we have annotated as hateful is indeed hateful and the person that we have annotated as non-hateful is indeed non-hateful. So we observed that the annotators have a high agreement with our model predictions as well as the internet agreement is pretty acceptable. So the Kappa being somewhere between 0.7 to 0.87. So a few more checks. So we also try to see what are the different topics. So GAP provides topics under which you can post your messages. So we see what are the typical topics in which the hateful users post their messages and the topics in which the non-hateful users post their messages. You clearly see that the hateful users post their messages in certain topics which are indicative of strong or intense hateful content. Similarly, so to gather more evidences we try to see what are the different URLs that the hateful users typically share in their post. And what you see that many of the URLs that the hateful users share along with their posts are extreme represent extreme right URLs represent contents that are extreme right in nature which is not true for the non-hateful users. So now with all these evidences we know that whatever we have obtained from the Google model can be considered to be reliable annotations of hateful and non-hateful users. So once we have done this we try to study cascades or we try to study the influence path. So a cascade is a path that is pressed by a post as it is reposted by the different users. Now it is usually difficult to trace the exact path of influence. So we use a heuristic called the least recent influencer model to create a DAG to construct the trace of the path. So I'll again explain it to an example. So let us say that we have the same graph that we saw last time. So you have the users A, B, C, D, E but now these edges that you see are the followership edges. If you remember in our data set we also have the followership information. So this graph represents that C follows A, C follows B, B follows A and so on and so forth and the number on the node indicates the time at which the node or the user has posted the message. So let us say A has posted the message at time 0, B at time 100, C at time 300, B at time 500 and E at time 400. Now the question is that what do you assume C's, whom do you assume C's influencer to be? Is A the influencer of C? Has C seen the message from A or has C seen the message from B? A has posted it at time 0 and B has posted it at time 100. So C since it follows both B and A it might have observed the message from A to A or B. Now in order to resolve this dichotomy what we do is we use a least recent influencer model. So least recent means is the oldest person. So we assume that A is the person from whom C might have received the message and this is we consider that A is the influencer of C. You can also assume something called the most recent influencer model where you will consider that B is the person from whom C has received the message. So we have taken both of the cases, the results doesn't differ much. So once we do this and resolve this dichotomy from this graph which has cycles we now have a directed acyclic graph. So now once we have this directed acyclic graph we know the exact influence parts. So now once we have the exact influence parts we construct the influence parts for all the hateful users that have been marked hateful by the degroup model and also by all the non-hateful users that are marked by the degroup model. Now we study some very standard cascade properties like size such as the which indicates the number of unique users depth that is the length of the largest part in the cascade, average depth that is the average depth from the root north, the breadth that is how broad the cascade is at any level and then the structural virality which is like an average of the pairwise distances. Now if you measure these properties these are very standard cascade properties that one can measure. If you measure these properties we measure it on the original posts of the KH and NH users on the post containing media or images like visual and audio or image okay so multimodal content and posts belonging to different topics. So what we observe is that for all the different metrics say size, depth, breadth, average, depth, SV the values of the KH users that is the cascades of the KH users have a larger value compared to the NH users that is the cascades of the hateful users are typically much larger than the NH users and all these are significantly all these results are significant. Significant states done through Kolmogorovsky's model test and the observations point-wise are as follows. Most surface hateful users as you have seen from the previous site reach a larger audience, they spread wider, they spread deeper into the network and they are typically more vital and this difference gets more pronounced when you look at multimodal messages. So these posts that you saw in the previous slide these are only text messages when you include the multimodal messages the differences get more pronounced and so is the case when you study across different topics under which the KH and the NH cascades are built. So now the question is that so we have seen that hate content actually spreads pretty fast and pretty deep into the network. Now this could have many strong and intense as well as adverse consequences. The question is can one think of designing platforms that could either slow the spread of such hateful messages? Is there a way to do that? Is there a mechanism to do that? Now some of the obvious things that one can think of is to say block or suspend the hateful message or the account itself the account of the user. You can block or suspend the account even actually several governments have established severe hate speech laws in order to prevent their spread. Many social media websites including Facebook and Twitter have come up with strict actions against hate speech but now the problem with all of these very harsh and extreme steps could be that people might argue that this actually could curb the freedom of speech. It could violate the basic premise of freedom of speech then the question therefore immediately is what could be an alternative solution? One of the most interesting alternative solution is probably countering hateful messages that is you have more hate speech you have more speech to actually counter the hate speech okay and counter speech is being used by various different NGOs and actually Facebook has declared that counter speech is one of the most going to be one of the most effective ways in future in order to stop the spread of hate speech. So I'll quickly cover the next work. So here we try to study the effects of counter speech but now considering another social media platform the YouTube so and we consider in specific comments of YouTube videos okay and for us the definition of counter speech is as follows it is a direct response or a comment that counters the hateful or harmful speech. So taking YouTube videos we actually study counter speech for three different target communities Jews African Americans and the LGBT communities. We prepare a large dataset and do some first level analysis the dataset is available and for further research. So we actually script comments from 31 hateful videos that are available on YouTube so some of the examples are cited in this slide. So now there could be various different types of counter speech it could be presenting of facts to correct misrepresentations or misprosper perceptions pointing out hypocrisy or contradictions warning of online or offline consequences showing affiliation denouncing hateful or dangerous speech humor and sarcasm positive tone or hostile language. So we collect the data using the following two state procedure. We scrape the YouTube comments for the 31 videos and then at the first level we annotate these comments as whether it is counter speech or not. So roughly we have a total of 14,000 comments and around 7,000 of them are counter speech and 7,000 other are non-counter speech and the internet agreement is around 0.8. In the second stage we ask the annotators to classify further the counter speech into different types of counter speech the different types are as I have told earlier like presenting of facts pointing of hypocrisy etc. Here the internet auditor agreement is 0.87. So after we did this exercise we observed the following distribution. So one of the immediate observations is that most of the hate speech that comes on this YouTube comments are countered by more hate speech. So there is a lot of hostile language that is being used to counter the hate speech. Of course this is another problem this is again a challenge that the hate speech in itself is generating more hate speech as a counter measure and we are presently looking into ways of stopping this but the good part of it is the other things that we see in the table. Say for instance for the juice community we see messages that have a positive tone actually act very well in countering the hate speech against the juice community. For the blacks community warning of online or offline consequences are the best measures to counter the hate speech. For the LGBT community pointing out hypocrisy or contradictions or expressing humour as a counter measure are the best ways seem to be the best ways of countering hate speech against the LGBT communities. These are all apart the hostile language. So we also study some very specific YouTube metrics like the number of likes that are gathered by different comments. So what we observe is that for the African American community as I have already told that consequences and denouncing of hateful messages are the best ways of countering hateful content for this community some examples are cited below. Similarly for the Jewish community showing an affiliation that okay I am a Muslim and I stand beside the Jews. So this is how you show your affiliation to a particular religion and then you subscribe the views of another religion. So this is what is called affiliation. So we see that for the Jews community such affiliation based hate counter speech works very well to fight the hate speech. For the LGBT community as I already said that contradictions as well as humour are the best ways to fight the counter speech sorry are the best counter speech to fight the hate speech. So some examples are cited here. So we also did a classification of very standard of the shelf machine learning exercise where we tried to identify what given some training data the counter speech versus non-counter speech we use some very standard backup word model to identify what or predict like whether a message is a counter speech or a non-counter speech. Also we tried to do the second level classification that is if something is a counter speech then can we actually also classify them into one of the different types. So the question could be like why do we need such a classification? The basic need for such a classification would be if you want to automatically generate counter speech. So as you understand it is a difficult exercise to have people or NGO workers to actually tie and generate counter speech to like stop this spread of hateful message. It would be very helpful if you could have computers automatically or computationally generate counter speech. So that is what is the aim but the first stage to do that would be to have an automatic classification system like the one that we have developed. So with this I would like to end my talk I think I am a little bit way of my time I'm sorry for that and all our work can be found on this website and thank you very much for listening to my talk. Thanks a lot. Yes thank you very much Animesh that was really very interesting we have a question in the Q&A. Could you give some idea about the typical cluster size of the sentiment driven avalanches that you mentioned in the beginning? You can read the question yourself in the Q&A box. I'm trying to read the questions. The typical cluster size of the sentiment driven avalanches that you mentioned in the beginning. I'm not sure if we have done a sentiment analysis as such. Could you give some idea about the typical cluster size of the sentiment driven avalanches that you mentioned in the beginning? So I'm not sure what is actually meant by the sentiment driven avalanches. We have not identified any cluster as such. So I'm trying to guess okay there is one more question. How do you measure the slowing down of hate speech cascade? Okay this is a very nice question. So there is a question that how do you measure the slowing down of the hate speech cascade? So what we will have to do so in a more recent study we have tried to study this cascade over time. So whatever I showed you is a single snapshot analysis of the cascade. Now if you keep measuring these cascades over time the different metrics that I talked about like the size, the average depth, the depth and things like that these will change and if over time the values of these actually fall seem to decline then you have a way to understand that the hate speech that the spread of hate speech is slowing down. Does that answer your question? Yes I think so I had another related question. So how do you measure the effectiveness of counter speech? So in other words what is it I mean if the counter speech messages were opposed or were not submitted would the hateful post have gone on for long? I mean do you have a measure of this? That's again a very good question so we did not measure it. I'll come to the measurement part a little later but we looked into our data and we saw that there are a few instances where the original hateful speaker actually have agreed to the person on the counter speech and have actually expressed apologies. So this has happened so there is not a direct way unless you do a argument analysis so you have to somehow do some sort of an argument mining and identify whether the chain of replies have stopped or whether the chain of replies do not include hateful messages. Any further? Okay thank you very much I think we are running a little bit late so I'd like to introduce