 I'm going to talk about ethical framework for a web immunization score on Twitter and we defined web immunization as individual or group susceptibility to misinformation on social media. And this machine learning element of our project is only part of this project and I will try to focus only on this part and try to analyze ethical elements of this project. But at the beginning, of course, I have to make some disclaimer and disclosure. I have a conflict of interest. However, this is not a financial conflict of interest. This conflict is due to my double role in this project. So on the one hand, I am a project leader. So of course, my goal is to navigate our research project to the fruitful end. And I would like to omit and avoid all possible problems. But on the other hand, I'm a bioethicist interested in research ethics. We are a background in philosophy and I would like to analyze all important ethical problems posed by our research project. And we invite you to take part in the seminar. We also invited some external experts also to avoid our blind spots to really take advantage of some outside perspective. I'm going to analyze our project and I will be using an ethical framework which was elaborated by the Norwegian National Ethics Committee. It is present in the guide to internet research ethics. And this framework consists of four different dimensions that should be covered by every ethical analysis. So the first dimension is accessibility of public sphere. The second is interaction with participants. The third is sensitivity of information that is collected during the research project. And finally, the fourth is about vulnerability of participants. So let's get started with the first dimension accessibility of public sphere. So it was already mentioned by Elizabeth. Twitter is usually considered to be a public sphere. And the goal of our project is to build machine learning models that will predict individual and group web immunization scores. So individual and group susceptibility to misinformation. And it will be based under activity on social media. But in our project, we select actually to collect data from one social medium from Twitter. So we will use Twitter API services to collect massive amount of data and be able to estimate individual and group susceptibility to misinformation. And of course, the first question that can be asked is do us researchers, do we have a right to collect identifiable data? And how our right, supposed right, can be related to users' expectations? And the data on Twitter is very difficult to be de-identified or anonymized. Because every single tweet is connected to the whole conversation. And in a tweet, not only the content of the tweet is important, but all metadata that is associated and linked to that tweet. So in order to really understand the tweet, we have to connect it to other tweets, to the whole conversation. Is it just a standing-alone tweet response to someone? Retweet, retweet with quote, who send this tweet and mapping and trying to put a single tweet into this whole context, make it almost impossible to de-identify. Because if we identify a tweet, it loses its research or data potential. It becomes useless from a researcher's perspective. And generally, as it was mentioned by Elizabeth, from the regulatory point of view, of course, we are allowed to do both terms of service and development agreement of Twitter allows us to collect the data and also terms of services on Twitter. They are very explicitly say that Twitter is disseminating the content that users are sending on Twitter. However, developer agreements put some restrictions on how this data could be used. I will discuss it a little bit later. From the federal regulatory point of view, so both from European perspective and GDPR and from US perspective from the common rule, this data is considered to be more or less public and available for a researcher. For instance, Article 9 of GDPR says that it is not prohibited to process data which are manifestly made public. One can reasonably argue that the data on Twitter are manifestly made public. However, as Nicolas Gault realizes, Twitter imposes certain restrictions on researchers and those who want to use the data from Twitter. For instance, Twitter forbids to reuse deleted content and that's why Nicolas Gault says that we should not consider data on Twitter to be public data but rather private data on public display. And we can also see that Twitter users are even not aware of the fact that the majority of Twitter users are not aware that researchers use Twitter for research sake. And however, a quite substantial percentage of Twitter users would not opt out from research if it is possible. So they would still participate in research. But one third of research users, if it were possible, would opt out from research and would not like to provide their data to researchers. So we are in the situation when we have to somehow balance the discourse of data ownership which is mostly present in the US context. And in the European context, we rather think about data as an inalienable individual possession which can be controlled by individuals and political community. But we have to also balance these two aspects with public benefit. So on the one hand, this is private data on public display. So we should also respect its private or this element of control of individual control but also weighted against possible public benefit. And when we talk about another aspect of research, interaction with participants at this stage of our project, we are not going to interact with participants. But sensitivity of information, however, is very important in this context. And especially important is a concept of group privacy. So usually we think about privacy in terms of individual or group rights. And we even think that we'll define and self-proclaim groups such as families, ethnic minorities or even group of patients who are diagnosed with specific conditions that they have a certain rights and they can claim these rights and seek justice in front of a court. And to give an example of such group claim, we can, for instance, think about Havasupai tribe that is in bioethical context, this research project was quite popular to discuss. So researchers violated, let's say, privacy of this ethnic minority of this ethnic group because they used their blood samples without community consent and they assessed risk of mental disorder such as schizophrenia and alcoholism. And they also used their genetic material to study their history and the genetic origin and doing so they undermined their self-identity beliefs. And the tribe recognized it as a violation of their group privacy and they sued the university. But when we think about group privacy in the context of machine learning, we cannot use this concept of well-defined group which can have certain legal representation because these groups which are formed in the process of machine learning when we discover certain characteristics and we can say that one individual belongs to that group. One individual, by the way, can belong to many different groups. These individuals are not even aware of this fact. They don't know that they belong to one or many different groups and they don't have any kind of representation but still they can be a subject of certain algorithmic intervention. And because they don't have knowledge about this intervention, they cannot seek redress before the courts and they are not recognized by the legal system. And one thing that has to be mentioned also that the concept of group privacy is very closely related to profiling. And Twitter developer agreement explicitly prohibits profiling. So the Twitter developer agreement says that targeting, segmenting or profiling individuals based on sensitive personal information like health, negative financial situation and so on cannot be used by those who use Twitter API. So right now we have to ask, we are facing three let's say ethical questions. So the first question is how we should treat Twitter development agreement. Is this agreement legally binding or ethically binding for us? Do we really exhaust and meet definition of profiling in our research project and how we are going to protect group privacy? How we are thinking about protection of group privacy? So the first question, how important is from ethical and from legal perspective Twitter agreement with a developer? So from the legal perspective, there is already some case law which indicates that this kind of agreements are recognized at least by the American justice system. However, I think that there are quite strong ethical arguments which can say that in certain circumstances this agreement can be say violated or overridden. So companies such Twitter, Google, Facebook have a very strong influence on our politics, not only on elections but also on discourse and political discussions. And I think in my opinion, a democratic society has to have some instruments, has to oversee their actions. And researchers are probably best situated to really put a check on these companies and examine their activities. However, of course this kind of research projects should be carefully overseen and reviewed by external ethics committees and also should have legal support from research institutions. And of course right now I want to stress and emphasize that we are not going to violate Twitter agreement and this is not what we are thinking we are doing in our research project. So what is the definition of profiling? So profiling is a technique to automatically process personal and non-personal data aimed at developing predictive knowledge and that knowledge subsequently be applied as a basis for decision making. So I think that we have to draw attention to these two elements. So on the one hand predictive knowledge, on the other hand decision making. And of course our project aim is to create predictive knowledge but we are not going to make any intervention about these individuals which let's say provides us with the data. However, we are very aware of the fact that this data could be used in that way. However, we won't perform any kind of intervention at this stage of the project and every future intervention would be coupled with informed consent process. What about vulnerability of participants? So the last dimension of our analysis and vulnerability usually in the context of biomedical research refers to the moment when we involve participants into a research project. And those who have some kind of cognitive and those people who do not have sufficient cognitive capability or who don't have legal capacity are usually unable to make informed decision about themselves and they are recognized to be vulnerable in the context of biomedical research. But I think that this concept of vulnerability really doesn't apply to our research project also because we are not going to obtain informed consent from our participants from the Twitter users. However, I think that some of our participants are vulnerable in that situation because I think that susceptibility to misinformation can be understood in terms of vulnerability. So people who have diminished ability to make autonomous decision in the information environment of social media are vulnerable in these circumstances. And those who cannot recognize misinformation and who spread this misinformation are vulnerable in information environment. So what about protection? I said that informed consent in our project would be very impractical and we are also from the regulatory perspective, it's not required. However, Neil Dickard and his colleagues in a very interesting article published in the American Journal of Bioethics recognize that informed consent is a procedure that has a lot of different functions. Actually, they distinguish between seven different functions of informed consent and this function can be also realized by other procedures or other actions. So for instance, one of the functions of informed consent is to make the process of research transparent. And by being present on social media and also disseminating information on our research project, we are going to, let's say, to try to meet this element, this function of informed consent. So we would like to inform, let's say, Twitter's sphere that we are conducting this kind of research and what it means for Twitter users. Another form of protection that we are thinking about is limited data sharing, especially when we talk about this data set that contains data from Twitter and the model that will allow us to predict the web immunization score of individual and groups, because we don't want we don't want this model to be used by any vet actors to, for instance, target susceptible individuals and groups with this information. So we were thinking about some kind of data access committee that will limit and vet the data request. And also we will not share the very model of data, but only a surrogate model which allows to validate our research but which doesn't allow to, for instance, to replicate and to target vulnerable individuals. And generally, I think that our research project, of course, there is a lot of ethically, let's say, sensitive issues. I think that one of the ethical laughs of information ethics, which was formulated by Florida, which says that entropy at not be cast in the infosphere is justification for our project. We want to limit misinformation and this is also how we serve, let's say, public interest doing this research. So we are not doing this just for fun or just out of pure curiosity. Okay, thank you very much.