 Okay, I will start now. So welcome again to the second webinar on public interest AI. I'm Teresa Tuga. I'm the lead of the AI and Society Lab, which is a part of the Humboldt Institute for Internet and Society since 2020. So now running for something like three years. And what I would like to do today in this short webinar is to give you an impression who we are, what we do, why we do it. And I will kind of run you through the process of our research and tell you what ideas we started with and what kind of obstacles and questions we encountered and so how we decided to deal with it. And that means that I will touch upon the definition of public interest and public interest AI. I will speak a little bit about data on the issue and also about the ecosystem around public interest AI. And I assume that you are all in some way part of this ecosystem and that this is maybe why you're here. And I will also speak a little bit about our future goals. And since in this webinar, it's a short hour that we have, we won't have the chance to get to know each other and introduce each other to ourselves. So if you would like to have more conversations, more contact, please reach out to me later. Yeah, but I will try to speak something like half an hour, maybe a little bit more. And then we can just get into a conversation and you're invited to also tell us about your research in your question when you want to speak if you want to. Yeah, but other than that, definitely reach out if you see any good chances for collaboration or a connection. Yes, so who we are and what we do. So this is the little research group. We are funded by the BMBF, the Ministry for Science and Education in Germany. And we are a very interdisciplinary group. So I have a background in media studies and political theory, while the postdoc Hadi Ashgari has a background in computer science and public policy. He very much also researched cyber security for a long time. And we have three PhDs in the project. Sammy Neno and Freya Hewitt. One is a data scientist and philosopher. Freya is a computer linguist. And they're both doing projects related to natural language processing. I will introduce that in a second. And Hewitt is doing a different type of project. He has a background in design and communication studies and researches, design patterns for pluralistic data governance. So the question if more than, if many stakeholders have a share in data, how can you find solutions and patterns, how you govern those data sets well. And overall, we started with the goal to set a theoretical frame for what is public interest AI. And coming from the background of political theory, I wanted to ground this whole debate with the theories about public interest that exist for a long time in the philosophical debate already. And at the same time, we wanted to very practically build a bridge to what does that mean for building AI. So we wanted to look at the process and the governance process of developing AI and governing AI systems. And that's why we also have prototypes that we develop ourselves and learn while we're doing it what problems you may encounter and try to include that also in our research and in our learning. So these are a few of the milestones that we have achieved so far. Yeah, we have one focus which is coming from Freyja's PhD on simple language and the translation from standard German to simple language versions of German, which is an NLP project. And one paper we wrote one paper we wrote together under Hadi's lead was to look at the landscape of simple language on the web. And we build a classifier and kind of looked at pretty much the whole web as well as you can to understand where simple language occurs. Since there is an EU directive that specifically public service websites need to have a simple language version and we wanted to understand how they actually realize that and if actually it is used by people and if it's helpful also. So we wrote a paper on that and we're able to present that at the ACM websites conference last this year. And also from that project we developed a prototype and this is something very important for us that from whichever research we do something comes out that is useful for citizens and it kind of tackles a real life problem. And it's not only like yeah research but builds something that is definitely helpful to citizens. And we built the Simba text assistant tool, which is a browser plugin. And in the version that you can download right now in the Firefox store, you get a simplification of a text in the way that you get highlights in any given website. You also get a summarization, an extractive summarization of the text. But it doesn't work as well as we wished it would. So we decided to go a little bit on a different path and build something for the group of language learners with the Simba text assistant tool. And we will kind of change what the tool does and produce a simpler version of a German standard text and explain words that are complicated. And this is then supposed to be helpful for people who are not in the beginning but in the process of language learning which benefit from a simpler text version. And this project is particularly interesting because when we started we kind of had this idea in mind to have to produce something that is really giving out simple text or simple German. But there is a lot of competition in this field. So we reached out to many competitors. Actually we wanted to collaborate but that was also interesting that when you come with a public interest project that also could have a business case. People don't necessarily want to collaborate. But that's also learning that depending on where you would like to produce something others might have the intention to have a commercial product. The last project is aiming to assist fact checkers. It's also a natural language processing project. Sami Nano is doing this project and it's about claim detection. So we're trying to automate the step of understanding which claims are check worthy for fact checkers. And he's collaborating there with different teams of fact checkers to also produce something that is really helpful for them and very applied. And there was an interesting study that actually said fact checkers would want a tool that helps them to determine which claims in the first place are check worthy because right now they're pretty much scrolling through telegram groups to find which claims they would want to check. And with the masses of information that is definitely a problem that they have a hard time going through everything that might be check worthy. Yes but coming from the who we are and what we do part to what do we understand public interest to be. And of course that is something that we kind of started with. We wrote a paper which was published in the society journal about this idea that we want to go back to public interest theories. And I think this is a very important move because the public interest concept has been around for hundreds of years. And also it's very well connected and kind of tested in democracies and also in the legal system. So they have the many lawsuits and other legal processes even constitutions where legal philosophy has been speaking about the public interest and what it is. And actually it has been negotiating in a society in cases what might be in the public interest in a specific case. And so digging into that research we found that the book by Barry Bozeman was very helpful on the public interest. He's going back to the theories of Dewey who had a very pragmatic approach to democracy. And Barry Bozeman there explains that for him a working definition of the public interest is those outcomes best serving the long run survival and well being of a social collective construed as a public. And he made it very clear going back to Dewey that for him that is an idealistic and at the same time pragmatic and procedural understanding of the public interest because there is a shared ideal even though it's not very substantial. But the main part of the public interest is really construed by the process of deliberation of people. So there needs to be this process of a dispute and communication about shared interests. And this negotiation this process then only defines what a public interest in a specific case is. So it is never universal. It always needs to be negotiated amongst citizens. And it's never coming from the private interests of people. But really when they step out as citizens speaking about their shared interests and a democracy that's when they start about talking about the public interests. And equality especially we kind of experienced or learned in the theories of fine tech who looked at the legal theories of public interest and the legal cases. And he kind of discovered that this value of equality is kind of the baseline the core value that is very important to public interest theory. And yeah so we also included that in our thinking to what public interest AI might mean then. And I will speak about that also in a moment. But coming from that theory we then were asked to do a more empirical study and help a network that is building based on the initiatives of three ministries in Germany who got together and wanted to build the civic coding network which is now also existing since the beginning of this year. It's an office in Germany which is supposed to help public interest AI projects. And they wanted us to do a background research report asking what are the actual needs and what actually is public interest AI. What do these projects encounter. What problems and challenges do they have. What makes them successful. And what perceived risks are there for these projects or perceived potentials also. And yeah you can find the study only in German sadly. There is also a shorter policy paper that we published along with it where we also give some recommendations. And in the study we talked to 20 experts in the field and did 10 case studies. Those are the case studies that we looked at. All very different in the field. And yeah I can maybe just introduce you to three different projects that we looked at. In those case studies one is quantified treats. That's a project by the city lab Berlin and it's connected to a very down to earth in the truth true sense of the words project which is called geese in kids. So water your area and it's about watering the trees around the city which in the summer yeah due to also climate change and due to hot weather and little green in Berlin at some places might not get enough water. And quantified trees then uses the data of the trees to predict which tree might need most watering and they can actually as given alert to citizens that trees need more water because there is knowledge about which type of tree it is which position it has how the weather has been. So there might be a good prediction to which trees might need water the next moments next days. Another really interesting project where we learned a lot is common voice you might know their project it's from the Mozilla Foundation and it's a language model data set in many many languages of the world where there has been pretty much no or very little speech data available to actually build tools up on the data and it works by donations so people donate their voices and read particular sentences and by that donate that material that data to the data set and make it possible for technologies to build on that and as far as I know it's the second biggest data set on speech data available and for us it was particularly interesting to look at that how that data is governed how they actually build boards and collaborated with speech communities to not go against their wishes but with those communities to build this data set. And maybe another interesting project is iCaptain which is a technology that allows people to navigate apps with their eye movement so every app it's an open source project and every app can include that code so the app is then yeah it's possible to navigate it with eye movement instead of finger movements. Yes but overall we in our research encounter the problem that you hear about AI for social good projects all the time it's a buzzword these days but beside the problem that I kind of already touched upon that there is in the discourse sometimes very little definition of what the good is or the public interest and we kind of need to speak about that from a theoretical point of view there is also not a very good data set on what these projects actually do and who's funding them and what team is working on them what technologies they actually use and so many other questions that we asked ourselves about these projects and usually get only a very superficial documentation of these cases so we started researching and we kind of researched a big data set ourselves of cases that we felt somehow belong to this range of potential projects but we also wanted to ask projects themselves to maybe yeah contribute to more data and a better knowledge background on what public interest projects actually define themselves to be how they understand public interest since it is never universal and never defined from one perspective we believe that this might be very important that they bring to the table how they understand it and what they actually do and also it is interesting that many of those projects run for a specific time and then cease to exist which is also a problem that we learned much about in our civic coding study because the reusability of technology is not yet yeah very good projects are not well documented they're not well available in i've for instance github and so it's very hard for other projects to actually find out which project existed even if they are connected and who was responsible and how to maybe build on an existing technology but there is we called it a graveyard of AI projects yeah because they're often only funded for two years or something like that and after the funding period you can't find anything about them and so also we wanted to learn like yeah how long projects actually exist how long they run and how long we can find them and that's why we initiated publicinterest.ai which we see as an interface also for the community talking and thinking about public interest AI and we try to offer certain yeah tools for mapping and also sharing knowledge on public interest AI and also yeah we are hosting a map there which gives people the opportunity to do the survey that i was talking about so i will go to the map shortly where you can add your project and this is the moment where i definitely encourage you if you have a project in this area please fill out the survey this data will also be open to others everything that people fill out in their profile right now it's 30 projects who did that already will be available in a profile if you go on the website and click on one of these dots you see that there is a quite extensive profile then which is available but also we will share the data set as yeah if it's a little bit bigger we're still hoping for more contributions with other researchers on the gesus network with his social science data sharing network um yeah and or if you know people in this field it will be definitely helpful if you send the survey to them and ask them to fill it out to make it possible to also have a better visibility for these projects also have exchanged amongst the project but also better research on what actually public interest projects do and how they work um yes another thing that we started here and which is hopefully also helpful is the stakeholder index so we not only collected projects that are in the field of public interest AI but also stakeholders um and yeah we felt that this is this might also be helpful to understand which stakeholders are funding public interest AI which um are interested from a political um perspective which think tanks work on this and which research institutes work on this and this is an open list so if somebody is missing then please send it to us and we're happy to include more people globally also we are also expanding our research on that while we go but we're also happy to just receive recommendations because this is supposed to be really helpful for also those people researching in this field to understand who else is part of the ecosystem um yeah and maybe one other thing that I can mention is something that is not yet very visible on the website but we started a fellowship for public interest AI because we realized that there is a problem on both sides that we are connected with once the academic institutions um have students who want to have practical experiences and mainly they go into industry um and we wanted to give an alternative to the playing field um and on the other side NGOs uh working with AI and data science have a hard time finding new talents um and so we thought okay why don't we make the connection and that's what we're starting to do right now we have a pilot um with three organizations with Amnesty International the forensic architecture and um with the empty corruption data lab um and they each um will have one to two fellows uh which are data science and computer science students from Berlin or from Cologne right now also if you have a computer science department and you would like your students to take part in this program also maybe reach out to me because we are hoping to expand this and create a network of exchange and also create a better exchange amongst those NGOs which actually are all now exploring data science and computer science and all said oh we lovely to also exchange with other organizations how they do it um yeah but uh oops sorry there we are again no yeah we're we're every time trying different glitch just for entertainment um yes okay but this is pretty much the public interest AI website but uh maybe another important part of it that I should speak about is if you scroll down a little bit um we see this also as a discussion space as I said there needs to be deliberation about what public interest is and um from our research we um yeah try to explain which ideas we developed what public interest AI should look like what it needs to be aware of and we speak about six um criteria that we think are important as part of this discussion which are the justification so it's the question why a team is thinking it needs to do something in the public interest by using AI so why this technology if it's in the public interest there needs to be some kind of conversation and the possibility of others to argue with that so there needs to be something like a public justification which could also mean there is a website explaining what the project does and wants to achieve uh to give people a chance to actually contest it or uh agree with it and help it um yeah then the next point comes from this strong value of equality and is arguing that equity for public interest AI systems is really a core value and by that we mean that at the least systems should not discriminate or hurt equity but in fact that should they should actually somehow um strengthen equity so um that's also what our idea for instance with the um simple language tools is that we try to give uh to strengthen the rights of people who uh yeah our language learners might need um this additional accessibility to to um participate um the next part is participatory design and deliberation this might look very differently in every tool and every application but generally we believe there needs to be a thought process about who is affected by this technology and how can we include those people somehow affected how can we offer participation and how is it designed to be meaningful because it's a real problem that there is user testing at some point which then doesn't really affect decisions and we believe it's definitely something that makes technology development harder and make more complicated and we also encounter that in our own projects that this is not easy but we think it's very necessary and uh also beneficial to public interest AI projects of course technical standards are an important point so safety of systems but also robustness so does the system actually have the accuracy that we want and uh if not is maybe a more easy technology more robust to give us a better result and that also comes along with this idea for openness for validation and scrutiny so in some cases it might be that the data is not open source actually for the training data set for Freya's project with the simple language it was really hard to find an open source data set because there just is no publicly available data set for which is big enough to actually train something in standard and simple language German and so we got a data set from APA in Austria but even though we tried to convince them to make it open data it's not it's only in a different version open now but not as well annotated as we hoped it would be which yeah it's maybe understandable from their perspective but for us it was very important and that's what we also negotiated um that we can at least show what we did and how we trained the model and have enough data um public to um yeah explain what we did and make it open for validation to others so we will share that on geysers as well and give a portion of the data and show how we annotated and how we trained the model and try to document that so well that others can actually work with that and last but not least sustainability we believe is very important and other other speaking about public interest AI kind of conceptualized that as a non-malicious or a non-harmful technology but we thought about that and felt that actually that might not be enough for public interest but rather um yeah it needs not to be only aware of the footprint that AI leaves itself we can also speak about that later if you're interested in it but also uh it needs to work towards a small sustainable world in terms of a social sustainability but also ecological and economic sustainability which is a challenge but we believe that those are the things uh systems should be mindful of and should work towards yes um yes and now I'm reaching the moment to speak a little bit about our future goals and what we want to achieve in the next year so we want to have more discussions and also more shared standards for public interest AI we believe that now we see really an ecosystem emerging of people working on this also in the US more and more people are working on the same topics there is a tradition of people working on public interest technology and I believe that a lot of things are actually speaking about the same things even if you were speaking about AI projects or other tech projects so we believe that these conversations need need to come together and need to also bring up shared standards that people can hold on to if they want to do something along the lines of public interest AI for that we need to as a ecosystem become better in openness about speaking about learnings but also open standards and open data sharing and also the reusability of systems of yeah the experiences and of data um and yeah this brings me also to the point that we're always open for exchange might it be from very practical projects or other research teams and also for collaboration we uh yeah that's also how this fellowship happened because we learned that there is specific needs that we can maybe help with um and uh yeah also we would really like to improve the ecosystem for public interest AI in any way we can right now that sometimes happens in very informal clinics which we would like to also maybe formalize a little bit better where people come to us who want to do a project along these lines and actually want to have a conversation about it because they are in maybe a early phase of their projects and there are some obstacles ahead and sometimes specifically actors from the civic field want to understand that they have a benefit from working with AI but have a lot of questions and a lot of uncertainties and there it sometimes can be very helpful to uh yeah just have a conversation about the learnings that we have where they might want to watch out or where they might also want to get additional help yeah that's maybe it for now and I'm very happy to answer any questions that you might have you can ask them right away and just raise your hand and introduce yourself if you want to you can also just post them in the chat and my colleague Daniel who is with me will then read them to me or point me to them so we can start having a conversation thank you so much for your attention