 Welcome to our Open Talk series. My name is Teri Zatzuga and I'm the research lead of the AI and Society Lab. I'm leading an interdisciplinary research group that wants to find out how AI can serve the public interest. And pretty early in our research we understood that this is such a big and complex question that we cannot solve it completely by ourselves. So in this series of conversations we speak to people who bring in their experiences and their research to find out the limits and the potential of AI to serve the public interest. And we really hope you enjoy these conversations. Just to introduce myself real quick. Right now I'm a technology policy researcher. I am a former data scientist in think tanks and local government in the United States and I'm pretty invested in the data science or social good education space. I want to talk about algorithmic services in the German federal government but really what I'm just trying to show you is where in this does public interest AI in sort of this really narrow formulation of algorithmic services fall. So what's going on in this space? 2021 you have a federal data strategy which a little more concretely, a little more recently has been driving decisions. It calls for a data atlas which would share data between all the ministries and ends to be a big data register. It calls for every ministry to get a data lab and a chief data officer. So some of this has to happen because they promise that you and allocated 239 million euros to do it. That includes the ministry data labs and the chief data labs scientists. So I think we can reasonably count on that happening. It seems to be happening. Some of these labs exist. There is one the chancellor's office, the foreign office for instance and a few others. And in a lot of ways I would say algorithmic services are like kind of an afterthought. There's so much more core infrastructural work happening in the federal government that if you cut out the AI services most people wouldn't notice. So when we do talk about algorithmic services, how do we know what's going on? We actually have some really totally brand new information. And the reason we have that is a member of the Bundestag, Anke Domscheetberg, sent a request to the Ministry of Education Research and said, I would like to know about all the deployed AI systems in the German federal government. And I would like to know about all of the pilot programs and research projects on AI in the German federal government. And the response to her, as far as I understand it, is the most thorough documenting of algorithms doing meaningful decisions in the German federal government by a long shot. So I think the internal, external development matters, the amount of money matters and then what standards or processes are people going through before deploying these and the answer is none. There is no meaningful risk assessments. There is no consistent process for stakeholder engagement. There is no obvious post deployment monitoring or auditing standards. In her blog, Anke writes about how many of the ministries fundamentally didn't understand the question of are you doing an algorithmic risk assessment? If we're wondering if we've successfully communicated the range of challenges of AI and government services, we may have more to do there. So this is really interesting. This is a big development. This amounts to an algorithmic registry, which in fact most countries don't have. So given this, I have some recommendations. You could take that list and that from BMBF and you could put it on a website rather than just in a PDF. You could formalize it and expand it over time to keep it updated. You could keep a change log. And I think that's important to keep. So you have, OK, well, here's what it does. And here's what we're saying about it now. And here's what we used to say about it, right? A little bit of public transparency on how you're portraying these. You could create a metadata standard for documenting these rather than having to do a survey where you go and ask everybody questions. You come up with a format and you say, if you're building an algorithmic process, that's particularly important. We would like you to write down these 20 things and then just tell us. Then you have a sort of consistent process of reporting. And then you can adopt formal risk assessment processes. And this is, again, in advance of deploying them. You can imagine us bring this into monitoring and evaluation. I really think we want to be routinely advocating for internal expertise. The reason why these or these these algorithmic services is already make important decisions and it's not using a word processor. It's the words, right? It's the actual government policies that you're shoving into code. And I just sort of think fundamentally the government needs to be responsible and held accountable for this. The other reason to do it is function. Just if you think this is going to be important, then you need to build this confidence sooner or later and you're better off just doing it up front. And that leads me to my last point, which is that this stuff matters and the pure number of activity and the pure number of institutions that are involved is high enough and complicated enough that it would actually be great to have a comprehensive plan for the government use of data and probably also technology processes. I haven't been able to find a consolidated single document that says this is what the digital service for government is going to do. This is what it is going to do. This is the open data teams problem. This is interior. This is if no one does this, you just can guarantee there is going to be overlapping confusion and some people doing the same thing and other people, you know, not grasping into certain roles and having just spent four months looking at this, I I'm still quite confused. And so I imagine that there's some other people out there who might share share that lack of a holistic understanding of what the intent of all these organizations precisely is. Certainly there's some criticisms, but also coming from the US like everybody's a mess, right? Like nobody's doing this like really well, right? It's maybe an interesting first question since you have this very important international comparison. Where would you say is there some good potential and some strength to actually build something like public interest? I and where are the biggest obstacles that need to be overcome? Like from at least from the government side to build public interest? I so in terms of talent, which is really hard to just manifest, Germany has that. And so finding the human resources barriers and the bureaucratic barriers that are preventing that talent from coming to the governments and then from doing their job effectively is probably the more immediate challenge. So I understand that, but you seem to be advocating to have like a lot of people hired, like not just the central office that's setting top level strategy, but actually like a lot of people that are making the things. Yeah. And I guess my question is do we have evidence that governments can actually do this? The second you get into stuff that really requires understanding the data, like the type of things I think about as data science, combining a technical understanding of algorithms and statistics with domain knowledge, right? And that domain knowledge is critical. And those teams really should be working not only with the domain, but also in the ministries with the people who work on kind of the front lines, right? Both because they'll be better at building the services and because there's a meaningful exchange because there's those data labs and services and the people doing the jobs of the ministry. Can governments do this? Yeah, hopefully the alternative is pretty grim. So just to give you a sense of what happens if governments can't do this, they're stuck with and we're already in this situation in some places. You're stuck with totally analog, slow services. We have to show up and go to a physical location to sign a form to hand it in, right? If government really can't deliver services and the quality of the private sector ones continues to get better, that is a problem for democratic legitimacy. And I think a pretty big one. It's hard to quantify. You could imagine a government that functioned well but outsourced most of its algorithmic services procurement. I can imagine that working, but it makes me very, very nervous and I am broadly opposed to it for a couple of reasons. The most prominent is this accountability problem. It is a good question. Building the governmental capacity has been very hard on that. I think it's absolutely, absolutely worth it. I wanted to agree with you on one point that is the focus on data instead of algorithms. I think it's very important to first recognize this importance of data and collecting it and understanding it and also understanding the responsibility that ministries have for the data, that actually this infrastructure is the biggest part to build. And then the algorithmic services are really just the layer on top. So I really agree with that. And one thing you said made me think a little bit also how where we could provide maybe a part of the vision that you are trying to also encourage. And that is this question for how to actually involve people in this process and how to create participation. And the answer that I think we would have so far is that it really depends on what the service is supposed to do. There could be many, many ways to invent also this participatory process in different models and in different cases where different civil rights are somehow touched. Very different methods might be the right way to go. But also in this gathering of data and understanding the data, I think there already should be some kind of board also bringing in participatory, deliberative voices from other stakeholders to kind of have a checks and balance also in that process. So I think along the whole way, you could think about how to actually build a system with data centers that already include expertise from civil society, from very vulnerable groups and their speakers because all of that is a sensible topic. And so, yeah, I think participation along the whole way would be an interesting issue to think about. I'll just I'll just echo. I think that is the one of the least well developed areas of thought here. And that guidance on that would probably be pretty valuable. And I don't know. I don't think I have that answer. But I think I do see the outcome where it doesn't happen where civil society groups are like, we discovered this algorithm and it's terrible and that seems to be happening somewhat frequently. And I wonder if we could maybe interrupt that process a little a little earlier and not have like the government of the Netherlands resign every time this blows up, right? Which is which is a real story that happened recently in part because of that, you know, and like you could you could imagine a stakeholder process having completely prevented that. That could be one of the main topics for this central institution to provide guidance and how to incorporate different forms of participation in the process and to come up with different models and also advise and to maybe also moderate this process for the different ministries and for the different data centers. So that could be one of the responsibilities there, which would be a very high responsibility. Yeah, and I think one that is not a nice to have by the side, like right now, participation is very often this, oh, yeah, we did a workshop, you know, and there was a hackathon, so there was some kind of participation. But this can still backlash really, really badly. So I think to to have that at a course, like to make that really important and to have also methods that can be learned from and best practices that can be shared among all the data centers that at one time, hopefully will evolve could be a really essential thing. And I was wondering if there are governments which are generally more transparent or if there's maybe an example where more transparency came with that bigger topic of digitalization and they became more transparent with that move towards visibility. For algorithms, specifically, I'm not sure there's like an amazingly high standard that someone is racing ahead and everyone else is behind. The cities have done a good job, Amsterdam and Helsinki and New York City. And they've all released pretty comprehensive algorithmic registries. And I think in some ways, they're probably the leaders in transparency. At some point soon, the US will have a big public release. Maybe the countries with established open data and established transparency processes are a little faster to this. But there's not a huge disparity, I don't think in the amount of transparency, if that makes sense. Rather, I think everyone's kind of figuring this out. And I don't know if I even think Germany is behind here so much as hasn't formalized their process really, maybe to pin you down a little bit more. Like, since you have the very broad overview, can you give us one really good example where you would say this is public interest AI in action in your understanding? I do love the Chicago lead paint example, which is one of the classics. So the city of Chicago has a problem where many of its homes have lead paint in the walls, so like three layers of paint back, there's lead paint in the walls. And over time, the paint chips off and it ends up on the floor. It's not a big deal until young kids crawl around on the floor to get the lead paint on their hands. And then their work state put their hands in their mouth because they're babies. And then they get very, very sick and lead is very, very, very dangerous. So they have inspectors who go around to the houses and look, but there's way too many houses and the inspections are random. And so their ability to find lead paint is really, really low. So the Center for Data Science and Public Policy led by Ray Ghani says, I think we can predict whether or not there's going to be exposed lead paint in these houses better than random inspections. And it turns out that they can. They have a historical database of whether or not there's lead paint in the wall. They have how old the building is. They have whether or not it's been derelict or whether it's not paid as property taxes. All this stuff doesn't tell you if there's lead paint in the walls, but it is enough a correlation to inform whether or not some houses are much more likely to have exposed lead paint than others. One of the things I really like about this is it's such a good story. And they managed to build a thing and it worked, right? They could dramatically much, much higher accuracy than a random inspection predict the most likely houses to have lead paint. And you're like, great, you're done. No, the implementation and deployment process took years, right? So you have a couple of problems. You have to time your predictions, not just randomly, but when young kids are going to the house. So you actually have to run this model when parents are in the hospital having a child, because that is actually how you find out that this is happening to prioritize that house for inspection, right? It's you. So they had to integrate this into the University of Chicago hospital system, essentially, and I think it's useful to really pay attention to example where the use case is clear and the algorithm is doable. Like that first part, they probably could have done that in a couple of months, right, to make that to get the data and make the algorithm like, yeah, this works. But public industry, I typically involves a really complicated, long drawn out deployment process. Other problems, you've got to get the inspectors to buy in. If they don't care about the problem, they don't make their jobs easier or they don't get more credit for it. Are you sure they're going to even take your idea? A couple of examples where you see great algorithms that do the thing well and then the stakeholders ignored it, because it didn't make their jobs easier. So I think that's a really good example of a really high impact, really powerful algorithm that is now working. But you can see by looking at the work where the challenge really lies, which is often in that deployment, in the shaping of people's incentives, in the process of making sure it's making a prediction at a meaningful time and then making sure it's going to keep working. Our open talks, our open for collaboration, contact us to get involved.