 New America would like to welcome you to our virtual event. The program will begin momentarily. While we are waiting, I want to review a few housekeeping notes. As the event is being recorded, a recording will be posted to the New America events page within 48 hours after the event. Attendees will be in listen-only mode, and you will not be able to be seen or heard by your fellow attendees or panelists. Therefore, we ask that you submit your questions via Slido. Slido is the box that you can see to the right of your screen. Should you have any questions about submitting questions via Slido, please submit them to events at newamerica.org. Thank you for joining us. Hi, everyone. Thank you so much for joining us today. My name is Lauren Sarkeesian, and I'm a Senior Policy Counsel here at OTI, focusing on government surveillance and privacy issues as well as AI. For the past few years, OTI, led by my colleague, Sandy Singh, has been looking into how internet platforms develop and deploy AI systems and machine learning-based tools to curate the content we see online. OTI has also done work related to government uses of artificial intelligence, including the facial recognition and procurement contexts. We're very excited to have released a new report last week, which explores and compares the different mechanisms that internet platforms and governments can use to promote fairness, accountability, and transparency around high-risk AI systems. So we tend to talk about many of these accountability mechanisms in a silo. Today's panel will focus on exploring some of these approaches and if and how they can work together to generate greater accountability around high-risk AI systems. We'll have some time at the end of the panel for questions. So audience members, please submit questions via Slido. So now I'd like to briefly introduce our panelists. First, Professor Catherine Sharkey, who is the Segal Family Professor of Regulatory Law and Policy at NYU School of Law. Professor Sharkey is one of the nation's leading authorities on the economic loss rule, punitive damages, and federal preemption. Last year, she co-authored the Government by Algorithm Report on how federal administrative agencies are and should be using artificial intelligence for the Administrative Conference of the United States, or ACIS for short. Next, we have Dr. Christine Custis, who is the head of About ML, which is a program at the partnership on AI. Her work focuses on bringing together a diverse range of perspectives to develop, test, and implement machine learning system documentation practices at scale. Christine also leads Fairness, Transparency, and Accountability work at PAI. Finally, we have Spandi Singh, who is a policy analyst at OTI. Spandi's work focuses on content moderation, disinformation, algorithmic accountability, government surveillance, and privacy. In OTI's recent report, we covered a broad range of mechanisms that can be used to promote fairness, accountability, and transparency around high-risk algorithmic systems. This includes machine learning documentation frameworks, transparency reports, government procurement processes, audits, impact assessments, and more. I wanna start off by honing in on some of these approaches and exploring what advantages they may have to offer, as well as what limitations they may have when we think about promoting fairness, accountability, and transparency around AI. So, Christine, we're gonna start with you. About ML is a multi-stakeholder initiative that is trying to push for the implementation of machine learning documentation practices at scale. Could you explain a bit about what the About ML program aims to do and why you think this approach could be useful when we're talking about promoting transparency and high-risk AI systems? Absolutely. Thank you so much for the question. So, about ML, it's actually an acronym, and I'm gonna read it because it's so long and I wanna get it right. It's annotation and benchmarking on understanding and transparency of machine learning life cycles. And in short, this is just work that we believe is imperative because it is how we think we can operationalize transparency at scale through documentation. It's the way we believe we can action responsible AI through the documentation of the different phases of the machine learning life cycle. And so, what we hope to do is set a new industry norm of documenting all ML systems as they're being built, as they're being deployed, changing this entire practice at scale. And a lot of times when you hear about the software development life cycle, there is documentation inherent in the life cycle that's well-received, well-known. Somehow in the machine learning life cycle, it's not so much. So, we're trying to bring that to bear in that life cycle through research-based, multi-prong initiatives, soliciting feedback from stakeholders and baking ethics right into the process. Thank you. Thanks for the explanation. Catherine, the government by algorithm report had some really interesting insights around government procurement of AI tools, noting that just under half of the algorithms government agencies use right now are procured externally. Can you talk about the instances in which government procurement of external algorithmic systems could be beneficial to promoting fairness, accountability and transparency in these systems and also discuss some of the concerns that they raise or that sort of arise by relying on just procurement processes. Sure. And thank you, thanks to Spandi for the invitation to join this panel. So by way of very brief background, I was part of a very large collaborative study that engaged NYU and Stanford, law students and computer scientists. We brought together the lawyers and the technologists. I had colleagues at Stanford on the project. It was commissioned by ACAS, the administrative conference of the United States. That's an independent agency that tries to be a kind of fertile ground for the production of useful empirical information that agencies might be able to look to. So in a nutshell, we were struck by the fact that there were numerous commentators and scholars that were very engaged in the question of how AI should be regulated, particularly less attention actually on governmental use, much attention on how government might try to regulate in the private sector. And yet very little information about actual, whether fledgling or more well-developed uses within government. So we tried several different takes in this report. We did a canvas at the largest measured by a number of employees, federal agencies, 142. We researched and canvassed. And of those, one of the interesting findings was nearly half had what we would call existing uses of machine learning slash artificial intelligence. So sort of popular opinions sometimes think that nothing's happening within government or government's way behind the private sector. So that was one finding. And then, Lauren, as you alluded to, we tried to look not only are they using these tools, we tried to have them classified by level of sophistication and also by whether they were coming primarily in-house or through outside procurement. And our findings were actually that 53% were coming from in-house, so we can come back to that. But that's interesting in and of itself because again, the procurement is an important piece and angle, but I think often missed is kind of what's happening with respect to internal capacity development within federal agencies themselves, maybe thinking about why this is happening, maybe thinking about some of the things that Christine alluded to in terms of getting some kind of accountability frameworks kind of ensconced within federal agencies, but there were 33% were being contracted for private from the private sector. So you're right, there's another 14% that makes up that external that are non-commercial collaborations. And that's important too because government is collaborating with nonprofits and with academic entities, but this roughly one third are being contracted from the private sector. And so again, you raise a very broad question. I'll try to be somewhat succinct. There are advantages and disadvantages. Maybe I'll start with some of the disadvantages. Sometimes we have to be careful about thinking about off the shelf private sector tools being imported in terms of governmental use, right? What I would focus on is the need to get the lawyer policy and technologists in the room together while developing these tools. Oftentimes within government, there are very nuanced, complex tasks that are very policy dependent and it's a little too late kind of ex post to ask, do these tools meet our objectives for accountability and the like? So that's one piece that I would emphasize that sometimes there are gonna be costs with respect to ensuring accountability and ensuring that these tools sufficiently meet all the nuanced policy goals within the agency. There can be costs of monitoring such tools and the like as well. On the plus side, a kind of advantage and this was one of the features that I particularly liked. There was much to commend this cracking open the black dots report. But one thing that I liked was a kind of theme or an emphasis on how government potentially could use this procurement process to force some accountability. And so I do think that there are very interesting ways that through the government procurement process the government can demand certain types of accountability standards. I also think there's a lot of discussion about how it will never be possible because of all sorts of barriers in the private sector with respect to trade secrets and patents and protections. And my response to that would be, well with respect to dealing with the government maybe as a lever the government's gonna be able to demand a certain type of access to be able to ensure the kind of accountability that we need for these types of tools. So I do think it's very worthwhile to think about the ways in which this process can be a kind of carrot maybe sometimes stick, but a carrot for enforcing some of these accountability mechanisms. Yeah, thanks for that. And for going back a little and providing the background about the report which was really just so crucial and provided the most insight by far to into how the government is using AI. So anyway, yes, I jumped ahead a little bit with that question but appreciate the background. Spandi, next over to you in OTI's recent report you wrote about how mechanisms such as algorithmic audits, risk assessments and impact assessments can be used to rein in high risk algorithmic systems. Can you talk about what value you think these kinds of assessments have in promoting FAT for short, we'll say around high risk AI systems and if you see any limitations. Thanks Lauren and thanks to all of our panelists for joining us today. So I think if conducted properly and transparently audits impact assessments and other methods of evaluations can be valuable mechanisms for promoting FAT around high risk AI both in the corporate and government sector. Some of the benefits of these approaches include they can shed light on the opaque nature of high risk AI systems. They can help evaluate specific variables that we're often concerned about like privacy, bias, fairness and human rights. They can examine unintended harms of a system and help an entity create a roadmap for mitigating these issues. They can enable investigation of certain concerns in their ways by external stakeholders such as impacted communities or civil rights organizations. And they can also help an entity determine whether their system is in line with their own internal policies and or external standards or regulations. But I think in order for these assessments to deliver these benefits we need multi-stakeholder consensus on definitions. We need standards for these evaluations and so on. So I think it's been really interesting and great to see over the last few years how stakeholders from civil society, government and platforms have been debating and encouraging the use of these methods. But I think the key limitation I see in the space is that we haven't answered the fundamental who, what, when, where, why and how questions which are really important if these evaluations are going to be effectively deployed. So in terms of who, you know, who is conducting these evaluations for maximum accountability? Is it a third party entity? Is it the companies or government agencies themselves? Is it a combination of both? There's a lot of really great literature. We talk about this in the report as well that if an entity is grading their own homework really generating the kind of accountability that we need. The second one, what are we evaluating? Are we talking about auditing for privacy or doing an impact assessment on risks to pre-expression or bias? You know, not all high-risk AI systems have the same objectives or designs. So we might even need multiple different frameworks depending on the context in which a system is being deployed. The when question is when during the AI lifecycle should these evaluations take place? As we talk about in the report many stakeholders have suggested that algorithmic impact assessment should occur before a system is deployed to identify potential harms. Others have suggested that audits should be taking place pre and post deployment and they should continuously be taking place to ensure that systems are still functioning as intended. The where question is where are the results of the audits communicated and where is the oversight coming from? And then the why is what goal are you trying to achieve with these assessments? So I think we need clear answers to these questions in order for these assessment methods to be valuable in promoting FAT. Right now the use of these evaluations are all voluntary. So, you know, vague and unclear guidance is generally a disadvantage for an entity to participate in them. And I think a lack of standards and clarity also undermines credibility. There have been a number of freely interesting audits for example, that have been conducted by journalists and researchers, but since there is no consensus around the methodology and the correct approach, the applicability of these findings is generally limited. Thank you. And switching gears a little bit since there are so many different sort of accountability mechanisms to cover here. Christine, we're gonna turn back to you and talk about sort of documentation frameworks. So there's numerous different existing frameworks for data and systems documentation in the machine learning context. These include Microsoft's data sheets for data sets framework, IBM's fact sheets framework and the model cards framework that Google initially championed. These approaches share some similarities and differences. But can you talk about if and how these frameworks could be applied to a high-risk AI system and sort of how impactful they can be? Yes, absolutely. So I'll just do a little bit of pitch here and I just wanna encourage all of the folks that are participating today to take a look at the partnership on AI's website. It's partnershiponai.org. We have a reference document. We have a resource library and lots of great tools out there. And we'd love to hear from you if you ever use them and your experience. But I will say that throughout this machine learning lifecycle, there are so many opportunities for record keeping, for documentation, throughout data specification and curation, data integration, one source with another, the maintenance of that data, keeping it fresh throughout the model specification and training and evaluation and integration and maintenance is all these opportunities to keep track of what is happening, who did it, what the outcomes are, the impacts are to the system, to the users, to the impacted non-users, right? And so there's opportunity throughout the lifecycle to do this type of record keeping and documentation which is beneficial to a high risk AI system because if we don't get it right, we at least wanna be able to go back and trace the error. We wanna be able to figure out the source of the harm. So giving ourselves this opportunity throughout the lifecycle, keeping track of just all sorts of administration, minutia so that we can go back and check that. And I will say, you mentioned all the different types of tools and methodologies that are out there. Right now, we're doing a meta analysis of tools and we've found 400 so far. There are so many different ways to document an ML system, and then ways that we may not even discover. So the way you choose is not necessarily as important as your consistency, that cultural shift into record keeping and being transparent through your documentation. And then I'll also say that every way is not right for every organization. For instance, a small startup dealing in patient care in the UK may not wanna use model cards. It's very bulky. You won't have one card, you'll have multiple. It's bulky. You have GDPR to consider. You have privacy and security of personally identifiable information for patients. So it's a really great opportunity to sort of get out there and see what's available, figure out the specific use case for the problem you're trying to solve, and just find ways to sort of cafeteria style, mix and match throughout the entire life cycle of your machine learning system, different ways, different approaches to documentation. I hope I answered your question. Yeah, that's helpful. Thanks for that. Vandy, back over to you. A number of legislative proposals in both the EU and the US have included provisions that would require internet platforms and government agencies to conduct audits or assessments of their algorithmic systems. What are the key elements that such legislation needs in order to appropriately address the risks and harms associated with high-risk AI systems? Thanks Lauren. So as you mentioned, we've seen legislators everywhere from the EU to the US to Canada and beyond put forth proposals that include audits and back assessments and so on. So it's good to see that policymakers are thinking through these important issues, but I think a lot of these proposals generally lack teeth for three reasons. First, there's a lack of clearing consensus-based definitions on terms like automated decision systems, high-risk AI, amplification and so on. And I think in a civil society context, perhaps we may have a general understanding of what these terms mean, but when you're trying to implement it into the law, it needs to be a little more granular. And this lack of clarity makes it practically confusing and difficult for companies or agencies to implement these evaluations. It also makes it difficult for external stakeholders to research and understand the scope of these methods. The second reason is a lot of these proposals lack clear guidance on how these evaluations should be structured and implemented. Some policymakers have proposed drawing on existing guidance around assessments from sectors like the financial sector, where auditing is a very common practice. Auditing for discrimination in housing is also common and sectors like the environmental and aerospace industries also have pretty robust evaluation practices. I think this is a good first step, but I think in order for us to really talk about the auditing of high-risk AI or assessments of high-risk AI, we need to contextualize any sort of best practices we get from other industries and really get more granular. And then the third reason is that these proposals don't really address the intermediate needs, such as the need for funding to establish an external third-party auditing or assessment landscape that is well resourced and legitimate. And without this, I think we can see a lot of different options play out, but I think one of the most concerning is that we could see communities for most affected by high-risk AI systems have to take on this external review function and they may not have the necessary resources to do this and they also won't be compensated for this work. And I think in that way, this can kind of further exacerbate existing inequities that these technologies can create. In terms of what good legislation should have, one I think is really critical is good legislation should propose clear guidance around transparency around results. So at the very minimum, I think entities subject to evaluation should be required to publish a summary of their findings on how they've addressed the issues about the project. In cases where audits are being conducted externally or assessments are being conducted externally, the evaluators may need access to sensitive data. And I think legislation can help establish these data sharing structures. Right now, there are many conversations happening about, for example, researchers accessing internet platform data. And I think that these sort of conversations around how structures can be set up to assuage platform concerns around trade secrets and privacy can inform this broader AI evaluation conversation. And then lastly, I think legislation also needs to address the broader ecosystem that enables high risk AI systems to be so potentially harmful. So for example, in the US, we need comprehensive privacy legislation to rein in data collection practices. A lot of this data goes to training new models. And I think we really need to institute clear and strong limits and to encourage platforms to promote user controls as well so that we're not just focusing narrowly, but also in the larger ecosystem of things that could impact FAT. Thanks, Fandy. Okay, Catherine, over to you next. And we've all been talking a lot in various contexts about transparency. In the ACIS report, you talked about how transparency mechanisms need to generate actionable and interpretable transparency in order to promote accountability. The report also mentions that administrative law works in tandem with an array of data and disclosure laws that in their current form can sharply limit transparency. Do you think that these kinds of laws need to be reformed? And if so, how? Great, so I wanna go back, if you don't mind just for a minute to contextualize again the question and go back actually, Spandy, to some of your second to last remarks about the various tools, the audits, the impact assessments, and then you came back to it here with respect to legislative proposals because I'm very struck, I'll give a very quick sort of analogy. A prior ACIS field study I did went into federal agencies and looked at federal agencies that had gotten very aggressive about preempting or ousting state tort law. So they were coming up with federal standards and as part of that standard, they would oust state tort law. Now it turned out in administrative law that agencies were supposed to be filling out what are called federalism impact statements. So you alluded to like the environmental impact statements and some audits in the financial sector, but I actually think that federalism impact statements has like a very close analogy here because it's a very policy inflected area. We get into disputes over what is federalism? What would a federalism impact be? How are states affected by what the federal agencies are doing? And I'll never forget the personal footnote I'll say as I went into a very high level rule writing person and policy leader within a federal agency and I started talking about these federalism impact statements and how could it be that a federal agency, for example, had just written a rule to entirely oust state law and yet said there were no federalism impacts, right? It was written into the rule and he turned around the way and showed me binders. And he said, do you know what those binders are? Each one was like this thick. He said, those are all of the required impact assessments that we're supposed to do every time we're doing rulemaking. So you wanna add a new binder, there's an existing binder for federalism and it's like this size. And you wanna put in all this academic intellectual content about what standards there should be, who should review it, where's the oversight, et cetera. And so one might be at first, at least within the governmental context, a little bit humbled by how are we gonna do this with AI? How are we gonna have not just federalism accountability, which I've given a lot of thought to, but now let's switch it and say, how are we gonna have AI accountability? And I guess the first note that I would say is that notwithstanding how important I think federalism is and that I argued then and still believe it should have its right place in review of agency regulations, we all I think could agree that sort of algorithmic governance is a sea change. So this is something really significant, really important. It should challenge us, it should challenge those of us working in the real world, those of us in academia like myself who are in constant thinking about administrative law and accountability frameworks. This isn't just some side area, this is a sea change. I believe federal agencies, let's just take the FDA, which regulates medical devices and drugs, are having a sea change with respect to how they have to regulate the use of AI within medical devices and drugs, what's happening internally, how their rulemaking will be affected, looking way down the future, what will happen when we have virtual clinical trials instead of actual ones? When we think about federal agencies, they are increasingly becoming data aggregators themselves. So Spondi's exactly right that we have to think about sort of a lot of our typical frameworks, which I think can be imported into the space, we have to think about it in a kind of transformative way. So having said that, I would go back to some basics. I do think that there are some administrative law frameworks. So again, to take the work that I've done in federalism accountability, there are frameworks that we can set up that force agencies, for example, to conduct these impacts assessments with respect to AI and then doesn't just leave it there. There are mechanisms of control. If we think about the Office of Information and Regulatory Affairs, OIRA, which does centralized review of agency rules and regulations, they typically focus on significant ones, sometimes defined as financially significant, but there are ways that we could refashion the use of machine learning algorithmic tools within rulemaking to constitute significance because, and here's the good part, the good part of the story is these administrative law frameworks that already exist focus on things like accountability, transparency, reasons giving. Now, what they mean in this context, Fendi, you're exactly right. There's all sorts of lack of consensus, need for further thinking, but let's again go back to just simple basic administrative law. Like let's assume an agency starts deploying these tools within rulemaking. They have to set things forward in a notice and comment rulemaking. They have to put out to the public these things like the assessments, they have to receive input from all of these entities, both stakeholders, random law professors, whomever is interested in these to debate these kinds of concepts. And I think that's really, really important. And I think that before, in a way, before we get to what other, this question, I'm not trying to necessarily dodge it, but before we get to the question of what other types of laws have to be refined, we should first think about how this existing administrative law framework, which by the way, not only could force agencies to conduct this internally and have some kind of executive review through OIRA, but we have judicial review too. And so the whole notion of judges, at least in this rulemaking context, sort of getting up to speed, so to speak, and thinking of what's acceptable reasons giving for the deployment of these tools will go a long way. And I believe that it will become, sometimes I feel like we get stymied by thinking about how the barriers are insuperable. So just to come back again to something very concrete and something I know quite well within the FDA, the FDA is not only sort of the most powerful ex-anti-federal regulator in the entire world, but it has the most data, right? It doesn't just review tertiary analyses of data and the capabilities of now getting more real-time ex-post data with respect to say medical devices that are out in the real world, et cetera, leads us to a point I think that talks about how government at least already has the capacity to be imposing these kinds of accountability constraints on its own use and deployment of this data, sharing with the public through the notice and comment process, having internal executive branch and external judicial review. And I think a lot could be learned through that process. I fear sometimes the idea of a kind of wait and see approach. The final thing I'll say is, Spandi, you raised, and this question also raises really difficult issues about the interplay between ensuring kind of transparency with respect to reasons giving in this administrative law context and things like data privacy and also security. And the only comment I'll say on that is that that is an issue that is gonna cut across the entire federal administrative state. So right now, each of our agencies have their own mission. They have their own ways that they're starting to think about the deployment of machine learning and AI tools, but this like algorithmic accountability is one that's going to be cross cutting across the entire federal administrative state. And I think a lot more thought has to be put into place as to how we're gonna have kind of coordination among federal agencies. So things like this panel are somewhat of a help, things like efforts by a guest, which is trying to share information between federal agencies. Anything I believe that kind of puts in the same room both lawyers and technologists to be discussing this issue and also puts in the same room people from government and the private sector and get them working together thinking as Christine was arguing earlier, thinking about the life cycle at the very earliest stages, right? I've been in too many panels where some of the technologists will talk about how they'll go about developing and then down the line sort of administrative law or the law will say thumbs up, thumbs down. That is not the way this should operate and hopefully we can collectively change the kind of trajectory of thinking along those lines. Yeah, that's really helpful. Thank you. And it sounds like our federal government is gonna need a lot of smart people across embedded who know their stuff. And as you said, both lawyers and technologists working together as the ACIS report really emphasized. So thank you. So I guess somewhat related, I think a lot of you brought up various points related to sort of high risk AI systems. So in all of this, we're talking about maybe not using sort of the strictest accountability mechanisms to regulate every single algorithm, but we're often mainly focused on sort of the highest priority algorithms. So high risk AI is what often the term we're using. So there's this ongoing debate around how to define high risk AI. I wanna pose the question to all of our panelists here. How do you think high risk AI should be defined and how can we as a community, I guess between civil society and academics and our private sector partners work to reach consensus around a definition? Christine looks like you're going first. Yeah, yeah, yeah. I'll jump in just for our first voice in the space. I wanna push back just a little bit on the categorization of risk. We do have with at the partnership on AI an incident database and it's sort of a repository of harms where we keep track of any harm that AI has caused. It's out in production, it's out in the world and it has already impacted human beings. And these are range from it misspelled your name on an application to an autonomous vehicle had an impact with a human being. So we have a repository of harms that we keep. It is not subjective, it's just a curation of news articles and other write ups that have already been publicly made available. The reason I wanna push back just a little bit on which AI is high risk AI is because of the cascading effect of any machine learning system or AI system and the impact that goes well beyond deployment. We talked a little bit in the beginning about how AI can impact users, the people who log in and actually make use of the system can impact the engineers or developers who are in community in creating the system. But there's also a very important demographic that we often leave out which is the impacted non-user. I like to use as an example, you go to the bank and you apply for a mortgage and you get rejected, probably no one in that bank can tell you what the algorithm was that made that decision. And that's an issue, especially if you don't have insight as a stakeholder, as an impacted non-user of some algorithm or some mechanism or some collection of models, you don't have insight into how that information was used and is kept. So I think at some vantage point, it's all high risk because there's so much happening, not to use the term in the title of your paper, this black box, we just need more visibility and transparency. So I would say the risk whether it's high or low or medium is not necessarily the way we would categorize the AI, but I'd say it's the way we should categorize our visibility into it. So it's high risk if we can't tell you what the heck is going on, low risk if we know exactly what's happening, but there's still some outliers, some things that could occur. So that's just what I'll add to the conversation in terms of the way we categorize the risk. But of course, I do believe that documentation is at least part of the solution. Yeah, I'll just jump in. So I'm fascinated, Christine, by this metaphor too of the cascading effect of AI. So I feel I'm torn here because on the one hand, categorizing things as high risk is reminiscent, for example, to come back to the FDA of having like class three medical devices that are given the most scrutiny because there are devices that affect potentially life versus class one, which are like band-aid. So we need to have some kind of in the governmental sector. We do need to have kind of different, I think it's appropriate to have a kind of risk management type of strategy. And yet I'm convinced listening to you that, and this gets back to the heart of the question, that we might be better served. I mean, you proposed one in terms of sort of how opaque or how explainable. Some sense you would have a metric of, is this, where are we? There's new emerging field of explainable AI. So we might have a continuum. And if you are on the end of more explainable AI, then you put that into lower risk. Another way to do it would be to be really focused. And again, I'm gonna come at this from the governmental uses, but we might have very different demands. I think we should have very different demands with respect to the uses. If this is being deployed in a way that could, for example, take away someone's individual benefits or affect someone's right to a type of service versus some kind of mechanism for policing a new emergent health and safety risk. I mean, we might, so we might kind of think about high risk with respect to the type of use. So we might think differently about in government uses that are in enforcement uses that are in like adjudication of individual benefits, let's say, and then things like the rulemaking process that I was alluding to earlier. The final note I'll say that brings me back a little bit. I wouldn't, I hadn't used this cascading metaphor, but I worry in my space about governmental uses that too often scholars will draw a kind of line in the sand and say that if it's just being used for a supportive use, we don't have to worry about the whole administrative law accountability framework I was talking about before. But if it's being used for determinative, like it's making the actual determination decision, then we do. And there's a lot of talk too about a human in the loop. So long as there's a human in the loop, we wouldn't be as worried. I guess I worry and I've seen, and maybe the cascading metaphor is a good one, that there are some uses, initial deployment of these technologies that look like they're very mundane, just supportive, not really affecting decision-making it ever whatsoever. And then as they learn and develop, they start crossing that line. And so I worry about the idea of these lines in the sand. I worry very much about how there's a human in the loop, but the AI has actually been making a determination and that the human really has no force to go against that. So I think you've put your finger on something that's very important. I think I would stress that we should think, we should define these concepts sort of along a continuum. So as we don't inadvertently create sort of safe harbors that allow certain types of deployment of these tools to evade, scrutinate the kind of scrutiny that I talked about earlier. And I can just jump in there. I think this kind of conversation is actually, you know, a really great example. I think the broader conversations we should be having in the space. But one interesting, like this is a, the question of how to define high-risk AI, how to frame it is something we grappled a lot with when writing the cracking the black box open before. And an interesting sort of discussion I came across was one that was happening in the EU where many stakeholders are starting to define high-risk AI as a system that poses threats or harms to fundamental rights. And I think that ties in a little bit of what Christine was talking about in a little bit of what Catherine was talking about. So, you know, some examples that were circulated were algorithms that determine credit-worthiness, algorithms that are determining whether an individual is sentenced to jail. But then also algorithms that, as Catherine was mentioning, not just making determinations, but that could pose an issue down the line. So algorithms that are used in police technology and depending who's sort of interacting with that technology is then sort of producing a harmful outcome. Just to sort of like tie together some of the threats, I think that, you know, it's a really complex question of like how to define something that will so critically, you know, we'll have such a critical impact on how we move forward in this space. And I think an overly broad definition will, you know, likely require companies and governments to conduct evaluations of systems that perhaps are not of significant concern and then use valuable resources that, you know, could have been used elsewhere. But to Catherine's point, you know, you can't always clearly define and understand the risk right off the bat. You know, systems change, the situations change, the context change. But an overly narrow definition could also exclude systems that do make critical and high-risk decisions or that could in the future. And so I think that we definitely need to continue these conversations, but I would emphasize that the longer we put off these conversations, I'm sad to say that this conversation is easy at all and can be resolved, you know, over drinks one night. But the longer we put this off, I think the longer entities that are perhaps using systems that we find concerning, they are able to define the term on their own and they essentially can set the terms, write the questions and grade their own homework. And so I think it's really critical for promoting accountability around these systems that we start to move forward and, you know, put forth a couple more concrete proposals, try to test them out perhaps. Because right now, it's really a grading their own homework situation. Thank you all. Yes, it is, I think it's really a challenge, but that this is a great sort of conversation, as Sandy said, and sort of an example of the bigger conversation that everyone's having that we need to have and difficult to come to consensus on this. But first of all, I'd like to give the audience a reminder to please use Slido to submit your questions. Slido again is the box located on the right of the video. And if you have any issues with that at all, you can contact us at events at newamerica.org. You can send your questions there. And so we will be looking for those. But in the meantime, we certainly have more things we can discuss. So again, I'm gonna pose another question to everyone to weigh in on here. So I think as we've been talking about throughout and as I sort of started with, we tend to talk about many of these mechanisms for promoting fairness, accountability and transparency in a siloed fashion. Talk about, I'm all documentation practices very separately from how we talk about audits and they're often not part of the same conversation. And then we talk about transparency reports entirely separately, wondering how we can connect those conversations and what are the challenges? We'll start there and then I have a sort of follow-up. I don't mind jumping in. Did somebody else, Kevin, you wanna go? Yeah, I just had a very short response which is that I was motivated. In particular, I was really peaked by the cracking the open the black box does a real service, I think by surveying and putting them all together in a document. And as I mentioned in my longer remarks, I won't repeat them here. I was really struck by how a kind of administrative law accountability framework similar to one that I designed with respect to federalism accountability could really work here. And what it did is it gave me food for thought with respect to bringing in things like particularly the impact assessment because that's very analogous to the federalism impact one but Christine's made me think a little bit too about sort of the documentation requirements. And we had occasion, I'll let her speak for herself but we had occasion she told me a little bit offline the other day about some discussions with GSA about how they might incorporate some of those things with respect to their procurement process. So to me, there are some very easy ways. So if we bring in some of these tools into an administrative law framework and publish it, et cetera to get these people maybe talking together about it. So the report does a service by doing that bringing it together in one place. And I guess the answer is each of us from our own individual vantage point to be challenged to think about what are ways in our own work that we can try to show not tell the need to do this. That's awesome. Show not tell, I like that. Yeah, absolutely. Catherine, we're talking with some folks over in GSA because they create those templates that government agencies use in order to procure, what's that percentage? Was it from your study? Was it the 33%? Yeah, 33%. To procure that 33% of machine learning AI asset, we're talking about, well, what language needs to go in that contract? What do you need to require upon delivery? So that's gonna be a really interesting and exciting possibility for actioning transparency. But to the question originally posed by Lauren about how do we connect these conversations, I'm gonna quote Peter Drucker, culture each strategy for breakfast or maybe it's lunch for at least one meal of the day, culture each strategy. And I feel like it's not about having these amazing tools that are super important or even this strategy or methodology or approach for documentation, which is really important. It's about a shift, a paradigm shift in the culture of the way we develop and deploy these machine learning systems. We all know from experience the way technology becomes pervasive. It changes the way we do life, the way we like imagine going around without your phone for a month. Like, oh, how do we do that? It just, so it just shifts the way we engage with the world. So just thinking of how as an organization we can focus on a culture shift where documentation, whatever the cost, if you've gotta have a whole new department of generalists that cross model, cross department, cross algorithmic function are in charge of doing this life cycle documentation, it can then inform at an enterprise level what you're seeing, what records are being kept. That's an amazing thing to do, but it takes lots of time. It's hard, culture shift, especially I believe in the federal government is difficult, especially in a very hierarchical space like maybe the Department of Defense or some military groups. So it's just a tough thing to do, but it's at the heart of how we make this change and how we operationalize responsible AI and transparency. Yeah, and I would just build on that. I think part of that culture shift that we need more broadly in the space also needs to include trust and transparency. I see a lot of times when stakeholders from different sides of the space want to talk about a certain type of algorithmic system, for example, but that trust doesn't exist between the relevant parties. And as a result, there's not any meaningful communication or transparency. What we mostly get from each different side are talking points like this is what we think is happening, like this is what's actually happening. And I think if we're really trying to push forward meaningful multi-stakeholder dialogues that are actually trying to promote accountability in all aspects of the space, then we need to sort of build trusting relationships with companies, with government agencies and vice versa to sort of engender this notion that we're working towards the same goal. Yeah, that makes a lot of sense. Thank you all. So I think, again, building on that and sort of working to bring together all these conversations, we've obviously talked about a whole, just within this past hour, talked about a whole constellation of different methods. I'm wondering if there's any sort of gold standard mechanism that you all will be willing to put for it, or sort of put your finger on the scale for that should be used sort of across context. Again, when we're at least talking about high-risk algorithms, is there one method of promoting fairness, accountability and transparency that sort of reigns supreme? Do we need audits or external audits in every circumstance or in most circumstances or anything like that? I'm wondering, again, we've talked about and there's probably a need for a combination of different mechanisms, and it is very context-specific, I think especially as Christine was saying. But I'm wondering if you all would put forth the idea that there is one that sort of needs to be there all the time. I'm happy to jump in there, maybe just to be a little contrary of that, no. I think like you mentioned there are, and as we've discussed during this panel, there are benefits and disadvantages to each approach. And one of the things that we, my co-op and I really tried to drive in through the report are that many of these approaches can supplement one another if we did bring them together into a holistic approach for promoting FAT. That's not to say that one approach could not be changed, one approach could not be championed, but I think that especially just as we've seen during this conversation, like procurement has so many advantages that maybe machine learning documentation doesn't even touch on, that maybe audits don't even touch on and vice versa. And they also have different limitations that they perhaps could fill. And so I think that there are best practices we can learn from each sort of side of the conversation, but I'm not sure that I would say that one of the necessarily raised supreme. So I'll go out on a limb and say, it's really apparent from my prior comments that within the governmental context, I'm most drawn to the idea of requiring a kind of algorithmic impact assessment and how that could fit into an administrative law mechanism, particularly with respect to thinking about the importation of AI into the rulemaking context. So again, that doesn't, I'm going bold by choosing and defending one, but I'm narrowing kind of the context. But what I like about that again, and the thought that I've done today about the analogies with federalism is it will force kind of in real time and with differing views backed hopefully by some empirical data and some study and learning some of these definitional controversies that I agree with Spandi, we don't wanna let those definitional sort of controversies paralyze us. So again, people can vehemently disagree with whether an agency for example, should set a standard and should preempt all state tort law. But I think we all could agree that they can't do that and say there are no federalism impacts. And that was happening. And then we had an administrative law framework that assess that. So there's some low hanging fruit that we wanna make sure we don't miss. So we shouldn't allow for governmental uses in the rulemaking context to go forward without such an assessment. And then we have tools that are already existing within that process to scrutinize. Now we'll get into all sorts of debates about when it comes to like courts review, should it be hard look, what should judges demand or a soft touch because it's technological areas. It's not gonna solve those problems and we have to continue to debate them. But I think that one to me looks like a very powerful tool that we can, what I like also about it is we can kind of put it into existing administrative law and judicial review frameworks for the governmental uses at least in rulemaking. I'll second that. I'll go for impact assessment. You know, that's super important thinking about the risks, how you might mitigate risks that show up and being bold in the way we distinguish between issues and risks. Issues being things, they've already happened. They've already occurred. It's already real. It's already live. It's already impacting people or systems. So I think that's it. But yeah, it's tough. It's tough to name just one thing, right? Because they're so interrelated. I think about my time as a consultant and as an enterprise architect and the work we had to do to show agencies like OMB why we deserved this funding. And we'd have to show all these architectural diagrams and how one system impacts another and we'd have to show all these requirements and traceability and how one affects another. And then that on top of any type of budget information and the cost and how the cost could double or triple or what if COVID happens? So there's so many things that affect so many other things. But I think impact assessment is a nice... If we had to require one thing, especially across federal agencies, that would be a nice thing to consider just asking those questions upfront. Thanks. Yeah, I know it's a tough question, especially since, again, as many of you have said, it's often very context specific what mechanism we need and also many of them should be probably working together. But thanks for humoring me a little there. And so again, sort of building on that and Spandi, you mentioned that there may not be one tool for every circumstance, but that there are best practices for promoting FAT around high risk AI. I'm wondering, are there best practices that you've seen internet platforms? And sorry, this question's for everyone. Are there best practices that you've seen internet platforms use that you think could maybe be brought over into the government context or vice versa? Are there some practices that the government's using that haven't really been explored by a lot of private entities or internet platforms? Yeah, anyone wanna? I'll give a real quick example. There's a site, it's called designethically.com. They have sort of a monitoring checklist. And so I would say this really helpful thing is the idea of even though you have successfully developed and even successfully deployed, that it doesn't end there, just that whole concept of watching, being aware, making note, monitoring, checking in with users. So just this idea of kind of like kids, like it doesn't matter how old they are at any point they come and ask you for money, right? You're gonna always have this model to babysit and to monitor and to watch. And in that active monitoring, there's this opportunity for improvement throughout the life of that model and any other model that is built based on the tenants of that model. I can jump in. So at OTI we do a lot of work around platform transparency efforts and we definitely encourage platforms to always be more transparent about their operations and policies. I think over the last few years, I've noticed that some platforms have started releasing more public information explaining how their algorithmic systems work in a sort of more digestible way so that the average user or perhaps the researcher can understand the structure of these systems. As someone who spends most of their time reading about this, that's been really helpful. And I think the aegis report was really interesting because I was at first sort of like breakdown of a comparable sort of breakdown that I saw. And so of course there are different sensitivities when it comes to the government and how much you can share, but I do think sort of transparency to raise awareness and generate common understanding in an explainable fashion around when systems are being deployed, how they're being deployed, what impact they can have that that's a, I think that's like a first step best practice that I think should be applied holistically. And I'll just add, I mean it was alluded to in your report and I guess I hold out, it's always good when there are bodies that are ongoing, but they haven't received, they haven't reached the end of their efforts. One can be quite hopeful, but the effort by NIST, the National Institute for Standards and Technology, they were energized by certain executive orders, but as your report talks about, they really have a desire to collaborate with both private sector and academia with respect to establishing sort of standards in this area. I guess I'm curious if we're allowed to ask questions of our panelists. So should I dampen my enthusiasm for NIST? I mean, this is a true question. I don't know whether either of you has had much opportunity to collaborate work with them, but it seems like as an entity, particularly with respect to having the technological expertise and then if they're open to an infusion of legal policy, input, et cetera, along the development of these standards, it seems like that could be a very fruitful model. So should I continue to be so enthusiastic and hopeful? Yes, we answered the request for information and just talked about how, you know, some practical approaches to developing this template, this guide, this framework, and then also the use and the audiences that might benefit from it. So I say yes, just cause the conversation has started, right? And they're a very pivotal organization within the federal government. So you get points for starting the conversation, so yes. Yeah, I would plus one though. We also submitted comments as part of NIST's recent consultation on AI bias. And I believe a number of other civil society organizations that as well. So it definitely seems like that infusion is happening. And yeah, like Christine said, like we're pretty early on in this conversation. So I would hope that it's not too late that we've lost hope. And I think that there could definitely be some meaningful work that NIST can do. Yeah, thanks everyone. So we've got kind of a tough audience question that just came in. I'll go ahead and throw it to you all. So do you think the increasing use of AI in government and business seriously threaten our democracy? I mean, if Twitter can almost take us down than anything's possible, right? So that's, you know, when I think about a question like that, I think about do we want to forfeit the evolution of technology or do we want to forfeit all the benefit that we could possibly get from some breakthrough just because of the risk, the very real risks of that technology or that breakthrough being used for harm. And the answer is no, no, we've just got to be thinking about risk. We've got to be thinking about mitigation. We've got to be thinking about all scenarios, not just the good ones, not just the ones we put forward to get the budget we wanted, but the ones that are bad and that could turn it on its side. And so, no, I mean, it's possible, but that shouldn't be a showstopper for us. Catherine will agree with me. We'll put stuff in place to make it possible to still move forward. Yes, you've intuited. I do agree. I think it's very worthwhile like you say to think about, I often think about in the terms of the promise and the peril of the use of these technologies. And it's definitely worthwhile to try to come at this in a kind of level headed way. I also would second the idea as with all prior technological developments, I think we need to take a kind of risk management approach. People, I teach a variety of different things in the law school and sometimes people come in with this idea that to go back to the FDA again, that their job should be to reduce the risk of medical devices and drugs to zero. And we all know that would mean no life-saving medical devices or drugs. So that's just not feasible. We are going to have to think about, we cannot hold sort of algorithmic accountability to some standard like that. At the same time, it challenges us because some people come up with metrics that technologies as long as they're better than human biases errors. Again, that was alluded to, I think Christine, you mentioned in your, I don't want to call them the catalog of horrors or the catalog of harmed instances, you have an autonomous vehicle incident. So well, if we compared that to automobile accidents, something like 98 to 99% caused by human error, we would start to be very enthusiastic about autonomous vehicles. Now that doesn't mean in my opinion that we throw up our hands and not worry about these kinds of issues, but it's really, really difficult to figure out in a level headed risk management kind of way how we should go forward. But absolutely it shouldn't, we shouldn't stymie the development because the promise side is real. And so the way I like to think about it is it's like all kinds of risky technologies we need to think hard about oversight and about sometimes these things are quantifiable and sometimes their values driven and we have to bring all of that into the conversation. So it will be easy. I would plus one that I would also say that a lot of the times with some of the technologies we grow to be concerned about the harms that they generate are we're in the moment unforeseeable and the accident means we didn't predict but sometimes these technologies are also exacerbating pre-existing harms, pre-existing inequities. And so when we think about how these tools can interface with our democratic structures, I think we need to have a little foresight in the ways they could mess things up but also recognize that our society is not perfect either. So we need to be thinking about how can we improve the societal structures as well to ensure the democratic principles and so on remain strong and resilient as we introduce new technology. And Catherine, I'm gonna tell our team that we should change the name of the incidence database to the catalog of horrors. I think that's a much better branding for that. I do like that. So that was a sort of broader question. We zoomed out a bit there with an audience question but to sort of zoom back into some specifics. Sorry, just a second. So Catherine, in the ACIS report, you all outlined some sort of key next steps and recommendations. I was wondering if you could give us a bit of an overview of what you all recommended and sort of what needs to happen next for the government. Yeah, I mean, a big piece that I don't wanna get too sidetracked but that I haven't hit on as much in my remarks is really the building of the internal capacity. And there are some challenges there, right? There are both financial recruitment challenges that the government faces. But the idea again to return to our empirical finding that of the 142 major agencies that we were serving, nearly half were deploying some form of machine learning AI and of the use cases, nearly half, a little over half, we're being built in-house, suggests that it's an interesting piece of this. And I think the building of the internal capacity is only going to help with the kind of efforts that Christine and I were talking a little bit back and forth about sort of the idea that those within government are gonna be able to interface with outside groups concerned about these kinds of issues, getting technologists and lawyers into the same room and the like. But realistically or practically speaking, I think we encountered several examples. Again, the FDA comes to mind because they were really out on the vanguard of this of kind of trying to build these non-commercial, sometimes commercial, but often non-commercial collaborations with academia in particular, with nonprofits, the FDA offers various types of research fellowship type positions to get people who are doing cutting edge work in other sectors in academia and in other, even in the private sector sometimes internally coming to the agency. And I think we need to be open to those kinds of things. My own view is while the procurement piece is important for all of the reasons that we've talked about that by and large, I think that the agency is given the sophistication and the nuanced contextual work that they're doing are going to have to develop the internal capacity. And I think that that's going to be a positive development. So thinking about ways to encourage that, I think is a key theme there. Thanks for that. And I think this next question is related. So I'll add it on here, but I think that that means that it likely is coming back to you, Catherine. But we just had an audience question. Could you address the role of lawsuits in cases in which government entities deployed flawed AI? So that's an interesting one and one that we maybe haven't spent as much time talking or thinking about. Will the concern about being sued lead to more careful deployment, you think? So it's interesting. One of the things outside of the AI context, as I said, I alluded to earlier kind of administrative law accountability framework. And I mentioned judicial review. Just to be clear there, I was talking about judicial review when people challenge rules. So let's take, for example, an agency that decides to issue a rule about doing kind of a retrospective review of its prior regulations and deploys an AI technology that is helping it to map out both what might be kind of overly burdensome rules or things that they might want to revise and the like. And they come up with this rule and let's imagine someone challenges the rule. That's what I was talking about when I was talking about judicial review. And I have written about before and I do believe, right? Academics differ about whether or not this threat of judicial review will induce agencies to do things differently. And my own view, it's hard to gather empirics on that. There are lots of strong anecdotal evidence, but my own view is that by and large, yes, it does incentivize agencies, for example, to think about things like federalism impacts or hear, think about algorithmic impact assessments to do the job, the best job they can, kind of defining those terms, thinking about the impacts, putting it out there because there will be the scrutiny down the road. Now, the question raises a different angle which is actually suing government. Now, it's something, there are other groups and organizations that have focused on that piece and I haven't in my own work in this area. I'm not averse. Yeah, there's not a kind of like inherent bias other than the fact that I was very interested and continue to be interested to explore kind of existing ways within agencies that they're deploying these techniques. And there are entities that come at this with a much more adversarial and advocacy perspective that's either hostile or supporting the agency. And given the work that I'm doing, I haven't engaged in the former type, but as a general matter and as a scholar who teaches tort law and thinks very much about kind of the deterrent effect of lawsuits, I do believe that whether we're thinking about government or really increasingly thinking about private entities deploying these kinds of technologies, I do think they're thinking about potential liability and that that does give them incentives, for example, if they're autonomous vehicle manufacturers, they're gonna get added incentives towards safety. It would be an interesting piece of this whole puzzle and I'm sure maybe Spandi or even Christine, you know of organizations that are cataloging. I've read about these as I'm a consumer of sort of the secondary literature. Just my own research hasn't focused on the angle of suing government over use of these technologies. Yeah, I can't say I'm a consumer of that content either. Maybe I should be, but just to something you mentioned about incentives, there is sort of like a discussion around in the AI kind of accountability space of whether developers of AI should be given sort of like subsidies or some sort of benefit to sort of develop more robust tools that match up with FAT expectations. And I think, and we talk about this in the report a little bit, but I think that kind of begs the question of, do we think accountability is that kind of accountability is something that needs to be incentivized in that way? Or do we think that that's something that should just be an expectation from the ground up and that developer should be doing anyways? Yeah, I mean, that's an interesting question. And I think it comes in my mind, it comes back to a little bit of an earlier question, Lauren, you asked me about sort of the positive and negatives of the procurement from the private sector versus building in house. So I do think that a benefit within government, at least at the building in house is there's a sort of, they're already primed to be thinking about these kinds of accountability mechanisms given the administrative law framework that surrounds all of this. In the private sector, again, maybe that would become, as you said, just sort of like a baseline norm so long as they're brought in at the earlier stages. I mean, again, my worry is sort of a buying but the sort of the example of horror to come back, Christine, that I have is sort of a governmental entity that's using this for a determinative use that might deny someone a benefit taking kind of an off-the-shelf tool that's already developed that for some other purpose that they know nothing about and deploying it. So that's the kind of, that's the example of horror. It's not to say that earlier on there couldn't be sort of collaborations. I personally think that it takes having some people within the agency to have the technological sophistication to sit in the same room. As I said, in some ways, I'll just come back to, I mean, I'm a law professor so maybe this isn't gonna surprise anyone but I was actually really blown away when I did this piloting with the Stanford NYU students. We did a cross-national zoom and this was pre-pandemic requiring this so everyone's much more facile even with the use of these technologies but we had PhD computer scientists, law students, professors, all of us trying to talk together about the legal and the technological issues and it was very and just reading the kind of same documents and asking the kinds of questions. Now we had a selection effect. We had law students who were interested in the technological side and tech people who were interested in the legal policy realm but I guess for me that's a little bit of a microcosm of a model for the future in a way in which we can have this cross-fertilization I think each of us has been talking about. Well, thanks for all of that everyone and yeah, Christine, or I'm sorry, Catherine, I was going to add on the exact comment that you mentioned but as a follow-up, yeah, I think that it's not just the government who should be concerned about the possibility of lawsuits if they're deploying high-risk algorithms within a non-transparent, unaccountable sort of manner that create harm but anyway, I think that that was actually our last audience question and so I think we may wrap up a few minutes early unless anyone wants to offer any sort of final thoughts on this very wide-ranging conversation. Okay, well, just wanted to give you all one last chance to add in anything that you didn't have the chance to say but yes, I'd just like to first thank our panelists. We really appreciate all your insights today and I know I learned a lot, I'm sure our audience feels the same so thank you all, audience, thank you all for joining us as always and for the great questions and thank you to our New America Events team for helping facilitate and put this on. This was a great conversation, I'm sure. These themes and discussions will be ongoing so nice to hear from you all in chat today, thank you. Have a great afternoon everyone. Thank you.