 So hi, welcome to responsibility in depth, layering, licensing, regulation, and more. Or there are no magic wands, and so we need every slice of Swiss cheese that we can get. Do not worry, I will explain that later, probably. So why do I hope you're here? Let me tell you about that, and then if you're here for other reasons, no offense taken, go ahead and get some more coffee. So one, you want responsible AI, but maybe you feel a little overwhelmed. The good news is everybody is a little overwhelmed right now. The bad news is I can't tell you what it means to be responsible. I wish I could answer this question. It's an important one when I said something to my seven year old about this last night. He says, well, why can't that just be a yes or no question? And I was like, well, how much time you got, kid, before bedtime? The answer to what is responsible changes regularly as we change, as our societies change, and of course, as our technologies change. So I really can't answer this one. I also can't tell you what open means in AI. There was a panel for that yesterday. I assume that they solved all those problems and I'll report out when I get the notes. As I'll discuss more later, we as engineering-minded people really liked the simple is it open test that licensing gave us. But I'm afraid it's not ever gonna be that simple again. The problems are too complex. The impact on society is too real. And so merely looking at the license isn't gonna be enough and in fact, as I was just saying to somebody in the audience, it might actually be a distraction. I also can't give you solutions. I would just sat down through the previous talk here and there were lots of nice branded boxes and flow charts and like, if you just click the right boxes in Microsoft Azure, it's gonna solve your problems. And I'm afraid there are no boxes in Microsoft Azure or any other platform, that's not a knock on Azure, that will tell you the solutions to the problems we've got today. What I can do and what I do hope to do is to give you a mental toolkit to help you build responsible governance. That you can use to help you frame your own choices in building towards your own vision of responsibility that's appropriate for your communities and your organizations. Or to put it another way, I'd like you to leave this talk with a sense of what levers you can pull to make the open AI world a better place. For at least some of you, because I definitely see some experts in the audience, individual pieces of this talk will seem old hat. I see some very experienced deep thinkers here. For you, I apologize, some of this is gonna seem basic in some spots, but I hope that the overall framework, the layering we're gonna discuss will still be a useful piece in your toolkit and perhaps some of the folks in the audience will be new eager recruits to help solve insoluble problems. When I put it that way, it sounds really exciting. I'm sure you're all just eager to sign up. Quick note, why should you care what I think? Hi, my name's Lewis. I have a political theory degree. I have a law degree. I have a CS degree. I've been involved in open source for way too long, starting with GNOME in the year of the Linux desktop. We all know how that came out. I drafted Mozilla public license version 2.0. Unfortunately, we know what's happened to Mozilla's market share since then. You'll see a sort of key theme here, which is that I do a lot of licensing, maybe only sometimes for lost causes. So again, I wanna say that just because I talk a lot about licenses in my career, I wanna make sure that we're not focused on that today. So the first thing I wanna talk about today is to talk about open source as we know it right now. Is it responsible, and if so, how did we make it responsible? So to get to that, I wanna talk briefly about what goals we thought we were governing towards back in the beginning-ish. How did we define what it meant to be responsible? And this is where I'm gonna really show my age besides the gray hairs, is that in 1997, fighting Microsoft really was the responsible thing to do. In 1997 was the year I got into open source, that we hadn't coined that phrase yet. And I realize it may seem quaint now, but Microsoft really was the big bad guy. I can't stress enough really how bad they were. They weren't just an abusive monopoly, though the federal government, they were so bad the federal government actually went after them for that. They didn't just fight and undermine other people's innovation. Their software also was just really bad. I first installed Linux because Microsoft Word crashed. I kid you not, six times in one hour while I was writing a one page paper. So I went next door to the other CompSci major and was like, I hear you have the good stuff. I hear you have the Linux floppies. And sure enough, he did 25 years later, here I am. So fighting Microsoft genuinely was the core of at least a responsible vision of software, of an ethical vision of software. And in 1997, collaboration really was a way to fight Microsoft. Again, I realize this seems a little quaint, a little naive, but collaboration for software hadn't really been tried at this scale. The idea that you could just put people on the internet together and an operating system, a viable, important operating system would pop out the other end, seemed honestly a little bit ridiculous. But we thought that it might help take down Microsoft and so we tried. And it turns out we weren't entirely wrong. One other key fact about 1997 was that licensing really was the biggest legal barrier to collaboration and to reuse. We had a lot of people saying, well I can't just use this thing I found on the internet. And so we created these open source licenses in part to solve that problem. In the best engineering fashion, we identified a problem, copyright, to a lesser extent patents. We knew that was stopping us from collaborating and from being reused. And bad collaboration was stopping us from the primary technical ethical challenge of our time, which was again, seems a little ridiculous, but genuinely true at the moment, taking back users from Microsoft. So we solved the problem, all right? Those of you who are nodding along because you were there and you remember, give yourself a pat on the back, we like genuinely made the world a better place. Unfortunately, it also worked a little too well because we told an entire generation of programmers that licenses were the key tool to use to build responsible software. I wanna stress again, this was not unreasonable. We really did have an ethical problem and we really did, to some extent or another, solve it with a licensing hack. The problem is not that we did that in 1997. The problem is what we have or haven't learned in the 25 years since then. So, interlude number one, there are no magic wands. As I said at the beginning, this talk is gonna be in part about magic wands and how we lack them. So let's take a detour in this history to talk about that. In the 1990s, all the way through to 2010s, more or less, licenses played on easy mode. As I've suggested, licensing did a pretty good job in this first age of open. I think that's because we had a very specific need for our licenses and that specific and simple need was to share things on the internet with as few barriers as possible. So our licenses had one simple job. They did that simple job well and we all benefited. What happens when we ask our licenses to do the most more complex job of ensuring responsibility in AI? Ding, ding, ding. For those watching along at home, there was a vigorous nod, noop, in the audience. Again, you can just leave now, you've learned the important thing. Friends don't let friends do licensing. When we try to do this, when we try to ask this simple tool of licensing to do something much more complex, it has holes, it has gaps that it cannot cover. To see how that works in more detail, let's turn to an unfortunately regrettable example of an attempt at responsible licensing in AI that's going on right now. A quick case study. Look, the AI community did not invent the notion of ethical or responsible licensing. Indeed, lots of traditional proprietary software contracts include terms like don't do illegal things. And more recently, the ethical software community has tried to use open source, or licenses like open source to do a well-intentioned job of enforcing human rights, forcing the law and preventing usage by things like the military. Inspired by that, the responsible AI licensed community, some of you may have heard this shorthanded as the rail community, has tried to write licenses that bring responsibility to AI through those licensing mechanisms. One of the members of the rail license family was adopted by stability.ai. Probably most of you know about stability, but just in case it's a large AI model for image generation. You feed it a prompt and it spits out an image. The CEO of stability has trumpeted rail as an open license, but for our purpose, the question of open or not open is less interesting. What's interesting to us here today is whether the license is effective or not effective in reaching its stated goal of responsible AI. Unfortunately, the answer is not responsible. In particular, in the past few weeks, two media outlets have reported on the use of stability AI models to create non-consensual pornography. Let's walk through why the license didn't work here. The good news is that the licenses signaled that the creators of the model dislike bad things. Trying to shame bad actors genuinely is more than nothing, and we'll talk about that a little bit later. The bad news is just about everything else fell through a hole in the license. Let's go through some of those things in order. First, there's the question of definition. What is responsible anyway? I already told you that is a super hard problem and won't go into the messy details here, but suffice to say that the rail licenses try to cram global human rights law and global criminal law into one page, and that doesn't work very well. So it's not entirely clear if non-consensual adult pornography is always gonna be a license violation. It's gonna depend in part upon what part of the world you're talking about. Child pornography, thankfully, more clear, so at least there's that. The second hole in licensing as a construct here is simply knowledge. If you allow the entire world to download your open source software, it's hard to know who's using it. In open source, that was a sort of minor distraction because the cost to me as a distributor was fairly small. But if I don't know who's using my stuff and I face potential criminal liability for non-consensual pornography, knowing who is using my stuff, knowing who downloaded it is often gonna be very important because otherwise it's gonna be very hard to enforce your rules, which brings us to the third hole, enforcement. A license with no plan for enforcement is less of a legal document and more of a statement of values. Again, statement of values is not a bad thing, but you shouldn't pretend it's a legal strategy either. In this specific case, the authors of RAIL had some very nice intentions when they wrote the license. But as far as I can tell, and as far as the media can tell, stability does not appear to have any budget, any legal team for enforcement. So if you publish a model that says, hey, we're gonna force you to do these things and then take zero steps towards forcing anyone to do these things, the tree falls in a licensing forest, where are you at? That brings us to the final quick hole that I'll discuss today, jurisdiction. Licenses and enforcement only work in places that respect the rule of law and when you can find the people who are at issue. No surprise, both of the relevant porn services covered in these recent media things, their about page doesn't have a location, doesn't say where's this company based, doesn't say who are the people behind it. That makes enforcement not technically impossible, but pretty hard. Okay, so licenses are not a magic wand. They have a lot of holes through which irresponsible AI can fall. Do we have to give up and go home? I think the answer is no, so let's talk about cheese. Those of you who've been paying close attention or at least came in on time, welcome to the folks who came in late, the critical thing you need to know is I promised people cheese and now I'm gonna give to you. So I just gave you a long list of holes, so that of course means that I'm gonna talk specifically about Swiss cheese and the Swiss cheese model of safety. The notion of the Swiss cheese model of safety comes from James Reason's classic book, Managing the Risk of Organizational Accidents, which is one of these lovely books that does exactly what it says on the tin. It tells you how to manage the risk of organizational accidents. And on page eight of a 200 page book to give you some sense of the context and how important he is, he says, look, no one thing that you can put in is going to cover every single threat to your organization. You must put in multiple layers of defense. In the security world, which I know some of you are from, that often gets expressed as defense in depth. But I wanna focus, I wanna call that today the Swiss cheese model because so many of our examples are unfortunately gonna focus on the holes. So how can we line up our cheeses so that even though we know each one is imperfect, hopefully only a few of these scary looking red arrows make it through to pierce our invisible mice. I'm not quite sure what's on the other side of that diagram. Some of you may be familiar with this model from discussions of COVID, where we talked about layering defenses like masking, ventilation and vaccination. None of them are perfect, but together they're hopefully capable of protecting us. Similarly today, we're gonna talk about layers of imperfect defenses. The rest of this talk, having already touched on licensing, we'll focus on the other pieces of Swiss cheese. What imperfect, holy pieces can we stack up together, tend up with something like responsibly governed AI? At this point, some of you may be thinking, okay, why me? What do I have to do with this? Why do I care about cheese? There are two big reasons why we as technologists, as leaders of technologists, as managers of technologists, as I see at least a few known lawyers, so as counselors of technologists, why we need to think about the Swiss cheese model of safety? The first is pretty obvious, we'd like to get the right amount of responsible AI, right? I mean, this is sort of trivial, but I think it's worth saying, if you're actually just here, because you really want Skynet, again, the coffee's outside. The second challenge is gonna be nearly as hard. We need to get the right amount of open. Open is gonna be challenged here from two sides. On the one hand, open AI is gonna be challenged by people who will use AI to do obviously bad things, like non-consensual pornography or racist policing. And on the other hand, open AI is gonna be challenged by industrialists who would love to have very profitable monopolists on AI, and who really don't want to provide transparency into how their AI's work, because that might involve answering some very hard questions. We as leaders of open are gonna have to figure out how to walk those tight ropes, providing AI's that are responsible while still capturing the values of open we love, like transparency and innovation. Licensing did have the nice virtue that it was simple to evaluate, relatively speaking. We could look at one document and say, this document is open or not. A lot of these gray hairs are from arguing exactly that question, so it's a little simpler than I'm gonna let on today, a little harder than I'm gonna let on today, but it is there. The new layers of Swiss cheese are not gonna be so simple to evaluate. Among other things, they'll often be negotiated in part by national governments, and with billions of dollars on the line. So they're gonna have compromises, they're gonna cut corners, they're gonna do things that we, as occasional open source purists, are not gonna like. And they're also gonna be trying to solve fiendishly complex problems with similarly complex solutions. We're gonna have to learn together to evaluate these policies and these governance options outside of licensing. Part of how I got into open machine learning was friends asking, okay, but is this model open? That's not even, that's hardly a question that makes sense. So to some extent, this talk is two years of answering that question. The obvious standards we can use are things like, do they work? Do they actually promote safety? And do they have an impact? The less obvious one, the one that's gonna be harder for our community in part because we're gonna be the only ones asking this question is, do these rules, do these layers of Swiss cheese, do these layers of defense promote openness? Are the compromises made tolerable? None of these is gonna be easy, but they're gonna be important anyway. So part two, a quick tour of responsible open AI governance layers. I'm gonna do this in order roughly for most to least under our control. Don't get too hung up on that. Each of these is gonna be context specific, so sometimes they may be easier or harder to do, but hopefully that gives us a through line. Let's first talk about who we are. I keep sort of casually saying we, we, we. Half of you don't know me from Eden. Some of you have had me on podcasts. Some of you have had me to your homes. The key thing here is that everything that we build, if it's meaningfully open, comes with it a set of people. When we choose the sort of people that we wanna work with, we help to choose the values of our software. Traditionally in open source, we've done this internally with two things. Simply by saying we love sharing, that does turn away some sorts of people, and with codes of conduct. In AI, we may need to think more creatively about how we do these, including perhaps shaming or ostracizing people whose values we don't share. Since I see some of you squinting at the paper on the right hand side, let me give you the short version. It's a great paper by David Witter and Don Naffus. David's at NYU, Don's at Intel, on how an early deep fake open source community felt constrained by the GPL. They said, well, this code, which we didn't write, is under the GPL. So we can't tell you, don't use it for porn. That's an additional restriction. Instead, they said, anyone who comes to our forum and asks questions that even smell the tiniest little bit of non-consensual porn, we will kick you fiercely and immediately out of the forum. The paper's a really good read and a short read for anyone thinking about how to do open communities in AI. This is not a perfect tool. Community, who's in, who's out, is not a perfect tool. It's one more piece of Swiss cheese. And it's in fact easy to build such a small perfectionist community that you spend more time talking about values than actually talking about impactful code. But if we're gonna be responsible, it's one of those pieces in the toolkit. Another tool that we don't talk about enough in technologists' communities is the power of our labor. While tech labor does not have the power it had even a few months ago, those who are lucky enough to work in the space still have a lot of influence. Open-source software has long benefited from this and responsible AI could, in theory, as well. Tech executives often hate this. I admit I'm a tech executive. Again, that's another talk. We can look for this to recent examples, not just in tech. Our peers in LA have recently reminded us that employees can impact and regulate responsible AI in part by striking. Of course, just like every other theory of responsible governance I'll present today, labor power has holes. American labor has a long history of racism and has been all too often happy to pull up the economic bridge behind it. Nevertheless, when we have responsible goals we wanna reach, labor power is an option that we should remember and consider. And if you're an executive, you didn't hear that from me. Again, coffee in the hall. Testing in QA. As a former QA guy, I'm contractually bound to point out, at least somewhere in the talk, that knowing what the hell is in your software and how the hell it works is an important part of governing that software. This whole volume's to be written about the recent White House executive order. But one of the things I really liked about it was the emphasis on red teaming. For those who are not in security, this is the idea that we should actually have attacks on our tools to understand better how they work from people who are outside the normal workflow of creating those tools. Similarly, just this week, Meta became the latest in a long line of groups to release open source AI analysis tools. No one tool is gonna be perfect. No one tool is going to solve all the problems. But if you're doing open AI and you don't have a testing strategy, you're missing one of the layers of defense that can help you build a reasonable community. In fact, you can have a whole red teaming QA community in parallel, which is a source of strength and vibrancy. But that's gonna be more work. Media scrutiny. So one tool that was very disappointing in the White House executive order was how little transparency it demanded of AI. This weekend's a core tool of responsible governance, which is our media. The media should be an area where the core values of open source and the core values of responsible AI could be perhaps most aligned. Responsible open AI advocates should be working tightly with our media to show them that better models are possible, both holding ourselves to account and holding proprietary opaque AI to account as well. Again, there's some good news here. The Allen Institute has worked with the Washington Post to give some responsible examples of what responsible openness could look like in interaction with the media. And a Luther AI continues to be a good counter example to the media narrative that all this stuff can only be done by giant for-profit communities. But we need to get that message out there. The things I've already mentioned, community, labor, testing, these are things that individual responsible software communities can mostly choose to do for themselves. Now we're starting to climb the difficulty scale. Data governance is an area that open as relatively little experience in. For example, something like five to 10% of Linux Foundation's projects are data centric. But there are some big impactful open data projects out there. Wikidata, who spoke yesterday, OpenStreetMap, which has a relationship with Linux Foundation, FlickrCommons, StackOverflow, Reddit, Wikipedia, InternetArchive, these are all genuine communities of people who have come together to create data sets. Great data sets, critical data sets. And if they chose to limit those data sets, they could have some incredible downstream impact on responsible AI. This isn't something we have a ton of direct experience with. Again, this is one of the areas where we're gonna need to be creative if Open wants to have impact on responsible AI. But there are some cool relevant experiments, including in science. Creative Commons is working with Chan Zuckerberg to make preprints more available. And medicine, where data trusts and things like that are normalizing in the UK and elsewhere. So I'm optimistic there are some models here. But again, they're only gonna do so much. Public benefit institutions. This is a chart of the recently famous or infamous Open AI governance model, where you have at the top a nonprofit and down at the bottom you have a big label, unfortunately a little bit obscured on this projector that just says, money. Open and governance are tricky things, right? There are multiple categories here. Our friends at Linux Foundation are one, primarily dominated by and created for the industry. That's not necessarily a bad thing, but it's one approach. Open AI thought that they were not dominated by the industry. Whoops. There are also b-corps, co-ops. Co-ops have a long, quiet tradition in open source. I think we need to start experimenting with that again. Eclipse has moved to Europe to take advantage of non-American nonprofit models. And they're doing interesting work, by the way, around data infrastructure. So to return to our theme of holes, even public benefit nonprofits can be deeply imperfect creatures. I do, if I had more hours in the day, I would be calling California's attorney general every day and saying, what are you doing to regulate this theoretically regulated by you nonprofit that happens to be in San Francisco? But since he wants to be governor, I suspect he will not pick up the phone for me. Getting serious about enforcement. I already touched on this one a little bit in the stability AI example, but quite simply if a policy, if a community has a policy towards responsibility, it needs to provide resources towards the policy. This could be as simple as making sure there's a standing group of volunteers to name and shame people who violate a code of conduct. One of our audience members has very deep opinions about that and you should check them out afterwards or it could be choosing a licensing structure that has economic incentives for lawyers to pursue it. I'll also point you to an attorney in the audience if that's something you want to talk about. However you do it, don't call yourself responsible if you only have words on paper. If you only remember like two things from this talk, that's the second one. The holes in this particular piece of Swiss cheese should be pretty obvious, running out of patients, volunteers or other resources. I'm gonna start skimming through because we are dearly in need of time, but as I already mentioned, one way to think about improving enforceability of licenses is to move away from public licenses and towards more traditional contracts. If you've actually signed and filed a document, you know who the other side is, you can verify they exist and you can estimate whether they're likely to comply. The EU does this a lot in privacy contexts. Talk to your local GDPR lawyer if you're interested in knowing more. There are many reasons we don't do this in traditional open. It's costly, requires an institution rather than just a maintainer and honestly it discriminates against many creative innovative folks who can't afford legal advice or wouldn't be seen as reliable contractual partners. But this is a great example exactly because of that where reliability and the creativity of open are in tension and we need to think about how we do that. In fact, in some of the recent OSI discussions about what is open source, I specifically told them to remove the word license because I think we're gonna see a lot of non-license agreements taking place in this space. Impacted communities, unfortunately again in the interest of time, I'm gonna be quick here, but we are used to getting input from the communities who build things. We need to start thinking about input from the communities who are impacted by these things. I am a runner in San Francisco, so I am impacted by autonomous vehicles. I am frankly more likely to be literally impacted by human drivers, but nevertheless I am impacted by autonomous vehicles. I'm a brown person who likes to wear a hoodie and it's not just the neighborhood watch that I'm worried about, it is AI, right? I'm pretty privileged. I can probably get myself out of jail but that is real, right? The holes around this are gonna be pretty obvious. It's very hard even for experts to understand this technology. So it's hard for non-expert communities to bring useful feedback to the table, but we're already again seeing very cool experimentation in this. My favorite experiment so far is the recent people's panels in the UK funded by Mozilla. They use deliberative polling techniques. They sat normal folks together, selected more or less at random for a whole week to learn, listen and at the end say hey, now that we know a little bit more about this, here's what we'd like to see come out of it. And by the way, when I put this up online, I'll have links to all these kinds of things in the talk. Legislation, look, it's happening. The EU announced that their AI Act is more or less done. I don't think open folks were very much at the table for that, but the good news is for the EU Cyber Resilience Act, there were open folks at the table in a way that I think is fairly unprecedented. So again, these skill sets are maturing, they're imperfect, sitting in those talks gave me a whole lot more gray hairs, but if communities you're part of like the Python Software Foundation like Eclipse participated in that, please give them kudos, call their EDs and say it's awesome that you spoke to the EU in Brussels on my behalf. They need to hear that you value that. Finally, litigation, this is the last and actually one of the oldest techniques here. To put it bluntly, to get responsible AI, we're gonna have to sue companies and governments who get it wrong. This is different from the license enforcement I already mentioned, because it's often gonna be done by activist groups rather than by developers ourselves. But developers will have a role to play here and their role is often gonna be writing them a check. EFF does this, EFF's been litigating about racist algorithms for years now and opaque algorithms for years now. Le Quadre Choueur du Net, which is a French equivalent to the EFF, recently did some amazing work on racist algorithms in French social government systems. Again, not perfect, big holes. This moves slow, right? Takes the biggest litigation, most impactful litigation in open source, took a freaking decade, all right? But it will give us another layer, another piece of cheese in our attempt to build responsible AI. Down the last bit, I wanna tell you that the last but the not least thing that I want you to hear from me today is that we need imagination to use these layers to build solutions. Look, I told you at the beginning I can't give you solutions here. These problems are complex, the answers are early. We are facing all of us perhaps the biggest challenge of our careers. We have a mix of genuinely awe inspiring new technology. It's so cool. And anybody who tells you it's not cool hasn't used it enough. Also, it's terrifying. Anybody who tells you it's not terrifying also hasn't used it enough. The bad guys are not shying away from using their imaginations. Science fiction is so often serving as a how-to manual that it's become a meme, right? We are all building or not building the torment nexus depending on what day of the week it is, right? And this imagination of scary ends is also serving as a disciplining tool. We're told that if we limit American AI companies in even the tiniest way, Terminator will not only come, it will come from China, right? This is ludicrous on a bunch of levels, but the use of imagination to inspire fear is an old tactic that's not gonna go away. Here's the thing I worry about most is that we, those of us who are trying to do good might not match those feats of imagination. If you take away any one thing from this talk is that the open tools that we have fallen back on over and over again were really quite amazing for their time, but they are no longer enough. We need to learn from them, but they cannot be the only tools in our toolkit. If we rely just on licenses, just on codes of conduct, we're gonna fail. We're gonna get AI that is neither open nor responsible. So we have to use our imagination to build new tools, new layers in the Swiss cheese model. Complex, sorry, ambiguous, sorry. Things we as engineers don't like, but we're gonna have to do it anyway. Cause that's the only way we're gonna build a more democratic technological future. So, hey, let's create some glorious new governance hacks together. Thank you, everybody. For those who wanna hear more, my newsletter on open and machine learning issues is at openml.fyi. On social these days, I'm mostly on the Fediverse at social.coop. And thank you for bearing with me through that blitz. I hope someday that each of these slides will get their own conference. And I hope to see you all there at those conferences. Thank you. Thank you. Thank you. Thank you. Thank you.