 Thank you and good morning. Thanks for coming out in this rainy weather. So as a bit of a possibly hard pivot from the previous few talks, as mentioned, my name is Dr. Ramon Choudhury, and I'm the Global Lead for Responsible AI at Accenture Applied Intelligence. In my job, I work with clients, actually work with academics, and I work with folks like yourself. I'm thinking through the ethical implications of the technology we launch. So I specifically, at the moment, focus on artificial intelligence. But there are questions being raised about blockchain, quantum, and a lot of the emerging technologies that are going to significantly shape society. So I'm here to talk a bit about open source in the future of ethical use and a question that a lot of us in the industry are grappling with at the moment. So this innocuous-seeming blog post is by OpenAI. If you're unfamiliar, OpenAI is a previously non-profit, now capped profit startup organization in Silicon Valley, ultimately dedicated to solving the problem of finding artificial general intelligence. So this blog post is about something they created called GPT-2. This was posted in February, and when they did, it was met with a lot of controversy. So why? Well, when they created GPT-2, it was a natural language processing text-generating model that could, based on only a couple of sentences, create a very realistic narrative. Now, this could be used to create fake news, fake media, fake blog posts, fake articles, and all you had to do was actually feed it one or two sentences. And this is one of the questions we come up with in artificial intelligence all the time. What we build is a multi-purpose tool. So something like GPT-2 can be used for fun. For example, the economist had a contest, and somebody replied with a GPT-2 created answer. They didn't win. But it was interesting to see how realistic it looked. We have journalists playing around with it as well for similar reasons. So it creates these coherent paragraphs. But at the same time, we can see how this could be used, and for example, in the upcoming elections of the United States, to create fake media, fake news. So what they decided to do was hold back rollout. They did not open source this product the way they do with everything else they make. And they said, well, you know what, we created this thing, but oh my gosh, it's so scary. We're just going to tell you that it exists, and we're not releasing it to the world. And this was not met with a lot of positivity in the community. As you all know, obviously, our community is built on democratization. It's built on sharing of materials. And in a few articles, we really saw that pushback. So in the upper corner, we're seeing some quotes by a professor at UC Berkeley who said that they've got a lot of money. They've got a lot of parlor tricks, calling what they did hyperbolic, that it was very extreme, that it was meant to drive media. And actually, the bottom post is by Dr. Britt Parris of Rutgers University, who actually entitled her blog post from Panic to Profit, and basically said, this was actually a big move for them to get more hype around what they do, because that is a big fear in this industry. There's so much hype. Like, what is real, what isn't real? And what Britt was saying, which is a valid critique, is that, well, you're going to tell us it's bad. You're not going to tell us what it is. And you're going to say you get to control who uses technology and how they use it. And by the way, she pointed out, this is also how some of the biggest companies from the Cold War got their massive lucrative media contracts by creating this mystique, this hype, and getting the government to fund them, getting DARPA, et cetera. So what she's calling out is what could be a brilliant marketing move. But at the same time, what this is really addressing is this problem that we have in industry where we actually don't know how to address these issues. GPT2 aside, what we saw earlier this year is an open source technology of deep fakes being used to create deep nudes. And if you didn't hear about this, this was an entrepreneurial individual who decided to charge $50 for the ability to take any photograph of any woman, interestingly, not men, only women, as he claimed. There were just more pictures of naked women online and pretty much undress her in the photo. The original article, which is broken by Sam Colette Weiss, she points out that they downloaded it. You can run it like a Windows application. It's super easy. You don't have to have advanced programming knowledge. You don't have to actually do a massive install. You just paid this dude $50, got this thing on your computer, put in whatever picture you wanted of any woman you knew, and came out with a fairly realistic nude of her. Now, Sam's work, she's been doing this since about 2017. She's been chasing deep fakes. And sadly, but presciently, what she had always said was deep fakes is not actually used as much for fake media. What it is used to do is create fake revenge porn. So the number one use of deep fakes is actually to do exactly this. So people to go in and create images of their ex-girlfriends, girls, women they don't like, their bosses, and shame them publicly and embarrass them in the most fundamental way possible. And this app just made it freely available. And this individual was able to make it using something that was open source. So clearly we have this issue. So critique aside, what OpenAI was trying to do is answer a problem that we have in industry today. We create these multi-purpose tools. These tools have very specific, very significant societal impact that we actually cannot ignore. And even the individual who created deep nudes, he didn't really have overtly malicious intent. He didn't create it with the intent of doing harm, I suppose. It seemed like he was, I don't know, naive about it, about what it might do and the impact it might have. I don't know. But then how can we reconcile this, right? So as I said, the reason why OpenAI's response to GPT-2 was met with so much criticism is that the culture of open source revolutionized how we adopt and utilize technology. I myself, I used to work at a boot camp teaching data science before I joined Accenture. And every single one of my students had gone through Coursera classes or some sort of free online class which are only possible because of accessibility to repos on GitHub that make things freely and widely available. Careers are built off of Kaggle, right? Where individuals use the tools they've got on the laptops they have to establish themselves as a presence. It's been revolutionary because we live in a society often where privilege, money, and power buy you access, the ability to pay for the best schools or to have the right internship. And the democratization of these technologies actually breaks down a lot of those barriers. And I don't have to tell people in this room that, right? The power of democratized tools helps deconstruct the centralization of power and that is a beautiful thing. And the rapid democratization is what can lead to a level playing field but then these things are not without consequences. So how do we stop malicious actors? Or can we and is it our responsibility? So I'm not gonna launch into the is it our responsibility talk, that's actually another whole talk that I do that actually if you were at the Linux summit in Half Moon Bay I gave that talk then it's on something I call moral outsourcing. And with the use of technology and our response to our individual responsibility given that we've pushed off a lot of the work onto automation. But let's just say we've all agreed that it is our responsibility. We build these products, we put them out there in the world. How do we stop these malicious actors? Can we? So there is some worlds. So in Britt's blog post that I mentioned earlier she does touch on a few communities that we can draw from. This is not a new and novel problem. We have in the past built technologies that can be used for harm, that can be used for good. We worry about it falling into the wrong actors. So people talk about the biomedical industry, right? So you think about cloning, genetic research. Nuclear is the obvious one. Nuclear research, creating the atom bomb, et cetera. And there's some talk about the way people do work on the infosec community in particular this idea of responsible disclosure. So identifying harms and flaws in what you've built and making it public what these flaws are. And there are a lot of models of, for example, bug bounties, et cetera, that are actually currently being understood and explored in the AI security community today. But what are the shortcomings? So let's do them one by one. So in the biomedical industry and actually in nuclear, what you have is a massive barrier to entry. You can't just do genetic testing in your backyard. You can't create nuclear weapons in your garage. You have to actually have access. And actually what they do in, and even if you did have a lot of money, you did have access, what they do is they trace and track. For example, like in understanding how to control nuclear proliferation, they trace and track the creators of the rare materials that are used to create these weapons. Not really possible with what we're talking about. It is our industry, our world is built on making things cheaply, freely, readily available. So we kind of can't adopt the same way. And ultimately, regulation leads to re-centralization. If we're gonna say, oh, well, we need to regulate, et cetera, well, then guess what we're doing? We're kind of fighting the culture of what we have built and we're re-centralizing the nodes of power, which is kind of what we're trying to not do because our industry culture is built on easy to adopt, easy to share, easy to learn, and distribute tools and software. So there's this notion going around this idea of responsible release. So the difference between responsible release and responsible disclosure. Responsible disclosure assumes that you can find all the bugs, because in a straightforward to use software, it has N outputs, and then you can calculate for N outputs how things might go wrong. Not so the case when we're thinking about artificial intelligence, multi-purpose tools, and also a technology that technically quote learns. So what is responsible release? It's the idea of kind of what open air we're trying to do. How do I create something where you recognize that it could be potentially harmful and release it responsibly? That's actually a big discussion in our community right now. So what is it? We have no clue. We have zero idea, and we are trying to sort this out at the moment. So some thoughts on how to start. There's this really great paper called Law and Adversarial Machine Learning. So this is specifically about adversarial machine learning attacks, so AML, essentially what you do is you have an AI that tries to fight another AI. That's why it's called adversarial. And as one improves on let's say tricking the other one, the other one improves on catching the other one. So you can imagine it like the example I used to give when I taught data science is a cop and a counterfeiter, right? So a counterfeiter is making fake money, the detective gets better at figuring out what fake money is, and then the counterfeiter gets better and better at making fake money. So ultimately, it's this constant push-pull. Thing about AML is it's a really great way to hack AI systems. So in this paper, it's actually a really wonderful paper where it's addressing not just the social impact but the role of law in this place. And they have three recommendations on how you might want to do this. So number one is benchmarking attacks and defenses. So this is something akin to responsible disclosure. So thinking through all the different ways in which things might go wrong, writing this down somewhere, and possibly giving ways to defend against it. That is a task to be done. One cannot just release and say, oops, sorry. Second is to architect for forensics. In other words, I'm gonna talk about this in a minute about some initiatives going on, creating the right kind of paperwork so you can trace back if something does go wrong that was unexpected how it went wrong. Again, if anyone in this room dabbles in data science so you know that we actually don't really have standards. We have standards on how our code should look, but if you create a project end-to-end pipeline, we actually don't have standards on data provenance, data storage, data usage, potential impact. We don't track any of this, how we tune our parameters, how we choose our variables. We just kind of wing it. And then the last part of the thing that's very interesting is to take into account civil liberties. When I say something as a multi-purpose tool, it means this is not just that like some random actor can take it, it means a government can take it. There are a few examples I'm gonna talk about when I get a little bit deeper into this. So benchmarking. Benchmarking, what is one thing as great is we can draw from the InfoSec community. So if you're familiar with Adam Shostak's Dread Framework for threat modeling, there's a lot to learn from the threat modeling space. It is necessary but I would say insufficient for understanding because the two things we've got in the AI space are the unintended consequences and malicious actors. You combine that with the fact that we are talking about a technology that's meant to evolve and learn and has to be context specific unlike, let's say, certain software. If you go through the Dread Framework, essentially what it's saying, the way to benchmark is you assess the amount of damage something could cause, the reliability of the attack, the ease at which an attacker can exploit or launch an attack, the scope of affected users and the ease at which an attacker can discover the attack. So how easy is it to find? How bad can it be? Who's gonna be impacted? It all seems pretty sensible, right? It is but it's actually quite difficult to do. And one thing that at least I personally am working on is something I'm calling critical data science which is the art of the pedagogy of critiquing data science. It's actually not something data scientists are explicitly taught how to do. So when we think about forensics, so again, how do you build in documentation to trace back to potential errors and vulnerabilities? So there are some really great papers out there, not really used in practice but still very solid papers. Couple by Google, one called Data Sheets for Data Sets, another one called Model Cards for Models. It essentially creates a standardized template for tracking particular things about your data and your model. This has led to an initiative in this group called a partnership on AI. The partnership on AI is an industry started but broadly attended group to understand the social implications of artificial intelligence. So it's not just industry, it's civil service groups, nonprofits. And so what we do when we get together is have this amazing wide range of people all interested in the implications of artificial intelligence. So I'm on the steering committee for this group called About ML and the goal of About ML is to create this sort of standardized documentation. And then there's a language that is often used in a lot of the legal space, so the government space. So the European Commission, for example, talks about things like explainability, auditability, traceability. The government of Canada just came out with an algorithmic impact assessment in order to attempt to audit any model that is used by the Canadian government. So there are these early attempts at forensics that we see happening right now. And then finally, this notion of civil liberties. I found this one to be the most interesting, because this one's the hardest to grasp, right? For us technical folks, it's not something that is programmable, codable, you gotta read things, you gotta read law things. But it's really interesting because we've seen this already happen. If you've heard of this infamous Gator paper that came out of Stanford, this was one of the first conversations we had probably about three years ago about how this technology, A, can be poorly designed and then be be used maliciously by a government actor. So what these researchers at Stanford did was they scraped Tinder photos, took people's express sexuality and trained a machine learning model to say, hey, I can look at your face and tell you if you're straight or gay. So many problems with that and the three and a half minutes I have left, wouldn't even have enough time if I had an hour left on all the things that were wrong on what they built and how they built it. But what was scary about it was once they had built it, there are countries in this world where it is literally illegal to be gay and you would be killed. And what they did was create a tool that one of these governments could pick up and very easily use, therefore stifling civil liberties. So thinking through what are our legal protections, you know, the UN actually has, we have a list of civil liberties. What we are owed as human beings are rights and is what we're building violating those rights. Another example is actually facial recognition. So we're seeing more and more protest movements all around the world. And frankly, I think it's an amazing thing. We're seeing protests in Hong Kong, in Lebanon, right? All around the world. And as people cover their faces, what governments are doing, like they did in Hong Kong, was ban the ability to cover your face so that they can then track you and identify who you are using facial recognition technologies, thus stifling dissent movements. And it's not just something as complicated as facial recognition, the surveillance state technologies are used in tandem to stifle the ability to have free speech, the ability to congregate. So we are stifling civil liberties with our creation of these technologies. And that has to be a consideration we have when building and creating the tools that we have. So we're in an era of massive impact, massive growth and massive potential. I don't want to diminish the value of what we're building. The reason we haven't stopped building these technologies is that we are creating new and amazing things. We are changing our paradigms. We're revolutionizing everything from education to transportation, to the notion of government, to the notion of citizenship. It's phenomenal, but at the same time, we can't just look towards a positive. We actually have to embrace the potential negative consequences because the worst thing we could possibly do is recreate a world in which we've continued the inequality that exists today. The goal of what we build is to democratize, is to make things accessible, to break down centralization of power. And what we don't want to do is actually re-enable malicious actors to do that kind of thing. So as we think about the technologies we build, my encouragement is to think through those three things. How do we create traceability in what we build? How do we think about the civil liberties impact? And also, how do we make what we've built understandable and accessible to people so that we can be critiqued and we can build better products? Thank you.