 Hi everyone. Welcome to my talk. I'm really thrilled to be here to talk to you about building trustworthy AI lessons from open source. So first off, huge thanks to all the organizers and for all of you watching. I really appreciate it. I really appreciate your support and your help. And so, hi, my name is Abby, Abby Kubanak-Maze. I lead Mozilla's developer-focused strategy around trustworthy AI and open source around MossFest. And before this, I founded them, Lead Mozilla Open Leaders, which has worked with over 600 projects globally. And I really want to take these lessons from working with so many open projects and see how we can apply this to trustworthy AI and building machine learning technology today. So let's get my slides up. And there you can also see my Twitter handle there. I'm AppiCabs on Twitter, and I'm A. Kubanak on GitHub. So if you have any questions at all during this talk, feel free to tweet me there, or if you're watching a recording, you can follow up with me afterwards. So, let's get started. I do want to talk about three things today. First is how AI influences our lives. I think this is obvious to many of us, but I just wanted to recap this a little bit. Second, that code is power. And then third, open source practices can shift power. And I think that'll be the bulk of my talk today. So first up, AI influences our lives. Let's take a look at some AI today. And if you've been on the internet at all over the summer, you may have seen, I was language model released by OpenAI called GPT-3. And that first example on the left really is mind-blowing. But he built a layout generator where he just describes the layout he wants. I think he's saying, yes, something says, welcome to my newsletter. And then GPT-3 actually generates working JSX code that creates that website. So it really blows my mind. And you can watch him just sort of editing what he types there. And then the code just, it just generates that code magically. It's pretty amazing. And then as a compliment to that, I brought on this other treat by Janelle Shane. So she blogs at AIweirdness.com, which I highly recommend. But she input the part in bold. She wrote Janelle Shane, stood at her computer screen. And then GPT-3 generated the rest of this, which is pretty amazing. It says, it was filled with 300 lines of carefully written Python code. It was the best code she'd ever written. The best code anyone had ever written. It was way better than her old code, which was better than her supervisor's code, which was better than her coworker's code. It was better than any code she'd ever read, better than any code she'd ever heard about. She stared at it for a long time, then she deleted it. And I thought that was so poetic. And it has this nice spectrum of GPT-3 being able to generate both working code and beautiful poetry, that poetry that really spoke to me as someone who's written some beautiful code in the past, but then had to delete it all. So this is, obviously, I think there's probably a bit of cherry picking in these examples. But it does raise a few risks. And I will say that OpenAI has been pretty good at regulating the production apps using GPT-3 to help prevent some of these risks. But these risks are still there. So this is an article that was written actually last year in response to GPT-2, the precursor to GPT-3. And in here they interview Jeremy Howard, the co-founder of FastAI. And he says, we have the technology to totally fill Twitter email and the web up with a reasonable sounding, context-appropriate prose, which should run out all other speech in the impossible to filter. And it's a little scary thinking about Twitter being filled with this auto-generated content and what that could mean. And then Rachel Thomas, the other co-founder of FastAI, she's also the director of the Center for Applied Science at Center for Applied Data Ethics, sorry, at USF. And here she talks about how in 2017 the FCC received over a million fake pro-repeal net neutrality comments, so anti-net neutrality comments that had a kind of mad lib structure to them. So if you look at the examples, I don't know if it's big enough for you to see here, but all the green pieces, it's just replacing Americans with individual citizens or with people like me. And it's just like mail-merge-style replacements across all of these. So it was pretty easy to figure out that these were fake, but consider how much more sophisticated these would be with GPT-3 powering them. It would be really hard to detect. And it has the potential to really sway where things go, where if you think public opinion is swaying in one way, but in reality it's just one person with a GPT-3 generating them all. So that's GPT-3, some risks there. Another one that's gotten a lot of press is YouTube's recommendation system. So this is a great article by Zaynep Tifekshi. She's a researcher in social movements, but she's also been writing a lot about COVID-19. So you're probably familiar with her work there. And in this article back from 2018, she talks about how this algorithm is just continually recommending and auto-playing more and more radical content because that's what's keeping people staying on YouTube longer. So it's led to the radicalization of a lot of people and led to the rise in the anti-vaccine movement, white supremacy, and more. So it's a little scary just how an AI, a simple recommendation engine has really affected society so much. So those are just two examples of how AI influences our lives. And I didn't want this to be too much of a downer, so I'm just gonna leave it at those two. I want this talk to be mostly positive and the lessons we can learn from open source. So we'll move on to code is power. And I think it is easy to feel like you're just a lowly engineer. You don't really have a say in what the product is doing or how it's used. But you have so much power as someone who's building this technology that people use. And I think it's important to recognize the power that you have. And if Marvel has taught me anything, it's that with great power comes a great responsibility. And I just, I do really love this picture. Super cute. But yeah, we do have a lot of power. And I think this next slide that from Jamal Watson-Daniel, I caught her talk at the participatory protest machine learning workshop in the summer. And I just, I love the slide so much, I just cut and paste it right into this talk here. But she talks about these power imbalances in machine learning. So as a technical community, you and I, we have control over what data is collected, what data is used for training, how much to reveal about training data sets, choosing models to be applied to data, interpreting models and model outputs, assessment and verification of models, deployment of algorithms based on models. And I think even past this, every time you're tweaking a variable or waving something a little bit different, that has real effects downstream that you might not know about. And so this really, I think this slide really highlights this power imbalance, how people in the technical community have so much power shaping what's being used and consumed by the rest of the world. So this next slide might seem like a bit of a non sequitur. But over the summer, I did join a book club where we read How to Be Anti-Racist by Ube Max Kendi. And one of the big takeaways from this book for me was this idea that the opposite of racist isn't not racist, but it's anti-racist. So this idea that if you just go through life, being neutral and just like not being a racist, you're still implicitly supporting the existing structures and existing bias in society. So by doing nothing, you're still supporting racism. So he really calls for us to be anti-racist and look ways where we can shift power and change those power structures and dynamics. So I bring this up because I'm seeing a really a similar through line with AI. So over the summer, Petusha Kalore from Stanford and the co-creator of the Radical AI Network wrote this nature commentary piece where she writes that many researchers think that AI is neutral and often beneficial, marred only by bias data drawn from an unfair society. In reality, an indifferent field serves the powerful. And I thought that was really powerful, a really powerful idea. Just a lot of times we think of technology and AI as being neutral. And it's just neutral technology. But by being neutral, it's just amplifying the bias that's already there and making it even worse, like we saw with the YouTube recommendation engine. So she really calls for us to ask how AI shifts power rather than ask if it's any good or fair. All right. And this brings me to the last point. Open source practices can shift power. And if you're keeping time, I think majority of my talk will be on this section. So I do want to start with a story and just look back at the history of open source. So it starts with a mosaic browser and that's skip navigator after that. And this is a time when millions of people were discovering this new resource, the Internet, for the first time. And pretty quickly, Microsoft uses Windows to turn Internet Explorer into monopoly. So if you look in the early 2000s, Internet Explorer had almost 100% of the browser usage. And this gave Microsoft a ton of power over the web. So here's a quote from Mitchell Baker on what she saw at the time. She's the founder and CEO of Mozilla. She said that the Internet was going to be a stack of Microsoft products. From Windows to Internet Explorer to Office to servers to file formats to protocols. And you know, that's almost the entire stack. And there was a real risk that Microsoft was going to move the web in its own direction away from these open building blocks that we come to know of as the open web. So Netscape did something that's pretty radical at the time. They publicly released the code behind their browser for anyone to use, copy, remix and share. And actually this action was the first time that the term open source was used in reaction to this. So I thought that was pretty cool. Nice piece of history with open source. So yeah, with the code out there in the wild, people started to band together and call themselves Mozilla. It's this informal community of designers, engineers, writers, community organizers that really wanted to take this open source code and build something together. Build something better than they could on their own. And they did. They released Firefox a few years after. And Mozilla actually took out an ad in the New York Times where on that left hand side, see all those tiny words behind the logo, they actually printed the names of every single contributor to Firefox. It really shows that this was a grassroots effort. This was a community of people that came together. It wasn't just like a Microsoft or Netscape, but really this big group that wanted to build something that no one else had. And this was a huge hit. People loved it. It was fast. There was pop-up blocking. And it obviously didn't put Microsoft out of business. Microsoft is still here. But it did break up their monopoly over the web and really gave us the web we have today where we still have these open building blocks of the web. And yet really sparked this new way of thinking around open and modern life. And this isn't the only time we've seen this story like this. I'm a little bit biased when I picked Firefox. But really I could have picked any of these examples. And it's really exciting to see how open source has really democratized a lot of the technology we have today. And I do want to point out at the bottom, there are organizations like the Free Software Foundation, the open source initiative, and even Creative Commons that are really vetting and stewarding these open licenses so that everyone can do the same thing that Mozilla did. Everyone can start an open project. And today, if you open a GitHub project, it's just a drop-down menu of open licenses. It's so easy to pick a license and then just build something great with it. So that was a fun story about open source. But I do want to look at a few lessons from that story. So the first one is that it was a legal mechanism that sort of sparked it all off and that enabled lesson two, the collective action that improved innovation. And then all of this was reproducible through number three reusable structures. So I'm going to go through each of those individually. So first is that legal mechanism. So in Netscape Set, they're open source code free. They actually wrote a license called Mozilla Public License 1.0 that enabled anyone to use, remix, distribute that work. And in terms of shifting power, this mechanism protected user rights and made it possible for the public to use and help shape that code. So that brings us to the second point, that collective action where the grassroots community came together. That happened because of step one, but they didn't buy working openly. So my favorite definition of working open is from the Mozilla Wiki where it talks about being both public and participatory. So this requires structuring efforts so that outsiders can meaningfully participate and become insiders as appropriate. So this shifts power by allowing others to co-create with you. It allows outsiders to have power and shape what you're making in the end. So I think this is a great example of democratizing technology and using that collective action to shift power. And then third one is that reusable structure. Like we've seen this happen time and time again because of groups like the Free Software Foundation, the open source initiative and Creative Commons who are vetting and standardizing these open licenses and make it so easy to start an open license or an open project today. I don't need a lawyer. I can just pick a license. I don't even need to sign anything. It's really easy. So these reusable structures gave everyone the power to start an open project which really shifted power to users to be able to do this. Anyone can start an open project. All right. So there's the summary. That legal mechanism enabled collective action and improved innovation. And this is all reproduced through reusable structures. So now I want to apply these to AI today and this AI landscape. I think there's two ways to do this that I'm going to talk about at least. First is around data stewardship. I think the open source license is great for opening up code but a lot of the power in machine learning algorithms is through data. So what can we do around data to help shift power? And the second is to participatory ML. How can we be building machine learning in a way that brings in others, that takes a lot of these lessons from open source yet to build something collectively and shift power? So back to data stewardship. So back to that first question. Do we need a new legal mechanism today for the trust for the AI landscape? So if you look at the internet today, eight companies wield enormous power over the internet. I'm sure you're familiar with many of these logos. But every internet user interacts with at least one of these companies on a daily basis. And often it's really hard to understand their revenue model, not for all of them. But for some of them, a lot of them have so much power because they've spent years or just collecting user data. And that data is really helping them create these machine learning technologies or AI that gives them even more power on the internet. So Mozilla has done a lot of research around alternative data governance approaches in machine learning. So there's a lot of words on here. I'm not going to talk to all of them. But they have surfaced a bunch of different ones from data commons, data cooperatives, data trust, data marketplaces. And I do want to talk about two of them right now. So the first one is data trust, which I think is a great example of that legal mechanism that leads to collective action. So a data trust in itself, here's my rudimentary diagram is a legal mechanism that acts as an independent intermediary. And that sits in between the data subjects. So the people creating the data are the users and the data collectors. These are the companies that collect the data. So the data trust, it's loyal to its subjects. But then it negotiates the data use with the companies, according to the term set by the trust. So a lot of people see this as instead of logging into Google or Facebook, you would log into your data trust. And then your trust would send data to Google or Facebook on your behalf, according to the terms you've set. So this gives you a lot more power. And as a group, as members of the state of trust, you have a lot more collective action in how your data is being used and like what it's used for. The one thing is it doesn't quite have a reusable structure yet. So if I wanted to start a data trust today, it would definitely need a lawyer to help me out. So it's not quite the drop down on GitHub yet. But people are working on that. And a great example of this is the UK Biobank, which is a charitable company with trustees and they manage the genetic data from half a million people. So that trust negotiates with researchers on how they'll use that data from the different patients who donated their genetic data. And the other example I want to talk about is data commons, which is data collected and shared as a common resource. And the classic example of this is Wikipedia, where the crowdsourced articles are made and shared with everyone. And Wikipedia has changed a lot of us using the internet. But the other example I want to talk about is common voice, which is a Mozilla project. But if you look at the voice technologies today, a lot of the big players are like Alexa, Siri, Go Home. And a lot of them are able to create these technologies because they've been collecting our voice data for years. And they just have so much voice data, they're able to create these personal systems. But it's really hard for an outsider to come in and compete with that because they haven't collected all this data. Yeah, they just don't have this data set. So Mozilla started Common Voice, where people can donate their data and anyone can use this public domain data set, excuse me, to build technologies. So if you're interested in donating your data, your voice to this, please go to Common Voice, always looking for more gender diversity or global diversity, and always looking for more languages. So it's a great way to help shift power away from those big tech by enabling others to create more voice technology. So those are just two of the many different data governance types that are out there in the world. And recently Mozilla published this data futures research, which is research to shift power through data governance. So you can go to that URL if you're interested and they go over seven different data governance models, along with some examples of how this is happening in the world today. So yeah, I recommend you check that out. All right. And this last section is participatory ML. Where can we apply working open today? And I think we're starting to see more people talk about participatory ML and realize that there's so much power with the technical community building this. How can you open up different parts of it so that you're co-designing with some of the end users? So you understand some of the impacts down the line. So I do think working open is a great way to shift power. Back to that definition I love so much, but working open is public and participatory. This requires structuring efforts that outsiders can meaningfully participate and become insiders as appropriate. And I do want to say that if you're intentionally including those traditionally excluded from shaping tech to become insiders, working open is great at shifting power. However, if you're only inviting those who already have power, working open doesn't shift power. You're just defending the status quo. So working open can shift power. I don't think it always does. And I think you can see that in the open source community today where it's done a great job of democratizing technology, giving so many people access, but open source still has its share of diversity issues. Yeah. So there's ways you have to be intentional about the way you work open so that it can shift power. So I wanted to give you a bit of a framework to think through. So this is a framework I use when I'm teaching open source. But these are five open source practices that you can implement in your own work. So there's three categories here. One's a giving, where you're just giving something away for free, listening either passively or actively, and then collaborating either with a team or through partnerships. So this is based on research from the Copenhagen Institute for Interaction Design and Mozilla Open Innovation. So I'm going to go through each of these and talk about what they are and how you can use them to shift power. All right. So the first one is gifting. So this is no strings attached to giving of valued products and services. And the example from the study is Google Android that gifted a development platform to encourage new users, new uses by developers. So this incentivizes adoption. It's a great advantage. People will use your thing if you just gift it to them for free. This can help drive a standard. If you have more people adopting your work, you can use that to drive a standard and generate interoperability. And it also leads to improved products and services. If more people are using it, you know how to start improving it and you can build a better product that way. So if you're thinking of using gifting as a way to shift power, first I'd advise you to make sure it's accessible to as many people as possible, that you're not just gifting to people who already have power, but that you are gifting to people who really need this and don't have power now. And then just a question to ask yourself like who does not have access to your work, who should have? And maybe try to gift it intentionally to them. Next is listening ideas. We're in the listening section now. So this is using a community to generate ideas and solutions. And I really like this example. I actually didn't know about this before I read the study. But there's a Lego ideas platform that allows anyone to propose a kit idea, which then can be voted into production. So as the open advantage here is it helps you understand your community and you can generate additional offerings from the community. And in terms of shifting power, can you solicit ideas specifically from marginalized groups that may be affected by your work? And then also are there experts from other fields like social science or race and gender studies that you can listen to? I think a lot of times when we're building responsible or trustworthy tech, we're not listening to the people who've been studying this for a long time like the social sciences, who's been looking at society and technology for ages. Okay, on the other side of listening, there's learning through use. So this is collecting and analyzing activity to improve product or services. This is that passive side of listening. Excuse me. Been recording this for a while. All right. So the example I have here is Spotify Discover Weekly, which learns a user's taste and then creates playlists, which it's a feature I actually really like. It creates that for me, a little playlist I can use. So the open advantage here is it helps you understand your users, you can improve your products, and you can fail fast. You can understand when something you're making is an organ. In terms of shifting power, really look at how marginalized groups are using your software. Is there something you can do to really optimize for them so that they're getting some power in the design of your work? And then moving on to the collaborating section, there's creating together the very traditional open source thing that you think about. So this is sharing the tasks and costs of achieving a pre-established goal. It's a local motors invites designers to use an online shared database of parts to co-develop products. And the advantages here, you have a better product, you lower operating costs, and you really give ownership to the community. They feel like they've helped build this and that part of it's theirs. So in terms of shifting power, really think about who should be a part of building this, but isn't. And can you invite experts from other fields to build this with you? And I do think of the options of the community interactions, this one takes the most effort on your end to create a structure so that others can co-create with you. It's really hard. It's much easier to give something away for free or even ask for ideas. This takes a lot of effort. So be be aware of that when you're choosing which one to do. I do think this is a great one. And it's like the best way to shift power like fully, but you can still shift power in smaller ways through gifting, soliciting ideas, etc. And then the final one is networking common interests. So this is coordinating to ensure that individual activities achieve more towards a shared mission. And the example here is from Ashoka, which serves the platform for innovators that share an overall objective, but each sets their own project. So this is definitely more about partnerships. So if you're running a project, is there another aligned project you can partner with, and you can do more together than you would on your own. So as an open advantage, this advances common playing fields and enables separate groups to help each other at lower operating costs. And you can improve products by learning from your partners. So in terms of shifting power, are there aligned projects you can partner with and share power? So back to the overview. These are five open source practices that can shift power. Some of them are like weaker ways to shift power, weaker ways of participation. Others are much stronger. I'd recommend you go as strong as you can if you really do want to shift power. But I understand that not all projects have the resources to do that. So I hope this helps you think through your own work and think about ways you can open things up a little bit at least. So I do want to ask these two questions after talking about these practices. But where can you include others and share power in your work? And then who will you include? Really think about that power dynamic and who really traditionally doesn't have power in what you're building, what is most affected by it? And what ways can you start to include them in the design of your work? So you can think about that after this talk. So I do work at Mozilla. Our mission is to ensure that the Internet is a global public resource, open and accessible to all. And right now we are really focused on trustworthy AI. And that's AI that's demonstrably worthy of trust, where privacy, transparency and human well-being are key considerations and their mechanisms for accountability. So if you're at all interested in this, I do co-chair the Building Trust for the AI Working Group. This is actually MozFest's pilot working group. And we've recently selected six projects that we're going to be working on leading up to the festival, which is in March 2021. And these projects range from sort of defining best practices around building trustworthy AI to including more diverse stakeholders in the creation of this technology, to creating building blocks that are needed to build more trustworthy AI sustainably. So if that interests you at all, join us. I think as of this time when you're watching it, our next meeting will be the next morning after. So come join us. You can hear about the projects. We're just starting them now. So now's a great time to jump in and start, yeah, start working towards better AI. And it leads us to MozFest, where we're working on these projects all the way up to March 2021. And so this is Mozilla's annual festival. That's really a celebration of the Internet. And I love our tagline that's come with an idea, leave with the community. We are virtual this year, like many other conferences. And our call for proposals actually opened yesterday as of this recording. I don't have a URL because I'm recording it earlier. Wait, it opened yesterday as of when you're watching this. That's the correct way to say this. So if you're working on anything related to this at all, the whole theme of the festival is trustworthy AI. And we do want to look at this from like the developer's perspective, from policy perspective, from consumers, really any way that we can get to more trustworthy AI, we're really interested in hearing from you. And I think MozFest is a great way to come and find others who with aligned ideas. And you really do leave with the community. So check out MozFest. And yeah, I just want to close with a huge thank you to many people who bounced ideas off of shared slides with me and thanks for many of you who did the actual work that I'm just showing off in this presentation. So with that, thanks you so much for watching. If you're watching the recording and you have any questions, you can tweet me. I'll be sure to follow up. I think that's a good way to get in touch. But I believe I'll be here live after this to answer some of your questions live. So thanks everyone. And I'll see you