 Okay, welcome everyone. Thank you for coming There yeah Everyone in this room. I was saying the less people in the room the more power each individual has so It's a powerful room So this is a session on Helping to define Open source AI And it's going to be an interactive session So I will start by introducing myself. I'm Mayor Joyce pronouns or she and her And I own a company called do big good that Designs and facilitates consultative decision-making processes and so the open source Initiative which is the standard setting body for open source and software has hired my company to design and implement a Consultative process on which open source should mean for AI and this is part of that consultative process and then I'm going to be Co-facilitating with Ruth Sealy Whose pronouns are she and her and is the executive vice president at the Apache software foundation and Director of open source at SAS and I also will make a disclaimer, which is that as I just said my expertise is in Collaborative decision-making and co-design. I'm not a content level expert on AI So if I make a mistake and it's just a whoopsy daisy then we'll keep going if I make a mistake And it's like oh what we do after this statement is going to be wrong then, you know, let me know Yeah Yes, and many topics we may not have time to resolve today So there will be times where you're like, oh, I disagree and it's like That's going to be in the parking lot Because we have a lot to get through so I'm actually going to start with the land acknowledgement. I think people probably know what that is It's a way of acknowledging the historical context of where we are in the work that we're doing So we acknowledge that we gather on land that was stolen from the Muwekma people Who are the true stewards of this land and we acknowledge the sacrifice of indigenous Africans who stolen labor was used to work this land And we acknowledge immigrant communities whose contributions continue to deepen and diversify the cultural fabric of the nation that occupies this land So what we're doing today specifically is OSI has been convening a global conversation to find the definition of open source AI now for almost two years and You are here to help us Do some very practical work on that which is Coming up with the criteria for evaluating whether a license for an AI system Can be defined as an open source license, right? So what one of so this is basically the primary use case for the open source definition of AI is that? OSI has a committee that looks at licenses open source licenses and says Haza according to our definition this can be used as an open source license or sorry according to our checklist this is not an open source of valid open source license and Right now we're getting to the point of actually saying, okay What is that checklist criteria criteria that the committee is gonna work on and Who's gonna give us more details on that? but that's what we'll be working on today and Yeah, by the end of the workshop we hope to begin to populate this checklist with criteria for Defining different types of AI systems as open and we'll see how easy or difficult that is That is part of what we're learning today Someone requested That this be recorded and it's actually being video recorded So audio is also fine because video is already happening. I see the little red light in the back of the room. So As I stated this will be interactive so after an intro context intro from Ruth we're gonna break into small groups and Each group will focus on defining how they think a particular type of AI system should be licensed as open and Different perspectives and experiences matter if you're up for participating But feel like a novice please stay because I know there's two conferences happening Some people may have more experience in AI than others and that's fine So just to start off with the interactivity which has not occurred yet Raise your hand if it took you less than an hour to travel to this conference Okay, just one Between one and ten hours to travel to this conference Okay More than ten hours to travel. Oh my goodness Just for shits and giggles where where are the the people who took more than ten hours? Where'd you come from Tanya and then I don't you're Cyprus of course Cyprus and what did you say? Could you London London? Okay lovely All right, so one other one other piece of hand-raising so part of this consultative process is to come up with a list of stakeholders Of who is a stakeholder in open AI? And so we have a prototype version of that list Which I'm going to see if it covers the people in this room as a little test of this list And this is these categories are not mutually exclusive so you can raise your hand more than once so who in this room Would consider themselves to be a system creator? Meaning they make AI systems or components that could be studied used modified or shared through an open source license such as an ML researcher in academia or industry who considers themselves as creator Well one two three four five, but some hesitant hesitant fine. Okay Who considers themselves to be a licensed creator rights or edits the open source license to be applied to the system such as an IP lawyer? Okay, wine okay Regulator rights or edits rules governing licenses and systems Government policymaker and obviously there's multiple geographies to which this could apply if someone considers themselves to be a regulator of AI or someone who might be called upon to regulate it. I Okay There's just three more licensee you seek to study use modify or share an open source AI system such as being an AI engineer or a health or education researcher One two three four four and a half four and two halves I'm gonna call it five End User Consumes a system output, but does not seek to study use modify or share the system. You're gonna be hearing those words a lot Such as a student using a chatbot to read a report an artist creating an image using a Generative a generative system End user consuming but not seeking to study use modify or share such as using a chatbot Okay, so this Okay, so this is this is this is maybe some some controversy controversy Okay. Okay. Um and the final final one is a subject So affected by a system output without interacting with it intentionally or an advocate for that group such as a someone who's a loan application to a bank is being evaluated by an AI system owned by the bank or used by the bank or a A oh, yeah, or someone who is a photographer who finds That their photograph is In a training data set that they didn't know about but yeah, do you consider yourself a subject of AI? so one Okay, yes, so that that that is that was my assumption as well that everyone would consider but I didn't want to Push that on people Yes, so the next question is Are there any other connections to AI in this room that are not covered by those categories and? Please please describe it Yes, okay, okay, okay, okay So what okay so OS OS what would you say your your connection is to AI? Okay, okay Okay within OS within open source, okay What? Investors, yep. Mm-hmm. Cool. Thank you All right, so now Ruth will give some more details on the project and then we'll break up into small groups and do some Work on the definition. I'm actually going to give light details. I have the whole slide deck But I know or are familiar with just about everybody in this room, and I'm pretty sure you could all work through this slide deck, too So I don't think we need to go deeply into detail. I did promise Stefano We would ask how many folks are familiar pretty familiar with the OSI and the open source definition and the four software freedoms and How many of you are like do not cite the deep magic to me? I was there when it was written. Yes several of you So as most of you are familiar with the OSI and And and most of you are familiar with me and know that I am not the OSI But I am a friend of community and and offered to do this We had done one at all things open in Raleigh and and I worked on that project. So here I am with you So OSI is this 25 year old nonprofit organization that currently maintains and defends the open source definition Among many other things that are represented here and if you have questions about any of those I'm happy to chat with you about them later But I think that this is a really good group to work on this The actual hands-on workshop you part and so I would like to get to that and I'm pretty sure most of you know What all these things are anyway that deep dive AI is is what has led to all of this so Everyone we all know that technology evolves and this is where we are now And in fact, I think most of you either were on the panel about an hour ago about open source AI Or were in that room. That was an excellent intro to what we're gonna work on here So I've heard Stephanos say a few times we had 20 years to work on the open source definition We don't have 20 years this time. It's it's now and so there's this sense of urgency in multiple ways with all of the policy and Regulation work that's happening around the world right now And if there's no real understanding if people are throwing around this term open source AI without a common Understanding of what that means we're going to end up with different interpretations We're gonna end up in a very bad place There's already a lot of confusion in the market around components models pieces that are being released already produced using References to open or open source using licenses that may not even apply to what they are It's kind of a hot mess and it's time to all agree on what exactly we're doing here and we're already seeing some of the juggernauts do some gatekeeping and That is not where we want to be everything is moving very fast And so that's why we're doing this now and I think I heard earlier officially the plan is to Announce the open source AI definition in October of 2024 fingers crossed. We'll see where it goes Did you have follow-up? That's the goal move fast September you just moved up the date move the audience has moved up the deadline Do I hear June June? So I'm sure you are all familiar with this typical machine learning pipeline and like I said I just really want to blitz through this so that that you guys have a lot of time to work We all know here that the legal landscape is pretty complicated right now So does a data set have copyright and what does that mean in different countries? Some countries have even more than that privacy laws database rights in the EU Database rights last for 15 years until somebody makes a change and a new set of rights are created for that database It's it's super complicated right now. Love this slide. So What is the copyright on this image? We asked for an picture of an anthropomorphic mouth Does everyone in this room have the same opinion on the copyright of this image as a certain entity that may be known for its litigious nature Yes, and and things thank you for further evidence that things are changing quickly and we must move fast And the the issues are complicated It is not simply let's just make a new open source AI definition There is a lot to think about around the data the amount of data that is required the issues around finding it and curating it and the bias that maybe they're in the conditions for using that data There are the models themselves what's in there what legal frameworks apply to those The knowledge to set up clusters test and fix those models the hardware that relies largely on a proprietary ecosystem which is not always our preference and How these systems are currently being deployed with a lot of big promises that aren't necessarily being being met maybe with some disappointing results in some cases so There is also an aspect of Community so Roman talked about the Apache concept of community over code and the AI community doesn't have those same social norm Concepts in place yet. So there's no Unanimous consensus on what acceptable behavior is around maybe scraping the web for images around publishing papers around models And there is a lot of harm that could potentially come out of that and so that's how we got to this discussion about defining open source AI So this that conversations been going on since June I think started with some much looser less concrete Conversations around what this might look like and has evolved through these workshops like what you're about to do Although you're going to try a slightly different version that I'm very excited to see how it goes And it has that objective of a shared understanding So we're talking to not just open source experts like all of you but experts in multiple fields and Multiple field disciplines around the world not just the same voices that are going to say the same things We can't just produce this sacred new text and present it to the world. That's not how we do things So in the way that open source is done collaboratively, we're doing this definition collaboratively I this is normally where Stefano would retrace the history of open source for you But again, you were all there when the text were written. So Let's just let's write through that I do want to to bring up the topic of self sovereignty and that that's the reason that field of use restrictions Are not generally what we do in open source. That is important to keep in mind when we're talking about these definition ideas Yeah, so if we think about that golden rule of the GNU manifesto and replace the word program with an AI system That is kind of the fundamental idea of the current state of the definition so Oh as a side note What is an AI system if I asked you all to write down a definition? Are we all going to come out with the same definition of what an AI system is? Probably not. In fact, absolutely not. There are as many definitions as there are people in this room But the generally agreed upon one from the OECD that is cited here and updated. I think just like a month ago or less Is the one that we're going with We also Would like to keep in mind that As an open source community, we have a certain set of benefits in mind for open source and what that means We are noticing that policymakers around the world maybe have some different priorities in mind some overlapping ones But also some different ones and we have to recognize that and deal with those things or this concept of open source AI will not Have a successful adoption So making sure that it can be transparent Explainable that the definition has all of these attributes and that we're considering those needs that are not the way that we all Sort of natively approach the world I know it's very easy to have spent Years and decades even for a lot of you in this open source world and come with a certain mindset and forget that there are a lot of people In this world who just woke up in the last 12 to 18 months and we're like, I'm sorry Where does the software come from and and that's the people that are going to be making these kinds of decisions So how are we going to do this? We're going to go back to those four basic freedoms That are the the four software freedoms and start from there. So study use modify share This is the current state of the definition that when you start working that you're going to work from and I promise I will put it back and I think to even have little handouts that people can look at Yeah, yeah, and and so this is where you'll be starting from to be open source an AI system Needs to make its components available under licenses that individually grant the users of the system these freedoms Study use modify and share that you are all intimately familiar with But today we're actually going to break down into the components of that system And so we're going to look at each of those aspects within the context of the code the model and the data And so actually if you want to hold up the little sheet, we're gonna there We will over explain this repeatedly and help you through it And there'll be handouts, but you're each going to get Each group not each individual each group is going to get one of these sheets So that you can discuss each of those aspects in those contexts Now if your group spends all the time in one of these boxes, that's totally okay If you don't fill in every square on the sheet, we don't have that kind of time I'm almost certain you will not fill in every square on the sheet Whatever your group has the interest and expertise to discuss. That's what we want to hear So this is a this is a replication of that and you will see this during your time to work I will put this slide back up that breaks down how the definition fits into each of those categories And you will have it on these little handouts So the chunk of this that we're looking at is that that for freedoms Mary mentioned that we're working on a license checklist and what that will look like for this And if you go to open source org slash deep dive slash drafts You can see all of this and the context around it You can also see the current draft and add comments You can see other people's comments You can have a whole conversation You can send it to your friends who aren't here and invite them to join that conversation there But that chunk of the four freedoms is what we're looking at today So let's make it happen The first thing that we're going to do is break up into groups And you are going to spend a few minutes introducing yourselves to each other If you don't already know each other Give a little context about where you're coming from to this conversation What you're bringing to it and how you interact with AI And then the bulk of the time is going to be the brainstorming and filling in their sheets You should appoint someone to come back at the end and present what you have found And that is how the day is going to go