 So let's start. And today we are having the panel discussion on the definition of open source AI. And as we heard, some people even think that it's too early to create that kind of definition. So that's also a point of view that we're going to discuss today. And we have three amazing panelists. So we don't want to spend too much time on introductions, so go read the bias if you want. But I would just like outline the angle that we're going for. So we have Sal Kimih, who is going to represent the developers community here. So they have their own startup and have been in machine learning community for many, many years and focusing on security of a supply chain for open source as well. So then we have Nithya Ruf, who is with Amazon. And she is the head of the open source program office. And Amazon is not the first company where she's done that. So for many, many years, she really knows the corporate side of the things. Then we have Roman Shapochnick, who, as he explained today on another panel that he was doing, has actually multiple personalities disorder. But for this particular panel, we ask him to wear the hat of the Apache Software Foundation and represent the foundation's angle. And my name is Tanya Dadasheva. So I have AI startup. But for many years, I have been in a venture capital firm. So I'll do what venture capitalists actually know how to do, ask hard questions. So let's start with the first one, which is that everybody here is talking about how the new things threaten open source and everything that we used to and how is it going to pan out. So my first question is, what about open source that you feel if threatened or if it disappears for some reason, what would you miss the most? So what is the most important part that you want to keep going? And that's important for you. Sure, I'll take that on. So I think what's important for me moving forward is that it really is a new generation of developers. And a lot of these are corporate developers first, meaning they don't have any previous open source experience. If you were to become a PM 10 years ago, you would have come out of a PhD in which you would have done open science, right? So that would kind of have been your best background to get into AIML. But now, we've got a whole new community of developers that are familiar with open source licenses as a sticker that you put on a product and not as a representation of a contractual understanding of IP sharing. And I want that to be maintained. I agree. I think in the early days of open source, we really delved into understanding the risks, the opportunities, what a license means, and very intentionally kind of used and understood the culture of open source and the benefits of open source. But I think we take it for granted. It's just there. And we just use it. And we need to go back and understand the fundamentals of open source again, the benefits of open source, and how to apply it going forward to other forms of innovation, like AI and data, et cetera. So I'm very, very interested in that. There's no other form of collaborative innovation that's more powerful and that's more inclusive than open sources. Yeah, and I would like to maybe also drill down on that point that you just brought a little bit more. Community, right? Again, I wear an Apache hat today. I wear an Apache hat a lot of times. And at Apache, we have a saying, community over code, meaning that if you have a great cohesive community, the code will come. But if all you have is code, some amazing code base that got created and maybe it's the best thing ever, but you don't really have a community around it, doesn't matter how good the code base is, right? You will always get out-innovated and outrun very quickly because it takes a village to raise an open source child, right? And definitely takes a village to raise an AI child. So to me, the ethos of community and how we build this really transparent communities that span borders and governments and institutions and allow people from all sorts of different backgrounds, a lot of times underprivileged backgrounds, to basically do the same kind of work that a lot of times, young kids think, only if you join Google, you can do that kind of work. No, actually, if you join an open source community, you can do exactly the kind of work that Google engineers are doing. That's the most important bit. You can even be working with Google engineers online for free until they pay you, hopefully. Yeah, so I want to come back. So specifically, I think it's really worth touching on the importance of the culture and community that the Apache 2 Software Foundation is specifically seeking to encourage, right? So did any of you in the last Cassandra keynote scan the thing and get your free Cassandra credits? Because you should have, because that's useful. If you did, you would have immediately been put to what's your name, information company. Have you read and understood the community guidelines of the Apache Software Foundation license of which you are just about to pull from, right? You're pulling from IP that is entrusted in that boundary. What does that mean? So you've got a couple of different open source licenses, but this is really meant for B2B communication. So there's a couple of things in the Apache 2 license that are important for AI. All of these open source licenses will allow you to do something like this. If I made a robot, a physical robot, it would mean that anyone could play with the robot and make changes to the robot if they're willing to play with it and make changes on the street. Now in AI, if you then want to take those models and work with them independently, if you want to take them back into your house and not show anyone the changes that you made to the cool robot the community is working on, we'll do a license you and you will have to be on a paid license, right? So there's never a situation where you necessarily have to have a fully closed or fully open source version. It depends on the audience that you're aiming at. But the last point here is there's three conditions of the Apache 2 license right now that are important. Number one, as soon as you read those terms and conditions, you understand that if you make any changes, you have to publicly contribute them. Number two, you cannot copy left. So you are allowed to use this for whatever ends that you wish to, even if the previous model was not intended for that use, right? If it's intended for a different use, that's valuable. And then number three, really, really importantly here, is the state attribution. So you need to know what state it is, where, or at. If it is being right. So Apache 2, when I am pulling from something that fits under that license, those releases are way more reliable to me because there's a trust chain for the developers behind it. I know I've spoken a lot, but community is really a misnomer. It is a contractual understanding after you signed the community agreement. So actually, that's a very good diving point into the second question, which is, what is different about AI? So open source community has been around for decades. And it has developed a bunch of foundations, a bunch of rules, a bunch of licenses that people rely on, including definitions that people rely on. But now with open source, with open source AI, it's different and nobody even knows what open source AI is. So people argue, people argue what it is, by open AI is using that name. And a lot of questions, they still are open questions. And people are not sure if we should be discussing them now or wait for some time. So what is wearing your hats of different sides of the market? What is your view on it? Is it too early, too late, do we need new organizations, or can we work around the current ones? I'll just make one very quick note and let the other distinguished speakers elaborate. My quick note would be in the open source community because I've been in this community for a long, long, long, long time. We got lucky because even the guidelines that were put in place were put in place by a small number of people for whom it was easy to agree. And now that we have the OSI definition of what open source is, we all keep forgetting that that definition was literally hammered out by just a few folks in a room. And everybody's like, yeah, that makes sense. But even though it was simple, what it allowed us to have is a clear delineation of what it is and what it isn't to be an open source. Now, don't get me wrong, we all want to have as big a tenter as possible, but certain things are just non-negotiable. And again, in my book, one of the things, for example, that OSI got just right in that room was basically not having any restrictions on the field of use, right? That's just one example, right? You know, there's like actually short documents or go ahead, read it, but being able to agree on those types of things is just non-negotiable, you know, for making sure that AI truly stays open. To me, this is what's new about sort of this whole AI movement because it is way more loud. It is way bigger, which is all good things, but it makes it so much more difficult to agree on even a common definition of what we all want. Speaking of which, by the way, we have a workshop at 4.30, I believe, 4.20, yes. So come to the workshop and we will all just figure it out. And you know, that will be the room that we will all then say that's where all got decided. I think if you remember the early days of open source, there was so much chaos and confusion and there was millions of licenses. Everybody was creating their own custom license and playing with that. And it didn't create speed of innovation and it created mistrust, you know, especially from corporate organizations to using it. And it wasn't until OSI started applying the open source definition and the rule break to licenses and then kind of codifying them and saying, here are licenses that are OSI approved. It started creating more trust. It started creating more confidence and you saw open source really blossom. And I think AI is somewhat in that same position where there's a lot of confusion about what is open AI and does model data set, weights, everything have to be open in order for it to be open. And in the lack of definition, you have companies coming out and saying they're open, open washing, open-ish, and it's not creating confidence, I would say. I know we've all established that we absolutely need open innovation in this space. And I'm very glad that OSI and LFAI and data and others are working on establishing a global and consistent definition. We really need it. And what's different to your question, Tanya, is so many more components than software had to deal with. And B, we have to embrace the data community, the AI community, the machine learning community to be part of this definition process. We can do it as an open source, traditional open source. Like old school way. Yeah, old school open source organization, yeah. Yeah, it's new school open source if we can get it to work. Yeah, I think speaking back to figuring out the license and license requirements around it, it is interesting. So OSI sent on an email maybe yesterday or today that says they will be announcing their OSI AI definition in October of 2024. So we have a lot of, I mean we got months of potential lawsuits, right? That's just the honesty, right? Because you're gonna be working in a space where the regulators, and like I'm talking to IP lawyers, I'm talking to all of these different sectors. You know, the people who would be doing the regulation don't understand the technology enough to regulate it, right? So it is, it's a dangerous place to be for about nine or 10 months in open source until we do have a shared understanding, right? Because all that open source licensing, if you are on a standard open source license is supposed to ensure is that I am giving you a one way derivative relationship of my IP. You are free to take it, and I am free to do nothing else, right? So that I can move on with my life. I've seen a lot of requests, a lot of regulatory requests around open source projects that just quite literally don't understand what they are. This is why they're different. This is not going to self-agile. This is not software. Every single AI system that we're working with and seriously attempting to regulate does have a software and hardware component. To some degree, it's got a database behind it. So no longer are we looking at something which is only gonna have one license on it. You quite literally can't put an OSI license on a database. So if you wanna have a standard stack of license, like if you want your head of policy to be comfortable with your deployments, you now have to have a license for the database, for the model potentially, for the weights, if there's something that you're borrowing. That's a composite license. It's the first time we've seen a composite license, and that's where I want us to go. And license may not even be the right construct, maybe. Exactly. So we don't know what that construct needs to be, and frankly, there are so many other considerations, yeah, that go into what makes AI open. And that's just such a good point, because all of us keep forgetting that even the notion of the copy left was a hack, right? You know, we're sort of like all kinda like looking at it as something given to us on stone tablets, but it was basically a hack done by a very clever group of people to circumvent a certain set of restrictions or make the software more available. And it's not like it got developed by a government agency. So I feel like we are required to do that level of clever hacking to basically come up with the best tools to protect the openness and freedom of AI. So actually, to that point, a lot, I noticed that at the conference, there are not too many use cases so far. So there is a lot of how people start to use AI, but it's very bottoms up. So they play with it, they show what it can do, but not too many real life use cases. So my question is, have you already run into these problems and some things that stop innovation, for example, in the company that didn't allow you to do something you wanted or didn't allow some people join your project that you wanted to join? Or is it mostly theoretical at this point? So what is your feeling? What were the most challenges so far? As a company, we have to examine every single model, every single data set to see what we can and cannot do with it, sadly, because unlike open source software with this codification and you can say, okay, Apache Dorado, that's good to use. It's on the goal list and it just gets used. We still have to examine and you work with the legal team to examine each and everything. And for example, I mean, when Lama came out, people thought it was open source, but when we kind of looked into the terms, it was not to be used in a large company like us, or it also had other fields of use restrictions and the data set wasn't open, et cetera. So I think we are running into those types of, we have to do a lot of manual work is what I'm saying without having a better definition there. Actually, interestingly enough, and maybe it's easier for me to talk about it, because I'm not a company insider, but I have a lot of companies the size of Amazon, just people reaching out to me and kind of complaining a little bit saying that, the companies are really not set up to do even GNI policies yet, right? It's like, as a developer, am I allowed to use this or that or that other thing, right? Again, hugging face, great tool, but all this innovation in quotes, of click-through licenses, try to get that past your legal department, right? It's an amazing zoo and yet, people are doing it every single day because that's what is required to basically make your business scale. So in this really weird situation, where it's like damned if you do and damned if you don't, and most of the people who aren't be used, they actually opt out to the damned if you do, because that's how the business operates, right? You never come to your legal department asking for permission, you typically come asking for forgiveness. But again, that can only go for so long, I feel. Yeah, exactly. And we are, as companies, we are looking to the Apache Foundation, the Linux Foundation to say, what are the norms in the community? Should we or should we not accept code generated by code assistance and everybody's trying to figure it out, but at the same time, we're all moving at warp speed. So it's kind of a really, I don't know, a tough situation, trying to figure out whether you're in the right or not as you're speeding through space. And so I would, I think what we're also saying to OSI and others is, gosh, we need the definition yesterday. And I'm sure you're hearing it all the time and we need to help you kind of get there, right? And we need to work with you to make that happen. I mean, I agree with that. I think, I mean, also, I think there's, well, I always keep trying to bring AI back to aerospace in the way that we ended up handling that, because I think it's incredibly relevant. The fact of the matter is that, I legally in the US at least, can strap anything I want to, a jet pack on the back of my back. And as long as I'm in like class D airspace, that's fine, that's my own liability. I cannot strap a blow dryer to the back of my back and then go over Los Angeles today, because that's class A airspace. So I think as we're looking and understanding that this is always an innovation space, you've got to parse it into three parts. You've got the data sets that you're pulling from, which may or may not be proprietary or accessible. You've got the models and the weights, which should be at least somewhat transparent, or you just should not touch them. And then you've got your algorithmic weights, whatever you're building on there, to create some kind of attention network within that. All three of those have to have their own licensees, and I can tell you for a fact, and this is probably the talk that we're gonna have to give on a panel next year, is really how do you wanna handle all those databases? Do you wanna inculcate any ethics, any other standards, any other toxicity? That's where you're gonna have that opportunity, but don't try to have that discussion around open source licenses, it doesn't fit. That's a really good point. Like access to those components that make up AI is one thing, and then taking into account privacy, security, toxicity, and ethics, et cetera. Almost has to be another construct, I think, to say how to deal with those things along with these components. And another thing that I'm learning, I mean, open source, by the way, has been very, very pragmatic and practical, and doesn't kind of say you have to have everything purely open source or purely proprietary. We consistently work in stacks which are proprietary and open. And so in AI also, we cannot be religious and say only open or all four components have to be open or all four components have to be closed. I think the LFAI and data community are trying to say there could be different applications for which data may be closed. There are certain applications for which maybe weights are closed and so on and so forth. So it's gosh, it's a nuanced place. Yeah, and when we're talking about all this policies and ethics and even licenses, these days it's also the question of different countries and different regulations. And I think not only me, but a lot of people noticed that unlike in software, when the regulators joined in the discussion later in AI, actually it's way too early. So basically nobody in the community knows what it is, but there are already a lot of regular voices involved. And this is also different. Like we've never really dealt with anything like this before. And this is another question. So open source was by definition global. And what happens with it now? Is it like a bunch of smaller markets that develop their own version of it? Would it merge at some point in the future? Do we need to start working on those compatibility at least right now? And I really like the example of the space regulations because one thing that they figured out that whatever goes in space it has to be compatible to each other. It has to connect. Even if you do your own infrastructure, your own rules, whatever you want in specific countries, still in the space it has to connect. So are there any forums? So basically we already mentioned that for AI there is an OSI initiative that works on the definition. But again, are there any other forums that you engage, that you want to highlight, or do we need something that is currently lacking? So what's your view on this angle? Yeah, I actually feel that in the open source community that's one of the bits that we took for granted. Again, the early open source community was very much predicated on this, I would even call it libertarian idea that in the 90s it really felt like we're just a few days away from just abolishing all the governments and it will be like one huge community of human beings on the planet. It really doesn't feel like that today. And I think the AI reflects that. So open source grew up in that culture and AI is growing up in what I would describe and I think Jim Zamblin also describes it as technonationalism, where it's not even the governments because the governments typically reflect the sort of, at least somewhat, most of the democratic governments reflects the will of the constituency, but it's the constituency that really wants to isolate. And I would actually expect that to be one of the big, big challenges to open AI because a lot of the pressure will come to own it at the sort of country level or geopolitical level. How can we resist? I honestly have no idea or no solution, but I feel that that would definitely be a challenge that we all would have to grapple with. Yeah, yeah. I mean, what Tonya said, where regulations are happening even before people understand what it is and then the fear, uncertainty and doubt around the two camps, right? Oh, AI is bad. It's gonna destroy humanity and it needs to be locked up and then I'm more on the side of I think we should democratize AI and make it transparent and safe and secure just as we did with open source and we need to do that. But I am nervous about the fact that there is so many, many divides and everybody's trying to create their own regulation, everybody's trying to create their own definition. We really need to work together across the world to come up with a common definition and a common way of working with AI. And the sooner we, I think, work together on a definition and get everybody to agree to that definition, the better. Yeah, it's kind of a turbulent time. Well, it's just, I mean, it's an emerging technology. There has never been regulation that looks like this before because nothing like this has existed before. But again, you gotta come back. The thing that's different about this and the reason why, I mean, we're just about a year away from specifically the Antitrust of Linux Foundation really getting that this has hardware in it, right? That to some degree, right? The supply chain for AI is totally different from your CPU little software world. All of your cybersecurity concerns are suddenly in the hardware supply chain, not really in the software supply chain, although they are statistically still there, so sorry. It's a completely different world, a completely different game, but that is important to understand. You can have hyper-globalization with a non-physical product. You immediately lose that. It is not a lack of globalization. It's just the reality of the way that money and regulation move so that you can get the safest chip into your hands as possible. It's for your benefit. Okay, so actually we're running out of time, so any questions from the audience? I'm running a great panel, so basically, I'm a chair of Genevieve Ecomans, Anthony's head of AI Alliance, so we have to grapple with these questions, and so I just wonder, I think Nithya voiced this, like OSI needs help defining this, right? Because kind of traditional OG open-source culture has never dealt with this massive big data, right? So this is an artifact of, you know, and big data has enormous influence that brought in influx of marketing and so forth, right? It's something open-source community never dealt with, so when we all agree that OSI needs help, how can we formalize it? And from community bodies such as Genevieve Ecomans, AI Alliance, Apache Foundation, what would be the mechanism, right, by which we can help OSI to come to the definition? You wanna take that? I don't know how to put you on the spot. So I'm a consultant that's been hired by OSI to coordinate the co-design process for the definition, and we're specifically working on a license review checklist now. So yes, we do need help. Yes, we are interested in moving more quickly. I feel like I'm probably not the one to give you a definition, but I'd love to connect and have a broader conversation, and Anthony as well. So there is a workshop exactly that's happening this afternoon at 4.30, so if you join that, I think you will understand how to get involved in that process. It's a very transparent and open process where you can sign up to be part of the mailing list and part of the workshop process. We do need a lot of voices into the process, but we don't wanna slow it down either. So if you could please, yeah. Well, the last, sorry, the last thing we want is to make a definition which excludes edge cases, which are next year's scalable technology. So that's, we want as many voices in the room to say, do we have an edge case or an emerging sector? That's it. And it's definitely not a cabal. Every one of you can join, right? It's like corporations, individuals, school kids, anybody. Thank you, capitalist. Absolutely. So thank you all the speakers so much for the panel. So if anybody comes up with more questions, we can discuss in the hallway track or on the workshop. So bring your opinions there. Yes. Thank you. Thank you for moderating. Yeah, thank you, Daniel. Thank you.