 Yeah, hey Good afternoon My name is Paul Keller I am co-director of an Amsterdam based think tank that works on digital policy issues and We're focusing generally on openness in the digital domain and so I'm here to talk to give a policy presentation at a developers conference I saw the person before me showing github slides or github screenshots and take I hope I'll have something to offer you to you as the audience I want to thank blender for the opportunity to speak here and I'm going to speak about AI and generative AI and the relationship with copyright and creativity which is a Topic which is increasingly dominating much of the discussions we as open future have had over probably the past one and a half years, but really that this has become also a policy issue is the past year since the beginning of this year and The talk I'm going to give is based on a blog post which I wrote in June of this year had the same title as the talk and After I published that tone had approached me and said like maybe this is a good idea to submit this as a talk for the blender conference and so Here I am and here gonna talk I'm Actually while I work on policy like I have a bit of a history of blender I know blender since around 2006 2007 when I worked at an organization called Kennesland at the time which Ran a small funding scheme called the digital pioneers and we funded a number or co-funded a number of the Blender open movies that were released here. You see here the photo from the premiere of big buck bunny in 2008 And so I have sort of known the blender sort of modeled the community the way the foundation works for quite quite a while now and Also in subsequent jobs Followed the work of blender and the community and Mainly from an interest as also a name of our organization or current organization suggests from openness right like and for us At that time like my main sort of focus on all of this was creative comments and blender was one of the examples that really ran with the idea of combining openness across the software stack the way the Movies were produced the way content was used and shared and for us that has the blender Community the blender Foundation the sort of you have always been like an enormous inspiration in showing sort of like how you can leverage The concept of openness on the number of levels and it's been very inspiring to see Blender as an organization as a community the software grow into where it is today and so That is why we always had a keen interest in sort of blender and seeing how communities Produce goods that are available in the open and how we can preserve these kind of characteristics and now 15 years later 15 years after 2008 We find ourselves in a situation Where there's some fairly significant technological developments, maybe not directly related to Animation software or 3d rendering software, but generally to content production have sort of come over us As it feels for many people who haven't really been involved in the technology very suddenly Since about last summer right like and this is developed. This is dominating a lot of the discussions about Creative production about the role of creators the balance between human creators and machine creators The role of the commons the future of the internet a lot of things that we care about and so We see it as our role at open future to also go into these discussions and try to understand to some degree Both where the policy is going trying to influence the policy But also understand community norms and preferences in this entire space And so it is very interesting for me to talk to you here about this today And in the reminder of this talk I want to share some of our thinking that we have developed around these issues over the last sort of roughly 12 months or something I Maybe hear a little bit back from you. Although the format isn't really geared towards that like how you look at this So this is also an invitation maybe to come to me afterwards and tell me what you think Generally like the emergence of this generative AI systems over the last 15 months or so has sort of Resulted in a number of fairly strong Positions from creators from people involved in the creative industries from rights holder organizations was from commercial rights holders from people in this space and roughly it falls into as In into two different camps the one camp you see like a quote from an open letter that was Sent in earlier this year here that the as people understood how this technology works And sort of not in detail, but that a big part of this technology is that it requires enormous amounts of training data or copyright it works as training data in the case of large language models or Image generators that it requires these and these are generally Scraped up from the internet people have started like the one camp has very clearly identified this as like in terms of theft and Copyright infringement and you see the I guess I guess you get the gist from this quote here on the other side On the other side, we also have seen Probably a much smaller contingent, but generally also creators who work more maybe more technology forward Who work more with technology who've come out in support of this technology and say like this is something that we need to Incorporate into our artistic practices This empowers us to do things that we couldn't do before and who are generally concerned that issues about locking this technology down Through means of copyright and through means of maybe other regulation is in the end detrimental to Creation and creators so we have these these these two camps That that are in this discussion like I think the first one the first camp is clearly more dominant In public discourse, but like the other position is there as well Now I was interested like maybe like a little bit of a show of hands Who of you would like identify rather with that first position so the sort of AI is theft? Can you raise your hands? Okay, and by comparison the other camp like the sort of this is technological opportunity that we should embrace camp a Well a little bit more in this room like which which maybe isn't surprising because I said like people who like we generally see this coming from People working in the arts with technology and like I don't know you but like I would assume that this sort of describes many of you So the question that we need to answer for ourselves as a society as a Artists as policy makers is how should we deal with the fact that machines can now consume all of human creativity? Reassemble it and spit out synthetic content that sort of resembles that human creativity That was previously produced or the exclusive like only humans were able to produce that and I don't think it's productive and like I don't think many people think it's productive to think to question this right like This is a technological reality This is technology which in all likelihood will improve over the next couple of things like I think there's also reasonable doubts that maybe it will Run into a wall at some point and not necessarily become General artificial intelligence wants some people hope and other people fear but that this Technologically it's possible to reassemble creativity at this large scale and spit it out is like a given that we will need to deal with and At the same time like and I guess that's it's a heart of the discussion many of the observers in this Feel that this is a situation that copyright should be able to somehow solve right like generally We have problems in the creative sector people always look at copyright as the first solution Now we would argue and I don't want to bore you with the details so much, but like that copyright is Entirely unfit to solve this issue right like this starts with very simple things that for example copyright generally Attaches to the concept of copying and making something available and those of you who know how these Generative AI models get trained will know you basically make a copy once very early in the process Then the model learns from this copy and there's no more copying going on the work isn't necessarily in there There's abstract concepts of that work embedded in that model, but it is every time it spits out something There's not copying going on so like by the traditional concepts of copying and it's also not making available the original work So by the traditional concepts and sort of mechanical hooks of copyright This copyright just doesn't apply at some stage after our model has been trained and a model can be trained Can have been trained in the past and like will be used for many many years going forward Another problem obviously is like copying copyright somehow sort of assumes a direct relationship between the original work And then the output, but we're having here like in the space of image generators for example we have Models that are trained on multiple billions of works that are all feeding in there that are all of the same Importance to this model and that will not be sort of like it is extremely difficult to trace Output of these things where you might have the feeling that this should somehow be regulated to actually any of these works in there and Generally, you might argue maybe some of the more well-known works have more of an influence But we also know from the literature like Removing individual works from the training data doesn't really have an effect on any of the output So like it is not about copying this thing going on and then obviously also if there is billions of works in there Like how do you copyright usually? Assume some way of revenue flow based on copyright back to the original creators Like how do you just like something that a model that is trained on five billion works? How do you? Identify revenue flows to individual artists back. So this is Copyright is really running into some of its conceptual boundaries with the way this technology works Sometimes you could even think like this technology is like a little bit designed almost to To to elegantly run around copyright But there's another effect and I come back To now we Klein and this is an article she wrote she says and I think that's also right What we are witnessing is that the wealthiest companies in history? Unilaterally seizing the sum total of human knowledge that exists in digital scrapable form and walking off Inside like putting it inside proprietary models that they then exploit So there is also a problem that we have created this sum total of human knowledge So some people might call this the digital commons or just the public internet and suddenly these companies come in and scrape everything up Put it into their proprietary models and try to benefit from that like the business models aren't very well-evolved at that moment But we can be pretty sure that this will be in some form Allocative business which raises the question also how can we sustain? Or ensure the sustainable of the digital commons as a whole and those of you who follow maybe sort of a bit tech news will have Notice things like reddit closing down right like that was a direct reaction that reddit realized We don't want these machine crawlers come in that feed the AI models take all of the thing the the fact that things like Stack overflow have declined enormously is a similar thing You could argue that sort of like some of the decisions in the demise of Twitter With closing it for unregistered users for example also been an answer against AI crawlers or something So we see these resources that were the public internet that were some shared common good often privately owned But available to the general public free to do more or less Whatever you want to do with them are suddenly closing down And so how can we ensure the sustainability of these commons resources is one of the questions that we are asking ourselves and so we've come up with broadly Two lines of argument of inquiry where we say like okay, this is what needs to happen. So on the one side we think There needs to be the ability for creators to control how of their works and also their artistic identity Their style things that are traditionally not protected by copyright because copyright Applies only to the expression are being used by commercial AI systems right like give them some control and Interesting for example in you copyright while there are rules that allow this opt out for Commercial AI training but that still enable for example for scientific research like they they mean scientific research is free So the one I had said you need to give creators some kind of agency creators need to be able to say no I don't want to be part of this or no I don't want to be part of this for a while while I develop my own stuff or maybe later There's lots of different reasons and on the other hand we need mechanisms to ensure that a portion of the surplus from training AI on Humanities collective creativity flows back to the commons This is not training AI only on commercially produced or on works Produced by professional artists that you can then give like identify the professional artists via collecting societies are something And give money back to only these artists. This is also stuff on Wikipedia This is everything all of us write in for on on the internet This is probably a lot of the output from the blender community that is shared online that stuff is being trained on without these people having a Sort of like representation as professional artists where you would have some mechanism that you can give back to artists You hear the professional artists say this this should be licensed and like we need to get money out of this But this is broader than just the use of professional creativity by people who are members of collecting societies And so in the last five minutes I want to walk you through like a little bit more elaborate Sort of version of these two general principles and this is something that we've started to develop with Communities on a global level. We had this at a creative commonsum and like a couple of weeks ago Where we came up with these principles and we're testing this with various communities to see like how this works and how they feel about it and We've come up with seven of these principles so the first one is that We need to ensure that people have the whatever we do right like as a prerequisite like also if there's harm being done to maybe Creators in the comments We need to also ensure that people have the ability to study and analyze existing works in order to create new works Either by themselves or through machines. This is how humans learn right like this is how probably all of you have learned to use use software build software Create etc. And this is like a fundamental principle that has always been outside of the scope of for example copyright Like you have a book you can do whatever you want with it You can like read it and ignore it you can put it on your shelf But you learn from it and that is not governed by any law what you can do with a book But you can do with knowledge that you have access to The second principle is to ensure that all copyrighted works can be used for training for public interest and non-commercial AI systems right like we see that we we can't we Afraid here that we're creating a situation where we make this dependent on licensing payments or anything for every type of use that this Benefits those large corporations who have the resources to license everything who may be also licensed on exclusive basis Everything and to build themselves like a massive data advantage over everybody else and so research and non-commercial uses need to be freely Possible for everyone and we're really thinking like the problem is here or or the problem that we need to address At least is one of the commercial uses of these systems right? So that's the second principle The third principle is that we need to ensure that creators have the right to opt out from the use of their work for Training commercial AI systems. This is essentially like the the one I mentioned earlier So we think legal systems need to give creators the right to say no To this But in this limited context for commercial applications of days, you shouldn't be able to say no for people Experimenting you shouldn't do this for scientific research. This is because the technology is not only used to create Sort of for large language models or for it is also in scientific research in medical sciences in a lot of scientific Sort of method also builds increasingly on the training of AI systems, and we don't want to cut this off This is again something which for example in the current you legal framework. There's a nice difference between Using copyright work for scientific research and on the other hand for training all types of other systems Then the fourth principle is a bit one like this is a global conversation But there is concerns among people in the global south in the majority world that types of Traditional knowledge that is community-based knowledge, which isn't necessarily well Framed in terms of copyright something that this needs to be protected from people coming out and training Machines on traditional knowledge resources that are not there's this idea that often these types of knowledge are stewarded by by By communities in some form or their specific community stewards And we think this needs additional protection as well and then the fifth principle is That everybody so not only creators But like this is probably most relevant also again for creators that everybody has the right to exclude personal data like Their likelihood their identity their artistic style from being mimicked from being trained on in these systems Right like the the way an actor acts the way a singer sings like these things are not necessarily things that are copied But that are protected by the our notions of copyright at the moment But with these sort of systems that can mimic human creativity They become things that suddenly become Transferable and this should be something that needs to people need to have the ability to say like we don't want this It's also obviously key for personal data that personal data shouldn't just be thrown into their systems right and then six We also need to ensure that the economic benefits of training AI systems on publicly available data are shared back with the Commons this is the other one I had at the top so we really think we need to think about some form of Levy or compensation system that applies to commercial systems that train on broadly available open data and That gives that back to the commons now It's very difficult to imagine like how you give something back to the abstract commons the open internet Like a community because they obviously don't necessarily have a bank account or something so we tie this a little bit to our Seventh and final principle, which is that we also need to invest in public compute infrastructure and public data sets That are governed as commons So for a lot of the smaller for a lot of the universities for a lot of the smaller companies for a lot of individuals Experimenting with this like these are real bottlenecks right like access to capable compute and access to well-structured data sets that are cleared From all doubts and maybe from personal data or something and we see To some degree a role for the public sector for public money to step in here and build some of these infrastructures. So we are not Becoming increasingly dependent on a few big tech companies who can afford to build proprietary compute infrastructures who can afford to build proprietary data sets that they use for themselves We think this is a fairly fundamental technology and there needs to be like Access to the resources to work with this to To build on top of this that needs to be guaranteed as public digital infrastructure if you will and you can tie this to the last principle so maybe a tax or a levy on Commercial systems that are trained on on public public data could actually fund some of these investments So this is sort of where we are at the moment with answering some of these things again I'd be curious to hear what you think about this after the talk But for now, that's it. So thank you