 So welcome everyone to the panel, image credits in Wikipedia, can they do better? I'm going to shortly introduce the four speakers who are going to take part in the panel, and then every speaker would introduce a bit more if they want about themselves. So we are going to talk with Isaac, who started the whole conversation on Facebook, thanks to which we will have this session, Asaf Bartol, a long-time Wikipedia, Dominic, who is going to talk about his work also related to this issue problem with image crediting. And I am Ukrainian Wikipedia and also an illustrator of the Ukrainian Wikipedia, and I also share the concerns and things. The outline of the panel is we are going to present slides, we are going to talk about issues that we see, problems, and also possible solutions, and talk about a pilot project that can be started as a result of this talk. And also we are going to take questions or discussion points if anyone from the audience would like to add something. That I'm going to give floor to Isaac. Thank you very much Nats, and good morning from Nigeria. So thank you again for providing additional context on the fact that this whole discussion actually started from a Wikipedia, weekly Facebook page. And I'm really very excited to join this conversation today. Let me start by saying that most users of images from Wikipedia believe it is safe, and they do not have to worry about any risk of criminal liability or legal jeopardy. But in the rich sense, images on Wikipedia are really free and safe for users. Yes, it is free and safe, provided the user can comply with the licensing conditions of the image. But imagine a scenario in which an academic or a company, for instance, reuse an image from Wikipedia and credit seems to Wikipedia because they found the image on Wikipedia. And a week later receive a letter from a lawyer that they have infringed on the copyright of the image they use from Wikipedia. The academic or company was told the image belongs to a Nigerian photographer called IOCOMI OAMI, and has the academic or company to pay damages and legal fee. So this is the kind of risk we are proud to subject users of images in Wikipedia as to. So let's look at our current practice. In the current practice on how we provide the informations for users. So our current practice provide the licensing information on external website, mostly with media commons, with insertions of a clickable links to the original file that contains all the relevant information. The current practice also assume that re-users will click on the invisible links to find the image license. The question is, is our current practice really effortful enough for our re-users? Do they have sufficient information, or is this current practice sufficient enough for our users to satisfy the licensing requirement of images they use on Wikipedia? This is Alfego from production, one second. So we just provide that link and tell them to click on it. We don't provide information that they must click on links to find those information. So we just put that invisible link there. But that comes with its own problems. So we need to look at the common problems that are associated with the current practice. One of them is misattributions by re-users, or no attributions at all. Because most of the time when these re-users use this content from Wikipedia, they just credit it to Wikipedia. A team? Isaac, can you hear us? Sometimes they don't even provide attribution. Can't hear. Isaac, can you please? And by extension, these put re-users at risk of criminal liability. All right, I put him into the waiting room. And we are actually starting now. Can you ask Mikael to comment? Sorry, there was a discrepancy here. We are getting started now to have you join. So we start from the beginning? Yes, and then I'll add one second. I'll let you know when. Apologies. Hello, can you hear me? It looks like the system locked me out. One second. Isaac, we are still not ready. Can you hear us? Because you were not able to hear before. Isaac? So I don't know. I'm not sure if I can continue if you can hear me. Yes, we can hear you. Can you hear us? Because you were unable to hear before. Okay. Thank you very much. How was the shot? Yes, we need. Can you hear us? Can you hear me? Yes, I can hear you. We need to start. Yes, we need to start from the beginning. Sorry. Wow. We tried to call you, but you didn't hear us for some reason. Okay. So good to go. Yes, thank you. So it feels a bit like a dead end. Thank you very much. I'll start again. And I'll tell you when you can again. Okay. Sorry, everyone in the room. Sorry. You'll have to do the beginning from the beginning again. So welcome everyone to the session. Image credits in Wikipedia. Can we do better? We want to talk a bit about our own practices of attributing, not attributing, mentioning using media images in Wikipedia articles. And this panel going to have four speakers. That's Isaac, Asav Bardov, and Dominic and myself. We all come from different Wikipedia's, different projects, and we also have different backgrounds. But this panel, the topic of this panel is relevant for all of us. It all of the idea of this talk started with the post by Isaac in Facebook and Wikipedia weekly. And we are going to talk about our practices and about problems and possible ways to resolve them. There are also ways to join a pilot project if anyone from the audience is interested in. And also we are going to have a session with Q&A. You can also definitely post your questions in the chat. But we are going to take them along the way if you see them and they fit, or maybe later when we have the Q&A session. So with that, I'm going to give the floor to Isaac. Thank you very much, Nats. I'm happy to be here. And thanks for providing the additional context that this conversation started on Wikipedia weekly on Facebook. So I would like to say that most re-users of images from Wikipedia strongly believe it's safe and they do not have to worry about any risk of criminal liability or geo-party for using any of those contents from Wikipedia. But images from Wikipedia are actually free and safe for re-users. The answer is yes. It is free and safe, provided the re-users can comply with the licensing conditions of the image. And one of the most important and key requirements is that the copyright order, which is the photographer, in the case of a photograph, must be credited. But imagine a scenario in which a business company or an academic institution re-use an image from Wikipedia and credit the image to Wikipedia simply because they found the image on Wikipedia. And a few days later, receive a letter from a lawyer that they have infringed on the copyright of the image they re-used from Wikipedia. And the lawyer claiming the image belongs to one Isaac Alatunde from Nigeria, a Nigerian photographer and not Wikipedia and demands, you know, the academic institutions or the company to pay for damages and the leg of it. So this is probably a classic example of risk we are putting or subjecting the re-users of images from Wikipedia that they don't get to report, reported on the media does not mean they don't Apple. Some of these things Apple's almost all the time. But is our current practice, is our current practice, you know, really helpful enough for our re-users to satisfy this legal requirement? Let's review our current practice. Let's, you know, put it like this. Our current practice actually provide licensing information on the external website, usually with media commons and insert almost invisible clickable links to the original files containing all relevant information. This same practice as Zoom that re-users will click on the invisible link to find the image license and every other information pertaining to that image. The question is, is this current practice sufficient enough? Is it effort enough? Is it enough to guide re-users on how to reuse content from Wikipedia? It is not entirely bad to present it this way. But it also comes with its own problems. So let's look at some of the problems that are associated with this sort of guide guides. So one of the common problems is mis-attributions by re-users or no attributions at all. So mis-attribution in the sense that many people will reuse content from Wikipedia. We almost, you know, credit that contents to Wikipedia itself or provide no attributions at all because they will definitely not see who the copyright order is right from the page. And by extension, this puts these re-users at risk of criminal liability or legal jeopardy. And this has a serious implications on a project, on its credibility, reliability of, sorry, on its credibility and reliability of information on Wikipedia. But why should we even care about crediting copyright order in Wikipedia? Why should it matter? Why should we spend, you know, volunteer's time trying to provide credit? First of all, it's a legal requirement that we must satisfy. And we must strive as much as possible to satisfy that legal requirement. There is no excuse. That volunteer's time will be wasted is not a valid reason to circumvent that requirement. So why should we care? One, it minimizes or even maybe significantly reduces misattributions, which means if you make the credit line as clear and visible as possible, people will hardly misattribute this to Wikipedia. People will hardly or almost not be able to not provide credit lines. They will, you know, see the copyright order and whatever it is and they will be able to provide appropriate credit. And this way we protect copyright orders works from misuse. And this gives them, you know, confidence in our project and, you know, gives them the impression that we are reliable and we can actually protect their interest and, you know, immediately resort in them adding more content, releasing more content for use. It also protects users from criminal liability or legal body after use. I mean, remember the scenario I painted? I mean, you can imagine if I have to receive a letter from a lawyer that violated the copyright because I misattributed it. Not my fault because I don't have, you know, a sound knowledge of how copyright works and I believe the content of Wikipedia has been released under free license and I get that kind of, you know, letter. It could, you know, change my orientations, my thought, my belief about Wikipedia's reliability and credibility. So all of that encompasses our serious implications on our project. So I'm saying in the next sense that it shows that when we, you know, care about all of this, the picture we paint out there is that Wikipedia can be trusted. It's credible and, you know, can be considered reliable as a good source of information. So how, what should we do better to improve our current practices? How should we go about it? Who are the people that should be involved? And is there any steps that should be taken as an emergency intervention? What do we do with the ones that have been, you know, probably drawn wrongly in the past? So all of this is central to the discussion we have today. So I'm going to hand over to my colleague, Max, to take it away from me. Max, over to you. Thank you. Thank you, Isaac. So I just wanted to remind about the requirement of the license, the most common license that we use now for media, that we must give appropriate credit and one may argue that given this invisible link where we expect people to somehow guess and know that all the information about the author is going to be provided on another website. Or sometimes not, if you're talking about non-media, is, I would argue, not the appropriate credit to the authors. So even if we, even though creative comments explicitly say that our practice is legal, but there are real issues with that that Isaac referred to a bit, we actually do not advertise, we lose a way to advertise real licenses even though the project are going to rely on them. We do not clearly mark non-media guiding people inside the articles. We can just take screenshots and be unaware that the pictures inside the articles have different kind of licenses possible and authors. Also, we, ourselves, do not set a good example given credit that we use and we expect others to do that when we do not give the appropriate credit ourselves. So the English Wikipedia has already discussed this in the past and their argument not to include image credits because they are relevant to this discussion and you can find this link. For example, one of the arguments was there is no need to clutter articles through this information. Credit is already provided for the majority of images in the file description page which includes authorship, licensing and more. And that partially is true but again, we go back with this link being visible and not clear. Whereas we do clutter article with templates explaining to people that this or that text fragment was taken from this or that source. Even if that source is under-free license or even in public domain. And then it begs the question why don't we have something like an information for the reader included in the article as a template that images used in this article are in public domain or published under-free license or non-free media for use. Useful educational purposes and this information you can get from the file description pages if you click on each file themselves in the article. Or maybe we should have them at least in the terms below because other projects are not exclusive and the terms do mention that text is available. There is no mentioning about the images used. So we do not expect people to understand about the text fragments. We do not expect people to just believe that the information is taken under free license or everything. But we do expect people to somehow clip on an invisible link and also believe that we are cluttering the text articles if we just giving citation or mentioning or explanation about the images. So warning in the terms we warranted taking into account that we do that for the text. But also we do mention about the text in the article itself not only in the terms. So it seems a proper citation attribution for every image used also can be warranted. We can easily have an article with 400 footnotes about every Warsaw of text information used and we would have all those sources we would have also general sources but we just want people to understand and read more further reading and we still with all of that are going to have a template about text included being from source under free license or about or in public domain. Because we do not expect people just to guess that and also we are going to have that mentioned in the page in terms used. So we also already know that media helps articles and articles help promote media. What I mean by that is that media files used in the article do help reader to understand the article better illustrations for particular portions of the article needed. The articles Wikipedia are viewed by more people and they are going to be reused more and some media outlets are going to use those pictures in the prominently featured in the articles more if they want to illustrate some points. So one of the other arguments the English Wikipedia in that discussion was maintaining this in article credits would be a significant maintenance burden on other editors. But there are ways to have it automated and let's talk about that and with this I'm going to go to Dominic to give us a bit of a background information practical ways to automate this information being shown in the articles based on his work. Dominic? Great. Thanks so much and thanks for inviting me. You can go to the next slide. First I just want to talk a little bit about how I came to this issue. So over the last two years I've been working with the Digital Public Library of America which is the National Cultural Heritage aggregator in the United States since we launched our program in 2020 we've uploaded about 3 million files to Wikimedia Commons from over 200 individual contributing institutions becoming the largest contributor to Wikimedia Commons. Next slide. You can find our category there on Commons. We're doing this essentially just to provide our member institutions a simple activity that they can easily be trained to perform putting their images into Wikipedia articles for which the outcome can be easily measurable. So this is an example of an image from the Indiana State Library in the United States that was uploaded by DPLA and as in a Wikipedia article. In one case a single institution staff member working to add images to over 100 articles over the course of a few months generated 45 million page views for their institution which is a huge impact. Next slide. In addition to the uploads themselves we actually contribute metadata in the form of structured data on Commons now. So this is just showing what our uploads actually look like in practice. So for every image we add about 10 or more structured data statements to our uploads. And we can do that because we have all the institutions descriptive metadata that's what we're using to do our uploads. So we're uploading statements or we're adding sorry we're adding statements like title creator copyright description all those sorts of things and this also has allowed us to regularly sync and update the metadata for all of our past uploads whenever the institution makes changes to their data. So it's not just a static snapshot of the uploads that as they looked like when we did them. Next slide. So just briefly I just wanted to say that the success of our program at DPLA is really based on these two philosophies first we need to get institutions on board with participating and that means tasks that busy professionals can perform successfully with minimal new skills necessary so if the cost to get involved is too high they won't be interested and part of buy-in is that cultural institutions want to know that we're going to be responsible stewards of their information and that they're not necessarily reflecting the metadata that they invested their own expertise and intellectual labor in as well as actually attributing their institution for the work that they've done and the objects that they hold in their collection that we're uploading and displaying on Wikipedia. The second part here is once we have institutions that are interested in contributing to Wicca media projects we want to make sure that they're actually successful and impactful and that's why we designed the program that we did with where we've solved the problem centrally of doing bulk upload so the institutions don't have to figure that out and it's also built around this idea of a small discrete easily trainable task of adding an image to an article using visual editor and tools on commons like crop tool I can train someone to do a task of adding an image to an article not writing articles on Wikipedia in a few minutes next slide so I'm coming to this with a goal of both improving the integrity of the data in Wikipedia but also improving the usability of our image workflow for editors and I think with some of these with this proposal that we can do both of those things next slide so I want to just talk a little bit about the landscape on English Wikipedia here's I'm not going to be able to go over everything in detail but here's a few links to some existing proposals like Nat mentioned this is in the perennial proposals section in English Wikipedia but just to show what is the actual kind of most relevant policy this is the manual of style on the English Wikipedia which explicitly says not to credit the image author or copyright holder in the article unless relevant whatever that means it's unclear so next slide this contrasts pretty sharply with what especially cultural institutions and academic institutions are used to this is just taken from a book off my bookshelf this is what you would typically find in an academic work where the image itself that's a historical image is considered informational content just like the text and is cited in the text as such next slide so next slide please so I can't go through all of these I just wanted to give a few links or that people can look up oh sorry if you can go back one just to give people a few seconds if they want to look those up but there's a few different sorry can you go back one slide now there's a a few different places we have mock-ups on the wiki this is I might regret this but this is the one article I'm aware of in English Wikipedia that has an image citation in the form of a footnote so if you want to look at it live it's also the top link here that Charles Robinson Rockwood comes along and removes that but if you would like to see a live demo this is already possible in Wikipedia to accomplish this by using the existing REF tag and you can create REF use the group function to group references so as long as you use the citation inside of a caption in an image right now and added a separate group for images there's nothing stopping anyone from technically from doing this so next slide please this is just the screenshot of that so this is just showing an article with an image and there's the footnote there which goes to the references section and there's a separate group within the references section for images next slide please what I want to also show is what we've been working on at DPLA and what's been essentially enabled by the fact that we've added structured data to all of our images and hopefully increasingly possible through more adoption of structured data for all images in Wikimedia Commons this is something that's possible right now only on Commons in Amaka but what I created is this template you can go look at it live this template embed DPLA on Wikimedia Commons but it's coded in such a way that the template has only the file, the name of the file as a parameter all of the data for a caption using citation type information can be generated automatically all of these things that are in the caption in the screenshot here are already in the structured data of the image the title, the creator, the institution it comes from this is just a hypothetical citation format it could be formatted however you like but the idea here is that a user could just by including the image have all the data that they need necessary for a caption or citation just auto-populated in the same way the user expects to be able to do that from when they're adding a citation in the text and just provide an ISBN or something and then all the rest is auto-populated next slide so we have some issues to make that technical part happen first is that that's not actually possible right now to do within English Wikipedia for example because structured data on Commons can't be pulled into other projects right now we're using Lua in the same way that wiki data can be pulled into other projects so what we would expect to see some day is that one structured data on Commons those statements can travel to other wikis then we could build templates like this that allow us to do that right now it's just a mock up on Commons although what we could do is something like that pulls it in in the same way that Cytoid does and it's just static text rather than being dynamically generated from the wikimedia Commons data there would still be some infrastructure built around that the other thing that would be of course the most important part of it is actually support from the Wikipedia community and so that if people started to just start adding these willy-nilly that they wouldn't be getting reverted that's I think part of what I want to have these community discussions about it so that's it for me and also if you're interested in learning a little more about the DPLA projects and what we're working on with structured data on Commons here's some links to our grants and thank you to the Wikipedia Foundation for sending them thank you Dominik, sorry muted so overall what can be done about this we can change how we do it after all it is our own practice so I'm here also going to give you a few mockups it's in Ukrainian but you would be able to see that there is a picture in the first article where we are prominently saying that this picture is a picture by and then the name of the author and also a license or in a gallery it can also look like that and for some pictures it's really important for some articles to include not just like a lot of pictures but pictures that have meaning for the article maybe the outer experience or inner appearance of the object and it's important then not only to have the picture included randomly but have it included because it illustrates some points it can be also done through a list of illustrations which I also showed like the upper one a list of illustrations used in the article it can also be done through references which was mentioned also in the chat and also it can be mentioned with a tiny visible link for photo credits near pictures if you don't want to go into that we can also think about that including photo credits which would link to the original file and I would argue that that might be useful for some people because some of the pictures on comments do not have actually one author somebody can change crop do something and then only the file itself can show you all the contributors to the file so another argument in the same English Wikipedia discussion was that people can spam articles including a lot of images in attempt to use Wikipedia for free advertisement but Wikipedia is not a discriminant collection of information and editors need to include only those images that matter anyway this proposal does not change that and if anything it goes it makes people go an extra mile to prove that this specific media is needed so with that I'm going to pass the floor to Asaf Thank you what I like to add to the previous speakers is to re-emphasize the point that there are at least two separate issues here in our current practice and that is the missing license type and the missing attribution these are at least ostensibly separate issues and they are in principle separately solvable we could for example add the license type the explicit mention of the license type while still sidestepping all the issues related to the usernames for example one argument that has been raised and was not mentioned here today yet was that some people might have offensive usernames the mention of which in article may be distracting or offensive so for example even though the offensive username issue is also in principle solvable we can completely sidestep it and still at least solve the problem of mentioning an individual piece of media's license license type so I'm pointing out that these are in principle separately solvable and like I said the offensive username argument for example is also solvable we could decide to have just a media credits link under each image in other words making the link explicit rather than implicit on the image and still hide the username or we could determine that there is some acceptable username policy and either rename non-compliant usernames or include only acceptable usernames on page and hide unacceptable usernames inside the link I'm just bringing it up as a sketch of an answer to make the point that we could solve this any number of ways I think the goal of this talk in this panel is to convince people who are yet to be convinced that there is a problem and that to encourage people to commit to solving it if we are convinced of this solving it we can go ahead and brainstorm the actual solutions but what I'd like to move the conversation forward to is this as has been mentioned this has been a perennial proposal it has been discussed in the past and we tend to discuss ourselves into exhaustion seeking consensus and that's why this is a perennial proposal but I suggest to you that we could cut through this exhaustion by running a well-designed experiment next slide please and I suggest to you that yes so an experiment could actually help us get the evidence that for example one of the claims is that this missing attribution is causing re-users to not properly attribute images and that would be easier to prove if we actually conduct some experiment and show that providing proper attribution does in fact help re-users attribute so such an experiment can really help determine whether and how much of a problem this is next slide please I suggest to you that an experiment would help us make progress before the end times when we manage to convince absolutely everybody on all Wikipedia's that this change can happen we could design an experiment and I'm not pretending that we have already designed it work needs to be done to create a good valid experiment to test the hypothesis that changing our image crediting practices would be a good change and we could for example identify some limited scope maybe a particular Wiki project on a large Wiki or maybe a small Wiki or maybe just a set of articles on which we experiment we may even have natural experiments in such articles or pages where there's already been this mockup that's a large enough sample or not but the point is we could create a sample and a control group of course and then implement explicit attribution and explicit licensing on those articles and then measure several things we could measure the awareness of the licensing terms among the people reading the articles we could do that with a little invitation at the top of the articles to fill out a form or survey or something and we could also try and measure the reuse of those images in the wild using an image search to try and find out whether those images in the sample group are in fact better attributed than images from the control group again this is just a sketch of an experiment and I think we would benefit from having professional researchers co-design the experiment but my point is an experiment could help us break through the cycle of is it a problem is it not a problem is there a solution to every edge case or not an experiment can help us move forward and I'd like to encourage those of you who agree that this is a problem to join us in thinking about how to move forward with an experiment next slide so if you're interested in thinking about it with us please mention your interest in the talk page of our Wikimania talk and we will get in touch with you and see how you can help with your community with your wiki project wherever you are maybe you can help with mockups maybe you can help with translation and we will try and report about this probably next year maybe next wikimania if we're able to actually run some kind of experiment thank you Asaf so we have actually started later so we don't have a chance now to have a session I've saved the chat and things that you discussed here so we would be able to follow up with you if we can find you if you can put your names in the either pod we would be able to find you even if your real name does not correspond with your username and I also put here useful links that are relatable to this discussion and of course there would be a link to this presentation from our submission and from the either pod and I wanted to say thank you in our full languages for your time and for your interesting ideas and discussion points that you mentioned in the chat thank you and thank you to the speakers and the support team