 Hello, hello everybody at Wikimania. Welcome to this session about the colonizing history of data. I am Mariana, I am with Zim, Kelly and Ekta, and let me share my screen, full screen. So welcome to this conversation. This is an invitation from Who's Knowledge, Wikimoviment to Brazil and Wikimedia Deutschland for joining a conversation that started last year. I am Mariana Fosati, Who's Knowledge, the colonizing Wikimedia program coordinator. I am with Erika Aselini, community manager at Wikimoviment to Brazil. Zim Pietersmann, she's the project management of Knowledge Justice at Wikimedia Deutschland and Kelly Foster, my colleague at Who's Knowledge, the program coordinator for Who's Digital Archives. So this conversation just started, it started on October 2021 when over 40 participants from around the world started this conversation about the colonizing the internet-structured data as part of a broader conversation about the colonizing the internet. These 40 participants were mostly female identifiers in or from the global south. They were mostly indigenous black people of colors in origin. And these 40 participants had together at the pre-conference of the Wikidata.com that last year was held in Brazil by Wikimoviment to Brazil. And this pre-conference where organizer between Who's Knowledge, Wikimedia Deutschland and Wikimoviment to Brazil. But for starting or restarting this conversation with you at Wikimedia, it's important to just talk a bit about what does structured data mean, especially for newcomers because this Wikimedia is especially focused on welcome newcomers. So let me just say that without not too much technical details that structured data are pieces of information that can be easily read, understood not only by humans, but by machines, especially by machines. Humans give the dictionaries, the vocabularies, the ontologies to machines for understand us, understand the meaning of the objects, events, places, people, relationships that are part of a structured data system. And these systems are used in countless apps, tools, platforms on the Internet that are built upon such as structured data systems from Google to Wikidata. For instance, let's think on the infoboxes that you can see when you do a search on Google, for instance, the Google Knowledge Graph or the Google Knowledge Panel is made using data from different sources structured in a certain way and is ready for answer your questions when you search online. And in Wikidata and Wikimedia are important parts of these kind of knowledge panels that you can see when you search on Google, but also there are other ways of using structured data. For instance, just yesterday in Wikimedia we shared about a database in Wikibase that can be used for instance for creating a chat bot in kitchen language for kitchen speakers. So there are a lot of different ways in which structured data can be used and a way in which you can see structured data on Wikimedia projects is structured data on commons. This is something that I don't know if everybody knows about this project, maybe the newcomers don't, but this is about the media files the millions of media files we have stored in Wikimedia commons can be connected with concepts where their representations come as data on Wikidata in a multilingual way. This is possible because every item, every entity, every object that you can find on Wikidata, which is a huge database is identified by a code this new follower by a number that identifies this unique entity this unique object like a book or a light bulb or a computer or even a feminist strike any event that you can find on Wikidata but also in a multilingual way because every entity on Wikidata is available in many different languages this is a collaborative database commons is a collaborative media project and both together make easy for people, for instance to search images in different languages and to let the sunshine shine on Wikimedia commons find images in a more contextualised way because the images are related with data in different languages this is a way in which you can experiment directly structured data and also you can contribute adding descriptions, linking the images with the data using Wikimedia commons but beyond the technical this session is about the political and the policy of epistemology in this technical developments so why does structured data need knowledge justice? that's our question today for different reasons at least these four important reasons structured data is at the core of how the internet works nowadays currently as I said before you can find structured data searching online when you do questions to a voice assistant or when you look for a translation you can find structured data everywhere, especially because artificial intelligence is deeply based in these datasets that are structured in specific ways but these systems are far for neutrality because, and this is something that we bring all the time to the conversation in whose knowledge is that the knowledge that you can find online in which structured data systems are based that feed these systems are not mainly created for and by women people of colors, LGBTQIA plus folks, indigenous communities and peoples in and from the global south on the contrary we are the ones that are more impacted by how structured data is used or even abused so there is an urgent need for centering those who are often marginalized in the build in the develop, in the process in the uses and abuses of structured data online for having this deeply political conversation we at the pre-conference at the wiki data pre-conference we set the conversation based on guiding principles that are love, respect and solidarity because we wanted participants to be aware of their positionalities and privileges and to be able to be their full multiple self during the whole session so that's why we grounded the session in these principles of love respect and solidarity and also we we had a commitment for the privacy and safety and well-being of the participants and also with language justice this session we had a simultaneous interpretation between Spanish, Portuguese and English because the pre-conference was very focused in Latin America because the wiki conf was held in Latin America last year but also because at least if we can just bring a bit of language justice this is so basic because language is a proxy of knowledge, especially online so that's why we were paying attention especially to that so those principles in mind and in practice what did we do and how did we do it in that session we organized this session in three in three parts a panel called Perspectives and Provocations with the words of a special guest their provocations their deep provocations with us then a small group session with all the participants split in smaller groups called Imaginations and Implementations and finally a plenary called Listening and Learning from each group reported back to the plenary that was the organization of this of the whole session the provocations of the panelists were about some specific questions what does Structural Data mean and why it is important that we talk about it what does it mean to have multiple knowledge frames, multiple epistemic frames or epistemologies at the heart of Structural Data how can we rematch in Structural Data especially from a feminist and anti-colonial lens what is one thing you would like to see done differently in Structural Data today that will help us come to that space of emancipation and liberation and I want to invite you to keep these provocations in mind and share about this in the chat especially by the end of the session today and especially the last question but after this provocation and the small groups work key insights emerged from the conversation and let me share a bit about this this topics one of the topics was access and control of data who control the data who govern and who is excluded of data governance online we identified here the need of granted fully access to knowledge and tools especially to the majority of the world who is marginalized from data governance to being able to participate in governance to create to develop Structural Data system to use it, reuse it process data but also and this is very important we identified here a conflict between the actors that use Structural Data in commercial ways and for profits versus some human rights issues that can be in conflicts with that such a profitable use of Structural Data online another topic was agency and engagement and here people talk about the importance of that there are no excuses for people to engage with Structural Data bringing the diversity of context and epistemologies that exist and if we create knowledge resources and a full ecosystem based in this debate diversity there is no excuse to not engage but also is important here to acknowledge that some specific knowledge especially in indigenous communities communities maybe shouldn't be online if the community don't give the consensus for that and there is a right to refuse thatification especially for indigenous communities another topic was distributed data as a smarter solution in comparison in opposition to big data maybe smaller and connected data sets governed by marginalized communities at a much better solution than a big centralized database and this is related with another topic Planet centered redesign of Structural Data the environmental impacts of data infrastructure and questioning the purpose of every new develop every new data set every new engine every new artificial intelligence application keeping in mind doing an assessment of these environmental impacts especially the impacts of big data and finally the importance of the plurality of data and to create models based on different knowledges and the glorious complexities of our communities the importance to go beyond text because not all knowledges can be encoded by text images, sounds, signs, other ways and also the importance of the local specificities and the critical need of listening because the colonizing Structural Data, the colonizing the internet is a process colonization is a process which needs time and effort, intentional time and intentional efforts so listening is critical here so before talk about what is next, I want to invite my colleagues to share some thoughts after these presentations I will stop sharing and I want to invite Erika Zin and Kelly maybe Erika can you go first Yes sure thank you very much for the invitation for this conversation for me is a very special moment because it's been almost a year now since we did this event last year and I think I can start from the beginning so we were organizing Wikidata Con in partnership with Media Deutschland and we thought that this shouldn't be a conference just for people who are already in the Wikidata Universe and heavily focused on their contributions in there so we wanted to take the opportunity to bring more people to the conversation and we also didn't want this to be a technical conference as well we wanted to focus on the social side of Wikidata because Wikidata is constantly growing it's scaling up at some say and still there are a lot of people who are not in there so this implies in how structured data on Wikidata is being organized at this moment and who are not there so which knowledge are not there so this conversation we had was very important not only because of the content itself but I think of the methodology that we use for the event so the way that we selected people to attend the session and the way that they were invited to provide their thoughts as well is very important for not only the engagement but for the way that they feel that their ideas were received and welcomed and valued at the same time because this is not something very intuitive in other spaces so the way that we organize this sort of conversation may be as important as the outcomes of the conversations as well because it helps us to build stronger connections in people in communities who have been marginalized from such processes so this is my perspective on everything that we did I feel that more than the outcomes because we'll see in the long run not on the short run of course but how we make this done is very important as well and I'm really proud to be part of this and thank you so much for being here I'm really curious to see what my colleagues here think about what happens in the near future as well Thank you so much Thank you Erika, thank you so much and I will pass the word to Sine I hope you can hear me Yes Oh sorry Sine, I can't hear you now I don't know why Okay, again No, now I can't hear It's audio I'm kind of human kind for forever but yeah I joined Wikimedia Germany actually like two months ago so I wasn't present for your conference last year but I still witnessed it as a person outside of Wikimedia so I'm super happy to be here and I'm kind of honored to working on knowledge equity with you and other communities and myself I have a background also like in tech and intersectional educational approaches to it and I would totally agree that we also have like to get to the point that more people just like learn and know that tech and structured data is not neutral and not objective and it's like created and formed by people and their beliefs and perspectives and as of now like a very specific kind and group of people so we definitely have to change that and also let people know about it because most people just don't know they think like it's a physical science but tech and data is not like it's just so influenced by people and so I think this would be a first step definitely and also a lot of people still have to learn and yeah like we are building our team up in Wikimedia Germany that's like where I joined two months ago and other colleagues as well because Wikimedia Germany in the Wikimedia universe has a lot of researchers to share and that's what we're here for like we want to get engaged with marginalized communities get their perspectives and of course share our resources and support them in their work so thank you very much I only speak a little Spanish but I can say thank you very much for all of your work I hope this was right so thank you so much for your work Thank you very much Mariano and hello to the other panelists my name is Kelly Foster and I attended both the workshops that were done to produce the Decolonizing the Internet Structure Data Report and I was part of the programming team for WikidataCon last year as well so a couple of reflections really all to emphasize some of the points that Mariano brought up in the presentation and one of the key themes that I remember from the workshop was around the right to refuse datafication and the right to opacity as it's called by a Martin Eakin philosopher Edward Glisson the right not to be understood and definitely in the sessions that I was in that was a strong theme how can we how can we ensure that we respect rights not to be understood and right to refuse datafication as the report put it the other thing reflecting on the conference last October in discussing Decolonizing the Internet or Decolonizing structured data one of the things that's less able to be communicated by a report by the recordings of some of the sessions was the very palpable pain that came along with the discussions that were being had especially the discussions around data modeling and taxonomies that reflect and reinscribe colonial violence I think it's something that partly because a lot of the structured data that we are importing into the Wikimedia projects especially on commons as well as Wikidata is using data sets from colonial institutions namely museums and other types of institutions that have been established to categorize people and cultures and often that categorization comes with the colonial violence of alienation and franchisement so being as I said that is something that is perhaps that pain that comes along with confronting those realities of the data sets as we come across them is something that is perhaps more difficult to communicate through the reports and other documentation of the events and then finally always as I'm thinking through these things Marianna's presentation concluded with emphasizing the need for plurality and plurality and creativity really in how we think about the data sets that we work with on Wikidata and on commons and when I think and speak about plurality I'm thinking not only about the language that we use not only about the semantics of the machine readable data but also about the kind of ontological structures that underline and that undergird the databases as well. Currently in my opinion Wikidata is imposing the ontological structure of the western encyclopedia but there is a potential for more plurality, more multi-vocality on a database like Wikidata but perhaps there is scope to do some more experimentation in thinking about how linked data can provide or work towards having these multiple or pluralistic views of the world and of being in the world. There was a really interesting conversation as well in the Wikidata conference about the potentials of using decolonial English as an alternative language option and again because so much of that taxonomical classifications and language language violence understanding that has been inscribed and imposed by colonial violence. So these are just some potentials or perhaps unfulfilled potential that is ahead of us as we're thinking about ways in which to bring in more ethical considerations into structured data in the Wikimedia projects. Thank you Kelly. I think that maybe this connect with a question we have in the chat by Jan about which data are you talking about access and control of. If we are talking about Wikidata, we are talking in general Wikidata is a reference of course here is one of the projects that we are thinking when we talk about structured data but there are other projects too and even interconnected with Wikidata for instance through and even Wikibase is another project we can also do the same question and I just remember the chat yesterday by Elwin Waman about the use of Wikibase for Kitra and this is a small project but it's connected with another big project so we are talking about different projects that can be interconnected and when we talk about access and control we are talking about different configurations for instance Wikidata is a cloud source database controlled by the community with almost the same rules we know and we practice on the Wikimedia community and this is but even when this is a community organizer and community led anyway, structures of power and privileges that are that are previews of the creation and the management of the databases itself are influencing the process so when we talk about access and control we are talking about access to tools for using this database, knowledge on how to use it, knowledge on how to create new things based on this data and these tools so control and access are the meaning of control and access is broader and it is beyond the data infrastructure itself and it is rooted in social structures of power and privileges that influence the whole process that's more or less our understanding of this issue and I see other questions could you share your thoughts on what we as individual Wikimedia and content uploaders could do to work on this important issue thanks Michelle and this is a question for you and for everybody in this session for people that is listening so please this is our question for you and we would love to see your suggestions, your thoughts, your ideas in the chat or in the other part and another question by Jan, what can should Wikidata editors than don't do, must import but should regular editing do different better when they get back to editing after Wikimedia another super important question and we really would love to hear from you from your perspectives and this is an invitation also to create more spaces and more opportunities for continue the conversation and let me share my screen the question is next because we started the conversation in October last year and of course these questions and these problems exist before the conference and the session of October so one of the main conclusion was that we need more to create and convene more opportunities to radically reimagine and redesign the structure of data through a feminist anti-colonial and anti-racist lens and for that we need more so that's the question for you in the Wikimedia community for us, for everybody what can we do as Wikimedians and how would you like to see happen next to move from conversation to action and how would you like to contribute in different ways so please if you have talk if you have links, if you have resources if you know more conferences spaces in which we can continue the conversation and go deep in the conversation please share because this is something that it reminds open for us and for everybody in this conversation so we are looking for ways to imagine radical possibilities to stay connected around these topics to make concrete steps towards emancipatory practices in the structured data and to join more collective spaces and to connect even small or individual projects to other projects and to create a space for this conversation and a space for practice and for experimentation too so if you wish to read more to learn more you can download the report of the session whose knowledge web page is available in English, in Spanish and in Portuguese so far and you can see the faces of the diverse participants that joined that first that open conversation so let me see if there are more comments or questions and to my colleague the questions or comments please feel free Erika, Cyn, Kelly I am just looking to the chat Mariana I think Ian had a question earlier on and if there's time to at least address it and also to say I will be in and hopefully some of the other panelists will join me in the networking session for 10 minutes or so after this session so you can join me and hopefully some of the panelists will also be there to talk a bit more so Ian was asking can you say a little bit more about decolonial English and arguably this could be decolonial any language and what is the issue that for example the way that the taxonomies, the taxonomic language around nationhood and citizenship works on wiki data both the language and the data modeling as far as I can tell resists the complicated ways in which nationality citizenship, tribal citizenship can be expressed or even ethnicity can be expressed as well so some examples for that is in are we calling North America Turtle Island or are we calling it America are we calling the people of the Navajo Nation Navajo or does wiki data identify that they call themselves Dine are we labeling someone as an object or are we recognizing that they're in a enslaved status that is their social and legal status so these are just some examples and I'm sure there are many others and how do we model that in the data but also then how do the ways that the languages modeled on wiki data reflect that there are these differences between this kind of conventional but colonial way of classifying and ordering people and how people themselves identify the language that they use so hopefully that's giving you a bit more of an insight into the conversations that would be had around decolonial English but it could be decolonial any language for that matter thank you Kelly and another question maybe you addressed this Kelly can you share some specific example of the data structures we should try to change I don't know if any of you want to add more about this about more examples but I think that it is about to recognize to acknowledge that there are not only different sources of knowledge and not only it's about to think on different communities as a source of knowledge to complete the lacks of knowledge or the biases we have but this conversation is also about the ontologies and this is informational science but this is only the philosophy the philosophical systems that in which different datasets are based on and for instance it is also about the relationship between the different entities people, places, objects and so on and in different territories for different communities there are different ontologies and I think that we need to deconstruct the idea that there is one or ontology with a higher hierarchy that can dominate every ontology and on the contrary isn't how those different ontologies can talk and can interact in a transformative way not only coexist at separate things but also it's about if we can create this space for a conversation about there is no one, not only dominated ontology we can move forward to a transformation to a transformative way of doing structured data with a lot of potential also for knowledge justice and for justice itself so I think that we don't have more time so thank you everybody and I hope we can continue the conversation in many much opportunities bye bye, thank you everyone bye