 Good evening. Everything is fiction until it isn't. Any AI enthusiast will tell you that. Thank you for everyone for coming this evening and making the Center for AI Future as a reality. The topic for tonight's talk is how anthropology and AI might work together in social sciences. Now, as any enlightened anthropologist will also tell you that what we write might be termed faction. We take facts and we put a story and a narrative on it. Following the same tradition, I'll quickly tell you a faction for the Center for AI Futures, which might help you to contextualize how SOAS, Heidelberg, sorry, Helsinki. I did work in Heidelberg. That's the first forper. Helsinki and Quilt came to work together. The three founding members of the Center, Matthi Pokhyanan, Angad Chaudhry and I, along with our dear colleague Minu Gaur enrolled in a PhD program at SOAS in 2004. We walked the intellectual terrain of our PhD years, working very closely together. And having at the very onset decided that we will enjoy the process, made films. We formed a collective called the sacredmediacow.com. Together we edited books, wrote papers, made films, and organized conferences. All good things end. We finished our PhDs, went our separate ways. Minu is today a renowned filmmaker. Angad went on to make his fortune in big data analysis. Matthi literally climbed mountains and moved on with his academic career. Our connections loosened somewhat. It was during the pandemic years, quite bored out of our skins, that we decided to get in touch again on Zoom. Minu was making a film in Pakistan and couldn't join us. In that meeting, we talked of finding ways to collaborate and work together again. All of us were keen to rekindle the intellectual curiosity of the PhD years and push back at what we felt was a cynicism that was creeping in. The Center for AI Futures had its genesis in that conversation. At CELAS, I took the idea to my line manager, Scott Newton, who immediately gave his approval. Our former pro-director research, Professor Andrea Cornwall, was equally enthusiastic and pushed it through the various committees. And finally, our director, Professor Adam Habib, smoothed the final hurdles. I'm very grateful to Professor Ashin Adib Mogaddam and Dr. Fabio Gigit, who kindly consented to share the chairing of the Center with me. Before I finish, I would like to mention an absent friend. Professor Annabelle Srebony was our PhD supervisor. Not only that, she was a mentor and friend to all of us at the Center and was a member of the managing committee. She was supposed to be sitting here amongst all of us. Sadly, Annabelle passed away after a sudden illness on the 30th of December last year. Her spirit and her wonderful sense of curiosity guides us. Thank you. Adam, may I hand it over to you? The colleague's friends, it's lovely to have all of you here at CELAS. I'm only meant to make a few opening remarks and I do need to apologize up front because I've just made a few opening remarks at another event 15 minutes ago. I've got to go yell and then I have to go to a dinner, two dinners in rapid succession of each other and I'm trying to figure out how I'm going to do and manage this evening. So please forgive me as you if I have to walk out in 10 or 15 minutes. I wanted to say a couple of things this week. This week at the executive board meeting, my colleagues and I had a long conversation on chat GPT and the implications thereof and the consequences thereof. And we had just made a way of one of the exams at INSEAD and another somewhere else that was undertaken and actually passed very well by the chat itself. And we're trying to understand both the implications thereof for teaching and learning but also the potentials thereof because there's a real conversation on can this innovation be mobilized in particular ways that enable us to make it something that is useful for learning and teaching and research in our world. And so what that kind of shows is the world of AI is fundamentally transforming our society in quite fundamental ways. The world of the digital is fundamentally transforming our society. And frankly, the world of data is fundamentally transforming our society. The amount of data we have available today allows us to navigate it and find solutions to challenges in ways that have not been possible decades ago. And the same is true of social media and how that world of digital, the digital has transformed our daily lives by enabling access, by opening up all kinds of possibilities and the world of AI, which is if you like in many ways, a culmination of all of those innovations that started 30, 40 years ago. But as much as that world creates enormous possibilities, it creates enormous threats. I sit on Twitter every now and then and you will not believe the stuff that plays out in Twitter. I'm utterly astonished at what people think they can say and articulate on Twitter. And the consequences there are for the world. The consequences there are for other people. Big data is a fantastic opportunity as is AI. But anybody who's worked in the field of critical race theory, who looks at the world of identities will tell you there's all kinds of racial, patriarchal assumptions written in the formal coding of these innovations that have consequences for our world, that have consequences for the kinds of debates that get amplified in one context and not in another, that have consequences for the choices that are made and the problems that are addressed and the problems that are resolved in particular ways. And so what is required is a real radical interrogation of this digital world, this world of artificial intelligence, this world of the data and how it's collected and how it's brought together in how it's categorized and how it's deployed for the use of some people and not others. And that is in many ways what this conversation tonight is about by bringing the conversation of artificial intelligence and anthropology together. But it is also fundamentally the purpose of our entire initiative in this center itself. And so it gives us an enormous amount of pleasure as so as to be part of this initiative, to be partnering the University of Helsinki and Colt and others around how we think through these initiatives. So welcome to this initiative. We're particularly excited by this. I must say, I should say perhaps one other thing about why we're excited about this. So as has undergone quite a radical strategic reorientation over the last 18 months. And part of that has to do with our research agendas and our partnerships. And our argument has been that we live in a moment where all of our challenges have become global. And what we need to address these global challenges is an integration and interaction of knowledge systems. But we also require an interaction between institutions across national boundaries so that we begin to pull together the collective intellectual capacities and resources to understand the challenges of our world from the multiplicity of perspectives that it needs to be understood. And so in some senses, the fact that we have so as Colt and the University of Helsinki in this partnership is particularly exciting because it breaches these institutional boundaries and it breaches the sectoral boundaries that are so important for how we reimagine our teaching, our research and our engagement with the world. I will caution two things. We are quite insistent that these boundaries must not only breach national boundaries but continental boundaries. And what we would like to see is institutions in Africa and Asia and Middle East to be part of this network in powerful ways. And it's important that we breach those boundaries as well so that those perspectives and those thoughts and those knowledge systems interact with ours because you can't truly be the universal from the perspective of Western Europe with the United States, the United Kingdom. You have to be universal by bringing the perspectives of the South, the perspectives of Africa, Asia, the Middle East, Latin America into this conversation about what we wanna do and where we are. And then finally, we do push the boundaries of what is equitable. We do push the boundaries of how we share resources. We do push the boundaries of how we structure the fees if there are any fees associated with this. And we do want to make sure that whatever resource flows that come to such a project is not only sitting in Bloomsbury in London or in Helsinki but is also getting into the institutions in Delhi and Mumbai and Janisburg and Kampala and Beirut. And that's gonna be a difficult conversation because we operate on a business model of higher education that makes that difficult. But that doesn't mean we balk at it, we try and confront it, we try to grapple with it, we try to innovate around it so we can get these resources where it needs to be had and where it needs to be to be enabling of the kinds of thoughts and research and teaching that needs to be happening. So I'm gonna stop there. I tend to go on about this time, forgive me for that. I do want to say to each one of you, thank you for joining us. We are particularly excited about this incredible initiative. We look forward to participating in this initiative with all of our energy, our collective energy and may all of us go for strength strength and may we begin to think not about simply big data and AI as a neutral technological resource, but as a resource that can be deployed we're advancing a better world. We're enhancing social justice or bridging the divides in our world and to be able to address the very, very deep inequalities that all of our world has been plunged into for many years and that we haven't truly addressed. Stop there, thank you very, very much. I really should know how to use this a little better. So Quilt.ai has been active for about five years now and we have worked across countries and languages and cultures. We can of course go very deep as you would like into some of the machine learning, data sources, equity kind of conversations after the presentation, but today what I'd like to do is, off the back of some of Adam's remarks, give you a sense of why we do what we do, how we all came together and where our technology and our product is evolving and how institutions, anthropologists, linguistics, professionals can all work together with technology. So the two of us, this gentleman here, Anurag, and I met in a coffee shop a few years ago. It was a serendipitous meeting in Delhi. Anurag is a serial entrepreneur and he has scaled multiple businesses and you will get to hear from him after I walk away from this mic eventually. After I graduated from SOAS, I worked in social and commercial research before setting up a purely digital research company that looked at Twitter and Instagram and Facebook back in the day to write observations about people for brands who would pay for it. When I met Anurag, his perspective was that if we can automate data collection, why can't we automate data interpretation? Now think of that for a second. If you're automating all of the erstwhile fieldwork elements looking at social media, looking at search engine data, why can't you automate the extraction of signal, which is the job of the researcher and the anthropologist? Why can't you scale that up? The anthropologist in me was very resistant to the idea. It was terrifying. Five years later, we have made some amount of progress here and I hope to share some of that with you today. Where we've not only invented a data collection platform, but a machine learning and interpretation platform across geographies and linguistic groups. The biggest challenge we face with AI and machine learning technology in particular is how do we not lose the story around the data that we extract? A lot of the online data interpretation that you will see, a lot of the reports that you will see that use machine learning will do one of three things. They'll convert an image or a sentence or a word into a mathematical representation. And this mathematical representation will be unique and contextual and relative to your data set and machines will be able to manipulate that. And the fundamental, the faint there is you have taken reality, could be traffic lights, it could be social media posts, it could be your romantic love letter to whatever chat GPT and it's reduced it into mathematical notations in multiple dimensions so you can manipulate it. So the first job is converting words and images, videos, sounds into dimensions. That's the first thing they do. The second thing they do is they take images, texts and faces and classify them into predetermined categories. And this is where a lot of the discourse or in the newspapers around AI bias and racism kind of emerges. Cause you have predetermined categories, which means you have presupposed certain things about how reality works and how it's categorized. And that's where all of your conversations about power and systemic bias, et cetera emerge. So that's the second trick that machine learning does. And the third one, which seems to be the most innocuous but also has interesting challenges is a pure descriptive move. So it'll look at an image or a text and say, okay, these are the top five keywords, here are the top six objects, here are the top seven colors inside this image. So three things convert reality into math, pre-fit what you see into predetermined categories and describe what's inside an image or a piece of text. Now, as researchers in the field, I think everyone here knows that there's so much more information than those three things inside any encounter or any database. And we will cover a lot of these through a very dramatic example I'm going to get through next. But a good example here is this piece here. I have no idea what this is by the way. One of our partners is a Japanese client that uses our platform. And what they did was they downloaded thing a hundred thousand photographs from different prefectures in Japan and ran color analysis on all of those Instagram uploads. And they could really tell the story of Japanese self-representation on Instagram by looking at how colors varied by region, by season, et cetera. I have no idea what this means. I can see some people are pointing and maybe you can tell me later, but the whole report is online somewhere. But that's the cool part, right? You build technology that allows people to suddenly tear open the fabric of reality and find things that you would never even conceive of looking for. Now, the good part. A key part of, I wasn't really going to double down in this, but I think Adam set the stage. So I have to keep out of addressing systemic and structural injustices, inequalities inside machine learning is not to replace biased models with unbiased models. And that's a very controversial statement to make I understand in this ground. And I've had a few encounters with the students earlier this afternoon as well because constructing an unbiased model is a very hard knot to untangle, to be perfectly frank. The labor and the effort that goes into it is very challenging. I would propose like looking at it in a slightly different way, which is not to approach it from the perspective of justice, but to approach it from the perspective of ethnography, which is to bring an ethnographic imagination to the field, not activist imagination into the field. And the first way of doing that is to see how groups and people that you're studying themselves self-represent without your categories, without your math, without your classifications is, you know, I see so many people wearing badges with pronouns. It's an act of self-representation. And I think listening to that is a big first step that we can do. So the images above here might seem random to you, but they are from 2018. So this is just after the election in America, just at the foundation moment of Quilt. And they're from Chicago. And it's one of our earliest pieces of work. And the reason I'm showing you this is because what's happening in this example really sets the agenda for how we as a business think about data and machine learning. Of these 1000 images, I want everyone to look at them and develop some hypotheses about what's going on there. But they're from Chicago in 2018. The best in class identification technology that existed in 2018 classified about 1000 individuals into three sets of people. It grouped one set into what I would call a nationalist cluster. And these are, you know, good old MAGA people. It's got army people. It's got survivalists and dead lifters. And honestly, in the course of a day, I'm a little bit over all of those. So I was a bit surprised by that classification. It grouped another group. And these are the labels the machine gave called gangs, which was inner city children with weapons, people upholding gang signs, people posing next to memorials and gravesites, et cetera. And it grouped a third cluster as cops. So based on uploads that were happening in police stations or with police uniforms or inside police cars. Now, none of these interpretations are false. So let's get that out of the way. They might be biased. They might be one dimensional, but they're not like false if you were to be, you know, if you didn't have time. But they are missing a lot of the ethnographic imagination. And this ethnographic imagination, which is what I'd signaled at the start of this conversation, which was, how do people self-represent? Like does a person say, I'm a nationalist and that's my category forever. And that self-representation and that tagging of information based on what the people themselves are saying is how you de-center some of the problems within this system. So what we did was we allowed the machine to remove previous knowledge and classification systems. So we said, okay, forget your nationalist gang, cop type of classification and really truly listen to what the people themselves are saying under these posts. And you can see the grouping has now changed materially. You've got different clusters emerging now. Earlier you had the nationalists, the cops and, you know, gangs and quotation marks. And suddenly the information is grouped differently based on the ways in which these people were describing themselves, the types of things that people were saying. And this grouping becomes a slightly more sophisticated second level interpretation of data. So you can see some of the nationalists and the gang members are together, some of the cops and the people classified as gang members are together, et cetera. And this gives us a slightly different set of groupings. So we had to peer inside, again, this is 2018, to understand how the machine would group people if you removed the categories. So group one was individuals speaking to ideas of self-reliance from being gun owners to being police officers, to being able to stand on their own feet. And it's a big American ideal. The second set as individuals yearning for self-improvement and mastery. Now self-improvement could be fitness, shooting guns, climbing mountains, but it's not gang, right? It's not nationalist. Group three and four were fun because group three and four were different styles of pedagogy. It was ways in which these people were speaking about how they were teaching the next generation, how to be amoral and fight for every little crumb or to be moral through martial arts and control over violence. And the fifth group was where you found some of the army folks and some of the gang members was around the use of memory to self-describe and create an architecture of the tribe that you belong to. People who have fallen along the way in wars, in gang violence, in police shootings and even amongst police themselves. So the reason I set this story up and just to remind you, our first set was a bunch of photos classified into three groups by the machine, predefined categories, to us saying, no, we don't want your categories. You want you to listen to what the people are saying themselves who have generated this data. We want you to take that self-description and remove your sense of, oh, the cops are in this group, the nationalists are in that group and tell us what the story is. And that allows us to set the agenda for how quilt AI looks at information. It's the ethnographic imagination and it's ground up labeling and modeling of data. Now, this was 2018. I cringe showing you these slides because it was so long ago, but it's important for us to show you our genesis so you understand our philosophy and our general direction. So there's a lot of work that has been done in the field. I'm gonna pause and switch gears. There's been a lot of work that's been done in the field of the digital humanities. Friends at Helsinki are very, very good at this, especially in digital ethnography and academia. And I know my colleagues here, and I'm looking at Marti because I'm familiar with his work, have done work on misinformation, hate speech, social network analysis and content across geographies. And a large part of this work requires some amount of technical competency. And our core product has been designed to remove that friction. So right now, if someone in the music department wanted to study music subcultures on Reddit, like the graph from where they are right now to them getting the access to their first data set is at least a year's worth of work. But we want to make that an instant experience. And a lot of the product technology innovations we'll be showing you is how anyone can go from sitting here in this conference to logging and start playing with data and information. So these numbers are, again, propaganda and they're all estimates, but the scale of information being generated by individuals encountering the internet today must make us take pause, right? Like again, these numbers are from this morning and we really don't have access to the fully transparent numbers from the platforms, but from villages to metropolitan cities across the world, all people are connected to some type of network. A lot of this information is not useful for researchers, like ethnomusicologists doesn't really care how many downloads or searches they are in an hour, but some of it is extremely insightful. When we think about online research, and again, I point to our friends there, we think about Twitter because their APIs have been researcher-friendly historically. And this is another little nuance on conditions of production because if Meta's APIs were friendly, then Meta would be the core academic platform for research right now. But the internet is also not just Twitter, but it's TikTok and Instagram and Google search and YouTube and Yandex, the Russian social network, which is one of the best image recognition platforms you can ever use. So please go to image.yandex.ru, I think. There's Naeba, which is the Korean social search engine, but regardless of the platforms that you may have access to, the way you want to think about internet data, from a researcher perspective, is it has three categories. So the first category, which is what everyone will see when they follow me on my Instagram after this talk, is uploads, right? And uploads is a very dynamic and rich category of, we all know what an upload is, right? And it's what individuals create, express, and share. It's what you open up on your phone and you glance through it. It's the noise of human interaction, argument. It's what Adam was speaking about, what he experiences on Twitter, what some of us experience on 4chan, Reddit. Like it's just stuff. It's like your pets, your dogs, your cats, your politics, your hot takes, you know, like it's all of that. And think of that as your uploads, right? It's a very one-dimensional piece and all researchers are obsessed with uploads, which is fine, but it's one part of the internet research piece. The second piece is search. It's not on the slide, so I don't look at the slides, I'm just riffing. So search is what people type with us, all of us type into YouTube, Google, Bing, Baidu, Yandex. It's what you do when nobody is looking. And as people who have access to search data, it's essentially a database full of typos and spelling mistakes, right? It's like half form thoughts. It's like a stream of consciousness type of questions. So are my knees hurting? Where is so ass? Why is it so cold? Like, am I fat? Like it's just like random. It's just the noise of your brain. And, you know, we make a joke on Aragh and I. Actually, it's more on Aragh's joke than mine, where he's like, his uploads are always about her running marathons and his search is always about arthritis. So, so, you know, to be very, yeah. So if you, if you, if you, if we're all friends here. So if you, if you wanted to, if you wanted to look at, you know, if you wanted to use a very basic metaphor, uploads are like your conscious mind and searches like that babble of your unconscious thoughts that are happening. And both are very, very helpful when you're telling a story and trying to learn about people. The third category is called metadata, which is the information surrounding the information. So how many searches? What's the seasonality of the searches? What is the volume of the uploads? You know, how many likes did this post get? How many views did this video get? What's the like over engagement ratio? Like there's so many metrics that you can work with. And while maths is not so as a skill set as my co-founder tells me every time he looks at my numbers, but, but, but maths math is interesting because what happens is like you have all this noise. If you have a hundred million keywords being searched for, which ones are interesting? And volume, engagement likes, all of that gives you some amount of perspective and allows you to distill stuff down. So even without AI, just by getting a hundred tweets or a hundred Instagram photos, a hundred searches and some amount of math under those, you have a lot of insight into people. In fact, you have more insight faster than you could ever possibly imagine. But then what happens, right? So that's the data collection pyramid. But so what, right? How do you ensure that that search that comes from London and that search that happens in Chennai for arthritis has the context of Chennai and London because in London, you have the NHS and you have the cold weather and you have all sorts of contextual thing in Chennai, you have insurance issues, you have all sorts of things, right? You have the diet. So there is that contextual information and you can't lose that just because you have lots and lots of data. And I think that's where a lot of this stuff becomes important. So to achieve this, which is to ensure that we get uploads, ensure we get search, ensure we get the metadata, ensure we keep the context, all of that kind of stuff is we built a fairly sophisticated technology stack that any researcher with a non-technical background can actually use. And it's important to say that because that the non, like you already have researchers coming into SOAS learning how to do field work. You can't expect them to become computer scientists. Like it's not fair. If they wanted to be computer scientists, they would be in computer science, right? Like, so I think the ability to have a non-technical person use this is a key part of how we've imagined it. Now this allows researchers from different backgrounds, different geographies, run projects, download data, run analysis on it. And even, and this is very important, build customized machine learning models. It's a really very powerful tool. It's built by researchers for researchers and that's its essence. It's not built for the man, right? It's not built for acquisition. It's not built for like Google. It's built for a bunch of nerds sitting in Singapore and in Boston who want to do research projects. And when Anurag takes you through some of our case studies, you really understand how we've been deploying it constantly. We can of course do like a bunch of demos after this, but this is just to give you a sense of what the platform looks like. There are raw data sources you can select, analysis models you can create and run, and of course raw data that you can eventually download. And given that most of us are super excited about having, sorry about this chat, having lots and lots of data, we're always adding new data sources constantly. So each data source provides us with a different degree of insight and observation. So if you look at this slide, okay, so I'm of course forgetting my ages. I'm sorry about the font size, but there's a few things here on the slide. Twitter is the most famous one, but search engines, YouTube can provide you with texture and color into what people are anxious about aspiring towards. We have a data source there called the General Open Web which can give you insight into the millions of blogs and forums that exist all in more manner of subjects. There's an e-commerce data source, could be Amazon, the Chinese data sources, et cetera, that gives you insight into what people are buying and selling. There's TikTok and Instagram and other visual platforms that give you a rich perspective on self-presentation and even the meaning of entertainment across cultures and geographies. And of course, we have unique data sources given that every country is unique. So we have Weibo and China, we have Walmart in the US and every country has its own data logic. So I'm sorry, I realize this slide is a little intimidating, but what we're trying to say is that it doesn't matter where your customer is or the person you're studying is and where she lives and what she's into. What matters is that if they have a fingerprint on the internet and identity fingerprint that is them expressing themselves or passionate about a hobby that is available with GDPR compliance, we have it on platform. Once the researcher finds the data sources they want to study, they can do lots and lots of really fun things, right? And I'm not going to go very, very deep into this, but you can run text, let's say, let's run a project, let's say Reddit forward slash our SOAS events, right? And you download the first thousand posts about SOAS events and you download a thousand images from SOAS's Instagram handle. You want to understand what's happening. So you get text extraction, you get emotional extraction, you get sentiment analysis, you get all of that basic stuff, but then as someone from SOAS would be like, actually the issues in SOAS are not about sentiment, but they're about these three political movements. And then you can train a machine learning model to detect those political movements and tensions inside the dataset. So it's a very, very sophisticated system. Similarly on images, we can do the same thing. So you have all the SOAS images and you can run, you know, labellers and classifiers that says, I want to find Guinness drinks. So I want to find cigars, smokers in this dataset and just like zoom in and find them for you. You want to like find you in a dataset to ensure that every time the color red is shown, it means X and like learn that. So essentially it's a very powerful toolkit that allows our researchers to answer very, very complex questions. I realize this is not very helpful, so I'll simplify it even more. All the data in the world that you need around an issue or a subject in any language can be pulled into a system. Any type of analysis you want to do on that, whether it's network analysis, keyword extraction or deep type of machine learning modeling that says, okay, I want to classify it into these three groups can happen and it can be output into a Google Sheet or a word cloud or a chart or a graph or whatever you want to, or a PowerPoint or whatever you want to output it as. So I'll just close up my remarks now. Keeping country and regional specific differences in place when you do research across the world is a super massive challenge. We have worked in the geographies you see on this map and we have some amount of mastery that computers allow us to achieve. So that's not an issue with us. I think with the expertise of SOAS and the Helsinki group, what more can we keep? What more can we actually achieve? With this new center, what new types of research are possible? Both from a commercial perspective as well as a student project perspective. And I've shared technology, I've shared philosophy but I would like maybe Anurag to take over now and share some of the case studies that are relevant to you guys. So maybe it'll spark some thoughts and it'll give you a sense of what all of this actually allows us to do. So I'd like to invite Anurag to the audience. I hope his arthritis is fine. Thanks, Anurag. I have massive imposter syndrome. Probably the most unqualified person here but thank you for being here. So just setting context on the idea behind Quilt, right, Zongad mentioned the idea of meeting a serial entrepreneur which is essentially how you stay unemployed for a really long time is what that means. So I'm sitting down at our house and my then seven year old daughter says, oh, daddy, new startup, what's this gonna be about? So I start talking about artificial intelligence and technology now goes and you can see her zoning out and basically heading out the door. Then she looks at Angad and Angad says, who's your best friend? And Anika, my daughter says this girl called Sienna who's our Italian next door neighbor. I wonder how she's now. But, and Angad says, well, daddy and wants to know why she's your best friend. And so Anika says, because I know everything about her. And he says, well, that's the point. Once you know everything about somebody like it or lump it, you will know them well, right? And that's where the word empathy comes from. So whole construct of empathy at scale comes from there. And this is the esoteric vision that we have but the idea that if I know you I will probably have civic discourse with you. And given the fundamental breakdown that's happened in civic discourse in the world, the idea becomes how do we get to know each other better? And the internet becomes a conduit to help accomplish that. So again, propaganda slide. So I won't spend much time on that. Another propaganda slide. But the reason this slide is up here is important. Another anecdote, I'm full of anecdotes including arthritis. But when I met Angad in that coffee shop I was in the middle of a multi-year process of adopting my son. And I showed up at Angad's house and adoption is a harrowing process. So you showed up at house having lost a lot of faith in humanity and you walk into Angad's house and he has 18 or 19 dogs, seven or eight cats, all rescues. And I said, this might be a good guy to work with, right? But the premise there is that if you have to, so you create a model from the very beginning. So a lot of capitalistic models work like this. You make a lot of money and you do a foundation, right? Which is great and important and we need them in the world. So Quill started with the idea that we would always be a 50-50. So we have some partners in the room here. So 50% of our work effort time goes into helping amazing commercial organizations do great things. And 50% of our time goes into helping large non-profits, philanthropists, bilateral, multilateral, small NGOs and there's some partners here do amazing things with the same data. And that allows us to track if the Quill team is online, hello, that allows us to track great talent but build that. So the reason this slide is interesting is because it has this smattering of combined forces. We can build great technology for an Amazon and then apply it on gender violence on the middle of Africa. So that's the part of the propaganda there. I'm going to walk you through a few case studies fairly crisply. This is vaccine hesitancy. So there are some of you in this room and that's okay, I'm teasing. But it was obviously this complicated topic that ran amuck for the last two and a half years. And having research that I think over 41 or 42 projects for various philanthropies and governments, we found that vaccine hesitancy tends to be in different spaces. And when you read the self-expressed information on, it actually does not take left and right viewpoints. It's much more nuanced, much more complex, much more human than that, right? So there are people on both sides of spectrum. There are people with different educational levels, people in different hubs that sit and become a vaccine hesitant in some shape or form. And we probably have the world's largest set of human study on what vaccine hesitancy looks like. And the way we cut the data is so that people can then action it. Because the way you would see it would speak and we analyzed the communication of all entities that spoke on vaccine hesitancy. And a large part of it is very unidimensional, right? You should take the vaccine, otherwise you are not very smart. If I had to sort of thematically, unidimensionally sort of summarize that statement, because that's what the statement would be. But I have friends who are the vaccine very tired of it, who have not taken the third booster or the second booster. I have amazing athletes, some of them are in this room who are immunity confident and don't want to take it. I have vaccine apple states who are like, you know, I didn't get the whole deal with this. It wasn't a hundred percent efficient. And then I have people who had terrible side effects, right? Like, oh my goodness, that destroyed me for three days. And there's some of you in this room who probably destroyed for a couple of days at least. So there are multiple nuances to us being human. And part of the reason we're doing this center for AI is that technology and data sometimes, often, not sometimes, often misses out that human strain. And it tends to clump things in ways that are convenient or easy to do. So I'm going to give you, this is a set of work that we do in our non-commercial research. And this is important because it's going to lead to what Angad and I would love to do with Mati and Som and this center. We've worked with some large organizations and some small organizations. And the reason we've done that is we do, you know, I think 5% of our revenue every month goes to charity. We do between two and five low bono projects every month. We do two pro bono projects every month. Some of these, you know, some of these you may not know as organizations, as science splash or the smallest ones here, but there are some people like ICRW is a phenomenal organization, a small budget, but very, very fantastic policy organization on gender rights via Washington DC. So we worked with a spectrum of them. And the idea there is to do work like this. So misogyny, as I tell everybody, no man is born a misogynist. You often become one based on the conditions you're exposed to, but what is misogyny? How do you study misogyny in data? And if I asked everyone of you in a blinded model today to talk about misogyny, the overlaps actually would not be that much. So how do you build a model that detects misogyny in conversation online? If you bring a group of social scientists like all of you together, you spend a ridiculous amount of time coding data, building an AI model, and then applying that model onto everything that comes in, and then reiterating and improving that model for context. So we built this for AWARE, which is an incredible NGO out of Singapore focusing everything from employment to women's rights. And we coded millions of comments, millions of posts against that and ran it through a bunch of cycles. And it's a big case study on AWARE's website, if you're keen AWARE.SG, to go take a look at how they're now applying it to understand what is the level of misogyny in daily discourse that exists online. Now that we built the model, and as Angalu is saying, it's mathematically represented, you can take that same model and drop it in different languages. We learn trainer and iterated, so it becomes native, because misogyny again manifests differently, as many of you know, in different environments and structures. So that's an interesting example of a potential collaboration. And as I said, as I proceed to the next five, seven, eight minutes, I wanna make sure that we get to that zone where we think about what are the tools that we can apply to solve some of the problems that plague us. This is a favorite. So some parents in this audience, I would assume, I am one of two. And the battle always is sugar and advertising and some great brands that I shall not name, but they're compelling in the ask, you walk down to supermarket aisle, you turn on TV, if it's linear TV, if it's not Netflix, there's advertising happening. And how do you make sure that not false promises, but even vague promises aren't made? And to do that, you have to study thousands and thousands and thousands of hours of video assets. These ads are being made. How do you code an ad as being something that's slightly troublesome or blatant, it's not crossing any boundaries, but the messaging that's implicit, even not explicit, is teasing kids, is inciting kids to go ask for that extra processed food piece. So another fascinating model to build. So once you code all these ads, then you know what enticing a kid looks like with certain kind of products. So what should lays chips? And I love chips and crisps, right? As you call it in this country. But there's a messaging in that that's challenging. And how do you tweak that? And then while within the letter of the law, is it within the spirit of how we should talk to children? So another model, climate, some, I think a couple of climate folks in this room right now. Again, on what makes an effective climate campaign work, right? And climate is a fascinating space because I don't know if you saw the piece about Greta being harassed the other day and Greta is a really funny, funny person, right? But there's a lot of vitriol that comes her way, misinformation, disinformation. So how do you manage campaigns that work effectively to counter that? And what does that successful campaign look like? And what do you learn from these campaigns? Because a large part of the way the climate change activism group works is the idea again that the world is ending and get on board or you're a clown. Like I'm summarizing in a very bad way. I don't have the elegance of my academic colleagues, but that's the premise. But there are nuances. There are people who are allies. There are people who are, so we did this beautiful piece of work, we segmented into eight segments around multiple countries. And there were people you needed to nudge just a little bit for them to become an ally or to say something in a particular way. And then there are people you would never, and so where do you expand your energy? Do you tweak a certain thing for a certain set of people to move them along the continuum or do you fight the radical person on who's never going to get committed to climate change? So finding an effective campaign, analyzing that, so donor funds are used well, so climate activists can do more with their time and money. This is a big one we did, we did gosh. I want to say about 21 or 22 projects during COVID on the increase in gender violence. I mean, you've read the headlines on it. But it's staggering, it's obscene, it's extremely, extremely challenging. And to find how it comes through and how it manifests and what are the local resources that, and this is done, a lot of this work was done in the global South, in Asia and Africa, where you could match donor resources and service resources to the need that existed. And to be able to map that on an ongoing real-time basis is tremendous and important. And I think we open-source some of this data. So it sits out there. If anyone's interested, hey, Angad and me up and we'll gladly share it. So I'm doing a whistle-stop tour here of the work we do. I'm clearly passionate about it, which brings me to why we are here intruding on your evening. So there are a few places that, so Mati and his colleagues, Angad and I, talked about working together and this isn't formed. So this can be fluid and I'm definitely gonna look forward to chatting Adam a little bit later tonight and seeing where and how we structure this, but this is where I need, I need, we need input from you. Because the internet is the largest repository of human representation and human information, it becomes the largest data field and it has the ability to not be biased. It has the ability to be, well, it has the ability to be comprehensive, let's leave bias alone for a second. It has the ability to be comprehensive and because it comes from the people at a scale that is unbelievable and there are access challenges, there are some villages that don't have it and having built startups previously in sort of complicated bottom of the pyramid structures, I understand that better than most people, but because the internet is this unifying lifting force, we have a set of data that's coming at us in a large way. The challenge is interpreting that and we have spent tens of millions of dollars over the last five years trying to build this structure that is not perfect. We love it obviously, but that's because ours, but it's, and so how do we take that structure, that interpretation capability and offer it in a low bono, pro bono way to amplify and increase the field of research, the knowledge that exists. Now I was in this, in my adoption journey, I was flying to, flying to a city called Patna in India and I was sitting next to a lady from the Gates Foundation and she was going to go study, the use of contraception by teenagers and from Seattle, she had a translator and I said, no 15 year old kid's going to talk to you about sex, that is not happening. I don't know, I have great translator, I'm like, that is not happening, that's not, no, I said, so Angad and I spoke and we said, well, you know, we should just check out, what are they doing online? What are the YouTube videos they're watching? What are they sharing? What are they talking? How can we garner this information in a privacy compliant GDPR process way? But gather that and create insights that are more useful, because she was going to take that and deploy 50, 60, 80 million dollars worth of programs on the back of two focus groups in two villages over four days. Yes. So instead, instead, you use the information generated by these people to help them and meet them where they are. So three quick things that we'd like to at least kick the conversation off and this is the beginning in so many ways. There are ways for us to help students research and we need to create a construct that is that is, that is as frictionless as possible for students as SOAS or Helsinki or anywhere for that matter to Adam's point, like why should we restrict it to Bloomsbury? Then take that and maybe combine that with grants from other large organizations that might want to study the same populations in some shape or form to advance. And then the whole idea for us at least is that it should be open source. It should be a public good. Anything that comes out should be then available to other researchers. A large challenge that I have is that some of the work that we do often isn't shared as widely as it should be. We've probably gotten the same brief for studying violence against women during COVID from multiple agencies and we'll send them the piece of work but just do it again for us. So I'll take the money, I'm a capitalist but that's really silly, right? And then for senior academics would like to reimagine the feel like what can we be doing differently? What is, how do we scale our knowledge and our learning on that? And then to the next piece which sort of sits across all of this is the piece on multiple regional departments. It should not be restrained to, we made the slide before Adam said what he did but I'm glad he said what he did because it then sinks up. How can we democratize, it's an overused word but truly how can you authentically democratize the knowledge that we get, the ability that we've built to be used by multiple institutions across the world and there is a licensing model and other things in there I'm sure but the idea should be to broad-based push it as much as is possible and eventually to build deep domain knowledge the idea of being able to build your own model, your own interpretation layer on a set of information is so much fun because it is not like taking a survey, right? Where you've got a set of data and you only cut it so many different ways but if you suddenly have access to, 55 TikTok downloads or a million of them you can study kids having funding a particular thing or we're going to a health conference in a couple of days, 33% of all Gen Z go to health talk to find out health information as opposed to trusting their doctor, right? I don't know, 55% of contraception conversation is on TikTok today, vis-a-vis Google. There are cities and countries where TikTok search is bigger than Google search. I don't know how many of you are on TikTok as observers or posters but the world is changing at a rapid pace and information is being generated in a really democratic way. It's an extraordinarily exciting time for us as researchers to study that and then I've had the good fortune of building this tool and we'd love to see how that gets used in some shape or form. I'm going to pause there, I realize we were hopelessly over time so it's been nodding at me now. I'm nodding at you, what do you say? But with that, I think the idea is what are the other models we build? How do we give access? What does access look like? What would be some problems we look at? How do we solve them collaboratively? How do we allocate resources from our side? We have a team that does give between half a day and one day a week across a couple of hundred people. That's a sturdy sized organization that is focused on solving some of these problems. So with that, I want to thank Som and Mati for hosting us. I'm very excited to have kicked this off but this is just a slushy entrance, right? We should... There's real work to be done from here. Thank you, folks. We have got 20 minutes for question and answers. Ambed, are you happy to take it? Instead of me moderating that? Yeah. Yeah. Do I just... Do you want to turn it to me? As you're asking questions, it is good to get some context or like a brief introduction of you and whatever you're comfortable sharing. Let's go. Thank you for picking me. Thank you for presenting this information. My name is Marcus, I already work in the School of Arts. May as well get this one out of the way because I'm probably not the only one asking it. But I guess I need some reassurance, really. And I'm a little sort of passing mention of cynicism about AI at the beginning which I wanted to just follow up on. And so obviously, I think for a lot of people in the room, we know that AI discourse and the kind of conversations that are happening around AI at the moment tend to entail a lot of fantasizing, really, about the possibility that all human experience and politics and everything can be kind of contained within the access of engineering and mathematics and misogyny and data, quite frankly, is an example of that, I think. So I suppose what I'm hearing is a lot of faith and I'm looking for a bit more reassurance about kind of awareness of the limitations of that as well. I mean, for example, gamma people are going on TikTok for health information because they can't approach the planets in the NHS at the moment. We can study what they're doing on TikTok, but we also have to... There's a kind of a technological element for that as well. And so I suppose, like, yes, technology can be the solution for something, but it isn't often a logical solution. And may not even always give us a complete picture. It may obscure something for us as well as revealing others. So I guess I'm just looking for a little bit of reassurance from you all about what you see the role of AI being and what you see the role of the centre being because as far as I'm concerned, how in here for the good guys, I think so as I'm... I'm going to use that. I'm going to use that. I'm going to put Peter in the middle of that. I'm thinking of questions, so maybe you can speak to that. Yeah, I'm happy to... Look, it's an easy question to ask, isn't it? It is, right? Because it... To ask is to answer. To ask, right? So you could see the question to ask because there is such a lot of conversation and I'm being intentional about the provocative there. Because I think of AI as excel, as math, right? And I think that there are probably four questions in that question that you asked. I'm trying to pass that to each one of them. But the first question was the idea that you need reassurance. And I think it's like human beings. What we put into the excel is what the excel will have us, eventually. So I think it's important to understand that the AI by itself is so far away from this all-knowing, all-being thing that the press makes about it. Well, you need to apply a similarity. I mean it sounds machine learning and specific, I'm speaking to you now, but it's that kind of stuff that you're actually talking about. So what is the reassurance indeed? Well, perhaps the track that you can't hear the reassurance there before is maybe part of the issue. But I think there's a kind of concern really that the idea that all of human experience and politics in society can be made sense of purely through a mathematical and engineering axis. Which, as far as I'm concerned, this has got to be not the only axis of insight. Oh, but I don't think you should say that. I'm hearing only that. So that's why I need reassurance for that. Okay, so the reason I have a vision anthropology at my side and our team is cultural analysts and human scientists and social scientists and semi-opticians is because the biggest flaw in the way machine learning is done today is done as engineers. And engineers are awesome. I have many friends who are engineers. But if you don't have a domain application of how a machine learning model is built, then that machine learning model is inadequate. And the machine learning model, as I said in my piece, needs to continue to refine and re-create. That's one feature. Second is for reassurance perspective, the reason we're here is to enlist broader support in teaching machine learning models to do better. And it's no different than machine learning models that work on sort of death row judgments. Don't do any better because the training set is how the death row judgments are being made. So the training set needs to be tweaked. It honestly is not. So the reassurance that you should ask for is how involved will you be, right? And others be in training the machine learning models. So that is the answer I think, right? Because the machine learning models have been trained in that particular context. The other piece that you had, which is, I think the toughest question in that set was around, are people going to TikTok because they can't get NHS? I can't solve for that. That's well above my pay grade. But imagine if you didn't have access to that TikTok information at all, right? So the venue that these people have to have a conversation is important. The humanity needs to have conversation than shared learnings across, which make you better learn, you learn more from that construct. So there are two separate questions there, but I hope the reassurance pieces is reassured. Thank you so much. It's really exciting to know about people from the Center for Global Media coming together to the Center for AI. So I'm asking the question, I think in relation to the question that was asked, which is around the assurances, I think it's about research and gender and about what do you hope to do in a sense, because we get think about the media studies and the development of the network, and the development of media studies. We have always been looking at, you know, kind of the semantics and understanding what we do with media. So in a sense, the source in model that provides us with a way of looking at big data and helping us, helping us as a research tool. But in a sense, we are still asking the same questions. And I think what we have saw as what we want to do or would like to do is continue to look into the questions that human beings are interested in. So in a sense, why do people want to use TikTok? Why do people want to use Twitter and so on and so forth? So the thematics that you will talk in several of your slides that came up through the insert petition through machine learning, are thematics that we have been engaging in for a long time. So what is different? Is it big things like that? And the other question is that, what happens to those people with the outliers, you know, not humans, how do you manage to research them as well? Yeah, I mean, I can take a first shot and I'm just gonna give the better answer. Nothing has changed except the vast, vast, vast quantities of data that are available and they're really going to interpret that and the joy in studying that, right? Because your previous processes would be what you would, you know, do field work and speak to people and do ethnography, which, you know, I've done and it's deeply rewarding and fun. But, you know, you leave people out in that equation too, right? And you create a sample and you extrapolate from there. There are some people that are not online today and we don't get to capture, there's thickness and tenderness of data. There are so many limitations to this piece, but the study is still the same. How do we understand human beings better? So you're 100% accurate on that. Sir, ma'am, sir. I'm not reassured. I agree. I wonder, I think, I wonder if you think that the more data you have, the better answers you get. And if only we had more and more and more data you get more and more and more accurate answers. And I don't think that's the point. I think it should be the point in responsible if you look at, you know, from a philosophical point of view of our human flourishing and what is good happening. I'm concerned that my simple question to you would be what is not captured in that data that is important to our humans? I'm interested in your answer to that question. What is not captured? I mean, so much, but that's true for any data set, right? But then what's the value of what we're doing and just missing so much? It is missing some. No, no, I think you're making a leap that is, that is. So what is not captured in any data set should be the question, right? So which data set is the most comprehensive in your opinion? When you say a data set, you're talking about people with internet use. Yeah. So what method would you use to understand humans better? If you look at situating practice, talking to you one-on-one, I thought you'd find out a lot more about you in a reading than I would if I looked at a million searches on your, on the internet. Actually incorrect, because you would never know that I searched for things that I searched for, right? I would maybe never express that, right? But you see my point, that does not count. You're not, you're capturing one. But you're missing that in a situated one-on-one conversation too, right? Yeah. Like, do you know everything? Like, let's have a chat. Let's spend the next 10 minutes chatting and I'll give you my search history and let's see if they link up. They may be very different, right? Yes. And so then, should it be complementary or should it be exclusive? So my point is that what you're doing is generalising across, you're taking a big set of data and generalising the process. First, what I'm saying is there is no universal truth that you're looking for. For sure. So no disagreement on that. I don't, I can't speak to the gentleman next, because there are real issuance. I'm looking for, from an institute, I'm surprised that they're to be listening to what is a pitch or surveillance, data surveillance system. That's what I'm doing. I'm shocked. I mean, I don't know how to assuage that shock then. I apologize. I may have come under false pretenses. So I'm not pretending to be further by any means. So I could be here under, I could be in the wrong room. So what would you change about this, right? Should we stop doing this? Should we stop the Centre for AI? Like, what is the solution you're trying to propose? I don't know enough about the Centre for AI. What should I stop doing in my work? So just to be clear, I'm not here to tell you what to stop doing. I'll tell you what my concern is. You accuse me of giving you a pitch, right? Yeah, no, you use the word a pitch, so you want to be mindful of that. Yeah, yes. So I feel I'm experiencing a pitch, must be using your language. I feel I'm being pitched to, which is just the truth. There is no pitch here. There is a request to see if we can build models in a more holistic, inclusive way. And I think the question around... Is that a bad thing? So the question, you're looking for good and bad. My question is, how responsible is it in the sense of, how much do you think, for research purposes, this genuinely captures the essence of what it is to be human by looking at a child's use of tip-toe curriculum? Yeah, I'm not saying it's invalid, it doesn't tell you something. I'm just concerned that it doesn't tell you... I didn't say it says, tells you everything. So I think that there's a couple of things, right? I don't know if it's right, but... No, no, I think I see a few hands up, but I think I'm a little attacked, I have to confess. So I reacted very strongly to the word pitch, because the first thing was you made a pitch, and then you changed it to say, I felt like I was pitched to, so there's nuance change there. So I want to make sure I address that. But I came here to see if you could build a holistic democratic inclusive model structure together, right? And I have not made a claim that this is comprehensive or perfect. It is a way to study humans that gives scale. It has, as I said to the lady before, multiple flaws and limitations, which we're familiar with intensely, right? But it's not better than your model of studying me one-on-one, but it's different. And he commented with those two things that I just said. One is, can we build things together? And can we assume that we learn something's different? Then I think we're aligned, and that's the only ask I've come with. The lady at the back. I'm just kidding. I don't know if that's the mother research. I guess Dave wants to ask a lot of the questions, because I think... So Twitter's acre was acceptable at some point, but it's no longer acceptable. You had to call Mr. Musk for that. I mean, I've had issues with using sort of commercialized faith. I need to do social research, including issues like informed consent and the type of data that researchers collect, social science researchers really like intentional. We don't collect all data about everything. So I want to add a sort of a platform that is sitting somewhere between commercial and research, how you're building these questions of research ethics that are a lot further off from the right of the world. That's a great question. What's your name? Nancy, you just followed me on Twitter. Yeah, I just saw that. There's no API. So the funny thing is that we've got two of our colleagues who are applying to the Oxford Institute. So I've put in a good word for them. Please, please accept them. On our website, our engineers today, because we have to go through a lot of regulatory processes, but just today, they have released our paper. I think today or this week or tomorrow. Is it tomorrow? Is it out? It's out. It's out. We released a paper on how our engineers are solving for PII at scale. As a commercial enterprise, we have commercial relationships with data providers and they have their own terms. So when I get the Twitter API payload or newspaper article payload or whatever I'm getting, there is personal information that is coming there, even though the API providers claim that they're scrubbing it, but they're not. Phone numbers, addresses, locations, all sorts of stuff. And one thing I think to keep in mind is I'm going to take a detour and answer. It's a lot of the way in which we interpret our data and a lot of the way the data enters the system is interpreted at an aggregate level. So we're not interested in Nancy's Twitter, but we're interested in people who are researching internet and posting as an aggregate, right? So the PII doesn't really technically matter because we're scrubbing all of that, but we're still getting it into our systems and it's still creating noise and it's still a massive compliance issue, even though the providers are giving them to us for commercial terms. So we've built a system of removing PII, which we've released a paper on, you know, which is okay, but it's what we've just worked on. In terms of PII, it comes in many forms. The first form is of course your handle, which is at Nancy, whatever, right? Like that's your identity and Twitter will give me that. So I have to delete that. Then there is the thing that you've said, which might be innocuous, but I can put that into a system and reverse engineer our handle immediately. So I have to scramble that. Then there's the time and location you've posted. Twitter will give me that happily. I have to remove that. Then there's your phone number, your address, your email, your handle, your solicitations for collaborations, your affiliations, I have to remove all of that. So by the end of it, I get very little, but we have to build systems that ensure that even though we're following the law, we don't actually have to remove everything. We have to follow as unrocks as the spirit of the law as well. So the PII challenge is huge. In image recognition is where it becomes really tough because TikToks and videos, like kids talking about healthcare is fine, but kids posting their locations, like a good authentic researcher can reverse engineer where you are, like in a second. And that's not violation of PII. That's technically a metadata activity on public data that actually wasn't there. So we have to figure out ways to manage for that. And we do that by aggregating machine tags and then just preventing the tags to our researchers. So this is a massive rabbit hole. So PII is not only about protecting personal identity, but protecting derived identity signals as well and all sorts of things. So you have to have a little bit of like a hacker imagination when you're designing PII protection systems. We have lost a lot of money as an organization by refusing to research certain populations, by refusing to work with certain institutions, by refusing to even study certain ages and genders. So like that advertising communication project on children advertising, we were originally tasked to study and we had regulatory approval, the actual kids and their reaction to the communications, but we took a decision to study just the metadata of the advertising in order to reach those conclusions. So it's a fight, we fight every day. And we have a- And we do know politics, so that's probably the call that he and I get like once a day from both the left and the right. So some reassurance, hopefully. I think that lady was there, sorry, sir, after you. So I guess what I wanted to ask you is, how much are you willing to, how much are you open to critique of your approach in the system when making between available for academic research and offering teaching as part of this partnership? Because I think it has been said, many of us see a logic of pathological normalcy in AI and I think that many of us are very skeptical about that. And I think, you know, as much as you say, well, we need to include more, we need to make sure that the usage increases so that AI is trained accordingly and reflects in these different perspectives. You know, that kind of suggests that eventually the majority is gonna be at that sort of, you reach sort of a rough average that should reflect one perspective when eventually there's clearly like also like this, you know, fear that many of us have studying minority voices or even particular cases that actually like that would be goes out and not be heard and not be reflected. So I guess, how much are you open to have this dialogue with academics and then come back to your original model and rethink it entirely potentially? So that's the key question. And also along with that, another question is, because you said, you know, obviously, you know, researchers are there, they're interested in those topics, researchers, but they don't potentially don't have access to that. They're not computer scientists. So we want to provide this to the user friendly tool so they can, you know, they'll be invited and sort of have access to AI somehow. And the question is, how much are you actually deep, are you willing to deconstruct and, you know, the mechanism in the process of that tool or academic research from a more critical perspective? So both of those are great questions. I think what I would do is frame it slightly differently, which is I would really like you to use it and understand how it works before having a perspective on it. Because once you use it, you'll realize how actually stupid it is, right? Like machine learning is actually a fairly dumb content analysis filtering mechanism, if I was to be more. Yeah. So then why are you scared of it? And to answer your second question, I appreciate any feedback as long as it's detailed feedback. So on the abstract level that AI is scary is not helpful feedback for me because I can't do anything with it. That's great. Yeah. I think we're open to, we would like to have critique and feedback. We may not be able to change everything. Some things may not be commercially reasonable to change and I want to be transparent about that. But like I think we have built models with feedback from many nonprofits and we encourage you to do that. So serve the back, then you and back to you and the whites. Okay, all right. No questions in regard, we're over time. Five, six, okay, all right. We'll be quick. And we're gonna have drinks and someone's gonna catch me there too, right? Yes, all right. So I'm co-founder at Electric Club Boardings. And so during your talk, sort of mentioned that that data will source from source to take five with regular 40 framework, right? So it sort of builds upon the question that the previous gentleman asked. So regarding surveillance capitalism, and I think a lot of people are particularly concerned with how surveillance capitalism where data is like rendered as a financial asset, where this is particularly evident within like post-configuration as an example, Google Nest, where bigger corporations sort of minixly data in a way that affects the data so to speak, even Amazon recommendation, for example. So how can we be sure that the data of guarding the behavior on your EI model will clarify what is clearly influenced by a large corporation or by external organization and what is self-motivated in that case? I don't think we can do that because I know when I go on Amazon, I don't know when I'm self-motivated by a fair of sneakers or when I see an ad, right? So what is my authentic behavior there? Well, it also does, so I'm also sort of questioning the overall aggregate data at the moment. So if you look at Instagram, that's an example where a lot of creative articles that you solve with two or one journey, it becomes a very complicated question about what is authentic, treated by humans or what is treated by AI, right? So there was recently a bit of a reddit controversy where an artist actually uploaded some type of artwork that resembled something that looked like mid-journey produced it. So how would you be able to differentiate what is generated by AI and what is actually generated by humans? Over time, we wouldn't. But the artist was taught by a bunch of amazing people and so is his art truly his or was he influenced by the people, right? So what is authentic is going to change a shift? I think your conversation is a really good one. So we don't run into that today, but it's something we will run into and I don't have an answer to that. That's an outstanding question though. Yeah. Hi, can you consider, you know, both the wrong people get their hands on this piece of work. You talked about, you know, you can get it for the mindset of a hacker. What are people about mindset of a serial killer or a terrorist? It'll be a very expensive way to fund the serial killing and to be perfectly frank. But yes, we have thought about it and I'd love to discuss it offline. So just let me group the questions. So there's the, oh, this is evil and evil empire type of question. So I think we've addressed all of those. So I'm going to not answer anymore. If you have any productive questions, that would be super helpful. I don't have a question. So my name is Jake. Lovely to have you guys. I'm not a student here. I just graduated from the University of York, Masters in International Relations, set a bit of AI that I'm trying to solve on you. So some of the questions here I'm seeing are between qualitative and quantitative research because regarding the data set question, current rejection by the general idea, every PhD researcher in quantitative research builds out a data set and it makes reasonable guesses and assumptions based off the data set. It's only as complete as they can do interviews with the expertise or fears of what each gentleman seems to be a little misguided. They're going to build out a data set and it's going to, the machine learning is just going to accelerate and facilitate the efficiency of the guesses. If I've said that correctly. Very well said, yes. So the AI is only as good as that it's going to be. But this is kind of what we do in quantitative research. My question, I'm curious, what are you excited for in the space? And I'm going to be a hug at this point in time. And then what are some roadblocks that you think will come across? Are you going to have these? So I think one of the issues we're having right now is that of storytelling. There's a lot of data in the world and the data is available to anyone at any given point of time. I think telling narratives that are meaningful and impactful that are both qualitative and quantitative in nature is a space I think we haven't solved. The roadblock is extracting signal from noise, right? Because the noise is so much. Yeah. So data is actually extremely noisy. So if you have, you know, like if you have, what's the small number? 1,000, 10,000. You have 10,000 Instagram posts and 10,000 TikTok videos, hashtag so as, right? So that's a 20,000 dataset, half image, half video. Each of those images and video have metadata around them, right? So they have the likes, the comments, the shares, all of that, so that's another layer of information. They have time, they're posted location, they're posted, that's another set of information. Then you have the comments and conversations around each of those posts. Then on the videos, each of those videos has multiple frames. It's not just, so think of it as an aggregate of like 10 images or 20 images. Each of those frames has different objects and information in it. So immediately just like a small number like 10,000 or 10,000, let's say 100 images. Just 100 images very quickly becomes a gigabyte of data. We deal with about four and a half petabytes of data every hour. So it's a lot of information, right? Now a lot of it is junk because the likes, the format, the engagement, the comments, the sentiment, like that doesn't tell a story beyond a certain point, right? So you have to figure out a way to make it meaningful to find signal within that noise. And that's one of our biggest challenges and has always been, there's no mathematical way we can get out of that. And that's what perspective nuanced context is key. What do you say? Yeah. I'm genuinely excited about building like democratic models, right? I mean, I know we should definitely pause now, but I think if you can build like large-scale human understanding models with lots of people's input, then that's the step in the right direction. Otherwise, human understanding sits in rarefied atmospheres where opinions are then disseminated and it doesn't speak to normal people like me. It's your turn. Well, one of the takeaways of this discussion is that there isn't a last word, right? There's not one way of looking at this. Well, not one way of analyzing data. And I'll come back to that in a few minutes, but I just wanted to say, so I found this evening really fascinating from the perspective of great to see alumni come back to the institution to engage, to carry on conversations that were started and friendships that were started within these walls and to take them into new places. That's really exciting to see, I think, and to engage with a new generation of students and staff, et cetera, all that kind of thing. But I'm also an anthropologist and it's given me some, I've been thinking throughout this evening about what, as an anthropologist, how do I kind of engage with it? Because I'm a very different kind of anthropologist. I mean, I've done, do qualitative research, done some quantitative research as well, but I've always felt that anthropology is, at its essence in a sense, a process of translation of some sort, explaining, looking, trying to get a partial vision of what you think you're seeing and trying to explain it to someone else. And I also work in the development studies space. So for me, that's involved a lot of talking to people at a very local level and speaking to people at more of a policy level about what they would see if they could spend the time that I've been lucky enough to be able to spend at that local level. So that's a translation process. And in some ways, what you're doing is also a translation process, dealing with sifting through the noise to try to find some meaning. And I think that that is a really an interesting and valuable tool. I don't think anyone has claimed tonight that it is the beginning and the end of the story. It's one tool to add to a number of different tools. And maybe the stories, some of the, how do you tell the story and how do you humanize that aspect is where you then reach out to other tools and other methods in interesting ways. And I think that the center provides some interesting opportunities to do that. Welcome back to so as you've now seen that there is a vibrant kind of, shall we say, culture of debate going on here. And that will be part of the center. It has to be part of the center. It has to be part of how you interrogate what, not just the method in the product that you have, but also the joint enterprise that we all share together and how we wanna think about the ethics of it, how we wanna think about what kinds of questions we can together perhaps tackle and which ones perhaps we don't want to or we want to use different tools in the service of trying to answer those questions. And that's what this collaboration I think will be about if it's at its best and its strongest. And I think, and I expect that that's what it can be. So that's really important. Adam started us off tonight by talking about social justice and narrowing divides. And then we said, well, maybe we don't, maybe that's not our starting point, but maybe that's our end point. Like we want to take an ethnographic approach. We want to understand what are the terms in which people describe their own lives. And that's part of this mining of AI. But then also, hopefully we come back to that question about democratization, about narrowing inequalities, about thinking about what's the social good that can come from some of this research. And I hope, again, that that's a theme that runs through the center. And if it is successful that, and it's in its best sense, that will be what it's about. So I leave this evening really encouraged by the levels of debate, but also by the possibilities of collaboration. I think that, as a kind of interdisciplinary anthropologist, so I'm clearly trained, I lived in a grass hut for two years in Northwestern Ethiopia with no one else around who wasn't from that place, but worked in lots of different other levels. And I think we should, as much as possible, try to explore these other tools and think about the ways in which, as a broad kind of enterprise of understanding people in their own terms, like how, in what pathways can that take us down? And so some will find this very useful and some may not, and that's absolutely fine as well, but be really interested to watch this space and see what happens from it. So thank you very much. Thank you. Thanks.