 Hello and welcome, everyone. This is Act In for OrgStream number 7.1 on March 20, 2024. We're here with Jamie Joyce and Speaker John Ash. We'll have a presentation, then some opening remarks and a discussion. So thank you both for joining. Looking forward to this and go for it. Thank you so much. And we're so happy to be here. My name is Jamie. And, you know, as stated, I'm here with John Ash, who works with us at the Society Library. And today we're going to talk about how and why we can make sense of AI deliberations at the societal scale using some proven collective intelligence methods that we've developed in order to help inform and hopefully improve decision making. And, you know, this methodology is something that we've taught to students at 32 universities through educational programming. And we've even been asked to extend this type of education to fact checkers around the world through the IFCN. But besides educational programming, one of the main things the Society Library is working on is enabling societal scale debate on complex issues by creating knowledge graphs. Now, I think we were asked to speak here today because of some recent AI generated deliberation graphs that we've been releasing on topics related to AI. For example, we got an advanced transcript of the debate between Connor Lehi and Beth Jayzos and then we created a number of debate maps about not only their arguments, but different arguments that could be made in the genre of topics that they were discussing. So you can go to our website to see those if you like. I'll show you what it looks like in actuality very shortly. But our work automating the creation of deliberation graphs stemmed from our work building them by hand. Essentially, we've been building linked knowledge graphs which map the arguments, claims, and evidence from as many detectable points of view as we can find derived from about 12 different forms of media. And this includes anything from economic impact assessments to scholarly articles to podcasts and tweets. And we modeled the back and forth or pro con reasoning in these knowledge graphs. And again, we used to do this by hand. And when we did it by hand, analysts would have to spend thousands of hours combing through archives and the internet to find relevant knowledge so that readers wouldn't have to. And of course, we don't just build deliberation graphs. We also enable this complex linked data from our knowledge graphs to be viewable through more flat and usable structures such as micro voting decision making models which have been piloted at the city level and something we call papers, which is an interactive interface that looks deceptively like a paper but is unpackable across varying dimensions of argumentation, linguistic register, and you can even open sentences in the paper to see like a Wikipedia page of information about each and every sentence which contains a specific claim. But why would we do this? Well, once upon a time libraries existed because information was really scarce. That's not why they existed, but that's why they were useful. And it was helpful to have commonly held knowledge that people could go and collect. Now we have an overwhelming amount of information and it's a fine line to walk to both parse out irrelevant or duplicative information while also ensuring that you have the maximally context relevant information to include a particular issue. Essentially, it's like really easy to bias our understanding of an issue by omission of data. So we feel as though independent public serving institutions should exist like libraries, digital libraries to ensure that people have the maximal context and the maximum amount of information available about complex issues without having to spend the thousands of hours finding it and analyzing it themselves. So just like the Library of Congress largely serves as the intelligence arm of Congress on policy matters, because the intelligence community works for the executive branch, if I'm not mistaken, I think we need to reimagine what a digital public library in the 21st century could look like to serve the public, which by default doesn't give you like a consensus view, but a diversity of views and shows you that when it comes to complex policy matters, there are usually multiple competing points of view with evidence and argumentation that are indirect conflict. Simply a lot of complex issues have no right answer, but must be like elucidated amongst value tradeoffs and assessment of the evidence that's available. Because when we have questions like how far do you think personalization of AI assistance like chat GPT to align with a user's taste and preferences should go, which was like a wicked ethical question that open AI posed over the summer. There are many competing answers and philosophies, some which deploy justifications and reasoning to support why that answer is correct. So for example, one response could be that AI assistance like chat GPT should be personalized to a user's taste and preferences to the fullest extent and highly tailored based on the user's past interactions, preferences and data with no boundaries. Another position is AI assistance should understand and adapt to a user's taste and preferences but have defined limits on learning sensitive or potentially harmful information, et cetera, et cetera. So essentially given the limited powers of observation that we have and that there's only been so much investigation into a given question space, it's helpful to see what work has been done and what reasons and evidence exist and what preferences people have, which can be informed by the collective knowledge pool of a given civilization pre modeled as a deliberation so that people can have an easier time logically processing how information is related. Now there's still a gap and we can do demos later on this talk, but there's still a gap between what we've been able to automate with AI and in terms of creating these large maps of societal scale deliberation and what we've been able to do by hand. So here's an example of when we did by hand. Hopefully some of you remember that there was this great big debate in the state of California about the last remaining nuclear power plant, which is called Diablo Canyon. This was a big debate in the past couple of years. Essentially, the plant was slated to be closed down its respective reactors in 2024, 2025, but the safety and environmental impact and the utility of its energy production has been debated for decades and decades. So we essentially deconstructed or extracted the logical reasoning from over 5,000 artifacts of multimedia types in order to create a state scale representation of what the debate in California looked like, what the debate that people were having actually looked like. And we met thousands of arguments linked based on their logical relationships across economic, environmental, safety, political, ethical and other considerations. We then fact checked and steelman these arguments as much as possible. Steelmaning means to make the argument as strong as possible through supportive reasoning and evidence. And this took over 8,000 human research hours, which is four full-time human research years. I could say a lot more about methodology here, but let's just get the idea of it. And you can see this map on our website if you would like. So here's an example of just one claim that's enlarged and then the debate about that one claim. So this enlarged claim reads, market forces and financial incentives have made the Diablo Canyon nuclear power plant redundant, non-competitive and undesirable. And that's both a high-level kind of vague, but also substantive claim. So as we unpack the tree, we see linked argumentation and evidence that's more precise. I know you can't read it, but basically that high-level claim gets broken down into more specifics like the details of consumer demand in California across three trends. And then as you can see here, there's essentially like eight levels of argumentation back and forth. That's a combination of breaking claims down into something that can be meaningfully debatable and then actually getting down to like the data level of people arguing over this claim. Now what you're seeing on screen is not everything that could be unpacked here, it's just one tiny little string that goes eight layers in depth. Now I'm going to show you how many strings exist only if we unpack three levels deep because unpacking to the full extent would actually be too large. So here we go. So this is just three layers of argumentation in that same graph. It's a very superficial unpacking. These are again very high-level and substantive arguments which we collected from those like 5,000 artifacts, deduplicated steelman, cleaned up, contextualized fact-checked, linked to references as much as we possibly could. And if we unpack further, we could see those long like eight-layer strings. And this is a 2x speed video that's not on loop and it goes on for a minute and 50 seconds. I'm certainly not going to subject you to watching it, but my hope is that you can appreciate the scale that we're talking about. Social media has allowed us to scale up the nature of communication since hundreds of thousands or millions of people can comment on threads online. But these threads in this video are deduplicated in like high-quality, real arguments that were made. So even though social media scaled our ability to communicate at the societal scale, no platform we've seen has really enabled the clean and formal debate at the societal scale on these points. Social media content is very useful, but you still have to dig through lots of underdeveloped points. So this enormous model of California's debate was created over eight months with four full-time analysts and then two part-time analysts using our tools and methods, largely unautomated by LLMs, which we're working on today. And hopefully it'll let me... there we go. Okay, great. So again, that's the old process and we're looking to automate with AI and then actually point that technology towards debates about AI itself. When we ask questions like, what should we do about AI? We don't have to spend thousands of research hours seeing the debate that has already occurred asymmetrically through different publications and platforms on the web. And here's the article on Medium if you'd like to read about our framing of debates about AI. But there's some problems with how we debate about AI currently. And what I'm trying to impart today is that, you know, Twitter spaces and online polls and citizen assemblies and books really aren't going to cut it in order for us to comprehensively and sincerely model the collective knowledge and reasoning we are expressing if our goal is to sincerely scale conceptualization of problems and the solution space and then inquire into truth to try to make optimal decisions. So let's dive a little deeper into the challenges of mapping AI debates and especially the problems that people tend to run into in other projects. So of course, first is scope from just a few hackathons that we were conducting at the Internet Archive. We were asking hackathoners to generate topics of debate and they came up with 841 unique high-level preliminary debate categories across topics like, you know, regulation and safety and alignment and goals and paths and future threats and risks and on and on. So it's actually the AI debates are enormous debates. Like think about that video you just saw and all the arguments in that knowledge graph scrolling by and how that was just answers to one question, you know, what should happen to that nuclear power plant. There are potentially hundreds of questions. Like there's probably like millions of questions you could plausibly ask but if you kind of like bucket them together and like regress backwards there's at least hundreds of questions that we could ask about artificial intelligence and so for to comprehensively map these AI debates the breadth alone of just questions we need to answer is unbelievably immense but before we even start debating questions we have other problems like regarding semantics. When people discuss risk and regulation it's really important to get straight about what we mean about AI. You know there's many terms that people often use interchangeably with AI. The wonderful thing about government documentation is that they provide you know in their regulatory language definitions of what they mean which is fantastic but not everyone on the internet does that and that can prove a challenge. And then also if we're sincerely interested in debating and facilitating societal scale debate it's really critical that we're sourcing arguments and materials from different stakeholders and groups who hold different views. At the Society Library we explore the culture of different stakeholders in order to try and understand the different contexts of groups. That's actually really important because the context can have a great big impact on reasoning which may be another challenge if groups are left to themselves to communicate across silos. Like here's an example of something that happened recently on Twitter where this was an exchange between AI safety means memes and George Hott's where you know George Hott said it's not the models they want to align it to you and then AI safety memes replied with both and so like in order for that exchange to happen in earnest my sense of it is that there is a great big difference about what is implied when we talk about aligning human behaviors where and I think that kind of has like something to do with like trust in authority for people who maybe are more safety aligned adjacent to that idea within their particular culture there may be this idea of increasing harmony and understanding when they mean AI alignment and then maybe like other groups in different communities like effective accelerationism even though I don't want to paint any group with a broad brush of course there may be like this inherent implied distrust in authority so it kind of brings the question up of like well who decides you know so there can be differences within cultures even though the same language is used meaning can be implied and so it's really important to like be in these communities and trying to understand to help extricate out all the implied meaning when people are making claims. Now this is a fun part that I really like another challenge of mapping AI debates is around calibrating the level of abstraction of the knowledge that we're surfacing so at the society library we create links between and this is not our full ontology this is very simplified just so you know we create links between high-level sentiments and arguments and more precise claims and then link those to the artifacts and the reason why we do this is because it's faster to read through all the sentiments than if you are presented like the full scope of all those arguments it's much easier to read like you know 14 positions as opposed to go through that entire video's worth of 5 000 linked points and then just so we aggregate things up and allow people to click and unpack but what do we mean by a sentiment so here's one you know fears about super intelligent AI from the less wrong community are just a projection now you can imagine this being something that someone would tweet it's like the gist of an idea and then there are certain like these kinds of sentiments are the kinds of things that often pop up in polling so like this is an example of some high level sentiments that were aggregated in a polis poll about you know sentiments around AI but if we were to debate sentiments like this it's a little bit too high level setting aside for a moment so like you can't you know drive is from ought and all of that but if people do so we're more descriptive in the way that we represent deliberation so we accommodate the fact that people justify value preferences with reasoning and evidence but so if we were to debate the justifications around it we would have to break them down into arguments and each one of the pieces of reasons that support it so let's go back to that example of the sentiment that fears about super intelligent AI are just a projection from the less wrong community now this is just an example so everyone calm down and here's an example of an argument I will read this to you don't try to read it so taking that sentiment actually breaking up into a high level argument here's an example of extricating out the premises and conclusion so premise one members of the less wrong community are more likely to display higher than average intelligence higher than average intelligence individuals are more likely to have experienced more frequent feelings of loneliness in childhood several studies suggest frequent feelings of loneliness and childhood is correlated with increased likelihood to project that god is wrathful however according to x survey most self-reported less wrong members indicate that they do not believe in god however a three percent claim they were raised in a household in which one or both parental guardians believed in god though according to y survey 64 percent of self-reported less wrong members indicate that they have 70 confidence level or higher that super intelligence would exhibit wrathful behaviors or attitudes the powers described to super intelligence are often defined in ways that would make it analogous to perceive powers of god irrespective of incentives motivations or perceive forms of set identities human beings sometimes project perceptions of one feeling idea entity or event onto others therefore less wrong members may have an increased likelihood of experiencing a cognitive bias which results from extensive loneliness due to higher than average intelligence in childhood which primes them to transfer projections of a wrathful god which they are more likely not to believe in instead onto the future behaviors or attitudes of super intelligence so it's just an example it's i know it's income it's not incomplete and it's like not broken down to like the greatest extent and that's because it would just be too much to read on screen um so but in if we're going to be meaningfully debating these things we need to take sentiments build out argumentation to like steal man them and then argue over those premises um uh let's see yeah and like you know each one of these premises a hypothetical example has even though some of them are are real uh claims that are made with actual references um you know each one of these is kind of supported by like a stack of artifacts and so when we think about how to create a model that shows the debate the societal scale we are essentially aggregating up and compressing knowledge so that's reversible but as an organization we think it's important to have an audit trail down to the references themselves so readers can dive up and down the level of abstraction so it's about maintaining provenance while enabling accessibility so it's like we are providing this bird's eye view and then going into in-depth detail uh the problem with this though is that of course we lose details we aggregate up but we've developed an ontology based on descriptive methods of how people tend to argue and then how we can make it concrete to make things actually debatable so for example you know people may say things like nuclear waste is dangerous but it's kind of a meaningless thing to debate unless you make it really concrete are we talking about storing it transporting it licking it like in what form um and in what ways is it actually dangerous when you start breaking it down you can actually get into something that's meaningfully debatable but i just really want to drive home the scale that we're talking about so i i showed you an example of the difference between a sentiment and an argument but let's walk through what it may be like to like actually map this out so here are the seven claims that make up that one argument about less wrong folks that it's not including the conclusion these are just the i guess the conclusion is there so uh let's just unpack the the premises because each one of these premises is debatable so there may be a set of arguments need to support why each claim is true this is why it may look like there's just one argument for each claim when there could be like way more and those arguments um to support the claims in and of themselves of arguments and like on and on so hopefully this kind of reminds you of the knowledge graph video i showed you earlier and hopefully this helps you appreciate the magnitude of what it means to not take shortcuts and really try to comprehensively map argumentation by breaking it down and then arguing over the premises then arguing over the data that supports those premises etc etc which we feel is how you can in earnest map societal scale deliberation especially if we care about not necessarily like finding out the truth because again we have limited knowledge limited powers of observation but at least just sincerely inquiring into what could be true uh so there's this sort of like mind-blowing complexity with the volume and structure of knowledge and we think we need tools to help us accommodate that complexity and like i said we work on many different things like interactive briefing documents different visualizations and uh decision making models um but you know just getting that base layer data is like really important so uh on top is kind of like a layout of our old methodology and how we used to build these knowledge graphs by hand and now we're moving towards automating with AI by like fine-tuning models at different layers of our ontological structure based on our understanding of like what ontology accommodates deliberation generating synthetic data fine-tuning it um impairing that with like claim extraction from the artifact level and we're working on you know modeling like the the logical relationships of which we accommodate 11 so there's still like a lot of work to be done but we've made a lot of progress and i'm happy to open up some of our toys and we can do some live demos if you're interested but in general like this is just the direction that we're moving in so that you know there can be this library institution that exists where if people want to explore complex subjects they can ask a question press a button and then it goes out and through a system uh collects the relevant information logically structures it so people can see the different positions and the different evidence and argumentation and that supports those different positions and you know we could always use help we're a 501c3 non-profit organization feel free to reach out to us to collaborate and you know i'm really excited to now take your questions thanks awesome thanks for the presentation john want to give a first remark or any comments yeah um you know when we talk about active inference we're often talking about sort of a generative world model on which you know we're using to make predictions about some underlying true world um but in the case of either language models which are you know dominating you know so much of the tech space right now and just the the sort of direction that things are sort of sort of going is that we're not actually modeling a real world we're modeling the artifacts of many separate different world models and all of those different world models have a different uh you know crystallized representation of that world and when we train these models we're not making predictions about a true underlying world state we're making predictions about a collective uh representation of artifacts about the world where there are many different world states and the interface to that to these models of world models is sort of evolving and changing as we grow as a people as we discover what is right and what is wrong inherently these models are pluralist because there's so much disagreement about what the true underlying state of the world is because it doesn't have access to or the language models currently do not have access to the real world there is going to be representations or like a multi-dimensional representation that covers all of those different perspectives and what we're trying to do is to get like in terms of the models that exist right now is both get a true in some amount of counting of the world but also you know keep the user talking to the model and different models have different interfaces on which they preference right now for example experimenting with claude quite a bit it tends to really lean into confirmation bias meaning that whatever world model that you might have that has a tendency to reflect and proclaim or project that you are right to keep to keep you talking if if it tried to lean into only one representation of the world it's because it's not really leaning into the real world it's leaning into our preferred representations of the world basically whether it be preferencing towards the scientific worldview whether it is like grok where you know they literally in their system prompts say never be woke whether it's gemini which had the debate about their tendency to you know when producing like generated artifacts about historical periods to always include minorities in those representations there is inherently going to be baked in some preference to the world model of the generators so what we're trying to do here is to to strip back a that preference or that bias meaning that we're trying to use as many models and tools as possible to get some form of artifact where you can holistically see the largest possible distribution of actual world models that are held within that map that preferences or biasing any particular one and give people the capacity to explore the those differential world models in that we as a species have succeeded so much because we are integrating and aggregating our different perspectives of the underlying true state of the world because we have such a limited you know aperture through which each to see the world and we have succeeded quite a bit by summing those differential perspectives through conversation and now this has evolved since that we can use these generative models which are modeling the conversation space by predicting you know essentially what the next token is i can't possibly access all potential written text ever written by all humans but it tries to do the best that it can and we're trying to with the full knowledge that it is sort of a fluid space create some representations whereby people can honestly interact at a bird's eye view of all the potential representations that might be held within the model so that you can get this pluralistic god's eye or bird's eye view of belief space about the true the supposed true underlying world that exists those are my thoughts a lot of ways to go i mean just to connect to the concept of affordances and policy selection in active inference a really important thing that's not shown on those branching trees is like the probability of each of those paths so it can be a pure collection effort like collecting the flora and fauna of a region at a first pass and then that provides all of these tools some of which are quite hard to get to they're like a well crafted design or a thought that might not be immediately apparent in its relationship to a parent claim and then giving the toolbox without even necessarily saying here's what 80 percent or 99.9 percent of people in this or that space feel so instead of going the statistical route of averaging out over word use which is already going to be an issue because it's like the fuzzy and the statistical as well as the the dead and the linearized as opposed to the actual cognitive or generative model that's making those words this is providing a tool bench or at least seeing like you said from a bird's eye view like these are the lanes in the highway or these are the is the street map but how you use the street map is going to be all about where you want to go all that you want to talk a little bit about how we are attempting to cross out source in through the the voting mechanism yeah yeah so like you know our interest in the work that the work that we're doing is acknowledging that a lot of the work that we could be doing could be a middle layer for different tools so like if i'm not mistaken like you know in tedlock tedlock's literature there are some evidence to suggest that one of the distinguishing characteristics of a super forecaster is like ingesting diverse amounts of information like across a spectrum of potential belief space and so like having tools that could make it so human prediction could be increased in quality or more likely to predict the correct outcomes is like one potential middle layer thing the other thing that we did that uh john just mentioned is that when we were releasing our debate maps about ai that our ai generated which you can see on our website or linked in our twitter people asked for a voting mechanism so we enabled the ability to for people to vote on each and every single node and then it creates a little mini graph on top of the node so you can see in terms of how many people are participating where um you know towards disagreement or agreement how much people believe that now we normally don't platform that specific um feature because it has this inherent biasing effect if people believe that something is highly believed by others they may be coerced to believe that that has higher value or something like that which unless you're doing like a really rigorous robust representative sampling of people and then have them vote on that platform um then yeah it just it just may be like a majority rule or like self-selected bias kind of situation so we've never used that feature before but people asked for it but uh in addition you know our next steps include taking our decision-making model which is a micro voting um protocol that actually aggregates up people's beliefs once they go through each and every section of a belief space and connecting that with our knowledge graph so there's many different decision-making tools prediction tools that could be built on top of this link data it's just enough of a challenge to create this link data um and pushing to like the maximalist context that we're interested in pushing towards in order to make it possible to make an evidence-based or more informed decision yeah and that's interesting thing is that inherently with these language models uh you know if you go before fine-tuning uh they're inherently going to put output world views that are more common that are more prevalent whether or not those reflect the true underlying state of the world if you if you train a language model only on information from the 1500s you're going to get a very very different uh set of outputs for the distribution of of beliefs and that's that that notion of the probability of the different lanes that could be traced out is something that I find to be sort of woefully under represented under represented in the work that as it currently stands and hopefully we'll see more of an evolution towards embracing outputs that lean into the uncertainty about the claims being made by the model and also hopefully bake into the model uh you know really at a parameter level like assessments of uncertainty and assessments of confidence that different people have and sort of an evolving time-based uh binding to the changing of that map because if like I said if we take the snapshot today it's going to look different than it does uh in two years there are various you know sub movements in AI that do not that exist now that really did not exist uh five years ago and so there will be a process of evolution where these maps in the way the information that is represented in them will evolve and people will be able to get a more nuanced representation of the change over time um but it's one of those things where it's like we're going from having to rely on like four years of full research work from humans just sitting around gathering documents on their own and reading through them manually to okay well now we can get at least a snapshot of this particular moment to hopefully creating a evolving representation of all of those lanes and all of the probabilities and sort of like uh you could imagine like a ton of overlapping overton windows and their motion through time as these maps you know evolved through time as the internal representation of all of these world models held by this one model or these many different models in this case evolve as training continues as feedback from the crowd continues and that actually may be really useful through time to be able to predict certain trends in human beliefs so like if we were to go back in time and like model the argumentation about interracial marriage which is like you know of values driven like conflict uh laden debate at a societal scale you know we may see patterns of argumentation like what about the children it's unnatural destroy the fabric of society and then fast forward through time um you know uh same sex marriage is an issue that people are debating about and maybe we see the same pattern of argumentation emerge what about the children it's unnatural destroy the fabric of society so maybe that's going to make it easier to predict when polyamorous marriage seems to inevitably become like a legal issue that people are going to want to debate is it the same pattern of argumentation is there anything that is actually differentiating that situation like what would be differentiated in a debate about like interspecies marriage or something like that we're a new argument about like consent because we cannot communicate with animals um could actually make a meaningful difference in the pattern of argumentation so we're hoping that over time we can kind of like accelerate through the kind of arguments that are more related to culture and values which could be actually correlated with like disgust or like unfamiliarity or fear and like feelings as opposed to like the reasons that people may deploy to justify those things which we could see over time like it didn't result in what people thought it may result in it's another interesting aspect when we would perhaps compare this to something like active inference in that there is there is supposedly you know this underlying true world state which you know when we had uh yeah yeah debatable very debatable at your you know school where you hosted it there there was like a very rich and intense debate about the nature of truth but then on top of that there is also this more morality space this that exists only within here right and the only way to model that distributed morality space is just by asking people their questions and inherently baked into like these maps inherently baked into debates in the first place is there's the true aspect of it like people trying to assess what's true but there's also this uh much more complex representation of morality and ethics which often can not necessarily reflect something in the real world but reflect something deeply felt that is very hard to get access to very hard to model very very hard to predict but can dramatically shift very rapidly like for a long time the interracial marriage question was very very very fixed it was like something like 10 percent of people thought it was it was okay and then it just very rapidly shifted um to to the other direction and people just very rapidly began to accept oh well that that makes more more sense and it's an interesting unsolved question trying to model that space yeah well and one thing i would say too is that like it's the reason why our approach is so context maximalist so instead of like trying to get people at a citizen assembly or to like fill out a poll and have them like express their preferences or values and then try to justify that it's like some some things are so deeply felt and justified by our feeling of certainty that it can be uh it's almost like ineffable it can be really difficult for people to articulate so instead like deploying strategies to like look across different stakeholder groups look across different forms of media that different stakeholder groups are expressing themselves um we can begin to like kind of link together some of the justifications that people put forth because instead of uh uniting someone who's participating in a poll to be able to articulate themselves well it's like you can look across the wide span of civilization and its artifacts to see if someone was able to articulate something and what's the diversity of those articulations you know when people feel nuclear waste is dangerous like where are the people who are talking about containment long-term storage transportation you know standards of containment in the united states versus sweden like coming in contact with it leakages that happen that sort of thing so like by just like looking deep into the space you can find some people are able to articulate more so than others and particularly in published works they tend to be more thought out and more descriptive than what like a real-time performance of a debate or like participation in like a deliberative system or like citizen assembly may yield in terms of like reasoning yeah that's an interesting thing oh did you sorry were you still going Daniel look like he wanted to pop in I just go ahead John if you want okay go John I was just saying about like there is this inherent aspect of polling which is the dominant form of getting the sense or the pulse of the population that the phrasing of the polling biases the responses and also the fact that you're sort of boiling down their perspective into just one way or the other and not leaving the space for them to just say whatever they say what is interesting about this evolving space is that we can take and just cast as wide a net as possible with no presumptions baked into you know that sampling like with a large language model in the first place trying to get a representation of the internet yes there may be all these bots in there and there may be all these you know people trying to influence what is online but you're not referencing or shaping the debate by any particular question you're just pulling every possible thing and then trying to distill out of it compress out of that representation that people can make sense of the territory that exists and hopefully as time evolves we'll get clearer and clearer representations of those of the territory with better and better maps and I just want to pull on this thread a little bit because this is actually a really important distinction within the society library methodology and our ontology like when we built debate maps by hand the question would actually come last we would just find a topic and unpack the topic space through like various techniques we had like 22 different techniques for how to unpack a space and look for claims and what would happen was is we would just aggregate claims up and what we would find was when you start clustering them up hierarchically based on like reasoning and like levels of abstraction eventually you would get to what we call positions which are just general orientations about something and what that would do is actually articulate the question and because like john said the way you ask a question can be inherently biasing to a debate like we received an email and someone said you know how fast can you make a debate map about like should AI be governed democratically and i wrote that back to them like well we can literally press a button and have four or five stages of a deliberation graph done in a couple of minutes but like you may want to rethink like how you're asking the question because the question behind should AI be governed democratically is should AI be governed or even beyond that what should we do about AI and then we can progressively formalize to what should we do about AI well we should govern it how should we govern it democratically versus all of these other different governance mechanisms and like right now we're releasing maps where we're just asking the question of our system and pressing a button just to demo how we're like being able to automate this but really when it comes to building the libraries themselves it's about figuring out what questions people are actually debating over most fundamentally because it's when you find those fundamental questions that you're casting the widest possible net of relevant argumentation it really like creates the scope of logical relevance and we don't want to accidentally bias by omission or commission and so the best example i can give for this is our work on climate change climate change was the first topic that we began like doing r&d in our methodology on and what we ultimately found that of the millions of questions you could ask about climate change and like the hundreds of millions of claims that have been made about climate change pretty much the debates about climate change break up into just six questions which is like what is it again those debate over terms uh is it happening what causes it what's the impact of it what could or should we do about it and why has this debate lasted for so long and we found that when we framed the questions that way we've never been able to find a claim that wasn't logically relevant some were in the tree very cleanly across those six questions and it allows for like a really clear delineation of like relevance when it's it may not seem obvious so like claims about the extreme weather events their intensity and frequency belong under like how do we know you know is climate change happening well there's increased intensity and frequency of storms and that's relevant because yadda yadda which is very different than statements about the intensity and frequency of storms like killing people and impacting like underdeveloped nations which would fall under impact and you can see how like maybe it'd be very easy on social media for that argumentation to go differently where someone's like storms are increasing in intensity and frequency and some and someone's like yeah and they're killing people it's like it's easy for debates to kind of like bleed all over the place but when you find those most fundamental questions it gives clear delineation so you can meaningfully debate something to conclusion because if you don't have that structure you can just debate infinitely and recursively forever in all directions um yeah so that was like a little discovery that we made early on with our methodology that led us to believe we were on the right track in terms of our approach awesome so many pieces here just some of the terms and the topics that i'm picking up on that we talk about all the time in active inference are like attention the role of affect incognition uncertainty including estimates of one's own uncertainties introspection context different forms of expression epistemic foraging so those are all really cool topics and then i got the sense like we're saving or scaffolding or augmenting the relevance realization for the cognitive entity like the ant colony nestmate as they're going through the street map not trying simply or only to make an artifact which itself has an understanding like a kind of answer machine about what should we do about this power plant in california it's a totally different systems design and process and method to try to develop something that can address that again begging the question what does it even mean to address that question for an artifact when it's the people who have to do and live with it and then just one last piece was like on some of the recent active inference papers touching on these topics like designing explainable artificial intelligence with art with active inference by mal aberrasson and colleagues talking about how when uncertainty parameters are explicitly defined and used as such in these composable architectures then that variable in the program like can be interpreted that way it has that semantics internally and then that semantics that are internal can be used externally whereas this whole reverse engineering of investigation into really nano and sub components of statistical architectures which may come up with some gold may come up with some vapor but it's also kind of like tracing the connections in brains or ant colonies or other complex systems which is like okay you even when you know every microprocessor it's still hard to actually predict what is the effect of a two lesions at once those are just some of the challenges of complex systems and so it's a really like respect or even appreciate some of the depth like many many layers of depth that these debates are and i think like hold up that affective component but also recognize that it's kind of like an attractor excuse like i'm tired like why can't i do this i'm tired it is a totally valid reason and that person maybe like shouldn't be forced to go to that event but i'm tired doesn't have there you could go sub reasons but it's like it collapses a lot of other reasons into it and those are our general sentiments which is why it is hard to get so precise and follow these trails because they take a lot of attention and craft and suggestions from others which unless they're curated you're never going to get exposed to enough good takes or takes period yeah i i super love the phrase epistemic foraging i had not previously heard that and i i love that as an idea and yeah we we specifically use the terms arguments claims and evidence to like kind of preserve claims that people make like i'm tired it's not all fact based like it's a big misconception people have about the nature of our work is it's like mostly just facts so i'm glad you like kind of brought that up because it's an important for it's important distinction for people to to know about our work too yeah i that's been on my mind this whole this whole conversation is that there's sort of like a differing philosophy about you know like i was trying to say something like a model that has a real that has access to the world directly we're trying to model the universe trying to model and predict things that will actually happen in the real world versus epistemic space versus belief space where it is this aggregate aggregation of various beliefs and we can't come to a a pure representation of the real world just by evenly sampling the best thing that we can do is we can map that belief space and the distribution of beliefs that exist within it yeah like to the example of body temperature which we kind of commonly return to an active inference there's a distribution of body temperatures there's no point in like denying that it could be outside of a homeostatic range that excludes people who need it most basically and so you can just talk at the like research level or scale about the distribution then you could make that into like a weighted distribution basically a probability distribution so the area under the curve sums to one for a given person through time or for population in a snapshot and then like that's the beginning of sampling and inferring that from the thermometer readings or from the body sensors that's the beginning of what we can and ought to do and that's the policy selection question but that initial sense making to go from observations and crashing them with priors getting to a probabilistic understanding of of how things are then that can be acted upon and it can be modeled how that gets updated by taking different affordances like the different things that are said in the conversation they are like selected from kind of a deck of cards it's like written on the fly as well but it is selection if you start to abstract I'll ask a question from the live chat upcycle club wrote I wonder what types of information theoretic patterns emerge from those debate maps I wish there's more of a live chat so I can know what you mean by information theoretic patterns they can follow up they can follow up but statistical or informational patterns in the maps when you analyze them like as their own data structures yeah so I mean one thing I can say that's hopefully relevant to what the person was querying is that our ontology was built descriptively and not prescriptively so before we developed the essentially the strategy of the society library we looked at over 168 different systems and we found a lot of different systems were prescribing a specific ontology to reasoning and while it's important to like accommodate more formal forms of reasoning like actually breaking up argumentation into premises and conclusions if we're creating a model that can inform like you know tools that could help human reasoning you kind of have to accommodate how people also reason so our ontology has 11 different kinds of logical relationships and there's the typical like pro con true relevant dimensions to argumentation but then there's also things like to be more specific this means which is a tag that we put on the edge relationship of nodes between something vague like nuclear waste is dangerous and then like essentially the cluster of more precise things that kind of have the same abstract meaning but like are actually something that's more precise that can be debated and that can be broken down you know across several different layers because people we couch so much meaning in terms like if we're talking about market forces then we talk about consumer demand well consumer demand means very specific things and very specific times etc etc so in terms of like information patterns our ontology kind of like represents that those information patterns we just notice like consistently how people tend to reason and we work to formalize it as much as possible so it's not just like a relevant random linked data that's kind of like a visualization of what twitter is our goal is to put into formal reasoning while also accommodating the fact that when people express themselves they may express themselves more vaguely or they may have a wider berth of what they would consider to be relevant which we then have to include because it's such a popular sentiment but then like also correct with counter-argumentation and so one way we do that is like we not only have these edge relationships between nodes but we also have tags so things like information in the like in the model could be outdated but we include that information and then just include the counter-relevance argumentation and a label that says may be outdated because we understand that people not everyone has the most up-to-date information and so we're modeling a deliberation we're going to include what may be like currently irrelevant given like the evidence that exists especially if it's very factual but you know have that contextualized information attached to it that can only be unpacked and seen as pairs together so there's things like that we've done with that and so I guess one of the information patterns is that people have a lot of updated like understanding of a particular issue. People tend to cluster around popular ideas which are punctuated across different categories and then people often argue in very vague ways. I would say another information pattern is that people get and again hopefully this is my correct interpretation of what was implied by the use of the term but people tend to debate about methodology as well so a lot of like the little strings that we were debated have to do with like the changing of models of how we assess seismic risk throughout the 20th century and it's like the back and forth argumentation from like congressional hearings and things like that about like hey listen as this plant got older and older we discovered more and more fault lines but also the model for how we assess seismic risk changed so it's like argumentation within academia and like activists and like congress about like is that appropriate or is that a way of just like giving a green like rubber stamped for the nuclear power plant to continue and then we of course find all of the logical fallacies that play ad hominem and things like that which you know which we have to deconstruct so it's not ad hominem it's not just like this is a you know sorry for the knocking there's some construction in the building you know people will make the argument that like a character who has produced evidence or reasoning is like flawed in some way but we like look for well what are the reasons that justify that and can we model debates about that hmm all very interesting and like looking at yeah looking at that like growth and kind of the occupancy the non-equilibrium stationary state of like where people dwell like these are the radio stations that people are dwelling on again there's the spectrum of all the radios and then here's the dwelling patterns and so it's like there's the underlying possibilities of landscape but then stacking it up and looking like a sharper and sharper probability distribution where people are spending more of their attention and then all the ways that the different computer systems are are in feedback with that kind of unprecedented scale and type and it's like each one of these brick topics touches upon so many different cascades of claims and questions and I think it was really cool how you talked about the question arising from higher order perhaps very seemingly latent or oblique aspects of the space and I guess my question is do the claims always ramify out like a directed acyclic graph or how do you deal with it when there are cycles of justifying claims yes there's um again finding the right question is important to um to kind of create the bounds of logical relevance so you actually can argue to a specific conclusion now our tree is what you saw it's actually an important point to bring up is um it is it is actually a graph architecture it's it they're not trees they're they actually do have all these interrelationships where claims pop up at different like layers in the map across different um perspectives and things like that and that's just like link data all across and we have different ways of um detaching how linked they are so like if there's a slight difference in context of a claim popping up in one area and another we can copy it but like it doesn't update in the exact same way um even though like the language and like we partitioned off like reasoning we've like broken the link of updating them because one is only relevant to update in one section um yeah so there's a lot of like trickiness with like the modeling itself and like the updating of this stuff um but it is kind of uh graph like and cyclical but the idea of trying to find what question is going to enable us to argue to conclusion is so we can just do that and not just like constantly be going around and around and around and around but like you're breaking things down and you're formalizing it and you're kind of asking that like socratic question of like well how do we know this is true how do we know this is true how do we know this is true which tends not to yield in the same cyclical patterns that you may see in like other systems where the ontology is either really overly restrictive or just like totally vague and open hmm and always a trade-off like whether working by oneself or in collaborative or crowdsourced settings tag systems are very general however sometimes they're just a few letters and so that means that all the context is being brought to the table by the tag viewer which ends up kind of defeating the purpose because it in some ways doesn't resolve that semantic ambiguity of what the tag was about other than just giving the kind of tautological it was used how it was used whereas with more structured ontological systems they are sometimes harder to use or they have a more restricted range however you can do more powerful inference with them when the information's in that format yeah can i actually show you share my screen one more time i just want to show like this yes any demos would be definitely cool and then if anyone has other questions in live chat go for it and then we'll look at a demo and look at any last questions cool all right can you see this screen um yep okay i'll crop it and everything okay cool um so this is our papers interface like you mentioned like tags and how like if it's like a crowdsourcing system like sometimes you can't like really like extricate out the meaning of why something was tagged a certain way so this is the same diablo canyon data except we've rendered it in an interactive piece of paper where you can like unpack like the definitions and things like that but also like you know here's the high level nine positions that people take and if you click on a word like economic you can see a believe it or not we wrote we rewrote this by hand this is a more natural language like kind of article version of all of the nodes in the graph under the economic section that support this position so when we add tags you can actually click on them and then it has a specific definition um it's like sometimes we provide more information within like a little wikipedia page for each and every claim and then you can like unpack um you know more information argumentation about that claim space so like the benefit of like curation is that we can be as like comprehensive and explaining and justifying why we're like labeling and linking things in certain ways um which again like if if people are like participating in a crowd source way like uh it's work so people may not be providing all the information that's needed in order to um like be able to interpret what was the original intention behind linking and labeling and this makes me wonder like if we could do it for the active inference literature because people will say things like markoff blankets explain consciousness or they don't or active inference proves free will to be correct or it proves that it's not correct or it proves we can't know and some of these high level decisions go into incredible places and people have incredible journeys sharing and working through some of these and like all of the kind of restabilizations in the meta learning journey and it's like not just as this method touch upon the scientific discourse as a source of stabilizing more natural discussion like are eggs good or bad for you which is going to rely a lot of science but not even anyone person will ever know we can also kind of turn the method back to the scientific literature because to analyze the claims made in papers like either you have the system you have or you have implicitly that emulated in your mind somehow yeah or you don't at which point it's just the affect that's retained and people will read something and just think I don't really like that or I wouldn't have really said that but of course not that's because another person said it and so that is so clear to see once it's seen yet it is enormously hindering to understanding and communicating technical topics like active inference yeah I mean I imagine how many hours people have to put into like diving through the conversations that have already been treaded just to like pick up that full context of like what has already been explored like either offer something novel or to interpret for themselves what where they want to assign their belief and so yeah we we think it's helpful to traverse that for people and kind of create this linked library of ideas like it's like a marketplace of ideas so people can browse and shop and like you know read the reviews so to speak of certain beliefs and ideas so that they can like you know either assign belief or just like pick up where that conversation has you know left off and like continue it in under you know some sort of investigation for themselves over what is true or what they would like to believe I wanted to add you know I've been thinking about this while in this conversation that the work that we're doing is not just sort of this top down from or I don't know how you which way you referred top or bottom but from the question but that there is you know there's both a top down and a bottom up approach where it's like we are trying to build out a library of artifacts related to the particular debate space and then generate them from the claim as well and some of the first work that we've been doing is just generating out from the question using these language models without taking in the actual space because you know we're in the wild west I mean like you go back a few years the only way to pull the community was through polls and like there's a lot of bias inherent in those but as we build out these models we're also using them to reach out into specific spaces so like the process that might occur for something like active inference is okay there's something baked into the model from this overarching view from like scanning the entire internet but what we also would do and we're more and more you know moving towards is pulling in a library of documents uh that are you know from experts in the field that are that is zooming into a higher level of resolution and then you know either training a model specifically to specific spaces or using various techniques that you know I won't I won't go into quite in this particular space about how we're approaching it but there is there is an aspect of both trying to get that the top level of the map to bubble out from the you know this this the focused view of documents that we might pull in as well as to get it to bubble from the top where it's like a very large view like a combination of two lenses trying to make it merge in the center towards something that is as best as possible as possible a representation of the least space in the moment that the artifact of the map is generated and viewed by the people because that's what we want we want it to reflect the debate as it currently stands with the full knowledge that it is changing yeah very interesting and like thinking about the epistemic foraging nestmate on the trail all the trail they cover is either new to them which is education or new totally which is research and discovery and the process has to be encouraged to maintain both of them like at all um question from love evolve curious how much of society libraries mapping is openly accessible yeah I mean we have a ton of maps that are not online the Diablo Canyon map is what we would consider meaningfully complete we still have to clip a few like strings off of it because we pretty much stopped when the grant ran out you can just go to our website you can also go to some of our if you go to our website and go to libraries you can see like in the AI section for example some of the maps that we've been releasing which are more like topical and like related to the zeitgeist of what people are debating which is more of just a demo that we're making progress in automation and less of like a this is a representation of the debate space our data is not currently open source but like we're thinking about it like the sincere inquiry into truth is like the responsibility of like the entire community and it should be something that is commonly held but we're like an absolutely tiny organization we're basically like two full-time people and then like four part-time people slash volunteers and we've kind of been that way for a little while so like managing open source projects it's like we only have so much bandwidth if we were to get like large-scale grants we would be happy to open source like a lot of this and like do more community stewardship in like you know collectively building out agentic systems and models and stuff like that but it's just like we're our tiny team we're doing our work we'll get to open source eventually and you know like I said it's the wild west things are evolving things are improving every single day that's part of the motivation is not wanting to make too big of a claim like this is the perfect representation of the debate space which it never will be but as we're we're evolving as more and more tools are coming out as we can better and better reflect the human generated map that we did with Diablo more and more that you know will likely be you know available to people but it's a process of taking in new tools and trying to recreate the work that is done at that manual scale and make it automated for everybody. Yeah the fast iteration of the work that we're doing is definitely factored into it and then we we have found that when we demo our tools people get so excited about them so we were rolling out just like publicly available tools of like versions of our tools like we never release these kinds of things because we're building out different aspects of the pipeline and like automating and testing all of that but even just like one small aspect of the pipeline like the model that predicts what the top level positions are likely to be for a given question which can help like direct research goals people love that so much because they can ask a question about their life like people would ask things like what are the ethics of polyamory or like why should I give a TED talk or something like that and it'll just like like make a prediction about the map of the spectrum of beliefs and then generate arguments across varying different categories so like that's a tool we were just gonna like make available for people to like play with and use because every time we demo what people get really excited and there's other tools like a little fact-checking model that we built that works pretty well and it's like we've had a relationship with the international fact-checking network in the past it'd be really great to get those tools in their hands so like some things come out of different like times but yeah it's a lot of bandwidth limitation on our end well I guess as we wait for any final questions in the chat like where are you gonna continue going towards yeah I mean so basically our roadmap the short-term roadmap is that we want to have a library system where anyone can ask a question press a button and then they get these graphs however we will also be a library where we're discovering what are these fundamental questions and building collections for people in case people don't have like you know the expertise to know how much a question can buy us a specific system this data can then go into decision-making models like I won't talk about who we're in conversations with right now but we've worked to support different members in government at the federal level in terms of creating legislation and like city-level decision-making so our overall mission is just to enable more free informed decisions and that means not only changing how people make decisions but also making sure people have sufficient information and context in order to make a more free decision like there's a lot of literature out there about how you can like buy us and like co-hearce decision-making in order to like achieve a specific outcome like the more empathetic outcome or like what have you but we're really interested in like can we explore the space of enabling comprehension instead of supercharging persuasion and get people to like have a metacognitive process of assessing all of their options and then given an assessment of their options which is you know can is compressed but also impeccable you know have it so that people make a less biased more inclusive kind of decision which is products that we've already piloted in government already and so our goal is in terms of our roadmap is eventually linking all the different tools together the education with the knowledge graph with the decision-making models with the policy you know if you go on our website you'll see there's a number of different programs that are listed under our program section and so something that we're not shying away from is that even though we're just and this kind of goes back to the graphic I showed of congress and the library of congress it's like we feel like there needs to be a library of congress that exists for the public like the library of congress is a premier library and they support the intelligence and policy-making of congress but it's also available to the public so we think library of congress of the internet 21st century should exist for the public but also for decision makers and that could be think tanks activists legislators other like you know more local decision makers and so getting to the point where we are informing decision-making models but also kind of seeding the field of optimal policy-making is something that we're interested in because once you have a sufficient mapping of the problem space and the solution space and you understand like what is quote unquote known in terms of available argumentation and evidence and also what is you know quote unquote known about value conflicts is there a way in which we could seed kind of like a new field of inquiry where we find optimal policy so it's not just majority rule majority consensus or just evidence-based it's like what could be optimal based on people's preferences and the evidence that exists and like given that my understanding is there's been some experimentation to suggest that large language models can be extremely creative especially when compared to humans we may be able to lean on some of like I know you're shaking your head it's all debatable we may be able to lean especially in the future as these technologies you know perhaps get better and better and potentially more creative to find like what is like that that through line that path forward so that we can solve or address some of our most complex challenges without trampling over people's rights by like a majority rule or just eliminating people's preferences by having evidence based only well optimal policy comes up all the time in after inference and a really key difference with the active inference approach for policy selection versus like reward reinforcement learning is like the policy selections based upon a combination of the pragmatic and epistemic value that's just one way to decompose it but the pragmatic value is the preferences that are being fulfilled by future observations like we want to keep the radiation levels at this level co2 at this level temperature ph so those can be homeostatic preference distributions that are loose or tight and then policies that do well in satisfying that get high pragmatic value and that's like the traditional understanding of optimal utility driven policy however the other component of expected free energy is the epistemic value and so that includes the information gain that's expected about future observations and like we kind of alluded to earlier the possibility of structural change of the underlying landscape like changes in the dimensionality of the landscape due to a technology or like other like nonlinear changes or distorting changes so even the most important thing pragmatically still at least has this like foot in the door for epistemic value and also have epistemic humility and together the epistemic and the pragmatic are part of the adaptive policy selection approach and that can be used to model ant-ness mate you know the person broader organizations and so on yeah excellent i'm so excited to hear this has been a topic that has been widely discussed that's excellent John any kind of closing-ish thoughts you know i have my own work with cognizant and you know i was drawn my intention was drawn to society library because you know there's there's a lot of overlap i would like to just see more and more in the future two things what we sort of talked about and this talk was one the sort of distribution of certainty about the world state reflected in all the various people who this information is being reflected from and the notion of time as these debates evolve and change i think that's a that's sort of essential to really accurately represent what we're doing here that you know if you take any particular language model and try to use it to suss out distributed belief it's pulling from a large set of text that has no representation of time unless there happens to be in each particular text like some sort of time stamp it's all pulling it together it's treating everything as if it happened at the same time and i mean that would be a great place to go into the future is getting a representation of certainty getting a representation of the distribution of of sources within the data getting representation of the evolution of the debate and hopefully at some point doing something like a very very live thing where it's like rather than having a polling process where you ask a question um to a group of people um where you know they're given a couple of options it's like you have a debate space in a group of people everybody just writes whatever they think and then you take that and you dissolve and compress it into a map that is pulled from all of those sources in real time rather than trying to pull from everything something that is really really reflective of a particular and exact data set um that is like pulled from a group of people who are given the freedom to say whatever they want to say in relationship to that debate and i'll just say that the time factor is something that we um have accommodated in our manual creation of these knowledge graphs so like it's expressed like in the claim itself or it's embedded in the metadata of the claim and then like we also have like the evolution of things over time through like timelines of nodes it's a specific like ontological relationship is like a sequence of time so like we do have those kinds of things but like there's plenty that we need to automate like there's um the complexity of our ontology is kind of immense and so we're and so we're just like one model at a time kind of like building out all these things to eventually like work in concert to press a button and then like structure for these things and again what we determine to be relevant based on observations of how people argue and what they care about like trying to crystallize and formalize by like deduplicating and steelmanning and that sort of thing but like recognizing that you know time series is incredibly relevant because some things become outdated when a new piece of information like defeats its relevance so yeah we've got a lot of work to do um and so i just want to like reiterate like um you know we're a non-profit we um would love supports you know volunteering is costly sometimes because it takes a lot of time to like bring people up to speed you know when we're an open source project hopefully we'll have all the documentation written to be easier but if you have a way in which you think you can help us and you're interested in the project or just appreciate the mission please do reach out awesome thank you till 007 too see you bye