 Hello and welcome. This is Actinth guest stream number 32.1. It's January 10th, 2023. We're here with Adam Peace and we will be hearing a presentation followed by a discussion. So Adam, thank you very much for joining today. Looking forward to this and off to you. Okay, thanks very much for having me. Looking forward to presenting a few bits of initial research as well as some background long-term ongoing research into understanding language and trying to mimic some, just some elements of human thought. Everything I do is open source. So I've got some links for you there. I also have a pretty active YouTube channel, although it's been a little quiet lately. So there's a lot of material here that I won't be able to go into in depth. And so I'm eager to answer questions when we get to the question period and also would encourage folks to take a look at that channel where there's longer video format presentations about a whole host of topics that we'll probably touch on today. So I have a large ontology. I'll explain in a minute what I mean by an ontology that's called sumo. And I'm also going to reference some work that Facebook has done called their baby corpus. So that's the background for the cartoons that I'm showing you. The main topic that I wanna cover today is a notion of common sense auto formalization. My objective is to be able to make statements in human natural language, just in English at the moment, ask questions also in natural language and get answers. But get answers with explanation, something that modern question answering systems that we hear about each day in the popular press are generally not capable of doing. You might get an answer, but you don't know why. And knowing why and understanding and explanation is very important. So we're not talking about just information retrieval, not talking about Google where you ask a question and you get a list of documents in which there might be an answer or even Google's occasional capabilities for very specific sorts of questions where they give you an answer, but they'll never tell you why. It's more like, you know, you ask, what's the weather today? You get a nice concise link to the weather. You won't get an explanation for why the weather is what it is. So what do I mean by ontology? Are you meaning to share? We see the first slide. You're still only seeing the first slide. Okay, I'm on slide three. So let's go back to, I'm not sure why that's no engine showing up, but. Good, just wanted to, it looks good now. We'll stick with this format then instead of the slide presentation format. Thank you, it looks good. Okay, so here's what I just talked about. So let me give you my definition for what is an ontology? Because this is a word that at a certain point in recent history, maybe about a decade or two ago got overloaded so much now that just about anything is called an ontology. And so for me to distinguish ontology from things we already had words for like semantic network or schema. I consider an ontology to be essentially a dictionary for computers to read. And that also requires a bit of unpacking. What is a dictionary for computers to read? Well, for me, it means minimally, we've got to have terms, a set of terms that label things in our world. But most importantly, we have to have a set of definitions that are stated in a computable language. So just stating things in English or French that are definitions for words. Well, that's a dictionary, it's not an ontology. And if we don't have definitions in some sort of mathematical language or programming language of some sort then our machines are only going to have the barest abilities to manipulate or do things with those natural language definitions. So I think they have to be in a formal language to nail down meanings so that we have shared meaning. We have shared meaning between people that agree to use a particular ontology and maybe more importantly, we have shared meaning between people and computers so that the same constraints, the same inferences that people would be able to make with these terms, the computer is also able to make because of these mathematically expressed definitions. So that's what I've spent the last 22 years trying to build up. And this is work that draws on linguistics, philosophy and computer science, as well as logic as a sub-discipline of probably all of those three. And one reason I'm doing this nowadays, artificial intelligence has become almost synonymous with machine learning. So I need to explain a little bit why am I doing this? This is sort of old fashioned AI of writing down lots of rules. Why do we still need to do this? Well, I'm a big fan of Daniel Kahneman's work. This is just one justification of many, but hopefully because it's very well articulated in his work, hopefully this can be at least the start of a justification for some of you who might be unfamiliar with the symbolic approach that used to be the primary approach in artificial intelligence. I think that he makes a good case that humans have two modes of thought. We have an instinctive, intuitive, sensory pattern matching kind of process that's good for face or voice recognition or making a very snap judgment about what to do in a particular situation that is learned from the presentation of many, many similar situations. And that sort of mode of thought fits very well with statistically based machine learning neural networks, deep neural networks and so forth. That's an important part of intelligence and it's led to a lot of important commercial activities. But there is another mode of intelligence that Professor Kahneman I think illustrates quite clearly that is a slower, more deliberative, deductive approach where people can explain what they do as opposed to just give a reaction. And it seems like in large parts of the AI community, people are starting to recognize that, yes, there is still this set of things that we require our computers to do that they're not doing very well with the current neural approach, numeric approach, that we need another mode of tool set for. And so we're starting to see this keyword neuro symbolic approach of trying to balance statistical machine learning approaches with some sort of explicit representation in a symbolic form. And it's this latter bit that I've been doing a lot of work with and I think it's the combination of these two of main approaches that's going to lead us to some more exciting capability. That's just my assessment of the state of the field, that there is some emerging evidence that this in fact is the case that we have a potential with integrating these two approaches to handle a broader range of things that would look a lot more like human intelligence. Along with this grand goal of doing AI, I've also been applying this work in creating this large dictionary to some far more prosaic stuff in data integration. The bulk of the commercial consulting work that I've been doing really over the past 10 years is just looking at a software engineering methodology of trying to capture the formal meaning of things that we use in computer science, databases, spreadsheets and so forth. You know, anytime you have a spreadsheet you usually have a column set of columns across the top of your spreadsheet that has some labels and it's very unusual for people to have any sort of explicit definitions of what those labels actually mean. It's usually just assumed that people by looking at those labels, you know, if you're in the same community or you can ask for an explanation that gives you a good sense of what's the intended semantics of that particular column based on that particular label. And what I've found is that, you know, it might work for a spreadsheet where you've got a few dozen columns and you're working closely with some longtime colleagues, but as soon as you get to a larger organization with a larger set of symbols, labels, where people may be working at different locations, maybe they'd never even met each other where there's interactions occurring over the space of years in large multinational corporations that people's initial assumptions about how clear their labels are are usually completely wrong. And so people wind up using data in dramatically different ways inconsistent with its original intent, precisely because nobody's taken the time to actually define things. And even in the rare cases of organizations that do have data dictionaries that define their terms, the intended meaning of their terms, they tend to define them in very informal ways using natural language definitions, which means then that the machines can do very little to help ensure that people's intended specification of semantics is consistent across a large body of terms. You might be able to have a human being keep a data dictionary of a hundred terms consistent and well organized over time, over a long period of time. But once you get to thousands or tens of thousands of terms, it's well beyond human capacity to keep such a product consistent. You need automation. And unless you have definitions that are explicit and in a computable language, the computer can do very little to help keep those definitions consistent. So that's one of the big byproducts of all of this work so far. And I use techniques like automated theorem proving that I don't have time to talk about now, but would love to talk about it in detail for anyone who's interested to keep these definitions consistent. So I have a logical approach for the semantics of terms, lexical semantics, the semantics of the words we use. And this extends also to computer science contracts like labels for tables or columns or database elements and so forth. And this is useful anywhere, pretty much. Anyone has a spreadsheet, anyone has a database. This is something that people don't realize they could make use of. People seem to accept that they're gonna need to have a lot of human-to-human communication to explain the meanings of terms. And there's just another way, right? And that's what I've developed and that's when I'm basing this work on language understanding on. There's a common belief that, oh, this is just too hard, right, that we should be able to come up with these things easily. There's too much of them, meanings change and so forth. Once you take the perspective that there is a need for precise semantics for our labels and that this precise semantics is something like definition in a dictionary but in a formal language, then it's not just a sort of a graph or a list of relationships or likelihoods or probabilities. It starts to be hard to imagine then how a machine would just come up with this sort of thing automatically. And of course, we have another venue in which we take this sort of harder approach of programming and don't expect machines to just do everything automatically and that's programming, right? We have many, many millions of programmers in the world now. It's they're expensive. It is a time consuming process to program things but we invest in this because there simply isn't another way to do it. And we make this enterprise of programming feasible through a couple of points. One is having very rich languages so we can say everything that we wanna say so we don't wanna be restricted in the sort of programs we can write. Similarly, we need to make sure we're not restricted in the sorts of definitions that we can write and the way that we make that practical is through reuse. Your modern programmer, if you deliver some software, you're writing maybe 1% of the program or 1% of the code that's delivered. Most of what you deliver is going to be a set of large libraries and reusable components, operating system, database, a web server and any number of other things such Java collection classes, numpy libraries, if you're doing Python. The way in which modern software development is practical is through reuse. And so that's the other thing that I've been working to provide is this thing called sumo, which is a library of definitions that can be reused that take this problem that many people would otherwise consider to be completely intractable, manual creation of knowledge and makes it eminently practical because you're actually reusing most of your semantics by virtue of using this library, right? So this sumo, like sumo wrestler, suggested upper merged ontology started in the year 2000. It started as just an upper ontology. So the acronym is a bit of a misnomer, a bit of an anachronism, if you will, right now, that it's really a comprehensive ontology, not just an upper ontology, because it's grown to be about 20,000 terms, 80,000 handwritten statements in an expressive higher order mathematical logic includes links to vast fact bases like the Yago system, lots of translations from its authored format into some of the standard representations used in the automated theorem-proven community. And as I mentioned, it's all open source and online and to every version has been open source, there's nothing that's held back for commercial reasons. Let me mention WordNet. So another common misconception in ontology is that we have to use words and follow the definitions of words. And as we should all know, words are ambiguous and words are polysemous, the same word can have multiple meanings. So we've created a very clear division, but also a relationship between the senses, the concepts that we have in the world that have formal definitions and the labels that we as humans put on those concepts to communicate them to each other. And this also makes it easy to ensure that our ontology is not tied to any particular one language because any one of us who's multilingual knows an English word and a Tagalog word can have the same underlying semantics, the same underlying real world referent, but we use different labels. And so we need to make sure that we can have vocabularies that are mapped to our common single formal ontology that tries to explain and define concepts in our world and not let sort of language and labels and intuitions about those labels creep into our formal definitions. So we reused Princeton's WordNet, which is a project that's been going on for about 30 years now, I think. And it has also projections into multiple languages, Polish WordNet, Tagalog WordNet, for example. Here is, there's continuing work on that from, which I'm very excited because I started the Tagalog WordNet back in the 2000s at De La Salle University. And so it's great to see that sort of effort continuing. There's WordNets for Arabic and Chinese and many other languages. And so we keep those labels in English, those dictionaries, human dictionaries distinct but related to the formal inventory of mathematically expressed concepts. And so back in the early 2000s, we took an effort to go through all 117,000 SIN sets in WordNet, all 117,000 linguistically labeled concepts and map them all to Sumo one at a time by hand. It just did an enormous effort where we couldn't have really done it automatically. It required human inspection. We invested that effort and it's since paid off for the linguistic work that we do with this ontology. Why do we need both? Well, the formal definitions are things that just aren't in WordNet or any other dictionary. You might have a word for earlier and a natural language definition for what it means for some events to be earlier than another, but we need a precise mathematical logical definition of what earlier means. And so Sumo has this and many, many other axioms that individually are trivial, but together add up to creating a certain understanding about the world that without it machines just don't have. We need a machine to know that if one event is earlier than another that the end point of this first event is before the beginning point of the second event. So we've got a formal mathematical definition of earlier which you just don't have in a dictionary. This allows machine to do logic inference, automated inference and give you conclusions and explain those conclusions. Not a criticism of WordNet, just they're different products with different needs and we need both. Also, I get a lot of questions about what kind of logic do we need? Do we really need a very complicated, expressive logic? It's a very popular nowadays for people to use taxonomies or knowledge graphs. Well, knowledge graphs are essentially semantic networks. It's a knowledge representation that was created back really in the 1960s. And so it does surprise me that it's kind of, what's old is new again in artificial intelligence somehow and that knowledge graphs are a big thing now, but it's a knowledge representation technology that leaves out a lot of stuff that we would need to say about the real world in order to define the terms that we use. So natural language definitions aren't enough and we believe that defining our terms or our concepts is good, which particular language do we choose? And I've had people often say, usually really without any backup, oh, those are the rare cases. Well, I got frustrated with that. So I did a paper recently about, well, now almost two years old about trying to quantify how often do we use constructs in language that require more than just a graph of relationships? And so I came up with some statistics. I'm not gonna go into this paper in great detail, but if you're interested, I'd love to follow up with you. And I came up with that in two, looking at two large corpora. Brown corpus is kind of old and kind of small by modern standards, but I also used the corpus of contemporary American English, took a random sample of sentences, created an automated system for looking at some various features that require in language translation into using an expressive logic, such as modal expressions, can, may, should, might, expressions of authorship, said, wrote, epistemic, knows, believes. These are words in language that trigger the need for a more expressive representation. We can also, I could give you a lot of detail on why that's directly entailed. This is not just that, you know, this is not just my opinion, trying to advocate a more expressive logic. These are very sort of strict things that are well known in linguistics, but at least now it's well-quantified that about 50% of statements taken from a random sample of a large corpus do actually require these more expressive semantics. So I think I've made a strong case that we can't just get away with graphs unless we wanna leave out about half of the things that people say to each other. And that tells me that we really need this as more expressive logic. Okay, so that's all the preamble. That was the library of semantics of meaning that I already have. And how can I use that now to try to create a system? And this is research, right? And it's relatively experimental even at the moment. That the attempt is to try to create a system where the machine can translate automatically from English sentences into a formal precise semantics using these terms out of the ontology that themselves are precisely defined. So I have at the one, at the starting point, English statements or questions. I translate these things to logic. I wanna be able to send them to a powerful theorem prover that's at least first-order logic plus equality, but preferably higher-order logic so I can handle some of these modal expressions. There are a few such provers. And I want it to engage in its process of doing theorem proving not only with the statements and the question that a particular user has provided at one time, but also this rich background that describes the world so that the machine is not just doing a sort of simple matching even with these powerful tools, but it's really taking into account background knowledge, world knowledge in the same way that a person would. And then when it's done, when it's got an answer, I want it to be able to explain what it does. And theorem provers have this capability already. It's built in. They were designed to provide proofs because a lot of the work in theorem proving is for theorem proving in advanced mathematics. So there's a quiet but significant community of mathematicians and computer scientists that are using automated theorem proving to prove novel theorems in mathematics. And because they're in mathematics, they have a very strict standard of what constitutes a real work. You have to show a proof just like almost all of us probably experienced in high school geometry of proving one triangle is the same as another through maybe the side angle, side process. Machines can now do this and much more significant and complicated and sophisticated sorts of proofs and provide detailed breakdown of how it reached its conclusion. And I take this as to being something that's required. If you ask another person, what's the answer to question X? Okay, they give you Y as an answer, but then you want to know, yeah, what's the details? How did you come to that conclusion? It's not just enough to say, well, I told you so, so believe me, sometimes you might get away with that, but not all the time, especially not for anything that's complicated. And so our machine should be able to do that as well. So I've done a number of forays into this grand goal over the course of my career. I started out with something called Controlled English to Logic Translation, where I had a hand-built grammar of a small subset of English that was designed to be unambiguous. It was all written in prologue and I was able to create expressions using sumo terms in its logic, which was mostly first order logic back then in the early 2000s. And it was somewhat successful, but it was a fairly restricted subset of English. And so the parser would break and say syntax error all too often. And that was kind of frustrating. It was sort of like what one experiences today with something like Alexa or Google, where you ask a question and says, I'm sorry, I don't understand, right? And so people wind up self governing their utterances to adapt to the machine. And people were able to do that with my KELT system, but it's still kind of frustrating. I'd like to be able to handle text that was written not for that particular system and not dumbed down on the fly by a human, but rather any arbitrary sort of text. So then I started working with Stanford's core NLP system and using dependency parse transformations and trying to turn dependency parses into logic. I did have some success with that, but I found that even as good as machine learning base statistical parsers are, there's still a lot of cases where they break down. In fact, they give a lot of metrics that look very encouraging, usually on a per word or per symbol basis. And they'll say or 97% correct on unlabeled attachment score is one that's used in dependency parsing community. And that sounds like a very high percentage, but your average sentence in English is depending on who you talk to, how you collect the statistics about 24 words. Well, if you look at 97% raised to the 24th power, then you get a problem and you get actually less than 50% of your sentences are going to be parsed correctly. And again, that's unlabeled attachment score for dependency parses. That merely means that two words are found to have the same dependency relation without a commitment to what exactly that relation is. Well, you need the labeled attachment score, which of course is lower. And you needed to be correct for all of these words to have really a coherent understanding of a sentence if you're gonna do any sort of automatic translation. So although the systems are good and they've certainly gotten better from 2015, I mean, Stanford stanza is yet another, few percentage points better than core NOP was in terms of its dependency parsing, but it's still not reliable enough, not consistent, not stable enough for me to use that as a basis for translation. And then there was all the problem of, well, how do I actually take that output and reliably create logic from it? And that meant at that time I was doing a hand-built rule base and that just didn't work in the limit. It was encouraging. I think there's potential. If you had enough people working on it, it's a bit more research, but it wasn't the way in which I thought I could reach success. So I'm doing something different now. So I'm doing what I guess is more prevalent now in machine learning than instead of doing a pipeline of machine learning operations, trying to do everything all at once with extraordinarily large corpora, where I'm trying to train a system to do auto formalization of language, to go straight from language to logic without any intermediate approach like dependency parsing. And this is based on some of the success with my colleague, Joseph Urban, who's been doing this sort of work for mathematics. So one of the challenges of using theorem provers, automated theorem provers for mathematics is that as formal as mathematics is, it's not really quite formal enough. In a mathematics textbook, there's still a lot of text and there's still a lot of things that aren't themselves precisely defined. And there's also a lot of stuff there in mathematics that's not in a common and computable format. So for a long time, there's been this problem of how do we take mathematical textbooks and turn them into extraordinarily detailed formalizations that an automated theorem prover can handle in first order logic plus equality or some other kind of mathematical logic. So Joseph has had considerable success at this and we decided to team up on trying to do this even harder problem of just looking at unrestricted text, not just mathematics. And they've been using Google's neural machine translation system which is built on top of TensorFlow. We're at the moment using kind of a relatively old version of that, we tried a newer version, it turned out to actually be slower for our purposes, but we need to do a lot more experimentation and make this work more popular. So people tell us where we're going wrong and maybe there's some better machine learning libraries that we could use for this. So this is the architecture. Now I wanna tell you a bit more about how we actually come up with this set of natural language logic pairs because like other work of this sort, the libraries of pairs, of your input out pairs that you're trying to train has to be really, really big. Orders of magnitude bigger than any sort of pipeline approach because you're trying to tackle the whole problem end to end and there's a lot more variability. Your statistical significance for the appearance of any given feature correlated with any other given feature is really low. He has a very long tail in language of things that are not seen too often. So your corpus has to be very big. How do we get that corpus? We certainly not feasible to translate a lot of English by hand into logic. So how do we get this thing to begin with? So we were aware of this project done at Facebook called the Baby Corpus in part because Tomasz Mikolov has returned to his own country of the Czech Republic and now works with Yosef Urban at CTU Prague. So we had benefit of getting some of his perspective on this work. It was basically an effort to try to teach machines how to do simple inferences. And we'll look at some of those simple inferences in a moment. And in order to do that, they tried to generate a very large synthetic dataset of simple natural language sentences and simple natural language inferences that they wanted the machines to be able to train up to handle. Ultimately they used really only a few sort of rules or patterns. I think Tomasz just said there's something like 20 different patterns. I've looked at the code, it's written in Lua. It wouldn't be hard to rewrite it in a sort of more common language. I think it's not particularly complicated code, not a huge code base. A lot of simple patterns, a great idea, but I think we can do better at it given the resources that sumo provides for generation as well as understanding and anchoring meaning, right? So they had a lot of their knowledge representation such as it is was things like, you have a bedroom, a thing, the location, actors existing in this world. It sort of harks back to the blocks world of Patrick Henry Winston. I think it was Winston. Hope I've got that reference right. Back in the 70s, there was this notion that we could do, there's a system called Sherglute. I think it was that did operations in a world of blocks. So a constrained world, there were actors, there were a few things that could be manipulated and the system was able to understand very simple utterances. Well, they've kind of turned this on its head, created a knowledge representation for a certain world and then tried to generate a lot of sentences from it to create this synthetic model of things that a machine should be able to do inferences about. And so here's some of the content of their questions. So Fred is either in the school or the park. Mary went back to the office, is Mary in the office? So the first sentence is just a confusion element of the corpus. It's really, if Mary went back to the office, yes, she's in the office. Bill is either in the kitchen or the park, another confusion sentence. Fred moved to the cinemas. Fred in the park? Well, no, he's not in the park but park has been seen as a token earlier. And so they wanted to make sure that the automated learning system isn't just trying to find proximity, which is a good thing to include in their dataset. So lots of stuff like this and more complicated things. Also just presented really some of the most simple examples, but all generated from a very simple knowledge model, concepts that weren't really defined and a lot of massive presentation of tiny, very mind numbingly tiny variations and a wide variety of sentences. So we started out thinking, well, let's just cover the stuff that's in the baby corpus. So we understand what they were doing. And the big advantage that we thought we could bring to this is we don't have to learn logic. You know, in a way they had a simple corpus that was trying to accomplish something that's very hard, which was to teach machines just through massive examples, how to do logical inferences or a certain subset of logical inferences. We have an easier problem. If we use a theorem prover, we don't have to teach the system to do logical inference. We already have that. We just need to get the knowledge representation from English into logic and then let the theorem provers do their thing. So again, it's a very simple world. They had objects, locations, actors, states, actions, a few other decorations. And so we tried to figure out really what they meant in a way that was hard because they didn't have a distinction between instances and classes. So, you know, we can talk about in sumo relationship that in an action, one thing is an object transferred from one location to another. They have this notion of being gettable, which is really a modal. And so we had to figure out what did they mean? They didn't really mean a modal. They meant, yeah, something that can be transferred and then there has to be some kind of transferred example. We wound up interpreting a lot of their symbols, not as classes. So the notion of an apple is a class, but the apple that gets transferred in a particular sentence is an instance. So this difference between existential and universal quantification. So we'd create a term like apple one is a kind of apple, is an instance of an apple, right? And then it's a gettable thing and it's located in the particular area, et cetera. So I don't have a lot of time to go into knowledge representation, but if you want a full course on sumo, I can do that at some point. I know there's a lot of videos online also. So maybe I'll skip through a little bit of this. We wanted to make sure we covered, you know, all of their stuff. You know, we did, they referred to things like chocolate and milk and garden and container and box. And we actually had all this in sumo. So getting back to this notion of, well, if you had to define all this stuff from scratch, you'd be tempted just like the Facebook guys did to just use the labels and assume everything, everybody means the same thing, but we didn't have to do that. We already had these terms precisely defined in sumo and we're able just to reuse them. So here we did a big table. This is just a portion of that table for all the different things that they referred to that we already had or in a few cases didn't have and we needed to add them. But even, you know, adding the new concepts and their definitions was easy because the new concepts had definitions that were based on in turn other terms we already had. So we just had to combine them in the right ways. You know, we didn't take for granted that we were just using the first meaning of things. For example, they had the word pajamas in some of their database. Well, pajamas have a couple of meanings in the dictionary. Yeah, I wasn't all that familiar with the second meaning, but it does exist. And so we wanna make sure we're using the right meaning. So we also did mappings to wordnet of the terms we defined so that we had disambiguation potential for our natural language. There were a number of things because of this small and artificial world, they had inferences like if you're hungry, then essentially you go to the kitchen and eat something that's in the kitchen. That doesn't really hold in the real world obviously. We're not in this fantasy constrained world that hunger immediately entails you go to the kitchen and eat an apple. But so we had to create some kind of some bogus rules in sumo that were very specific just to this corpus to try to replicate the inferences that they were doing. So here's the rule that gives the right answer for this corpus. It gets a little complicated to state it, but we can state it in sumo that if you have a cognitive agent that's hungry and that's at a particular time, then there's a translocation event that is desired by that agent and that translocation has a destination of the kitchen where there's a getting of an apple. A rule that doesn't belong to be in sumo on any permanent basis but is what's used to cover the semantics of this particular corpus. And we looked at things like being bored and tired, which they had. They all say, if you're bored and tired, you go to sleep. Okay, I wish that that were true for all of us hardworking engineers that we had that option. We don't at all times, but that was what covered the semantics of the corpus. So we had to do similar things for bored and tired. They don't really have weights and measures and units in the baby ontology. Sumo, of course, has a detailed ontology of weights and measures of units all the system, international measures. And we had to figure out how we were gonna translate these hacks that were in this corpus of, things have a numerical size, one, two or three, something like that. How are we gonna represent that? So we figured out a similar hack for covering that. All right, so we did this exploration to make sure that we could cover one of these sample worlds. But what we really wanna do is generate a much larger corpus with a more interesting, more varied set of things that are referred to in the real world. So we wanted to do a synthetic text generation as our first step so that we kind of seed the system long-term. We know that we need to go beyond that synthetic test. We're not there yet. We're just still in the process of creating this very large set of synthetic sentences because we can start with an idea or a knowledge representation about what we want to express in the world and generate syntactically and semantically valid logic expressions from that. Well, at the same time, because we've created the state of the world artificially in a program, we can then also create a sentence that models that state of the world. So we go kind of both directions, both to logic and language from a very ad hoc sort of expression in some Java code about what's true in the real world. And that allows us to do a lot of variation, but it's never going to cover the full expressivity of natural language, but we think it will cover enough that it gives us a head start where we can then start seeing, well, where does our automatic language to logic system break down? What are the sentences where it fails, where it generates something that's not even syntactically correct, maybe or violates constraints that we know from Sumo about how the world works. And then try to add those as new patterns to our generation system. So we can find one sentence that causes a problem. We look at the core issue of what causes the problem and look at maybe the things that are variable the sentence. So we might have a sentence that's problematic that has some subject and an object where the subjects and the objects have a wide varieties that could of things that could be substituted in that we can turn into sort of a template to try to cover another pattern, if you will, in natural language. And one of our possibilities I think will be to look at frame net that is exactly such a large inventory of these things. I've wanted to do a projection of frame net into Sumo for several decades now, did some early work with the team there and we published some work about that in the mid 2000s, but we didn't do a comprehensive inventory of all of their different frames. And so I think there's a good project there to do a more detailed exploration of all the frames in frame net and translating them to Sumo because the frames in frame net don't have a precise logical semantics, which is one of the challenges I think that people have found with using that resource. It's not anchored in the same way that it needs to be, I think to be used more productively. So there's I think an interesting project there that would have a very practical outcome. But for now I'm just gonna talk about the synthetic text because we're not even into that phase yet. Synthetic text is the tough stuff. We just started this work sort of mid fall. So that kind of gives you a perspective on where we are. This is gonna be a long-term thing. So I tried to collect all of the things that I thought I could detail both linguistically and in Sumo. So things like propositional attitudes, modals, tenses, names and roles and basic forms of transitive, intransitive, di-transitive verbs in the subject, verb, object, basic grammar of English. Try to say what some of these things are. So I'm gonna give you an example. It's a real example of stuff we can generate and it looks kind of funny. And I'll tell you why it looks funny and what we can do to fix it. And it's other examples like it. But I'm giving you words and all of this extremely early research, even though I think it's very promising, I'm gonna tell you what's wrong so far and the examples show stuff that look kind of funny. So an internet user traps a crustacean. That's a pretty funny sentence. It's syntactically valid. And I think you can imagine an interpretation of that sentence that's legitimate in the real world. I'm an internet user and I go down to the beach and I catch a crab. You can imagine that actually happening but I'm not sure you'd ever find that sentence uttered by a person. We think we can do better constraining ultimately our language generation through using large language models. It should be in principle fairly easy to do so that we generate stuff that's just a little bit more reasonable, a little more common, a little more expected. But this is what we've got right now because we're just creating these templates and we substitute in all of the legitimate possibilities where legitimate is just, does it meet syntactic and type constraints specified by sumo? And so it's possible for an internet user to trap a crustacean. So we generate this sort of thing amongst many other possibilities. And at the same time, we generate the logic expression that you see on the left. I'm sorry, I'm throwing code at you and I don't have the time here to go into the full syntax of this expression but the moment this is just first order logic. So it's an existential quantifier saying there exists an internet user, a trapping or an ambush is the term that's used in sumo. Again, remember labels and definitions are different. So we used a particular software identifier of ambush but its definition fits with what you would expect of trapping, okay? And just the word that we happen to be using in English is trapping, the software programming language identifier we're using in the sumo library is ambush, okay? So you shouldn't expect those to be identical because we have different labels for things that we may have used consistently in terms of their formal semantics but their labels may vary. So we have an agent that does something that agent is the internet user. The object of the action is the crustacean and the object of the action is something and that something is crustacean, all right? So very simple sentence, very simple logic formalization. And so we generate tons of sentences like this for different objects and different agents and different actions. This is your most simple subject verb object sentence. And we have lots of different objects that can play roles, different agents that can play roles and different processes that are already defined in sumo. So we just go through the entire inventory and we also probabilistically generate and we look at each of these symbols in sumo has a connection to wordnet and wordnet has word frequencies for each of those concepts. So by looking into wordnet, we can see which concepts, formally defined concepts in sumo are more likely to occur in free text individually. And so we do prioritize our generation on the basis of trying to make it more likely using a randomizer in Java and then a frequency table. We try to make it more likely that we generate sentences with things that are more commonly seen. But as we can see, even though internet user may be fairly common in natural language text, you wouldn't likely see it paired with trapping and or crustacean. And so it's these pairwise selectional restrictions that we don't have right now, which is why you get these funny combinations. We can generate times. So we can have the internet user trapped crustacean or trapped crustacean or will trap or is trapping and it's formal temporal logic expression. One thing you often find in simpler knowledge expressions is verbs used as predicates without tense. So you might have in a knowledge graph, internet user a relation trapping or traps or some sort and then crustacean, right? This would be a more typical expression in a graph. And that doesn't really capture what the entire sentence is saying, right? It says that there is a temporal qualification to what is happening in the sentence is happening now, right? And so there's a begin fund of this activity and an end function of this activity and now is between those two things. You have a dyke dick, as we say in linguistics, right? Something that's being referred to that gives semantics to the sentence that's only meaningful in the context of the speaker and the listener. Here, there, now, et cetera. There's a lot of these things. And we've got to be able to handle that in order to capture the semantics of the sentence. So it's not enough to say there's a trapping relation between the internet user and the crustacean. We really have to go to a temporal modal logic beyond first order logic with equality to capture this information and capture it in a way that's going to work, not just now for this one isolated case, but for all possible cases of these sort of temporal expressions. It's also common for people that don't have a background in modal logic to sort of assume that there's some kind of a fudge for this in a graph representation, there isn't, right? And this is why logicians work on this stuff. It's hard getting it right has taken many, many decades of effort. And so you can fudge it, but essentially somewhere down the line, you're going to get completely wrong conclusions, nonsensical conclusions, because you don't have this same strength of semantics in your representation. That's why we need this difficult and expressive logic. We have negation. Negation is another big problem for graphs, right? You've got to be able to negate an arbitrary conjunction or disjunction of other statements. To do that, you've got to have a construct that is beyond a particular relation that is sort of a contextualized relation. It can handle a whole expression. Logic already gives you this. And getting negation right is really a necessity in logic or in any other representation you try to create. So like sometimes people try to say, well, I can do that in my graph or I can do that in my description logic. And well, maybe you can for this one point case, but essentially if you're doing this in your ad hoc representation, that's not a formal expressive logic, you're going to wind up syntactically recreating what logicians have already done and done many years prior. And you're going to sort of discover these things on a point-by-point basis instead of just getting them right from the start. So encourage people to use a representation that can handle the full complexity of the semantics of English and only a higher order logic will give you that. So here's the case where the internet user doesn't trap a crustacean on negating this entire complex, existentially quantified expression that includes a temporal, monologic expression within it. What if I have an agent, Patrick desires that the internet user trap a crustacean? Pretty common to see in news texts. Kevin McCarthy said he believes that the next vote for speaker will be successful. Bolsonaro desires that he be the leader of the country again. These are very common expressions in natural language. They're not infrequent and they're difficult and they require this sort of expressive logic and that's the sort of thing that we're handling. We're able to generate this stuff and have a representation that handles it. And then maybe if we're lucky, have a theorem proof that can handle it. Higher order logic theorem proofing is still difficult and somewhat experimental. And so that's a parallel line of research. But we're not sort of hand waving away this problem. We're not imagining it doesn't exist or imagining that we can really handle all this complexity in language without this additional set of computational support that we believe is in fact really required. So here, just making the logic and the sentence more and more difficult, the plumber knows that Patrick desires that the internet user traps a crustacean. Pretty artificial sounding sentence, admittedly. But not all such constructs for sentences are that rare. You do get in a news report. The New York Times reported that the speaker desires that this bill pass without further amendment. That would not be a particularly unusual sentence and yet it follows exactly this pattern. So one of the things that we need to do, and if anyone is at all intrigued by this work and thinking, well, how can I contribute? Which would be wonderful. Here is the great way to do that. One thing that we need that needs to be added to Sumo and yet is also fairly well constrained in terms of the complexity of the underlying axioms that need to be expressed. So it's possible to make these fairly formulaic. And in fact, the use that we make of them inside my Java code that generates logic language pairs is very formulaic. And we'll look for only axioms that are of a very particular restricted form in order to do its restriction of which sentences get generated. We have this notion of capability, a capability relation in Sumo. So it says that some kind of thing is capable of performing some kind of role in some kind of action. So ears are capable of hearing, guns are capable of shooting. And we have to talk about things that might happen in order to restrict the sentences that we generate. We'd like to be able to say that John, John, it can't be eating a boat dock. We might generate something like that now, but we wanna be able to restrict eating to have an object transferred or a resource that is only food, ideally, food for that particular animal. So we want termites to be able to eat boat docks, but we don't want Bob the plumber to be able to eat a boat dock. And we have a lot of processes and we have a lot of objects in Sumo. And so creating an inventory of these different capability expressions would eliminate, I think, a lot of the kind of funny sounding sentences we generate. Now that said, I think it's not catastrophic that we have some of these funny sentences because remember that we're ultimately trying to go from organically occurring natural language text in the wild in order into arbitrary logic expressions. So the fact that we have other sentences that kind of have a similar form, but don't really make a lot of common sense may not be a big issue because as long as we have enough examples of things that are reasonable, that do cover the vocabulary we care about, that do have reasonable selectional restrictions, that that'll still enable us to translate the good stuff that occurs, the sensible sentences into sensible logic expressions. And the fact that we have a corpus of translations where the translation is reasonable, but the core sentence is kind of focus may just be background noise. We may be able to get away with that, but even if that proves not to be a problem, it still would be better for us to have in sum all these capability expressions just to add to our knowledge of how the world works. So if anybody is intrigued by this, I'd love to help you get started because I think it would be not too hard to do and would be of great value to the project. So we have a lot of, as I've said, I've said a couple of times now, we have a lot of processes, we have a lot of objects. Well, what do I mean by a lot? Well, here's just a segment of the natural language comments on a portion of the hierarchy starting from the top. Here are all the process types in sumo. We have 1,300 of them at last count and here are a number of them. So in a lot of sentences, we don't restrict what processes can hold for a given sentence. And we'd like to have, again, these selectional restrictions based on capability expressions for the type of objects that can participate in the types of processes so that we don't get silly stuff. For each of these processes, we have mappings to verbs and wordnet. So that's what allows us to cycle, to iterate through all the process types and still generate a sentence, not using the labels in sumo, but rather using the labels, the legitimate words in the wordnet dictionary. So for example, for handbush, it's mapped to a particular numbered sin set, collection of synonyms, collection of synonymous words in wordnet that linguistically identifies a sense in English. And so we have ambush that's mapped to sin set, blah, blah, blah, 926. It has a certain definition, has a certain set of synonyms, emboscad, ambush, lying in weight and a trap. And that's why we saw trapping earlier the internet user traps across station. That was exactly this equivalent sin set for ambush. We have lots of case rules. So these are the ways in which entities can participate in events. There are some simple systems, including some simple systems in linguistics. If you look at, I think verbnet has some of this set of restrictions. So agent, patient, instrument, resource, these are very common. And so you'll also see agent zero, agent one sometimes as labels in semi-formal linguistics talking about the roles that entities play in activities. And so occasionally I've gotten this question which I think is funny, it's like, well, why isn't it enough to have just agent, patient and instrument? Well, yes, that does cover at a high level, maybe exhaustively the roles that things can play. I mean, James Pustiyovsky also has this set of qualia relations. This is a common, I think, approach in linguistics to try to explain language through a small number of covering predicates or abstractions. And while that might be fine for abstract linguistics in practice, there's a lot more that we can say about the world, about how things interact in events that happen. And so we've come up with 67 of them. And I think this is many orders of magnitude too small, but it's a good start. It's the stuff that we've found that's been useful over the years. So invading virus, catalyst, broker, enjoys, reagent, all of these relations, many of them are sub-relations of things like agent and patient and instrument and resource, but they're more specific. They entail different additional knowledge when they're used and it's knowledge which fits with our knowledge of the real world. We have lots of objects. So here's just the corpuscular objects, things that are not substances that have parts. And we have 930 of them, a lot of them are content bearing objects or things that may have writing or symbols on them that are common in our world, as well as organic objects like organisms. There's a lot of these and they all play different roles in different activities. Here are some of the roles that human beings or agents can play. You can have a certain profession, you can be a plumber or a programmer and you can switch freely from one to the other. This, you know, past weekend I made my toilet breaks and I have to be a plumber. Today I'm a programmer, maybe tonight I will be teaching a course and so I'll be a lecturer. These different roles are free and flexible and large and they also are temporarily qualified, right? You can play different roles at different times. So it's not enough to say, as wordnet unfortunately does as a linguistic product that a subtype of human is plumber. Well, you can be a plumber one time and not be a plumber another time and that doesn't mean you cease to be a human. It's a bad relation to have formally although it's a reasonable relation to have linguistically because you can do the substitution test that linguists do of substituting say a superclass for a subclass and it works in a lot of sentences but it doesn't work formally, doesn't work in logic. So we need to separate out these roles and say that they're transient, temporally qualified roles that people can have. So we can generate against bogus stuff like Robert kills a boat dock doesn't really make any sense. It makes sense syntactically but we need these capability restrictions to handle them. And a lot of times we do have restrictions like the rule below here but that it's just I can't afford to do theorem proving while I'm doing generation. If I need to generate a one terabyte corpus of language logic pairs, I can't prove the truth of every sentence. This would take many years if just to do one run of the generation system. I've got to find a way to optimize that is just to look at capability expressions. Even though the following rule that I'm showing up here says that the killing has to have an object that's an organism and a boat dock is not an organism. So this would be trapped if I could do theorem proving for everything I generate but I just don't have time. So I need to use this more restrictive form that I can turn into a sort of a lookup table and make run much faster. Okay, so I need this following expression that if there's a capability of killing the killing event only the patient has to be an organism. If it's the patient or the object of the killing event it must be an organism. I also have had the question of how much is too much. You know, right now I am able to generate say a terabyte corpus in the span of a couple of hours and then train on it. Training takes about 20 hours on a multi-GPU fairly beefy server. How much is too much? I don't have a sense of that. How much is it too little? You know, we're getting good fidelity. You know, break things up in the standard train test split but and we get good metrics. You know, we get a perplexity score that's down around 1.01, 1.02. That's one reasonable measure for the fidelity of your deep neural network training approach. But is that good enough? I don't know yet. We just need to do a lot more experiments. So we have about 100 names that we use. We have a corpus of common human names. We have about these 300 roles. We have lots of different propositional attitudes each of which could be negated. We have modals like can, may, should that are also formally expressed in our modal logic. Those in turn could be negated. We have 400 different sorts of roles like plumber and teacher, 1300 processes. Those in fact could be negated. And we have past, present, future and then the progressive tense for each of those. All of those possibilities. We've got 930 direct corpuscular objects and then we've got substances which I haven't listed on here. We've got lots of substances like water or gasoline, 67 case rules. And again, 930 of those same corpuscular objects and can be indirect objects. And then I've also added a bunch of politeness expressions like can I, may I, should do the following. So a lot of these modal expressions can become politeness. So please, may I, et cetera. So we can handle stuff in chatbot or conversational kind of stuff that's very commonly seen. So I did an inventory of stuff that a lot of the chatbots use for their sentence patterns to make sure we cover that. But a lot of these combinations are nonsense. So I've got to somehow pare it down, if not, if it's not a problem in terms of creating a lot of junk that can otherwise get ignored when I'm translating a sentence, at the very least it slows down training. And so as I make the corpus larger, I have to pare down the nonsense just so that my training phase is possible because I don't have the resource of open AI to spend $3 million a month on running their chat GPT for the world. I've got to run on one server that my friend and Prague has access to. So there's lots more that I can still do, units and measures. I've only begun starting to do that. I've got a little bit of that. So I can say, Bob puts three birds in his wagon or three pounds of steak in his jar, something of that sort. Because we've got a whole set of, I think they're about 60 different system international units and Sumo has them all. And then of course your values can be within all sorts of different ranges. We've got logical combinations of sentences that I could generate. So Bob goes to the store and Bob buys cookies. We can temporarily qualify all of these things. I've got a little bit of that already. So Bob goes to the store on Tuesday, Bob goes to the store on January 27th of 2003. We have a bit of an issue that a lot of these case rules require different prepositions. So I don't have those working correctly. I need to actually do a whole set of patterns. So there's a little fragment of frame net, if you will, for different action types and which particular prepositions are historically used. So you get in the car, you get on the boat, you go to the station. I don't handle all of those correctly. Things we can do like with more than one agent right now, agents are always singular in my examples. We can certainly have Jack and Jill go up the hill and not just Jack. I have now just implemented quotes for said. So we can handle this common form for news reports or novels or what have you. Lots more things that could be done to make this a more comprehensive corpus. Long before I run out of stuff to do, I can cover a lot larger portion or subset of English. So here's just some detail on how we run the model that just got a script. It's not too hard. Just takes a long time. So future work, once I get done exhausting the set of things I can think of that I can do, then we need to start seeing where it fails on natural texts. A big thing to do is to use either word to VEC or some large language model, maybe as a way to pare down the number of nonsensical combinations along with my preferred approach of having a lot more capability senses, which will then add to the understanding of the competence of this core model regardless of whether we're trying to do language understanding or some other kind of project. And I really want to do a comprehensive test on the baby corpus to just see if I can do better. If I'm not learning and teaching the machine learning system how to do inference but using a system that already knows how to do inference. I think I should be able to get better scores but that remains an empirical question. If I can do that, I think maybe then people will pay more attention to this work. So it would be a nice demo. Okay, that's it. So thanks for learning about this early work as well as the more mature model that it's based on that's already being applied commercially. And would love it if I got somebody on this call so excited that you wanted to do some work together. Excellent. Thank you for the presentation, Adam. You can unshare or leave it up and we will begin with Dave providing some initial comments and questions. Yes, one idea, can you hear me now? I can. Great, one idea that occurred to me several points during your presentation was if you put part of the pruning inside the model by creating persistent agents with goals and beliefs. Bob isn't just a transient pattern of activations but Bob is something that stays around for a while and perhaps Bob's desires evolve more slowly than Bob's beliefs, including Bob's beliefs about Carol and Carol's relevance and whether Carol believes correct things about what he's interested in right now. Have you gotten around to something as charming as that? Well, that's what the baby folks did, right? They've got this model, a world model and they generate sentences and variations of those sentences based on that world model and we could certainly re-implement what the simple system that they've done in Lua in a more complicated blocks world, if you will, that uses all the entities and objects that are in Sumo. That's something I'd like to do. I can only. Okay, reconnecting here. We've got Daniel back. Okay. Maybe he's having some internet issues. Yep, I had to move to a back. I had to move to a backup source for a strange reason, but let us continue the conversation with no video. And yes, the last thing you added was the capacities and the opportunity to contribute there. Please feel free to pick up there or I can pick up there and take it in a different direction. Yeah, Dave and I was just talking a little bit further about maybe some other work that's going on in psychology and ontologies of emotions. And I was just commenting how Sumo already has extensive ontology of emotions, but of course it has ontologies of many other things. So anybody that wanted to study a representation of emotion for use in psychotherapy or any other reason, we'd kind of get for free all of this other stuff defining model of the rest, which would I think make that work have a greater impact over time and be more readily adoptable in different contexts because it has this link to a whole host of other semantic defined entities. Yeah, one of the things that would make it a lot easier to participate across groups is if it were easier to actually run Sigma. Several of us have tried to install and run it. I don't know that anybody in the active inference group has ever succeeded. You do put for the appropriate environment, you have some really nice videos. And I've tried to follow those in every case it broke down with some warning like, if you keep going, your Windows system may self-destruct, why don't you stop now? So it's pretty scary. We even tried to go to a vendor, a provider of cloud services. And I don't know if you've ever tried to do that. But I think it's a lot easier to do that because there are a lot of systems like Sumo at all that provide rigor in how those different state spaces get translated and mapped to each other, like short text to long text, long text to short text, short text to mouse movement. All those different mappings are implausible in multiple ways without a formal semantics, and I've heard about uncertain reasoning because this is an area that I think a lot of people that are used to being in a probabilistic representation often have some misconceptions about what's possible in logic, usually based on the fact that the logic that most people are exposed to academically is going to be first order logic where you do have just truth values, binary truth values, right? It's either true or false. And people want to be able to represent things that are uncertain or likely or unlikely and therefore assume that logic can't handle that. And well, it can, it's just you have to go beyond first order logic and so that's exactly why we're in the realm of higher order logic in Sumo and have modal expressions. We want to be able to say and can say in the logic that something is likely or that one thing is more likely than another or that because of such and such event, something has become more likely. So we have to make a projection that includes some temporal reasoning as well. So logic is, you could call it a non-classical logic, right? I find that a little uncomfortable because most non-classical logics are things like four-valued logics or things that don't just have true and false. Well, modal logics do still just have true and false but they are able to describe an entire situation sort of with all its possibilities and conditionals, things like possible worlds, which allow us to capture some of these same intuitions that a probabilistic representation like Bayesian reasoning can. As I think there's, to my mind, a clear advantage of the symbolic representation. It is possible, of course, to have large libraries of empirical data who to have realistic probability assessments and say that such and such an event is 73% likely in this context. But a lot of times when I see people using probabilistic or Bayesian representations, they're sort of using mathematics is precise. The source data is often not precise. People are making a guess. I guess that it's probably 70% likely that the following will happen and they encode that in their system. And I think those things are better represented as symbolic preferences. This is more likely than that but we don't know that it's 73% likely versus 56% likely. An additional problem I think often, maybe not in your case, but certainly in many cases with people that are doing numeric probabilities is that the numeric probabilities force a total order essentially. And you've got these appearance of precision where 73% is more likely than 56%. But in reality, if that comes from anything less than statistically rigorous empirical data, then it's just sort of a confidence factor like was used in a lot of old style expert systems that doesn't give you reliable results. So none of those criticisms may actually apply to your situation. I don't have enough detail about what you're doing but I'd ask you to consider those points. Awesome, I'll give a thought on that and then Dave happy to hear from it. So for any given model architecture symbolically represented like a symbolic regression or a Bayesian graph, constraining on that model structure, which in the variational inference world, we just call Q. It's just the variational distribution, the one that we are getting to choose the family that we're parameterizing it from. So it's like our designer symbolic model and why the data come in. So then free energy, which is the imperative for minimization, whether it's the real time variational free energy happening right now or it's the planned or anticipated expected free energy. In either of those cases for a given generative model, given Q distribution and given set of data, you might have a very high confidence in something that was very inadequate. And so in that way, it's like there are distributions of scatter plots that can confuse a least squares regression or you get some sort of misleading model outcome and that it may even be invisible. You may be able to look at diagnostics of the linear regression and tell that some sort of error or aberration is happening, but you also may not like a kind of unknown unknowns question. And in the case of cognitive modeling, these state spaces can be very large and they're often trained with sparse data. So this problem of overfitting and conditional and dynamic relationships so that something can be like true and trained upon for a hundred video frames and then something changes and then there's a non-linear change in what can happen next or how things unfold through time. So that's kind of like the complexity of action. And perhaps could you speak to how action is represented or treated or how the concept of action, selection and planning and strategy and implementation impact, how did these features develop in sumo through time? I think there was more to that question. It sounds like Daniels maybe is still having some connectivity issues. That was it. Just how did the concept of action evolve through time? How does the concept, yeah. Let's see. There was a lot packed into your question. I'm not sure I grasped most of it, but let me just maybe make a, I'll give myself a related question, which is how do we handle the fact that sometimes in machine learning systems, the mathematics has such linearities that we get nonsensical conclusions, right? I think that's a fairly common problem. So I think that the interaction between these two models of the world, the statistically machine learning derived model and the symbolic model is really important so that our machines don't go off the rails as it were. I'm using a term I call cognitive guard rails. And I think this applies well to people. People have instinctive reactions to many things. And we have at the same time, a deliberative cognitive mind that constrains those reactions. We might react emotionally and react badly to something that sort of triggers us in some way. But then we have another process in our head that says, well, we're in a social situation and so no, I don't wanna punch this person that's offended me even though I might be tempted to do so. If we have a self-driving car that sees an object in front of it, my initial reaction might to be to plow through it, but the cost to doing so might be very high. So maybe I wanna swerve or break gradually. And having this interaction between these two models, I think can address some of this problem that we see with nonlinearities of modern machine learning systems that do work really well in a lot of common cases and then sometimes just veer off into infinity as it were and do something completely ridiculous. Yes, I appreciate that response. I think it's exactly like you described with the overfitting to the meaning and going off the rails from a cognitive perspective is there's like the known unknowns where you're very... Example, especially there's a University of Wyoming, I think did a nice paper maybe quite a while back now, probably at least five years back that would say, yes, it's a cat or a dog because it was just training on sort of a coincidental feature of pixels that maybe approximated the fur of the animal. And so it couldn't recognize a... Recognized a test pattern as being a cat and it couldn't recognize a simple cartoon drawing as a cat because it'd never seen anything like that and wasn't looking at the essential features of the world. And so machines are really great at finding correlations but that doesn't mean they're the right correlations. And I think that's at the root of a lot of things that we call overtraining or overfitting but we never know really whether it's fit correctly or overfit. I mean, sure, there's some numeric measures that at times can tell you, okay, we're bouncing out of this local minimum and our fidelity has gone down numerically but we don't know what that really means in respect to the common sense larger world that is represented by something that's much larger than any given test set that we might be training on. So I think that having a sort of completely different representation of the world can help guard against some of this. And so I thought of using machine learning and symbolic systems together where the symbolic system addresses problems in the input data, right? Because faulty input data, erroneous input data is always a big problem. How do we know we're training on the right stuff? How do we know that if we're training on a system to diagnose check fraud, for example, or a credit card fraud that we don't have some bogus data that's crept in here there that might have a spurious correlation. We've got say zeroed out fields or a field of 99. It's really as a don't care field, but it's in the data and there's no easy way for us to discover it if we've got terabytes of data. Wouldn't it be nice to have a common sense system that could act like an intern, a human intern working at warp speed, looking over all of our input data to make sure we've not included any inputs that are had odds with what we know about the real world. And then on the output side, making a few inferences in our symbolic system could help the overall computational system not make really dumb conclusions like, oh, we don't know what that thing is so let's run over it. Might be the conclusion that a statistical system has but we'd like to have our combined agent be able to trap things like that in the same way that a person wouldn't presumably punch somebody else who's offended him because they know that that's both illegal and immoral. Thanks, Dave, do you wanna add or ask something? Yeah, a couple of the other approaches that sometimes simplify exactly these same kind of problems that Daniel and Adam have both been cutting through. One is fuzzy logic. Sometimes a fuzzy solution and a probabilistic solution agree numerically, but in other situations, they don't and in many of those cases, the fuzzy answer turns out to be way back when Lofty's out exifying and reacting to situations and it wasn't until he started calling the logic fuzzy that people were beating on him. He was just coming up with a great results. Another approach that's been impressing me is David Deutsch and Kira Marletto's work in constructor theory where they start by asking what can be done and what can't be done? And what does it mean to do something? What does it mean to be able to construct? They generalized the notion of catalyst, for instance, as a general concept in systems theory and they just identified, I'll know a lot of things that people spend a lot of brain power on as just simply being nonsensical. You can't construct that. Therefore, you're wasting your time if you try to work out how it would be done, what the consequences were if it could be done. It just can't be. So go on to something interesting. Well, I'm not familiar with the latter work, but I am a bit familiar with fuzzy logic and maybe I should just mention that this has been another thing where people that are familiar with fuzzy logic wonder how you could do this in a non-fuzzy logic. And the challenge then comes down to again, people that aren't familiar with logics beyond first order. A lot of things we might say about like, John is tall, okay? And maybe John, how do we compute whether John is tall? And we have this problem of, well, in a traditional logic system, we would have the problem that, okay, if he's above six feet, he's tall, if he's below six feet, he's tall. Therefore, if he's five feet, 11 and seven eighths inches, he's not tall. And of course we all know from a common sense point, standpoint that's ridiculous. And so we'd wanna use fuzzy logic. But in fact, what we should be able to say is that he's tall with respect to a particular group. And if we can make these more complex statements generally involving sets of things and conditionals that are beyond first order logic, we can express these things perfectly well in this more expressive logic. We don't always have to go to fuzzy logic. Cause one of the challenges that I think you have with fuzzy logic is again, a lot of the people that did work in this sort of world back almost 30 years ago came up with things that were functioning well empirically as sort of hacks, because they didn't have data about, what is the actual statistical value curve of tallness? It was just somebody's guess about what the curve of tallness membership should be, for example. So I'd like the notion of fuzzy logic intuitively. I'm not sure that I care for it so much on a theoretical basis and on a practical basis. There's simply another way to do it. Very interesting. Dave wanna add anything? Yeah, now this is, if we have time for this, this is a much deeper question about the way the sumo rules are written is, do we have time to go in a lot of depth in this venue? Depth, not necessarily breadth, not necessarily a big answer, but okay, I'll just ask this. Reading through the rules, a number of rules have conditions that very specifically ask in the first participant of a relation. If the following holds, then the following logic applies. If a different set of conditions on the recipient of action, the indirect object or direct object of a process holds, then you can apply the following logic. And you know, I look at that and I say, wait a minute, these different rules are splitting off very different kind of real world situations within the same sumo class. Well, there's gotta be two things have to be true in order to do that, in order to have find these differentiating rules within a single class first. You call those by the same name in whatever natural languages you were thinking in when you wrote the rules. And second, the formal, literally the abstract algebra of these various cases have to all be analogous. They all have to be isomorphic. And my name for that situation is you're covering analogical cases in the same sumo class. And in that sense, the sumo class itself is an analogy. It's an analogy among these various cases which in different domains of that portion. Am I just not thinking discerningly enough about what a sumo class represents or can a sumo class actually be a higher or analogy? It's hard for me to know because there's no analogical reasoning happening. So I'm not sure you're talking about analogies and there's no natural language happening once you're in the formal system. So I'm not sure why you mentioned natural language. I think we'd have to look at a very specific example and then we could run that example on a theorem prove and you'd see exactly how things behave. It's not something about classes per se. You just mentioned how the rules work in a logic. You've got preconditions for the antecedent of a rule and you've got the consequent of a rule. And yes, indeed, the symbols in that formal system have to match, that's how logic works. Great, I'll get some specific examples of an analogy. Okay, sounds great. In our last minutes here and also apologies for the internet disrupt this is part of our long-term journey, learning about this and applying it to active inference. So we're gonna keep this thread alive and be able to bring some more informed areas of intersection and application. But one that I wanted to mention was to return to your type one, type two thinking and the way that there's a faster, reflexive or habitual statistical type one inference and then a slower type two, what you refer to as deliberate and deductive that involve symbolic logic. And in chapter five of the active inference textbook there's a schematic of a dopaminergic neural circuit in the central and peripheral nervous system where there's a dopamine related parameter computationally that's like a dial in the model that is controlling the balance between replicating actions according to the prior, like just the habit. So it's some distribution of capacities, affordances. And then if there's three options, it's 0.8, 0.1, 0.1. So then under the most habitual setting, like what we might even call non-agentic or subconscious, then that vector of affordances is recapitulated in action selection. And then in a different dopaminergic regime, there is increasingly fine scale and potentially overfit application of the free energy and that can reflect like a symbolically informed deep generative model of the world, which captures semantics of the world like nested time scales and temporal priority effects. So it's interesting that this area while deep hierarchical generative models often are described as if they have semantic value. However, that is usually just sort of specified or only explored in a few limited cases. So integrating and working at that mapping between the top down symbolic and the bottom up data driven is gonna be a really important interesting area. So as a closing round or question, what do you see in the ecosystem and or what directions if you didn't state them earlier, you're working on this year? Well, yeah, the generative AI stuff certainly got a huge amount of press. It's fascinating as an intellectual exercise, certainly. And it's fascinating to have systems that can complete sentences and write stories and always have an answer to a question. But as we're seeing with a lot of newer things that are being written as follow ups to all the excitement that people are showing how well these systems will give you an answer, but it's often a complete elucination, right? So one recently of saying, well, Hillary Clinton was the 43rd president. Well, it came up with that somehow by merging together various texts that other people have written, but it doesn't make sense together. It's not true. And so I think these systems are interesting in the way that a parrot can have a very limited conversation. You can teach your parrot how to say, Polly wanna cracker and say, thank you, when you give it a cracker. That doesn't mean it actually understands anything about what it's doing. It's just a stimulus response. Now, first there is a philosophical question if the stimulus response pairing is complicated enough, aren't human beings also just collections of stimulus and response? That may be true, but I don't think we're at that level of complexity yet for our machines. And I think some of the things that we can get these generative models to do who they're so obviously nonsensical human are good evidence that they're certainly not sentient and they're not even smart and they're certainly not very reliable if we have very challenging or high criticality tasks to worry about. I think they'll be useful in the same way that Google search is useful. It doesn't always get the right answer. We'll universally get many wrong answers, but it's a very powerful and useful tool when you have a human being available to filter out all of the nonsense and just use that as a filter for what they'd otherwise have to do. It's not feasible for a human being to answer a question by reading all the books in the library, but if Google can give you 10 books and 10 pages in those 10 books where your answer may be found, it's done you a great service, even though it's given you Winnie the Pooh along with the handbook of analytical organic chemistry or something where your actual answer is found. So in that same way, chat GBT might be very useful in breaking your writer's block to help write the summary for a proposal or a student essay, but you certainly don't want to have it write the whole essay for you and hand it in because you'll probably fail. So I'm not quite as worried about high school or college essays as some commentators have been. I think it'll be a useful partner and that's about all and we've got to understand these limitations and ideally pair it with a symbolic system that really knows something about the real world so that they can work together. Awesome. Dave first and then Adam, what are your closing thoughts or reflections? I've enjoyed this very much. We've got lots of work to plow forward and do. I'm glad there's gonna be common work that we can become affordances for one another on. Thank you gentlemen. Well, thank you Dave for the good questions, good comments and Daniel as well. And for setting this up, I guess I would encourage what I've maybe encouraged in the past is a lot of the things that we've talked about in these several conversations we've had are general, right? We're saying what's an overarching direction for the field or something? And I think that if we got down into some real specific examples and actually doing some representation together, then a lot of these more philosophical notions would go away and we'd get down to doing something that has practical utility. That's why I also encourage actually having, if people do say some of the sumo exercises together because I think Dave, some of the questions that you were having over what is the rule, how is it working? I can answer all those questions and give you some very precise experience that would give you a real actual understanding of how logic works and how sumo works. If we took the time together to go through those exercises do some work with a theorem prover and it would clear up I think a lot of maybe misunderstandings or misconceptions and give us a real path to do something concrete together. Because although the talking is great fun, it's the doing and writing a practical computational system that hopefully we'll all want to get to. Yes, and say, let us descend into the particular. Yes, that's great, I love it. Great. Ending the stream.