 Hello and welcome everyone. It is June 28th, 2022, and we are here in Act In Flab guest stream number 24.1. Today we're here with Professor Stephen Grossberg, and the agenda will be as follows. First, Ali will provide a short introduction. We will then play a 45-minute pre-recorded video, followed by a Q&A. So thanks everyone for joining, and Professor Grossberg really appreciate joining, and I'll pass to Ali for the introduction. Hello and welcome. I'm Ali. I'm an independent researcher from Iran. I'm very happy and excited to be here and be able to speak with Professor Grossberg today. So I'd like to thank Professor Grossberg for joining us. Stephen Grossberg is the Wong Professor of Cognitive and Neural Systems and a Professor Emeritus of Mathematics and Statistics, Psychological and Brain Sciences, and Biomedical Engineering at Boston University. For more than 50 years, he has led pioneering research in discovering and developing neural design principles for autonomous adaptive intelligence based on biological and machine learning. His neural network models have been applied to many large-scale problems in engineering and technology, including the design of increasingly autonomous adaptive algorithms and mobile agents. In fact, this is what Carl Friston says about him. Whenever you claim to be the first to do this or that in artificial intelligence, it is customary and correct to add with the exception of Stephen Grossberg. Quite simply, Stephen is a living giant and foundational architect of the field. Professor Grossberg is the recipient of the 2015 Norman Anderson Lifetime Achievement Award of the Society of Experimental Psychologists, the 2017 Frank Rosenblatt Award of the IEEE Computational Intelligence Society, and the 2019 Donald Ohab Award of the International Neural Network Society. His latest book, Conscious Mind, Resident Brain, as a combination of his decades-long research, written in a rather non-technical and conversational style, is published in 2021 by Oxford University Press and is the winner of the Association of American Publishers in 2022 prose award for the best book of the year in neuroscience. Now I'll pass it to Professor Grossberg and then we'll continue with the 45-minute pre-recorded lecture. If you'd like to say hi, otherwise I'll begin the video. I just saw my face frozen on the screen. Well, I'm delighted to be here and I hope you find some points of interest in the lecture and I'll look forward to the Q&A. Ali has prepared a series of questions that I've thought about and have some prepared sketched answers and then after that if you're still interested, I'm happy to do live Q&A about anything related to the topics of the day. Okay. On to the main course. I will play the video now and you won't hear anything on the live stream. I'll crop it and the audio will be coming through if I know. Hello. I'm delighted to be able to speak to you today about a topic concerning artificial intelligence which, as you know, is very much in the news these days and I'll be contrasting two very different approaches to artificial intelligence. But to do that, I need to pull up my PowerPoint slides and share them with you and let me maximize them and minimize my face. So my topic today is explainable and reliable AI comparing deep learning with adaptive resonance. This lecture is based on the following article from this year which is both open access and on my webpage. The article summarizes core problems of deep learning such as its untrustworthiness because it's unexplainable and its unreliability because it experiences catastrophic forgetting. The article explains how adaptive resonance overcomes these problems indeed overcome 17 problems of deep learning and outlines a blueprint for achieving autonomous adaptive intelligence. The article is part of a frontiers in your robotics special issue about explainable AI whose editors wrote and I quote Though deep learning is the main pillar of current AI techniques and is ubiquitous in basic science and real world applications, it is also flagged by AI researchers for its black box problem. It is easy to fool and it also cannot explain how it makes a prediction or decision. In other words, deep learning is not trustworthy. No life or death decision such as a medical or financial decision can confidently be made based upon a deep learning prediction. Deep learning uses the back propagation algorithm for learning how to predict output vectors in response to input vectors. That propagation was based on perceptron learning principles that Frank Rosenblatt started to introduce in the 1950s. It has a complicated history which Juergen Schmidt-Huber beautifully reviewed in an article from this year. Major contributors include Schnichi Amari, Paul Werbos and David Parker. Perhaps one would say that it reached its modern form with simulated applications in Paul's 1974 paper before being popularized 12 years later by Rommel Hart-Hinton and Williams. Here's a schematic of a back propagation circuit reprinted from a survey article by Gail Carpenter of neural network models. In it, information flows feed forward from an input stage to an output stage. Learning is supervised by an external teacher who on each trial defines a target or desired output. The teaching signal is the error or mismatch between the actual and the target outputs. The teaching signal in level F3 of adaptive weights in level F2 have no network pathway whereby to reach from F3 to F2 within the algorithm. So the algorithm uses an artifice called weight transport which physically lifts the weights from here and moves them there so that they can be used to control learning. Well this is clearly a non-local operation as well as being clearly non-biological. Back propagation learns through slow learning which means that the adaptive weights change just a little to reduce our on each learning trial. That requires many trials, that is to say many repetitions of the whole database to learn, possibly hundreds or thousands of trials. This is to be contrasted with fast learning where adaptive weights zero error signals on each trial just as we can learn a face that we see just once and remember it for a long time. If backprop tried to use fast learning it would become wildly unstable. Catastrophic forgetting also occurs in backprop so that during any learning trial an unpredictable part of its learned memory can unexpectedly collapse. So deep learning which is based on back propagation is thus neither reliable nor trustworthy. But why is this? One reason is that all inputs are processed by a shared set of learned weights. The algorithm cannot selectively buffer learned weights that are still predictively useful. In particular there's no attention mechanism. This problem occurs in any learning algorithm whose shared weight updates follow the gradient of the error in response to the current batch of data points while ignoring past batches. There have been multiple efforts to fix back propagation. One is to selectively slow learning on the weights important for learning by optimizing parameters using the Bayes rule as Kirk Patrick at all suggested a few years ago. But that assumes an omniscient observer who can discover and alter the important weights as well as non-local computations such as the Bayesian computation. The same problem occurs with evolutionary algorithms and diffusion based neuromodulation and other approaches to try to fix backprop. These efforts to overcome catastrophic forgetting created additional conceptual and computational problems. I view them as adding epicycles to ameliorate a fundamental flaw in the model which to me is reminiscent of adding epicycles to correct problems in the Ptolemaic model of the solar system as we all know the Copernacle model that we now accept didn't require epicycles. Perhaps this is why Jeffrey Hinton who played a key role in developing both backprop and deep learning said in an Axios interview a few years ago that quote he's deeply suspicious of backpropagation. I don't think it's how the brain works. We clearly don't need all the labeled data. My view is throw it all away and start over. I would claim we don't have to start over because these problems were solved in the 1970s and 1980s. In particular in the first issue of the journal Neural Networks in 1988 I had an article that listed 17 problems of backpropagation that are overcome by adaptive resonance and here they are with regard to not needing all the labeled data. I noted in the third item here that self-organized unsupervised or supervised learning preys us from needing labels all the time as to slow learning. I noted that in art you can have fast or slow learning. Indeed art can learn to classify an entire database using fast learning on a single learning trial as Gail Carpenter and I showed in the 1980s where over auto overcomes all 17 problems of backpropagation without the cycles. Furthermore all the core art predictions have been supported by subsequent psychological and neurobiological data. Indeed art is a principled biological and technological theory unlike backprop and deep learning which are just algorithms. Art has explained data from hundreds of experiments and it's made scores of predictions that have subsequently received experimental support. Well why has art been so successful? There are a number of reasons but one of them is that art can be derived from a thought experiment about a universal problem in error correction that I published 40 years ago in psychological review. The thought experiment asks the question how can a coding error be corrected if no individual cell knows that one has occurred? Let me quote from my paper. The importance of this issue becomes clear when we realize that erroneous cues can accidentally be incorporated into a code when our interactions with the environment is simple and will only become evident when our environmental expectations become more demanding and even if our code perfectly matched the given environment we would certainly make errors as the environment itself fluctuates. So I was talking about autonomous local learning in a changing world. A purely logical inquiry into error correction is translated at every step of the thought experiment into processes learning autonomously in real time with only locally computed quantities. Moreover the thought experiment uses familiar environmental facts about how we learn as its hypotheses and art circuits naturally emerge where these facts are familiar because they're ubiquitous environmental constraints on the evolution of our brains and since we're living with them all the time they become familiar. Because of this universality art circuits made us in some form be embodied in all future truly autonomous adaptive intelligent devices with a biological or artificial art has probably for this reason already been used in many large-scale engineering and technological applications. In fact almost immediately after art was introduced it began being used because it succeeded in benchmark studies against machine learning, back propagation, statistical methods, genetic algorithms either getting much better accuracy or much faster training speed or both. It's also used in applications where other algorithms totally fail such as the Boeing company's part design reuse and inventory compression application. That's just one of many large-scale applications in engineering and technology some of which can be found on our tech lab web page at bu.edu. The Boeing parts design retrieval system in particular was used to help design the Boeing 777 and to do that you needed fast learning and stable memory to learn and search a huge and continually growing non-stationary parts inventory at the time of this application. There are already 16 million, 1 million dimensional vectors that were used to describe each of the parts and you have to be able to quickly search the inventory if you want to find a part to use in a new plane especially if your new design might have a part in the inventory that was similar to it. Binding it and slightly modifying design could save millions of dollars in fabrication costs. Satellite remote sensing is another large-scale application that art was used for very soon and Gail Carpenter and her colleagues took the lead here. For example using a very small number of pixels of ground truth of 17 vegetation classes they used art to automatically complete these maps using remote sensing data. Art did it in a day rapidly and automatically it gave a confidence map for each pixel and the pixels were 30 meters in scale which was small enough to see roads. This contrasted with an AI expert system which took a whole year to do with and it had to derive ad-hoc rules from experts. You had to correct upwards of a quarter of a million site labels and even so the pixel size was an order of magnitude larger. Gail went on with her colleagues to study information fusion and remote sensing. Let's say you have multiple observers each of them may be using different labels. Their labels may also be incomplete or missing or even incorrect and the task was to derive consistent knowledge from potentially inconsistent data to automatically learn and stably store one to many mappings and along the way Gail and a colleague showed how to self-organize a hierarchy of cognitive rules including confidence measures between these different levels of the hierarchy. There's been continual work on art some more recent work was summarized in a special issue of neural networks just in December 2019 that was edited by Donald Bunch who started the special issue with a general overview of neural network models that I and my colleagues developed and then went on in a long and detailed article with several collaborators to provide a survey of adaptive resonance theory neural network models for engineering applications to the present time. So back propagation and deep learning are a feed forward adaptive filter but art is more than that in fact art is an explainable self-organizing production system in a non-stationary world. What do these words mean? Art self-organizing because it can autonomously carry out arbitrary combinations of unsupervised or supervised learning trials with the world as its only teacher. It's a production system because it uses hypothesis testing to discover and learn rules by a top-down matching process that focuses attention on critical feature patterns these are the patterns that predict behavioral success while suppressing irrelevant features. Art's explainable using both its activities or short-term memory STM traces and its adaptive weights or long-term memory LTM traces activation dynamics learning dynamics. Observing the STM traces in a critical feature pattern explain what recognition categories will learn to code and what features predict goal oriented actions. In particular the long-term memory traces in the fuzzy ARTMAP algorithm translate into explicit fuzzy if then rules that code what combinations of critical features in what numerical ranges effectively control predictions thereby illustrating one of many examples where neural networks can learn rule-based behaviors. ART includes a bottom-up adaptive filter of feed forward neural network as I've observed already but that's supplemented by top-down learned expectations and two types of recurrent inhibitory feedback interactions that help to choose the recognition categories and the critical features. Notably top-down expectations use what Gail Carpenter and I call the ART matching rule to learn how to focus attention on critical features that control predictive success. The ART matching rule is another way of talking computationally about the process of object attention how we pay attention to salient objects in the world and we show how it stabilizes learning and thereby avoids catastrophic forgetting. Remarkably and this has been supported by many data the ART matching rule can be realized by a top-down modulatory on-center or surround network. Well what does this mean? Well let's say we have bottom-up inputs from external features to feature selective cells that get stored in short-term memory. Let's say we activate a recognition category which has previously been learned and it tries to read out its learned excitatory prototype. Well it can't fully do so because it also reads out an inhibitory or surround that's broader than the prototype so this is approximately one excitatory against one's inhibitory. It can only give you a modulatory on-center but if you have both bottom-up inputs and the top-down expectations simultaneously active then within the bounds of the prototype if you also have a bottom-up feature you have two excitatory against one inhibitory and those features can be selected gain amplified and synchronized to start focusing attention on this critical feature pattern while outlier features the ones that aren't within the prototype only have one excitation against one inhibition are suppressed and in 1999 I was able to begin to understand how laminar cortical circuits carry out object attention in particular layer six of a higher cortical area can activate layer six of a lower cortical area either directly or via layer five and then it can fold up the layer four to modulate an on-center and to inhibit an off-surround so attention acts via a top-down modulatory on-center off-surround network via folded feedback within laminar neocortex and this is one example of the paradigm of laminar computing that I introduced which has why are all neocortical circuits organized in layers and how do laminar circuits give rise to all kinds of biological intelligence adaptive resonance sense is the story because attended feature clusters reactivate their bottom-up pathways activated categories reactivate their top-down pathways closing an excitatory feedback loop between features and categories giving rise to a feature category resonance that synchronizes amplifies and prolongs system response between the attended critical features and the category to which they are bound and it's this resonance that triggers fast learning in the bottom-up and top-down adaptive weights which is why I have called the theory adaptive resonance theory moreover I've done a lot of work since then showing that all conscious states are resonance states and these feature category resonance is a one example of that one that supports conscious recognition of visual objects and scenes there's a lot of data support for our predictions it's well known that attention does have an on-center off-surround circuit behind it and that attention can facilitate matched bottom-up signals many other data as well so now we can say more about why art is explainable or trustworthy in short-term memory it's because the critical feature patterns determine the attentional focus that controls information processing and you can just read off what those features are in long-term memory again it's the critical feature patterns that determine the adaptive weights learned by the bottom-up adaptive filter and the top-down learned expectation so you know also what these weights are encoding art's reliable and avoids catastrophic forgetting because outlier features that are not in the critical feature pattern are suppressed so that only the predictive features are processed and coded art's a production system because it carries out a kind of hypothesis testing and this is nicely illustrated in the simplest art model called art one that gail carpenter and i published in 1987 art one has an attentional system that does all the category learning and the expectation learning and the paying attention that interacts with an orienting system which is activated when there are big enough matches in the attentional system and thereby drives reset and search for novel or better matching categories here's a schematic of the art hypothesis testing and learning cycle so let's say you have a bottom-up feature pattern coming in there may be many many active bottom-up features but i'll draw just one arrow here for simplicity but that vector of input features can activate a distributed pattern of feature detector cells some may be very active some not so active some not active at all and as this is happening each of these active pathways is trying to turn on the orienting system so there might be quite a few inputs converging here but as the features are activated each of them tries to inhibit the orienting system and there are as many features as there are inputs so this excitation and addition of balance keeping the orienting system quiet as the feature pattern goes to the adaptive filter and chooses a category that category reads out a learned top-down expectation that obeys the odd matching rule which can suppress some mismatch features thereby reducing the amount of inhibition on the orienting system and raising the question when you have too little inhibition and too much excitation how big a mismatch will activate the orienting system and cause reset and that ratio is determined by what's called vigilance which i'll say more about soon but if you don't have enough inhibition then the orienting system gets activated it equally activates all the cells in the category layer because it doesn't know which cell may be active or not so it causes a novelty sensitive non-specific burst of arousal novel events are arousing thereby selectively shutting off the active category eliminating its top-down expectation and unmasking the original feature pattern which can again go through the adaptive filter however now this previously disconfirmed category remains off and the category level is renormalized so it responds to the same input pattern with a new category and you go through this cycle of resonance and reset until you get a good enough match to either learn a new category or select a previously learned category and it's a theorem that as categories are learned through this matching process search automatically disengages leading to direct access without search to the globally best matching category explaining for example how we can quickly recognize familiar objects like your mother's face even if as we get older we store enormous numbers of additional memories so you don't have to search your whole repertoire when you see mom you get direct access and quickly say hi mom there's a lot of support to the hypothesis testing cycle one source of support is from event relate potentials also called human scalp potentials which shows correlated sequences of three different evoked potentials during oddball learning tasks an experiment that john paul banke and i reported in the 80s where you'll get a p120 for a mismatch an n200 for the arousal activated by the ordering system and a p300 for the short-term memory reset of the category layer thereby supporting the processing stages of the search cycle there was also physiological data from infratemporal cortex where categories are learned early on from the lab of bob desimone who showed an active matching process that's reset between trials during this kind of event there's also classical data about hippocampal mismatch dynamics it's known that novelty potentials subside as learning proceeds from numerous experiments this is as the orienting system is disengaged and there's more recent data using multiple electrode studies from the lab of oral miller from prefrontal cortex and simultaneous recordings in hippocampus and they show there's rapid object associative learning may occur in prefrontal cortex which is a projection of infratemporal cortex one of the stages of category learning while the hippocampus may guide neocortical plasticity for by signaling success or failure well this is just what happens when the attentional system interacts with the orienting system there's also complementary computing in art in particular the attentional and orienting system was a complementary as manifested by the fact that two event relate potentials a complementary processing negativity in n200 processing negativity is activated when there's a top-down match in the attentional system the n200 as i just noted is activated when there's a mismatch that activates the orienting system and you can just look across these four rows and see that these two kinds of erp potentials are manifestly complementary as illustrator the complementarity of the attentional and orienting systems so this leads us to discuss another paradigm introduced which i call complementary computing that asks what is the nature of brain specializations complementary computing introduces new principles of uncertainty and complementarity that clarify why there are multiple parallel processing streams with multiple processing stages in our brains and a beautiful example of that is this famous image of the macro circuit of the visual system from david van esen and his colleagues where you can see these multiple parallel processing streams in the multiple stages needed to achieve what i call hierarchical resolution of uncertainty but what are complementary properties their analogies like a key fits into a lock or puzzle pieces fitting together in words computing one set of properties at a processing stage prevents that stage from computing a complementary set of properties these complementary parallel processing streams are balanced against one another it's a very yin yang kind of situation and interactions between the streams overcome their complementary weaknesses in fact there are many complementary processes that are known in the brain that have been modeled here is just five of them there are many more so this is a basic principle of brain organization so in summary so far back propagation and deep learning do not have thought-to-memory activation patterns including critical feature patterns so they can't pay attention indeed they don't have any fast information processing nor do they have long-term memory top down learned expectations so they can't carry out hypothesis testing using interactions short term and long-term memory traces indeed there's no neural architecture there's just an algorithm in this really great contrast with complementary computing which discusses the global organization of our brains from the very start it was shown how easy it is to get catastrophic to getting and carpenter and i showed it in art when we would shut down the art matching rule then we demonstrated you could get catastrophic forgetting if you had just four input vectors a b c d presented in the order a b c ad a b c ad and so on if they obey very simple subset relationships and here's a computer simulation of that here you don't have the art matching rule here's a b c ad a b c ad and you see a is coded by category one here by category two here by category one here two here it never settles down but as soon as you impose the art matching rule learning is complete by the second trial and after that point you get direct access to the globally best matching category well let's say a little more about vigilance vigilance determines what features are learned in the critical feature pattern it clarifies how our brains learn concrete knowledge for some tests and abstract knowledge for others in particular high vigilance leads to learning of narrow concrete categories like a category that fires selectively to a frontal view of your mother's face low vigilance leads to learning of broad and abstract categories like everyone has a phase it should be emphasized that critical feature patterns are explainable at every level of vigilance it's known from physiological experiments by Desimone again that their vigilance control in the infertile cortex which they showed by studying easy versus difficult discriminations in monkeys and in the difficult condition which you'd assume would give you high vigilance as expected you had enhancement of the responses and sharp and selectivity to the attended stimuli how is vigilance computed well let's say of input vector it instates a vector of activities and feature detectors at the same time as it tries to activate the earning system but it does so multiplied by a parameter row which is a sensitivity or game parameter that's vigilance and as these features get in state they try to shut off the orionic system and if the excitation is less than the inhibition the orionic system stays quiet so the system can resonate and learn but if inhibition isn't strong enough the orionic system gets activated you get reset and search for new categories there's a very simple computation because you have an orionic system that's complementary to the attentional system well how do you change vigilance based on predictive success but this we have to go from unsupervised to supervised art models so we'll have an unsupervised art a model an unsupervised art B model linked together by a learned associative map as occurs in fuzzy art map and the key point is you can have an input here that can create an output there because you have both bottom up and top down connections at all these levels so in this way you can learn many to one and one to many maps one example of a many to one map is let's say you're trying to categorize visually processed a letter a which comes in multiple fonts you'll learn various visual categories of a based on visual similarity at the same time you're learning auditory categories is saying a and then the associative map can map all of these visual categories of different a's to saying a but it could have been here that these inputs were symptoms test and treatments in a medical database prediction example and you're predicting length of stay in the hospital the possibilities here are endless and there have been many applications or let's say you're trying to figure out what this image is and you've learned to say that's a dog but today you say it's rover and that causes a mitch mismatch which drives a search to focus attention on the particular combination of features in this dog that will identify it as rover that leads to learning of a visual category of rover the an auditory category for the name rover an associated map between them and you can now simultaneously store expert knowledge about that image well how do you conjointly minimize predictive error and maximize generalization so that you minimize uh using memory resources let me read you an answer and then show what it means in images match tracking realizes a minimax learning principle mainly given a predictive error vigilance increases just enough to trigger search and thus sacrifices the minimum generalization to correct the error so let's say you've made a prediction that must mean that vigilance is less than the analog match between bottom up and top down but let's say now you have a mismatch well that'll lead to a match tracking signal that bumps vigilance up till it's just above the analog match just big enough to drive a search so you've given up the minimum amount of generalization to correct the error well our mechanisms like vigilance control realized in lamina cortical and phylamic circus the answer is yes my phd student max Versace and i show this by developing the synchronous matching order smart model which introduce a lot more neurophysiological and anatomical versatility into the model including spiking dynamics lamina cortical circuits interacting with specific and non-specific phylamic nuclei this is another example of lamina computing and here's a schematic of the model you see all the cortical layers with identified cells a hierarchy of cortical regions interacting with specific phylamic nuclei and non-specific phylamic nuclei a ton of anatomical data got functionally explained in this way and many other data as well for example we showed if you have a good enough match between bottom up and top down you're going to get fast gamma oscillations during attention there was quite a bit of data about that already but we also showed if you have a big enough mismatch you'll get slower beta oscillations that wasn't well known but since that time there have been experiments in at least four labs in three different parts of the brain confirming that prediction most important vigilance control was shown how to be controlled by mismatch mediated acetylcholine release a big enough mismatch in the non-specific phylamic nucleus activates nucleus base cell as a minor that releases acetylcholine in layer five cells across the cortex reducing after hyperpolarization currents and causing vigilance to go up and I also showed that breakdowns in acetylcholine modulation can help to explain the symptoms of multiple mental disorders so as to memory consolidation we know there's a dynamic phase of memory consolidation while the input exemplar still drives memory search and before direct access occurs but what if the orienting systems cut out what if you have a lesion in the hippocampus well then as occurs in medial temporal amnesia you get unlimited enterograde amnesia because you can't search for new categories you get limited retrograde amnesia because you could have direct access to previously learned categories this is a failure of consolidation which is mediated by the orienting system you get defective novelty reactions because that is also mediated by the orienting system and memory consolidation novelty detection immediate by the same structure for the same reason it's normal priming because priming occurs within the attentional system learning of the first item dominates you can get some learning but you can't then search and there's an impaired ability to attend to relevant dimensions of stimuli again because you can't search so now where does infratemple cortex fit in within the larger brain i introduced the predictive art or part algorithm model in order to show how the prefrontal cortex amount of other things learns to control all higher order intelligence you can find that in a 2018 paper on my web page i also published it open access and in this macro circuit these green areas of prefrontal cortex control processes like working memory learn plans prediction optimized action these regions in red control processes like reinforcement learning motion motivation adaptively timed learning the category learning i've talked about in it is just in those two regions all these processes control visual perception and there are detailed models of all of these regions and their interactions now and each brain region in nature and in predictive art carries out a different function interesting really dramatically with the homogeneous organization of a typical deep learning network so i've told you just a little bit about some aspects of cognition and why they're explainable but if you put in all the biological models of perceptual cognition emotion and action they're all explainable and then you can assert how perceptual and cognitive processes use art like excitatory matching and match-based learning to create self-stabilizing attentive and conscious representations of objects and events that embody increasing expertise about the world moreover complementary spatial emotive processes that i couldn't mention at all use inhibitory matching and mismatch-based learning to continually update spatial and motor representations to compensate for bodily changes throughout life taken together they provide a self-stabilizing perceptual and cognitive front and for conscious awareness and knowledge acquisition which can intelligently manipulate the more labile spatial and motor processes that enable our changing bodies to act effectively on a changing world and when you put them all together they provide a blueprint for designing autonomous adaptive algorithms and mobile robots with behaviors humans can understand and control because they're both explainable and reliable see my web page sites dot bu dot edu steve g for these models and with that i'd like to thank you very much for your attention before we get started what i want to say so i should say that everything i've talked about and much more is in my book conscious mind resonant brain how each brain makes a mind for those who don't know it's self-contained and non-technical it's written in a conversational style so that people who know nothing about the mind or the brain can enjoy reading it and i have friends who are a rabbi a minister a painter a gallery owner a lawyer a social worker who've all been enjoying reading it also it's a big book it's almost 800 double column pages with over 600 color figures so everything is illustrated but instead of costing 150 dollars it costs $35 for the hard copy and only $17 for the kindle because i spent a lot of my own money so that people who are interested in the topic can read it and one other comment if people do have questions or comments about my lecture or anything they read in the book my email is just steve stede at buhorstonuniversity.edu and i'll be happy to try to reply so thank you now some researchers and explainable ai people like lina the joffania and antonio di cecco demand that any explainable ai should at the very least meet these four criteria to be fair not biased in one way or another to be accountable or reliable to be secure against malicious hacker attacks and not to be full easily and also to be transparent now you explained how adaptive resonance theory or art and by the way i got to say i love your creative and clever use of acronyms for your models my favorite one is sovereign model self-organizing vision expectation recognition emotion intelligent gold oriented navigation if i'm correct amazing anyway you explained how art can address and overcome the issues of accountability security and transparency of current deep learning approaches but it seems that this fairness issue aka the problem of algorithmic bias has also been a growing concern lately especially since it's regarded by some researchers like antonio badia as a practically intractable problem so i wanted to ask in what ways do you think art can contribute to the ongoing quest for mitigating this problem well when a holly sent me this question i said well first i'd like you to send me a definition of algorithmic bias that will clarify what you have in mind so that i know what i'm trying to respond to and you wrote me that you barred the term from body is book the information manifold and you sent me a quote from page 247 that i will quote in part before i respond to that background information so there are two main reasons for algorithmic approach to decision-making that may result in unfair outcomes either at the individual or group level one is that data used is biased and another is that the algorithm analyzes the data in such a way that it yields biased results the basic point to remember is that algorithms are designed to achieve a certain goal not created naturally by evolution or accident thus most algorithms are written to detect certain patterns of interest or a particular objective not just any pattern to be able to pick out some patterns in disregard others program is build a model of the data by listing expectations about what data should be like in order to qualify as relevant to the problem well as i'll explain below self-organizing learning classification prediction models like adaptive resonance theory or art overcome all the problems it's a general purpose device but why don't i try to answer that as part of my replies to all these subsequent questions okay thank you so much now as you also mentioned in your lecture in 1988 you pointed out 17 issues with back propagation in one of your most famous and highly cited papers on nonlinear neural network so it's been 40 34 years now do you see any fundamental confirmed change of perspective happening in deep learning research or we still keep we're still keep adding epicycle upon epicycle to our Ptolemaic model well you've sort of anticipated what i'm going to say and as i said in my lecture various investigators and i mentioned clune aftrick and velez or have recently attempted to modify deep learning to overcome some of these problems but as ali just mentioned my lecture noted that at least to my mind they're like epicycles that are added to a kind of Ptolemaic model of the solar system through a concept overcome some of its its problems but as we all know the Ptolemaic model ultimately crashed because it was both qualitatively and quantitatively wrong and they could only be solved by throwing out the Ptolemaic model and replacing it with the Copernican model that became the basis for modern astronomy and astrophysics so art overcomes foundational deep learning problems that can't be solved using epicycles and it already done it shortly after i introduced it in 1976 and i it can't be overemphasized as i noted my lecture deep learning is untrustworthy because it's not explainable and it's unreliable because it can experience catastrophic forgetting and that happens for a basic reason deep learning just like back prop which is its learning engine is just the feet forward adaptive filter uh so as you know in your question i described these two problems in addition of 15 others in my off-sided article that i published in 1988 in the first issue of neural networks and i also showed them that art that already solved the problems in 1976 and what i find sad is that back propagation and deep learning architects like Jeff Hinton who knows all of this background never mentioned this history and keep talking about making deep learning explain the brain but it can't explain the brain because its foundation is contradicted by basic psychological and neural data yes great in the deep learning community i like comparative discussion and criticism but i don't like solipsism in science great thank you now uh on slide number 50 of your presentation you pointed out that art is inconsistent with models where top-down matches suppressive such as Bayesian explaining away a similar view is evident in page 195 of conscious mind resonant brain to which you also add one of many serious problems of the Bayesian models is that fully suppressive matching circuits cannot solve this ability plasticity dilemma now would you care to further elaborate elaborate on this point sure um my lecture and my book summarizes uh my book my lecture couldn't go into a lot of it some of the copious psychological anatomical and neurophysiological evidence expectations which obey what uh gail carpenter and i call the art matching rule are matched against bottom up input patterns and as the lecture briefly noted the art matching rule is defined by a modulatory on center off-surround network and the modulatory on center is excitatory however acting by itself it can't fully excite its target cells it can prime them sensitize excuse me or modulate them to be ready to fire vigorously when matched bottom up inputs arrive and when there is a good enough match between the bottom up input and an active top-down expectation that's reading out a circuit that obeys the art matching rule that's when you get what i noted in my lecture what i call a feature category resonance because it develops between the matched or attended features and the recognition category that they activate and it's this resonance that synchronizes and gain amplifies the matched features well suppressing the mismatched features and that sustained resonance is important because it's sustained long enough to drive learning in the more slowly varying adaptive weights of the active bottom up filter and learning top-down expectation and it's because resonance triggers learning that i call the theory adaptive resonance theory and the art matching rule avoids catastrophic forgetting as i briefly mentioned in the lecture because it suppresses the relevant features using its off-surround while it's amplifying and focusing attention on the critical features that regulate both bottom up and top-down learning as well as successful predictions because they're relevant they've been selected by previous learning experiences to which discover the set of features that are predictive or causal in a given situation and along the way not only does the art matching rule achieve causality in predictions although as the world changes you have to update your causal explanations it also solves the stability plasticity dilemma in brief purely suppressive matching can't do any of this it shuts off the expected data and so we can't focus attention or learn about it and um there is fully suppressive matching in uh spatial and modal learning but that isn't learning to be expert about the world i could explain that more if you want to know but that's also in my book and these two kinds of learning the excitatory match-based learning and the inhibitory mismatch learning are computation complementary it's another example of complementary computing and the match-based learning goes on in the ventral or what cortical stream in the mismatch learning goes on in the where or dorsal cortical stream the what stream for perception and categorization and prediction the where stream for spatial representation and action and then you need what to wear and where to what interactions uh so that you can reach for and otherwise engage through approach and what have you look at reach for approach the things that you've recognized thank you now following from the previous question and um considering that the free energy principle and active inference framework as works in progress are related to predictive coding and Bayesian brain hypothesis what is your view on the extent of compatibility between art and active inference because despite some prima facie similarities between the two do you see them as fundamentally incompatible or irreconcilable and how could this how could this issue be rigorously evaluated and positively resolved in terms of reconciliation or integration of art and act and active inference or otherwise because you see to add some more context here um Smith Smith at all in their recent paper an active inference approach to modeling structure learning i have stated that although they're they have not explicitly incorporated arts a top-down attentional and feedback mechanisms there are mechanisms within their active inference based model which they believe are quite similar to top-down and bottom-up feedback exchange in art so this is seems to be some degree of disagreement about the compatibility between the two frameworks well let me try to respond to the two parts of your question separately so i'm not going to try to talk about Smith at all for a moment um let's talk about free energy and i like getting definitions on the table because it's really so frustrating to try to remember what something is when someone's talking about it so i go to wikipedia wikipedia writes in part that the free energy principle assert well that systems minimize a free energy function of their internal states which entail beliefs about hidden states in their environment the implicit minimization of free energy is formally related to variational Bayesian methods and was originally introduced by Carl Friston as an explanation for embodied perception neuroscience where it's also known as active inference and we all know all is a brilliant and very insightful man the free energy principle describes the behavior of a given system by modeling it through a mark of blanket that tries to minimize the difference between their model of the world and their sense and associated perception this difference can be described as surprised and is minimized by continuous correction of the world model of the system one more part of the quote the free energy principle has been criticized for being very difficult to understand each and even for experts and the mathematical consistency of the theory may have been questioned by recent studies discussions of the principle have also been criticized for invoking metaphysical assumptions far removed from a testable scientific prediction making the principle on false defiable and in a 2018 interview Friston acknowledged that the free energy principle is not properly falsifiable so that's Friston himself so my main concern with the free energy principle just like any theory about how brain makes a mind is how much data can it explain in a principle and unifying way that's what we do in science we develop theories to explain and predict data and in the case of the free energy principle from what I can see the answer is essentially no data and you can correct me if I'm wrong it therefore cannot be evaluated as a physical theory at all and there's a basic reason for this problem our brains are designed to autonomously learn in real time in response to a changing or non stationary world that's filled with unexpected events like today we're experiencing an unexpected event I didn't know till recently that I'd be enjoying your company today optimization principles were designed to cope with stationary dynamics whose rules and probabilities do not change through time so it's not possible to quote minimize the difference between their model of the world and their sense and associate perception unquote because there is no predefined model of the world which is always changing in unexpected ways so you need a theory about how the world changes surprise occurs in art when there's a big enough mismatch between an input pattern and the currently active opt-down expectation of a category that it's activating this mismatch activates the art orienting system that I briefly discussed in my talk which interacts with the attentional system where the category learning does occur and as I illustrated in our discussions of search and vigilance that it drives hypothesis tests in a memory search to discover a better match or to begin to learn the category so auto discuss a better match in the case where the system was attending some other familiar features when the new input occurs but the features of the new input have previously been categorized that's why they're familiar and then the orienting system very quickly shifts attention to the matching category and you resonate on and you recognize it consciously often art begins to learn a new category when the input represents a truly unfamiliar and novel situation now as to Bayesian methods in science hey I'm a mathematician how can I not love Bayes right but the beauty of Bayes is its simplicity you just write the probability of two events a and b in two different ways the probability of b given a times the probability of a the probability of a given b times the probability of b set them equal because they're identical divide by let's say probability of a and then optimize that's Bayes and it's a useful statistical method and should continue to be used in statistics but it's just the formal identity wherein lies its power it says nothing about any physical reality whether in physics chemistry or biology the Bayes rule itself tells us nothing about physical reality and contains no heuristics to discover anything about physical reality but that you need to develop models driven by a profound analysis of large databases so it turns out that biological models like I do not incorporate the Bayes rule however art does routinely choose the best or optimal categories that represent the data best so you don't need Bayes to achieve optimality also Bayes works best in a stationary world with stationary probabilities and art's designed to learn about a non-station so you know one can discuss is till the cows come home it's good for what it was designed for and some of the neuroscientists who try to apply Bayes are wonderful experimentalists but they know no math and no theory and you know it's uh the temptation of free lunch there it is waiting to be applied there is no free lunch in science thank you uh now as a final point of comparison what are the Smith I didn't reply to yes yes okay you quoted a sentence of Smith but before that sentence Smith at all wrote it is also worth highlighting that as our model is intended primarily as a proof of concept and a demonstration of an available model expansion reduction approach that can be used within active inference research it does not explicitly incorporate some aspects such as top-down attention that are of clear importance to cognitive learning processes and that have been implemented previous models for example the adaptive resonance theory Roth model of Grossberg was designed to incorporate top-down attentional mechanisms and feedback mechanisms to address a fundamental knowledge acquisition problem the temporal instability of previously learned information that can occur when a system also remains sufficiently plastic to learn new and potentially overlapping information while our simulations do not explicitly incorporate these additional complexities there are clear analogs to the top down and bottom up feedback exchanges in our within our model such as the prediction and prediction error signaling within the neural process theory associated with active inference or it addresses the temporal instability problem primarily the mechanisms that learn top-down expectancies that provide attention and match them on bottom up input patterns which is quite similar to the prior expectations and likelihood mappings used with an active inference but as I've already noted the quote prior expectations and likelihood matching mappings within adaptive inference quote unquote do not have any of the key learning attention and memory stability properties the York matching rule York matching rule is a unique solution to that problem in its variations it's been supported by psychological anatomical physiological and biophysical data it also occurs in many species nobua suga for example shows that occurs in bats it occurs in ferrets you know so also I think it's important to note that when learning begins in an art model it doesn't need prior expectations or likelihood in fact typically the initial bottom up weights are chosen to be random because you don't know what you're going to be experiencing and the initial top-down expectations are chosen to be large so that whatever category happens to be learned when it reads out its top-down expectation it can match whatever features activated that category so they all start larger than their prunes to match the critical features that happen to be learned in that category so there are no built-in models or discovers its own models I should also emphasize that active inference is also not explainable arts is explainable because a currently active critical feature activity pattern namely the features to which attention is paid controls all learning and prediction by the model and in principle can be measured by nor physiological experiment a model without cell activities or short-term memory traces that can represent the critical feature pattern can't be explainable so I think there are qualitative differences I don't say people shouldn't use active inference it may be incredibly useful and powerful in technological applications but when one is doing you know brain science psychology it just doesn't match the foundational data it just doesn't know the person of things thanks and I guess you somehow already answered a part of this question but what are the possible ways in which arts approach to explainable AI which if I'm not mistaken can be described as a model dependent intrinsically explainable approach can inform active inference as approaching cross-fertilized with it which is based on abductive reasoning through constructing generative models for example as a sketched out in par and patsulo's understanding explanation and active inference paper well first I don't think art is model dependent as I just noted one begins typically to learn in art with random initial bottom-up weights and uniformly distributed top-down initial adaptive weights so you can match any category that you happen to learn but the authors you quoted write in part that active inference and here I want to quote them so we can respond in a little more detail active entrance quote implies a deep generative model that includes a model of the world used to infer policies and a higher level model that attempts to predict which policies will be selected based upon a space of hypothetical that is counterfactual explanations and which can subsequently be used to provide retrospective explanations about the policies pursued so again art works without a generative model of the world or any predefined policies but of course what it's trying to discover what changing world it happens to be in and nobody knows what it is at priori and in general in art classifier responds to a front end of preprocesses that process perceptual data from one or another sense notably vision and audition where we get most of our information about the world and that's why I classify like art begins its work in the brain in the temporal cortex where it receives highly preprocessed perceptual representation so decades of work went into understanding how our brains consciously see in here and in the case of vision art classifies perceptual boundaries and surfaces that require multiple stages of processing because as I mentioned briefly they for the outcome of what I call hierarchical resolution of uncertainty you need multiple stages to define a perceptual boundary surface one reason being because our sensory organs register such noisy and incomplete data like you may know that our photosensitive retina has a huge blind spot where you can't register any sense any visual signal the blind spot as big as the phobia where all of our high resolution vision occurs so it's not a little thing and moreover veins come out of the phobia and occlude the retina in multiple places and you can't register visual signals on the veins either so the signal you're getting is very incomplete and it takes multiple processing stages to overcome those uncertainties and like how did I work for decades to explain how that happened on maybe I'll stop there for that thank you so much now this next question is of a personal interest to me because currently I'm working on modeling some probabilistic aspects of effective response to music and your most recent paper toward understanding the brain dynamics of music immensely helped me gain a better understanding of entrainment as you pointed out in the supplementary notes for this paper violation of prior learned expectations is instrumental in inducing a wide range of effective responses in musical and non-musical situations some psychologists such as Patrick Eustlund have distinguished between perception and arousal of emotions but you left out a question is it the last time where you just skipped it I think I want quite a bit about it which question do you see the future of odd and neuro inspired I think that will be our last question okay so we'll come back to that yes yes thank you because that that's an important question yes okay so sorry to interrupt I just want to be sure no problem thank you now yes well some psychologists such as Patrick Eustlund have distinguished between the perception and the arousal of emotions in the context of musical experience and also several studies such as the works of Eustlund and Gabylson from the psychology department of Uppsala University have shown that despite music's ability to communicate a wide range of positively and negatively valence emotions it somehow evokes mostly positively valenced emotions for instance we can easily perceive rage or anger in music without necessarily getting angry on the other hand we're more likely to actually feel elevated and happy after listening to happy music and also the evidence shows that this disparity between perception and evocation of emotion is probably even more significant in musical experiences than any other non-musical experiences so how can this difference in diversity between perceived and aroused aroused or evoked musical emotions be account and be accounted for within art framework can it possibly be regarded as another kind of broken symmetry as you mentioned on page 621 of your book but specific to musical effects so as you've noted I haven't studied this issues in the context of music but I'll try to venture some general comments I should first note that the Lamanark model which is a development of art to show how and why all neocortical circuits that support perception and cognition typically share a canonical six-layer circuit my colleagues and I have modeled how just as in our brains variations of this canonical lamina circuit can support all perceptual and cognitive processes so there's a major generalization of art and we've done it for vision speech and cognitive working memory and planning in particular so the main point is that the lamina circuitry is basically an all perceptual and cognitive areas vision audition etc etc that's how one can create a context for discussing music and in fact my work on music applied such discoveries like I was able to put together discoveries that had been made based on what I believe were different evolutionary pressures on the organization of our brains but that evolution also discovered and I try to sketch how if you put some of them together in a certain way or then capacity for learning and consciously performing music but could arise so now how about arousal well it's essential of course to all awareness and consciousness neocortex needs to be adequately aroused for waking consciousness to occur at all it also arouses plays a major role in the processing of emotions and it's very relevant to musical issues both my gated dipole model explains how opponent processes, opposites are organized in all parts of our brain uh perceptual cognitive motor affective in particular emotions are organized in pairs in such an emotional dipole and one reason is because emotions need to compete with each other such as fear versus relief for example in post traumatic stress disorder therapy a therapist may try to help a patient to think about positive experiences that generate relief in order to inhibit the chronic fear that's so destabilizing during PTSD so their opposites are competing and another property that arousal enabled is that the sudden offset of an emotion like fear let's say during escape behavior let's say you know a favorite example of mine is you know some cruel experimentalist puts a pigeon in the skin a box the floor is electrified the pigeon is feeling pain and fear it's dashing frantically around trying to keep its feet off the floor it bangs into a red buzzer the buzzer shuts the shop off and the animal experience the wave of relief or positive motivation for learning the escape response so the rebound from theater relief that can be associated with actions that lead to escape and can motivate escape by means in the future is energized by arousal in the gated dipole you shut off the external cue of shock but the arousal is tonically or has sustained activity in both the fear and the lead channel so because the arousal is sustained or tonic through time and equally activates the fear and relief channels when fear suddenly decreases then arousal in the relief channel wins the competition and can thereby call what I call an antagonistic rebound from theater relief that activates the relief channel and thereby providing motivation for escape whether reactive or learned through escape experiences and I also approve which is related to some degree to music that the non-occurrence of an expected event can by itself cause a burst of arousal and that's an antagonistic rebound and can flip emotions from positive to negative in so doing and I always love that discovery because I especially love discoveries where the occurrence of nothing has profound effects on future behavior so it's the non-occurrence of the expectation for the dismissment that can flip emotions now how this influence of the perception of music needs more work it's as I mentioned my paper I haven't tried to study that uh my my paper on music I feel is like you know a drop in the bucket and hopefully if I don't get around to it someone else will but the above example show that arousal and emotion are not the same thing because arousal can support all emotions fear relief hunger satiety whatever the ones that win the competition are then able to support compatible behaviors by motivating them and I've also explained that the level of arousal must be chosen within an intermediate range to support normal behaviors it's a kind of golden mean there's an inverted U on the effects of having arousal from too little or too much and if you have too little arousal you have an under arousal syndrome which can support symptoms like autism and over arousal can support symptoms of these like schizophrenia there's only one of many factors in these diseases but I'm happy to say that subsequent clinical data supported those predictions that these two mental disorders are a parasitic stream of the arousal inverted U so I think and you know this is very speculative because I've never really seriously studied it and I try not to speculate but what the hell oh the kind of arousal at music activates a generally positive just like the arousal that activates exploratory behaviors is positive it somehow links you know music is a sonic adventure if you like there's no aversive cue when listening to music except perhaps music that plays so loud as to cause a headache or ear damage or even a seizure in susceptible individual stuff there's no particular reason why it shouldn't be positive so now which question you want to ask me uh I think we're left with just two other questions if you're not too tired starts with as a final point of comparison another stuff yes the last we're until last the two questions so you want to uh how do you see the future of all unknowns by the air research and before that yes before that I just wanted to ask you about the your view about artificial consciousness do you see I really need the first answer to answer the second okay as you wish yes okay so yeah and how do you see yeah yes it's quite fine and how do you see the future of art and brain inspired AI research in general in your view what research areas ought to gain more attention than they do today so I'll give you quite a general answer but it implies what I think about this so first with the caveat I like to say I couldn't predict the present so I can't predict the future that being said I believe that all engineering technology and AI will increasingly embody autonomous adaptive intelligence in the coming century and we can already see its beginnings in autonomous automobiles and airplanes and increasingly autonomous controllers on the factory floor and many people have written about it and I think art will play a central role in this as well as other models that are summarized in my book and that's because already in 1980 I published a thought experiment in the journal psychological review which was then and still probably remains the leading theory journal in psychology and you may recall that Einstein derived both special relativity theory and general relativity from thought experiments and let me just clarify by my thought experiment when they derived their enormous power so my thought experiment was about how any system and autonomously all about autonomy correct predictive errors in the changing world and the hypotheses upon which the thought experiment were derived but just a few facts that are familiar to us all from daily life and they're familiar because they represent ubiquitous environmental pressures on the evolution of our brains over the millennia and when they act together I suggest art is the unique outcome that's a huge plane and I turn to the power of the thought experiment not any personal ego trip for that belief in particular nowhere in the thought experiment of the words mind or brain mentioned so if you accept that these facts about the world exist which we all do and that they're always operating on us then you have to accept the outcome if you believe in the scientific method and logic so art is a universal solution to the problem of autonomous error correction and changing world so in another way if you can't find a mistake in the thought experiment then I think you either have to believe in art like dynamics may be expressed in Laminar or your favorite art variant or you have to give up your belief in logic in the scientific method it doesn't imply that art can't be further developed I expect a large number of scientists, technology and technologists to be busy developing art like architecture as long as someone's gone so maybe now we can go to your philosophical note with that back yes thank you so much now uh yes as I mentioned mentioned earlier I just wanted to ask about your view about your view on artificial consciousness do you see the consciousness as artificially producible or engineerable as some researchers like Mike Solmsley is there a fundamental distinction between a biologically conscious agent and an artificial agent with a fully simulated computational model of consciousness I know it's a big big question but I couldn't resist asking your opinion as an authority on consciousness modeling I'm happy to give a shot so as I just noted my work on art suggests itself the universal problem about how we can learn to correct predictive errors in a changing world my work also shows in its analysis of hierarchical resolution of uncertainty remember like how you go from some a noisy retina to a surface representation that can control looking and reaching or how evolution may have been driven to discover conscious states so this was a surprise to me too conscious states were needed in order to choose that processing level or level that computes a sufficiently complete context sensitive and stable representation the case of vision a surface representation with which to successfully plan and act to realize value goals so let me make it clear so you start with a noisy retina you have to go up all of these stages until you get a sufficiently complete surface and boundary representation that you can use to regulate successful action and if you used one of the earliest stages it would lead to incorrect actions which would kill you off by governing selection so how the hell do you know where the stage is where you can compute the sufficiently complete context sensitive and stable one and I propose in vision I predicted that the choice is embodied in what I call a surface shroud resonance between pre-stride visual cortical area v4 in the next processing stage posterior parietal cortex or PPC so it's in v4 you get this really good surface representation and then a resonance between the surface and spatial attention which fits the surface that spatial attention in PPC is called a shroud Christopher Tyler gave it that name a surface shroud resonance allows you to pay conscious spatial attention to the surface that you're going to use the control looking and reaching behavior so it's a way of ensuring you have a good enough representation of control action so the shroud is computed in posterior parietal cortex which is part of the dorsal or where cortical stream and the shroud modulates in varying category learning in the ventral or what cortical stream I can't go into that right now but my book discusses it the category learning itself in the what cortical stream as I indicated is supported by a feature category resonance and so the surface shroud resonance is modulating in varying category learning in the feature category resonances surface shroud resonance also supports conscious seeing the feature category resonance is supporting conscious recognition and when they synchronize across streams on a familiar object that's when you consciously see something that you know about okay so conscious states here by rise due to learning requirements this sort of fell out of wash of how you do invariant category learning and learning in particular without catastrophic forgetting it's regulating feature category resonance so given that the above solutions are computationally universal in the sense I sketch the self-organizing machine that embodies them should be able to support internal representations whose parametric properties mimic conscious states but whether such a machine can experience conscious qualia remains as much of a mystery the machines as it does the humans and that's because no computational theory which after all is just a set of equations can do more than imitate the dynamics of our brains perhaps with great precision I don't have a clue why the representations that my colleague and I've worked so hard to explain huge amounts of psychophysical data about seeing extra shading 3d form just go through the list why they support qualia I don't know ask God or whatever God you choose to believe in in the 21st century thank you so much professor I think we have a couple of questions in the chat but we are actually approaching our two-hour limit I don't know Daniel if it's a good place to stop or whatever you say I think that's a great place to stop you've given us a lot to to think about and digest and I hope that these words are taken well and paid attention to might create some categories activate some categories but professor grossberg thanks again for this amazing live stream we really appreciate it I appreciate it I'm depending on younger people like yourself but do just what you said Daniel I'm not going to be around that much longer so I hope you have whether with my work directly or related work you have a very fulfilling intellectual adventure I know I've been on a wild ride since I was 17 and that's 65 years of discovery and I've loved every minute of it