 Hello and welcome. It's March 13th, 2024, and we're here in Active Inference guest stream number 75.1, Active Inference with Empathy Mechanism for Socially Behaved Artificial Agent in Diverse Situations with Tata Yuki, Matsumura, Kanako Isakai, and Hiroyuki Mizuno. So thank you all for joining and looking forward to hearing your talk and discussion. Okay, thank you for inviting us to such a good opportunity. And I'm Tata Yuki Matsumura and my co-researcher Kanako and Hiroyuki also joined this discussion. We are developing AI, especially AI which is inspired by human-like autonomous agents to make social artificial agents. Recently, we published a paper related to the idea in Journal of AI Life. The title is Active Inference with Empathy Mechanism for Socially Behaved Artificial Agent in Diverse Situations. First, I'd like to explain the paper. This is the motivation of this research. The title of the paper shows the motivation of this research is to develop Socially Behaved Artificial Agents. We believe that sociality becomes an important point to apply AI in our daily life. The difficulty to realize social agents is that an appropriate social behavior depends on a given situation or culture. As described in the bottom figure, we have sociality even in just walking case. For example, how fast should we walk or how much distance from others? So if we implement sociality with rule-based program, we have to design a lot of magic numbers for each situation. This is impossible because there are a wide variety of situations. Due to this, unified mechanism for principle for social behavior is required. This is the challenge. We assume that we humans have such unified behavior model because we humans can behave socially in diverse situations. We develop social agent inspired by human behavior model, namely free energy principle and active influence. In this research, we use the active influence for the basis of the behavior model of human life agents. As you know, active influence is a concept proposed in the free energy principle. And free energy principle is a unified process for our active cognitive activities. In the FAP, our cognitive activities are explained by such. But we have internal models for predicting future of surrounding environment. And second, we always try to minimize the free energy of the model. Free energy is defined by this equation and it intuitively represents uncertainty of prediction. The interesting point of the FAP is that the internal model is not only the hidden state but also actions for the agent. That is the action to be taken is the action which minimizes the free energy of the agent. This is the basic idea of active influence, as I understand it. Intentional actions also can be explained in the active influence. In this case, we consider expected free energy, which is the free energy for the future state. When considering much time step ahead, expected energy at much time step ahead is considered as described in the right figure. Based on the expected free energy for each action, the distribution of the action is determined such that the smaller the expected free energy, the higher the probability of selection. When we consider intentional behavior, intentions are considered as components of future observations. And we want, according to the future observation, encode it into the last equation. These are my understanding of the standard active influence and the behavior model of the human life agent based on it. We extend this active influence to generate the social behavior. The idea of the extension is explained using the situation in which there are two agents called I and other. The agent called I is the subject of the action in this explanation. The biological agent and the free influence makes predictions about the environment and takes some action to minimize the uncertainty of the prediction. In this figure, I predict future behavior of the other because the other is an environment for I. Namely, I act to reduce uncertainty for the other. So far, this is a standard active influence. Here, if we consider that the other is also a biological agent, we can assume that the other also acts under the active influence. Under this idea, we can assume that the other also predicts the future of behavior of me and the other tries to reduce the uncertainty of the prediction about me in this situation. In this understanding, we notice that our action can reduce not only our own free energy but also that of the other. This is because my action affects not only my own free energy but also the free energy of the others. If I can predict other's prediction for me, I can act to reduce the free energy of the other. This is the idea of the proposed extension of the active influence. We extend FEP and active influence from an individual idea to a collective idea as described in this image of concept. The agent marginally shares the brain and they try to minimize the free energy of the shared brain. To reduce the free energy for the shared brain, we act not only for me but also for the others. Because of this action for others, we expect that this idea can be used to generate social behavior which is a behavior for us, not only for me or only for you. To realize this idea, it is necessary to predict other's prediction about myself. This is implemented based on the mirror system and simulation theory. I process for the mechanism of empathy. In the mirror system and simulation theory, it is said that we understand others by simulating others by own bodies. We consider the idea of using our own body to simulate others as using our internal model to infer other's prediction. The overview of the mechanism is described in the figure. The agent has the internal model to predict future observation for his own observation. Moreover, we use the same internal model to predict other's prediction about myself. Although the model is the same, the input to the model is changed to the observation of the others. For this, the process of inference of other's observation is included. For example, when the observation is an image, no overview synthesis, which is the problem of generating images from different viewpoints, can be used. In the other case, as we will discuss in the following evaluation, if the observation is the location of the agent, for ordinary transformation, transformation can be used. Based on this inference procedure, the expected free energy is extended as described in this equation. As with the encoding of intentional behavior, we add the term to the first term that is intention to response to the expectation of others. If there are multiple others, the sum of each other expectation is considered. We call the proposed idea empathic active inference. We evaluate the proposed idea using the control path of an autonomous mobile agent. In the evaluation, multiple agents move to each target point. The shortest path of each agent intersects each other, and agents are required to take collision avoidance behavior. The player, his red agent, is controlled by the proposed method, and the other agents are controlled by the social force model. Social force model is a simple social walking model. Roughly said, it is a dynamics model in which the walking speed and direction determined by two forces. The first one is driving force to the goal, and the second is the passive force from the others. In this evaluation, the weight of these two types of forces are changed to evaluate different situations. There are two types of others, selfish or altruistic others. When the force to the goal is larger than the force from others, the agents take behavior in selfish. On the other hand, the force from the others is larger than the force to the goal. The agents take behavior in altruistic. These are examples of the behavior of others under the control of the social force model. The player gets stuck with randomly moved, and the others, green, blue, or orange, are moved with social force model. The top picture is an example for selfish others, and the bottom picture is an example for an altruistic others. The difference is from the difference of the magnitude of the two forces. As you can see from the figures, when others are altruistic, the other will take the path which has more margin with others. The setup for the player is simply described. The action space is defined in discrete space. Movement for five directions and weight current position, totally six actions is assumed. A simple variational autoencoder-like model is used to the internal model, and the model is trained in the offline model. The model is used to predict the expected free energy for each action. First, the variational results when there is only one other besides the player are shown. We discussed the sociality of the agent based on the point. The first one, if my travel distance is longer, it is social behavior. And the second, if the other travel distance is shorter, it is social behavior. Third, if the minimum margin among agents is larger, it is social behavior. Fourth, if the inequality of the travel distance is smaller, it is social behavior. Inequality means that the difference in travel distance among agents. As a comparison, we also showed the result of the standard active inference cases. The two figures in the left side of the case, when the others are selfish, in this case, the player with the standard active inference moves straight to the destination because the player acts optimally only for its own purpose. On the other hand, the proposed agent moves to avoid the other, as well as the other, because the player predicts that the other expects the player to avoid collision and the player tries to respond to it. As a result of taking this detour path, the sociality of the behavior is increased. Namely, the travel distance of the player increased compared to the standard active inference case. At the same time, the travel distance of the other decreased and the minimum distance between the player and the other increased. In addition, the inequality of the difference in travel distance is decreased. Next, the case where others are altruistic is shown on the right two figures. In this case, the same tendency is observed, as in the case where the other is selfish. But the player moved a path with larger margin to the other compared to the case of selfish other. This result indicates that the proposed agent can behave in a social manner that is beneficial to the other, even if the result is advantageous to itself. Another important point is that the difference of the social behavior does not depend on any reward designing as in reinforcement learning. Namely, the sociality depends on the character of the other around the player. The other around the player is not just an obstacle for the agent, but as a teacher for the social manner. In the case where three others besides the player are similar results, observe in the altruistic other case. Namely, the travel distance of the player is increased and that of others are decreased and the minimum margin is increased and inequality of the travel distance is decreased. On the other hand, in the case of selfish others, a collision occurred. This may be due to high density in the center point. In this case, the proposed agent can avoid collision by waiting the crossing of others. I repeat the important point. There is no need to design reward for penalty or sociality in each situation. A social behavior realized by responding to the other expectation. Finally, I would like to summarize the explanation of the paper as I said at the first slide. The motivation of this research is to develop a unified mechanism for social behavior. The answer for the motivation is that running an internal model to predict the future behavior of others. This is the same as standard free energy principle or table inference. Moreover, predict others' expectations about me with the same internal model and try to respond to the other's expectations. This process makes a cyclic process and it encourages the agent to behave like surrounding others because due to mirror system, other expectations to me are same to my expectations to others. This mechanism encourages the agent to behave like surrounding others. This summary shows our idea for what is social behavior. The answer is that social behavior is a behavior like surrounding others. The others are not obstacle but teacher for social behavior. This also means that we are teacher of AI or robot for the future society. That's all. This is the explanation of the paper. I'd like to discuss about it. Thank you. Would either of the other authors like to say hello or to give any remarks? Hello, I am Kana Paesaki, also a researcher at Hinaji. I have been studying on control of systems, especially systems with hardware. My recent interest is how to apply the activity of robot systems. Hello, my name is Hidoki Mizuno. I am responsible for this research. Thank you for giving us this wonderful time. Thank you. Okay, many interesting questions but definitely very striking results about the tolerance to a variety of different social settings for the active inference algorithm to by nature of always updating at each time step based upon the expectations able to kind of deal with a range of different settings like different numbers of agents and the different behavior types for the agents. I'll ask some questions that were submitted by... Can we talk to you for now? No? No? Or okay now? Okay. Okay. All good. All good. Okay, I will read some questions that were submitted by email by Shohei Wakayama and also will look to the live chat. So I'll start with the fourth question he wrote. I would like to receive more detailed explanations of no operation action, the NLP in general. Because I noticed also in the graphics that when there's going to be a collision that the active inference agent just stops. So how did you model or calibrate that decision to make a move at all versus which way to move forward? Okay. First, I'd like to answer what is the NLP operation? I mean, NLP is just no operation. So in the evaluation case, NLP means wait the current position. The agent take no movement action. The agent stage the current position. This is a no operation, no operation. And the question is very important because this is a critical problem to realize our method. The problem is that we want to decide an action. But the procedure we proposed requires to decide action. As described in this figure, the lead circle action, actions for others. We decided to action of the agent, but it requires the deciding other action. This cyclic requirement is very difficult problem. So we have to set some actions for the other. I think there are three solution strategies. The first one is assume random action, raising the result of the randomly selected actions. The second is predict action, running and predict using policy network. For example, AlphaGo used this strategy. And the last one is we selected the third strategy that assumes specific actions. No operation means I don't take any actions. And you act as a priority, please. I mean nope. I mean this idea for assuming nope. So I believe that humans have such sociality in nature. This is the reason why I used this third idea. Does it make sense? Yes. Okay. Next question. How do you tune or calibrate the model so that it's in an area of state space that's pro-social? Like maybe you could have a generative model that just only did nope. And it was just paralyzed. Or another one just goes straight forward no matter what. So how do you explore that state space and choose which parameters are open for updating? And which parameters are going to be like explored during the design phase and then fixed at runtime? At runtime. I don't adjust any parameter at runtime. The model is aligned and offline manner. And the parameter model is explored and searched at the training phase. But the parameter space is not so huge in this case because the model is very simple. The observation is just two dimensional information X and Y. That means location of the agent. So in the case of four agents, there is just eight dimensional space. But we need some time steps. So even that case, the dimension is not so large. So the hidden state is also not so large. Can you understand? Yes. The evaluation situation is very simple. So the model is also simple. So I don't search a parameter with a lot of independent. That is not tough work. Simply just try to some parameter and I pick up a best parameter set. So this was an in-silico software only simulation, right? Yes. This is just a simulation. So how did you tune it? And then I guess a bigger question and definitely one that a lot of people are interested in learning active inference is like, how do you go from having an in-silico simulation to bringing that on to a physical robot? Like what are the steps in the process between having the situation in the simulation and bringing it into like a room? Yes. That's a very interesting point and it's very important. And Kanako tried to apply the active inference into the real robot. So how about do you think Kanako? So I also tried in a small space to apply the active inference to the robot. So I have just an idea that maybe the continuous space is important key from simulation to the physical world. And in general, the problem is also the problem in reinforcement learning, I think. So the techniques in the reinforcement learning field also can be applied into the active inference cases. For example, simulation to real. I don't remember the name of the technique, but similar to real. Maybe such techniques can be applied in the case of active inference. But I don't know there is the active inference specific technique. I don't know such technique. That's interesting about the continuous state spaces because in the discrete state space where you have the Markov decision process, you get the big planning trees. But our experience like walking through a crowd or maybe waiting at an intersection, it's not necessarily based upon planning. It's a lot more on the path and the curvature of the path that's planned and maybe the paths that others are expected to take as they move through space too. So that's possibly like a simpler representation that could give natural movement behavior with just a few parameters. And there is another problem for applying this technique to a real world. That is the computational cost because we use the Monte Carlo simulation to start the best action. I think this is not suited to the real situation. So if we try to apply this technique to the real application, we have to solve the computational cost problems. I think so. Yeah. Another part even outside of the decision making computational costs would be like going from the video camera data. When you're getting the exact location data perfectly, you're in the minimum. You're kind of playing chess in the state space versus to have noisy sensors. There might need to be a lot of observation model here, a lot of layers before the simple kind of active inference could even happen. Okay. I'll ask another question from Shohei. So question five, he wrote, study the behaviors of the standard active inference agent and the empathetic agent are compared. I would like to know why you did not compare the proposed method with other game theoretic approaches because the metrics like travel distance can be applicable to these methods as well. The reason why I compared the proposed method with active inference is that the aim of the paper is to extend the active inference to the social behavior manner. So I think it is natural to compare the proposed method with standard active inference. But the question is that why we don't compare the proposed method with the game theoretic method. I'm not sure what this means, the game theoretic method. But if that means reinforcement learning method, the reason why I don't compare it is that just it needs a lot of time to compare it. And I think it is important, but the main focus of this paper is to evaluate the quantitative future of the proposed method and the qualitative evaluation is not so important for us at the moment. But I think it is important and I have to compare it in future work. So there is no real reason. This is just, we don't have much time to compare it. Thank you. Make sense. Thank you. Another question. So the movement was being influenced by the relative ratio of the F other and F goal? Yes. So here F other is kind of like a electrostatic repulsion from other agents. How does it come into play that the red agent is entertaining beliefs about what other agents think? Sorry, can you repeat the question? So here the two forces acting on the red agents are the gravity for the goal and the repulsion for the proximity of other agents. So where or how does it come into play that the agent is actually thinking what other agents are going to do? Like the birds flocking simulations that have the simple rules like go in the same direction as the other birds and maintain a distance between them. But that doesn't require a theory of mind for the other birds. You mean how the agent knows the cause from the other birds? You mean how the agent knows the cause of goal of others? Yeah, it's just running away from others or is it actually imagining what they think? That's a good question. In this simulation the answer is that the destination of the other is given. I don't remember the detail of the evaluation. I think the agent don't have to know the information. At first I think I give the information to the agent but that information does not need the agent. So the goals of the others are not explicitly given but with internal model the agent can predict the path, the greater path so we can predict the goal. That's not explicitly. Does the red agent explicitly model what the others are going to do? Or it just has heuristics that it follows when they move? The former one only predict the action, the recent action and the future. The agent just learns the future position of the others. This is all of the tasks of the internal model. So there's no need to give the goal of the others. Yes, I think this is very interesting aspect. Like to what extent the social behavior rests upon simple rules like etiquette, like passing on the right or on the left that don't require knowledge about the internal states of others or even what kind of thing they are. Whereas the experience of sociality and a lot of the psychology of sociology focuses on very cognitive mechanisms that are very expressive. Thank you. This is because the action is repeatedly decided in short time duration. So this is future of the proposal method as you said. Thank you. I don't notice that that is important. Where are you going to go in your continuing research or development directions? Sorry, in future work, I think an important point is regulating the empathy. And we have a presentation the last year conference about artificial life. In the presentation, we discussed how regulates emotion are empathy. This means that the people we introduced is that the expected free energy is extended to the including others. So D called D my plus the others. And there should be weight of empathy. That means W in this equation, we have to regulate this parameter. I don't consider the parameter in the journal paper, but we have to regulate this parameter because there is diverse others. There is good person or there is bad person. We should not empathy to the bad person. So this is an important point. I will emphasize this. And we tackle this problem with the emotion. We tackle this problem with the emotional context. This is a future work. We have a presentation at the current conference. So if you have some interesting, please read this paper. That's very interesting. Like in a lot of the active inference and social sciences conversations that we've had here. There's a lot of discussion about how do you balance the coefficients in the free energy calculations for self and others. Because you're making decisions at a very short time scale like speech and second by second. But then sociality is very long term. And so you could make a decision that is aligned locally but then very out of alignment at a longer time scale. Especially if these other four have complex relationships, then action is not very simple. Yes. Is there any other things that the other authors want to say or anything else you want to talk about? I don't have cool. Well, thank you very much for joining the discussion. Thank you. Yes, good luck with the ongoing work. And I hope to stay in touch and hear more about the research in the future. Okay. Thank you. See you later. Bye.