 Hello. This is Acton Flab guest stream number 21.1. It's April 21st, 2022, and we're here with Shannon Procce, and Shannon will be giving a talk, Coordination Dynamics of Multi-Agent Interaction. The talk will happen for maybe around an hour, and then there'll be time for Q&A, so feel free to ask any questions in the chat during. So off to you, Shannon, looking forward to this talk, and thanks again for joining on Acton Flab. Awesome. Thanks so much. So I'm Shannon. Daniel's already given you the title of our talk, and I'm coming to you from the University of California in Merced, where I'm a PhD candidate, and I'll actually be defending this summer, so maybe the next time you hear from me, I'll be talking to you from Augustana University, where I will be starting a tenure-track professor position later this fall as an assistant professor of psychology and neuroscience. Oh, applause. Nice. Thanks. So that's it. I'm not going to talk to you about neuroscience today. I'm going to talk to you a little bit about the behavior of crowds, and I stole this roadmap idea from previous Active Inference Lab sessions, and this is kind of outlining where we'll be going today. So the first thing we're going to do is really dig into a little bit of theory and data. I'm going to give a bit of a brief introduction to multi-agent interaction and what kind of human behavior I'm interested in, and I'm going to walk through a couple of ways that Active Inference folks might be interested in crowd dynamics and behavior, but primarily I'm going to be focusing on dynamical systems theory and the notion of synergies and the tools that come along with these concepts. And then finally I'm going to show you a model system where we apply these tools to a system of coupled metronomes before I look at some empirical data from real world interaction of musical performance and also of crowds cheering at a basketball game. So first, what is multi-agent interaction? This is simply an interaction that's happening between two or more agents, in our case we're looking at people, and each of these people they'll have their own personal goals and behaviors, but their individual behaviors are influenced in some way by interacting with other agents in their shared environment. And I want to talk about two forms of multi-agent coordination today and the best way to introduce this is through a short thought experiment. So I'll just have you imagine the sounds of a crowded coffee shop. Consider how individuals in that coffee shop might be interacting with each other. There might be some small groups, there might be pairs, but mostly it's many individuals engaging in small temporary interactions and these individuals aren't coordinating with every other individual in some sort of cohesive coffee shop group. They're just a jumble of individuals who are cohabiting a shared space. Now instead, imagine the audience on the floor of a rock concert. They're cheering or singing along with the artists on the stage. And alternatively imagine the fan section at your favorite sporting event emerging into a synchronous chant or a chorus of resounding booze. Imagine how individuals in these large crowds might be interacting with each other. As they cheer, sing, chant or boo, they're all sharing in similar behavioral states engaging in similar actions. They're likely sharing similar physiological states like breathing rates and neural states as well. These crowds are changing together in time. They're behaving and they're coordinating like one large interdependent group. And you could refer to these as shared acoustic spaces and they include various levels of interaction from dyads, pairs of individuals or larger and larger groups. And I might even call these acoustic social worlds. The acoustic signal that's generated by these crowds it carries some information about their behavioral dynamics that were employed in creating that signal. And the sounds these crowd makes carry some sort of information about the status of that crowd, either as an incident incidental collection of lots of individual coffee goers or an integrated and coordinated crowd like a crowd of concert goers. And there are of course many important factors at play when we're evaluating these acoustic social worlds and we're trying to decide whether that crowd is acting as one integrated multi-agent system rather than a collection of independent agents. And these include the social context. A coffee shop is different than a concert arena. The development of pattern cultural practices in these different contexts and different scripts surrounding what events happen and when these events happen and different regimes of shared attention between each of these agents. And these are things you've talked about in the conference lab before. And basically this is all a rather long-winded way of saying that we expect coffee shop behavior in a coffee shop and concert behavior in a concert arena. If I were going to talk to you in the language of active inference, I might talk to you today about Markov blankets. And we might talk about this statistical boundary that might be behind a pair of interacting individuals around a pair of interacting individuals separating their internal states of this interaction with the rest of the interactions happening in a coffee shop crowd. And I might talk about how the statistical boundary expands as we start to engage or interact with larger groups of people from three to four to maybe a hundred people or more if we're in a large group interaction. And while this can be a really useful way to model multi-agent interaction to talk about a suite of nested Markov blankets, we quickly run into questions of how a researcher chooses where to draw a Markov blanket at any given system or where to draw each nested Markov blanket in a given system. And this is a topic that you've discussed a lot in the active inference lab. You've spent some time talking with Jelle Brunneberg and colleagues after their recent BBS article, commentaries and how there might be differences from me as a researcher applying some sort of Markov blanket around a system versus the actual agents interacting in that system. But I just mentioned this idea of Markov blankets to situate the topic for the audience of these interactive inference lab live streams. As far as measuring the emergence of an integrated interacting crowd, this is an empirical question that I'm asking rather than a modeling question. And I'm not a mathematician or a modeler, so for this question about groups of interacting people, I found empirical tools from dynamical systems theory and with the concept of interpersonal synergies. So that's what I'm going to introduce to you guys next. So dynamical systems theory. This considers a trajectory of behavior of a system over time. And the concept of synergies arises from the field of motor control as a solution to the degrees of freedom problem. So essentially, the degrees of freedom problem is that fine and precise control of every degree of freedom in a motor movement would be computationally intractable for the nervous system to solve. And what you see here is a minor hammering. And you see these light patterns. This is really sort of the earliest sort of motion capture that we had. And we're seeing the trajectory of his arm hammering here. And you can see that even though the hammer is hitting the same place every time, you'll have to trust me that's hitting the same place every time. His arm is taking a slightly different trajectory to get there. But it generally covers this one swinging behavior. And that's because the muscles in his arm here are self organizing into coupled and locally constrained degrees of freedom forming temporary soft assemblies of motor function that's enabling this flexible movement and coordination here. But another way. Synergies are identified by their functional specificity, dimensional compression, and reciprocal compensation. What are those? What are those? Well, the need to perform a specific function like hammering that anvil, that functional specificity, temporarily shapes the coordination of several components into one system. That's dimensional compression. In which external and internal perturbations are flexibly dealt with via compensatory adjustments of the components, reciprocal compensation. So each muscle in his arm is compensating for the activity of the other muscle. And if he was bumped slightly, there would be compensation from other muscles to ensure that anvil hits its location. And this is all done in order to preserve the functionality of the whole system for as long as it's needed. And that last comment for as long as it's needed, that gets to the notion of soft assembly, where any of these units in this picture here might be functionally coupled just for as long as you need to perform that task. So if we're talking about a group of people, this group of people might form some sort of complex system just while they're interacting for that meeting while they're at that concert, until they disperse and are ready to move on. We can differentiate mirror coordination, which might happen between two people as a result of engaging with the same stimulus and similar patterns of actions from a true interpersonal synergy. And you see here, we might see the same patterns of movement if we have two individuals who are maybe just listening to the same set of music. But if these individuals are interacting with each other, where the action of one person and the action of the second person is constraining each other, then this is where an interpersonal synergy happens. And these synergies are the result of these microscale structures of a system giving rise to macroscale dynamics. When a synergy develops, we no longer need to describe the independent dynamics of each microscale structure. We can instead describe the behavior of this single interaction dominant system, which consists of complementary actions from each person's motor activity as it pertains to their role in this softly assembled complex system. And you might even notice some similarities here between these graphics and a graphic we might draw of nested Markov blankets. And these might very well be two sides of the same coin, just two perspectives with a very similar conceptual landscape here. And there's a variety of methods for setting the coordination dynamics that underlie interpersonal coordination. For example, we might ask two people to sit next to each other and swing pendulums or to rock and rocking chairs. And we can measure how these individuals intentionally or unintentionally synchronize their behavior. As each individual's action becomes coupled to their partner. Or we can ask musicians to play together in a musical ensemble or a duet. And we can measure how these quartets or duets move together in time forming these loosely coupled interpersonal synergies. But in each of these interactions, a motor or acoustic behavioral output of each interacting individual was measured and analyzed for meaningful correlations between these individuals. So importantly in these experiments here, it has been possible to obtain clear measurements of individual behavior. In order to correlate the behavior of these individuals and examine the emergent coordination dynamics of multi agent interaction and social self organization. But what happens when we have this massive crowd of people? We measure the interpersonal synergies that form in such a large crowd. How does a group of people move between novel and habitual patterns of coordination? How does a multi agent system navigate transitions between chaos and stability? And let's make it a little bit trickier. The real world is messy and measurement is difficult. In a naturalistic social interaction in a very large crowd, I probably can't measure the behavior of every single individual person. I probably can't give everyone an individual microphone. I probably can't track each person's individual brain states. And I might not even have access to a camera, let alone a set of motion capture cameras and hundreds of markers to track everyone's head and limb movements. So let's come back to the crowds we imagined at the start of this discussion. When you're imagining this coffee shop crowd and the crowd at the rock concert, you might have also been imagining how these crowds sounded. So the difference between this disjointed noise you might hear at the coffee shop and this all together coordinated stinging maybe at a rock concert, it's trivially easy for you or I to identify. An uncoordinated group of independent individuals happening to coexist in a shared space versus a coordinated group of interdependent group members in a crowd. We can simply hear that these two groups of people sound different. Similarly, we can simply hear when we ourselves are engaged and participating as a collective sort of agent while we're interacting with a large group of people or in a very large musical ensemble. So I'm taking that idea and I think we can measure the sounds that a crowd makes. And with a well-placed microphone or a few microphones, we can generate a single audio recording of the sounds that an interacting group makes over time. And I know that a human can make inferences about the interactions and the acoustic social world of a crowd just by listening. So similarly, I want to know if this dynamical systems toolkit that I have can allow us to make inferences about the behavior of an entire complex system based on this one acoustic measurement that we have access to. So for the next little while, I'm going to walk through a bit of theory and then a toy example until I eventually end this talk with data from real live multi-agent groups that we're analyzing with tools from dynamical systems theory. So one of our most simple complex dynamical systems is a metronome or a pendulum. So Christopher Huygens first discovered this activity of two pendulums from pendulum clocks will be related to each other when these clocks are hanging from a shared surface. And what he found was that these pendulums would swing in anti synchrony with each other. And since then, in 1600s or so, there's been lots of research on pendulums and on coupled metronomes. And if we put these metronomes a timekeeping device for making music on a shared platform, and maybe that platform rotates like this, you'll see each of these metronomes come into synchrony with each other. These are just static pictures. So to really help you visualize this and hear this, I'm recruiting a little bit of help from the mythbusters. The mythbusters on YouTube, so hopefully it is free and uncopyrighted. And I'm just going to skip ahead here to partway into their episode. Friction and momentum as I can so that only the metronome is having the effect. Now let's try two. Adam doubles his trouble with two metronomes set to the same tick mark. Look at the movement on this platform. And after two minutes, the shimmy takes over. Stay with it boys. Nice. Nice, it's totally working. It just took them a little while to find each other's rhythm. I have an idea. If I stop the platform, they should drop out of phase. The moving platform is totally, totally critical to this experiment. All right. So that's Adam Savage there with two metronomes. One on Earth. Why does this matter? Amal, what he's done is replicated the findings of that centuries old experiment that was originally conducted with pendulums on a clock. And he noted that this moving platform here is what's totally, totally critical. But what I want you to also think of is that you heard these metronomes slowly coming into phase with each other. They were becoming interdependent on each other's behavior because they are connected by this coupling mechanism here, this shared platform. And as Adam correctly notes, this moving platform is totally, totally critical. And in fact, when you saw him stop the platform, the metronomes were no longer coupled and their behavior no longer affects the other. This platform here, it can take many forms as long as it is freely flowing and responsive to the metronome's movement. And for the metronomes, this platform is the essential coupling mechanism for this emergent of coordinated acoustic behavior. In humans, this coordination doesn't have to be a physical platform. In fact, I argue that it is the interaction itself, which serves as a coupling mechanism for multi-agent human behavior. And we're going to spend just a little bit more time with metronomes before I get on to real human groups here. So when you have many metronomes on a shared platform, you can have more than two. They're going to transition from uncoordinated random phase to coordinated synchronous phase. And what you're going to hear next is the acoustic output of 32 metronomes on a freely moving platform. And I'm going to justify to you two things that we can identify emergent coordinated behaviors in acoustic data and what all I'm going to justify what state space reconstruction is and how this tool from dynamical systems can help us to analyze that acoustic data. So I'm going to start at the beginning with each of the metronomes in random phase. And this really takes a bit of a time. So we don't have to listen to that for so long. I'm going to skip ahead. So what you heard was 32 metronomes all taking at their own tempo at their natural period. It sounded like a bunch of random noise. But this platform, again, served as a coupling mechanism and synchronization between the metronomes emerge. So they began to sound like one group ticking together. And this is an example of the type of data we'll be investigating for the emergence of coordination and acoustic signals. So keep these metronomes in mind. No, don't don't play the metronomes again. The platform is serving as a coupling mechanism, a synchronization emerges, they begin to sound like one group. So we're going to visit some theory. I'm going to introduce you to Taken's theorem and phase space reconstruction. So if we take one time series from a system, this is the Lorenz system here. It's defined by a set of three differential equations. But what you need to know is that we can reconstruct the behavior of this system as a whole by measuring only one of its parts. This is accomplished by embedding delayed copies of a single time series into multiple dimensions on some delay. So here's one of this time series extracted from the Lorenz system. And we delay that time series to make another time series delay once more to make a third time series. We take these three dimensions and we use them to reconstruct a projection of the Lorenz system. And this is phase space reconstruction. And you can see that it maintains the topography or the same shape as the original Lorenz system. In the case of our acoustic data, I'm going to take a time series of that sound, that acoustic signal. And I'm going to follow these steps here to I'm going to follow these steps here to reconstruct the behavior of those metronomes over time. And once we've completed this phase space reconstruction, I'm going to visualize the behavior of our reconstructed system by generating what we call recurrence plots. This is done by plotting recurrent points from the reconstructive phase space within some radius that we set. And we can evaluate the structures in these recurrence plots using recurrence quantification analysis. So on the right here, you're seeing a few examples of different time series. And you're seeing their reconstructed phase space here. And on the bottom, you're seeing the recurrence plot that we generate from these reconstructed phase space. So one of the simplest measures that we can glean from recurrence pot is the rate that states recur. And this is called recurrence rate. And on the left here, we're seeing a time series of a simple sine wave. And when we reconstruct it, we end up seeing something a bit like an oval or a circle. And when we plot a point on our recurrence system, every time our recurrence plot here, every time our system revisits a space on the circle here. So in the sine waves recurrence plot, we're seeing these diagonal lines. These diagonal lines are reflecting deterministic behavior, and they track how similar sequences of values change over time. So this sine wave here, it's reliably deterministic. If it starts here, it's going to go here next, and it's going to go there next. And that gives us these long recurrence values here. And as you probably expect, this measure is called determinism. In the middle, we see uncorrelated white noise. And you can see the signal already looks much more random than our sine wave. And when we reconstruct it in face space, I don't really see a shape here that we can get much out of. And underneath, you see our recurrence plot, and it looks similarly random. You could think of it like white noise on your television even. Finally, on the right, we see a time series extracted from speech. It's a little bit deterministic, like our sine wave. If it goes up, it's going to come down. But it's also drifting a bit. And you can see if we reconstruct that face space, instead of a simple oval, we're seeing a sort of transient signal here. And in the recurrence plot, we're also going to be seeing variation in lengths in these diagonals. And finally, recurrence plots can contain these vertical structures that you don't see here in addition to diagonals. And these are indications of periods of stability. We measure these with a value called laminarity. And laminarity, you can contrast it with determinism. It measures how a signal is organized in terms of similar absolute values that are stable over time. And this is one representation of recurrence plots. It takes a little bit of time to get comfortable with what our recurrence plot is, so I just have a different visualization to show you a little bit of the same thing. So these are the same set of RQA measures that we take from our system and plot on recurrence plots. This is demonstrating recurrence rate. It's just the percentage of recurrent points that we see on this recurrence plot over time. And it represents patterns of behavior that repeat over time. Determinism is the percentage of points that fall in a diagonal at any time. And it represents behaviors that belongs to a longer sequence of behavior. Entropy, this is the variability in lengths of these diagonal lines down here. And it represents the amount of disorder that there might be in these sequences. Finally, laminarity here are the percentage of points that fall on a vertical line. And this represents clusters of behavior that might repeat for a length of time, like when a system revisits a period of behavior, leaves and returns that behavioral state again. All right, so there's a lot of theory. I need to bring you back to the metronome so we know what's going on here. So what you see here, this is a picture of our uncoordinated metronomes. They're moving in all sorts of directions and finally our coordinated metronomes at the bottom. And I took a 30-second acoustic signal from each set of behaviors here. And on the right, I have signals from this uncoordinated metronome and the recurrence spot we generated from them. And you can see this first recurrence spot, it resembles that kind of uncorrelated white noise. I don't see a lot of structure here. And then in our time series for the coordinated metronomes, you start to see some periodicities kind of in this signal here. And you see a lot more of these laminar structures down here in the bottom, these vertical structures, which indicate these periods of stable states in this group metronome behavior. I could break that down even smaller and I could look at five seconds of sound. And we see the same structures emerging. You can also see a bit more clearly, these spikes that occur in the coordinated metronomes that correlate with these vertical structures that we did in the last part. So this is great. Now you're experts, you've learned everything you need to know to evaluate the emergence of coordination when we measure it from a global acoustic signal. And you know how to visualize it in a recurrence plot and how to interpret that plot. So we're going to apply all of this to human data. And I want to remember for the metronomes, this platform is the essential coupling mechanism for the emergence of coordinated behavior. For humans, the interaction is totally, totally critical. So say me and my colleague Marley Reeves from the applied mathematics department here at UC Merced. So we don't need a shared platform. Instead, we just need to interact with other human beings in order to couple our multi-agent human behavior. So now that we've got this far, I promised you some empirical data. And I'm going to tell you about multi-agent interaction first in a specific musical ensemble performing a specific musical work, and then in the crowd at a basketball game. So first, our musical ensemble here. They're performing a work called Welcome to the Imagination World by Tasuki Shimizu. And this group of musician, they're performing this piece that was composed to do something interesting. So the composer wrote the first few minutes of this musical piece to be completely random. In music theory terms, we call this alien torque. It's when some element of the music is left up to chance, or to whatever the performers decide to play in that moment. So the start of this piece sounds a little bit like this. Okay, that flute gets a little loud for me. So this is not the musicians warming up on stage. This is the actual music. The musicians themselves are instructed not to interact with each other. And you can see this is kind of hard for musicians, because you do hear musicians sort of taking turns, starting each musical line a little bit. And it's a little bit hard to play completely random. You hear some folks playing scales up and down. But in general, this is supposed to sound random, and they're not supposed to be coordinated. And after a few minutes, the musical score actually directs the musicians to begin coordinating with each other. And when this happens, I'll show you what that sounds like. I'm going to stop it right before it gets exciting. So once they've been coordinating with each other, they begin to sound like one interacting musical ensemble. So this musical interaction is serving as the coupling mechanism. And the coordination between the musicians arises, and they begin to sound like one interacting ensemble. Now we're not saying emergent or spontaneous interaction in this particular musical composition and this performance. The musicians are told by the musical score precisely when to begin coordinating their actions. And in this particular performance, the divide between uncoordinated and joint music making is made even more apparent by a theatrical trick, the lights turn on, and then the conductor arrives on stage just as the musicians began to play together as one ensemble. But I'm using this as a model system because I know the behavior they're supposed to be engaged in. And I know that there's going to be a transition from uncoordinated to coordinated behavior. So I can use that knowledge to ask the questions. Are there any differences in nonlinear recurrence dynamics measured by RQA between the uncoordinated acoustic behavior and the coordinated behavior of this musical ensemble? And spoiler alert, they're absolutely are. So at the top, you're seeing the first and last 30 seconds of that first piece, that first part of the performance. On the bottom, you're seeing the first and last 30 seconds of the second part of the performance once the musicians have become coordinated. And we can see that the acoustic behavior from these musicians in the coordinated section of the music is more structured. We see these bursty vertical lines here with where you have these stretches of connected stretches of sound for periods of time. And again, we can look even closer at five-second samples. And we can see the same vertical structures in the coordinated but not the uncoordinated sections. So if I take just these little blocks here, zoom in to five seconds, we see a little bit more structure here and a little bit less or a little bit noisier structure here. These are just for visualizations. This is not actually the five-second snippet here, but it works for getting the idea across. Okay, so cool. I see these structures in the recurrence plot, but is it really more structured when musicians are coordinated? We can do some statistics and we can visualize our data in another way. So these are distributions of all of those measures that we extracted from the recurrence plot. And you can see coordinated sections in blue, uncoordinated sections in red. And we're seeing that when the musicians begin interacting with each other as a coordinated musical ensemble, recurrence and stability measures increased overall. So we have higher, also more variable RQA measures when they're coordinated. And we see lower and less variable RQA measures when they're uncoordinated. And we can see this even a little more clearly if we can plot those same measures over time. And the blue and purple lines here are representing uncoordinated section of the piece. Orange and green is represented. Sorry, blue and purple is coordinated. Orange and green is uncoordinated. And we can see that over time when the musicians are coordinated and acting like one interacting ensemble, they're able to explore a greater variety of recurrence dynamics. They're able to explore more behavioral states than when they're uncoordinated. And we see all of these measures, no matter what they are, hovering around a small subset of dynamics. Now, you would be forgiven if you are skeptical of these results. After all, the entire musical performance is over eight minutes long. Maybe I simply chose the most uncoordinated sections and the most uncoordinated sections of the song. And you'd be smart to have this worry. Our peer reviewers shared the same worry. So we graphed the time series of RQA metrics for the entire piece. And it's pretty clear, even if you don't know what these values are, that there's some difference in all of these top pink and blue kind of graphs and these bottom orange and green type of graphs. We're seeing this same greater variability, greater values and more variability in each of these RQA metrics. When the musicians are coordinated, then when they are performing as a bunch of individuals on stage. And the takeaway here is that these musicians, when they're interacting together as a musical ensemble, they're able to explore a greater variety of acoustic behavior when they formed a single complex system, an interpersonal synergy, then when they were acting as a lot of individuals on stage. And we were able to use these nonlinear statistical methods from dynamical systems theory to measure that change. Cool. So that's a musical ensemble. I'm using it as a model system because we know what the behavior should be. And I measure this from a different kind of system, one that doesn't have a script dictating the behavior of what's going on on the stage. So to do that, I look at data recorded from the student section of a BYU basketball game. And this data was collected by Butler and colleagues and labeled by many hardworking undergraduate students. And we have basically recordings from the crowd that are labeled based on what kind of crowd sounds they're making, whether they're cheering, this is like a loud positive crowd vocalization, whether they're applauding. So this is mostly clapping, but can include some vocalization, distraction noise when you're trying to distract someone from a free throw, mainly positive chance. So this is the bands chanting their BYU cougars chant or defense negative chanting. It's when they're angry rather than cheering positively, singing and then silence. So we're taking this data and we're going to ask a similar question. So we don't have a clear transition from uncoordinated to coordinated behavior, but we do have all of these different types of behavior. So we're going to ask is their relationship between nonlinear recurrence dynamics measured by RQA and these different categories of acoustic behavior at the crowd at this basketball game. And we follow the same process. We extract acoustic signals and we cut them up into five second samples here. And we use these signals to generate recurrence spots. And we take the measures from that recurrence spot to see if there's anything different. And I'm going to show you just two recurrence spot in the next slide. I'm going to show you a plot from positive chanting and from distraction noise. So here's distraction noise. It looks like white noise. It looks like what you would see on your television screen. And when I show you positive chant, there's clearly a striking difference. There's almost no structure in this distraction noise here, but there are very clear laminar bursts of activity represented in these vertical lines. And in fact, these two plots look very similar to our unsynchronized and synchronized metronomes way back from the middle of this talk. And intuitively, this makes sense. If you're trying to distract someone during a free throw, you're making all sorts of random sounds, you might be shaking your keys around, people are stomping on the stands. But if you're cheering your team on, trying to motivate them to score more points, or if you're mounting a strong defense, you're going to be working together shouting, whatever your chant is, go team, go defense. And it's going to occur in a very rhythmic metronomic faction. And we can compare how all of these distributions of archaeometrics look between each different kind of crowd noise. And you can see there's quite a lot of overlap, but there's also some difference. So positive and negative chat seem to be a bit similar to each other in every category, which makes sense because whether they're chanting in a positive or negative energy, it's still this rhythmic isochronous behavior that's happening. Distraction noise, cheer and applause, on the other hand, all seem pretty similar to each other, no matter where we look at them. And then finally, all the way on the lowest scale is angry noise here. And one thing to note is we had very, very few samples of angry noise here. So that could be why we're seeing a few of these differences. And if we had more samples, maybe angry noise would look a bit more like distraction noise. So visually, we can see there are differences in these archaeometrics for all of these crowd sounds. And we don't want to just do visuals. So we can do some statistics here. This is just another visualization of our estimated marginal means showing pairwise comparisons between every crowd sound category. And it's a graphical representation of the estimated marginal means if these red lines overlap, that means there are not significant differences between, for instance, distraction noise, cheer and applause here. Positive and negative chant do look different, however. Interesting. And then angry noise is far out by itself. So knowing this, I mentioned very briefly earlier, I have a colleague in the applied mathematics department. And one thing she's interested in is whether we can give learning algorithm some of these archaeometrics to tell the difference between these different crowd sounds. The researchers at BYU have done this with traditional sound metrics, spectral analysis, capture analysis. And we're doing it with our archaeometrics, so we're focused more on the rhythmic content rather than pitch or spectral content of the sound. And she is doing this in two ways. So one is training a neural network with image classification. And what I'm going to show you a few results from is a random forest that is using these recurrent features as input and doing some feature vector classification using decision trees and doing some stuff that if you ask me questions about, I will not know the answer because I'm not the mathematician in this context. But once she's trained that algorithm, it can actually learn the differences between these crowd sounds. So if these are dark blue along the diagonal, that means that it categorized here very well. If they're light blue, that means there was some confusion here. And if we remember through one of the previous graphs, applause, distraction noise all kind of look similar to each other. So if we sort of smush those into one distraction category, put applause and cheer together and put positive negative chance here, we get even better classification. And this part is something we're still working on. Hopefully, we will be ready to submit in a paper kind of soon. But now that we're ending on this really complex point, what have we learned? First, we took a quick dive into dynamical systems theory and the notion of interpersonal synergies. I told you that interpersonal synergies form when a functional coupling emerges between the behavior of two or more individuals. This kind of synergy can arise when individuals begin to coordinate and as an integrated multi-aging group of people. But measuring the submergence of coordination in large groups of people is difficult. Even if we can intuitively hear the difference between a coffee shop crowd and a crowd at a concert or sporting event, without individual signals from each person in this interacting group, it's very hard for scientists to measure when the coordination occurs. But if we can hear this difference, then maybe we can measure it. So we practice with the toy model, a pair of coupled metronomes transitioning from uncoordinated to coordinated activity, unsynchronized to synchronized. And then we applied the same nonlinear statistical analysis techniques from dynamical systems to measure uncoordinated and coordinated behavior in a musical ensemble. Transitioned once too far. My musical ensemble is behind our graphs here. Apologies about that. I can come back to this slide. So when we want to learn something about social behavior, music provides a good model system. You know roughly and can manipulate what roles each member of the musical group is performing. And sometimes you can have the aid of a musical script like we did to be able to carefully measure the coordination dynamics that emerge in this joint music making. These same intentional interdependencies which arise from a group of musicians co-creating music together can also emerge as the result of local interaction between individuals in large interaction groups of people like crowds at a basketball game. And we investigated that by measuring the different crowd sounds that these fans made and the different acoustic behavioral dynamics we could observe in each of these crowd sounds with fans at a BYU basketball game. And all this work was not done just by me, but also my colleague Marley Reeves in the Applied Mathematics Department, my advisor Meshbaul Superminium, and also our colleagues Chris Kello and Michael Spivey. So that's all I got. Thank you so much for having me. Awesome. Thank you. You can unshare and all ask some questions and if anyone in the live chat has questions they can also ask. Sounds great. Are you going back to the slide? I was going to go back to the slide. I also cannot find the unshare button. It's fine. It's in the same as the share button, but yeah. Cool. All right. Well, a lot of very interesting research and really appreciate the care and to kind of speaking it very concisely and clearly. So just if I could give one takeaway that I thought was really fascinating. So in the Fusaroli 2014 quotation he talked about how there was like synergy and about the continuum and the spaces that are uncoordinated versus increasingly coordinated from the sheet music, which is one type to the emergent coordination and then thought that was an interesting way to open up coordination, not in a framework of either strong versus weak emergence, but rather within a directly measurable dynamical systems perspective and then at the end to do the machine learning and be able to train a model on the recurrence statistics and then identify the system. It's like there's information about the system state as identified by experts, which is what that labeling task was about that's retained like just in the coordination pattern. And so there's something about the patterns across systems that can be generalized that captures also important information. And that's an intuition that I think people often bring up when talking about dynamical systems. But I really thought it was made clear with how you presented it. Yeah, thanks. I'm glad that I got across those kind of messages I was trying to share. One place to maybe start is this soft assembly. So where does this soft assembly come into play? And I guess just what does it mean in human teams or communication? Yeah, so this notion of soft assembly, let me think of something that's not soft, something that's hard wired. Oh, I don't have an opposite example to soft assembly. So if you take a tool, like, I don't know, a wrench, it only can move in one way, the parts of the wrench or the parts of a set of pliers, they're not softly assembled, they're always attached and they're always going to perform that function, right? But if we think about lots of kind of systems, if we think about human systems, if we think about ant colonies, or if we think about anything where an individual member of that system could leave if it wanted to or could leave when they're not interacting, that's what we would call a softly assembled system. And a group of humans, the example that I was thinking of was any group of humans that comes together for some amount of time, if we're talking about musicians on stage, they're together and they function as a string quartet or as an orchestra, so long as they're on stage and performing together. But then they can pack their instruments up and they can go home. So there's not a strong bond or like a strong physical mechanical connection that's forcing their interaction to occur. It's just happening during that process. Because you also connected that to like the transient motor chains. And so that was pretty interesting. Again, anyone who's watching live can ask a question, otherwise I'll just keep on asking about a few points. So you use the metronome as an example of synchronization. But also there's listening to repetitive music, whether it's like electronic music that's repetitive or other kinds of music, uh, binaural beats. So what is this repetition listening nature and what like kind of makes us select a given level of repetition to desire? Like what, how do we think about that in terms of how you presented it with the recurrence here? Yeah. So there's two things that I would, I would think about. So first you mentioned binaural beats and we want to listen to these very repetitive musical patterns. And so there's two ways that our brains might be interpreting rhythmic patterns in music. One notion that is probably very amenable to the active inference lab and what you talk about is predictive coding. And the idea that we have these hierarchical predictions that are happening in the brain. So when we perceive this low level stimulus, this rhythmic input, we're constantly predicting when the next input is going to arrive. And we're only passing up areas when our prediction is wrong. And isochronous, these very rhythmic beats are really easy to predict. But there's another option. It could be that we're not doing any prediction at all. And we just have these inherent oscillations, these natural frequencies of oscillations in our brain. And when we hear a rhythmic pattern that is similar enough, we naturally entrain to that beat or that rhythm drives the oscillatory behavior. And so you have this emergent entrainment and no predictions necessary. And you could be in either camp. I alternate between being a firmly predictive coding camp and being a entrainment camp. But this either way, what seems to be really important in this activity is activity of your motor system and your ability for your motor system, which is really good at making periodic movements like moving, walking, the ability of the motor system to help the auditory system to communicate with the auditory system in predicting when the next beat occurs or in entraining to the beat of a sound like that. And that's with binaural beats. The second thing that I would bring up is we like these binaural beats. Maybe they're relaxing for some people. But we're listening to music. There's a phenomenon called musical groove. And this is when it's very scientific, no, it is actually the scientific term on Peter Janata in 2012 coined this. And it's when the rhythm of a music is just complicated enough that it makes us want to move to music. So it's just in that sweet spot where it's predictable, but not quite predictable enough. And it engages our motor system, instead of just passively engaging it, it actually makes us move to the beat of the music. This might be because when we move, when we tap our foot, when we dance, we're better able to predict when the next beat happens. Thanks, the groove. Again, jargon, so apologies. It reminds me of the zone of proximal development and kind of what does it mean? If there was a buildup and too much tension being built, like a ramping, but then there's, for example, when there's quarter notes for two measures, and then eighths, and then sixteenths, and then it's like when it goes even faster, and then it's like a meme with making it more exciting because it's reflecting a cultural and potentially related to biology, like our priors on increasing rate of the sound is like meaning something because it can't go on forever, so then something is exciting or something is changing. So then that's very interesting about how our, like what is quiet, what is loud, what do those represent, and what are the priors of the musical experience that become encultured in different ways. So yeah, maybe how does it help us think about like different preferences amongst individuals in the kind of music that they want to play or engage in or listen to or how they want to engage with that music? Yeah, so so much content in what you just said. The one is this like notion of anticipation when the rhythm keeps getting faster and faster, and there's a whole book by Leonard Meyer about how music is correlated with emotion and how there's this idea that a composer is playing with these periods of tension and release. So whether that tension is from build-up of rhythmic events, and we usually expect the faster rhythm gets, maybe it's associated with a certain activity in the music, or we expect it to slow down after it's gotten faster for so long, or you're thinking about when the beat drops in a piece of electronic music, for instance. We can play with the rhythm or we can play with the musical pitch, and we might have a certain set of chords that usually lead up to a certain resolution, and we can trick everyone in the audience if we play a different chord instead. But you mentioned this kind of idea of different individuals liking different kinds of music, and this zone of proximal development, too. So this is really getting into how the music that we're exposed to shapes the expectations that we're going to have. So if you grew up hearing Western music, if you were trained in Western classical music, you have certain expectations of what rhythms are allowed of what kind of melodic or harmonic progression is going to happen. And when you hear a different kind of music, it might not even register to you as music the first time you hear it, until you learn about it, until you learn about the music of that culture. Even if it's music within your own culture, say you listen to hard rock for the first time, or like heavy metal for the first time, it might just sound like noise until you're exposed to it enough, or if you study the patterns of how that music moves in time. It's like the plastic learning phase, which is always continuing, but especially during early developmental phases. It's catching on to patterns and rhythms and whether the four beat measure or different pitches and rhythms are familiar or not. That's a very interesting area. So how would one model that and or where do you see active inference in this discussion and what you've discussed here? Yeah, so there's been pretty good coding approaches at modeling this. Also the information dynamics of music is one model that comes to mind. Or if you think of any of these AI music generators, Google has an AI that plays music like Bach. Actually, that might not be Google, but there is an AI that plays music like Bach. Google has different music AIs that you can actually interact with. And what you do here is you turn all of the music into symbols for the computer and you're going to encode what the pitch is. You're going to encode what the rhythm is in a set of however many different numbers you want. And then you can kind of take measures of maybe the Shannon entropy of how likely you are to expect this note knowing that that note has played. And if you've only trained this AI on Bach corrals, maybe then you know the notes are mostly going to be quarter notes, maybe half notes, maybe an eighth note here or there and some whole notes. So you already are constrained with the type, the length of note that you're going to hear. And Bach corrals follow a similar pattern every single corral. So you have, you know the probability of you hearing the tonic chord at this point in time and not hearing a major second or hearing a certain step at this point in time because you've mapped out all those probabilities. And lots of people do that. I don't do that kind of AI or modeling work with musical scores, Mike Bell. So they're creating musical scores with what imperative or for what end? That's a good question. So I've actually worked with a co-author who submitted a manuscript and that's kind of one of the questions we ask waiting on reviews back, but we're looking at all of these different kind of music making AIs. And we're coming at the question from another angle. We're kind of asking whether they can make music and whether it's good music and what's missing if they can't. And that's because when we're talking about humans and why humans make music, a lot of it comes down to the social interaction that music allows. So music seems to facilitate social bonding. If I am drumming in time with you, even if I'm walking in time with you, but if I'm doing something in time with you, whether it's drumming or tapping or making music, I probably will like you more after that event. And it's also a tool that we use parents to infants to soothe them if you think about lullabies. And there's a lot about music that seems to be rooted in social bonding. And music also activates reward systems in our brain. So we get a dopamine rush when we get to that peak experience in music. There's some recent work and I can't remember who put it out that if you listen to music, you're getting similar activation as if you had engaged in exercise. We use music in therapy. Like music is just it's connected to a lot of survival and social bonding functions for humans. But why doesn't AI make music? Like the AI doesn't need to bond with another AI. Maybe they're getting reward in terms of like a value function that we've programmed into the AI to say like, oh, good, that was a good musical line that you made. Now do it again. But like what motivation do they have? There's also no connection to any sort of like interoceptive states or physiological states. If we think about listening to music before we go to the gym, we listen to music to hype us up and get us pumped and it's raising our blood pressure. It's getting us our body prepared to do the activity we want to do. And at night we might listen to calming soothing music, lower blood pressure, lower breathing rate to prepare to go to sleep. What drive does a computer have to sort of maintain these different sort of homeostatic needs at different times of the day or times of your life compared to a human? It's a great question. I don't know if the people who are making AI music are always asking that question. But have you ever heard of Holly Herndon? No. So she is a electronic music creator who has created music AI in this space and her notion was to take this idea of cultural learning. And if you could think of the zone of proximal development that she has, she's trained her AI baby with voices of a lot of different people and exposed to a lot of like basically a culture in order for it to generate some sort of music from what she makes. And she has some really interesting art in that space. It's a little bit like the tree falling in the forest. Is it a noise or a sound? Someone have to listen to it if you have a bunch of auto-generated songs in a genre but no human experience writing them, no one experienced listening to them. It's like, did it ever happen? That's quite an interesting idea. And it's also it's a cross media issue with generated text and image. And that also makes me think about a John Cage quote talking about music and duration and time. So maybe what do you think about music and time? Music and time. So stereotypically, whenever we think of John Cage, think of that four minutes and 33 seconds. He sits in front of the piano starts a timer and is just silent. There's a Marius Kozak, if I remember right, actually writes about music and time and the idea that music is deeply linked to movement in time. And it's sort of a way to express movement in time through not just the auditory lens but also through dance. So like thinking of dance as a form of music as well and seeing how not not only something like auditory music but dance is prevalent in every culture in the globe. And I think there's something important to the notion of time and maybe music as the choreographing of movement through time. Because if we think about you mentioned generated language or images, these are static pieces. You can look around the scene or you can read the text, but music has to happen in the temporal domain. You have to listen to it over a period of time. Once a note is played, it's gone after that. Or once you've enacted a particular dance, it's gone after that. And I don't know, I don't have a nice way to wrap that up about why time is important. But it's definitely an important feature of music and of dance that you're not going to find in other art forms. You might find you could think of it in a narrative, has to happen over time because you're reading it over time. And that's part of why so in my neuroscience work, I study how we perceive musical rhythm. So if we perturb the time that a musical rhythm is happening, how does our brain cope with that? Or if we perturb part of the brain that is thought to be involved with timing perception, so this ability to hear beats that are happening in a certain phase or to hear certain rhythms, if we perturb that system, are we going to impair how people perceive the rhythm in that piece of music? And these are questions I'm asking, questions a lot of people are asking. It's a very bright landscape right now for that kind of question. That's really interesting about bringing the action in and maybe even moving music outside or a little bit beyond just the auditory, because then that might open up auditory experiences that aren't music and musical experiences that have a large component through time, but they're not necessarily auditory. So one other thing thinking about the development of auditory preferences and expectations, or probably a long time in human history, the sounds and the music that one heard was local and performed, if not interactive, and then there was a pretty short period of time of like top 40 records, CDs, and then now there's the ability to listen to a huge number of music as well as listen to a small selection many times. And what is that doing to our auditory cognition or even more broadly that I can listen to the same song with 50 different versions in a row versus that just not being plausible not even so long ago? Yeah, that's an interesting question. I haven't thought a lot about it, but I'm basically just tracking like from music in the home to everybody's hearing everything the same thing on the radio to now we have Spotify and the internet. And it's probably a question that's not just about music, it's also about media we went from only getting our media or our news from people that we locally interacted with, but everybody sitting for the same fireside chat to now everybody's in their different bubble for whatever media they're consuming. And it's interesting, I know there's been some work, for instance, Mike Hobe has looked at how the bass sounds in music have gotten lower over time, lower and louder, which has some impact on how we feel musical groove since the bass is usually where the rhythm sits in pop music. But yeah, so what is it doing to like our brains and like our perception of music? That's interesting. You just and you mentioned memes earlier, which made me think of TikTok. And so this like new way of proliferating sounds. And if you've ever been spent very much time on TikTok, there'll be like one sound that everybody's making a dance to, or one sound that people are just playing over the top of the video of their dogs running around, because that sounds popular. And then when you hear it on the radio, it's a little bit jarring because you only know this like 15 second snippet or however long a TikTok is from that TikTok video. And then you realize this is a real song, like a real work of art that somebody made before it became a TikTok meme, which is just pure like personal anecdote, but it's interesting. So if we're getting insulated by like the media we consume, are we also getting insulated by the type of music we consume? Like are we less able to appreciate other types of music because I'm listening to the same song in a lot of different renditions? Or are we like broadening our musical consumption because I'm able to access music from all over the globe at the click of a button? It's probably a little bit of both. Yeah, it's like the space of the action opens up when there's so many affordances in our niche to select from, yet also quite curated and presented in a certain way that just on aggregate probably still does drive people to have shared action. And so then like it's literally the platform playing that role of the metronome coupling interaction there. That's interesting that you bring that up. And also, so if we come to this music is this drive for social bonding? And we've just been in this global pandemic or we still are. But for many people we were on various stages of lockdowns. Music live music was not happening. Choir or musical ensemble rehearsals were not happening. And you saw this drive to continue to do something like make music together online. And one of these drives was she shanties became really popular on TikTok. And it's not something that people were doing together at the same time because we have this delay that's happening over audio visual communication like this. It works well when we're taking turns. It's harder for trying to synchronize and sing together. So TikTok allowed someone to record a sea shanty. And then the next person adds their voice to that recording. And they record their song singing the sea shanty. And then the next person does the same thing. And you end up with these compilations, not compilations, but these like ensemble performances of a song that people would sing when they're off on ships at sea. And they're feeling they're like having this social bond with the people that they're out at sea with the whole time. And now it's our song that we're singing through the internet as a way to feel like we're having some sort of social bonding and interacting and engaging with people. And you saw that on TikTok you also saw that with celebrities trying to sing when they all gathered together to sing Imagine. We can argue about whether that went well or not. Or when we saw basically there was kind of an explosion of different types of online music interaction to try to replace this local interaction of going to a concert and seeing someone on stage around a group of people that you're interacting with together or going to perform or rehearse or jam with your friends in your garage and trying to find digital avenues for the same social role that music traditionally plays. Very interesting. Like, and how will it change going forward? Some people may find very immersive sensory experiences to be powerful for them. Maybe that's a scalable approach. Maybe it's not. Maybe that's a good thing. Maybe it's not. So one other part was like, when you were contrasting the musical study that you presented first with the basketball, you mentioned that that environment was unscripted or in contrast to sheet music or like a spoken dialogue that was scripted. And then in one sense it is unscripted. But then as per the paper on active inference and scripts, it's more like the strong and the weak scripts like the basketball is a scripted environment in the cultural norm weak script sense. So yes, it doesn't have the stage directions. But you know, they're not science helping identify where people will be given tickets and sit and signals and lights and sounds that also provide a lot of cues that are going to be totally specific like the seventh inning stretch in baseball. Why stretch then? But then that's part of where like the embodiment of even watching the game and participating in it gets to be why it's the phenomena of entertainment and why it's something that people spend their life like supporting directly and indirectly and practicing for and why this isn't just like a neural entrainment binaural beat hack or something like that. Yeah, so if I got the same experience being at a basketball game sitting in the student section that I got from sitting on my couch and watching the basketball game at home, then you would imagine nobody pays to go to basketball games and we all sit on our couch and watch them at home, right? But we spend lots of money to go to basketball games or whatever sporting events. We have student sections. We wear the colors of our sports team. So we're already signaling to people that, you know, I'm wearing this outfit, so I'm a BYU fan. I'm wearing this other outfit. So I'm a, I don't know, basketball teams. I was going to name another one. I'm that kind of fan. So that's already signifying a kind of script where I know, oh yeah, you're a friend, we're going to cheer at the same time. And then the game has rules and the game has good things to happen. If my team scores a point, my response is going to be to cheer. If the other team scores a point, I'm going to be upset. So there's already a relationship where the game is driving the behavior in the stands. And then you have the interactions of the people next to you, the people you can see next to you and in front of you, the people who you can hear also behind you. And there are certain cheers that might be scripted just like that musical ensemble. So if there's a certain idiosyncratic chant that I make that's associated with this team and this mascot, we all know this chant. We've learned it. Maybe it was written down. Maybe it's being broadcast on the big screen above the game. Maybe there's cheerleaders or a band who's leading a certain cheer and you're engaging it. So even though it's less scripted in a way than this musical ensemble, in the sense that it's not written on a piece of paper that every person's looking at, they're all aware of the social scripts that are involved in a basketball game experience. And a basketball game experience with that team or with that student section that you're sitting in. So you're absolutely right that there's scripts like this. And I think any interaction you get to, there's going to be some kind of script. There's not going to be something that's pure interaction. Nobody's interacting just on vibes. There's some social script based on where the people are, based on what context they're in, whether they know each other ahead of time, whether there's some external signal like clothing or you look like me or you don't look like me. And that's all going to change the dynamics. It might make certain forms of interactions occur more easily. It might make certain kinds of interactions dissipate more easily, which interestingly is something that we're still studying with metronomes. So metronomes don't only move into synchrony. If you have two interacting metronomes, they might actually cancel each other out or they might exhibit other like coming in and out of synchronous dynamics. And if we are still learning how to understand that in that physical system where there's physical properties that we can change and measure that are really simple, just moving back and forth in a certain period, it's that much more complicated when we're looking at humans who we seem to have some sort of free will. And we can behave according to some social rules that we know about, or we could behave completely contrary to those rules. Awesome point that the model system like you brought up helps us confirm just our statistics. Like before the linear model gets brought out in one setting, try it on the piece of graph paper. And so this is like the graph paper for thinking about semantic or narrative coordination. But one wouldn't want to be in a non observable territory with kind of flying blind that way. And also, I think the examples with the game and the context coordinating behavior to different ends. It made me think about where there's fans from different teams in the same setting versus like sections that are separated. And then there's the parents at the game. And maybe like the intention is just to support people in their play versus like a professional sports. And then there's like people who are there for pragmatic value. They have literal money. Other people have like identity currency on the line. And like feeling like one's city has a good baseball team, and all these very complex things. And so it's kind of just cool to have a model that maybe can blanket off some of that complexity, but also just recognize the real data that you're getting from these systems, which is sometimes as you put it out down to a single microphone, which has many advantages over visual observation, or like sensor tracking or anything. And that's what we're observing from the system. And then there's our process of doing analysis on that signal, because that's actually what the system is providing is measurements. So it kind of has tie in active inference and how we're modeling these complex settings without getting lost and how many possible ways they can be. Cool. Do you have any other like things you'd like to add or any other points you want to bring up? No, I think this was really great. I really appreciated the discussion you brought. We covered a lot more than just interacting groups of people running from brains to music AIs back again to people. It was really fun. Cool. Thank you very much for joining. So see you anytime. Awesome. Sound good. See you. Bye.