 All right, thank you very much. Very sorry for this delay. I think my old Mac is to blame. I should get rid of it at some point. It's quite old. I tried it yesterday, but the connection is a bit shaky. So today I talk about spawns and some work on machine learning. And I want to start with architecture because actually there's at least one architect in the audience. As you may know or not, it is Mies van der Roel, who coined the phrase less is more. And he also designed beautiful buildings of minimalist design, like the Barcelona Pavillon that you can see here. And it's striking to think about that this was already 1929. So a long time ago way ahead of its time. And the ideas to use less rather than more in design is fairly universal. And many people on swarm robotics are using it as well. Deborah explained well how outcome, algorithm and environment are related. So I'm not going over this again. But what we are interested in is not how much information processing does a particular agent need to do in a group, but how little in a sense. That's the emphasis. And let's take a look at this in a particular context of a task where a bunch of robots have to gather in one place. So in our framework that we call computation free, the robots have simple sensors that give some discrete value. For example, there can be a line of sight sensor and the line of sight sensor tells what color the object is you're pointing towards. In this case, the line of sight sensor is just binary. So at any moment in time the robot knows is there another robot in front of it or not. And we assume that the range of the sensors actually infinite. So using one bit of information as we will show is sufficient if you have embodied robots to solve this particular task. So how do the robots move? They have two wheels. They can turn on the spot, move forward, backward and so on. So each of these wheels can be set to continuous velocity from minus one to one. And now the question is what is the control policy? Because we're not just interested in the outcome, we're interested in a policy that maps input to output. And we're assuming that we have robots without any state. So there's no runtime memory. And if you make this assumption then the only thing you can do is map input to output. Anything, I mean everything you do is just a map from input to output. So in our case we have a binary sensor value. And for each of the possible values we have to determine what is the velocity of the left wheel and what is the velocity of the right wheel. So as long as the sensor reading doesn't change you keep moving with exactly the same velocity profile, the pair of wheel velocities. In total we have four parameters here. The two wheel velocities if there is a robot in front of you and the two wheel velocities if there isn't a robot in front of you. Now how do you optimize these four parameters? And I want to be upfront. We use a lot of computation in the optimization process. We use a simulation framework. We are using like search algorithms to find a suitable controller. And there's a lot of computation involved in this. However once we have found a suitable controller we can upload this onto the robot. And from there on the robot does no longer need to perform arithmetic computations. So this is the function that we are going to optimize. And PIT is a position of robot I at time t. And essentially this function tries to minimize how far a robot is away from the average position of all of the robots. It's the square distances. And there's some normalization. Small r is just the radius of the robot. Now when you optimize this function I cannot show you all information here because we have four parameters that's a bit difficult to visualize. But what you see here is if you choose a particular set of parameters, two parameters in a certain way what is the maximum performance? You obtain the other two parameters chosen to your advantage. And so for example, if you are a robot and you have another robot in front of you actually do not know how far the other robot is away could be 50 meter away, could be touching. It is obviously not good to move away. Yes, so you don't want to move backward. That is a very bad performance. What is interesting is the best thing to do is to turn clockwise or counterclockwise. So when you're turning clockwise or counterclockwise I mean on the spot you're not changing your position. So that means you're not approaching the other robot by even a millimeter. And that's a bit counterintuitive because we wanted to approach other robots. We wanted to get close to them. So the answer to this is that the robots are approaching the other robots not when they see them but when they don't see them. And you can geometrically analyze this in the interest of time I skip this actually. But you can prove analytically that two robots simultaneously moving will find each other in quadratic time. Now we have tested this with three robots. So we have robots in the lab. They are called epochs. They have two wheels and they have a camera on the front. And in an extreme case a single pixel is sufficient to tell whether there's a robot in front of not. You can see here an example. 40 robots, this is 16 times three speed. And we did 30 of these trials and almost all of them the robots aggregated into a single cluster. So a few points about this collective behavior. First of all, looking at it, it looks complex to me like in the sense that you have some collective movements of groups and not just individual robots. And it looks also while it may be inefficient in terms of the time to generate everything. The result is very compact. You can see some hexagonal pattern in there which is known to be very good. Also the robots that are inside the structure, they see other robots at all times. So what should they do? They turn on the spot which is a good choice. The alternative to stop wouldn't be necessarily good choice because then two robots that see each other but are far away would stop and they would never aggregate. So in the second part of the talk, we talk about machine learning and we will come back to this particular behavior I've shown. And the question when you are interested in machine learning is something like this. Given this trajectory data, and that's actually trajectory data from one of the real robots here, given this trajectory data, what was the rule to generate it? And we're not just interested in predicting trajectories, we're interested in knowing like Deborah said, the algorithm that takes input into account to produce it. And we know the trajectory but we do not know the input that generated it. So that's the second part of the talk but let's finish the first. There's many more other problems, interesting other than aggregation that you could look at like circle formation that can be used with the same framework. Another problem is that of clustering, not the robots themselves but other objects. So these are passive objects, they don't move but if you program robots using our framework, they start clustering them. In this case, the robot doesn't have a binary sensor but it has a sensor that returns a ternary digit. The sensors explained here on the left. So the robot can either see nothing or it can see another robot or it can see an object. So there's three possible states and for each of these sensor input states, zero, one, two say you have to choose a velocity profile, a pair of real velocity. So here we have six parameters and we used genetic algorithms to optimize it. Some of you are interested in collective choice and consensus problems. So we have also tried whether a group of robots that doesn't perform arithmetic computations are able to choose between two equal options that are presented to them and they are unable to discriminate between these two options. So it's a purely a matter of random choice whether they go for the left one or for the right one and I think 96 or 97% of our trials, we did 50 trials, the majority of the robots ended up near one of the objects. There are some also quite incredibly complex tasks we learned yesterday about cooperative transport. So we did some work on this and we came up with an elegant way to transport an object towards a goal assuming that the goal is not very far away. So suppose you are a robot, you're traveling around and you find an object that you want to bring back to the nest. Now, if the nest is somewhere nearby, you can assume that if you can see the nest, you probably don't want to push the robot because you are going to push it in the wrong direction. So here we assume the object is fairly tall, the robots are very small. So whenever the robots are behind the object, they won't see the goal or they won't see the nest. So the robots are programmed to push perpendicular to the object, but only if they are in the occluded area and then you can represent, this is a particular object, but you can make a generalization, you can go to differential curves and if they are closed and if they are convex, you can prove mathematically that this strategy eventually will bring the object back to the goal under some assumptions. A lot of assumptions like they need to be uniformly distributed in infinite number of point robots and quasi-static movement. Now, you can implement this strategy using a swarm of robots like we did in our laboratory and in this case, they use the arithmetic computation. They find the robot, they use some kind of neural networks to learn how to push and we can actually predict the trajectory that the object takes and compare against the experiment. So an object that is round and symmetrical will be pushed along a straight path. For example, this particular object will follow a curve trajectory and so on. Under the assumption that they're uniformly distributed in the occluded area, which is not always the case. Interestingly, the whole algorithm doesn't need any communication between the robots. So there's no explicit information passing between the robots. The robots are however aware of each other, they are avoiding bumping into each other. We also implemented this in simulation and this is where we used our computation-free framework. I won't go here to too much detail, but essentially in 3D you can have robots that stick to something and propel, bring it back to some goal. We developed recently the first robot that moves by routing fluid through itself and this is a model of robot. It's composed of many, many units you can attach together. And these units, they are put in an aquatic environment and they can, they have pumps inside and they can, if you turn on this pump over here, it will remove fluid from within the robot to the outside into the environment and all the passive pumps will let the fluid in. And we can implement our occlusion-based controller that we just discussed to move such a robot towards a light source like a task you call phototaxis. So if the light source is towards the right, like in the video, all the pumps that are facing away from the light source will be active and will kind of help the robot to propel whereas the other pumps will let the fluid in and you can actually prove, this is not published but we are working on this, that even robots of concave shape of any arbitrary morphology will be able to get near to the goal under this simple policy which doesn't require communication, it doesn't require memory, it doesn't require arithmetic computation. And these potential robot, they can potentially be quite powerful if you think about a three-dimensional version like you see here or maybe you can't, I don't know, but think about a cube, a 10 by 10 by 10 cube. So you have 1,000 modules. You have 600 external faces and on each face there is a pump that can be on or off. That gives you two by the power of 600 possibilities how you can fire the pumps or not. It's a lot of possibilities also for the robot to move. Now, I have one more part before going to the machine learning that's on synchronization. Some of you may be interested in mobile coupled oscillators. These are agents that move around in their environment and this is about synchronizing their clocks in a distributed way. Each of these agent can fire and once they fire, all the agents that are in the neighborhood are going to update their faces. So one of the crucial questions is what's the velocity of the agent as we will see and also what is the neighborhood model? And there's different types of neighborhood models. For example, the nearest neighbor, the k-nearest neighbor or a cone of vision connectivity. Cone of vision means for example, you have an animal, it has a certain perceptual field, a cone and whenever someone flashes within this field you're updating your own face. Now you can see here some of the robots in our lab. This is again these EPAC robots and they are starting being out of sync but after a while actually takes quite long, they're beautifully in sync. So and that's not maybe very surprising that you can program them to do this. What is more interesting and if you are more interested in this, check also the original paper by Pregnano and physical review letters is that the velocity, there's a non-monotonic dependence between the synchronization time and the velocity of each agent. So all the robots move with a constant speed and if the speed is medium it takes very long to synchronize. So it would be better for the agents to move slower or faster. And this is data from the actual physical robot experiment. Here there's some simulation. Imagine you have n robots and they start being synchronized and now you perturb a single robot by small epsilon and you measure the time it takes for the whole system to reach consensus again. A small perturbation to a single agent in a swarm. Now if the agents move slow or fast the time is linear with the group size and that's the lock scale. If the agents move in the intermediate regime the time grows exponential with group size to reject the disturbance that you applied to a single agent which is a situation akin of stable chaos. So the last part of the talk is about this machine learning. And in general in machine learning you have models that produce data and you have systems that produce data and you're interested to comparing these. You want that the model becomes like your system. You will usually have a similarity metric for this. And interestingly coming back to Deborah's talk we're not just interested in outcomes and matching outcomes in a sense we want to match the algorithms that are behind. Now recently there have been attempts to kind of replace these type of metrics. So one of the issues with the metrics is that sometimes if you don't choose a suitable metric you will get something that you thought is similar but in fact isn't. So like a good metric would be maybe the KL divergence which measures the difference of two distributions essentially. However you do not normally have the distribution. You just have let's say 100 trajectory data. And typically not even two of these trajectories will be identical. So how can you measure a distribution? So there is a new family of machine learning algorithms. We introduced in GECO 2013 we proposed to replace the metrics by discriminators that evolve and that compete against the models. And a year later guns emerged at a NIPS conference which I'm sure many of you will have heard about. And at the latest NIPS we presented a paper how to generalize guns using a framework that since a couple of years we termed Turing learning. So what's the difference? How does it work? So in Turing learning and any type of gun algorithms rather than a fixed metric you have some metric that evolves. So you have a discriminator that is trained to label data that comes from the other side. So it can be from the model or from the system and you train the discriminator for labeling correctly the data. And at the same time you train the model to fool the discriminator. And as the system is so similar to the Turing test you essentially just have two agents that are learning and they're competing against each other. We are calling this Turing learning and generative adversarial networks are just one example where the model is a generative model. Now let's look at one case study. Imagine you have a swarm behavior and you want to understand what these robots do. So these robots they are clustering some objects similar like you have seen before. And essentially first we need to explain what is the training data? The training data in this instance could be the trajectory data. So these robots are moving around the blue robots are moving around and there could be animals that could be real animals that you're actually tracking. In our case this is in simulation so we actually know what rules they follow but we are not using this knowledge. The good thing of this is at the end because we know the ground truth we can compare whether what we inferred is actually correct. And we will also introduce a replica agent. So we have a simulator, we just add another agent and we assume that we know what is the morphology of this agent but we do not know the control. So here we are inferring the six parameters that these other agents use. So we try to learn what parameters they use in order to cluster. So whenever we put a model onto the replica this will produce a trajectory. So the discriminators will essentially take, they are neural networks that take as input a trajectory and they have to tell is it the trajectory coming from a red agent from the replica or from the blue agent? That's the task of the discriminator. And then we use an optimization process to we have a population of discriminators and we have a population of models and they are both competing like in the co-volutionary sense. We are using an evolutionary algorithms for the discriminator which is a recurrent neural network we are, the genotype is the neural network weights and they are just trained using a standard evolutionary algorithm and the fitness function is the fraction of correct judgment. So that's a fitness function that is generic is not task dependent, it doesn't depend whether you want to learn about a swarm or whether you want to learn about a human. And for the model, the fitness is the fraction of times it misleads the classifier, the discriminator. So again, these are fitness functions that are fairly universal, they don't depend on particular tasks. So now you can see some example results. So this is after 100 generations, you can see the replicas trying to join in the group but to us humans it's very simple to see it's not doing the right thing. It's like the continental driver in Great Britain moving on the wrong lane as I did quite often. Now after 200 generations and me living for eight years in Sheffield, I've also looked greatly adopted and I drive more likely on the correct lane. And after 1000 generations, you actually cannot tell the difference. So, and the nice thing is not only we get fooled when we look at it, but actually the parameter values when you compare to the ground truth are where they're supposed to be. The ground truth here indicated with the dashed lines. So this is like the controller, the parameters of the controller we inferred. Had we used a metric based evolution, this is what we would have obtained. So we did this and in this paper, we have also like formally proven that the global optimum of least square metrics is not the correct behavior. And it's a bit like this. Think about the agent moves 50% of the time to the left and 50% of the time it moves to the right. And let's say whether you move to the left or to the right depends on an input. But if you learn from trajectory data, you do not have access to the input. So what the least square method does is as you're moving forward 100% of the time, which means an error of 90 degrees and that's minimizing the least square method. So it's telling you the wrong answer in 100% of the cases. During learning instead, without knowing the input can infer the correct distribution, as you can see over here. The next thing was kind of something straightforward. We didn't want to continue in simulation. So rather we said, can we do this in the real world without any simulation whatsoever? So here we are learning directly from data generated by the trajectories from real robots and we use the aggregation as an example. So we have an overhead camera in our lab that monitors the replica and also the other agents. The blue agents, they have some known rule to aggregate and the red agent, we can upload via Bluetooth, some candidate model and then we can generate trajectory data. And the whole Turing learning algorithm is running on an external computer. This is the trajectory data, sorry, this is the results. This is the four parameters we inferred in 10 independent coefficients and each time despite noise on sensors, noise on actuators and noise on the tracking system, we were accurately inferring the parameters of the system without seeing the inputs, but just from the trajectory data. So we are not just inferring the outcome, we are inferring a policy to produce output from input. Some further results in the paper are that we infer the control structure as well. We also refer, we also infer the morphology to some extent like the perceptual field of view and we study what happens if you look at swarms of replicas, if you separate them from the swarms of the agents. Sometimes you don't want to mix things because the replica may influence the rest of the group. So you could have a swarm of simulated replicas and a swarm of let's say real animals and so forth. So in the last part, now we talk about active learning and that's another way we are generalizing over what's currently possible using guns. In guns, you're learning passively and that means that the data you're presented with is just passively taken and you're confronted with this data and then you have to decide as a discriminator is this correct or not. But some models are very good in some aspects but very bad in others. So training these models using uniformly sampled data is not smart. What you really want is you want to put the model into a situation where it is most likely to perform poorly. You want to reveal the weaknesses of the model and that's exactly what happens in the Turing test because in the Turing test, you didn't have a passive human that makes judgments. You had an interrogator that asks questions and has a dialogue and by this active learning approach, you can learn much faster and more accurately if the behavior you want to learn is complex. Here the behavior we want to learn is a probabilistic finite state machine. So the agent that we want to learn about is responding to stimuli in the environment and that's so important also for biologists because whatever behavior you observe, it is only meaningful in the context in what it was produced. So the idea is rather than letting the context drift at random, we are controlling the context. The discriminator in our work is not only making a judgment on whether the data is genuine or not, it will set the conditions under which the data was produced in this case turning off and on the light controlling a stimulus. And by doing so, we are able to more accurately determine the parameters. So in red, you can see how close we are to the ground truth of the parameters of this finite state machine like probabilities of transitions and the particular behaviors in the states. The alternative is not to control the stimulus but to let it drift using a random process for example. Then you are learning, you may even be aware of the stimulus but you are not aware, you cannot control it and you are unable to infer some of these states because you're unlikely to observe them. It's a bit like when you're interacting with a human and you want to really understand their full repertoire of behavior, then you have to put them in lots of different states, right? So maybe you increase the temperature of the room or something, so there's a lot of possibilities but if you do not do this, you're unlikely to observe these things. That's why learning via interaction is potentially more powerful and that's just another example of it. Here we didn't want to learn the behavior of the agent, we wanted to learn more the morphology of it so in a sense there is a robot that doesn't know where its senses are. It has zero calibration, it has to find out by itself where are my senses? It has eight senses to detect obstacles but the senses could be all pointing to the front or pointing to the back on a different way. And the robot not only does it not know where its senses are it also does not know where does it start in the environment. So it's a chicken and egg problem. You don't know where the obstacle is, you don't know where the sensor is to sense the obstacle and you can solve this using Turing learning but only if the discriminator is allowed to interact. In this case, the discriminator is a neural network, it takes sensor readings and it has to tell are they coming from a calibrated robot or not. But not only this, the discriminator controls the movements of the robot while the sensor readings are produced. So this is what we call closed loop control. And the discriminator learns to move in a way to maximally help its discrimination task. That's what we call like active learning. And as you can see again similar to before we obtain better results using this. So to conclude, we have initially talked about this computation for free-swarming. I know means there is no computation at all because we have first of all to generate the controllers. We have lots of computation. And also you can argue that there's morphological computation as we hear about in the last few days. But what our robots don't need is an arithmetic logic unit to perform any computations because what they do is so simple you can directly store it using a couple of transistors. And we have seen robots that are so basic that they can move using binary actuators and binary sensors and use this framework without any runtime memory and any sophistication, not even communication among them. So this is a possible way kind of to aggressively push towards simpler and simpler robots that maybe one day can be implemented at the micron scale or even below. And the second part I talked about the Turing learning and this is about like in guns to learn without the predefined metrics. Because whenever you have a metric you will get what the metric gives you. And so in the Turing tests and in the Turing learning you have discriminators that take on the role of emulating finding a suitable metric. And the only thing you care about is inter-distinguishability. You want that the data that is produced from your model is indistinguishable from the data that is produced by the system. You don't care about how similar and so on it's just the inter-distinguishability. So this Turing learning we have applied in behavioral inference and we have showed that in the interactive learning approach that for some problems you can learn faster if you are allowed to interact with the process. And I think this is a kind of a powerful possibility because if you are interested in animal behavior it is so vital to understand the context in which the animals are placed, the environment. And so we can now close the loop by making experiments where computers control the conditions under which animals are observed and use this information to reason about how the animals possibly perform. And if the models are very weak and do not capture well how the animal performs under certain conditions then certainly the discriminators will pick these particular conditions. And then the model they kind of they lose out. And then there's a pressure like there's adaptive value so to speak for the models in the optimization process to improve on the parts where they are weak and this is like continuous arms race like you know from co-evolution. So I would like to say there's a few more research that I didn't present one on Brazil nut effect. So we study segregation and robotics using this. We work a bit on a cooperative transport versus solitary transport using rules from solitary transport and see whether you can transport in groups, rodent huddling, thermal regulation, self-assembly multi-armored bended and a few more. And the work I presented today was mostly done by my PhD students on the left particularly the ones in bold face and some other collaborators like Andreas Kalling and Rudika Zilma from Unilever. And I think this was my last slide, yeah. Thank you very much.