 Okay, so let me start by saying thank you to the organisers, it's been a really interesting few days so far, so today I'm going to talk to you about decision making and cues and how collective systems make use of social and personal information and we're going to be focusing on, mainly focusing on migratory ungulates, so over the last few years I've been working on wildebeest and caribou, so these are migratory species and I'm also going to talk briefly about some modeling work that I've done in the past which is more abstract and more or more generic. So we're going to start with caribou migration in the Arctic, so just to set the scene, I'm going to show you a movie here, so hopefully you can see these dots moving around, so each dot you're seeing represents a caribou that's moving through its landscape, the different colours represent different herds of caribou and the one I want to draw your attention to is these caribou up here on Victoria Island, so this is in the far north of Canada and what you're watching is they're making their southern migration, so the date is shown here, so this is going from September to December, so as winter approaches they start to move south and you can see what they do is they start to build up on the coastline, so they're moving south and they're aggregating on this coastline and what they're doing there is they're waiting for the ice to freeze, so when the sea ice freezes that's when they begin their crossing and they continue on their migration, so we thought this was a very interesting system with potential effects of collective behaviour going on, so an interesting fact about this migration is that when the population decreased for a number of years the migration ceased and then the population recovered and then it's resumed again, so while we were there, so we went up there a couple of years ago to try and study these these animals and while we were there there was also a film crew trying to make a documentary and they shared some of their footage with us, so this is from the film crew and this is just to give you an idea of the environment these animals are moving through the navigational challenges and obviously there's a very high cost associated with this crossing, so a lot of these animals don't make it across this ice, so these these caribou are out there on the sea ice making their journey, so we were very keen to understand how they're interacting and how they're making use of social cues as they make this migration, so we weren't able to actually film them on the ice but we were able to film them as they're poaching the ice and moving up and down the coastline before they make their crossing, so this is footage that we took using a drone, so this is just a standard off-the-shelf drone, it's a 3D robotics solo, so it has a GoPro fixed to it, so we just fly the drone above the caribou herds and then we take films, take videos of their movement dynamics, so these are quite small herds, so in order to keep these animals within the frame what we do we have to follow them around, so the drone actually has to follow the herds as they move, so in order to process this footage we first had to subtract this motion of the camera, so we want to know how the animals are moving but we want to differentiate that from the fact that the drone is having to move to follow them, so we use the features of the landscapes or the different regions of where it's dark or light to work out how the landscape is moving and from that movement we can work out how the camera is moving and once we have that we subtract it and we're left with just the motion of the animals, so once we have that next we need to find individuals within this footage, so we use the machine learning algorithm to locate caribou within the video footage and this is based on the shape of the animal and then once we have the individual caribou, the trajectories are linked and we work out what's animal movement, what's camera movement, what we're left with is a series of trajectories and positions of each animal within the herd, so we're seeing we're getting all the animals in this herd within the frame of our drone, so with this we can begin to look at how they're responding to one another and how social interactions are governing the structure of the herd, so the first thing we did was to look at where individuals are relative to one another, so in this plot what I'm showing you is if you imagine you have the caribou here, so this is an arbitrary animal that we choose from the herd and then it's pointing it's moving in this direction so it's moving from left to right and then once we've had this animal we just work out where its neighbors are most often, so the heat map shows the most common locations of its neighbors, so what we see is these very clearly defined line formations where if you have any caribou the most likely place for another caribou is either in front or behind, so they're forming these lanes as they move, so similarly we can look at how well aligned they are, so this is showing you how much structure there is in these herds, so the low values mean, so low values of blue means that there's low variance, so they're more aligned, so we see there is this line front to back where we see more aligned animals and either side is much more disordered, so there's these structured lines which the caribou are forming and either side is a bit more noise and a bit more disorder within the herd if we look at a focal individual and this is the relative heading, so this figure is really showing you what you would expect, so it's quite obvious how these animals are behaving from the first two images but this is just showing the relative heading, so how much an individual here is pointing towards this individual, so if it's one it means it's pointing directly at this individual, if it's minus one it's pointing directly away and if it's zero it's moving parallel, so we see these aligned herds which are moving in a consistent direction, so that was our first pass of trying to understand these trajectories that we'd collected, so next we wanted to look at how social interactions were determining the movement decisions they made, so how they were responding to one another when they decided on the directions they were taking, so what we use is a called a correlated random walk movement model where we split up these trajectories into these discrete time steps and then we imagine each caribou is effectively making a random draw from some probability distribution, so when at each discrete time step the animal is deciding which way to turn and we model that as some random distribution governed by these two parameters which are rho and then lambda, so these give us the the effectively the mean and the variance of our probability distribution, so the first parameter is this parameter rho and this is really telling us how well we're able to predict which way the caribou is going to turn, so if you have very high values, so if we have values close to one that means our movement steps are very centered around our angle lambda, so it means that we can predict which way they're going to turn with very high accuracy and as we increase our this parameter rho then our distribution is getting more spread out and our predictions are less accurate and this is saying as we get down to low values of of rho well this means that we really don't know from one time step to the next which way the animal is going to turn, it's effectively a uniform draw between minus pi and pi, so that's the first parameter rho, this second parameter is lambda t, so this is the the angle so the mean of this distribution what is the most likely angle it's going to turn towards and we model this as being made up of three different factors so this is going to be the weighted average of directional persistence, so we know that the animal is more likely to go in the same direction it was moving in the previous time step, it's also going to be influenced by its environment, so there's going to be lanes, there's going to be obstacles, there's going to be influenced by but it's also going to be influenced by social factors, so we have these three different components that go into our into our expected turn angle and to get our resolved angle we just take a weighted average of all these three components, so the first component is our persistence, so this is just the previous heading, so this is a very straightforward one, so we put into our model the fact that the most likely heading without any other factors is it's going to be its previous heading, next we have the environmental features, so here we don't actually know what the environment is, so we've detected the caribou but from our videos we can't determine the lanes, we can't really see the obstacles, so we don't have a clear idea of what the what what the environment is that the caribou are navigating through, so what we do is we try and construct this field, this environmental field based on the movement of all the animals in the herd, so what we do is we take a weighted average over all animals over all time at that point in space, so depending on how close an animal is to that point in space then we weight that animal more heavily, so what we're doing is we're building up a picture of the average heading at that point in space irrespective of the time the caribou move through it, so we're building up an average sort of a mean for the whole herd and then looking at how individuals deviate from that mean, so if an animal makes a turn for every other animal in the herd made that same turn then we consider that to be driven by the by the environment, so we we don't consider that to be a social turn or a random turn, so next we looked at social interactions, so we we have this angle phi s which is a function of the headings and the positions of the neighbors that the animal can see, so the influence of these neighbors is going to be depended on the relative position, so where the neighbor is relative to our focal individual and these interactions so in our model we seem they could be metric, they can be topological or they could be somewhere in the middle and we include, so in our modeling framework, we include different factors that we think could be important, so we add in alignment forces, we add in repulsion radius, exponential decaying interaction ranges and visual angles and we put all these into our model and then just try and work out what the data supports most, so what does the data tell us are the most important factors, so we use some interaction functions to convert these these these metric or topological models or decaying models into a specific mathematical form that we can put into our inference procedure, so we have a weighting function for the metric interaction range it just gives a weighting of one to an individual if it's within a fixed radius and a zero if it's outside that radius, so just a standard metric interaction model, similarly for our topological interaction range we do the same thing except instead of having a fixed distance we we look at a fixed number of neighbors, so we just count the number of neighbors an individual is away from our focal individual and weight it if it's within some threshold and then finally we have this decaying interaction range model where the weighting of the individual depends on how far how far away that individual is from our neighbor so the further away the individual is the less weight it's given so this decays exponentially and effectively this is somewhere in between these two models so distance plays a role but we don't have these very steep thresholds that the metric model imposes, so we put all these into our into our inference procedure we infer the the parameters and then we score the model based on how well they they approximate or they fit our data, so if we take our random walk model as a baseline so this is we're comparing it to this model our simplest model where we don't have any social or environmental forces and then we just add in more complexity and then we can we calculate the WAIC score which is just how well it fits the model plus some penalty term that penalizes the complexity of the model so as the models get more complex it's easier for them to match our data so we have to impose some penalty for the more complex models but as we go through we can see that we do if we include the environment then we get a much much better match to our data and then again if we when we start to include social forces we see that there is this huge improvement when we just look at the environment the environmental forces and for the different models or you found that the alignment always plays a role so all these models that performed best had alignment but the the best model overall was this exponential decay which included an alignment force so from this we can build up a map of this social weighting based on the parameters that both that best match our data so this is the optimal weighting for social interactions so we see we have this exponential decay so this region here represents the the region where if you're too close then you're you're not influencing or the individuals moving away and then we have this really strong region of interaction just directly in front within this quite small cone this small visual cone and as we go further out these individuals do play a role but they're there the weight is much less so effectively near neighbors play have a far significant role in further neighbors but there's no sharp cutoff so there's no sort of metric interaction and also when individuals are far away so if we have no individuals in this region this individual out here doesn't have the same force on the focal individual so it's not quite topological either so it's somewhere in the middle that there's this decaying interaction range so using this this approach we can also look at how individuals are very in their behavior so we classified individuals based on the group or their their life stage so we classified them from the aerial footage or whether it was a calf an adult sort of a mature adult or a large bull so we can see we can distinguish them from the video footage so we classify them according to these categories so we don't know if these are the adults are females or do or small bulls but we know that they're not the large dominant bulls and then we can look at how each class uses social information and as we would expect we see that the calves really strongly their movements are really driven by social cues so they're really strongly tied to their mother they're not using any other cues and they're really strongly focused on that one individual in front and the adults are somewhere in the in in the in between but the bulls are completely independent so they're giving very low weighting to what others around them are doing and they're much more autonomous so so this paper just came out a few weeks ago and also if you are interested in these types of topics are thinking about collective movement in terms of ecology and larger scale processes we do have a special issue out this month which has a lot of papers a lot of excellent papers all about this this topic in different species and different study systems so next project i'm going to talk to you about is slightly different topic we're talking about wildebeest and not not their movement but how we can actually count them using either collective intelligence or artificial intelligence so using collective systems as a tool to try and help us understand or study populations in the wild so the question we're interested in how many wildebeest are there in the Serengeti and Serengeti National Park how many wildebeest do we have so to answer this question what typically happens is aerial surveys are run so every couple of years people go out in a small plane with a camera fixed into the floor of the plane and they fly transects so the weight till the caribou are sorry the wildebeest in a certain configuration so in march april time they're all concentrated in these short glass planes in the Serengeti they're all quite still they're not moving around that much so in that time they go out they fly transects and they take lots of pictures of the wildebeest so they as they fly along every 10 seconds they take a picture pointing straight down and the end result of the transects is that you get thousands of these images which look like this so probably you can't see this but within these images dotted around there's these wildebeest so tiny little dots in these large high-resolution images and somebody has to go through currently and try and count all the wildebeest in these thousands of images so once they these images are counted so from this sub-sample of the population it's then inferred what the total population is at that point in time and this has been done since the 1950s so going back to 1950s where so there was a winter pest epidemic in the wildebeest which was eradicated around this time and after that the population started to recover there's also better protection of the park better protection of the wildebeest so the population recovered and then it's been at this fairly steady state of 1.3 million since since the 1970s so it is important that we understand how many wildebeest there are in the parks the wildebeest of the dominant species they fire number any other species and if the population declines it can have a dramatic effect on many other species in in the ecosystem so the last time we have an estimate for is 2010 so that's going back eight years we don't have an estimate or current estimate since then despite the fact that the transects have been run in 2012 and 2015 so why this is important is because things are changing quite rapidly so within the Serengeti you're seeing a lot of changes so this was a paper that came out recently looking at how fencing has just really exploded in in the Marwa region so this is in the northern region of the Serengeti in Kenya there's lots of new fencing going on and this is having a very well we can see that this is a having a significant impact on the wildebeest so we have this new perturbation to the environment and this is a picture taken by my collaborated Grant Hopcraft this is taken two months ago so we have this new all these changes to the environment and we can observe that these fences are killing wildebeest so they are having an impact on the mortality rates of these species so we really want to know quickly what's going on with the population dynamics so in order to do this the question we're trying to we're asking is can we replace this manual counting by experts so right now we have these pictures these thousands of images and we need trained experts wildlife researchers to spend three weeks or two or three weeks going through and counting each image one by one and we have two people counting them and then when it when they disagree then there has to be a reconciliation till we get an accurate count so we've been working on this for a number of years now so I think in 2016 we published a paper looking using machine learning approach to try and automate this counting so this is of a old style machine learning where we design the features that we think describe a wildebeest the best and we feed this into a classifier and the results we got were reasonable they weren't excellent they weren't sort of usable as a as a you know as a replacement for the for the professional counters but what we saw was that the so this is the machine learning algorithm as we increase the number of training samples it sees and what I'm showing you here is the error per image so how many wildebeest the root mean squared error per image it was off by and for comparison these are the two human counters so each each of these counters made a number of errors and then when there was a discrepancy the two counts were reconciled to give us the actual baseline count that we're comparing it to so we see that the machine learning algorithm has a much larger error than our two counters even though the the error for the human counters is still quite high but what we found was that if we look at the total count for each for the algorithm and for the people then the algorithm actually was better than both individuals so what we saw was that for each person there was a systematic bias in one direction or the other so we have one professional counter that tended to overestimate the wildebeest in each image and one that tended to underestimate but the machine learning algorithm although it had a larger error overall there was no systematic bias so these errors tended to average out and the overall the final total was was much closer so moving forward so after that result we thought okay well this is promising but it's not something we can go back to the the research authorities and say we can replace your your manual counting is too large an error for that so something else we were trying is using collective intelligence so instead of having professional counters count these images we'll put it on zooniverse and have lots of people try and count this take an average take a mean or a median and see if see if the the crowd can do better than the expert so the 2015 images were put on zooniverse and there were around 2,000 volunteers who went on and clicked on these images and helped count these wildebeest so they resulted in almost 170,000 classifications there was nearly 10,000 different images they had to count and this represented 800 of the original images so the images were too large for we thought a single volunteer so they were split up into little sections so that people could go on and count a few wildebeest and then aggregate the total and see see how close it matches the individual or the the expert count so what i'm showing you here is the results from the zooniverse counts when we just take the mean so we just look at the mean of all the estimates from we have 15 individuals for each image take the mean of that number and compare that to the true count which is the expert count that we think is the accurate number so we see is a quite a lot of scatter so this is on a log plot so you can see more so there's lots of images with few wildebeest so one two up to ten and there's some images which have a thousand or up to a thousand wildebeest but we see that there is this trend sort of systematic undercount so over this is the cumulative total for each image so for a thousand images we see how well does zooniverse do if we look at the cumulative total and then if we take the final number we see there's quite a large difference in the between the expert count and the mean of the zooniverse count so then we had a look at the median so the median is more of a bust measure of what people think perhaps so it's less susceptible to the one crazy individual who thinks there's a million wildebeest you know maybe that'd be more accurate and we'll see that it does so using the median instead of the mean does help and it does improve things so our line our final estimate is getting closer to this true count so the median is a better measure for the zooniverse data so what's going on here is so we're not really seeing this of many wrongs phenomena where independent errors tend to cancel out and we zone in on our true mean what we're seeing is something slightly different in that we have these individuals making these counts but what we see is that the larger than true count so the more wildebeest there are in the image the greater the undercount and then the smaller number wildebeest in the image the greater the overcount so effectively what's happening is if wildebeest don't have any or if images don't have any wildebeest in them people find wildebeest somewhere and they make mistakes and that's what's happening here but also if they have a lot of wildebeest then they get bored they get a bit tired they count you know if there's 300 wildebeest maybe they count 50 and then they give up so so it's not quite these independent errors being averaged out it's just there's some people overcount some of the smaller images and then people just tend to give up on these really dense images of wildebeest so after looking at that there was it looked at another metric which was a set of using the median maybe we just throw away the smallest one of the smallest count for each image and then take the median of what remains and we found that this does a much better job so this pretty much matches perfectly our total count it's still not quite satisfying because we don't really know how robust this is I mean you know perhaps you would work on this data set for the 2015 but if we try it again maybe we'd have to drop out the two or drop out the the highest count so we don't really know how well this is working so it seems to be a bit of a mixed message for the zooniverse data that the collective intelligence didn't seem to work as we'd hoped it would so we also went back to the machine learning approach and as we heard this morning there's been a lot of progress in machine learning and especially deep learning over the last few years and things are changing very rapidly and it's quite impressive the amount of progress that's been made so I like this figure just because it kind of explains the difference between deep learning and previous old-style machine learning is that as we saw for the machine learning algorithm we added in more and more data and it's just saturated it stopped improving but with a deep learning in general we see that as we add in more data that it just has a greater capacity to learn these things especially so the specific to image detection of our object detection and images as we add in more data then we get really high performance out of these algorithms so we looked at a few different approaches a few different networks and then we came across this deep learning algorithm called YOLO stands for you only look once so this is an object detection system that takes in an image puts it through one pass through our network and it comes out with a load of classifications for each grid and then a bounding box for each object and this as I was saying things are moving fast so this was originally published in 2016 and a year later they updated it and it was much better and now could detect 9 000 different objects and then last month they released another version so every year they're releasing a version and it seems to be getting a lot better with each version and especially for the wildebeest project so this is a quote from the the paper the 2018 paper which came out I think in March or April so what they're saying is here the previous versions it struggled with small objects so a wildebeest is a small object it's a very large image where we have this tiny wildebeest located somewhere in there but with this new version the performance has flipped so YOLO used to struggle with small objects but it was good with medium or large but this new version YOLO version 3 is very good for small objects so we saw this and said perfect we had been working with YOLO version 2 and getting okay results but not great but we saw this and said great we'll switch and so now we've used this and it is true that we get much better performance so we do make some small modifications so the original object detection framework uses so it has these anchor boxes which are effectively templates for the objects it looks for it can detect 9 000 different objects so these anchor boxes all different shapes and sizes for our problem we have a much simpler problem so we have wildebeest and that's all we care about so we don't need as many different shapes of our objects pretty much every wildebeest is the same shape it's just slightly different orientation so we reduce the number of boxes down to three so you can think of this as a box for a wildebeest pointing diagonally one foot pointing up one foot pointing left to right and these are the three shapes our algorithm is looking for the other issue is that the the original YOLO version 3 it's used to dealing with a lot of objects in images so it's used to seeing images that have a lot of objects in there has to detect but for our problem it's slightly different is that we have a lot of empty space so we have a lot of nothing that we just want to ignore and then occasionally we have these wildebeest dotted around and sometimes we have these very dense images so we had to tune the loss function so that we suppress these false positives that we're tuning our algorithm so that it doesn't try and find objects where there aren't objects there so we split up the 2015 data set we took 500 images and used that for training and then we took a thousand images as our test set to count and then see how that matched our expert so the training takes around 48 hours and this is with transfer learning so it means that we take the network that's being trained so the network knows already or can identify lots of different types of objects already so it's it's learned sort of it's learned the how to find objects generically so it has these sort of ability to find what makes an object and what we're doing is just tune it for this specific object so we're tuning it for for the wildebeest so we're seeing that the training is a lot shorter than we might expect if we trained from scratch and then once we trained it if you want to count a thousand images then it takes around two hours using a GPU so so if you want to count so in the 2015 we had around 3,000 images in 2018 we have around 7,000 images because the wildebeest were more dispersed so we're seeing that we can do all this count we can do the full count within a day so overnight we could do this this full count and how do the results look so this is a sort of standard image we have the wildebeest dotted around we also have some zebra in here some sort of landscape features we see some little white dots which are just sort of scarring on the landscape and then we plug it into our algorithm we see that it's pretty much picked out every wildebeest in in the frame or in the image it has made the odd mistake so I think there's one or two zebras it thinks they're wildebeest but it has ignored a lot of the zebra so it has it's starting to learn the difference between a wildebeest and a zebra I think it's still a little bit confused by the zebra foals but it's still um it's still getting very high accuracy and then these types of images really gave us a problem for the old algorithm so so there's these rock formations in the serengeti so these are copies where we just have loads of these tiny little rocks and our old algorithm would see this and count thousands of wildebeest so just see wildebeest everywhere so somehow it just see in these little contours you know these little small little rocks it would find the shape of a wildebeest it was looking for but the new yolo algorithm sees this and gives us back nothing so it's not just it can't just deal with sort of grasslands it can deal with these structures and is distinguishing wildebeest from you know it's not just finding objects it's finding specifically wildebeest so this is what the error rate looks like so we get a much tighter error so this is the true count the yolo count so we want our points to be on this line and we're seeing that the the yolo count really matches the expert count much better than than the zoonobust did for at least for the mean and the median and our total count is is very accurate as well so just to sum up this this project so we have we're going to look at the the different error rates we get in our different approaches that we've we've tried over the last number of years so our first machine learning is our error rate here so our standard machine learning approach gives us a root mean squared error more than 30 so these are experts still got quite a large error so these are in blue because they're from a different year so these are from 2009 which had a lower resolution so we shouldn't be too critical on our experts but then here we have the zoonobust data so our collective intelligence actually doesn't perform that well and then down to our deep learning which is giving us a real significant improvement it's much faster it's more accurate than than an individual expert and it's also more accurate than than the crowd okay so I just want to switch gears slightly and talk about a few minutes okay um maybe I'll just skip this section then and just talk about um so I'll skip through this and I just want to show you some uh so I just want to talk briefly about collective behavior in wildebeest so I'm not going to uh I don't actually have any sort of scientific analysis to present here but I just have to want to show you some movies or the types of things we're looking at um and what we're observing in these large herds of wildebeest so again we're filming wildebeest as with the caribou with these drones so taking them out to the Serengeti and uh filming so getting these nice shots of of the landscape but also of the wildebeest herds we had some issues with permits over the last few years so we've been using balloons as well recently so we fly these helium balloons over the herds uh so this is some footage from the balloon so this is using a near-infrared camera because we were trying to pull out the vegetation see how they were responding to that but yeah so we're interested in these these formations of these herds and and the fronts as they're grazing um also want to show you some some drone footage so this is a large herd that's quite nervous so there's a I think there's a few predators in the area so they're quite tense and you just see this huge wave of vegetation moving through the herd so it starts up here so hopefully you can see this and you just see that the information just propagates throughout the whole herd and I have done some sort of very rough tracking of this just to try to make it clearer what's going on so you see the herd this is of the wave beginning up on the top it moves through and I think what's interesting is you do see these so very strange little instabilities where the density just collapses so you have this sort of bubble formation or bubble popping type effect where as soon as the density starts to get a bit low all the wildebeest leave that area so there's this instability there that you see these little these little vacuoles inside the herd and then we see these quite dense regions at the front so if anyone has any suggestions about what's causing this I'll be very interested to hear it and also so the last one just want to show you this one because this is a river crossing and it's very striking just how much like a physical system these herds look so we're seeing these herds as they're just crossing this river and they're just coming in quite large so there's quite spread out and they just close in and cross this river and you see as they're making this crossing every now and again there's some disturbance which interrupts their movement and then the whole thing stops so they're happily moving across maybe some individual gets a bit scared it just stops and then you see that the whole thing just stops for a little while before picking up again so we have this fluid like behavior but also this intermittency where they just seem to get jammed up and then that propagates backwards so like here we just see that they have just the crossing stopped for some reason and after a certain amount of time it just resumes and off it goes and this just keeps happening as they make this crossing okay so I just want to finish a list of collaborators so Andrew Badal and Leon DeBell were up there in the Arctic with me collaborators on the Wildemuse project with Grant and Lacey didn't get to talk about the work with Ian and Simon Levin but yeah thanks for your attention