 talking about a little bit more of let's say the technicalities of deep learning work just to show you an example but we're also going to connect a little bit to decision making and also to sound so I hope we can draw some connections with the talks that we've just seen. So I'm going to be talking about birdsong and AI and how we create high-resolution birdsong representations. I'll get into the detail of that and tell you what I mean later. Maria's already given an introduction, thank you very much, so I don't need to tell you that I'm a naturalist as well as at Tilburg and also highlighted the biodiversity issue. So this is something that the World Economic Forum reviewing global risks has only recently put biodiversity loss into one of the top categories along many of the other crises that we think about but biodiversity loss is incredibly important economically as well as for all the reasons and if you look at this graph here we see that farmland birds over the past 30 40 years have actually declined massively and that's because of land use changes. So this is just one example of how at a very large scale the whole of Europe and the whole of the world in fact we've got a problem to solve but for us and for me especially we've got an opportunity because audio is actually the best way to monitor many of the species that we care about birds, insects, whales, bats. These are all important species for biodiversity reasons and they also are detected by sound. Here's a visualization so this is just to give you a flavour of what data is like when you're working with sounds so we can see a spectrogram here in fact four spectrograms and that shows time on the x-axis and frequency on the y-axis from low to high and in the bottom left there we have the song of the dawn chorus so that's the birds singing in the morning and so you can see there's some complexity in there that's good news for us because there's information. There's also an increasing amount of acoustic sensing devices so there are specific devices that people will mount on a tree leave in a forest somewhere. Also mobile phones we can use citizen science type approaches with smartphones. Warbler is an app that I've been involved in over the past few years so there's a lot of data there's a lot of information in that data and with my role here at Tilburg and Nutter Alice we're involved in a Dutch project called the Neighbour Arise project and this is a very big project where we want to monitor all Dutch species so we're building a big infrastructure for monitoring biodiversity in the Netherlands. What I want to talk about today is to get into a little bit more of the technicalities of how we do that just to show you how this works. You might have seen diagrams like this before this shows a sound file so let's say that's a five second audio recording it goes into deep learning system which deep learning is often multiple layers that's what we're showing here of processing to produce some kind of class label now for animal sound that might be a species label so is it a mural or is it a road worst and so this is in fact the kind of problem that most commonly is addressed in bioacoustics. Now I'm showing you this for a specific reason because when we have these layers of processing we start with the sound and we process it until the deep learning and the final layer produced a species label but in fact we can use this for other purposes if we take off that final layer that produces the class label we look in there and we find that the audio has been transformed into some kind of vector coordinate so it is a vector in a mathematical sense and it's some kind of point on a map we've turned that audio into a location and what we can do with that then there's a lot of information in there still because deep learning what part of what's useful about deep learning is it extracts very rich representations which which not so reductive they contain quite a lot of information so we can take what we call the embedding this is now a standard term that originated from text processing but we just say embedding for one of these maps that we produce using deep learning or other technique so we can use this embedding for lots of other purposes not just the original one that we trained it for so here's something can we recognize individual birds if you look at the screen there can you recognize that bird now I'm not actually asking if you can recognize the species I'm asking if you can recognize the individual is it the same individual that came and landed on your loudspeaker yesterday these are the kind of problems that we would like to solve because we can use that to count the population we can in a particular forest for example we can work out if without having to disturb the animals at all we can account the population pretty precisely and so there's the species level and there's the individual level and if we think of those two categories they form a kind of hierarchy and so what we can do and what my PhD student Inesh Nolasco is currently working on is using these embeddings for hierarchical classification so that's what we're illustrating there this shows that in this embedding space there is a kind of semantic representation which we can use for high resolution analysis of animal sound and this is actually derived from what you might call let's say human annotated data right so we have annotated the taxonomic class we have annotated if that's the same individual so we're actually starting from a pretty human centric perception of what is important here but does this bird agree with us now this is a different bird and the reason this is a different bird this is a zebra finch is this is a kind of bird that's quite often various research groups work with this in the lab we could ask him so we could ask this zebra finch here if the sounds are similar or different and this is what we've done in a project that's run for the past two years in fact we have zebra finches in the lab and we designed a listening test to say is one bird is one sound similar to another so what you see in the picture is that the birds are using a bird feeder just to get some extra seed and when they do that they have to solve little task of is it is the correct sound on the left or on the right show you it in a little bit more detail and here's the device taken out of the the aviary and in diagrammatic form we play sound A and the birds should go to the left and this is what happens during training we play sound B they should go to the right and then we play mystery sound X and now the question is which side should they go so what we can do through that in the bird lab is collect data about decisions let's call them their preference based decisions or similarity based decisions about which sound is similar to which other sound and then we treat that as data so if you want to know if sound X is similar to sound A well as we did before we put it through a neural network and we produce a coordinate in this embedding what we're going to do though is we're going to train the network not using some classification decision which is what I told you before we're actually going to use similarity about whether the bird told us that X was more similar to A or more similar to B so this is something which is implemented in what we call the loss function so the loss function is just a part of how we optimize a deep learning algorithm instead of the loss function which is about categorizing things here we have a loss function which says well if A is what the birds choose in this example then we want to push X closer to A and push it further away from B so during the training of the algorithm we're shaping the space so it better reflects the bird's perception so this takes a lot of work it takes a lot of work primarily because it takes months and months to gather enough decisions from the birds to be able to train a deep learning algorithm but the results which are in press at the moment are that our neural net agrees with the birds choices much more often than other audio representations do so this green line here is a kind of gold standard of matching the the same accuracy as the birds themselves have and our method is outperforming all the signal derived ways of analyzing the sound so what I'm showing you there is a little bit more of how we get more and more precise and this is this is partly driven by perceptual and cognitive inspiration also by the data analysis motivations of wanting to monitor biodiversity and higher resolution so with credit for that as I mentioned the initial last go is working in the hierarchical embeddings and the perceptual embeddings it's a collaboration with many people but Veronica Morpheus my postdoc who was working on the the deep learning for the perceptual embeddings all of this comes together to help us understand animal sound animal behavior so we're it's for scientific purposes as well as for conservation purposes and sound is fascinating so it's really interesting work to be doing so thank you very much