 Yeah, thank you very much. Nice to be here. It's my second zero Python last year was just as a visitor and now this year I thought like let's tell something about the work we do at the Netherlands forensic Institute So the summary of my talk is basically this I'm going to tell you about how we at the Netherlands forensic Institute used open source Python software That's why we're here to help prevent murders in a major criminal case by training and deep learning model Language model I should say using active learning This is basically the structure of my talk. So let's just start at the top What is the Netherlands forensic Institute? Well, you might have guessed it's in the Netherlands Actually, it's in the Hague to be specific This is the building and if you were to go in to the entrance you would see people like this So the NFI the Netherlands forensic Institute is basically the crime scene investigation the CSI of the Netherlands So if you go in you would see people analyzing DNA samples Analyzing drugs bullets pistols you name it and Actually, I I'm from the digital department. So we have a big digital forensic department where we deal with hardware and software as well And a very important detail, which is different in different countries is that in the Netherlands the NFI is not part of the police So I'm not a police officer. I'm just Simple government employee and I work for the Ministry of Justice And we have the forensic Institute that is responsible for analyzing traces evaluating evidence on the request of police or public prosecutors or even lawyers at the defense in criminal cases and What we do is we use our expertise our subject matter expertise to help that and also to develop new techniques to help Ensure justice in the Netherlands So one of these new techniques you might be familiar with these buzzword data science Well, I'm a data scientist and I work in the NFI. So I call myself a forensic data scientist This is what we do. We investigate. We use the newest methods in machine learning and in data science Yeah For the betterment of the Netherlands So that's who I am. That's who we are So let's talk a bit about open source At the NFI and we love open source. I put some of the Of the packages we use our favorites and you can see it's basically mostly the data science stack and some specialized Things so we're very very happy with that and it works great And there's a little bit of a problem because when people like me say that like we love open source What we basically mean is we love using open source Like it's a really nice libraries. And yes, sometimes we want to do a bug and we We submit a bug report or if you're very lucky a patch But we can do better I think we're trying to do better. Maybe you've seen these stickers around. This is a campaign like an advertisement campaign Not by the government by by an organization That says like if you use public money to build something you should Publicize the code or the code. Well, I work at the government. It's not that easy Just anything you write you can open source, but we can try better, right? So and you know if I we not too long ago we made a GitHub page right now. We have 23 followers, so I hope at the end of this talk Maybe we have some more The link is here. I'll put the link up at the end of the talk as well And just to show you a few of the things we've put there If you happen to own a Tesla a Tesla is basically a computer on wheels. It has cameras. It has logs And we have a package called Tesla logs that does what you think it does it parses and analyzes logs and puts it together with With the video streams, so yeah, anyone here has a Tesla We okay, so acquire a Tesla and then you That's what we do Like sometimes we have a government expense like we need a Tesla for this case Okay, I need to go faster and then this and We also put some data here So these are some experiments that have been done by iPhones and to analyze the data that comes out of the health app And basically experiments like walking running sitting what what kind of that and data that the iPhone generator And what can you do with it? It's open source data. If you have a nice project you can use this And he's a statistician in the room No, okay. Well, this is very nice. This is a likelihood ratio library if you're into Bayesian statistics evaluation of evidence and Calculating and Calibrating models. We have a library for that. It's called like your ratio library and finally a colleague of mine. He made Confidence has nothing to do with forensics. This is a library that makes it very easy to load configuration From the almost from the environment and use it in a in a nice and structured way in in your Python programs Okay, back on topic because you came here for the murders, right? Yeah, I hope you can Appreciate that I cannot tell you everything about to the major criminal case But I can tell you the highlights. So it is about encroach at has anyone here ever heard of encroach at look Okay, quite a few people other people who don't know it is something that's called a crypto phone It's what there are many types of crypto phones and go chat is one of them. It is actually a phone and an applications According to Europol it is widely used by organized crime groups and I do mean like the serious organized crime groups that like deal in drugs and in murders It was operational from somewhere in 2016 to 2020 So most of what I'm going to tell you is at the end of their operational life in 2020 and what they sold were modified Android and blackberry devices you can buy them from anchor chat the company And you can only communicate between those anchor chat devices. They are like specialized devices to communicate in a secure and private way Which by itself is not illegal The price is a thousand euros for the phone and 1500 euros per six months of subscription like you're paying a lot for your privacy And you can imagine in broad strokes like these phones showed up around heavy crime like Suspects of murders assassination stuff like that. So In a lot of places a lot in the Netherlands but also in Spain and Germany and the UK and Very complicated long story short the court ordered that That the police could try to intercept the messages from this specific surface for a limited time because it was used so much for for criminal purposes And actually these people they sent a lot of messages that tens of millions of messages and for a period of time the police was able to Have a life tap of that like get in real time the The messages that were sent and this is where where we come in this is where the data science comes in because The court gives this rather broad permission and expect something in return like you are able to read these messages as the police Some of them will discuss Preparations of heavy crime maybe that could hurt people That's like oh yeah, if you have that you have to make sure that you can act on that to protect people's life is if possible So then they came with this question to us like which of these millions of messages discuss Yeah, we call them in Dutch Liquidation so I'm not talking about murders like crimes of passion like I'm talking premeditated people of one gang assassinating another gang these murders And the problem is and it's a bit weird to say it this way But these are very rare these messages that discuss these liquidations. That's a good thing But we need to find them and and we would like to know fast It's no use if we know that the murder is being prepared tomorrow and we know it a week from now But that doesn't help anyone So us being the data science team that came to us like can you help with this? So let's go to the data science part We have a Low base rate right like the base rate in your messages is like like the messages that talk about the preparations of murders Is very low and that's taking assumption like one in ten thousand No, I try to put on a slide like like what is that so this is Ten thousand times the word no well actually not there was one yes, but it's Why yes, so it doesn't break the pattern So I think you have to be in the front row to be able to spot it Can anyone spot it? No, I didn't think so. It it's here. You have to trust me Yes, so ten thousand this is a lot and now you just have knows But you have to imagine each of these knows is is a message talking about something in context If you have to read it you have to read it try to understand maybe the context Evaluate it and yeah, it takes a long long time to get to the Yes, and this is simply not feasible to do by hand the police is Really not afraid to put a lot of manpower on things, but this is just too much So I'm going to show you some examples and of course they have to be fictionalized They cannot show you the real examples so and they have to be in English because doing this talk in English and of course these messages were mostly in in Dutch what we're talking about and So there might be something lost in a double translation and But some examples I'm going to read the first one like bro. He went to visit him and gave that order He's going to lure him and then they will shoot him and there is nothing cryptic about this, right? This is like stating the obvious This is not unrealistic because you remember these people they were Communicating under the assumption that they were completely safe So they were talking completely in the open offer about criminal things and other examples like he has to disappear I'm done with him with with disappear your misspelled No worries dude. He's going to put that nephew to sleep. This is something that is lost in translation the nephew here, but Going to put someone to sleep basically means to kill them and Stuff you don't come up with like will torpedo the mascot These are the types of messages that you get in the task of the humans and of the mission of the data science analysis we're going to do is to Create some model that is going to make the distinction between the two like are they talking about plant murders or not? But the contact length was in this service was typically with between chat messages and email They were using PGP under the hood So it was kind of like an email service with people send short messages. So it's not your typical chat message Extensive use of slang Something we call a street language with slang criminal slang certain subcultures It's it's not your typical Wikipedia text to put it mildly which all our models are trained on and That's not spelling and grammar mistakes And one strategy what you could do like like a baseline level is like, okay Let's create a list of words that occur in these messages and just execute a search query And then you will get some of the results And the problems are quite obvious, but you will get too many false positives so sleep will occur in like Context that has nothing to do with it with murder And false negatives like I would not have thought to put the word torpedo on a list of Keywords, so that's why we are Going into the deep learning model We want to use of something that has a little bit more understanding of language that can Understand the nuances in something. So let's go forensic data science Basically, you can translate this to quite a simple data science task, right? This is a binary classification. You have a text and you want to classify yes or no this talks about a Tread to life is what we call the general category You can give it a score and what you need is a suitable language model and labeled examples to train on Well, there was 2020. So what language model did you think we used? correct It was Diffing to use and still is for many use cases. So for people who don't know bird It's a language model trained by Google that you can fine-tune to do certain tasks It comes pre-trained on language something that you could call language understanding and you can use that in your fine-tuning So for example, you would know that torpedo has something to do with like Destroying like it's a negative thing even though you've never taught it that in the context of this task So it looks like this if you create a fine-tuned bird You can put a message in there. I'm going to kill him and it's going to give you a score I don't know if everyone can read it, but it's an 98% threat to life and 2% not try to life This is this is the idea simple binary classification And you need a model to start with well You can just get from one from hooking phase models. You can get the bird space multi-lingual case Actually forgot how many languages this can do but Dutch is one of them. So that's nice to use Or you can go more specific specifically for Dutch There is a a model called bet you the bet based Dutch case that is a model specifically trained by University of Groningen to to know to be trained on on Dutch texts So the model is quite easy and if you want to you can even fine-tune this model unsupervised on your your type of data to make Get it even more understanding And so now we just need labels right we need labels So we need to find some of the messages that contain discuss the preparation of murders, which are pretty rare Which was exactly the question I started with like we need labels We need this wall of texture. We need to find the yeses And the only thing is we need you don't need to have to analyze any everything You just need to get like a sample enough to train the model, but still they're very rare So that brings me to what we did active learning loop So what is active learning? It is a loop And what you do is you start with some small labeled set You train your model your your bed model in this case to be able to predict Whether or not message discusses the threat of life or not and based on that first Sample is going to learn something, but it's going to be not very good better than random So what you do is you take the output of the mod the model you you run You train a model for a bit you run it over your whole data set They give scores to everything and you can take those scores and use this as a prioritization For labeling have humans look at the output of your model and see like this was wrong. This was correct It was wrong and generate more data So you need a start So you can start with just a keyword list Like take some of the words sleep Like I said you you can get some of the of the true labels you you want to look for and And you add it by random and if you want to bootstrap you can just assume with a low base, right? If you take something random, it's going to be a negative example You might get unlucky once but on average is good enough to start with and then you have this set you train your model You have it output and you have humans look at it and Label it. Yes. No, correct There's a problem with this strategy And that is it's very easy to train yourself into a circle of only what the model knows it will predict Give a high score give to the laborers label it will say yes And now you need you need a strategy in active learning to combat that And that's basically a strategy a sampling strategy and you can do very complex sampling strategies But it comes to do simple ones So for the next loop you can sample from your data set based on the scores, so let's highlight this one. It's like Prediction so the model thoughts that This message is a threat to life between 0.4 and 0.6. It's like I don't know is basically the translation So the model is not able to infer anything from these messages So there will be a nice message to label like tell her the truth like no This is actually discussing that to life or this is not discussing that to life Now you can do that with lower predictions higher predictions But what I think is most important in these kinds of context is You keep adding a subset of random sampling from your data set And that gives you a sample from your data set if you If your model like trains itself into a corner it it can get from from that samples It can get examples of the things is doing or wrong or wrong or right So let's look at some examples of what you have to label again a fictionalized and translated and So I have three examples here a bro. You're totally right. We will get him Is this a threat to life? Session not easy right like it could be Talking about football like that they're going to have a football match You have to imagine like people are chatting a lot on this service And even though it's only ankle chat phones talking to other ankle chat phones Yeah, there are some of these communities are actually communities and talking about other things than planning murders So this one it's hard to say which you maybe from context you can you can get it guts my AK I Mean yeah, is he going to use it tomorrow on someone or not that that's the question like this is hard This is a hard problem. And yes, his wife is going to die That could be a direct threat or someone could be sick and just it's can it can be a factual statement So many many more of these examples that make it hard Labeling is harder than you think is my message and no one ever talks about this as my experience in data science Like now you just acquire some labels and then we're going to talk about the model and find you the model and what kind of like Do you need dropout or not or the labels are very important. You need to think about that Luckily, we had some help. We had the police domain experts that were available for labeling So these people police experts also know more of the slang They know the context and they can give a much better impression of what we're talking about then me as a forensic data scientists can So this is some of the results like you so you train this model in a loop and What I highlighted here are some messages that have sleep in the In the in the content. I did modify them a bit, but you get the gist of it So you can see that like there are sentences like did they sleep over Baru and Yiza? I'm assuming this these are names Baru is the news because I really want to go to sleep like these have the keyword sleep that is used in try to live But they get a low score Presumably the mother model learn the context of this And then there are some other ones like brothers Tilo. I guess that's a person again has put Baldi I guess it's a bald person to sleep and a lot of exclamation marks So this is actually it's like someone got murdered yesterday and like they are communicating about that like Someone got murdered he has to sleep first thing They're going to hunt and we have to sleep first thing they're going to hunt tomorrow This is also talking about preparation But then the other one that's maybe the smallest one But it says go to his house fast and get that phone you keep that because people who are only sleeping are no use to me So that is actually not a threat to life. It's just yeah You know used to me, but yeah a model is never perfect and this gets a somewhat higher score. That's not perfect So it's not like 0.9, but it is Yeah, it is a model is never perfect. So what you do you've we've trained this model we apply it to All the messages you apply it live to the messages that stream in And then you give it Basically to the police and you say like okay, you can sort these messages by these scores Maybe you have some other filters if you want to and like if you're going to read like these messages investigate Maybe start here because these maybe will save someone's life Precautions about active learning Like I already said I said most of this but I keep always random sampling like any Any system that is like self learning in production always make sure there's some kind of random labeling going on that you can See it's Sees results in practice always validate your models. I guess everyone does that always This is not trivial for like lawyers like that you keep like a data set outside and you validate on that Important and like I said high scoring messages are read by humans Make sure you get feedback on mixed signals if Investigations lead to messages that were not not giving high scores by your model Make sure the feedbacks comes to you and very important like only humans decide on taking action Like you would not want like this model to give like a high priority to someone's message and like to arrest that person That's not something you do not want and do not ever want to make So the results of what happened in this case I'm going to go quickly because I'm running out of time. I want to have time for questions This was a press release by the police a couple of months after the The operation now like hundreds of suspects arrested million. There was 8,000 kilos of cocaine Seized dozens of firearm there were 19 synthetic drug labs dismantled. They were like hiding in the Netherlands and specifically on this topic and They said in the Netherlands alone more than 3000 signals that appear to be life-threatening have been processed in recent months They always intervening in a timely manner. The police have prevented dozens of violent serious violent crime Including imminent kidnappings extortion assassinations and torture And this is no joke. They actually found like shipping containers rigged with like a dentist chair like ready for torture It's quite a famous scene. So it's very nice that they were able to prevent the use of that Well, if you ask Europol two years later They give you like numbers like over six thousands arrests like like this case in general right the anchor chase in general Six thousands arrests. Yes. Almost a billion Euros in funds. Well, you can look on Europol websites to see the numbers Personal feedback that we got from the police is that without the model It would have been overwhelmed by the number of false positives and besides that the model recognizes extra threat to life Signals so that is false negatives that We were able to get that could not have been detected if we only want to use a word list. So they were quite Pleased and that is how we get another forensic Institute. You know the rest by now Thank you Thank you, Edwin. It was a great talk. We'll have a Lot of time for questions. So Hello So you talked about a live stream of messages that was fully encrypted by PGP How did you manage to decrypt all those messages? Ah, that's a good question. So I don't have this I do have to slide with I'll show you the slide of the advertisement of because there are a lot of Buzzwords here this one right? Yeah, that's a yes. It has RSA. It has PGP. Yeah, so Actually, I'm not part of the team who broke it like it's the police But what you have to consider is these are Services that builds some kind of the custom infrastructure to sell you a private service and they claim to use all of these things And do they have weaknesses? I don't know like the official story Like like in the media they said in the press release like the police hacked This service So there was somewhere a weak link and there was there were servers involved That's all I know basically so the keys were stored on the surface Honestly don't know but you can expect mistakes were made Hi, the talk was very interesting. Thank you. I have two questions So does the police use models to track other platforms and if anchor chat doesn't work anymore What is the new anchor chat for a crime organization? Okay, two questions in one does the police we basically this slide is I'm not the police. I honestly don't know You should have the police What was the second question? What is the I mean a crime organization still communicate in some ways, right? So what is the new anchor chat? How do you find out? I'm not really into that world So I don't know what all right. Thank you like what I know is this goes in fashion And every few years there is a new service But I only hear about it when they get like cracked by the police So I don't know what is currently like maybe some open source things would be nice to use if you want privacy Hi, I mean I was immediately thinking about with Europe being so open and even quite easy being able to Everyone being able to get into Europe from outside of Europe, but inside Schengen you cannot expect the Criminals to be nice and always speak just Dutch right so you have obvious bias that you're gonna remove like the Dash ganks from the scene and you make it easier for non-dutch-speaking gags in Netherlands to thrive. No this way I guess this is a very big investigation. You saw the word Europol there So yeah, we always deal with with the problem of multiple languages at the same time And yeah, it's like you have to take take precautions you have to make sure your models are Like basically you have goals and you know like every machine learning model is going to have like blind spots And you have to be aware of them and make sure like What is the what is the effect of that and what can you do with to combat that? Thank you? Very nice talk. Is your life at risk? I Hope not Criminals are quite famous for not really liking what when someone's or fears with a take cocaine That's very that's very true That's of course you take the thing into consideration before we publicly talk about what we talk about What I can say is everything I told you has been in Impressed releases Like it's public information like summarized by me for you with a little bit of data science detail so Yeah, if if I were criminal will not start with me Like I did like you can see my answers to these questions like I don't know But you're helping the police I think of the talk and One of the blind spots that I was thinking about is code switching So when criminals start to use sort of two languages bet in the same sentences is that something you came across and how did you deal with that? I Don't know about purposeful code switching which you we have the problem I called it street language or like heavy slang like there are many very interesting linguistic aspects of these messages that they different subcultures maybe different cultural backgrounds different language backgrounds and they all Mashed together and that's what makes the data science on this challenging that you cannot just take any model from That's trained on Wikipedia and expect good results. It is Yeah, but in you can imagine in case like this like many different expertise come by including some linguistics and expertise Thank you Space