 Machine learning can be used for content moderation online, which is a complex area that's currently highly labour intensive. One of the goals is to remove the need for human moderators completely because they can experience severe mental health issues due to the nature of their work. Laura Hano is a machine learning engineer from a company called Unitary. They develop machine learning software to detect hate speech and toxic comments as well as recognise harmful content in imagery like graphic violence or nudity. Our main mission at Unitary is to build a safer online world and we're trying to do that by using artificial intelligence to detect harmful content. So we're doing that by trying to teach machine learning models to interpret videos in context. So in other words we're not just looking at the visual aspect of a video, we're looking at the audio, we extract the speech from the audio, we look at the text surrounding it whether it's on screen or in captions or comments and interpret them as a whole so we make sure we get this tone and setting right. Ensuring online safety is one of the biggest challenges we are facing today and the scale of this problem is actually huge. There's 80 years worth of footage of videos uploaded to the internet every day. It would be very hard for a human to manually go through each and every one of them and check it for harmful content. Having to look at these sort of videos and looking for extreme content is having a huge toll on the moderator's mental health. So this is where automating this process comes really in handy. We can actually process up to 25,000 video frames per second which is impressive feat and we have a great platform engineering team who is making this possible. So how does the software work? So a machine model learns by looking at lots and lots of examples so for example in the if you're trying to detect cats and dogs and teach a machine model to learn to distinguish between what makes a cat a cat and dog a dog you will feed a lots of examples of what a cat looks like lots of examples of what a dog looks like and you will learn to find patterns of what makes the cat a cat. Like for example 20 years in the case of a cat or like bigger like nose in the case of a dog. Programmers working on the machine learning software face many challenges. So the biggest challenge we're facing is actually building diverse and representative datasets. So this means that we want to give examples to our machine learning models that are representative of the real world and it doesn't learn to have biases or discriminate against certain groups of people. So if a model only sees for example someone's ethnicity or gender in a negative context and doesn't see it in a positive context it would learn to maybe associate that characteristic with negative traits which we don't want it to. Another challenge we're actually facing is when it comes to hiring we want to hire a diverse range of people that have the right balance of great technical backgrounds and machine learning backgrounds and also a passion for social good. What skills and qualifications do you need to get into the field of machine learning? The most important subject to study at school would be maths, further maths and just usually scientific topics such as physics or science since that develops your creative problem solving which is really important in this field. If your school offers you the opportunity to learn computer science that's definitely a good way to get into the field and a necessary skill to have. So however it's not a prerequisite so you can sort of go into a technical scientific field first and then learn these computer science or programming skills as you go as well. So most people I know that work in this field are actually deeply curious people who love to get to the bottom of really challenging problems. The most important skills to get into this area is to have a mathematics background statistics and learn how to program particularly in a language such as Python which is really widely used in machine learning. What I love about this job is that it's just really fun to get to teach machine learning models to interpret really challenging content that would be difficult for even a human to interpret. The 80 years worth of video footage uploaded to the internet every day so is the technology going to be able to keep up? It's definitely a challenge but what's certainly the case is that humans can't keep up and typically a lot of people are involved today in manually reviewing content that's uploaded and so technology has got a much better chance of keeping up. It can definitely scale but what's a really interesting challenge for technology is keeping up with the change as the way that the type of content that people post on iDIS changing all the time. People have new ways of communicating of sort of referencing different things and AI have to build AI technology that can keep up with that. Yes it's got to learn it's not only 80 years worth of content that it's seen before it's 80 years worth of content including new ways of expressing yourself that the AI doesn't understand. Exactly there's always a new way of referring to politicians or referring to types of drugs or whatever that's sort of currently trending and you have to constantly keep up to date with the way people are sort of interacting and communicating on the internet. So we're asking the AI to identify things that are inappropriate and acceptable harmful in some way. How is that defined? How do we train the AI to say well this is acceptable but this is unacceptable? It's a really good question so at Unitary we're not the arbiter of what's good and bad content. We basically classify content according to what we find inside. I mean give that information to a social network or to a company to decide how they want to deal with that. So for example we say this video contains people being beaten up and then people can respond how they want and we tend to give content some sort of risk level that this is very high risk or very high likelihood of containing content in this in this category and we sort of classify content according to these different policies but we don't determine what should be taken down or not or we don't decide what's harmful in advance. Do you have any examples where something is the AI has said yes that's unacceptable and you've looked at it and said it's some puppies or something? Not as extreme as that but it happens all the time because it's definitely not 100% accurate it's predictive and the way it works is you sort of give the AI, you train your model with lots of examples of one type of content and you train it to be able to recognize further examples of that. So for example we built an initial model to detect guns so we were training the model with lots of examples of images of guns and in an early version of this it kept flagging microphones and telescopes as guns and so we sort of had a look and realized that there was this common pattern because they looked a bit similar and so the way that you deal with that is we then sort of proactively collected lots of images of telescopes or microphones and labeled them as not guns and so the model can then learn. At the moment what are the limiting factors? Is it computing power? Is it how smart we are at programming and setting up the AIs in the first place? So there are lots of limiting factors one is computing power that's we're constantly demanding more and more access to GPUs but also something that's so important is the training data and getting that right and getting a good distribution of data that's well labeled and it's so easy to have one class or another over-represented in your training set and so we want to make sure that we can have a really well-rounded set of data so that when our model can learn in a way that's representative of the real world. Are there any other applications of this technology? Definitely there are loads. Basically what we're building is an ability to understand visual content in context so really understand what's in a video using AI and our mission is to use that kind of technology to make the internet safer and help identify harmful content but you could actually understand or identify any content and you can imagine it being used in a future way of doing sort of video search for example rather than searching on TikTok for all of the captions and titles you can actually search inside the video if you couldn't really understand the the content of video it opens up so many new applications. Oh thank you Sessha. Thanks.