 Hi, everyone, and welcome to this talk. My name is Cécile Robin, and I am a research associate working in annual go-away. Today, I would like to introduce you to my world, the world of natural language processing. It's a domain where artificial intelligence and computer science meet language. Funny mix, isn't it? So in this domain, we program computers to understand, to write, to speak, to analyze human language, natural language, as opposed to programming language, computer language. Now I will tell you a story about how companies listen to their customers. How people like me, using specific natural language processing techniques, help companies take your comments into consideration. So let me introduce you to the story. Try to imagine that it is March 2023, and we are in the office of ITech, a popular Irish tech company. So today, their team is busy finalizing the details before their big event next week. The release of their new messaging app, Messagram 1.0. So let's explore this in more detail. Welcome to the ITech office. Look at this team, so busy. They're a bit stressed as you can imagine as well, but they're very excited because they're working on the little, small, final details before the big release next week. And now we're one week later, and Messagram 1.0 is now officially launched. What next? Once the app is released, what do you think will happen? Yeah, that's right. A huge wave of comments and suggestions are going to come out from people. No time to rest. The Messagram team need to know if the app is doing good. So what do they do? They monitor user feedback. But what do they hope to find in the user feedback? They hope to identify issues and bugs, any major problems. They hope to monitor satisfaction. Are people happy with the app? Any features they like or they dislike? And they want to monitor success. Is the app successful? Is it widely used? Are they doing better than their competitors basically? So now they know what they are looking for. But where can they get it from now? Yes, from you, and your friends, and your neighbors, and anyone potentially using it. I don't know about you, but personally, when I have anything good or bad to say about something, my first go-to places are either Facebook or Twitter. And I don't hesitate to tag the company and see if they react when there's a real problem. So it's the place where people are the most direct and honest. And it's a very precious feedback to get. But apart from social media, you have also reviews from app stores. You have reviews from YouTubers, from bloggers. You have articles from newspapers, tech news, or more generic newspapers. And these ones are a bit more structured reviews. So you see many, many different places to search. And all these people are very vocal. It's constantly changing. New comments arrive all the time, every single day. How could possibly a human go through all of this? That'd be crazy. So what do you think the Messagram team will do? Yes, they will use automatic methods. They will use natural language processing, or NLP for those in the know. Natural language processing, or how to get the computers to do the work for you. So let's go back to our mess. First thing first, the Messagram team has identified where to look for the comments. They collected all the different website links. But now they need to collect the text itself. Every single thing that is of interest of ITEC. So they use a program that will crawl the data from the websites. In other words, it's going to collect the text from the websites and gather it in a format that can be used for the next steps. And once they have that, they clean that up. They cleared all the things they don't want. They filter images, weird characters, emojis, memes, comments or reviews that are not about Messagram or ITEC. And this way, they get the data ready for the computer analysis. And that is what they will obtain after all this preprocessing, a data set, all prepared and ready for the next steps. But what are the next steps? Well, the team is going to dig into this data to further analyze the user feedback about the app. By hand, no, no, no, automatically. So here's one of these processing steps. It's called a word cloud. And you may have guessed as its name indicates, it makes a cloud with words, words from your data set, the data set we were just talking about before. So this nice process allows ITEC to get a broader view of what the customers are talking about when they refer to Messagram. And it highlights what words are dominating the data set. But now single words are quite limited if you want to get deeper and understand what people are really talking about. What if you want to extract specific terms, like people talking about the specific feature called your friend's group call feature, as opposed to any feature? So by using NLP techniques, this can be done automatically by a process that we call term extraction. But the big desire of the Messagram team is not only to know what people are talking about, but how they are talking about it. So that's where the magic of NLP happens again, sentiment analysis. Because knowing what people talk about is great, but knowing whether they are talking about it in a positive or negative way, that's what is really helpful to ITEC. And sometimes it's clearly positive, and sometimes it's clearly negative. It's clearly negative, but sometimes it's also quite nuanced. So then it is the algorithm in the computer, which is a set of instructions that are executing a task, that has to decide which side to lean towards, whether globally if the positive takes over the negative, or vice versa. And this is done at the level of the whole comment. And now even more specific. Imagine you take the term extraction phase that we mentioned earlier, and the sentiment analysis step. You mix them together, and you obtain aspect-based sentiment analysis. This allows to attach a sentiment to a specific term. So it's much more precise, you know exactly what people are happy or not about. Plus you lose less information as you can keep both the positive and the negative aspects, even if they appear together in the same comment. So this is a type of process that will make it much easier to the ITEC team to make a detailed analysis and summary of the good and the bad. So now let's go back to the office and check this all out. It's time to analyze the results given by the algorithms. And this is where the human has a key role now, to interpret the results and to make decisions accordingly. And here it's a pretty pro team, as you can see, because most of the comments were actually pretty positive. People liked the feature, the design, it's a really good feedback. But nothing can ever be perfect, you all know. So the Messagram team has also to deal with the less happy reviews. And in the end, these ones are the most important ones, because they are the ones that they will use to make decisions to improve the app, either on the short term or the longer term. So for example, thanks to the analysis, they can fix direct problems and bugs. And these ones are the urgent ones to tackle. But also they've identified less urgent problems that can be improved in future versions of the app. And sometimes there are just things you can't do anything about. So don't tell anyone, let's call it a feature, not a bug. So that's it, the Messagram team is finally finished. Well done to all of them, and now they can relax for a little bit until their next report. Thank you very much for your attention, and I'm very happy to take any comments, reviews, feedback, suggestions, or simply questions.