 Hello, I'm Violeta Ruizman and I'm presenting Outlier Redemption for the USAR conference. Let's start with the definition from Wikipedia. An outlier is a data point that differs significantly from other observations. What people teach you about outliers is that they are bad and that they can ruin your model. So you think about them as a bomb that will explode your model when you feed it to your data. Then, once you know that your dataset has some outliers, you will be sad and you should spot the outliers and get rid of them or use methods that ignore them. But how can we spot outliers? You can use, for example, a box plot that represents the outliers as isolated data points, or you can use, for example, an histogram where you will see the outliers as the small bars that lie far from the rest of your data. But what people forget to teach you about outliers is that they can actually be cool. You can tell stories about them and they can be fun. So what I propose is instead of outlier rejection, outlier redemption. Spot outliers and tell stories about them. I will tell you two stories that I found out procrastinating with data. The first one is about mobile apps. So I always find myself wondering if I should buy or not an application given its price. And I usually think if it's more expensive, then it's probably better, right? So I was playing with this dataset from Kaggle that has prices of the applications from one platform in dollars. And I got this histogram that doesn't look really good, that have a bar that is quite huge close to zero. And that's normal because applications tend to be cheaper free, but there is a tiny bar around 400. And I wanted to know more. So I made some on this histogram and I realized that there were around 15 applications that had the price of around $400. And that's quite expensive. So I was curious and I searched the name of these apps. And I realized that most of them were something like I am rich. So these are actually the most useless apps in the platform because they do nothing. They are only to show off your money. On the other side, I will tell you a story about Wikipedia friends. So I've seen some people on Twitter wondering about the top visited article on Wikipedia in Spanish. So here we have a screenshot of two tweets asking why Cleopatra is always on top, a top red article on Wikipedia. So I plotted the number of visits of the top articles that I found in Spanish. And between them there was Cleopatra, but also I wanted to compare the Maricoli article that is this spike, big spike here. And the project table that is this periodic pattern here on the bottom. So I realized that Cleopatra was a quite constantly, constant line and super high compared to the rest. So it was an outlier, but why? So thanks to some Twitter folks, I learned that it seems to be because the Google Assistant recommends you to search for Cleopatra on Wikipedia. So a lot of people are doing this. Finally, these were two stories about outliers. So check out more on the outlier redemption site where you will find the data sets and also the article. Thank you. This is a website and this is my Twitter handle if you want to chat.