 Thank you very much for this introduction and for the invitation for this conference. Let me just take a moment to start the presentation. Okay, so the title of my presentation is Am I Talking to a Human? And by that I would like to immediately start the discussion about the state of the art in NLP in 2020. So the discussion is whether it is now the time that we actually ask ourselves this question when we interact with machines. And if so, why and how has the situation changed in recent years? So to start the discussion, let's first have a quick look at the historical background on NLP. We have been actually working on NLP since at least 1950s. But what I want to show by this slide is how the approach to NLP research has changed rather quickly through the years. So one of the first approaches that was applied to NLP was based on statistics. So this is illustrated by the saying of John Rupert Firth from the 1950s. You shall know a word by the company it keeps. So the idea was that we need to look at the text itself, at language itself, and from that language we will discover important knowledge, important relationships, and applying statistical machine learning methods is the way to go. But then came so-called AI winter in 1970s. Publications by Minsky and Pappert were published where they criticized or have shown limitations of neural networks. There were publications by Chomsky who criticized this statistical approach to NLP. And basically, most of AI research has been stopped by stopping funding and most research concentrated on applying expert linguistic knowledge to processing language. So people concentrated on creating grammars, rules, dictionaries, and this was supposed to give better results than this previous statistical approach. But then things changed again in 1990s where more data was available, more processing power was available. And it turned out that these old statistical approaches and machine learning approaches are starting to give better and better results. And here for the illustration, we have a quote from Fred Jelinek who worked on machine translation who said that anytime a linguist leaves the group, the recognition rate goes up. So this illustrates this change in the approach that was applied then that we don't need this expert knowledge, we are relying more and more on statistics and on the language itself. So looking at these dates, we may ask ourselves and we asked ourselves in 2010, are we going back to the linguistic approach in this 2010, because this would be the most obvious trend here that we are going back and forth between this statistical and then linguistic approach to NLP, are we achieving some kind of situation in which we cannot rely on machine learning alone and we need this linguistic knowledge again to move forward. But then came various new developments about which I would like to talk today. And most obviously deep learning and this changed the situation in the way we are still involved mostly in machine learning and artificial intelligence when we are talking about natural language processing. So let's discuss what are the most important developments in the 21st century that really moved the natural language processing forward and created the situation in which we rely on artificial intelligence in this area. So in my opinion, there are four pillars of this situation. The first are the algorithms. So we have the right tools to efficiently model the data to discover knowledge from the data. And of course, these algorithms are not at all completely new, but they are optimized. They are created with these large amounts of data that we cope with today to actually efficiently mine and discover knowledge from language. The second pillar is the big data trend or revolution. The idea that we have so much data available now coming from various sources, from the internet, from internet of things, from our phones, that this enormous amounts of linguistic data give us more knowledge than ever before. Next, third pillar is the available processing power. So we actually have processing power to discover knowledge in the data. We can utilize cloud infrastructure, we can use new processing architectures to transform the data and we are able to process this data in a reasonable time. And the fourth pillar is the need that is the business need in four NLP methods and NLP applications that really change the world and the business world and services that we use every day. So discussing in a little bit detail these four pillars, I would like to start with the algorithms and I don't want to get into very much technical detail here but just to give the intuition what has changed in the recent years. Let's stop for a moment looking how does the machine see written language? How do we represent text when we want to process it to transform it, to mine some knowledge from it? So what we have done just 10 or 15 years ago, most commonly was to use the so-called bag of words representation. So we just would count the words that were in a particular text and using such a simple frequency lists and bags of words, we would feed them to algorithms to classify them, to segment them and so on and so on. This representation is very primitive and as you may assume it, it gives a large number of problems related to ambiguity, to lack of semantic information in this representation. And what has changed in recent years is the approach that we now seem to apply the most often. And it is actually a comeback to the saying of John Rupert Firth in the 1950s. So we are coming back to this idea that we will know a word by the company it keeps and we actually are doing it in an LP. So what we are doing now is actually we are looking at the words and we are looking at the particular window around the words, looking at other words that appear in a similar context. So here shown in green is the word it in a movie review and the word it appears in context of other words such as movie in yellow, fun, recommend, scene, see, scene. And if we have a big enough corpus of such movie reviews that it will soon appear that this word it in this kind of reviews is appearing in similar context than the word movie. So we may say that the representation of the word it is similar to the word movie because it appears in similar contexts. And this intuition used in real world the mathematical representations leads to a very interesting thing. It appears that the semantic relationships between words translate to the mathematical space of words representations. So if we know that there is a relationship between man and woman in the real world that is the change of the gender that the same relationship holds in the mathematical space of words representations. So we can then somehow calculate new words, find words which hold the same relationship than other pair of words. For example, if we are interested in a relationship similar to the relationship between king and kings we can ask the representation what is the equivalent word holding this relationship for the word queen and the representation response with the word queens. And this is really incredible and also something magical even happens when we ask these representations for other types of relationships. And it appears it doesn't have to be a grammatical relationship, it can be a real world relationship analogy, for example, between countries and their capitals between people and their roles. So we can, given the relationship between France and Paris, we can ask the representation what is the equivalent word for Italy and it responds with the word Rome or we can ask about Einstein being a scientist and we get that message a midfielder. So this is really magical and it turned out to be a very important milestone in NLP research in recent years, but what we can ask ourselves is how is it possible? Is this algorithm so incredibly intelligent? Well, the algorithm is actually not that new but what has changed is that the quantity of data that we are able to process today is vastly higher than what we could do in 1950s or even 1990s. So our algorithm knows these relationships because these relationships are in the data and if we have enough data, we can find it automatically. And this is the second pillar of this NLP revolution in recent years, big data which allowed us to collect and process vast amounts of data. This illustrated here by the total size of Wikipedia article text in gigabytes or the size of common crawls of the internet, the number of web pages that this common crawl project collects, downloads from the internet. Having these amounts of data allows us to find those relationships that are hidden in the language itself, in the text itself. Similarly, as humans, a child learns each year by reading and hearing language. But the third pillar is also necessary, this processing power so we can actually perform calculations on petabytes of data. One change, of course, was the change in algorithms and calculation tricks but actually new processing architectures such as GPU and then TPU, allowed us to use much larger neural networks that were used in 1950s, that were criticized by Mieski and Pappert. And also we just have much more row processing power using, for example, a cloud infrastructure. And then this allows us to use this not that new approaches to petabytes of text. And finally, the fourth pillar is this business need. So natural language processing is not longer tied to academia research anymore. NLP is a core technology for enabling many products and services of today. And this research is conducted in many privately held companies that moves this research forward faster than ever before and the financing is much higher. One of the examples that you may know from everyday life are these so-called voice assistants, either on your phone or as a standalone device. So as you probably know, these assistants are created both by Google and Amazon but also on your phone, by Samsung and many other companies. So, and this application is an example of this progress in NLP that was made in recent years by solving so many NLP tasks in one device because then the voice assistant solves at least the problem of voice recognition. So you can actually, the machine can actually transfer the voice recognition to the machine. Actually, the machine can actually transform your voice into text, it solves the problem of a chatbot. So when you talk to it, the machine understands your intention and the objects that you are talking about, it solves the problem of question answering. So when you ask it a question, it responds. And many, many other problems just in one device that had to be solved for this device to be successful. And talking about this example for a moment, I would like to focus on this NLP problems. So one problem mentioned is the problem of a dialogue agent. So commonly called a chatbot that you may know from web pages or these voice assistants. So the idea is that we talk with a chatbot like with a regular person and the answers are generated based on our questions. Now, there is a memory of the dialogue, there is a state of the dialogue. So the machine knows the context and knows what it has been asked before to generate new responses. And this was all possible thanks to advanced language modeling that is taking place nowadays. So we are able to get these large amounts of data, these petabytes of data and create language models which extract this knowledge about language itself and various additional processing steps that allow us to discover the intents of a person that is using the chatbot, the entities that are being related to the context and performing actions that are requested by the user. And this example of a chatbot or a voice assistant also bring us to the problem of previously mentioned question answering. This is another problem that is quite well being solved right now by various, also commercial services. Here is the example of the Google search engine where you can just ask when was Lincoln born and this search engine will correctly identify that you are talking about Abraham Lincoln and show his photograph. It will correctly find the answer to your question and print it in large letters here and also come up with other related questions. So this is the more specific NLP task that needs to be solved for those voice assistants to be working, but still we have a lot of smaller problems that had to be solved for this question answering to be working correctly. So we have the problem of named entity recognition, the problem how to identify that Lincoln is a person and it is Abraham Lincoln. The problem of analyzing the structure of the sentence and identifying that we are actually talking about the time of birth of Abraham Lincoln and the problem of extracting information from various sources, probably from Wikipedia and other web sources and finding in long strips of text that the actual birth date of Abraham Lincoln was February 12th. So this is all very interesting, but the example that is probably more related to the topic of this conference is an example of automatic cyber bullying detection. So this is one of the tasks that were organized during the poleval competition that was already mentioned in the introduction. So in 2019 we had this task of recognizing cyber bullying on the internet and a training data containing examples of such cyber bullying was presented to participants of the competition such as annotated examples of disclosures of private information, personal attacks, threats, blackmailing, ridiculing, gossiping and so on and so on. And the task of the participants was to come up with the algorithm and method, the model that would automatically identify such examples and annotate them in particular categories. And it turned out that it went very well. We had a very large interest in the task and the results were very promising and further enhanced by even newer models, these so-called language models that are being developed still and are being evaluated on this dataset, denoted here as CBD, where we have an accuracy of over 70% of identifying such mentions of cyber bullying in an unstructured text. So this is very promising. And one of the hot topics in NLP is of course natural language generation. This is even more known as this open AI group announced that one of the language models that they developed, namely GPT-3 is so advanced that it's too dangerous to humanity to actually release it. This is of course somewhat a publicity stunt, but actually these models are really getting better and better and are trained on more data. So you can probably test them yourself on the internet just looking for GPT-2 or GPT-3 and it will complete your sentence just by looking at statistical probabilities in the language model. And this example is of course wrong, so we are asking the model somehow where was Lincoln born, asking it to complete the sentence Lincoln was born in and the predictions are in Germany and that is because this test is made on an older GPT-2 model and smaller data set but also this model is actually not trained for question answering but rather for continuing sentence that were provided by the user. And these examples of NLP applications are appearing in more and more business applications and academia related projects. We could discuss them in large quantities, but one important application that I would like to point to is also the combination of image recognition and natural language processing. So what we can do now is also combine those methods that were developed for image recognition and train models that are able to automatically label and even describe images, photographs that are able to describe the situation that is not only the objects, but also the situation that is taking place on a photograph and this is also very important application that allows for example, creating documents that are readable by people with disabilities and this is one of the projects that I'm working on right now. So in conclusion, NLP has come a long way to this date but just the recent years had the great share in this revolution based on those four pillars of algorithms of available processing power and available data and the business need that is the driving force of this work on NLP. Thank you very much.