 I hope you have a great coffee. I know it's the very last. So today I'm presenting Natural Language Processing. My name is Govind. I'm from India, actually. I just here to attend and present here. I have around 13 years of experience with the Drupal and other web technologies. I'm working as a technical lead at Salsa. So today I'll discuss about the Natural Language Processing. It's a part of AI. I don't know if you remember or attend the last day's session. The man just provide the AI-powered Drupal. The thanks to him to provide the good context about the AI, how you can utilize the AI with the Drupal. I'll more talk about the Natural Language Processing behind the scene. You can see what exactly the things. So what's inside? I'll discuss about the introduction with the capabilities of the Natural Language Processing. We'll talk more about how NLP actually work and how we can utilize in a Drupal. We will also discuss about the Drupal models that we have, already have, and what else we can implement. So first thing, introduction. What is NLP? So it says natural. What exactly the meaning of natural? Anyone? Yes, yeah. Yeah, definitely. So if I say, hi, man, you can understand. Definitely. If I'm saying, hello, brother, you can understand. If I'm saying, how are you? Can you understand? Yes, so a few people can understand. Yeah. So this is all about the natural. So if people can understand the language naturally. So by the definition, it's a refer to the branch of computer science, and more specifically, the branch of artificial intelligence. It's concerned with giving computers the ability to understand text or spoken words in much the same way a human being can. So it's always talk about the interaction between a human and a computer, you can say. So how a computer can easily understand a human language. So how exactly this all started? So actually, NLP started a very early stage before the AI, you can say. Yeah, so it's a part of AI actually. But yeah, it started when people want to understand each of the languages. So in 1950, Alan Turing, like he published an article about the computing, machinery, and intelligence. In that article, he mentioned about can a computer think? What do you think? Can a computer think? So after that, he just defined a Turing test to just provide a criteria of intelligence. Like if you talk about a computer, how smart he is. So as a criteria of intelligence, Turing test is a one thing that you can look into that. So after that, it's a small progress in NLP or AI field until 1980s. So in 1980s, IBM developed a few of the statistical model for the natural language processing. Then after in 2000, there's a good progress because Yoshio Panjio and his team proposed a neural language model using a feed forward neural network. So that is where NLP combined with the deep learning, machine learning, you can say. In 2011, everyone aware about the Siri. So hey Siri, how are you? They introduce a speech modulation module so they can understand what exactly human are speaking. In any language, it doesn't matter which language you are speaking. But yeah, it matters how exactly they implemented in which language they are supporting. So this is how Siri is using the NLP. It's a more thing about the Google Alexa, everyone is using the NLP. So they can understand the language in a human manner. They can action whatever the, like as a human do the answer. So the NLP consists two component actually. One is the NLU, natural language understanding. The other part is natural language generation. So if we are trying to interact to a computer or a person, what do you need? They can understand you and they can respond back to you. So the same thing. What is NLU is it deals with the understanding, a given text, interpret it and just provide the meaning and it just structure the data, whatever we are trying to give. So let's suppose a computer can only understand the structure data, not unstructured data. So first step is to understand the language. You need to structureize it. So NLU is to manage the unstructured data in a structured data to convert that. Now the other part, if you are talking about how computer will respond back in a natural language, that is natural language generation. Or why exactly I'm talking about NLP? This is all thing. You can see the statistics. This is actually really good. And those are from a very good statistic. There's a good provider for the statistics. You can also look into online. This is the market value of the NLP. Look into it, implement it and earn money. It's all about money. This is the revenue stream that you can see with the NLP. I'm not going into this, but yeah. So these are the basic use cases. You can think of NLP. There are a couple of things. Like we already discussed in yesterday's presentation, but I still would like to mention these few. Language translation that everybody want to understand one another. Do you know how many languages are spoken by a person or in human, in a word? Yeah, you're perfectly right. It's more than 7,000 actually. So I can only understand Indian English. So I don't know what any other language. Yeah, Marwari, this is spoken language, you can say. Yeah, that's a dialect. So with the language translation, if you want to translate something, the NLP is the best practice that you should follow, application should follow. So they can translate in a good manner. Actually, the thing is that if you are talking like a person, you can use slang, you can do the sarcasm. So computer can't understand those things. So this is where NLP comes. Search engine results, if you just Google and if you just want to, let's say, I want to book a flight, it will show you not the result, it will show you that you can easily book the flight from Google directly. So this is where NLP came. You can understand your contacts, you can provide you more good result thing. There are other few use cases. So you're already aware about the Siri Alexa and all. That's a virtual assistant. Chatboards, you can, if like chatboards are mainly utilized like in a customer industry, so you can definitely automate those for the better customer service. Email filter, spam detection, sentimental analysis, social media monitoring, tax analytics, predictive text. I think it works. Okay, so now how NLP actually working? The behind the scene. So it's a very simple process. They just took the input, provide you the output. This is something that you can easily understand. So in the image, you can see NLP can be looked into text, video, audio. So someone asked in the previous presentation that can we extract something from the video or something like that? So you can do with the NLP. You can definitely process the video transcript and get the summary or tax or whatever entities you want. So this is the first thing. In the NLP, there are sub tasks and it'll be performed. There are phases in NLP. We will discuss in a later stage, but yeah, there's a sub task. Every phase have these are the sub tasks available in NLP. So first thing is segmentation. It's a very simple thing. In the first step, it just take the whole text divide into simple sentences. So we can perform the further step. The next step is tokenization. It's as simple as that. Everything, a word is a token. Like you can see, this is a sample. So I'm tokenizing is that this is a sample. If I have a dot at the end of this, that could also be a token. Now, the other thing is it will perform stemming or limitization. This is very crucial in terms of NLP because without this, you can't understand the meaning of the word. So stemming or limitation are kind of similar, but stemming means it just referred to get the root stem of the word. It just provide like any suffix prefix in the word. They just remove and you will get the final word, but might not be get the right word. So that's why we have a limitation. Limitation will also do the same thing. It's just top of the final thing, but it also check the dictionary that this word is available or not. You can see in the screen, like the change, change, something, and you get the change. But with the limitation, you will get the, in the final, finalize, finally get the final word. So this is how the stemming and limitation will work. Now the other thing is removing the stop word. It's not about the whole thing that we need to remove, but the words that are not affecting the final context of the statement, we can remove that so we can easily understand the context of a sentence. Now the best part is part of speech tagging. Like part of speech, it means when you are a small child, you need to understand how you can frame the sentences, what are the different things that you need to combine. So the first thing you need to understand what are the verbs, a conjunction, everything, that you can see in this diagram, we have noun, verb, noun, what are these things? So if you understand these things, you can frame the sentences. So this is the thing that an LP perform and then we can extract entities. Entities are named entity, recognition, they just, and it could be anything. It's like organization, location, date, time, anything. You can see in this example, like it's a big paragraph, but I can see the tag colors you can say, location, the terms, date, condition, process, people. So after this stage, an LP will give you good context what exactly the sentence is all about. So these are the phases that I mentioned before. So an LP have five phases, you can see. The laxical analysis, syntax analysis, semantic discourse and pragmatic analysis. With the laxical analysis, it just recognize and analyze the word structure, the collection of word or phrases in a language is referred as a laxicon. So in this stage, it just get the sentence into breakdown into the words, all the steps that we already discussed, the different different tasks, it perform and get the part of speech, POS, recognition, tagging, everything, it turn with the laxical analysis. Now with the laxical analysis, we have the few meanings available. We will pass it to the syntax analysis to check or to frame the sentence syntactically correct, okay? So with the syntax analysis, it just frame the sentence first. Like this is an apple or you can see it's a correct sentence syntactically. If I'm using the same words with the different, you can say sequence, this is in the last, like apple and this whatever something. So this is maybe grammatically it is correct, but it doesn't have any kind of meaning. So after the syntax analysis, you will do the semantic analysis just to get the correct sentence with the proper meaning. So semantic analysis is to provide you the meaning of sentence. Grammatically it is correct because we already passed with the syntactic analysis. So now you have a sentence with the proper meaning. Let's suppose if I pass a sentence like, my mobile is eating banana. Do you really think that it could possible? So NLP with the semantic analysis, this text or sentence will, like you can be discarded because it's not proper sentence. It's not a logical sentence. So then after we will do the discourse integration. Discourse integration is just about like, to get the reference or the context of the whole thing. So sometimes you have like two phrases in a single sentence. To understand the context of second one, you should need to have the first sentence available. Like, this is my mobile, I bought it. If I'm saying I bought it, you can't understand it, what it is actually referring to. So this part of discourse analysis in NLP provide the context using the different sentences, what are the different meaning of a particular, let's take another example, like I'm reading a book and also I'm saying the book is very good. So book, there are two things. If you're using book as a booking the flight, it's a different meaning. If you're saying I'm reading a book, that's a different meaning. So NLP also need to understand those ambiguity. Final thing is about the pragmatic analysis. This is the final stage where we also have the references, grammatically correct sentence, everything is available. Now we need to understand the intent of the sentence. If I'm saying like, oh man, just close it over. What is the intent? It's on order, request, whatever. It could be anything. So you need to understand. As a human, you can understand what my tone is, what my thing is, how exactly I'm saying you can easily understand, but a computer can't understand. You just have the close it over, okay, close it over. Share your screen, you don't have any context. I am saying that share the screen of my computer screen or any other screen, whatever. So there are very good products available. I'm just providing you the context so you can utilize them. There are freely available few of them, but IBM Watson is a very good product to get a play around with that NLP. So you can write your own algorithm, write your own machine learning or deep learning algorithm to get the correct output and all. So Google NLP is available. Amazon also provides some services. Azure and NLTK, OpenAI, that we discussed a lot in yesterday's. So OpenAI is also very good tool right now. Okay, so I'm just showing you how they are doing NLP. So whatever the phase is, the output of the, like if I'm providing input as a text, what could be the output, how we can utilize in your system. So just give me a moment. I'm sharing my, so this is the Watson demo you can see. One second, one, two. Sorry about that. So like I can provide a URL or something text, anything I can do that. So I'm just providing a URL. It's a good article from the gardener. I'm trying to get extract the information from there. If I analyze this URL, see this, this is what a Google NLP will do. So with NLP, whatever I mentioned you, the tasks, different entities, different categorization. So this is the IBM Watson demo. In the API, you can see this is linguistic. So it also applies. The first thing is synthetic analysis. Do the tokenization. It get the word lemma, part of speech tagging. So every NLP should perform these basic tasks. And then you can extract the major information about the keywords, that what are the different keyword and relevance score. So using the NLP algorithm, you can identify different kind of entities, category keywords that can be utilized in a CMS to categorize your content, auto tagging, auto summarization. So different kind of things you can do with that. So these are the entity. You can see the gardener is an organization, IT executive, gardener IT. So this is the NLP. There's a algorithm is going on behind this and they are providing this kind of data. I do have a Google API as well. That is also provide the similar kind of things. They provide different output. You just need to use the API or you can create your own with the Drupal integration. Like I have a simple algorithm. It's a static algorithm. It's not AI based. But yeah, there's a module available. We'll discuss in a bit. So what we can do with the Drupal. So lots of things we can do. But a simple thing, it's just as simple as that content classification. So every content editor want to classify the content because we need to show not only the content, related content, some. So you need to categorize the content so you can manage easily. Using the NLP, it's a best practice and they are quite good. If you have the right algorithm, you will definitely get the right keywords available to tag and you can utilize the content classification in a using the NLP. The other is a sentimental analysis is like, it's a very good actually. If you are a business person and you want to get the sentiment of the user, you can analyze the feedback, analyze the blog, anything you can analyze as a text and you can get the good reviews. Sentimental, it's a positive, negative, neutral. You can also get the emotion, like what's exactly they are trying to say, not about the sentiment. It's just like they're sad or like it's a joy, whatever. These kind of things also can be predicted by the NLP algorithm. This is a automatic text summarization. So in text summarization, they, what exactly they are doing, they use the algorithm to get the matrix. Then after they, using that matrix, they will figure out the word tokens and after that, they provide the summarization. I'll just provide you a quick demo of how they are converting it. Image tagging, it's a very good thing because most of the times you use the image from a Google or any other place. But if you really want to extract the actual meaning or actual thing from the image, you can use the NLP behind that. You can tag that, you can classify. Or for the captioning, for the alternate tags, you can use the NLP algorithm. There are lots. Again, like for the SEO metadata, it's a very really good thing to generate the automatic tags, automatic description for a page. So you don't need to worry about the SEO anymore. Just implement the NLP and it's good to go. So there are other use cases as well. With the Drupal, you can do. Let me show you the quick thing with the, it's not a AI algorithm. It's a manual algorithm that text rank is an algorithm to predict the text ranking. So if you provide an input text, it will generate a matrix. And according to that matrix, it will provide you summarization and all. This is a basic site. I'm just trying to do an article. I'm using content from the Drupal South Brisbane. So just coughing it. Okay. I'm putting it here and adding title. And there's a tag and summary. I'm just leaving them right now and we'll see what will happen. So if I go to my home page, there's a Drupal South. There's some automatic summaries available. There's a tags also available. If I go on the page itself, you can see, but this is not an AI algorithm. I must say, because it's a text rank. So you can do that. You can create your own algorithm. So you just need to understand NLP to do those things. So you can say there's a tags, there's a summary. So this is all about the Drupal integration and all. There are a lot of things, but due to limited time, I cannot provide more stuff here. There are a couple of modules available. You can utilize them, play around it. I found very good output for the Google NLP. Though few modules are not stable, but I did some research, tried that out, and it's really good. Google NLP is very good. It's a freeware available right now. You can try it out with your integration. Augmenter, I must mention, that's a very good module they started. So it's also good. Okay guys, don't forget to attend this code sprint tomorrow. Okay, any questions at all? Know that I'm not a subject matter expert, but I can definitely try to give more context. Thanks a lot. The demonstration you had there was bringing back structured content. You know, the Wikipedia example where you had tags and the type. How do you see that kind of content being integrated into Drupal where you have that more structured data? So it's not just tags, but you're looking at the entities where you have types and things like that. Do you have ideas on how that might help? That's why I mentioned it's AI powered. It should be AI powered. So you can have more context around the tags. This is why machine learning and deep learning is there with the NLP. NLP about how you understand the human behavior, human language. It's not about how computer responding. First thing, they need to understand the language. Then after they can provide you more good content. That's why we have the machine learning behind. So we can feed more data, more things so they can learn. And after that, they process on those tags and provide you good content. That's why I've mentioned Google NLP, IBM Watson because they have done years of work out there and they are doing it. Maybe if you have that kind of capability just create your own algorithm, figure out from the internet, there's a free things available to get the more context of data. If you're starting out in this area, which tool would you recommend to start with? It's pretty hard to say that. But like, I analyze a few things. In static method, tax rank is quite good. If you want to create your own algorithm, you can start with that. But yeah, if you just want to utilize the existing, or you don't want to reinvent the wheel, go with the IBM Watson and Google NLP. They are quite good actually. Do we have any more questions? The tools that you're showing seem to be pretty much making API calls, external calls. Are there any that will run as part of the Drupal? Like a Drupal module, Drupal code base? Yeah, what I showed, it's a part of Drupal. Drupal module. It's not an external API, the type. So they're doing the, sorry, good advice. They're actually doing the analysis on the server where you're running Drupal, as I always say. Yeah, it's a PHP script. That's why I mentioned tax rank is an algorithm. I also mentioned this reference about the tax rank. You can see this one, tax rank. Right, yeah. There's a GitHub Drupal. You can definitely look into that. Cool, thank you. Any more questions? Thanks guys.