 I hope no one here is here thinking that the relationships here mean human relationships and that was a mistake that I did not realize when I wrote the title. So yeah someone brought up later. So today we are going to talk about so this is this mandatory disclaimer from my company. So today we are going to talk about relationship extraction which is a key part of information extraction required to move from software that can read text to software that can understand text. Now in this talk we are going to talk about relationship extraction why relationship extraction is important. Deep learning and relationship extraction in my opinion are a match made in heaven and we are going to see that in this talk that how deep learning facilitates and simplifies the task of relationship extraction. In the end if time permits we will talk about how to build a knowledge graph using the concepts that we have seen in the prior slides and finally the conclusion. So in very simple words information extraction is the application of human knowledge and machine learning to convert human text into a form that can be easily consumed by software agents. Here I have a simple example. So it is basically like converting from unstructured text to a structured pattern and I think most of you would have seen this a lot in today's talk and in your day to day lives. So information extraction follows a very specific pipeline. If you want to do information extraction like you know what you can do is that you can first identify the entities of interest in your text. This problem is called named entity recognition then what you can do is that once you have identified the entities in the text you identify the relationship between the entities in the text. We are going to talk about both of them in the next couple of slides then after that you identify which events the entities are participating in. For example you might say that airlines is going to give a price hike or there is going to be a rise in price of fuel. So you find the events in which the entities are participating in and the final most difficult part in my opinion is the temporal expression. So Atul Singh is giving a talk today at ODSE. What does that today maps to mapping phrases like today last day yesterday three days ago to exact dates in the calendar. So that is the last task that you want to do to kind of achieve information extraction. So named entity recognition is a classification problem. When the proper nouns in the sentence the whole idea of named entity recognition is to classify them put them into classes. There are lot of classes that can be used on the right on my right you can see like some of the typical classes that are used people, person, organization, location, geopolitical entity. Now named entity recognition is becoming fairly commoditized in my opinion spacey, NLTK, Stanford, NLP do a good job of named entity recognition still like you know I have written here the shown here the code first from spacey to do named entity recognition which is like just five lines of code and it will show you in this very nice manner that Narendra Modi is a person, Sunday is a date, Bansagar is a location, Yogi Adityanath is a person. So this is probably so and in still if you have some desire to learn or for a specific need of your organization, you can you want to develop your own custom any classifier then you can use like you know existing knowledge bases like Groning meaning bank is a data source which is already available with labeled data that you can use for building your trained building your NER supervised machine learning base classifier. So that is about named entity recognition which is used to identify entities. Once you have done the named entity recognition and identifies the entities in your text the next task that you want to do is the identify the relationship between the entities. Now this becomes slightly complex now the first task was just to take the proper nouns and classify them into classes now if there are two entities in a sentence can you take them and identify the relationship between them again you can treat it as a supervised machine learning problem given some label training data you can train a classifier and kind of decide on the classes but the thing is this that the nuances will vary with the domain you know the relationships that are important for finance will not be important for medicine. Similarly, like you know the things that are important for medicine might not be relevant for security purposes. So that was one of the reason I thought maybe we can talk about relationship extraction here because this is something which is very domain specific and it is not easy to kind of just take something out of the box and use it. What I have in top here is some examples of relationship that you could like to kind of basically a taxonomy of relationship that you might want to use to kind of identify the relations that might be of relevance to your domain or your problem space. On the bottom there are some examples like you know physical location. So he was in Tennessee's part whole subsidiary XYZ is the parent company of ABC. So these are like kind of examples of relationships the items in blue are the examples of the entities. So remember we said that we identified the entities and once we have identified the entities in the sentence we want to classify the class or the relationship in which those entities belong to you. So why do we want to do information extraction and relationship extraction? So I mean it is this typical problem that the data generation pays never ceases. We keep on generating data at ever faster pace and every day. What I have here is some example of data generation like you know we have like you know 473k tweets generated per minute 12 million text messages sent per minute 2k comments per minute and absorbing all this information and converting it into useful and actionable insights is kind of becoming a challenge that different organizations want to kind of solve. So that is where this knowledge of relationship extraction information extraction becomes very relevant and it is interesting because it is not that the approach of kind of doing extracting converting from unstructured text to structured format is new. So WikiLeaks leaked about this software called Palantir which is used by US intelligence and they do not disclose they are a private company and they do not disclose how they take the unstructured information from different sources on the web and convert it into a structured format but I thought this would be interesting because the ideas that you learn here you could use it to build something like this which could be extremely relevant for security. KNL is another good example of a knowledge graph which is a project from CMU which is like running from a large time where they try to crawl the web and try to identify different relationships on the web. Now the left side I am just trying to kind of highlight the importance of like there are 4000 listed stocks on NASDAQ and if an investment company has to invest in these stocks they would not have enough resources to kind of absorb the information that is being continuously generated on the internet about these stocks and take process them and consume them. So that is just a point to highlight the importance of this. Security is very important I mean there is some statistics from stats here which says that there are like 11000 terror attacks which happened in 2016 and if security organizations can process the information that is moving in social media and all and can kind of point out to some give some insights about a potential terror attack that could lead to saving a lot of life. So what you are learning here is extremely relevant in terms of utility. So this is the slide here shows like how you can do relationship extraction using existing using the older machine learning algorithms like random forest or SVM if you want to use that for relationship extraction. So basically you take the text you take the entities that you have identified using named entity recognition and then you go through a feature generation and then you generate a whole lot of features like you know you take the entities any R type you take the entities lemma you kind of can concatenate the lemma and the types of the both the entities then you kind of generate word based features you take the text between the words you text left and right of the text then you do syntactic features like basically how the what are the dependency paths constituent path etc. Here you have generated all of these features then you have to kind of convert because remember your supervised ML will only accept numbers. So you have to pass it through a word embedding phase to convert the words into numbers and then you pass it through the supervised ML algorithm of your choice. So you have to do all of this to kind of do relationship extraction and with deep learning the advantage that we get that all of this gets converted to this basically you just take the sentence and the entities you do some preprocessing the word embedding and the deep learning model becomes part of the model and you get the relationship. So remember this is like again a classic supervised ML problem but instead of doing all this feature generation that we had to do earlier we can just let the deep learning network do this feature generation. So what is yeah oh so you have to identify the classes so I will come back to it yeah so short talks probably I will just come back there is a slide about it after that okay thanks for pointing it out. So we will come back to that point so what he has asked is that if you are saying it is a relationship problem it is a supervised ML problem that what are the relationship. So I have a whole pipeline of how you would do it it essentially involves identifying the relationships of interest to you and then generating data for those relationships and then building the classifier we will talk about. So typically with basically gradient descent gradient descent and kind of back propagation we can train very large size neural networks. Now recurrent neural networks are a special type of neural networks where the information is passed across the neurons within a layer unlike the normal neural networks. There are multiple types of recurrent neural networks grad GRU LSTM kind of are some prominent examples of recurrent neural networks in the literature it is difficult to say that which one is better for your problem better it is like you know something that you will have to experiment with and see the results for yourself. So we are using like LSTM in our work and the recurrent neural networks are very good for text processing problem because they maintain the structural information. So what they can do is that if I have a sentence teddy bear I am going to present teddy bear. So the resolution of bear depends on like resolution of teddy depends on bear there could be another sentence that teddy Roosevelt Roosevelt was the first president of US. So again the resolution of teddy there depends on the presence of Roosevelt next to it. So what recurrent neural networks do is that they can kind of they can maintain the information in which they have seen the words in the sequence which is not possible in normal neural networks and that makes them very good for text processing problems and I think you have seen almost everyone is trying to use them for. So this is like an elaborate topology of the deep learning model. So we basically take the sentences replace the entities in the sentences with the markers then we what we do is that we pass it through a input layer then we take it through an embedding layer what the embedding layer does is that it transforms the words into a vector space representation then we pass this vector space representation into a bidirectional LSTM and after that into a dense layer with a softmax activation function. We have used cross entropy as the last function and what you see in the end is the different relationship classes. So this is basically like the workflow that you are asking for that let us say if your manager comes in and says that I want a relationship extraction then what you need to do is basically you want to identify the relationships that are of interest to the problem at hand you cannot generally say that like using a supervised approach you cannot say that I will identify any relationship that comes to anyone's mind you have to build some with respect you have to build with respect to certain particular relationships then you have to generate label data. So for label data one trick that is used very often in this problem space is called distant supervision. So what you say is that if you say Bill Gates and Microsoft and this is not the cleanest technique but it works well. So what you say that if I want label data about founder so what I will say is that I can go and crawl the web and find different instances of sentences with Bill Gates and Microsoft together and say that in this sentence Bill Gates and Microsoft represent the founder relationship. So that's the different distant supervision problem. So basically we crawl a corpus of text like web to find sentences that contain both the entities together and we label them saying that the relationship for which the we know the relationship between these two entities. So we label those sentences as saying that this is the relationship that these sentences present then you train the data on the model on the data and then you evaluate the model basically so that answers. So as I mentioned that we can use distant supervision which is like you know if you know if you let's say I'm interested in five relationship let's say founder headquarter. So for founder let me generate 10 tuples where I know that these two entities are related by the relationship founder for example Alan Musk and Tesla Bill Gates and Microsoft and so on and so forth. Similarly let's say headquarter so you know Facebook is headquartered in California or like you know Alan Musk company is headquartered in San Mateo. So I kind of know these tuples entity one entity two tuples which are related together by the relationship founder. So I take these tuples so you had say five relationship you took 10 tuples for each relationship then you found sentences on the web say for example or you can go to a corpus of text which is very specific to your domain and then find sentences which contain these two entities together and you put the label saying that these two in this sentence these two entities are related by the relationship that I am already aware of. That answers your question yes yes for that. So testing you already will like have to kind of maybe go and manually label the data and see if you want to kind of evaluate that yes yes yes you have to kind of go and manually kind of find sentences and label it and do a bit of but you can simplify this using the distant supervision technique where you say that if those these two entities are related if Microsoft and Bill Gates are related by the relationship founder then every sentence that contains Microsoft and Bill Gates as the relationship founder. So yeah so what we will do is we will quickly have a quick or okay yeah go ahead you can ask the question. Yes so that is what I was wanted to come to good question. So there are two what you can do if you want to play with these ideas like in your personal time or whatever so what you can do there are two data sets available one is the Wiki links data set. So what Wiki links data set does is that Google at some point Wiki links yeah so what Wiki links data set does is that Google at some point kind of anointed different entities with the URL with their corresponding URL in Wikipedia so that it becomes easy to kind of disambiguate names within sentences etc. I don't remember the full context of it right now but certain university I think UMass UMass UMass what they have done is that they did not merely they have extended the Wiki links by taking corresponding sentences for these entities from Wikipedia. So what you can do is that something like this entity one entity two the sentence on the left middle right so you get the whole sentence with two entities in them next thing what you want to do is that you want to classify or label these sentences how do you do that there is kind of something called freebase so freebase was another collaborative effort where people had manually labeled manually labeled relationships between entities it is no longer like you know active but you can still get data dump of freebase so once you take the freebase data dump you will get relationships between known entities you can merge these two together to kind of label your data once you have labeled the data what you can do is you just like need to do a little bit of preprocessing and all and then eventually you can do the train test split and then you can build the model which is basically what the same thing I said you kind of this is in keras in python you add an embedding layer you add an bidirectional LSTM and then kind of you add a dense layer and then you train and test the model so that's about information extraction part of it yes stanford stanford library is what I stanford as an entity information extraction parser that does a good job of kind of relationship extraction but it is the thing what it does is that it takes the two entities and tries to kind of give a very simple it's it uses some kind of like I think a clustering kind of approach where it kind of tries to simplify the text from the information in the text it tries to give the relationship open eyes another one I think which can do this to some extent but then the next step where like you know these texts will be like very random so again you will have to take these text and pass it through a classifier to give some fixed labels okay so that's kind of covers so let me come to the conclusion yeah so we have kind of demonstrated how you can do extracting entities and relationships I'm sorry we are not able to cover the knowledge graph so key limitations that like at least two key limitations that you should keep in mind we have not talked about coreference resolution which is finding all expressions so for example removing he or she to kind of the nouns that it is referring to and duplicate resolution while populating knowledge graph so that's probably not relevant because we have not talked about it thank you