 So, as an example, behind the website of BBC there is this database that is updated like 10 times a second with all the inference and stuff going on, while there are hundreds of queries, read queries that they use to generate the web pages. So if you occasionally go to the sports section of the website of BBC, you're seeing this in action all the time. We do use text analytics, we have all the weaponry that people usually have through this purpose. And at the end of the day, the result often looks like this. So you have text and you recognize some stuff in the text. And what's more important, it points out to something that's in the knowledge graph. Of course, you have relevance, confidence about everything that you recognize. That's pretty much what everyone has. So I'll switch over to Peyon while he's getting on the computer. We have a project called Open Policy. It's with an organization called the Logistics Management Institute, about 1,000 people in Washington, D.C. Together with them, we deliver this vocabulary management solution to various agencies, be it the Department of Defense, Human Health Services and so on. Thank you. We thought that we should share this story because this is a cool story about two things. Ferbo is the collaboration in the fiber world. And the second is the immediate usage of the work that you have created. So first, how actually the fiber world works together. We got involved most deeply with fiber, thanks to Jacob, who was really passionate to convince us that this is the thing that we should be doing. And then on the last meeting with David, actually, we're discussing how we're approaching it, how we're doing stuff. And he gave us a cool idea that this might be used in various other ways. So why don't you do this? Then, thankfully, to the dean, works to transform this cost. We were able to make an immediate use of it because fiber might be a top level ontology. But there is a lot of value and actually enormous value in the terms in which if represented to a vocabulary, it can be like an immediate use. And this is actually an example how we were able to use this knowledge in order to be able to get this value out of it from actually the version which was like this product that we have, Open Policy. So what Open Policy is, it has actually a three columns application. In the left column, this is the actual regulation that somebody is interested in. The right column are the older facets. We have a small cost editor when you can see all the concepts. And, of course, somebody might be able to do a facet on top of them. And it's going to be slow. And actually it enables to put relevant fiber terms to all the paragraphs and all the segment of a regulatory compliance. So it's basically the use cases for discovery and being able to combine the knowledge, a model of fiber, to store it in the database and to run a not so sophisticated text analysis on top of it. And we're able to segment and annotate each paragraph with older fiber concepts which are relevant to it. And this way we can move and present, let's say, relevant content to somebody who is looking for a fiber concept. And depending on what's loaded on the right side, you can get really quick access to the relevant knowledge. And this is a rather simple but powerful example how different derivatives of fiber can be put into some use because now we are working on it to provide specialized feed for regulations. So I don't know if you... Yeah, now we're working with it to provide feeds to people who are interested in changes to regulation. And we might use one or more than one of the fiber concepts to let them know what's pending on what has been currently amended. And by using text analytics we can assess even the priority of this one. So we put fiber into an immediate use and help people make sense of the complexity of the regulatory documents. And this is something which has been done with not so many efforts and something which proves the immediate value to fiber to the community. Thank you. This was actually a result of probably a couple of hours of work. And that's the case with many of these demonstrations. That's not what you can achieve in a year of effort. On the fiber it was literally a couple of hours. And most of the stuff that I will present now, was a couple of weeks of work to put all this together. So the message that I'm trying to convey is that with this technology and with fiber it's fairly easy to do these kind of things. So we are basically aiming at a case where you can detect these kind of patterns. Like if you have organizations that control each other through some chain of relationships, they're located somewhere, there are some relationships between the locations, like Seattle being part of Washington, that's part of USA and so on and so forth. And this way you can detect patterns like US company controlling another US company through a company on the Cayman Islands, this kind of stuff. And to combine this with the news analysis. So that's where we are going. So that's not so interesting. So for those of you who don't know what linked data is, that's a web of data that started around 2007, relatively small, all sorts of things. Like you have dvpd that we will be using. So this is a modified version of Wikipedia. Geonames is a very, very well maintained and good quality database with geographic features, also geographic features on Earth, millions of them. Then there are dictionaries and many other things. So this thing was growing quite well through the years, so it really grows exponentially. I hate when people who don't know mathematics speak about exponential stuff, but this is exponential one. So it really grows and there is a bigger volume of data that's coming available. So the point of this presentation was how we can get a few data sets from linked data and to do some sort of polishing and small adjustments to them, map them to FIBO and get something useful out of it. So the concrete set-up is we have the English version of dvpd. That's about half a billion triples, Geonames, 150 billion triples. They are well mapped between one another, which is useful. And then we have some 130 million statements that are just links between news articles and this knowledge graph. We have also loaded some data from the legal entity identifier initiative. I'll talk about it later. That's as much as we've been able to get. So Kevin recommended us to have a look at this glade modeling tool. We took the biggest dump and it was like three million statements. It's still interesting stuff. I'll show you some statistics later. So altogether, this way, we ended up with a repository of about a billion of triples, two-thirds of which are explicit and the rest of it was inferred. On top of this repository, we have plenty of useful things on top of the bare-bone triple store, like geospatial indexing, full text indexing, and that allows you to score results and so on and so forth. So about the news metadata, we... Let me find where it is. No, that's not this one. So we are in news for quite a long time, analyzing news using these big knowledge graphs. So we have a public service code now.onstex.com. You don't see it so well here, but it's constantly fed with news. So that's news coming from, I think, Google. So we processed them. This one came an hour ago. And for each of them, I'll probably try to make it a bit bigger. You see these metadata so you can click on a specific concept. You can explore what we have about this concept in the graph. See popularity trends. It's not loading too fast. Oh, I picked up on Brussels. And you see that out of the sudden, it became very popular on the 23rd of March. It was a sad reason, but still. Yes, the bombings on the Brussels airport. You can see which other entities it occurs with, like organizations and people. It's not a surprise, sorry, to see that Angelo Merchio and Alexis Tsipras and other European politicians are those that most often appear together with this one. Yeah. And of course, you can jump again into news and this kind of thing. So that's the same platform that is actually being used by people like Financial Times and BBC for their news. So what I'm trying to demonstrate here is how we can get all the metadata that we generate through this news and do some statistics and get fiber involved in the game. So these are about 10 million statements. Sorry, 10,000 news a month. Not too much, not too less. 300 news per day sort of. 70 tax, 70 annotations, 70 links between a news article and something in the graph, on average. It's like all sorts of news, mostly international business, sports, everything. You can see the distribution of the kind of things that we recognized in the text. Quite a big number of those are key phrases that we automatically extracted from the text, but you see that organizations, locations, and people are on each other like number of references in the text that we've been able to find. Good. So let's try to get feeling about what this graph looks like. So we call this diagram the Bose of Zetso. That's one colleague who designed it. So it represents the classes, the classes of things that are available that are loaded in the repository. You see agent, places, and other things. If you zoom into agent, you see that we have about 3 million agents which are, yeah, that's the super cross for person and organization. You can further drill down to see what sort of subclasses of people. Person we have, you can click on any of them to get additional information and so on and so forth. So we essentially need these kind of tools and diagrams to allow us, easily explore data sets that are combined from multiple sources. And each engineer or even data architect needs some sort of tool to figure out what's in there and this kind of thing that we loaded. So Fibon, we loaded the foundations and the B components. So altogether these are about 5,000 statements. And when we load them with the outer rail profile, we need for another 15,000. And those get materialized and stored in the triple store. In these two modules of Fibon, we see about 337 classes. So they are visible here. But I mean, to be able to take this picture, I loaded them in a separate repository first so that you can see just Fibon. And here you see Fibon in the context of the bigger thing. Like we have autonomous agent with organization and person, you can further drill down into organizations and so on. But what you see here is that we have like 2.5 million instances of Fibon autonomous agent. That's because of the mapping that we did between Fibon and the DBPD ontology. So that essentially you can see all these data through the Fibon classes and properties. That's not rocket science, you can see properties, sort of domain range graph for classes. In this case, we did a sort of minimalistic mapping. So we basically mapped person, company and organization to the relevant concepts in Fibon. And we mapped DBPD ontology predicate subsidiary to controls in Fibon. It is important that we map them in such a way that essentially the classes within the bigger data set that DBPD classes are sub-classes of those in Fibon, so that one can really see the instances of those classes through the Fibon primitives. If we have done it the other way around, this wouldn't work. We don't want to put them equivalent because they're not precisely equivalent. So that's pretty much it. So let's go to some examples. One thing to mention is that in this case we can enjoy what we call semantic press clipping. This fact that we have annotated and found references of concrete objects in the text. We did the disambiguation so that if you have a Paris in the text, we know whether this is Paris in France, Paris in Texas, Paris Hilton, or Paris the Greek hero or whoever. That's important. All sorts of variations, the names and these kind of things. That's half our business anyway. The more interesting thing in this case is that we can also trace and I'll just demonstrate these references and appearances in the text of related objects, like daughter companies or related people, this kind of stuff. We can of course use all sorts of information that we have in the graph making statistics about banks, making statistics about automotive companies. And I'll make few live queries now. So to start with I'll get objects just a second. So I have a Sparkle query which you can use this as a parameter. You say in this case Volkswagen group. Is it visible? I should make it a little bit bigger problem. So you say, well, I bind Volkswagen group to entity. Then I'm getting two Fible controls. I'm getting entities that are related to Fible via this predicate. I also bind Volkswagen group, the entity itself to be part of this set. And then that's a bit technical. I'm essentially saying I need news that refer to any of these objects. And then I filter by date. In this case it's April 2005. Good. So that's easy. And I'm getting back as you see news articles that refer not necessarily to Volkswagen but to Audi, to Porsche, to Lamborghini. Because we have information about all these companies that they are part of the Volkswagen group. And that this comes directly out of DbPdia. I personally made the exercise to polish a little bit this company control information in DbPdia because it's very messy. As you can imagine they use probably 10 different predicates to designate these relationships. But 10 is not that much because you make a query of giving the most popular relationships between organizations. Within 10 minutes you know which are the 20 that matter. You map them and one hour later you're in the game. And there is plenty of this information like parent-child relationship between companies available. And actually it is because we have access to some good quality databases from commercial vendors. It is just a different view. Very often you have information in DbPdia that is true. And that is for whatever reason still missing with the commercial databases. Okay let's move on. So we can make various statistics like to give you a feeling about what's in this dataset. How many statistics about given this dataset in DbPdia. How many organizations we have within each industry. And you see finance is at the top with 5000. Then transportation, software, telecommunications and so on. This needed a bit of cleanup also. Because again there are plenty of different predicates used to designate what industry these organizations is part of. And also there is no agreement about the identifier of what is finance. I mean finance we can find it in five different ways. But again this is something that one can search within a day of work. And get a decent industry classification hierarchical so that bank is defined. What is the relative popularity of different companies. So which are the top performance across each and every industry like automotive. It is like General Motors, Tesla, Volkswagen. But then we say hey that's actually direct references of the companies without related companies and related entities. So we have an augmented version of this query that also includes related companies. And then well in this case we said as industry software. You can put anything they have probably about a few hundred industry identifiers at different levels. And you see Alphabet, Microsoft and Yahoo coming at the top of this. So, what industry you will be interested to? I just want to make a case that it's bloody life. Just a second I need my auto complete game. Okay it's probably gaming. Video game industry. Here you are. Nexon, Nintendo, Nvidia, Blizzard. Does it make sense? Good. So we got to finish some right? Okay I'll scroll through the next. So I won't get in that much detail. So you can in the presentation that's available you can see how the rankings can vary substantially. Depending on whether you got the reference to the companies themselves or the companies and their children companies. Like that's what appears to be the second most popular if you can't just direct references. And it goes down to fourth place if you consider all the sub companies. Finance is really messy because finance is too many things. But if you go into banking, you see Goldman Sachs and JP Morgan Chase being at the top. If you consider it alone, then this is some sort of noise with the China Merchant's Bank. But then JP Morgan Chase and Goldman Sachs switch places if you consider all the related companies that are referred to the news. So you can do things like regional exposition of a company. So you can and that's the last query I'm going to show I promise. So for instance for Toyota we can see locations from which countries are mentioned together with Toyota in the news. In order to figure out which regions, which countries are most related in the news with Toyota. So in this case you see United States, Japan, China and so on. But that's partly because China is United States is too often in the news. You can normalize this query by the number of references that you usually have for this, the number of news about this country. You're getting slightly more interesting picture. Like from British Petroleum you see Angola, Mexico, Norway and other countries that have something to do with the business of British Petroleum. So you see my experiments with the Glade data from the presentation. It's still very strange. US is very high. They're the top Goldman Sachs accounts for probably one tenth of the entire data set and these kind of things. So the overall message was that we can basically map a link data to Fible. We can integrate external data sources like I had time to say what happened with this Laidata. But within really hours of work you can get reasonable mapping between this kind of data set and public data. You can integrate this with text analytics and we know how to get this done with the existing ports. Any questions?