 Thank you and good afternoon everyone for joining this session, the third day of the conference almost towards the end and I hope I don't bore you to death or put you to sleep. My name is Sumit. I am based out of Boston, U.S. I'm part of the Ontotex team as a strategic technology director. Ontotex is a company that's based out of Sofia, Bulgaria and I myself have been doing graphs since last millennium, since my bachelor days when I was doing a Bachelor of Engineering in computer science and graphs are a very interesting topic which have sort of you know gained a lot of momentum especially in the 20s in the you know that was called sort of the decade or the year of graphs 2018 and before Ontotex I was part I was a research analyst at Gartner. I wrote a lot of articles and published a lot of papers around data engineering, data management, areas around stream processing, time series databases, graph databases and all that. So I'm very interested in knowledge graphs and the power of knowledge graphs and that's what brought me to Ontotex. I've been with Ontotex about six months now. So with that said let me ask you a question right to show you the power of graphs right take any celebrity you know right any celebrity you know and on the other end of the spectrum take any person in a remote island remote tribe right with on a planet of eight billion people now how many connections do you think you need to connect these two people at two ends of the spectrum any any guesses one yeah one is the minimum but the average and that's the best case but the average is actually six right it's called six degrees of freedom there is a book written by Kevin Bacon on this on this topic and it's very hard to imagine planet of eight billion people and the average number of connections to go from one extreme to the other is just is just six right and that's why I think I would recommend you all to see this this particular documentary on Netflix called connected we are we all of us are very highly connected some way or the other right and this documentary shows shows the power of graphs right when you hear about data scientists right data scientists are always data hungry right they're always looking for more data why because they want to get more signals from the data right they want to put together all the connections right relational databases right relational database to me is sort of an oxymoron relational databases yes they can manage relations but you know any social media company you take whether it's Facebook LinkedIn Twitter they did not implement their systems with relational technology they had to write their own graph databases to because social media is all about connections right one of the famous things that made a graph even more famous way back in 2016 was was an analysis of a lot of the papers text to connect to figure out how lot of the rich people were siphoning money into banks into into banks outside their countries and there was a team of journalists who used graph databases to figure out the connections and how the money was being long how the money was getting laundered laundered right and you will see today a lot of the fraud detection technologies that that financial organizations continuously use use graphs under the cover because fraud detection with plain machine learning reached a threshold and fraudsters you know are always outsmarting right and graphs provided that extra insights where because fraudsters typically work as cohorts as clusters right so a lot of financial organizations use graph technologies to figure out to figure out fraud fraud detection right so graph technologies are all around us the device you are holding is is using graphs right it has social media the moment you use gps you are using graphs the road connections the road intersections are the nodes and the road themselves are the the edges right any you know whenever you you are using cell phones right as your as you are moving as the connection goes from one cell phone tower to the other they use graph technology under the cover to to map it right supply chain is all about graphs bill of materials all about there are numerous examples right in computer science for those of you who are from computer science relational databases use deadlock detection technologies right when there is deadlocks we are trying to modify the same data together concurrently they use graphs to figure out if there is a cycle and then they kick out one of the one of the links to resolve the deadlocks right compilers are using graph data structures under the cover so graphs are actually all around us knowledge graphs knowledge graphs are one level higher than graphs graph is just a graph database where you have where you figured out the connections where you map your connections from the data as nodes and as edges and as attributes but knowledge graphs go one level deeper the term knowledge graphs was actually sort of popular the term knowledge graph was popularized by by google when in 2012 google changed its algorithm from using inverted index to using knowledge graphs so when you google or use any search engine for that matter of up right if you search for a dentist right if you search for a dentist google will give you oral surgeons they will give you orthodontists how does google know that dentists and orthodentists are related right it's a subclass of it and it uses a knowledge graph google has an extensive knowledge graph which is not publicly available google uses its own but these are other knowledge graphs that are sort of publicly available most big companies today use knowledge graphs extensively take jp Morgan chains take ebay they have all knowledge graphs under the cover to bring out more inherent relationships the unknown unknowns right that is a very difficult thing the known knowns can be figured out easily the unknown knowns can be figured out but the unknown unknowns are very difficult to figure out that is where knowledge graphs comes in and these are sort of the different just a handful of examples here right the moment you use as i said any search engine you use you'll use a knowledge graph moment you use things like cd assistants or alexa and things like that they use knowledge graphs all the banks right they you know they use what is called kyc i call it kill your kill your customer they ask you for so much information that you are exasperated right but kyc under the cover uses uses knowledge graphs because now you are getting data from so many different sources and now you have to integrate them meaningfully that meaningfully is the part where knowledge graph becomes very important the semantics associated with you and i'll use two terms very extensively here ontologies and taxonomies and i'll i'll show you what there is a gentleman here who has already worked with ontologies but what are ontologies and taxonomies that is what is the dna and rna for a knowledge graph which is built on top of a graph database knowledge graphs require a graph database but it's the things on top of it that gives you the meaning and the context right so there are basically two forms of graphs right LPG graphs you know it's not liquefied petroleum gas it's like labeled property graph very simple to understand the advantage of graphs is the way you model it it's very intuitive the way your brain thinks the way you whiteboard it graph becomes much more much more easy to model for those of you who have done relational databases again right in relational if you have to do a simple m many to many relationship right many products many suppliers you cannot do it in relational right you have to break it down you have to have an association table which does the m2n mapping and then one 2n mapping right and that association table has no meaning in real life right it's extra burden on on a join as well as on the conceptual side right that's where graphs come in you can model the way your brain thinks right and this is a label property graph where you have nodes nodes have attributes and then you have nodes are connected by edges edges have attributes very simple for brain to understand right you are in this conference right you're going from one room to the other right your brain internally is using a graph each of the rooms are nodes and the corridors and passageways are all the the connections right so graphs are graphs are omnipresent right this is an example from ontotext website where how you can prepare a pizza there's a small error there but overlook that how you can model a pizza with its different ingredients right and each ingredient has its own attributes right it can be delivered and all these things so this is the way to model right you cannot escape modeling you still have to do the modeling what is a node what is our relationship what are the attributes right and then this is where the real fun begins this is what is called as an rdf resource description framework right resource description framework is another way to model a graph another way to model the advantage of this is you break your statements your facts into the most granular form right socrates is a man right you break it into a way as subject object and predicate that's the simplest form and when you break your facts into a simple form as in an rdf it it definitely explodes the amount of data you have to store but it gives you that extra capability of reasoning socrates is a man all men are mortal sumit is a man we can find out that sumit is mortal too knowledge graphs can help you to deduce that new fact because now you have modeled your data in the most granular form and that gives you that extra power LPG graphs let me quickly go back right LPG graphs have one big disadvantage and any LPG graph can be converted to an rdf graph or an rdf can be converted only the size differs right but when you use a gps right you want to go from one point to another point when you're walking or driving a car right typically LPG graphs are used because graph algorithms there are so many graph algorithms out there right when you do a google search with um with page rank algorithm right page rank is basically an influencer sort of an algorithm which which node in this whole connected thing is the most influential where all connections go through google will put it at the top apart from the knowledge graph part of the query so LPG graphs are mostly used for graph analytics for graph traversal but rdf graphs are very much more powerful for doing semantics because it's breaking it up into so much granularity that making meaning and modeling your domain into those conceptual granular level is much more simpler right and this is the same pizza example now with an rdf model and what you can also do here is see on the top left is recipes right or allergies what sort of allergy you can get allergy ontology allergy taxonomy and in and sort of overlay it on top of your graph to get much more information and these kind of ontologies and taxonomies which we will see are publicly available you could create your own domain specific ontology or taxonomy but you can also use other publicly available taxonomies or ontologies which are written in standards that is the advantage of knowledge graphs and rdfs they are very standardized the formats are very standardized which makes it very interoperable you can quickly import just like in us if you're writing a c code or a java code and you do imports right you can import these these ontologies one more thing I wanted to show you each each subject predicate and each subject object and predicate that you have each of them have unique identifiers which are globally unique that way you resolve ambiguity so knowledge graphs are very useful especially in data management when you do entity resolution entity disambiguation where you want to adhere to standards for example date today we hear about terms like data mesh where data contracts are need to be established between data producers and data consumers data contracts can be modeled as a knowledge graph saying I want this data to be in this form with these types right so I think I sort of said what what is what is a knowledge graph right it's a graph under the cover it's a graph database and under and under and on top of the graph database is your ontologies and taxonomies where you where you're modeling your facts and that gives you that inferencing and reasoning power right typically this is how a knowledge graph one high level way of building a knowledge graph give it any content especially unstructured content right when you have unstructured content most organizations 75 80 percent of the data is unstructured unstructured meaning video audio most importantly text data right text data has become so important now with chat GPT and all these technologies right 80 percent of the data in an organization is unstructured in form of emails blogs your PDF documents document files right a knowledge graph can extract out the entities from different documents in your in your organizational data repository and then interlink them that's why knowledge graphs are very used highly highly used in the chemical industry in the drug discovery industry onto text specializes in life science and finance as well as an energy sector as well as in things like digital twins and iot the reason is especially in in energy and in finance there are a lot of regulatory requirements right and a lot of these regularity regularity and compliance are embedded in text when meaning can be very disambiguous you need to make sure that your meanings are correct and that's where knowledge graphs are so you can extract out these you can then build the connections through entity linking through semantic enrichment with providing the ontology and the taxonomy build the knowledge graph and then through that you can do classification comprehension recommendation and things like that right the knowledge graph has three inputs right your basic raw data your ontology which could be very domain specific bbc users bbc uses knowledge graphs for sports for wildlife they use it for one more thing i forget they use it pretty extensively to to make sure that your the classification categorization of the news and things like that are very well defined so the advantages of a knowledge graph is reusable it's reusable because as i said you could reuse ontologies extra there are a lot of ontologies that are available right financial has a financial domain has a very well known ontology called feebo right where all financial terms their interrelationships are already defined and these ontologies and taxonomies they are also the fun is they are also modeled as a graph they are also modeled as a graph so when you overlay that ontology graph on top of your data graph it becomes much more as you saw in the pizza example if you overlay the allergy or the recipe graph on top of it it gets much richer it gets semantically much more meaningful right data integration you hear about the topic data fabric right data fabric is very common where again i wouldn't say it's very common but there's a term there is there is a lot of there are these buzzwords always floating around right but data fabric is basically an integration layer and under the covered data fabric uses knowledge graphs knowledge graphs are very powerful for data integration coming from different data silos as i said 360 data coming from different areas and you need to meaningfully tie them together right reasoning socrates is a man the example i gave it can quickly figure out new facts for you right standards all these ontologies and taxonomies have to be written in specific languages in specific formats so that it's interoperable right now i have a lot of slides here the reason i add it i know i won't be able to cover them in the 40 minutes but this is for you to reference as well there are a lot of standards and a lot of products that are used for ontologies and taxonomy building that you can later reference for for your own um um for your own use right so discovering the unknown unknowns right um intermediate hops right when you do a friend of friend right in in in social media or any link right it's it's basically doing a graph traversal going from one node to the other and then going to the other right and you can go to any depth right there is a similar terminology like we use for um you know searching it's called the page rank algorithm for searching we have another term called graph rank which allows you to figure out what is the ranking for a particular node providing semantics for the concepts so you can use knowledge graphs in enterprise setting which makes which there is a terminology called enterprise knowledge graphs enterprise knowledge graphs can help you to model to your specific domain if you are in the energy domain right you can use publicly available energy ontologies and import it right especially in the oil and gas or in the electrical or in the electrical industry where there are so many standards right let's say two pipes are getting connected and each of these pipes have different standards on which they need to be on which they need to be operating the temperature their width depth all those sort of things when you have constructed it you need to validate it when you have made an architecture you need to validate it and knowledge graphs can help you to validate that because all those standards can be expressed as owls um as as ontologies or taxonomies which are again very standard so with enterprise knowledge graphs you can help to organize your knowledge domain so this is a mind map which i think i've sort of covered what are the different use cases for knowledge graphs it could be used in the data management especially very rich and especially very good in the content management space where you have content that needs to be uh parsed out the entities need to be extracted and then interrelated right knowledge graphs are also used in ml explainability today let's say you have given a picture right how do you figure out that this picture is telling you a story or telling you an information you can use knowledge graphs we've all have heard about llms we're all overdosed with noise about llms right llms hallucinate and we at ontotext have developed techniques to make sure your hallyu nations could be minimized because now the output of the llms can be validated from a knowledge graph right the other way works also you can create a not creation of a knowledge graph is very challenging and that's why there is a steep learning curve to it right but now you can use llms you provide your ontologies as prompts to an llm and llm can generate a knowledge graph for you with that ontology that you pass as a prompt right so these are the different use cases and there are there are many more use cases of graphs and and knowledge graphs right so typically a knowledge graph stack would be would be something like this you have the data integration layer yeah that's the raw data that's the raw material right you have to bring data from different sources you parse it out you create that subject object and predicate right which i said earlier on you create you store it in graph db or any graph database that you have and then you apply your ontologies and taxonomies to build that semantic layer on top of the semantic layer if you want you can have reasoning reasoning works in two different ways forward chaining and backward chaining depending on your query time and latency you need different engines use different ways graph database uses forward chaining when you ingest the data it makes sure all the information is valid and when you're querying it results come very quickly right and then you have the query layer you can query it in multiple ways you can query through apis sparkle is the language for querying it's a lot like sequel but here you can it but here you can query your rdf database you can use llms to query as well you know like you can use llms to validate your queries a lot of times queries become very complex and llms can provide explanation to to your queries and you can you also use graph ql to do that and then you can build applications some semantic applications on top of it using using the query layer so there's a pretty long stack there's a lot of things in each of the layers right but at a high level this is what happens in a knowledge graph right so what are the main components you need as I said knowledge graphs need three things right you are raw material the raw materials which which is which is the which is the data that you have you have the ontology and taxonomy either you can import it from public sources or you can import it from public sources and customize it for your own domain a lot of the ontology taxonomy tools provide provide you ways to do that so taxonomy and ontology are sort of the rna and dna for building your knowledge graphs you need a graph database you need a vocabulary right what is the glossary or the terms the business terms that your domain is using you need a data mapping framework on the text for example provides a way to integrate with your relational databases and bring in your relational data as a graph but it needs that layer the mapping framework to map the components from relational schema to your to to your ontologies taxonomies and to your knowledge graph schema right and you need of course data data extraction tools so taxonomy you know we all have been using taxonomy since we were we were two years or three years old right think of taxonomy as a way to classify things right and I'll show you an example which probably will make more sense take an example of when you do machine learning right machine learning model performance has a lot of different ways of doing measuring performance right of a machine learning model and these can be classified into different ways so it's basically a classification then you have a data right data can be nested data data can be in any form cc as reformat parquet format other format so and there are also these public taxonomies available nasa uses knowledge graphs taxonomies ontology extensively they have to right to make sure that things are more the other thing knowledge graphs are very much used is in data validation because your data follows a certain taxonomy and ontology you can quickly validate your data you know a lot of organizations talk about now talk about ai and machine learning without having good data right it's it's all garbage in garbage out without you do without doing you know good data quality your machine learning algorithms are not really useful for business and knowledge graphs can step in and at ontotex we have implemented this for a lot of europe a lot of european energy companies to validate the data and found out that a lot of the data has has you know violated some some regulations and compliance and things like that so going back so taxonomy is basically a way of classifying information typically as a hierarchy it's sort of a knowledge map right now and that's the foundation for doing a lot of semantic search for doing a lot of discovery metadata tagging faceting things like that findability right most organizations build data lakes right bring in data we'll then let's first bring in data and then we'll figure out what to do with it right that's totally the wrong approach right and that's why first generation data lakes all fail because became data swamps biggest problem with data lakes was findability you could not semantically find your data who brought what data when what are the data quality profiles how can i use it who is using it that's where you know findability becomes very important especially with knowledge graphs more semantic semantically able to find them right see how am i doing on time okay these are some of the taxonomy standards right standards are the most important part about knowledge graphs right dr tim bernersley has been talking about it since the late 90s right semantic web that still hasn't happened yet right but that's the idea where we are not doing searching with strings right because strings are characters characters are just zeros and ones but we are doing searching with things right which are concepts right that's the whole idea about the semantic web and they and when you develop taxonomies you follow these standards that makes your taxonomy interoperable reusable across different departments within your organization right and most of the tools follow these follow these standards right remember earlier when i said rdf's right subject object predicate subject object predicate each of them have unique identifiers so you can uniquely identify is a relationship has a relationship right composed of relationships because now they are unique your knowledge graph engine can quickly give you semantics that this is related to this if you if you understand that that's one of the core things to understand it has it has standardization and that gives it that reusability power the power to do inferring right these are a lot of taxonomy tools um there are ontology tools that is a very very well known tool called protege from stanford university very usable especially those of you who have you know background in object oriented development um analysis and design and development will find it very easy to use lot of it falls into that category where you have classes subclasses super classes is a subclass of things like that it's it's very much like that it's not exactly taxonomies and ontology and not exactly like that but it gives you a good way to start and get your head wrapped around right but taxonomies have problems right when you have poly hierarchies multiple hierarchies right one thing belonging to multiple just like object oriented design has challenges when you have multiple inheritance right poly hierarchies cause problems when you have nest when when you have nested data that's fine and oftentimes very strict hierarchies limit your conceptual orientation you may have one object could belong to multiple hierarchies happens in real life right in real life things are like that taxonomies become limited and that's where ontology is come in and and provide you much more depth right so think about ontology as think about ontology as your schema for your graph data right it provides a scheme now you may be a little confused here saying you know okay we had relational databases which were very schema oriented and then we had that non-relational databases right like key values graphs time series and all that now typically the advantage which has been said of no sequel databases is you know they are schema agnostic but again at some point in time you will have to put a schema on everything to get meaning meaning out of it so the ontology provides you that schema and again ontologies and taxonomies remember that are also modeled as graph elements as subject object and predict right everything is a everything is a graph here right sorry the other thing about ontology is this is what helps you to minimize ambiguity once you have an ontology on top of your data you can quickly figure out what belongs to which class to which to which hierarchy to which categorization right and it's all it's all very formal it's all very formal and that's the beauty of doing knowledge graphs is everything is machine readable everything is machine readable and the machine the engine the graph database engine onto text graph database engine or any other graph database engine can automatically figure out your data validations your which your elements where it belongs to because it's all machine readable it's what is called we use the term IIRI I forget the exact full expansion of that term but it's not like URIs universal resource identifiers right it's international it's it's international resource identifiers right so it's global it's standard right that gives it a machine readable capability so you can automate you can automate your data validations you can automate a lot of the generation part of of your knowledge graphs and now with this ontology once you have an ontology you know in big enterprises different departments have ambiguities in terms of meaning in terms of data what is the data type that is going to come all that but with an ontology if you have a good shared ontology it can help you to make it more shareable data sharing right data sharing is what a lot of the cloud vendors are doing now data exchanges they call data marketplaces right these these are based on data contracts and you can leverage knowledge graphs on top of it to or the knowledge graph is the engine under underneath it to make sure that your data that is incoming is according to a specification according to a contract using using the right kind of ontologies right so basically it's the blueprint right ontology gives you the blueprint it gives you the semantic type capabilities for relations as well as for your entities it gives you that common basically the common data model a shareable shareable data model and that makes it interoperable right again i think i already talked about all this we don't have to go through this mm yeah we're doing good on time so think about ontologies as a schema right and i think i have an example below which will give you the type of the class to which that concept belongs each concept has different properties and concepts are interrelated right so now you can have different types of relationships right i have a relationship to my spouse i have a relationship to my son or daughter right so relationships can be different types and ontology can help you to specify that the other part is what are called constraints constraints are when you yeah when you specify these these schema you specify like like in relational you specify constraints that value for this field has to be between this and this very similar to this right it's specified in this again a standard language called shackle shape constraint language pretty easy language not not not very difficult it it tells you that a particular data type has to be within these classes or within multiple classes or within these values and things that helps you to do data validation you have a shackle engine that can read shackle and make sure that your data is valid and then there is the taxonomy which is more concepts classes and the hierarchies right so this is what the core components of an ontology looks like right what are the classes what are the properties sub properties properties can also have sub properties and super properties which they inherit from and it defines all that right you see there is a relationship right is mother of relationship each of these relationships have unique identifiers and for for a given standard right and that gives knowledge graphs that additional power to quickly interpret do inferencing and reasoning ontology best practices for big organizations don't try to build a whole ontology for the organization first you know start small right right that's a mantra start small have big ideas yes but start small iterate and refine same with ontologies you the the tools out there are are pretty good now they are well matured to help you and also don't start from scratch when you're building ontologies import import existing outside ontologies which are available in the public domain and then do do do changes to it remove things that you don't need now right just like the way you you do it for for everything you you import a template and then you make changes to it customize it right a lot of ontology tools and libraries ontotext partners ontotext is not an end-to-end knowledge graph product we integrate with partners with partners who help us to in to integrate data from external sources to build the ontologies to do the visualization of the graph we have our own visualization but we also integrate with other external visualization tools the graph visualization tools but we have the core graph database to bring in your data to model it in as rdf's and then the rdf and the rdf engine or the graph engine to do inferencing and reasoning a lot of well known ontology some of the ones that have highlighted are very are very well known there is almost an ontology anything you can think about right so when you are starting to build your knowledge graph you don't have to start from scratch you can import these and start and start using them right these are some ontology ontology standards to provide some of the things like shackle as I said provides the automatic data validation specify your data specify your data specifications as shackle and and import it into the graph db engine and figure out if your data is valid or not right and these these are certain questions to ask before you start building on your knowledge graph right what what are do I need to build an enterprise white knowledge graph do I you know how is the knowledge shared how is my data refreshed on a regular on on a regular basis right yeah I have five minutes more I think I have a few more slides on yeah these are some of the questions you should ask when you start building a knowledge graph specially start building it in a departmental unit and collaborate with the domain specialist when you build a knowledge graph you need domain specialist for your particular domain right and this is how you typically go about building a knowledge graph you have data virtualization which is on the top left you bring in the data for doing your data processing and enrichment you do the schema mapping and on this side are all these different things we talked about to build the model to build the model repository and then generate and then generate the graph on top of it begin with a single use case identify which identify where your contents are imported with the right kind of data integration tools and then reuse reuse your ontologies define your ontologies and taxonomies first and then you can you know there is a well-known arm there is also a well-known tool called R2 RDF which is relational to RDF which can bring in data from relational technologies relational databases into a graph data RDF graph database engine so that's a data integration tool that you can use to I talked about this you can use LLMs to build your knowledge graphs by giving it an ontology as a prompt again when you do ontologies these are some of the questions you should be asking can I reuse don't typically don't go and build ontologies yourself reuse and involve your SMEs domain experts right and this is how you go about building an ontology right typically model nouns as concepts and verbs as relationships right define the different facets as attributes right define your classes super classes subclasses to which your concepts could belong to right these are some of the well-known knowledge graph platform vendors ontotext Cambridge semantics star dog and these ontotext is the only one which has a benchmark a worldwide benchmark standard graph database engine most of the other other vendors do not have a benchmark in terms of performance when you're doing these these graph queries these graph analytic queries querying as I said earlier you there are different ways there are ways in which you can square you with SQL as well SQL to spark sparkle converter you can query dbpdia because it has a sparkle endpoint you can if you know sparkle you can just query dbpdia and get get your answers we also have an endpoint for certain example demos that we have if you're interested I can give it to you and yeah most of the things are with sparkle or our graph ql nowadays with knowledge graph query knowledge graph is not a one-off engineering project you do it you do it as a mode of an iterative start small and iterate and bring in different data sets ensure that you have the right ontologies right typically don't do knowledge graphs each component go for a platform go for a company that gives you the full platform otherwise you will struggle yeah you can do it with open source right putting things together but that's a problem in the cloud too right cloud vendors give you all services then you have to stitch it together right it's a huge integration effort right typically go with go with platform vendors yeah these are certain guidelines that you can that you can refer to as you go and build your knowledge so with that I will end and thank you for being very good listeners and thank you for joining hope this was useful but if you have questions you can ask them now or I'm here let me see if it goes back thank you oh the next one you you can download them as well all these slides are available yes please hi you can go ahead so I had a question about the general approach about what approach about the general approach of using knowledge so what do you think about like these new statistic based methods that are coming we've taken the raw data and then you can directly query and ask questions from the raw data without having these intermediate knowledge graphs as people are investing more time and energy into creating those statistics method do you think they can ever reach a point where like they can compare with knowledge graphs it does but the challenge is the hallucinations right you cannot you have you have to do you probably like our senior self you may have to take a library insurance before you start using LLMs widely across your you know data engineering because you don't know any person correct but in data engineering you typically want 100% right and that's where knowledge graphs and complement uh LLM the output of LLM fed to a knowledge graph verify it and then you feed it to your automated system or to your continuous that's the direction we see things are going it's more complementing each other rather than competing thank you i have a question about the parent relationship so can you express this relationship with like more general rules like can i say that parent of a parent is a grandparent or sibling of a sibling or okay so i can express it like a more it's not just individual instances but based on classes and rules so um so i understand you would have like an ontology before you take some unstructured data to create a graph out of it and the ontology will dictate how the graph gets created but the problem with unstructured data is how do you know where are the entities and i've personally had to deal with this a lot like we would first pass that data through an NER or something and then it's kind of like a chicken egg problem right like the NER could just not be as accurate